An Evaluation of Microarray An Evaluation of Microarray Visualization Tools for Visualization Tools for Biological Insight Biological Insight Presented by Presented by Tugrul Ince and Nir Tugrul Ince and Nir Peer Peer University of Maryland University of Maryland Purvi Saraiya Chris North Dept. of Computer Science Virginia Polytechnic Institute and State University Karen Duca Virginia Bioinformatics Institute Virginia Polytechnic Institute and State University
31
Embed
An Evaluation of Microarray Visualization Tools for Biological Insight
Purvi Saraiya Chris North Dept. of Computer Science Virginia Polytechnic Institute and State University. Karen Duca Virginia Bioinformatics Institute Virginia Polytechnic Institute and State University. An Evaluation of Microarray Visualization Tools for Biological Insight. - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
An Evaluation of Microarray Visualization Tools An Evaluation of Microarray Visualization Tools for Biological Insightfor Biological Insight
Presented byPresented byTugrul Ince and Nir PeerTugrul Ince and Nir PeerUniversity of MarylandUniversity of Maryland
Purvi SaraiyaChris North
Dept. of Computer ScienceVirginia Polytechnic Institute
and State University
Karen Duca
Virginia Bioinformatics Institute
Virginia Polytechnic Institute and State University
2
GoalsGoals
Evaluate five popular visualization Evaluate five popular visualization toolstools Cluster/TreeviewCluster/Treeview TimeSearcherTimeSearcher Hierarchical Clustering Explorer (HCE)Hierarchical Clustering Explorer (HCE) SpotfireSpotfire GeneSpringGeneSpring
Do so in the context of bioinformatics Do so in the context of bioinformatics data explorationdata exploration
3
GoalsGoals
Research QuestionsResearch Questions How successful are these tools in stimulating How successful are these tools in stimulating
insight?insight? How do various visualization techniques How do various visualization techniques
affect the users’ perception of data?affect the users’ perception of data? How does users’ background affect the tool How does users’ background affect the tool
usage?usage? How do these tools support hypothesis How do these tools support hypothesis
generation?generation? Can insight be measured in a controlled Can insight be measured in a controlled
Typically evaluations consist ofTypically evaluations consist of controlled measurements of user controlled measurements of user
performance and accuracy on performance and accuracy on predetermined taskspredetermined tasks
We are looking for an evaluation that We are looking for an evaluation that better simulates a bioinformatics data better simulates a bioinformatics data analysis scenarioanalysis scenario We use a protocol the focuses onWe use a protocol the focuses on
recognition and quantification of insights gained recognition and quantification of insights gained from actual exploratory use of visualizationsfrom actual exploratory use of visualizations
5
InsightsInsights Hard to define what is an “insight”Hard to define what is an “insight” We need this term to be quantifiable and We need this term to be quantifiable and
reproduciblereproducible SolutionSolution
Encourage users to think aloudEncourage users to think aloud and report any findings they have about the datasetand report any findings they have about the dataset
Videotape a session to capture and Videotape a session to capture and characterize individual insights as they occurcharacterize individual insights as they occur
generally provides more information than generally provides more information than subjective measures from post-experiment surveyssubjective measures from post-experiment surveys
6
InsightsInsights
Define insight asDefine insight as an individual observation about the data an individual observation about the data
by the participantby the participant a unit of discoverya unit of discovery Essentially, any data observation made Essentially, any data observation made
during the think aloud protocolduring the think aloud protocol Now we can quantify some Now we can quantify some
characteristics of each insightcharacteristics of each insight
2 participants per dataset per tool2 participants per dataset per tool Have at least a Bachelor’s degree in a biological Have at least a Bachelor’s degree in a biological
fieldfield Assigned to tools they had never worked with Assigned to tools they had never worked with
beforebefore to prevent advantageto prevent advantage measure learning timemeasure learning time
Senior researchers with extensive experience in microarray Senior researchers with extensive experience in microarray experiments and microarray data analysisexperiments and microarray data analysis
11 Domain Novices11 Domain Novices Lab technicians or graduate student research assistantsLab technicians or graduate student research assistants
9 Software Developers9 Software Developers Professionals who implement microarray software toolsProfessionals who implement microarray software tools
11
Protocol and MeasuresProtocol and Measures Chose new users with only minimal tool trainingChose new users with only minimal tool training
Success in the initial usage period is critical for the Success in the initial usage period is critical for the tool’s adoption by biologiststool’s adoption by biologists
Participants received an initial trainingParticipants received an initial training Background description about the datasetBackground description about the dataset 15-minute tool tutorial15-minute tool tutorial
Participants listed some analysis questionsParticipants listed some analysis questions Instructed to examine the data with the tool as Instructed to examine the data with the tool as
long as neededlong as needed They were allowed to ask for help about the toolThey were allowed to ask for help about the tool
Simulates training by colleaguesSimulates training by colleagues
12
Protocol and MeasuresProtocol and Measures
Every 15 minutes, participants Every 15 minutes, participants estimated percent of total potential estimated percent of total potential insight they obtained so farinsight they obtained so far
Finally, assessed overall experience Finally, assessed overall experience with the tools during sessionwith the tools during session
Entire session was videotaped for Entire session was videotaped for later analysislater analysis Later, all individual occurrences of Later, all individual occurrences of
insights were identified and codifiedinsights were identified and codified
Avg. Time to First Avg. Time to First InsightInsight
ClusterView: very short time to first insightClusterView: very short time to first insight TimeSearcher 1 and Spotfire are also quickTimeSearcher 1 and Spotfire are also quick
Unexpected InsightsUnexpected Insights HCE revealed several unexpected resultsHCE revealed several unexpected results ClusterView provided a fewClusterView provided a few TimeSearcher 1 for time series dataTimeSearcher 1 for time series data Spotfire contributed to 2 unexpected Spotfire contributed to 2 unexpected
insightsinsights
HypothesesHypotheses A few insights led to hypothesesA few insights led to hypotheses
Overview of genes in generalOverview of genes in general Expression PatternsExpression Patterns
Searching patterns is criticalSearching patterns is critical Clustering is usefulClustering is useful
GroupingGrouping Some users wanted to group genesSome users wanted to group genes GeneSpring enables groupingGeneSpring enables grouping
Detail InformationDetail Information Users want detailed information about genes Users want detailed information about genes
that are familiar to themthat are familiar to them
28
Visual Representations and Visual Representations and InteractionsInteractions
Although some tools have many Although some tools have many visualization techniques, users tend to visualization techniques, users tend to use only a fewuse only a few Spotfire users preferred heat-mapsSpotfire users preferred heat-maps GeneSpring users preferred parallel GeneSpring users preferred parallel
coordinatescoordinates Lupus dataset: visualized best with heat-Lupus dataset: visualized best with heat-
mapsmaps Most users preferred outputs of Most users preferred outputs of
clustering algorithmsclustering algorithms HCE not useful when a particular HCE not useful when a particular
column arrangement is usefulcolumn arrangement is useful
29
Running out of time, So, Running out of time, So, wrap upwrap up
Use a Visualization tool (that’s why we’re Use a Visualization tool (that’s why we’re here!)here!)
Spotfire: best general performanceSpotfire: best general performance GeneSpring: Hard to useGeneSpring: Hard to use Dataset dictates best tool!Dataset dictates best tool!
Time Series data: TimeSearcherTime Series data: TimeSearcher Others: Spotfire, GeneSpring?Others: Spotfire, GeneSpring?
Interaction is the keyInteraction is the key Grouping and Clustering are necessary Grouping and Clustering are necessary
featuresfeatures
30
CritiqueCritique
In all fairness, measuring insights is really In all fairness, measuring insights is really hard! Here are some possible issueshard! Here are some possible issues
SubjectivitySubjectivity Experiment relies on users always thinking aloudExperiment relies on users always thinking aloud Also, depends on a domain expert to evaluate Also, depends on a domain expert to evaluate
insightsinsights Results may vary widely based on participants Results may vary widely based on participants
expertise (only two per tool-dataset pair)expertise (only two per tool-dataset pair) Some insight characteristics are inherently Some insight characteristics are inherently
subjectivesubjective Domain ValueDomain Value Breadth vs. DepthBreadth vs. Depth
31
CritiqueCritique
How do one count insights?How do one count insights? Assumes honest reporting by participantsAssumes honest reporting by participants Some insights may be of no great valueSome insights may be of no great value What if a discovery just reaffirms a known What if a discovery just reaffirms a known
fact? Is that an insight?fact? Is that an insight? Measuring time taken to reach an Measuring time taken to reach an
insightinsight Maybe instead of measuring from beginning Maybe instead of measuring from beginning
of session we should measure from last of session we should measure from last insightinsight