Top Banner
1 This work partially funded by NSF Grants IIS-9732897, IRIS- 9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine Rosario, Allen R. Martin, Ying-Huey Fua, Daniel Stroe http:// davis.wpi.edu/~xmdv XmdvTool Interactive Visual Data Exploration System for High- dimensional Data Sets Worcester Polytechnic Institute
20

1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

1

This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276

Matthew O. Ward, Elke A. Rundensteiner,Jing Yang, Punit Doshi, Geraldine Rosario,

Allen R. Martin, Ying-Huey Fua, Daniel Stroe

http://davis.wpi.edu/~xmdv

XmdvToolInteractive Visual Data Exploration System

for High-dimensional Data Sets

Worcester Polytechnic Institute

Page 2: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

2

XmdvTool Features

• Hierarchical visualization and interaction tools for exploring very large high-dimensional data sets to discover patterns, trends and outliers

• Applications: Bioterrorism Detection Bioinformatics and Drug Discovery Space Science Geology and Geochemistry Systems Monitoring and Performance Evaluation Economics and Business Simulation Design and Analysis

• Multi-platform support (Unix, Linux, Windows)• Public domain software: http://davis.wpi.edu/~xmdv

Page 3: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

3

• Scale-up to High Dimensions: Visual Hierarchical Dimension Reduction

• Scale-up to Large Data Sets: Interactive Hierarchical Displays, Database Backend with Minmax Encoding, Semantic Caching and Adaptive Prefetching

• Interlinked Multi-Displays: Parallel Coordinates, Glyphs, Scatterplot Matrices, Dimensional Stacking

• Visual Interaction Tools: N-Dimensional Brushes, Structure-Based Brushing, InterRing

Xmdv: Main Features

Page 4: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

4

Scale-Up for Large Number of Dimensions

Solution to High Dimensional Datasets:• Group Similar Dimensions into

Dimension Hierarchy• Navigate Dimension Hierarchy by

InterRing• Form Lower Dimensional Spaces by

Dimension Clusters• Convey Dimension Cluster

Information by Dissimilarity Display

Page 5: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

5

Visual Hierarchical Dimension Reduction Process

Page 6: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

6

A 42-dimensional Data Set

Dimension Hierarchy Interaction Tool:

InterRing

A 4-Dimensional Subspace

Visual Hierarchical Dimension Reduction Process

Page 7: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

7

InterRing - Dimension Hierarchy Navigation and Manipulation

Roll-up/Drill-down Rotate Zoom in/out

Distort Modify

Page 8: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

8

Dissimilarity Display

Three Axes Method

Mean-Band Method

Diagonal Plot Method

Axis Width Method

Page 9: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

9

Scale-up for Large Number of Records

Solution to Large Scale Datasets:• Group Similar Records into

Data Hierarchy • Navigate Data Hierarchy by

Structure-Based Brushing• Represent Data Clusters by

Mean-Band Method • Provide Database Backend Support

using MinMax Tree, Caching, Prefetching

Page 10: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

10

2D example

Interactive Hierarchical Display

Hierarchical Clustering Structure-Based Brushing

Page 11: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

11

Flat Display Hierarchical Display

Interactive Hierarchical Display

Mean-Band Method in Parallel Coordinates

Page 12: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

12

Flat Display Hierarchical Display

Mean-Band Method in Parallel Coordinates

Interactive Hierarchical Display

Page 13: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

13

Scalability of Data Access

• Approach• Attach database system to visualization front-end

• MinMax hierarchy encoding• Key idea: avoid recursive processing

• Pre-computed

• Caching• Key idea: reduce response time and network traffic

• Prefetching• Key idea: use application hints and predict user patternsapplication hints and predict user patterns

• Performed during idle timePerformed during idle time

Page 14: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

14

• Pre-compute object positions

– level-of-detail (L)

– extent values (x,y)

– preserve tree structure

• New query semantics

– objects are now rectangles

– select objects that touch L

– select objects that touch (x, y)

– structure-based brush = intersection of two selections

Scalability of Data Access:MinMax Hierarchy Encoding

level of detail

extent values

L

x y

query = (x, y, L) x y

L

Page 15: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

15

• Purpose• reduce response time and network traffic

• Issues• visual query cannot directly translate into object IDs high-level cache specification to avoid complete scans

• Semantic caching• queries are cached rather than objects• minimize cost of cache lookup• dynamically adapt cached queries to patterns of queries

Scalability of Data Access: Caching

Page 16: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

16

• Strategy– Speculative (no specific hints)

– navigation remains locallocal – both user user and data setdata set influence exploration

– Adaptive (strategy changes over time)– Evolves as more knowledge becomes available

– Non-pure (interruptible prefetching)– leave buffer in consistent consistent state

• Requirements– non-pure prefetching + large transactions & small object

size + semantic caching small granularity (object level)– speculative, non-pure prefetcher cache replacement

policy + guessing method

Scalability of Data Access: Prefetching

Page 17: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

17

Conclusions: Caching reduces response time by 80% Prefetching further reduces response time by 30% Designing better prefetching strategies might help

further reduce response time

Effectiveness of Prefetcher

0

5

1015

20

25

30

0 2 4 6 8Delay between User Operations (seconds)

% Im

prov

emen

t in

Resp

onse

Tim

e

Effectiveness of Caching

0

40

80

120

160

200

Client OFFServer OFF

Client OFFServer ON

Client ON ServerOFF

Client ON ServerON

Caching

Res

pon

se T

ime

(sec

ond

s)

Scalability of Data Access: Experimental Evaluation

Page 18: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

18

Random Random Strategy

(m-1) m (m+1)

Direction Direction Strategy

Hot Regions

Current Navigation

Window

Focus Focus Strategy

m(n-2)

m(n-1)m(n)

m(n+1)

Mean Mean Strategy

m(n-2)

m(n-1)m(n)

m(n+1)

Exponential Weight Exponential Weight Average Average Strategy

Vector Vector Strategies

41p

41p

41p

41p

Data Set Driven Data Set Driven Strategy

Localized Speculative Localized Speculative Strategies

Scalability of Data Access: Prefetching

Page 19: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

19

Xmdv System Implementation

• Tools– C/C++

– TCL/TK

– OpenGL

– Oracle 8i

– Pro*C

User

MinMaxLabeling

SchemaInfo

Hierarchical Data

RewriterTranslator

Loader

BufferQueries

GUI

OFF-LINE PROCESS

Estimator

ExplorationVariables

DB

ON-LINE PROCESS

MEMORY

Flat Data

PrefetcherLibrary:RandomDirection

Focus

EWAMean

DB DB

Buffer

Page 20: 1 This work partially funded by NSF Grants IIS-9732897, IRIS-9729878 and IIS-0119276 Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.

20

Publications (available at http://davis.wpi.edu/~xmdv)

• Jing Yang, Matthew O. Ward and Elke A. Rundensteiner, "InterRing: An Interactive Tool for Visually Navigating and Manipulating Hierarchical Structures", InfoVis 2002, to appear

• Punit R. Doshi, Elke A. Rundensteiner, Matthew O. Ward and Daniel Stroe, “Prefetching For Visual Data Exploration.”

Technical Report #: WPI-CS-TR-02-07, 2002• Jing Yang, Matthew O. Ward and Elke A. Rundensteiner, “Interactive

Hierarchical Displays: A General Framework for Visualization and Exploration of Large Multivariate Data Sets”, Computers and Graphics Journal, 2002, to appear

• Daniel Stroe, Elke A. Rundensteiner and Matthew O. Ward, “Scalable Visual Hierarchy Exploration”, Database and Expert Systems Applications, pages 784-793, Sept. 2000

• Ying-Huey Fua, Matthew O. Ward and Elke A. Rundensteiner, “Hierarchical Parallel Coordinates for Exploration of LargeDatasets”, IEEE Proc. of Visualization, pages 43-50, Oct. 1999

• Ying-Huey Fua, Matthew O. Ward and Elke A. Rundensteiner, “Navigating Hierarchies with Structure-Based Brushes”, IEEE Proceedings of Visualization, pages 43-50, Oct. 1999