Top Banner
Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department Stanford University
31

Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Polaris

Query, Analysis, and Visualization of

Large Hierarchical Relational Databases

Pat Hanrahan

With Chris Stolte and Diane Tang

Computer Science Department

Stanford University

Page 2: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Motivation

Large databases have become very common

Corporate data warehouses Amazon, Walmart,…

Scientific projects: Human Genome Project

Sloan Digital Sky Survey

Need tools to extract meaning from these databases

Page 3: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Related Work

Formalisms for graphics Bertin’s “Semiology of Graphics” Mackinlay’s APT Roth et al.’s Sage and SageBrush Wilkinson’s “Grammar of Graphics”

Visual exploration of databases DeVise DataSplash/Tioga-2

Visualization and data mining SGI’s MineSet IBM’s Diamond

Page 4: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Formalism

Page 5: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Polaris Formalism

UI interpreted as visual specification that defines:

Table configuration

Type of graphic in each pane

Encoding of data as visual properties of marks

Data transformations and queries

Page 6: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Schema

MarketStateYearQuarterMonthProduct TypeProduct

ProfitSalesPayrollMarketingInventoryMarginCOGS...

Ordinal fields(categorical)

Quantitative fields(measures)

Coffee chain data[Visual Insights]

Page 7: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Polaris Visual Encodings

Principle of Importance Ordering: Encode the most important

information in the most effective way [Cleveland & McGill]

Page 8: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

The Pivot Table Interface

Common interface to statistical packages/Excel

Cross-tabulations

Simple interface based on drag-and-drop

Page 9: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.
Page 10: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Data Cubes

Structure relation as n-dimensional cube

Each cell aggregatesall measures for those dimensions

Each cube axiscorresponds to a dimension in the relation

Page 11: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Table Algebra: Operands

Ordinal fields: interpret domain as a set that partitions table into rows and columns:

Quarter = {(Qtr1),(Qtr2),(Qtr3),(Qtr4)}

Quantitative fields: treat domain as single element set and encode spatially as axes:

Profit = {(Profit)}

Page 12: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Concatenation (+) Operator

Ordered union of two sets

Quarter + ProductType

= {(Qtr1),(Qtr2),(Qtr3),(Qtr4)}+{(Coffee),(Espresso)}

= {(Qtr1),(Qtr2),(Qtr3),(Qtr4),(Coffee),(Espresso)}

Profit + Sales

= {(Profit),(Sales)}

Page 13: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Cross () Operator

Direct-product of two sets

Quarter ProductType =

{(Qtr1,Coffee), (Qtr1, Tea), (Qtr2, Coffee), (Qtr2, Tea),

(Qtr3, Coffee), (Qtr3, Tea), (Qtr4, Coffee), (Qtr4,Tea)}

ProductType Profit =

Page 14: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.
Page 15: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.
Page 16: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.
Page 17: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.
Page 18: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

SQL Dataflow

Notes Aggregation operators applied after sort Only one layer is shown; additional z-sort

Relational Table Tuples in Panes Marks in Panes

Sort

Page 19: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Multiscale Visualization

Page 20: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Hierarchical Structure

Challenge: these databases are very large

Queries/Vis should not require all the records

Augment database with hierarchical structure

Provide meaningful levels of abstraction

Derived from domain or clustering

Provides metadata (missing data for context)

Page 21: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Hierarchies and Data Cubes

Each dimension in the cube is structured as a tree

Each level in tree corresponds to level of detail

Page 22: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Schema: Star Schema

StateMonthProductProfitSalesPayrollMarketingInventoryMargin...

Measures

LocationMarketState

TimeYearQuarterMonthProducts

Product TypeProduct Name

Fact tableExistence Table

Generalizations

• Snowflake schemas

• Lattices (DAGs)

Page 23: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Categorical Hierarchies

Quarter Month

Direct product of two sets

Would create twelve entries for each quarter, i.e. (Qtr1, December)

Quarter / Month

Based on tuples in database not semantics

Would only create three entries per quarter

Can be expensive to compute

Quarter . Month

Based on tuples in existence tables (not db)

Page 24: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Cartographic GeneralizationCanterbury and East Kent

1:50,000 1:625,000

Page 25: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Generalization: Techniques

Selection

Simplification

Exaggeration

Regularization

Displacement

Aggregation

Page 26: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.
Page 27: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.
Page 28: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.
Page 29: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.
Page 30: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Summary

Polaris

Spreadsheet or table-based displays

Simple drag-and-drop interface

Built on a formalism that allows algebraic manipulation of visual mapping of tuples to marks

Multiscale visualizations using data and visual abstraction

Connects to SQL/MDX servers

See http://www.graphics.stanford.edu/projects/polaris

Page 31: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases Pat Hanrahan With Chris Stolte and Diane Tang Computer Science Department.

Future Work

Articulate full-set of multiscale design patterns

Transition between levels of detail

Develop system infrastructure for browsing VLDB

Support layers/lenses/linking with tuple flow

Device independence through graphical encodings

Extend formalism to 3D

Couple scientific and information visualization