Top Banner
Interactive Query Processing in Scientific Applications David Liu UC Berkeley Computer Science Division
9

Interactive Query Processing in Scientific Applications David Liu UC Berkeley Computer Science Division.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Interactive Query Processing in Scientific Applications David Liu UC Berkeley Computer Science Division.

Interactive Query Processing in Scientific Applications

David LiuUC Berkeley

Computer Science Division

Page 2: Interactive Query Processing in Scientific Applications David Liu UC Berkeley Computer Science Division.

Problem: Data management in Scientific Experiments

Big science is about managing big dataScientific Applications lots of storage distributed architectures lots of computation Embarrassingly parallel Embarrassingly slow

Page 3: Interactive Query Processing in Scientific Applications David Liu UC Berkeley Computer Science Division.

Interactive Query Processing + $$$

Can locality and random sampling go together?

dataQuery

Processorinteractive random

samples

dataQuery

Processor

cache

batch

interactive randomsamplesdata

QueryProcessor

cache

Page 4: Interactive Query Processing in Scientific Applications David Liu UC Berkeley Computer Science Division.

Fine-grain computation scheduling

Three queries active in the systemProcess A first, then B

Q1

G B

A

Q2

D B

A

Q3

E C

A

Page 5: Interactive Query Processing in Scientific Applications David Liu UC Berkeley Computer Science Division.

Fine-grain computation scheduling

All queries partially satisfiedQ1 aborted at 66% complete

Q1

G B

A

Q2

D B

A

Q3

E C

A

Page 6: Interactive Query Processing in Scientific Applications David Liu UC Berkeley Computer Science Division.

Fine-grain computation scheduling

D, E, C all satisfy 1 query a piece, which one to go for?E or C, because it improves the more neglected query

Q2

D B

A

Q3

E C

A

Page 7: Interactive Query Processing in Scientific Applications David Liu UC Berkeley Computer Science Division.

Fine-grain schedulingUse fine-grain scheduling to improve overall mirth Schedule more popular tuples first Schedule neglected queries first

Page 8: Interactive Query Processing in Scientific Applications David Liu UC Berkeley Computer Science Division.

Vision: A Flexible Querying System

Flexible Querying: sacrifice fidelity for performanceWays to Sacrifice fidelity: partial answer approximate answer alternative answer

Another way to improve perceived performance: interactive results

Page 9: Interactive Query Processing in Scientific Applications David Liu UC Berkeley Computer Science Division.

A Flexible Querying System

QueryProcessor

User[Agent]

Costs, Utilities

Partial,Approximate,

AlternativeAnswers

Preferences

Cache

FlexibilityRules

Data