Top Banner
HOLISTIC OPTIMIZATION BY PREFETCHING QUERY RESULTS Karthik Ramachandra & S. Sudarshan Indian Institute of Technology Bombay SUPPORTED BY MSR India PhD fellowship Yahoo! Key Scientific Challenges Award 2011
33

Holistic Optimization by Prefetching Query Results

Feb 23, 2016

Download

Documents

Myrrh

Holistic Optimization by Prefetching Query Results. Karthik Ramachandra & S. Sudarshan Indian Institute of Technology Bombay. Supported by. MSR India PhD fellowship Yahoo! Key Scientific Challenges Award 2011. The Latency problem. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Holistic Optimization by Prefetching Query Results

HOLISTIC OPTIMIZATION BY PREFETCHING QUERY RESULTS

Karthik Ramachandra & S. SudarshanIndian Institute of Technology Bombay

SUPPORTED BY MSR India PhD fellowship Yahoo! Key Scientific Challenges Award 2011

Page 2: Holistic Optimization by Prefetching Query Results

2

THE LATENCY PROBLEM Applications that interact with databases/web

services experience lot of latency due to Network round trips to the data source Disk IO and processing at the data source

ApplicationDatabase

Disk IO and query executionNetwork time

Query

Result

Page 3: Holistic Optimization by Prefetching Query Results

3

MOTIVATION

Multiple queries could be issued concurrently Allows the database to share work across multiple queries

Application performs other processing while query executes Significantly reduces the impact of network round-trip and

server/disk IO latency

Performance of applications can be significantly improved by prefetching query results.

Manually inserting prefetch instructions is hard. Need to identify earliest and safe points in the code to perform

prefetching For queries within nested procedures prefetching has to be done

in the calling procedure to get benefits Hard to manually maintain as code changes occurOur Goal: Automate the insertion of prefetches

Page 4: Holistic Optimization by Prefetching Query Results

4

EXAMPLE OF PREFETCHING

executeQuery () – normal execute query submit() – non-blocking call that initiates query and returns

immediately; once the results arrive, they are stored in a cache executeQuery() – checks the cache and blocks if results are

not yet available

for (…) {

… genReport(custId, city);}void genReport(int custId, String city) { city = … while (…){

… } rs1 = executeQuery(q1, custId); rs2 = executeQuery(q2, city); …}

for (…) {

… genReport(custId, city);}void genReport(int custId, String city) { submit(q1, custId); city = … submit(q2, city); while (…){

… } rs1 = executeQuery(q1, custId); rs2 = executeQuery(q2, city); …}

Page 5: Holistic Optimization by Prefetching Query Results

5

EXAMPLE OF PREFETCHINGfor (…) {

… genReport(custId, city);}void genReport(int custId, String city) { city = … while (…){

… } rs1 = executeQuery(q1, custId); rs2 = executeQuery(q2, city); …}

for (…) {

… genReport(custId, city);}void genReport(int custId, String city) { submit(q1, custId); city = … submit(q2, city); while (…){

… } rs1 = executeQuery(q1, custId); rs2 = executeQuery(q2, city); …}

What is the earliest point when we can prefetch? Will prefetch potentially get wasted?

Page 6: Holistic Optimization by Prefetching Query Results

6

EXAMPLE OF PREFETCHINGfor (…) {

… genReport(custId, city);}void genReport(int custId, String city) { city = … while (…){

… } rs1 = executeQuery(q1, custId); rs2 = executeQuery(q2, city); …}

for (…) { submit(q1, custId); … genReport(custId, city);}void genReport(int custId, String city) { city = … submit(q2, city); while (…){

… } rs1 = executeQuery(q1, custId); rs2 = executeQuery(q2, city); …}

What is the earliest point when we can prefetch? Will prefetch potentially get wasted? Intra- vs. Inter- procedural prefetching

Page 7: Holistic Optimization by Prefetching Query Results

7

RELATED WORK Software prefetching extensively used in

compilers, databases and other areas of computer science

Predicting future accesses is based on Spatial and temporal locality Request patterns and statistical methods Static analysis

Query result prefetching based on request patterns

Fido (Palmer et.al 1991), AutoFetch (Ibrahim et.al ECOOP 2006), Scalpel (Bowman et.al. ICDE 2007), etc.

Predict future queries using traces, traversal profiling, logs

Missed opportunities due to limited applicability Potential for wasted prefetches

Page 8: Holistic Optimization by Prefetching Query Results

8

STATIC ANALYSIS BASED APPROACHES

Manjhi et. al. 2009 – insert prefetches based on static analysis No details of how to automate Only consider straight line

intraprocedural code Prefetches may go waste

Our earlier work Guravannavar et. al. VLDB 08 –

given query in a loop, rewrite loop to create batched query Chavan et. al. ICDE 11 – as above but using asynchronous

query submission Consider: Loop calls procedure, which executes query

Common in many database applications Our earlier work is applicable, but requires very intrusive

rewriting of procedures

getAllReports() { for (custId in …) { … genReport(custId); }}void genReport(int cId) { … r = executeQuery(q, cId); …}

Page 9: Holistic Optimization by Prefetching Query Results

9

OUR CONTRIBUTIONS IN THIS PAPER

Prefetching algorithm purely based on static analysis Inserts prefetches at earliest possible point in the

program Uses notion of query anticipability; no wasted prefetches*

Works in the presence of loops and interprocedural code Enhancements that optimize prefetches

Code motion, chaining and rewriting prefetch requests Increasing applicability

Integrating with loop fission Applicability for Hibernate, Web Services

Experimental study on real world applications

* except due to exceptions

Page 10: Holistic Optimization by Prefetching Query Results

11

INTRAPROCEDURAL PREFETCHING

void report(int cId,String city){ city = … while (…){ … } c = executeQuery(q1, cId); d = executeQuery(q2, city); …}

Approach: Identify valid points of prefetch

insertion within a procedure Place prefetch request submit(q, p)

at the earliest point

Valid points of insertion of prefetch All the parameters of the query

should be available, with no intervening assignments No intervening updates to the database Should be guaranteed that the query will be

executed subsequently Systematically found using Query Anticipability analysis

extension of a dataflow analysis technique called anticipable expressions analysis

Page 11: Holistic Optimization by Prefetching Query Results

13

QUERY ANTICIPABILITY ANALYSIS

void report(int cId,String city){ city = … while (…){ … } rs1 = executeQuery(q1, cId); rs2 = executeQuery(q2, city); …}

Bit vector = (q1,q2) = anticipable (valid) = not anticipable (invalid) Backward data flow in the

control flow graph

n1n2n3

n4n5

( , )

( , )

( , )( , )

( , )

( , )( , )

( , )

( , )

( , )( , )

start

n1

n2

n4

n5

n3

end

( , )

Page 12: Holistic Optimization by Prefetching Query Results

14

QUERY ANTICIPABILITY ANALYSISDefinition 3.1. A query execution statement q is anticipable at a program point u if every path in the CFG from u to End contains an execution of q which is not preceded by any statement that modifies the parameters of q or affects the results of q.

Data flow information Stored as bit vectors (1 bit per query) Propagated against the direction of control flow

(Backward Dataflow Analysis) Captured by a system of data flow equations Solve equations iteratively till a fixpoint is

reached Details in the paper

u

End

q

Page 13: Holistic Optimization by Prefetching Query Results

15

Data dependence barriers Due to assignment to

query parameters or UPDATEs

Append prefetch to the barrier statement

Control dependence barriers Due to conditional

branching(if-else or loops)

Prepend prefetch to the barrier statement

INTRAPROCEDURAL PREFETCH INSERTION Analysis identifies all points in the program where q

is anticipable; we are interested in earliest points

n1: x =…

n2

nq: executeQuery(q,x)

n2

nq: executeQuery(q,x)n3

n1: if(…)

submit(q,x)submit(q,x)

Page 14: Holistic Optimization by Prefetching Query Results

16

INTRAPROCEDURAL PREFETCH INSERTION

q2 only achieves overlap with the loop q1 can be prefetched at the beginning of the

method

void report(int cId,String city){

city = …

while (…){ … } rs1 = executeQuery(q1, cId); rs2 = executeQuery(q2, city); …}

void report(int cId,String city){ submit(q1, cId); city = … submit(q2, city); while (…){ … } rs1 = executeQuery(q1, cId); rs2 = executeQuery(q2, city); …}

Page 15: Holistic Optimization by Prefetching Query Results

17

INTRAPROCEDURAL PREFETCH INSERTION

q2 only achieves overlap with the loop q1 can be prefetched at the beginning of the

method Can be moved to the method that invokes report()

void report(int cId,String city){

city = …

while (…){ … } rs1 = executeQuery(q1, cId); rs2 = executeQuery(q2, city); …}

void report(int cId,String city){ submit(q1, cId); city = … submit(q2, city); while (…){ … } rs1 = executeQuery(q1, cId); rs2 = executeQuery(q2, city); …}

Page 16: Holistic Optimization by Prefetching Query Results

18

INTERPROCEDURAL PREFETCHING

Benefits of prefetching can be greatly improved by moving prefetches across method invocations

Intuition: if a prefetch can be submitted at the beginning of a procedure, it can instead be moved to all its call sites

Use call graph of the program, and CFGs of all procedures

Assumption: Call graph is a DAG (we currently do not handle recursive calls)

Page 17: Holistic Optimization by Prefetching Query Results

19

INTERPROCEDURAL PREFETCHING ALGORITHM (INTUITION)

Iterate through the vertices of the call graph in reverse topological order

Perform intraprocedural prefetching at each method M If first statement is a submit(), then move it to all

callers of M at the point of invocation Replace formal parameters with actual arguments

void generateAllReports() { … genReport(custId, city);}void genReport(int cId, String city) { submit(q2, cId); … rs1 = executeQuery(q2, cId); …}

void generateAllReports() { … submit(q2, custId); genReport(custId, city);}void genReport(int cId, String city) { … rs1 = executeQuery(q2, cId); …}

Page 18: Holistic Optimization by Prefetching Query Results

20

PREFETCHING ALGORITHM: SUMMARY Our algorithms ensure that:

The resulting program preserves equivalence with the original program.

All existing statements of the program remain unchanged.

No prefetch request is wasted. At times, prefetches may lead to no benefits

Enhancements to getbeneficial prefetcheseven in presence ofbarriers

Equivalence preserving program and query transformations

void proc(int cId){ int x = …; while (…){ … } if (x > 10) c = executeQuery(q1, cId); …}

Page 19: Holistic Optimization by Prefetching Query Results

21

ENHANCEMENTS

INCREASING APPLICABILITY

SYSTEM DESIGN AND EXPERIMENTAL EVALUATION

PREFETCH INSERTION ALGORITHM

Page 20: Holistic Optimization by Prefetching Query Results

22

1. TRANSITIVE CODE MOTION (STRONG ANTICIPABILITY)

General Algorithm: Control dependence barrier:

Transform it into a data dependence barrier by rewriting it as a guarded statement

Data dependence barrier: Apply anticipability analysis on the barrier statements Move the barrier to its earliest point followed by the prefetch

void genReport(int cId){ int x = …;

while (…){ … } if (x > 10) rs1 = executeQuery(q1, cId); …}

void genReport(int cId){ int x = …; boolean b = (x > 10); if (b) submit(q1, cId); while (…){ … } if (b) rs1 = executeQuery(q1, cId); …}

Page 21: Holistic Optimization by Prefetching Query Results

23

2. CHAINING PREFETCH REQUESTS Output of a query forms a parameter to another –

commonly encountered Prefetch of query 2 can be issued immediately

after results of query 1 are available. submitChain similar to submit ; details in paper

void report(int cId,String city){ … c = executeQuery(q1, cId); while (c.next()){ accId = c.getString(“accId”); d = executeQuery(q2, accId); }}

void report(int cId,String city){ submitChain({q1, q2’}, {{cId}, {}}); … c = executeQuery(q1, cId); while (c.next()){ accId = c.getString(“accId”); d = executeQuery(q2, accId); }}

q2’ is q2 with its ? replaced by q1.accId

q2 cannot be beneficially prefetchedas it depends on accId which comesfrom q1

Typo in paper: In Section 5.2 on chaining and in Figure 10: replace all occurrences of q2 by q4

Page 22: Holistic Optimization by Prefetching Query Results

24

3. REWRITING CHAINED PREFETCH REQUESTS

Chained SQL queries have correlating parameters between them (q1.accId)

Can be used to rewrite them into one query using known techniques such as OUTER APPLY or LEFT OUTER LATERAL operators

Results are split into individual result sets in cache Reduces network round trips, aids in selection of set

oriented query plans

submitChain({“SELECT * FROM accounts WHERE custid=?”, “SELECT * FROM transactions WHERE

accId=:q1.accId”}, {{cId}, {}});

SELECT ∗FROM (SELECT ∗ FROM accounts WHERE custId = ?)

OUTER APPLY(SELECT ∗ FROM transactions WHERE transactions.accId =

account.accId)

Page 23: Holistic Optimization by Prefetching Query Results

25

INCREASING APPLICABILITY

PREFETCH INSERTION ALGORITHM

ENHANCEMENTS

SYSTEM DESIGN AND EXPERIMENTAL EVALUATION

Page 24: Holistic Optimization by Prefetching Query Results

26

INTEGRATION WITH LOOP FISSION

for (…) { … genReport(custId);}

void genReport(int cId) { … r=executeQuery(q, cId); …}

for (…) { … submit(q,cId); genReport(custId);}

void genReport(int cId) { … r=executeQuery(q, cId); …}

for (…) { … addBatch(q, cId);}submitBatch(q);for (…) { genReport(custId);}

void genReport(int cId) { … r=executeQuery(q, cId); …}

Original program Loop Fission

Interprocedural Prefetching enables our earlier work (VLDB08 and ICDE11) on loop fission for Batching/Asynchronous submission

Page 25: Holistic Optimization by Prefetching Query Results

27

HIBERNATE AND WEB SERVICES Lot of enterprise and web applications

Are backed by O/R mappers like Hibernate They use the Hibernate API which internally generate

SQL Well known performance problems when accessing data

in a loop Are built on Web Services

Typically accessed using APIs that wrap HTTP requests and responses

To apply our techniques here, Transformation algorithm has to be aware of the

underlying data access API Runtime support to issue asynchronous prefetches

Page 26: Holistic Optimization by Prefetching Query Results

28

SYSTEM DESIGN AND EXPERIMENTAL EVALUATION

PREFETCH INSERTION ALGORITHM

ENHANCEMENTS

INCREASING APPLICABILITY

Page 27: Holistic Optimization by Prefetching Query Results

29

SYSTEM DESIGN: DBRIDGE Our techniques have been incorporated into the

DBridge holistic optimization tool Two components:

Java source-to-source program Transformer Uses SOOT framework for static analysis and

transformation (http://www.sable.mcgill.ca/soot/) Preserves readability

Prefetch API (Runtime library) For issuing prefetch requests Thread and cache management Can be used with manual writing/rewriting or automatic

rewriting by DBridge transformer Currently works for JDBC API; being extended

for Hibernate and Web services

Page 28: Holistic Optimization by Prefetching Query Results

30

EXPERIMENTS Conducted on 4 applications

Two public benchmark applications (Java/JDBC) A real world commercial ERP application(Java/JDBC) Twitter Dashboard application (Java/Web Service)

Environments A widely used commercial database system – SYS1 PostgreSQL

Both running on a 64 bit dual-core machine with 4 GB of RAM

Page 29: Holistic Optimization by Prefetching Query Results

31

AUCTION APPLICATION (JAVA/JDBC): INTRAPROCEDURAL PREFETCHING

Single procedure with nested loop Overlap of loop achieved; varying iterations of outer loop Consistent 50% improvement

for(…) { for(…) { … } exec(q);}

for(…) { submit(q); for(…) { … } exec(q);}

Page 30: Holistic Optimization by Prefetching Query Results

32

WEB SERVICE (HTTP/JSON): INTERPROCEDURAL PREFETCHING

Twitter dashboard: monitors 4 keywords for new tweets (uses Twitter4j library)

Interprocedural prefetching; no rewrite possible 75% improvement at 4 threads Server time constant; network overlap leads to significant gain

Note: Our system does not automatically rewrite web service programs, this example was manually rewritten using our algorithms

Page 31: Holistic Optimization by Prefetching Query Results

33

ERP APPLICATION: IMPACT OF OUR TECHNIQUES

Intraprocedural: moderate gains Interprocedural: substantial gains (25-30%) Enhanced (with rewrite): significant gain(50% over Inter) Shows how these techniques work together

Page 32: Holistic Optimization by Prefetching Query Results

34

FUTURE WORK Which calls to prefetch? And where to place

them? Cost-based speculative prefetching Implementation

Cross-thread transaction support in runtime library Completely support Hibernate, web services

CONCLUSION Automatically prefetching query results using

static analysis is widely applicable in many real

applications can lead to significant gains.

Page 33: Holistic Optimization by Prefetching Query Results

35

MORE QUESTIONS?

Today, 15:00 – 16:30PODS/SIGMOD Research Plenary Poster SessionLocation: Vaquero Ballroom B–C

PROJECT WEBSITE: http://www.cse.iitb.ac.in/infolab/dbridge

QUESTIONS?