Multi pass algorithms. Nested-Loop joins Tuple-Based Nested-loop Join Algorithm: FOR each tuple s in S DO FOR each tuple r in R DO IF r and s join to.

Multi pass algorithms

Nested-Loop joins• Tuple-Based Nested-loop Join Algorithm:

FOR each tuple s in S DO

FOR each tuple r in R DO

IF r and s join to make a tuple t THEN

output t

What’s the complexity?

Block-based nested loops• Assume B(S) ≤ B(R), and B(S) > M • Read M-1 blocks of S into main memory and compare to

all of R, block by block

FOR each chunk of M-1 blocks of S DO

FOR each block b of R DO

FOR each tuple t of b DO

find the tuples of S in memory that join with t

output the join of t with each of these tuples

Example• B(R) = 1000, B(S) = 500, M = 101

• Outer loop iterates 5 times• At each iteration we read M-1 (i.e. 100) blocks of S and all

of R (i.e. 1000) blocks.• Total time: 5*(100 + 1000) = 5500 I/O’s

• Question: What if we reversed the roles of R and S?• We would iterate 10 times, and in each we would read

100+500 blocks, for a total of 10*(100+500) = 6000 I/O’s.

• Compare with one-pass join, if it could be done!• We would need 1500 disk I/O’s if B(S) M-1

Analysis of blocks nested loops• Number of disk I/O’s:

[B(S)/(M-1)] * (M-1 + B(R))

or

B(S) + B(S)B(R)/(M-1)

or approximately B(S)*B(R)/M

Two-pass algorithms based on sorting• This special case of multi-pass algorithms is sufficient for

most of the relation sizes.

Main idea for unary operations on R • Suppose B(R) M (main memory size in blocks)

• First pass: – Read M blocks of R into Main Memory– Sort the content of Main Memory– Write the sorted result (sublist/run) into M blocks on disk.

• Second pass: create final result

Duplicate elimination using sorting• In the second phase (merging) we don’t sort but copy each

tuple just once.

• We can do that because the identical tuples will show up “at the same time,” i.e. they will be all the first ones at the buffers (for the sorted sublists).

• As usual, if one buffer gets empty we refill it.

Duplicate-Elimination using Sorting Example• Assume M=3, each buffer holds 2 records and relation R

consists of the following 17 tuples:

2, 5, 2, 1, 2, 2, 4, 5, 4, 3, 4, 2, 1, 5, 2, 1, 3

• After the first pass the following sorted sub-lists are created:

1, 2, 2, 2, 2, 5

2, 3, 4, 4, 4, 5

1, 1, 2, 3, 5

• In the second pass we dedicate a memory buffer to each sub-list.

Example (Cont’d)

Example (Cont’d)

Example (Cont’d)

Analysis of (R) • 2B(R) to create sorted sublists, B(R) to read each sublist in

phase 2. Total: 3B(R)

• How large can R be?– There can be no more than M sublists since we need one

buffer for each one. So, B(R)/M ≤ M, (B(R)/M is the number of sublists)

i.e. B(R) ≤ M2

• To compute (R) we need at least sqrt(B(R)) blocks of Main Memory.

Sort-based , , -Example: set union.

• Analysis: 3(B(R) + B(S)) disk I/O’s• Condition: B(R) + B(S) ≤ M2

• Similar algorithms for sort based intersection and difference (bag or set versions).

• Create sorted sublists of R and S• Use input buffers for sorted sublists of R and S, one buffer

per sublist.• Output each tuple once.

- We can do that since all the identical tuples appear “at the same time.”

Join• A problem for joins but not for the previous operators: The

number of joining tuples from the two relations can exceed what fits in memory.

• A solution? • Maximize the number of output buffers.• Minimize the number of sorted sublists (since we need a

buffer for each one of them).

Simple sort-based join

• For R(X,Y) S(Y,Z) with M buffers of memory:• Completely: sort R on Y, sort S on Y

Merge phase• Use 2 input buffers: 1 for R, 1 for S.• Pick tuple t with smallest Y value in the buffer for R• If t doesn’t match with the first tuple in the buffer for S, then

just remove t.• Otherwise, read all the tuples from R with the same Y value

as t and put them in the M-2 part of the memory. • When the input buffer for R is exhausted fill it again and

again.

• Then, read the tuples of S that match. For each one we produce the join of it with all the tuples of R in the M-2 part of the memory.

Example of sort join• B(R) = 1000, B(S) = 500, M= 101• To sort R, we need 4*B(R) I/O’s, same for S.

– Number of I/O’s = 4*(B(R) + B(S))• Doing the join in the merge phase:

– Number of I/O’s = B(R) + B(S) • Total disk I/O’s = 5*(B(R) + B(S)) = 7500

• Memory Requirement? • To be able to do the sort, we should have B(R) ≤ M2 and

B(S) ≤ M2

• Recall: for nested-loop join, we needed 5500 disk I/O’s, but the memory requirement was quadratic (it is linear, here), i.e., nested-loop join is not good for joining relations that are much larger than MM.

Potential problem ...R(X , Y)----------- x1 a x2 a … xn a

S(Y, Z)--------- a z1

a z2

... a zm

What if Size of n+1 tuples > M-1 andSize of m+1 tuples> M-1?

• If the tuples from R (or S) with the same value y of Y do not fit in M-1 buffers, then we use all M-1 buffers to do a nested-loop join on the tuples with Y-value y from both relations.

• Observe that we can “smoothly” continue with the nested loop join when we see that the R tuples with Y-value y do not fit in M-1 buffers.

• Do we really need the fully sorted files?• Suppose we are not worried about many common Y values

Can We Improve on Sort Join?

R

S

Join?

sorted runs

A more efficient sort-based join• Suppose we are not worried about many common Y values

• Create Y-sorted sublists of R and S• Bring first block of each sublist into a buffer (assuming we

have at most M sublists)• Find smallest Y-value from heads of buffers. Join with other

tuples in heads of buffers, use other possible buffers, if there are “many” tuples with the same Y values.

• Disk I/O: 3*(B(R) + B(S))• Requirement: B(R) + B(S) ≤ M2

Example• B(R) = 1000, B(S) = 500, M= 101

• Total of 15 sorted sublists• If too many tuples join on a value y, use the remaining 86

MM buffers for a one pass join on y

• Total cost: 3*(1000 + 500) = 4500 disk I/O’s• M2 =10201 > B(R) + B(S), so the requirement is satisfied

Summary of sort-based algorithms

Operators Approx. M required Disk I/O

, Sqrt( B ) 3B

, , - Sqrt( B(R) + B(S) ) 3(B(R)+B(S))

Sqrt( max( B(R),B(S) ) ) 5(B(R)+B(S))

Sqrt( B(R)+B(S) ) 3(B(R)+B(S))

Two-pass algorithms based on hashing

Main idea: • Instead of sorted sublists, create partitions, based on

hashing.• Second pass creates result from partitions using one pass

algorithms.

Creating partitions• Partitions (buckets) are created based on all attributes of the relation

except for grouping and join, where the partitions are based on the grouping and join-attributes respectively.

• Why bucketize? Tuples with “matching” values end up in the same bucket.

Initialize M-1 buckets using M-1 empty buffers;FOR each block b of relation R DO

read block b into the M-th buffer; FOR each tuple t in b DO

IF the buffer for bucket h(t) has no room for t THENcopy the buffer to disk;initialize a new empty block in that buffer;

copy t to the buffer for bucket h(t);ENDIF;

ENDFOR;FOR each bucket DO

IF the buffer for this bucket is not empty THENwrite the buffer to disk;

Hash-based duplicate elimination• Pass 1: create partitions by hashing on all attributes• Pass 2: for each partition, use the one-pass method for

duplicate elimination

• Cost: 3B(R) disk I/O’s

• Requirement: B(R) ≤ M*(M-1)

(B(R)/(M-1) is the approximate size of one bucket)

i.e. the req. is approximately B(R) ≤ M2

Hash-based grouping and aggregation

• Pass 1: create partitions by hashing on grouping attributes• Pass 2: for each partition, use one-pass method.

• Cost: 3B(R), Requirement: B(R) ≤ M2

• More exactly the requirement is:

MM

RB L )1(

))(((

Hash-based set union• Pass 1: create partitions R1,…,RM-1 of R, and S1,…,SM-1 of S

(with the same hash function)• Pass 2: for each pair Ri, Si compute Ri Si using the one-

pass method.

• Cost: 3(B(R) + B(S))• Requirement: min(B(R),B(S)) ≤ M2

• Similar algorithms for intersection and difference (set and bag versions)

Partition hash-join

• Pass 1: create partitions R1, ..,RM-1 of R, and S1, ..,SM-1 of S, based on the join attributes (the same hash function for both R and S)

• Pass 2: for each pair Ri, Si compute Ri Si using the one-pass method.

• B(R) = 1000 blocks• B(S) = 500 blocks• Memory available = 101 blocks• R S on common attribute C• Use 100 buckets

– Read R– Hash– Write buckets

Example

...

...

10 blocks

100R

Same for S

• Read one R bucket• Build memory hash table• Read corresponding S bucket block by block.

RS

...

R

Memory...

Cost• “Bucketize:”

– Read + write R– Read + write S

• Join– Read R– Read S

Total cost = 3*[1000+500] = 4500

In general:

Cost: 3(B(R) + B(S))

Req.: min(B(R),B(S)) ≤ M2

Summary of hash-based methods

Operators Approx. M required

Disk I/O

, Sqrt(B) 3B

, , - Sqrt(B(S)) 3(B(R)+B(S))

Sqrt(B(S)) 3(B(R)+B(S))

Sort vs. Hash based algorithms• Hash-based algorithms have a size requirement that

depends only on the smaller of the two arguments rather than on the sum of the argument sizes, as for sort-based algorithms.

• Sort-based algorithms allow us to produce the result in sorted order and take advantage of that sort later. The result can be used in another sort-based algorithm later.

• Hash-based algorithms depend on the buckets being of nearly equal size. Well, what about a join with a very few values for the join attribute…

Multi pass algorithms. Nested-Loop joins Tuple-Based Nested-loop Join Algorithm: FOR each tuple s in S DO FOR each tuple r in R DO IF r and s join to.

Documents