Top Banner
Database and Stream Mining Database and Stream Mining using GPUs using GPUs Naga K. Govindaraju Naga K. Govindaraju UNC Chapel Hill UNC Chapel Hill
57

Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

Dec 27, 2015

Download

Documents

Randolf Short
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

Database and Stream Mining Database and Stream Mining using GPUsusing GPUs

Naga K. GovindarajuNaga K. Govindaraju

UNC Chapel HillUNC Chapel Hill

Page 2: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

2

GoalGoal

• Utilize graphics processors for fast Utilize graphics processors for fast computation of common database computation of common database operationsoperations• Conjunctive selectionsConjunctive selections

• AggregationsAggregations

• Semi-linear queriesSemi-linear queries

• Essential componentsEssential components

Page 3: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

3

Motivation: Fast operationsMotivation: Fast operations

• Increasing database sizesIncreasing database sizes

•Faster processor speeds but low Faster processor speeds but low improvement in query execution improvement in query execution timetime•Memory stalls Memory stalls

•Branch mispredictionsBranch mispredictions

•Resource stallsResource stalls

•Ref: Ref: [Ailamaki99,01] [Boncz99] [Manegold00,02] [Ailamaki99,01] [Boncz99] [Manegold00,02] [Meki00] [Shatdal94] [Rao99] [Ross02] [Zhou02]……[Meki00] [Shatdal94] [Rao99] [Ross02] [Zhou02]……

Page 4: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

4

Fast Database OperationsFast Database Operations

CPU(3 GHz)

System Memory(2 GB)

AGP Memory(512 MB)

PCI-e Bus(4 GB/s)

Ours

Video Memory(256 MB)

GPU(500 MHz)

Others

Page 5: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

5

NVIDIA GeForceFXNVIDIA GeForceFX6800 Ultra6800 Ultra

NVIDIA GeForceFXNVIDIA GeForceFX5900 Ultra5900 Ultra

Intel Pentium 4Intel Pentium 4

MemoryMemoryBandwidthBandwidth

35.2 GBps35.2 GBps 27.2 GBps27.2 GBps 6.4 GBps6.4 GBpsDDR2 400 RDRAMDDR2 400 RDRAM

Peak SIMD Peak SIMD InstructionsInstructions

6 Vertex Ops 6 Vertex Ops 16 Pixel Ops16 Pixel OpsFloat Float

4 Vertex Ops4 Vertex Ops4 Pixel Ops4 Pixel OpsFloatFloat

4 Float Ops (SSE) 4 Float Ops (SSE) 2 Double Ops 2 Double Ops (SSE2)(SSE2)

Vector Ops Vector Ops per Clockper Clock

16 vector4 (float)16 vector4 (float) 4 vector4 (float)4 vector4 (float) 1 vector4 (float)1 vector4 (float)

Peak Peak Comparison Comparison Ops per Ops per ClockClock

6464 1616 4 4

ClockClock 400 MHz400 MHz 450 MHz450 MHz 3.4 GHz3.4 GHz

Page 6: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

6

Graphics Processors: Design IssuesGraphics Processors: Design Issues

• Relatively low bandwidth to CPURelatively low bandwidth to CPU• Design database operations avoiding frame buffer Design database operations avoiding frame buffer

readbacksreadbacks

• No arbitrary writesNo arbitrary writes• Design algorithms avoiding data rearrangementsDesign algorithms avoiding data rearrangements

• Programmable pipeline has poor Programmable pipeline has poor branchingbranching• Design algorithms without branching in Design algorithms without branching in

programmable pipeline - evaluate branches using programmable pipeline - evaluate branches using fixed function tests fixed function tests

Page 7: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

7

Basic DB OperationsBasic DB Operations

Basic SQL query Basic SQL query Select ASelect A

From TFrom T

Where CWhere C

A= attributes or aggregations (SUM, A= attributes or aggregations (SUM, COUNT, MAX etc)COUNT, MAX etc)

T=relational tableT=relational table

C= Boolean Combination of Predicates C= Boolean Combination of Predicates (using operators AND, OR, NOT)(using operators AND, OR, NOT)

Page 8: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

8

Database OperationsDatabase Operations

• Predicates Predicates • aaii opop constant constant or or aaii opop a ajj

• op: op: <,>,<=,>=,!=, =, TRUE, FALSE<,>,<=,>=,!=, =, TRUE, FALSE

• Boolean combinations Boolean combinations • Conjunctive Normal Form (CNF)Conjunctive Normal Form (CNF)

• AggregationsAggregations• COUNT, SUM, MAX, MEDIAN, AVGCOUNT, SUM, MAX, MEDIAN, AVG

Page 9: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

9

Data RepresentationData Representation

• Attribute values aAttribute values aii are stored in 2D are stored in 2D textures on the GPUtextures on the GPU

• A fragment program is used to copy A fragment program is used to copy attributes to the depth bufferattributes to the depth buffer

Page 10: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

10

Copy Time to the Depth Buffer Copy Time to the Depth Buffer

Page 11: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

11

Data Representation: IssuesData Representation: Issues

• Floating point and fixed point Floating point and fixed point representations are differentrepresentations are different• Need to define scaling operationsNeed to define scaling operations

Page 12: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

12

Predicate EvaluationPredicate Evaluation

• aaii op constant (d) op constant (d)

• Copy the attribute values aCopy the attribute values a ii into depth buffer into depth buffer

• Specify the comparison operation used in the Specify the comparison operation used in the depth testdepth test

• Draw a screen filling quad at depth d and Draw a screen filling quad at depth d and perform the depth testperform the depth test

Page 13: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

13

Screen

PIf ( ai op d )pass fragment

Else

reject fragment

aaii op d op d

d

Page 14: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

14

Predicate EvaluationPredicate Evaluation

CPU implementation — Intel compiler 7.1 with SIMD optimizations

Page 15: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

15

Predicate EvaluationPredicate Evaluation

• aaii op a op ajj

• Equivalent to (aEquivalent to (ai i – a– ajj) op 0 ) op 0

• Semi-linear queriesSemi-linear queries• Defined as linear combination of attribute values Defined as linear combination of attribute values

compared against a constantcompared against a constant

• Linear combination is computed as a dot product of two Linear combination is computed as a dot product of two vectorsvectors

• Utilize the vector processing capabilities of GPUsUtilize the vector processing capabilities of GPUs

ii as

Page 16: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

16

Semi-linear QuerySemi-linear Query

Page 17: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

17

Boolean CombinationBoolean Combination

• CNF: CNF: • (A(A11 AND A AND A22 AND … AND A AND … AND Akk) where) where

AAii = (B = (Bii11 OR B OR Bii

22 OR … OR B OR … OR Biimi mi ) )

• Performed using stencil test recursivelyPerformed using stencil test recursively• CC11 = (TRUE AND A = (TRUE AND A11) = A) = A11

• CCi i = (A= (A11 AND A AND A22 AND … AND A AND … AND Aii) = (C) = (Ci-1i-1 AND A AND Aii))

• Different stencil values are used to code Different stencil values are used to code the outcome of Cthe outcome of Cii

• Positive stencil values — pass predicate evaluation Positive stencil values — pass predicate evaluation

• Zero — fail predicate evaluationZero — fail predicate evaluation

Page 18: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

18

AA1 1 AND AAND A22

A1

B21

B22

B23

A2 = (B21 OR B2

2 OR B23 )

Page 19: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

19

AA1 1 AND AAND A22

A1

Stencil value = 1

Page 20: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

20

AA1 1 AND AAND A22

A1

Stencil value = 0

Stencil value = 1

TRUE AND A1

Page 21: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

21

AA1 1 AND AAND A22

A1

Stencil = 0

Stencil = 1

B21

Stencil=2

B22

Stencil=2

B23

Stencil=2

Page 22: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

22

AA1 1 AND AAND A22

A1

Stencil = 0

Stencil = 1

B21

B22

B23

Stencil=2

Stencil=2

Stencil=2

Page 23: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

23

AA1 1 AND AAND A22

Stencil = 0

Stencil=2A1 AND B2

1

Stencil = 2A1 AND B2

2 Stencil=2

A1 AND B23

Page 24: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

24

Multi-Attribute QueryMulti-Attribute Query

Page 25: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

25

Range QueryRange Query

• Compute aCompute ai i within [low, high]within [low, high]

• Evaluated as ( aEvaluated as ( aii >= low ) AND ( a >= low ) AND ( aii <= high ) <= high )

• Use NVIDIA depth bounds test to Use NVIDIA depth bounds test to evaluate both conditionals in a evaluate both conditionals in a single clock cycle single clock cycle

Page 26: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

26

Range QueryRange Query

Page 27: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

27

AggregationsAggregations

• COUNT, MAX, MIN, SUM, AVGCOUNT, MAX, MIN, SUM, AVG

Page 28: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

28

COUNTCOUNT

• Use Use occlusion queriesocclusion queries to get the number to get the number of pixels passing the testsof pixels passing the tests

• Syntax:Syntax:• Begin occlusion queryBegin occlusion query

• Perform database operationPerform database operation

• End occlusion queryEnd occlusion query

• Get count of number of attributes that passed database Get count of number of attributes that passed database operationoperation

• Involves no additional overhead!Involves no additional overhead!• Efficient selectivity computationEfficient selectivity computation

Page 29: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

29

MAX, MIN, MEDIANMAX, MIN, MEDIAN

• Kth-largestKth-largest number number

• Traditional algorithms require data Traditional algorithms require data rearrangementsrearrangements

• We perform We perform • no data rearrangements no data rearrangements

• no frame buffer readbacksno frame buffer readbacks

Page 30: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

30

K-th Largest NumberK-th Largest Number

• Let vLet vk k denote the k-th largest denote the k-th largest numbernumber

• How do we generate a number m How do we generate a number m equal to vequal to vkk??

• Without knowing vWithout knowing vkk’s value ’s value

• Using occlusion queries to obtain the number Using occlusion queries to obtain the number of values of values some given value some given value

• Starting from the most significant bit, Starting from the most significant bit, determine the value of each bit at a timedetermine the value of each bit at a time

Page 31: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

31

K-th Largest NumberK-th Largest Number

• Given a set S of valuesGiven a set S of values• c(m) —number of values c(m) —number of values m m

• vvkk — the k-th largest number — the k-th largest number

• We haveWe have• If c(m) > k-1, then m If c(m) > k-1, then m ≤≤ v vkk

• If c(m) If c(m) ≤≤ k-1, then m > v k-1, then m > vkk

Page 32: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

32

0011 1011 1101

0111 0101 0001

0111 1010 0010

m = 0000v2 = 1011

22ndnd Largest in 9 Values Largest in 9 Values

Page 33: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

33

0011 1011 1101

0111 0101 0001

0111 1010 0010

m = 1000v2 = 1011

Draw a Quad at Depth 8 Draw a Quad at Depth 8 Compute Compute c(1000)c(1000)

Page 34: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

34

0011 1011 1101

0111 0101 0001

0111 1010 0010

m = 1000v2 = 1011

c(m) = 3

11stst bit = 1 bit = 1

Page 35: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

35

0011 1011 1101

0111 0101 0001

0111 1010 0010

m = 1100v2 = 1011

Draw a Quad at Depth 12 Draw a Quad at Depth 12 Compute c(1100)Compute c(1100)

Page 36: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

36

0011 1011 1101

0111 0101 0001

0111 1010 0010

m = 1100v2 = 1011

c(m) = 1

22ndnd bit = 0 bit = 0

Page 37: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

37

0011 1011 1101

0111 0101 0001

0111 1010 0010

m = 1010v2 = 1011

Draw a Quad at Depth 10 Draw a Quad at Depth 10 Compute c(1010)Compute c(1010)

Page 38: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

38

0011 1011 1101

0111 0101 0001

0111 1010 0010

m = 1010v2 = 1011

c(m) = 3

33rdrd bit = 1 bit = 1

Page 39: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

39

0011 1011 1101

0111 0101 0001

0111 1010 0010

m = 1011v2 = 1011

Draw a Quad at Depth 11 Draw a Quad at Depth 11 Compute c(1011)Compute c(1011)

Page 40: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

40

0011 1011 1101

0111 0101 0001

0111 1010 0010

m = 1011v2 = 1011

c(m) = 2

44thth bit = 1 bit = 1

Page 41: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

41

Our algorithmOur algorithm

• Initialize m to 0Initialize m to 0

• Start with the MSB and scan all bits Start with the MSB and scan all bits till LSBtill LSB

• At each bit, put 1 in the At each bit, put 1 in the corresponding bit-position of mcorresponding bit-position of m

• If c(m) If c(m) ≤≤ k-1, make that bit 0 k-1, make that bit 0

• Proceed to the next bitProceed to the next bit

Page 42: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

42

Kth-LargestKth-Largest

Page 43: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

43

MedianMedian

Page 44: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

44

Top K FrequenciesTop K Frequencies

• Given n values in frame buffer, Given n values in frame buffer, compute the top k frequencies compute the top k frequencies without performing data without performing data rearrangements and using rearrangements and using comparisonscomparisons

Page 45: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

45

Accumulator, MeanAccumulator, Mean

• Possible algorithmsPossible algorithms• Use fragment programs – requires very few Use fragment programs – requires very few

renderingsrenderings

• Use mipmaps [Harris et al. 02], fragment Use mipmaps [Harris et al. 02], fragment programs [Coombe et al. 03]programs [Coombe et al. 03]

• Issue: overflow in floating point valuesIssue: overflow in floating point values

• Our approach: bit-based algorithmOur approach: bit-based algorithm• Mean computed using accumulator Mean computed using accumulator

and divide by nand divide by n

Page 46: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

46

AccumulatorAccumulator

• Data representation is of formData representation is of form22kk a ak k + 2+ 2k-1k-1 a ak-1k-1 + … + a + … + a00

Sum = 2Sum = 2kk ΣΣ a akk + 2 + 2k-1k-1 ΣΣ a ak-1k-1 +…+ +…+ ΣΣ a a00

ΣΣ a ai i = number of values with i-th bit as 1= number of values with i-th bit as 1

Current GPUs support no bit-masking operationsCurrent GPUs support no bit-masking operations

Page 47: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

47

TestBitTestBit

• Read the data value from texture, Read the data value from texture, say asay aii

• F= frac(aF= frac(aii/2/2kk))

• If F>=0.5, then k-th bit of aIf F>=0.5, then k-th bit of aii is 1 is 1

• Set F to alpha value. Alpha test Set F to alpha value. Alpha test passes a fragment if alpha passes a fragment if alpha value>=0.5 value>=0.5

Page 48: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

48

AccumulatorAccumulator

Page 49: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

49

Stream MiningStream Mining

• Streams are continuous sequence of Streams are continuous sequence of data values arriving at a portdata values arriving at a port• A few common examples include networking A few common examples include networking

data, stock marketing and financial data, and data, stock marketing and financial data, and data collected from sensors data collected from sensors

• Goal: Efficiently approximate order Goal: Efficiently approximate order statistics such as frequencies, and statistics such as frequencies, and quantiles on data streamsquantiles on data streams• Exact computations require infinite memoryExact computations require infinite memory

Page 50: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

50

IssuesIssues

• Data streaming applications require Data streaming applications require real-time processing requirementsreal-time processing requirements

• Applications also require small or Applications also require small or limited memory footprintlimited memory footprint

Page 51: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

51

IssuesIssues

• Efficient CPU-algorithms perform Efficient CPU-algorithms perform histogram computations and are histogram computations and are eithereither• Compute-limited and therefore, cannot process Compute-limited and therefore, cannot process

data faster than its arrival ratedata faster than its arrival rate

• Memory-limited, and therefore, use memory Memory-limited, and therefore, use memory hierarchies on disks and are slow. Alternately, hierarchies on disks and are slow. Alternately, load shedding algorithms which drop excess load shedding algorithms which drop excess items are also useditems are also used

Page 52: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

52

Histogram ComputationHistogram Computation

• Efficient sorting is fundamental for Efficient sorting is fundamental for histogram computationshistogram computations

• Our new sorting network algorithm Our new sorting network algorithm uses texture mapping and blending uses texture mapping and blending functionality of GPUs to perform fast functionality of GPUs to perform fast sorting on GPUs.sorting on GPUs.• The comparator mapping is performed using The comparator mapping is performed using

texture mappingtexture mapping

• The conditional assignments (MIN and MAX) are The conditional assignments (MIN and MAX) are implemented using blending algorithmimplemented using blending algorithm

• Maps efficiently to rasterization and is fast!Maps efficiently to rasterization and is fast!

Page 53: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

53

Further detailsFurther details

Fast and Approximate Stream Fast and Approximate Stream Mining of Quantiles and Frequencies Mining of Quantiles and Frequencies Using Graphics ProcessorsUsing Graphics Processors

Naga K. Govindaraju, Nikunj Naga K. Govindaraju, Nikunj Raghuvanshi, Dinesh ManochaRaghuvanshi, Dinesh Manocha

Proc. of ACM SIGMOD 2005Proc. of ACM SIGMOD 2005

Page 54: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

54

AdvantagesAdvantages

• Algorithms progress at GPU growth Algorithms progress at GPU growth raterate

• Offload CPU workOffload CPU work• Streaming processor parallel to CPUStreaming processor parallel to CPU

• Fast Fast • Massive parallelism on GPUsMassive parallelism on GPUs

• High memory bandwidthHigh memory bandwidth

• No branch mispredictionsNo branch mispredictions

• Commodity hardware!Commodity hardware!

Page 55: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

55

ConclusionsConclusions

• Novel algorithms to perform Novel algorithms to perform database operations on GPUsdatabase operations on GPUs

• Evaluation of predicates, boolean Evaluation of predicates, boolean combinations of predicates, combinations of predicates, aggregationsaggregations

• Algorithms take into account Algorithms take into account GPU limitationsGPU limitations

• No data rearrangementsNo data rearrangements

• No frame buffer readbacksNo frame buffer readbacks

Page 56: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

56

ConclusionsConclusions

• Algorithms map well to rasterization Algorithms map well to rasterization and GPUsand GPUs

• Preliminary comparisons with Preliminary comparisons with optimized CPU implementations is optimized CPU implementations is promisingpromising

• GPU as a useful co-processorGPU as a useful co-processor

Page 57: Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.

57

Future WorkFuture Work

• Improve performance of many of our Improve performance of many of our algorithmsalgorithms

• More database operations such as More database operations such as join, sorting, classification and join, sorting, classification and clustering.clustering.

• Queries on spatial and temporal Queries on spatial and temporal databasesdatabases