Top Banner
Approximate Query Processing using Wavelets Kaushik Chakrabarti(Univ Of Illinois) Minos Garofalakis(Bell Labs) Rajeev Rastogi(Bell Labs) Kyuseok Shim(KAIST and AITrc) Presented at 26 th VLDB Conference, Cairo, Egypt Presented By Supriya Sudheendra
34

Approximate Query Processing using Wavelets

Feb 23, 2016

Download

Documents

Padma

Approximate Query Processing using Wavelets. Kaushik Chakrabarti ( Univ Of Illinois) Minos Garofalakis (Bell Labs) Rajeev Rastogi (Bell Labs) Kyuseok Shim(KAIST and AITrc ) Presented at 26 th VLDB Conference, Cairo, Egypt Presented By Supriya Sudheendra. Outline. Introduction. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript

Approximate Query Processing using Wavelets

Approximate Query Processing using WaveletsKaushik Chakrabarti(Univ Of Illinois)Minos Garofalakis(Bell Labs)Rajeev Rastogi(Bell Labs)Kyuseok Shim(KAIST and AITrc)Presented at 26th VLDB Conference, Cairo, Egypt

Presented BySupriya Sudheendra

OutlineIntroductionApproximate Query Processing is a viable solution for:Huge amounts of dataHigh query complexitiesStringent response-time requirementsDecision Support SystemsSupport business and organizational decision-making activitiesHelps decision makers compile useful information from raw data, solve problems and make decisions

IntroductionDSS users pose very complex queries to the DBMSRequires complex operations over GB or TBs of disk-resident dataVery long time to execute and produce exact answersNumber of scenarios where users prefer a fast, approximate answersPrior WorkPrevious Approximate query processing techniquesFocused on specific forms of aggregate queriesData reduction mechanism how to obtain the synopses of dataSampling-based TechniquesA join-operator on 2 uniform random samples results in a non-uniform sample having very few tuplesFor non-aggregate queries, it produces a small subset of the exact answer which might be empty when joins are involved.

Prior WorkHistogram Based TechniquesProblematic for high-dimensional dataStorage overheadHigh construction costWavelet Based TechniquesMathematical tool for hierarchical decomposition of functionsApply wavelet decomposition to input data collection > data synopsisAvoids high construction costs and storage overhead

Contribution of the PaperViability and effectiveness of wavelets as a generic tool for high-dimensional DSSNew, I/O-efficient wavelet decomposition algorithm for relational tablesNovel Query processing algebra for Wavelet-Co-Efficient Data SynopsesExtensive ExperimentsBackgroundMathematical tool to hierarchically decompose functionsCoarse overall approximation together with detail coefficients that influence function at various scalesHaar wavelets are conceptually simple, fast to computeVariety of applications like image editing and queryingOne-Dimensional Haar WaveletsHow to compute, given a data array:Average the values together pairwise to get a lower-resolution representation of dataDetailed coefficients-> differences of the averages from the computed pairwise averageReconstruction of the data array possibleWhy Detail Coefficients

One-dimensional Haar WaveletsWavelet Transform: Overall average followed by detail coefficients in increasing order of resolution. Each entry->wavelet coefficientWA = [4, -2, 0, -1]For vectors containing similar values, most detail coefficients have small values that can be eliminatedIntroduces only small errors

One-dimensional Haar WaveletsOverall average more important than any detail coefficientTo normalize the final entries of WA, each wavelet coefficient is divided by 2ll: level of resolutionWA = [4, -2, 0, -1/2]Multi-dimensional Haar WaveletsHaar wavelets can be extended to multi-dimensional arrayStandard DecompositionFix an ordering for the data dimensions(1,2,d)Apply complete 1-D wavelet transform for each 1-d row of array cells along dimension kNonstandard DecompositionAlternates between dimensions during successive steps of pairwise averaging and differencing for each 1-D row of array cells along dimension kRepeated recursively on quadrant containing all averages across all dimensionsNon-standard DecompositionPairwise averaging and differencing for one positioning of 2x2 box with root [2i1, 2i2]Distribution of the results in the wavelet transform arrayProcess is recursed on lower-left quadrant of WA

Example Decomposition of a 4 X 4 Array

Multi-dimensional Haar coefficients: Semantics and RepresentationD-dimensional Haar basis function corresponding to w is defined by:D-dimensional rectangular support regionQuadrant sign information Support Regions for 16 Nonstandard 2-D Haar Basis FunctionBlank areas regions of A whose reconstruction is independent of the coefficientWA[0,0] overall averageWA[3,3] contributes only to upper right quadrant

Haar CoEfficients: Semantics and RepresentationW = W.R d-dimensional support hyper-rectangle of W encloses all cells in A to which W contributesHyper-rectangle represented by low and high boundaries across each dimension j, 1