Big Data Management

Big Data Management – Challenges and Opportunities –

an Incomplete Survey

Jiaheng LuRenmin University of China

Joint work with Yu Liu

Tutorial on HotDB

Tutorial objectives

• Big data challenges• Big data management new principles• Big data management research

– Indexes– Transaction– Architecture– Application– Benchmark

Big data challenge

• Big data– Science data– Finance data– Streaming data– Internet data

Big data management challenge

The growth in database transactions and volumes has a large impact on response times Source: http://www.codefutures.com/database-sharding/

Many techniques have been evolved ..

• Master/Slave

• Cluster Computing

• Table Partitioning

• Federated Tables

Four new principles in big data management

New principle in big data management （ 1 ）

• Partition Everything and key-value storage

• 切分万物以治之

•1st normal form cannot be satisfied


• Embrace Inconsistency

• 容不同乃成大同

•ACID properties are not satisfied


• Backup everything with three copies

• 狡兔三窟方高枕

• Guarantee 99.999999% safety


• Scalable and high performance

•运筹沧海量兼容

Big data management

•切分万物以治之•Partition Everything•容不同乃成大同•Embrace Inconsistency•狡兔三窟方高枕•Backup data with three copies•运筹沧海量兼容•Scalable and high performance

Big Data Management Indexes on Big Data

Transaction on Big Data

Processing Architecture on Big Data

Applications in MapReduce Parallel Processing

Benchmark of Big Data Management System

Related Papers

0

2

4

6

8

10

12

14

2009 2010 2011

SIGMOD

VLDB

ICDE

Related Papers

00.5

11.5

22.5

33.5

44.5

Index on Big Data


Architecture Applications Benchmark

2009

2010

2011

Big data papers (incomplete data)

Indexes on Big Data ~ 4 papersTransaction on Big Data 4~5 papersProcessing Architecture on Big Data

6~7 papersApplications in MapReduce Parallel

Processing 6~7 papers


3~4papers






Indexes on Big Data

Construct indexes which can be maintained in an incremental way.

Avoid bottleneck in the tree-like structure to provide concurrent reading and writing operations

Distributed B-TreeGoal: perform consistent concurrent updates whileallowing high concurrency(read)

M. K. Aguilera, W. Gloab, et al. A Practical Scalable Distributed B-Tree. VLDB 2008

Indexes on Big Data

Distributed B-Tree

3 techniques: Transaction– optimistic concurrency Control Lazy replication of version numbers

at clients Eager replication of version numbers

at servers

M. K. Aguilera, W. Gloab, et al. A Practical Scalable Distributed B-Tree. VLDB 2008

Indexes on Big Data

Use BATON overlay to support range queris Local B+-tree index & Cloud Global(CG) index Only publish a few local index to global index to get high throughput and

concurrencySai Wu, Dawei Jiang, et al. Efficient B-tree Based Indexing for Cloud Data Processing. VLDB 2010

Indexes on Big Data

BATON overlay

Steps to retrieve data:1. Search in the BATON tree(lookup());2. For all overlapping nodes in global index, find the corresponding

nodes(and local index)3. Search in the local B+-Tree index to retrieve data

Sai Wu, Dawei Jiang, et al. Efficient B-tree Based Indexing for Cloud Data Processing. VLDB 2010

Indexes on Big Data






The CAP Theorem

Consistency

Partition tolerance

Availability

The CAP Theorem

Once a writer has written, all readers will see that write

Consistency

Partition tolerance

Availability

The CAP Theorem

System is available during software and hardware upgrades and node failures.

Consistency

Partition tolerance

Availability

The CAP Theorem

A system can continue to operate in the presence of a network partitions.

Consistency

Partition tolerance

Availability

The CAP Theorem

Theorem: You can have at most two of these properties for any shared-data system

Consistency

Partition tolerance

Availability

Consistency

• Two kinds of consistency:– strong consistency – ACID(Atomicity Consistency Isolation

Durability)

– weak consistency – BASE(Basically Available Soft-state Eventual consistency )

A tailor

3NFTRANSACTION

LOCK ACID

SAFETY

RDBMS

“Not all data need to be treated at the same level of consistency.”

Goal : minimize overall cost of operations in cloud Consistent Rationing

Define consistency guarantees on the data instead at the transaction level

Switch consistency guarantees at runtime, automatically3 categories

T. Kraska, M. Hentschel, et al. Consistency Rationing in the Cloud: Pay only when it matters. VLDB 2009



Category C: Session Consistency (temporal) inconsistency is acceptable read-your-own-writes monotonicity converge & achieve eventual consistency at some interval

Category A: Serializable Consistency violation results in large penalty costs

Category B: trade-off between cost per operation & consistency level Adaptive. Switch between session consistency and serializability at

runtime


Category B: trade-off between cost per operation & consistency level General Policy

“higher consistency level need to be provided when conflicts(updates) is high.”

Time Policywhen “deadline” approaches, more commits.

Fixed Threshold Policy (for numeric type)

Dynamic Policy (for numeric type)

Y: sum of update value



• Datalog and coordination complexity: theoretical results from PODS aspects

(PODS keynote 2011 Joseph M. Hellerstein, UC Berkeley)

Datalog• Main expressive advantage: recursive

queries. • More convenient for analysis: papers look

better.• Without recursion but with negation it is

equivalent in power to relational algebra• Has affected real practice: (e.g., recursion

in SQL3, magic sets transformations).

Datalog• Example Datalog program:• parent(bill,mary). parent(mary,john).

• ancestor(X,Y) :- parent(X,Y). ancestor(X,Y) :- parent(X,Z),ancestor(Z,Y).

• ?- ancestor(bill,X)

Joseph’s Conjecture(1)• CONJECTURE 1. Consistency And Logical

Monotonicity (CALM).• A program has an eventually consistent,

coordination-free execution strategy if and only if it is expressible in (monotonic) Datalog.

Joseph’s Conjecture (2)• CONJECTURE 2. Causality Required Only for

Non-monotonicity (CRON). • Program semantics require causal message

ordering if and only if the messages participate in non-monotonic derivations.

Joseph’s Conjecture (3)• CONJECTURE 3. The minimum number of

Dedalus timesteps required to evaluate a program on a given input data set is equivalent to the program’s Coordination Complexity.

Joseph’s Conjecture (4)• CONJECTURE 4. Any Dedalus program P can be

rewritten into an equivalent temporally-minimized program P’ such that each inductive or asynchronous rule of P’ is necessary: converting that rule to a deductive rule would result in a program with no unique minimal model.

Circumstance has presented a rare opportunity—call it an imperative—for the database community to take its place in the sun, and help create a new environment for parallel and distributed computation to flourish.

------Joseph M. Hellerstein (UC Berkeley)







Make MapReduce more powerful, especially on complicated analysis

Merge cloud computing systems and PDBMSs

Mapreduce online testing platform

• Cloudcomputing.ruc.edu.cn

• Automatic evaluation of Hadoop Mapreduce codes

• Theoretical questions

开放式 Mapreduce 测试平台cloudcomputing.ruc.edu.cn

“Sort-merge implementation in Hadoop poses fundamental barrier to incremental one-pass analysis”

New Hash-Based Platform


B. Li, E. Mazur, et al. A Platform for Scalable One-Pass Analytics using MapReduce. SIGMOD 2011

Fast Join Processing in Data WarehousePartitioning Data into Vertical Groups Dynamically

Y. Lin, D. Agrawal, et al. Llama: Leveraging Columnar Storage for Scalable Join Processing in the MapReduce Framework. SIGMOD 2011


Fast Join Processing in Data WarehousePartitioning Data into Vertical Groups DynamicallyConcurrent Join

More Map-side JoinsBASIC PATTERNS: Star Pattern & Chain Pattern



Make MapReduce more powerful, especially on complicated analysis

Merge cloud computing systems and PDBMSs

HadoopDB Combination of Parallel DBMS(performance) and MapReduce(scalability, fault-

tolerance) Communication layer : MapReduce

nodes: single-node DBMS instances SMS Planner: SQL MapReduce Job SQL







A. Okcan, M. Riedewald. Processing Theta-Joins using MapReduce. SIGMOD 2011 Discuss some Theta-Joins(Inequality-Joins)algorithms


R. Vernica, M. J. Carey, et al. Efficient Set-Similarity Joins Using MapReduce. SIGMOD 2010

Use MapReduce Framework to perform set-similarity join, i.e. given two(or one) files, find all pairs of records (a, b) satisfying a and b are similar(sim(a, b)> t)

Give algorithms coping with large amount of data, as well as experimental evaluation.








Comparison of the performance between MapReduce paradigm and parallel DBMSs

PERFORMANCE PDBMSs >> MR systems (except data loading)

ComparisonSchema SupportIndexingProgramming ModelData DistributionExecution StrategyFlexibilityFault Tolerance

A. Pavlo, E. Paulson, et al. A Comparison of Approaches to Large-Scale Data Analysis. SIGMOD 2010


Comparison of the performance between MapReduce paradigm and parallel DBMSs

PERFORMANCE PDBMSs >> MR systems (except data loading)

ComparisonSchema SupportIndexingProgramming ModelData DistributionExecution StrategyFlexibilityFault Tolerance

A. Pavlo, E. Paulson, et al. A Comparison of Approaches to Large-Scale Data Analysis. SIGMOD 2010

How architectures affect cloud computing (performance) on database applications?Especially for OLTP?

D. Kossmann, T. Kraska, et al. An Evaluation of Alternative Architectures for Transaction Processing in the Cloud. SIGMOD 2010


How architectures affect cloud computing(performance) on database applications?Especially for OLTP?



How architectures affect cloud computing(performance) on database applications?Especially for OLTP?



Conclusion• Big Data Management: HOT DB topic

• Research topics: Indexing, transaction, join, architecture, application,

benchmark

References• Sai Wu, Dawei Jiang, et al. Efficient B-tree Based Indexing for Cloud Data

Processing. VLDB 2010• David Chiu, A. Shetty, et al. Evaluating and Optimizing Indexing Schemes for a

Cloud-based Elastic Key-Value Store. In 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing

• J. Wang, S. Wu, et al. Indexing Multi-dimensional Data in a Cloud System. SIGMOD 2010

• D. Kossmann, T. Kraska, et al. An Evaluation of Alternative Architectures for Transaction Processing in the Cloud. SIGMOD 2010

• T. Kraska, M. Hentschel, et al. Consistency Rationing in the Cloud: Pay only when it matters. VLDB 2009

• H. T. Vo, C. Chen, et al. Towards Elastic Transactional Cloud Storage with Range Query Support. VLDB 2010

• H. Kllapi, E. Sitaridi, et al. Schedule Optimization for Data Processing Flows on the Cloud. SIGMOD 2011

• M. K. Aguilera, W. Gloab, et al. A Practical Scalable Distributed B-Tree. VLDB 2008

References• E. Friedman, P. Pawlowski, et al. SQL/MapReduce: A Practical approach to self-

describing, polymorphic, and parallelizable user-defined functions. VLDB 2009• R. Vernica, M. J. Carey, et al. Efficient Set-Similarity Joins Using MapReduce.

SIGMOD 2010• S. Blanas, J. M. Patel, et al. A Comparison of Join Algorithms for Log Processing in

MapReduce. SIGMOD 2010• D. Logothetis, K. Yocum. Ad-Hoc Data Processing in the Cloud. VLDB 2008• B. Panda, J. S. Herbach, et al. PLANET: Massively Parallel Learning of Three

Ensembles with MapReduce. VLDB 2009• A. Okcan, M. Riedewald. Processing Theta-Joins using MapReduce. SIGMOD 2011• K. Morton, M. Balazinska, et al. ParaTimer: A Progress Indicator for MapReduce

DAGs. SIGMOD 2010• Y. Cao, C. Chen, et al. ES2: A Cloud Data Storage System for Supporting Both OLTP

and OLAP. ICDE 2011• K. Morton, A. Friesen, et al. Estimating the Progress of MapReduce Pipelines. ICDE

2010

References• W. Lang, J.M. Patel. Energy Management for MapReduce Clusters. VLDB 2010• T. Nykiel, M. Potamias, et al. MRShare: Sharing Across Multiple Queries in

MapReduce. VLDB 2010• C. Olston, G. Chiou, et al. Nova: Continuous Pig/Hadoop Workflows. SIGMOD 2011• Y. Lin, D. Agrawal, et al. Llama: Leveraging Columnar Storage for Scalable Join

Processing in the MapReduce Framework. SIGMOD 2011• B. Li, E. Mazur, et al. A Platform for Scalable One-Pass Analytics using MapReduce.

SIGMOD 2011• D. G. Campbell, G. Kakivaya, et al. Extreme Scale with Full SQL Language Support in

Microsoft SQL Azure. SIGMOD 2010• A. Abouzeid, K. B-Pawlikowski, et al. HadoopDB: An Architectural Hybrid of

MapReduce and DBMS Technologies for Analytical Workloads. VLDB 2009• Y. Xu, P. Kostamaa, et al. Integrating Hadoop and Parallel DBMS. SIGMOD 2010• J. A. Q-Ruiz, C. Pinkel, et al. RAFT at Work: Speeding-Up MapReduce Applications

under Task and Node Failures. SIGMOD 2011• A. Pavlo, E. Paulson, et al. A Comparison of Approaches to Large-Scale Data

Analysis. SIGMOD 2010

Big Data Management – Challenges and Opportunities – an Incomplete Survey Jiaheng Lu Renmin University of China Joint work with Yu Liu Tutorial on HotDB.

Documents

big data slide

big data transaction

big data applications

big data management

big data management

big data management

data need