Big Data Management – Challenges and Opportunities – an Incomplete Survey Jiaheng Lu Renmin University of China Joint work with Yu Liu Tutorial on HotDB
Mar 31, 2015
Big Data Management – Challenges and Opportunities –
an Incomplete Survey
Jiaheng LuRenmin University of China
Joint work with Yu Liu
Tutorial on HotDB
Tutorial objectives
• Big data challenges• Big data management new principles• Big data management research
– Indexes– Transaction– Architecture– Application– Benchmark
Big data challenge
• Big data– Science data– Finance data– Streaming data– Internet data
Big data management challenge
The growth in database transactions and volumes has a large impact on response times Source: http://www.codefutures.com/database-sharding/
Many techniques have been evolved ..
• Master/Slave
• Cluster Computing
• Table Partitioning
• Federated Tables
Four new principles in big data management
New principle in big data management ( 1 )
• Partition Everything and key-value storage
• 切分万物以治之
•1st normal form cannot be satisfied
New principle in big data management ( 2 )
• Embrace Inconsistency
• 容不同乃成大同
•ACID properties are not satisfied
New principle in big data management ( 3 )
• Backup everything with three copies
• 狡兔三窟方高枕
• Guarantee 99.999999% safety
New principle in big data management ( 4 )
• Scalable and high performance
•运筹沧海量兼容
Big data management
•切分万物以治之•Partition Everything•容不同乃成大同•Embrace Inconsistency•狡兔三窟方高枕•Backup data with three copies•运筹沧海量兼容•Scalable and high performance
Big Data Management Indexes on Big Data
Transaction on Big Data
Processing Architecture on Big Data
Applications in MapReduce Parallel Processing
Benchmark of Big Data Management System
Related Papers
0
2
4
6
8
10
12
14
2009 2010 2011
SIGMOD
VLDB
ICDE
Related Papers
00.5
11.5
22.5
33.5
44.5
Index on Big Data
Transaction on Big Data
Architecture Applications Benchmark
2009
2010
2011
Big data papers (incomplete data)
Indexes on Big Data ~ 4 papersTransaction on Big Data 4~5 papersProcessing Architecture on Big Data
6~7 papersApplications in MapReduce Parallel
Processing 6~7 papers
Benchmark of Big Data Management System
3~4papers
Big Data Management Indexes on Big Data
Transaction on Big Data
Processing Architecture on Big Data
Applications in MapReduce Parallel Processing
Benchmark of Big Data Management System
Indexes on Big Data
Construct indexes which can be maintained in an incremental way.
Avoid bottleneck in the tree-like structure to provide concurrent reading and writing operations
Distributed B-TreeGoal: perform consistent concurrent updates whileallowing high concurrency(read)
M. K. Aguilera, W. Gloab, et al. A Practical Scalable Distributed B-Tree. VLDB 2008
Indexes on Big Data
Distributed B-Tree
3 techniques: Transaction– optimistic concurrency Control Lazy replication of version numbers
at clients Eager replication of version numbers
at servers
M. K. Aguilera, W. Gloab, et al. A Practical Scalable Distributed B-Tree. VLDB 2008
Indexes on Big Data
Use BATON overlay to support range queris Local B+-tree index & Cloud Global(CG) index Only publish a few local index to global index to get high throughput and
concurrencySai Wu, Dawei Jiang, et al. Efficient B-tree Based Indexing for Cloud Data Processing. VLDB 2010
Indexes on Big Data
BATON overlay
Steps to retrieve data:1. Search in the BATON tree(lookup());2. For all overlapping nodes in global index, find the corresponding
nodes(and local index)3. Search in the local B+-Tree index to retrieve data
Sai Wu, Dawei Jiang, et al. Efficient B-tree Based Indexing for Cloud Data Processing. VLDB 2010
Indexes on Big Data
Big Data Management Indexes on Big Data
Transaction on Big Data
Processing Architecture on Big Data
Applications in MapReduce Parallel Processing
Benchmark of Big Data Management System
The CAP Theorem
Consistency
Partition tolerance
Availability
The CAP Theorem
Once a writer has written, all readers will see that write
Consistency
Partition tolerance
Availability
The CAP Theorem
System is available during software and hardware upgrades and node failures.
Consistency
Partition tolerance
Availability
The CAP Theorem
A system can continue to operate in the presence of a network partitions.
Consistency
Partition tolerance
Availability
The CAP Theorem
Theorem: You can have at most two of these properties for any shared-data system
Consistency
Partition tolerance
Availability
Consistency
• Two kinds of consistency:– strong consistency – ACID(Atomicity Consistency Isolation
Durability)
– weak consistency – BASE(Basically Available Soft-state Eventual consistency )
A tailor
3NFTRANSACTION
LOCK ACID
SAFETY
RDBMS
“Not all data need to be treated at the same level of consistency.”
Goal : minimize overall cost of operations in cloud Consistent Rationing
Define consistency guarantees on the data instead at the transaction level
Switch consistency guarantees at runtime, automatically3 categories
T. Kraska, M. Hentschel, et al. Consistency Rationing in the Cloud: Pay only when it matters. VLDB 2009
Transaction on Big Data
Transaction on Big Data
Category C: Session Consistency (temporal) inconsistency is acceptable read-your-own-writes monotonicity converge & achieve eventual consistency at some interval
Category A: Serializable Consistency violation results in large penalty costs
Category B: trade-off between cost per operation & consistency level Adaptive. Switch between session consistency and serializability at
runtime
T. Kraska, M. Hentschel, et al. Consistency Rationing in the Cloud: Pay only when it matters. VLDB 2009
Category B: trade-off between cost per operation & consistency level General Policy
“higher consistency level need to be provided when conflicts(updates) is high.”
Time Policywhen “deadline” approaches, more commits.
Fixed Threshold Policy (for numeric type)
Dynamic Policy (for numeric type)
Y: sum of update value
T. Kraska, M. Hentschel, et al. Consistency Rationing in the Cloud: Pay only when it matters. VLDB 2009
Transaction on Big Data
• Datalog and coordination complexity: theoretical results from PODS aspects
(PODS keynote 2011 Joseph M. Hellerstein, UC Berkeley)
Datalog• Main expressive advantage: recursive
queries. • More convenient for analysis: papers look
better.• Without recursion but with negation it is
equivalent in power to relational algebra• Has affected real practice: (e.g., recursion
in SQL3, magic sets transformations).
Datalog• Example Datalog program:• parent(bill,mary). parent(mary,john).
• ancestor(X,Y) :- parent(X,Y). ancestor(X,Y) :- parent(X,Z),ancestor(Z,Y).
• ?- ancestor(bill,X)
Joseph’s Conjecture(1)• CONJECTURE 1. Consistency And Logical
Monotonicity (CALM).• A program has an eventually consistent,
coordination-free execution strategy if and only if it is expressible in (monotonic) Datalog.
Joseph’s Conjecture (2)• CONJECTURE 2. Causality Required Only for
Non-monotonicity (CRON). • Program semantics require causal message
ordering if and only if the messages participate in non-monotonic derivations.
Joseph’s Conjecture (3)• CONJECTURE 3. The minimum number of
Dedalus timesteps required to evaluate a program on a given input data set is equivalent to the program’s Coordination Complexity.
Joseph’s Conjecture (4)• CONJECTURE 4. Any Dedalus program P can be
rewritten into an equivalent temporally-minimized program P’ such that each inductive or asynchronous rule of P’ is necessary: converting that rule to a deductive rule would result in a program with no unique minimal model.
Circumstance has presented a rare opportunity—call it an imperative—for the database community to take its place in the sun, and help create a new environment for parallel and distributed computation to flourish.
------Joseph M. Hellerstein (UC Berkeley)
Big Data Management Indexes on Big Data
Transaction on Big Data
Processing Architecture on Big Data
Applications in MapReduce Parallel Processing
Benchmark of Big Data Management System
Processing Architecture on Big Data
Make MapReduce more powerful, especially on complicated analysis
Merge cloud computing systems and PDBMSs
Mapreduce online testing platform
• Cloudcomputing.ruc.edu.cn
• Automatic evaluation of Hadoop Mapreduce codes
• Theoretical questions
开放式 Mapreduce 测试平台cloudcomputing.ruc.edu.cn
“Sort-merge implementation in Hadoop poses fundamental barrier to incremental one-pass analysis”
New Hash-Based Platform
Processing Architecture on Big Data
B. Li, E. Mazur, et al. A Platform for Scalable One-Pass Analytics using MapReduce. SIGMOD 2011
Fast Join Processing in Data WarehousePartitioning Data into Vertical Groups Dynamically
Y. Lin, D. Agrawal, et al. Llama: Leveraging Columnar Storage for Scalable Join Processing in the MapReduce Framework. SIGMOD 2011
Processing Architecture on Big Data
Fast Join Processing in Data WarehousePartitioning Data into Vertical Groups DynamicallyConcurrent Join
More Map-side JoinsBASIC PATTERNS: Star Pattern & Chain Pattern
Processing Architecture on Big Data
Processing Architecture on Big Data
Make MapReduce more powerful, especially on complicated analysis
Merge cloud computing systems and PDBMSs
HadoopDB Combination of Parallel DBMS(performance) and MapReduce(scalability, fault-
tolerance) Communication layer : MapReduce
nodes: single-node DBMS instances SMS Planner: SQL MapReduce Job SQL
Processing Architecture on Big Data
Big Data Management Indexes on Big Data
Transaction on Big Data
Processing Architecture on Big Data
Applications in MapReduce Parallel Processing
Benchmark of Big Data Management System
A. Okcan, M. Riedewald. Processing Theta-Joins using MapReduce. SIGMOD 2011 Discuss some Theta-Joins(Inequality-Joins)algorithms
Applications in MapReduce Parallel Processing
R. Vernica, M. J. Carey, et al. Efficient Set-Similarity Joins Using MapReduce. SIGMOD 2010
Use MapReduce Framework to perform set-similarity join, i.e. given two(or one) files, find all pairs of records (a, b) satisfying a and b are similar(sim(a, b)> t)
Give algorithms coping with large amount of data, as well as experimental evaluation.
Applications in MapReduce Parallel Processing
Big Data Management Indexes on Big Data
Transaction on Big Data
Processing Architecture on Big Data
Applications in MapReduce Parallel Processing
Benchmark of Big Data Management System
Benchmark of Big Data Management System
Comparison of the performance between MapReduce paradigm and parallel DBMSs
PERFORMANCE PDBMSs >> MR systems (except data loading)
ComparisonSchema SupportIndexingProgramming ModelData DistributionExecution StrategyFlexibilityFault Tolerance
A. Pavlo, E. Paulson, et al. A Comparison of Approaches to Large-Scale Data Analysis. SIGMOD 2010
Benchmark of Big Data Management System
Comparison of the performance between MapReduce paradigm and parallel DBMSs
PERFORMANCE PDBMSs >> MR systems (except data loading)
ComparisonSchema SupportIndexingProgramming ModelData DistributionExecution StrategyFlexibilityFault Tolerance
A. Pavlo, E. Paulson, et al. A Comparison of Approaches to Large-Scale Data Analysis. SIGMOD 2010
How architectures affect cloud computing (performance) on database applications?Especially for OLTP?
D. Kossmann, T. Kraska, et al. An Evaluation of Alternative Architectures for Transaction Processing in the Cloud. SIGMOD 2010
Benchmark of Big Data Management System
How architectures affect cloud computing(performance) on database applications?Especially for OLTP?
D. Kossmann, T. Kraska, et al. An Evaluation of Alternative Architectures for Transaction Processing in the Cloud. SIGMOD 2010
Benchmark of Big Data Management System
How architectures affect cloud computing(performance) on database applications?Especially for OLTP?
D. Kossmann, T. Kraska, et al. An Evaluation of Alternative Architectures for Transaction Processing in the Cloud. SIGMOD 2010
Benchmark of Big Data Management System
Conclusion• Big Data Management: HOT DB topic
• Research topics: Indexing, transaction, join, architecture, application,
benchmark
References• Sai Wu, Dawei Jiang, et al. Efficient B-tree Based Indexing for Cloud Data
Processing. VLDB 2010• David Chiu, A. Shetty, et al. Evaluating and Optimizing Indexing Schemes for a
Cloud-based Elastic Key-Value Store. In 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
• J. Wang, S. Wu, et al. Indexing Multi-dimensional Data in a Cloud System. SIGMOD 2010
• D. Kossmann, T. Kraska, et al. An Evaluation of Alternative Architectures for Transaction Processing in the Cloud. SIGMOD 2010
• T. Kraska, M. Hentschel, et al. Consistency Rationing in the Cloud: Pay only when it matters. VLDB 2009
• H. T. Vo, C. Chen, et al. Towards Elastic Transactional Cloud Storage with Range Query Support. VLDB 2010
• H. Kllapi, E. Sitaridi, et al. Schedule Optimization for Data Processing Flows on the Cloud. SIGMOD 2011
• M. K. Aguilera, W. Gloab, et al. A Practical Scalable Distributed B-Tree. VLDB 2008
References• E. Friedman, P. Pawlowski, et al. SQL/MapReduce: A Practical approach to self-
describing, polymorphic, and parallelizable user-defined functions. VLDB 2009• R. Vernica, M. J. Carey, et al. Efficient Set-Similarity Joins Using MapReduce.
SIGMOD 2010• S. Blanas, J. M. Patel, et al. A Comparison of Join Algorithms for Log Processing in
MapReduce. SIGMOD 2010• D. Logothetis, K. Yocum. Ad-Hoc Data Processing in the Cloud. VLDB 2008• B. Panda, J. S. Herbach, et al. PLANET: Massively Parallel Learning of Three
Ensembles with MapReduce. VLDB 2009• A. Okcan, M. Riedewald. Processing Theta-Joins using MapReduce. SIGMOD 2011• K. Morton, M. Balazinska, et al. ParaTimer: A Progress Indicator for MapReduce
DAGs. SIGMOD 2010• Y. Cao, C. Chen, et al. ES2: A Cloud Data Storage System for Supporting Both OLTP
and OLAP. ICDE 2011• K. Morton, A. Friesen, et al. Estimating the Progress of MapReduce Pipelines. ICDE
2010
References• W. Lang, J.M. Patel. Energy Management for MapReduce Clusters. VLDB 2010• T. Nykiel, M. Potamias, et al. MRShare: Sharing Across Multiple Queries in
MapReduce. VLDB 2010• C. Olston, G. Chiou, et al. Nova: Continuous Pig/Hadoop Workflows. SIGMOD 2011• Y. Lin, D. Agrawal, et al. Llama: Leveraging Columnar Storage for Scalable Join
Processing in the MapReduce Framework. SIGMOD 2011• B. Li, E. Mazur, et al. A Platform for Scalable One-Pass Analytics using MapReduce.
SIGMOD 2011• D. G. Campbell, G. Kakivaya, et al. Extreme Scale with Full SQL Language Support in
Microsoft SQL Azure. SIGMOD 2010• A. Abouzeid, K. B-Pawlikowski, et al. HadoopDB: An Architectural Hybrid of
MapReduce and DBMS Technologies for Analytical Workloads. VLDB 2009• Y. Xu, P. Kostamaa, et al. Integrating Hadoop and Parallel DBMS. SIGMOD 2010• J. A. Q-Ruiz, C. Pinkel, et al. RAFT at Work: Speeding-Up MapReduce Applications
under Task and Node Failures. SIGMOD 2011• A. Pavlo, E. Paulson, et al. A Comparison of Approaches to Large-Scale Data
Analysis. SIGMOD 2010