Big Data Hadoop – Hands On Workshop Data Processing Solutions – Comparison Guide Big Data Workshop Series Danairat T. Results Data Inputs Cloud 1 2 Data Inputs Results Staging Staging Staging Big DWH Data Mart Data Mart Data Mart Data Mart Staging Analy tic Resul ts Layer Cube Layer Data Mart Layer Data Warehouse Layer Data Staging Layer Data Source Layer 1 2 3 4 5 6 Core Hadoop Traditional Data Warehouse VS.
59
Embed
Big data Hadoop Analytic and Data warehouse comparison guide
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Big Data Hadoop – Hands On Workshop
Data Processing Solutions – Comparison GuideBig Data Workshop Series
Danairat T.
ResultsData Inputs
Cloud
1 2
Data Inputs
Results
Staging
Staging
Staging
Big
DWH
Data
Mart
Data
Mart
Data
Mart
Data
Mart
C
u
b
e
C
u
b
e
C
u
b
e
C
u
b
e
C
u
b
e
Staging
Analy
tic
Resul
ts
Layer
Cube
Layer
Data
Mart
Layer
Data
Warehouse
Layer
Data
Staging
Layer
Data
Source
Layer
1 2 3 4 5 6
Core Hadoop Traditional Data Warehouse
VS.
Big Data Hadoop
Solution 1. Core Hadoop processing
NO data staging transformation and NO data move required!!
Analytic Results
Data Inputs
Top Benefits1. Cloud and IoT ready architecture roadmap
2. No data duplication with reduce cost of data store/storage
3. Fast data processing and all processing are built-in fault tolerant
4. Align with unify data architecture and data governance
5. Less steps of data processing comparing with traditional DWH
The Effort Investment:-1. Learn core Hadoop
Cloud Ready
1 2
Big Data Hadoop
Solution 2. Using BI Tools to analyze Hadoop data
Required single transformation to CSV raw text and store in Hadoop HDFS for BI
Tools to connect and represent the visualization
Hadoop HDFS
(CSV Raw Text)
Data Inputs
Top Benefits1. Lower cost with cloud/IoT ready architecture
2. Fast data processing and all processing are built-in fault tolerant
3. Less steps of data processing comparing with traditional DWH
The Effort Investment:-1. Learn Hadoop
2. Require transformation to CSV
RAW text for BI Tools
Cloud Ready
1 2 3
Results
Big Data Hadoop
Solution 3. Creating data warehouse in Hadoop
Required single transformation with DWH set up on Hadoop for BI Tools
Top Benefits1. Lower cost with cloud/IoT ready architecture
2. Fast data processing and all processing are built-in fault tolerant
3. Less steps of data processing comparing with traditional DWH
The Effort Investment:-1. Learn core Hadoop
2. Require transformation to CSV RAW
text for BI Tools
3. Require DWH on Hadoop set up
(Hive, Cassandra, HBase)
Hadoop HDFSData Inputs
Cloud Ready
Hadoop
DWH
Hive, (or
Cassandra,
Hbase)
1 2 3 4
Results
Big Data Hadoop
Solution 4. Implementing traditional data warehouse
Staging
Staging
Staging
The more data
grow, the
slower data
processing
Data Mart
Data Mart
Data Mart
Data Mart
Top Concerns from Traditional Data Warehouse Architecture1. A lot of data duplication lead to cost of data store/storage issue
2. Very slow of data processing and need to restart/roll back the job if any failed
3. Data security issue due to keep data too many copies and various formats
Cube
Cube
Cube
Cube
Cube
Staging
Analytic
Results
Layer
Cube
Layer
Data Mart
Layer
Data
Warehouse
Layer
Data
Staging
Layer
Data Source
Layer
1 2 3 4 5 6
Data Inputs
Results
Big Data Hadoop
Benefits Comparison Summary
Benefits
Criteria
Solutions
Cloud
Ready
Archit
ecture
Built-In
Parallel
Proces
sing
IoT
Archite
ctureRoadma
p
Without
DB cube
investm
ent
Witho
ut data
mart
invest
ment
Without
DWH
investme
nt
Without
Staging
data
(RAW
Text)
Unstruct
ured and
RAW
Source
Content Processin
g
1. Core
Hadoop
Yes Yes Yes Yes Yes Yes Yes Yes
2. Hadoop and Pentaho/Power
BI
Yes Yes Yes Yes Yes Yes No(require
CSV)
No (require
CSV)
3. Hadoop and Cognos,
RapidMiner,
BO, Cognos,
Tableau
Yes Yes Yes Yes Yes No(require
Hive
connector)
No(require
Hive
connector)
No(require
Hive
connector)
4. Traditional
Data
Warehouse
No No No No No No No No
Big Data Hadoop
Appendix
Big Data Hadoop
Pentaho supports Big Data Inputs
Big Data Hadoop
PowerBI supports Big Data Inputs
Big Data Hadoop
Tableau supports Big Data Inputs
Big Data Hadoop
Rapid Miner supports Big Data Inputs
Big Data Hadoop
Hadoop Cluster Installation and Excel Parser Processing
Big Data Hadoop
Clone hadoop master to slave1 and slave2
master
slave1
slave2
Big Data Hadoop
At master node: Edit host file
Big Data Hadoop
At master node : Copy key file to slave1 and slave2