Hadoop, Big Data, and the Future of the Enterprise Data Warehouse
Post on 26-May-2015
1131 Views
Preview:
DESCRIPTION
Transcript
1
Hosted by Barry Thompson,
Founder & CTO of Tervela
What We’ll Discuss Today…
• How is the role of the data warehouse changing in the face of big data?
• How are Hadoop and other big data technologies coexisting with traditional data warehouses?
• What happens when we have multiple big data sources (and multiple versions of the truth)?
• How do I use replication, data loading, cloud integration, and other technologies during this transition period?
2
About the presenter...
• Barry Thompson
• Founder and CTO of Tervela
• Visionary with 20 years of experience
• Background in transformative technologies (robotics, imaging & traditional enterprise)
• Technology leadership for AIG, NatWest and UBS
• X-Prize board of trustees
3
Data Complexity Exploding
4
X X X
End PointEnd PointProliferationProliferation
End PointEnd PointProliferationProliferationData Data
ExplosionExplosion
Data Data ExplosionExplosion RegulatoryRegulatory
RequirementsRequirements
RegulatoryRegulatoryRequirementsRequirementsGlobalGlobal
DistributionDistribution
GlobalGlobalDistributionDistribution
30 billion30 billionpieces of content shared on Facebook every month
30 million30 millionnetworked sensor nodes with 30% annual growth 5 billion5 billion
mobile phones in use in 2010
40 billion 40 billion Devices connected to the Internet by the end of the decade
78%78%
23%23%of Asia is on the
Internet
58%58%of Europe is
on the Internet
of North America is on the Internet
Dodd-FrankDodd-Frank
Basel IIIBasel III
Consumer ProtectionConsumer Protection
More DataBy More People and Apps
In More PlacesFaster
HIPAAHIPAA
64 exabytes64 exabytesAmount of data moved around the Internet per month by the end of the decade
It Should Be Easy
5
Operational Data Stores
Traditional Data Warehouses
Hadoop Map-Reduce
Transactional Data
Structured Analysis
Unstructured Analysis
But It’s Not
6
Operational Data Stores
Traditional Data Warehouses
Hadoop Map-Reduce
Transactional Data
Structured Analysis
Unstructured Analysis
NoSQLNoSQL
Real-Time Decision Real-Time Decision SupportSupport
Real-Time Real-Time OperationsOperations
Real-Time Real-Time AnalyticsAnalytics
ETL ETL ReplacementReplacement
What’s Driving This Activity?
7
Accessibility of Big Data Streams
Multi-Format, Multi-Type
Inconsistent Ingest Rates
Scaling Across Geographies
Explosion in Real-Time Analytics
A Question For You…
1) We have an integrated Hadoop - Data Warehouse strategy
2) We aren't sure how Hadoop should fit with our warehouse
3) There's no interaction between Hadoop and our Data Warehouse
4) We aren't running Hadoop
5) I don’t know
8
What is the relationship at your company between Hadoop and your corporate data warehouse?
Warehousing… The Old Way
9
Operational Data Store (Database)
Data Feeds & Web Services
FlatFiles
ETL
Data Warehouse
Data Mart Business Report
Data Mart Analytic App
Slows down data
availability
Slows down data
availability
Single location, single point of
failure
Single location, single point of
failure
Inflexible data formats
Inflexible data formats
I don’t fit I don’t fit
The New Warehouse Paradigm
10
Operational Data Store (Database)
Data Feeds & Web Services
FlatFiles
Business User
ETLData Warehouse
Data Mart
Data Mart
Analytic AppETL
Backup Warehouse
Analytic AppHadoop
Real-Time Console
Real-time apps get
immediate access to data
Real-time apps get
immediate access to data
The right format, the right
processing
The right format, the right
processing
DR & Backup for Big Data
DR & Backup for Big Data
Dat
a F
abric
Dat
a F
abric
What is a Data Fabric?
11
Features•Data Capture•Data Movement•Data Availability•Data Protection•Data Management
Tervela Data FabricSoftware, Hardware Appliances
or Cloud Services
apps & SOA file systems DBs ODS/clusters clouds
clouds warehouses analytics
Data Stores
Data Sources
Requirements•High performance•No loss•Centralized Management & Visibility •Ease of integration•5 9’s of reliability
High-Performance & Parallel Loading
12
• Guaranteed delivery of data into multiple systems• Buffered and streamed to deal with slow consumers• Efficient multi-casting avoids excessive network traffic
Real-Time Analytics
13
• Streaming avoids bottlenecks in ETL or warehousing• Delivers the right format for your analytic system• Best way to handle the explosion of analytic apps
Cloud Integration
14
• Buffering simplifies big data transfer over slow WANs • Stream data between cloud apps without temp storage• Bridge your cloud apps with on-premise systems
Global Data Synchronization
15
• Backup heterogeneous Big Data over unreliable WANs• Create active-active configuration for DR & scale• Geographic distribution for better local performance
Big Data Replication
16
• 10-100x faster than existing / native replication over WAN• Multi-cast replication saves bandwidth• Local data improves performance
For More Information
• @tervela• barry@tervela.com• www.tervela.com
17
Request a trial:
Read some case studies:
http://tervela.com/download
http://tervela.com/customers
Thank you!
18
top related