Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 1
Big Data: Myths & Realities
Oleksiy Razborshchuk
Distinguished Solution Architect
Oracle Canada ULC
May 21st, 2014
People. Process. Portfolio.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 2
Agenda
Big Data
Oracle’s Big Data Solution and Differentiators
Use Cases and Implementation Examples
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 3
True or False?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 4
BLOG
What Makes Big Data BIG DATA?
Volume • Very large
quantities of data
Velocity • Extremely
fast streams of data
Variety • Wide range
of datatype characteristics
BLOG
Telematics
Social
Social
Value • High potential
business value if harnessed
Challenge: Exploiting Synergies
Big Data. Big Architecture.
ANALYZE
DECIDE ACQUIRE
ORGANIZE
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 6
Basics Of Hadoop
In Memory
File 1 Piece 1 1
File 1 Piece 2 2
File 1 Piece 3 3
2 5
3 6
4 7
Name Node
Data Node Data Node Data Node Data Node JAR
Map
Reduce Map
Reduce Map
Reduce
Map
Reduce
Job Tracker Task Tracker Task Tracker Task Tracker Task Tracker
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 7
MapReduce Example
Hello World Goodbye World
<K,V>
<Hello,1>
<World,1> <Goodbye,1>
<World,1>
<K,V,V,V,V> <World,1,1> <Hello,1> <Goodbye,1>
<Goodbye,1> <Hello,1> <World,2>
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 8
Wrap Up
Hadoop Architecture
9
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 10
Cloudera Stack
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 11
Active Archive
Transformation and Processing
Self-Service Exploratory BI
Advanced Analytics
Enterprise Data Hub (EDH)
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 12
What is Big Data Environment?
VS &
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 13
Unified Data Analytics Environment
Unified Analytics API
SQL R MR
Unified Analytics Processing Platform
Hadoop RDBMS
Management Framework and Tools
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 14
Big Data in the Enterprise Information Architecture Strategy
14
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 15
Agenda
Big Data
Oracle’s Big Data Solution and Differentiators
Use Cases and Implementation Examples
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 16
Oracle Big Data Appliance
Better
TCO
Faster Time
to Value
Optimized Lower risk. Engineered to perform.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 17
What Do We Mean by Commodity DIY ?
Red Hat / CentOS Different
Platform
Every
Time
Integrated
Tuned
Optimized
Identical
Applications
Compute
& Storage
Networking
OS
CPU, RAM, Blade, Rack
Cisco
120+ separate parts Months from start to production
1 Big Data Appliance Unpack to production in days
Hadoop Distribution
18 © 2014 Oracle Corporation and CIBC – Proprietary and Confidential
Why Oracle Big Data Appliance vs. Commodity With proof points on the following slides
• Designed and Engineered by Cloudera & Oracle (OEM)
• Big Data Best Practices already implemented
• Pre-Integrated, pre-optimized, and pre-tuned before arrival
• Comprehensive (all h/w, s/w, tools, integration labour)
• Manageability top-to-bottom
• Secure and hardened
• Shorter deployment and time to market
• Faster Performance
• Lower TCO
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 19
BDA TCO Beats Build Your Own Hadoop Cluster
$0
$200,000
$400,000
$600,000
$800,000
$1,000,000
$1,200,000
$1,400,000
Year 1 Year 2 Year 3 Year 4 Year 5
Oracle BDA
HP+Cloudera
Cisco+Cloudera
Dell+Cloudera
IBM+Cloudera
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 20
Engineered by Cloudera and Oracle
Managed Distribution
– Components certified to work together and on Oracle Big Data Appliance in regular
updates, on the same hardware/software stack as all our customers
Cloudera’s Hadoop Knowledge Engineered into the system
– Master service lay-out, settings for Hadoop parameters
– Optimized data block size, number of Map-Reduce slots
– Infiniband fabric optimized
Enterprise Hadoop Features jointly developed
– Multi-Homing for Hadoop
– Highly Available NameNode Solution
– Tight integration between Oracle Enterprise Manager and Cloudera Manager
– Sentry security (invented by Cloudera and Oracle)
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 21
Engineered for Quicker Time and Lower Cost
http://www.oracle.com/us/corporate/analystreports/industries/esg-big-data-wp-1914112.pdf
ESG believes that a "buy" versus "do-it-yourself"
approach will yield roughly one-third faster time-
to-market benefit improvement...
0
5
10
15
20
25
30
Oracle Big Data Appliance Build it yourself
Time to Market (Weeks)
0
100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
Oracle Big Data Appliance Build it yourself
Cost: Initial Infrastructure/Tasks
[…] nearly 40% cost savings versus IT
architecting, designing, procuring, configuring,
and implementing its own big data infrastructure.
Compared with a DIY Cluster
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 22
Engineered for Performance Compared with a DIY Cluster
0
5
10
Big Data Appliance
DIY Hadoop Cluster
Tim
e (
ho
urs
)
Configured for exceptional
performance on delivery
6x faster than custom 20-node
Hadoop cluster for large batch
transformation jobs
Engineering done by Oracle and
Cloudera:
– OS and File System Tuning
– Java Virtual Machine Tuning
– Hadoop Configuration and Setup
6x
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 23
Enterprise-Grade Big Data
BDA 2.5 DIY CDH 4.6
Integrated Management Console
Single Command, Full Stack
Patching and Upgrade
Automatic Cluster Re-Configuration
Encryption and Auditing
out-of-box
Authentication, Access Control
HA / DR
Engineered by Cloudera for EDH
Tuned and Optimized Performance
(OS, Java, Hadoop, Infiniband)
24 Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Oracle Unified Information reference architecture Native integration between BDA and Exadata (like iPhone and iPad)
Stream Acquire – Organize – Analyze
Oracle BI Foundation Suite
Oracle Real-Time Decisions
Endeca Information Discovery
Decide
Oracle Event Processing
Oracle Big Data Connectors
Oracle Data Integrator
Oracle
Advanced
Analytics
Oracle
Database
Oracle OLAP,
Spatial,
Graph
Apache Flume
Oracle GoldenGate
Oracle
NoSQL
Database
Cloudera
Hadoop
Oracle R
Distribution
Oracle Coherence
Oracle Big Data Appliancea Oracle Exadata
25 Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Big Data Connectors and Data Integrator
Big Data Appliance +
Hadoop
Exadata +
Oracle Data Warehouse
15TB / hour
10x Faster
26 Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Agenda
Big Data
Oracle’s Big Data Solution and Differentiators
Use Cases and Implementation Examples
27 Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Big Data Solutions for Financial Services
IT Optimization
Big Data Analytics
Business Process Transformation
• ETL and batch processing • Extended Data Warehouse
• Mainframe offloading • Active Archiving
• Customer 360 • Omni-channel CX
• Cross-selling / Geo-fencing • Payment Analytics
• AML / Anti-Fraud • Risk Management
• Pricing Management • Compute Offload (VAR)
28 Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Customer 360 with NGData Lily
Oracle’s Big Data Value
Added Partner
Individual Customer
Behaviour Translated into
Industry Specific KPIs
Customers include:
Socio-demo
Life Time Events
Mobility
Affluence
Social
Affinity
Lifestyle
Competitor
Segment
Communication Preferences
Communication History
Customer Status
Products
Usage
Customer Engagement
CLTV
Loyalty
Customer Experience
Customer DNA
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 29
Case Study Lowering Costs by Simplifying IT Infrastructure
Objectives
Comply with regulations requiring more
data to support stress testing
Reduce IT costs & streamline processing
by eliminating duplicate data stores
Solution
Single, reliable BDA/Exadata-based ODS
supporting all downstream systems
Landing zone & archival repository for
both structured & unstructured data
Use Exadata as “19th” BDA node
- Toyota Global Vision
Operational Data Store Mainframe,
RDBMS, more
BDA Exadata
• Agile business
model
• All data
• De-normalized
& Partial-
normalized
• Normalized
• Aggregate data
• EDW
Oracle Enterprise Manager
Oracle Data Integrator
Data Delivery
Master
S1
Master
S2
Master
Sn SOA/API
CRMS
Other
Fast access to 85% more data
Lower costs, simplified architecture and
fast time to value
Benefits
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 30
3 Key Takeaways from this presentation
• Big Data is not just Hadoop
• Key BD use cases: Active Archive, Data Processing, BI Analytics
• Oracle+Cloudera = most complete & integrated solution in the industry