Instant Results with Infinite Storage David Parker, VP SAP Big Data Platform
Dec 15, 2015
Instant Results with Infinite StorageDavid Parker, VP SAP Big Data Platform
© 2013 SAP AG. All rights reserved. 2
Big Data EconomicsLots of Opportunity
Predictive incl. data mining and machine learning
“Unstructured” AnalysisIncl. text, media, spatial etc.
Scalable Storage
Streaming Dataincl. Sensors, Social and Mobile
Urgent Need
© 2013 SAP AG. All rights reserved. 3
Open Source Community Gift Apache Hadoop
Predictive incl. data mining and machine learning
“Unstructured” AnalysisIncl. text, media, spatial etc.
Scalable Storage
Streaming Dataincl. Sensors, Social and Mobile
Urgent Need
Apache Hadoop
► [Commons]
► [Projects]
► [Distributions]
► [Data Scientists]
© 2013 SAP AG. All rights reserved. 4
The Big Data PhenomenonBig Data is More Than Just Hadoop…
Exploding data volumes
Accelerating data velocity
Increasing data variety
Business Trends Technology Trends
Storage / Memory / CPU advances
Data Mining/Predictive analysis
In-memory computing
Hadoop & distributed MPP
Complex event processing
“Enterprise”Big
Data
© 2013 SAP AG. All rights reserved. 5
SAP HANA + HadoopReal-Time Big Data for the Enterprise
Combine INSTANT Results with INFINITE Storage
8Real-timeInfinite storageInstant Results
SAP HANA HADOOP
• Modern In-memory platform
• Transact/Analyze in Real-Time
• Native Predictive, Text, and Spatial algorithms
• Distributed Disk platform
• Store infinite amounts of unstructured data
• No-SQL Access
© 2013 SAP AG. All rights reserved. 6
Example: Data Analysis of Cancer GenomeGoal: Analytics for Personalized Medicine
© 2013 SAP AG. All rights reserved. 7
MKI Design Decisions to Improve Speed of ProcessingUse Hadoop for Pre-Processing; SAP HANA for Advanced Analytics
© 2013 SAP AG. All rights reserved. 8
Genomic DNA analysis in real-time will transform how we enable comprehensive patient care to fight against cancer. SAP HANA will be the mission critical and reliable data platform to make real-time cancer analytics into a reality. Separately, our internal technical comparison demonstrated that SAP HANA outperforms a traditional disk-based system by factor of 408,000 when performing other types of data analysis.
Yukihisa Kato, Director & Executive Officer, CTO, Research and Development Center, MITSUI KNOWLEDGE INDUSTRY CO.,LTD.
Benefits
Accelerated predictive & correlation analysis with in-memory processing
Reduced time to detect variant DNA
Optimized treatment plans based on DNA mutations
408,000x faster than traditional disk-based systems in PoC
216x faster DNA analysis results - from 2-3 days to 20 minutes
“ ”
SAP HANA + HADOOP + R
SAP HANA + Hadoop for Advanced AnalyticsResults: Deliver Personalized Results More Quickly
© 2013 SAP AG. All rights reserved. 9
Supports any Device Any Apps
Any App ServerSAP Business Suite
and BW ABAP App ServerJSONR Open ConnectivityMDXSQL
Other AppsLocationsReal-timeHADOOPMachineUnstructuredTransaction
HANA PlatformSQL, SQLScript, JavaScript
Integration Services
Spatial
Business Function Library
Search Text Mining
Predictive Analysis Library
DatabaseServices
Stored Procedure & Data Models
Planning Engine Rules Engine
Application & UI Services
Converges database, data processing and application platform capabilities, provides libraries for business functions, planning, predictive, text, spatial analytics
to enable business to operate in real-time
SAP HANA PlatformMore than Just a Database
© 2013 SAP AG. All rights reserved. 10
SAP HANA Platform
ApplicationsAnalytics
Lan
dscap
e man
agem
ent
SAP HANA In Memory
Mo
delin
g &
lifecycle man
agem
ent
Apache HadoopDistributions
Ro
les, security, g
overn
ance, co
mp
liance, au
dits
Consume
Store & Process
Ingest Replication Framework
Data Services
TransactionalPlanning & Simulation
Graph Analytical
Machine Learning& Predictive
Native HANA Apps & Services
Spatial
Consume
ProcessESP IM
Extended Storage (IQ)
Tiered Storage (Hot-warm-cold)
Smart Data Access
Text, Social Media Processing
Exploration, Dashboards, Reports, Charting, Visualization
SAP HANA Platform for Big DataEnd-to-End Environment for the Enterprise
© 2013 SAP AG. All rights reserved. 11
SAP HANA Platform for Big DataOpen Hadoop Strategy
Big Data Science Services
SAP HANA Platform
SAP Data Services
DataConnectorsAcquire
Accelerate
Analyze
Sybase IQ SAP HANA
GeospatialPredictive Text Analysis
Visualize and Act
Industry/LOB Apps Custom AppsAnalytic Apps
SQL XS EngineR
© 2013 SAP AG. All rights reserved. 12
Enterprise Scenarios: SAP HANA and Hadoop4 Common Use Cases
Hadoop
Data storage (Hadoop Distributed File system)
Job Management
Computation Engine(s)
Hadoop as a flexible data store
Hadoop as a simple database
Hadoop as a processing
engine
Hadoop for advanced analytics
Reference Data
Streaming Data
Enterprise Data
Transaction Data
Social Media
SAP HANA Platform
SAP Data Services
DataConnectors Acquire
Accelerate
Sybase IQ SAP HANA
Analyze
GeospatialPredictive Text AnalysisSQL XS EngineR
Visualize and ActApplications Analytic Tools
Hive
© 2013 SAP AG. All rights reserved. 13
SAP HANA
Execute Query
Query Feder-ation
Split Query
Execute
Consolidate
Execute
Execute
New Technology: Smart Data AccessRapid Analysis of Big Data without Data Movement
BI and analytics software from SAP
In-memory
Disk-based data ware-house (SAP Sybase IQ)
… and/or ...
SAP HANA
Analytic engine
Analytic engine
Hadoop
Data storage (Hadoop Distributed File system)
Job Management
Computation Engine(s)
Hive HBase …
Users
• Submit a query that accesses remote data like local data
• Leverage the full power of SAP HANA – analytics, predictive, text search, geospatial – during processing
• Synthesize enterprise data regardless of location, size, representation – without moving it
© 2013 SAP AG. All rights reserved. 14
• Decide what remote data sources are needed for the application running on SAP HANAData Needs
• Define the remote data source with the appropriate security credentialsSources
• Create virtual tables which reference the remote data source (table)References
• Write your application, using HANA tables and virtual tables• Query processor in HANA does the rest of the optimizations
and data accessApplication
SAP HANA Smart Data AccessHow to Use
© 2013 SAP AG. All rights reserved. 15
Example: Smart Data AccessSteps For Creating and Using Virtual Tables
1. Create table in HIVE
2. On SAP HANA, create DSN, e.g. “hive1”
3. With SAP HANA Studio or using DLL command, create a remote source:oCREATE REMOTE SOURCE HIVE1 ADAPTER "hiveodbc" CONFIGURATION 'DSN=hive1'
WITH CREDENTIAL TYPE 'PASSWORD' USING 'user=dftest;password=dftest';
4. Using a DLL command, create a virtual table for Hive: CREATE VIRTUAL TABLE "HIVE1_PRODUCT" AT "HIVE1"."default"."default"."product";
5. Execute a query on virtual table: SELECT * FROM HIVE1_PRODUCT;
6. Drop a virtual table DROP REMOTE SOURCE HIVE1 CASCADE;
© 2013 SAP AG. All rights reserved. 16
SAP HANA Platform Unprecedented insights combine
structured and unstructured data for insights never seen before
Advanced analytics predictive, text, and spatial analytics
Instant results with in-memory processing
Apache Hadoop Data exploration / mining uncover
nuggets information from large volumes of unknown data
Infinite storage offload or archive cold data
Amplify the Value of Big DataMarry SAP HANA and Hadoop for real-time business results
© 2013 SAP AG. All rights reserved. 17
Get SAP HANA for Freewww.saphana.com/docs/DOC-369
© 2013 SAP AG. All rights reserved. 18
Participate in SAP HANA Academywww.saphana.com/community/hana-academy
Developer Sandbox Available for hands-on training
Projects Create HTML5 applications Create data marts Etc.
Online Courses and Videos SAP HANA Predictive Analytics Library R Integration SAP Lumira (visualizations) And more…
© 2013 SAP AG. All rights reserved. 19
Join the Big Data Geek Challenge by SAPwww.sap.com/bigdata/challenge
How to Enter*:
1) Get SAP Lumira (for free)
2) Design a data visualization to analyze Big Data quickly
3) Make a video to present your use case
4) Submit your entry
5) Winners to be announced by December 2013
Challenge only available to residents of the United States. See official entry rules for details.
© 2013 SAP AG. All rights reserved. 20
SAP Startup Focus Programwww.saphana.com/community/learn/startups
SAP Lumira &SAP Predictive Analysis
SAP Startup Focus Program
SAP HANA + HADOOPInstant results + Infinite Storage
Visit Booth 311
Thank you
Contact information:David [email protected]
www.sap.com/bigdata
facebook.com/sapanalytics
twitter.com/#!/@sapinmemory