#tduniv Big Data A Real Life Revolution April 2014, Teradata Universe Manuel Sevilla Global CTO Business Information Management @msevillatweets
Jan 27, 2015
#tduniv
Big Data A Real Life Revolution
April 2014, Teradata Universe
Manuel Sevilla Global CTO
Business Information Management
@msevillatweets
2
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
What is Big Data?
76 million
smart meters
in 2009…
200m by
2014
2+ billion
people on the
Web by end
2011
100s of
millions of
GPS enabled
devices sold
annually
4.6 billion
camera
phones world
wide
30 billion
RFID tags
today (1.3B in
2005)
12+ TBs
of tweet data
every day
25+ TBs
of log data
every day
Many PBs
of data every
day
80% Of world‟s data
is unstructured The 3Vs: Volume, Variety & Velocity
3
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
The real 3V explanation
Volume: BIG DATA
• More and more devices and
• Each device generates more and more data
Variety: ALL DATA
• Non structured and structured data together
• This is about an interconnected world with many external
partners working with no or low-modeled data
Velocity: FAST DATA
• This is about information and insights value
• Data and insight value decreases every minute!
1 0
0 1 0 1 0
0 1
0 1
1 1 0
1 0 1 1
0 0 1 1 1 0 1
0 1 0 0 1 1 0 1
1 0 1 0 1 1 0 0
1 0 1 1
4
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
In-memory has changed the game
An in-memory appliance
40 x86 cores, 1TB of RAM
For only 100 K EUR !
Moving a standard application from in-disk to
in-memory (without redesign) means a performance
increase between 100 and 1000 times faster
1 to 100 ratio: 2 minutes become 1 second
1 to 1000 means that
48 hours process may run in 3 minutes !
5
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
5760 cores for $3000 !!!
With two GK110 chips,
TITAN Z is powered by
a total of 5,760
processing cores, or
2,880 cores per GPU
Analytical
power is not
a limit
anymore !
6
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
From Predictive Asset Management…
Dust
from
dirty
wagons
Dynamic forces from trains
rubbing the ballast particles
together
Ballast choked
with fines – poor
drainage
The stiffness of
the underlying
formation
Voids under track
Embankments can degrade
through drying out and
shrinkage esp. in SE
Environment
dead vegetation
Strength of
underlying earthwork
Wet bed
7
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
to Predictive Human Healthcare
8
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
BI Appliances Hadoop
Expensive dedicated HW
Built for performance
Designed for high volumes (e.g. 10s of TB)
High availability (limited due to price pressure)
Initially developed using Relational Data bases
Very mature solutions (skills, SW, HW, administration)
Designed for modelled and structured data
Business As Usual ways to design, build and deliver
Teradata, Exadata, Netezza, HANA...
Commodity PCs
Built for extreme scalability (Batch oriented)
Designed for extreme volumes (10s of PB and more)
Very high availability
Initially developed for web ranking
Not as mature
Hadoop = Data is distributed over many machines
MapReduce = Computing is distributed and executed
where data is (grid solution)
9
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
Treatment Key
Object
SQL Database Search In
memory
Hadoop is first of all an extremely sustainable solution
Hadoop Distributed File System
Impala
Spark
HBase Map
Reduce
HAWQ SolR
SAS
HPA
HIVE, PIG, JAQL,
R support, ETL…
H
Catalog Stinger
Connectors Flume Storm Sqoop
10
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
IT World was split in 2 categories
Historical
Data
Streaming
Data
(Events)
GB
TB
PB
GB/s
MB/s
KB/s
Day Hour Min Sec SubSec
Data Warehouses
OLTP
Databases
Volume
Events Response Time
11
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
Big Data is about new Solutions, new opportunities
Historical
Data
Streaming
Data
(Events)
GB
TB
PB
GB/s
MB/s
KB/s
Day Hour Min Sec SubSec
Volume
Events Response Time
Data Warehouses
OLTP
Databases
Event Processing
Tools
Hadoop
In-memory
Databases
12
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
What is really innovative behind the Big Data buzz word ?
Hadoop & NoSQL : Low cost ways to store, manage and analyze
massive volumes of data
Cloud : Ability to rent, reduce Infra cost, reduce time to market
Event Processing tools : Ability to analyze and detect trends in real
time streaming events (monitoring, Next Best Action, Fraud...)
In-memory technologies : A new way to guarantee response time
even for very complex calculations
Explosion of Analytics usage : R is open source, High performance
statistics make them usable by every needed process
Multiple external data sources : Easier to tackle new data sources eg.
social media, traffic, GPS sensors, open data.
13
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
Multiple internal views – consistently compromised
Co
rpo
rate
Ad-h
oc
LO
B
ma
na
ge
me
nt
Op
era
tio
ns
Market
Op
era
tio
ns
LOB mart Spreadsheets
Line of business
Transactional systems
CRM ERP PLM
EDW
Fit
Detail
Freshness
Fidelity
Corporate
ODS
Web
14
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
The Business Data Lake Solution
HD
FS
Load everything
Keep the history
Business driven Country
Sales
Asian Marketing
campaign
Operations
data mart
Distill
Str
uc
ture
Transactional systems
CRM PLM ERP Meters Grid Web Social Media Market Supplier
15
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
You only need to be a few %
points better than your
competitors
16
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
The New Analytics Organization
A Proof of Concept (PoC) is usually delivered in months
Fail fast methodology!
This has to be considered as Research & Development : discovering new
patterns, new ways to reduce cost, improve customer experience, work on
Predictive Asset Maintenance, Fraud Detection
Massive use of Cloud
to reduce Time to Market
CAPEX vs OPEX , as a service mode
Very close to business and IT, often a Mixed team (internal + suppliers)
Dedicated to deliver new insights fast
Full access to many company data sources
Deep knowledge of IT data sources and modeling
Deliver temporary solutions, planed to be used as prototyping for
industrialization
Fast Analysts Teams
PoC Analysts teams
17
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
Data monetization : new revenue?
18
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
Data Privacy: 3 levels
May be federal, country or state level
May concern usage, storage, Information Lifecycle Management
May include legal requisitions (as with telco companies)
Level 1 : Law & Regulation
This is the way you want and you do not want to use all this information!
May be defined per BU, country…
May include partners / suppliers
Level 2 : Company level
Every customers wants be able to accept or reject any information usage
This may be linked to a contract, a checkbox, a call, a spam declaration…
This is not business as usual
You need to give to your customers the ability to define their own data access and
usage wishes (authorizations) !
Level 3 : Personal level
19
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
Our Big Data Service Center Framework
De
live
ry m
anag
em
en
t
Technical coordination
Service Center Governance
and PMO
Functional Coordination
Lean / Continuous
Improvement / Best
Practices
Functional & Technical
expertise
Factories governed by people, process and technology
POC Design Build RUN
Transformation – Big Data Business Cases
New Big Data development
Data usage (API, Analytics…)
Adding new data sources
Demand management Resource management
Quality and performance management
(Progress and quality)
Configuration and
release management
Infrastructure
architecture
Sandbox Link with legacy
Datacenter (countries)
Privacy management
Business Analysis
Datalab
Industrialization GO Live
Innovation
Big Data Governance
Design authority (Big Data architecture/roadmap)
BI vs Big Data governance
Change management
Prioritization and sponsorship
Transversal Delivery Strategy & Innovation
Infrastructure
Data Scientists
Analytics
Privacy & ILM
VM Package
Network Management
Architecture
Product / Solution choice
Delivery Model selection
Information Lifecycle
Management
Support and Administration
Analytics as a
Service
20
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
SMAC is dead. Long live SMAC!
SMAC New SMAC
Social Security
Mobility Movement
Analytics Artificial Intelligence
Cloud Creativity
21
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
Capgemini Global BIM Service Line Driving innovation with its technology partners
Capgemini „s global reach with operations in 44
countries and a focus on BIM with over 9600 BIM
practitioners.
A uniquely integrated approach to Information
Strategy based around the Capgemini “Intelligence
Enterprise”.
Deep Industry sector knowledge supported by Sector
Specific BIM offerings.
Capgemini‟s best-in-class Rightshore® capability for
BIM for development and management of BIM –
4000 BIM experts in India CoE.
A unmatched (and vendor independent) depth of
technology experience. Capgemini works with all the
major BI software vendors to deliver solutions
appropriate to the customer‟s needs.
850+ M EUR revenue in 2013
Europe:
South Africa
Argentina
Brazil
Mexico
United States
Canada
Saudi Arabia India
Australia
China
Morocco
Austria
Finalnd
France
Italy
Germany
Norway
Netherlands
Poland
Spain
Sweden
Switzerland
UK
22
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
Capgemini Group focuses on Big Data
Capgemini Group focuses on Big Data up to top level executive with his CEO, Paul Hermelin,
to support French Industry Ministry on Big Data project (34 French‟ Industrial projects)
www.redressement-productif.gouv.fr/files/la-nouvelle-france-industrielle.pdf#page=27
Paul Hermelin, Capgemini Group CEO, on live in
BFM Business® on 11/10/13 presenting Big Data
project for French Industry government .
23
Big Data & Analytics
Copyright © 2014 Capgemini. All rights reserved.
Contact
Manuel Sevilla Global CTO
Business Information Management
@msevillatweets
#tduniv
The information contained in this presentation is proprietary.
Copyright ©2014 Capgemini. All rights reserved.
Rightshore® is a trademark belonging to Capgemini.
www.capgemini.com/bim
About Capgemini
With more than 130,000 people in over 40 countries, Capgemini
is one of the world's foremost providers of consulting, technology
and outsourcing services. The Group reported 2013 global
revenues of EUR 10.1 billion.
Together with its clients, Capgemini creates and delivers
business and technology solutions that fit their needs and drive
the results they want. A deeply multicultural organization,
Capgemini has developed its own way of working, the
Collaborative Business Experience™, and draws on Rightshore®,
its worldwide delivery model.