Top Banner
1 © Copyright 2016 EMC Corporation. All rights reserved. 1 © Copyright 2016 EMC Corporation. All rights reserved. MODERNISE YOUR EDW – DATA LAKE CHARLES SEVIOR, CTO EMERGING TECHNOLOGY DIVISION
23

Charles Sevior, EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

Feb 08, 2017

Download

Data & Analytics

Corinium Global
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

1© Copyright 2016 EMC Corporation. All rights reserved. 1© Copyright 2016 EMC Corporation. All rights reserved.

MODERNISE YOUR EDW – DATA LAKECHARLES SEVIOR, CTO EMERGING TECHNOLOGY DIVISION

Page 2: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

2© Copyright 2016 EMC Corporation. All rights reserved.

ALL ORGANISATIONS ARE ON A JOURNEY TO…

1000XMORE DATA

REAL TIMEOPERATION

ANALYTICINSIGHTS

PERSONALISATION & ENHANCED SERVICES

Page 3: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

3© Copyright 2016 EMC Corporation. All rights reserved.

THE JOURNEY TO DIGITAL BREAKS TRADITIONAL IT INFRASTRUCTURE

Gartner IT Budget Growth

Clickstream

Geolocation

Web Data

Internet of Things

Docs, emails

Server logs

TRADITIONAL DATA

NEW DATASOURCES

Page 4: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

4© Copyright 2016 EMC Corporation. All rights reserved.

CHALLENGES WITH ENTERPRISE DATA WAREHOUSES

1. Expensive storage– 70% of data in a typical EDW is unused

2. Expensive processing – On average 55% of EDW CPU utilisation is low value ETL

3. Expensive licensing…4. New data sources

– Traditional systems are unable to capture and use new data sources, such as unstructured or semi-structured data

Page 5: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

5© Copyright 2016 EMC Corporation. All rights reserved.

COST DRIVERS

OPERATIONS 50%

ANALYTICS 20%

ETL/ELT 30%

COLD DATA 70%

HOT DATA30%

ENTERPRISE DATA WAREHOUSE

HADOOP WITH ENTERPRISE GRADE STORAGE SOLUTION

ETL/ELT OFFLOADACTIVE ARCHIVE

> $16 K per TB

< $1 K per TB

Cost Comparison

Vs.

Page 6: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

6© Copyright 2016 EMC Corporation. All rights reserved.

Throw Data Away1

Waste capacity on low value workloads2

Unable to leverage new data sources3

ARCHIVE

ELT

CHALLENGES WITH EXISTING EDW INFRASTRUCTURE

Page 7: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

7© Copyright 2016 EMC Corporation. All rights reserved.

DATA ARCHITECTURE OPTIMISATION WITH HADOOP

Don’t throw data away1

Reclaim Enterprise Data Warehouse for high value BI2

Leverage new data sources3

ARCHIVE

ETL+ELT+BI

Page 8: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

8© Copyright 2016 EMC Corporation. All rights reserved.

ALBERT wants to: Optimise the existing

data infrastructure spend Enable analytics on all

data, structured and unstructured

Lay the solid foundation of Self-Service BI

• Albert has an existing 1 PB Enterprise Data Warehouse Infrastructure. With rapid growth in data volume, he needs to add 500 TB of capacity to his existing EDW Infrastructure.

2013

6.5M

2014 2015 2016

EDW Cost

SAMPLE PROBLEM SCENARIO

• At Average Cost of $13,000 Per TB of EDW Storage, the expansion is estimated to cost $6.5 Million to add 500 TB of capacity.

Page 9: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

9© Copyright 2016 EMC Corporation. All rights reserved.

MODERNISE YOUR EDW

Page 10: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

10© Copyright 2016 EMC Corporation. All rights reserved.

Data Management

DATA LAKE SOLUTION FOR EDW MODERNISATION

Clickstream

Web & Social

Geolocation

Sensor & Machine

Server Logs

EXIS

TIN

G S

OU

RCES

ERP

CRM

Commodity Compute

DATA SERVICES

OPERATIONAL SERVICES

HORTONWORKS DATA PLATFORM

HADOOP CORE

Business Analytics

Visualization& Dashboards

IT Applications

NEW

SO

URC

ES

2

3

1

ETL/ELT OFFLOAD

ACTIVE ARCHIVE

ENRICH WITH NEW DATA TYPES

MULTI-PROTOCOLACCESS

ENTERPRISE-GRADE DATA MANAGEMENT

5NFS, SMB,HTTP, Swift

1

2

3

4

5

4

New Data Flow

Current Data FlowLegend

OFFLOAD

Isilon

Page 11: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

11© Copyright 2016 EMC Corporation. All rights reserved.

ENTERPRISE EVOLUTION PROCESS

COST DRIVERS REVENUE DRIVERS

Enterprise Data Warehouse is

Processing Limited

Enterprise Data Warehouse is

Capacity Limited

Need to add new data

source Types

Typical Evolution Process (Every customer journey is different)

HADOOP WITH ENTERPRISE GRADE STORAGE SOLUTION

ETL/ELT OFFLOADACTIVE ARCHIVE ENRICH WITH NEW DATA TYPES

Page 12: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

12© Copyright 2016 EMC Corporation. All rights reserved.

DATA SILO CONSOLIDATION

12© Copyright 2016 EMC Corporation. All rights reserved.

Page 13: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

13© Copyright 2016 EMC Corporation. All rights reserved.

DATA SILO CONSOLIDATION

Home Directories & File SharesSurveillance

Next-Gen Application

Hadoop & Analytics

TransactionLogs

BLOBSEDW

ContentShares

Marketing M&E

Social & Next-Gen

Archive &Backup Target

Data Monetization

Design, Test & Manufacture

Application Test

13© Copyright 2016 EMC Corporation. All rights reserved.

Page 14: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

14© Copyright 2016 EMC Corporation. All rights reserved.

DATA SILO CONSOLIDATION

Home Directories & File SharesSurveillance

Next-Gen Application

Hadoop & Analytics

TransactionLogs

BLOBSEDW

ContentShares

Marketing M&E

Social & Next-Gen

Archive &Backup Target

Data Monetization

Design, Test & Manufacture

Application Test

14© Copyright 2016 EMC Corporation. All rights reserved.

Page 15: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

15© Copyright 2016 EMC Corporation. All rights reserved.

DATA SILO CONSOLIDATION

DATA LAKE

Home Directories & File SharesSurveillance

Next-Gen Application

Hadoop & Analytics

TransactionLogs

BLOBSEDW

ContentShares

Marketing M&E

Social & Next-Gen

Archive &Backup Target

Data Monetization

Design, Test & Manufacture

Application Test

15© Copyright 2016 EMC Corporation. All rights reserved.

Page 16: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

16© Copyright 2016 EMC Corporation. All rights reserved.

DATA LAKE

SCALE-OUT SINGLE REPOSITORY

IN-PLACE ANALYTICS

MULTI-PROTOCOL / WORKLOAD TIERS

16

ENTERPRISE FEATURES

MANAGEPBs

© Copyright 2016 EMC Corporation. All rights reserved.

Page 17: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

17© Copyright 2016 EMC Corporation. All rights reserved.

EMC INFOARCHIVEAn Enterprise Information Archiving Platform that

unlocks data of all types, trapped in siloed applications, lowering IT costs, preserving

compliance and putting application data to work.Leave No Application Data Behind

Page 18: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

18© Copyright 2016 EMC Corporation. All rights reserved.

Eliminate costly old legacy apps

and systems while still retaining data

and content for compliance purposes

Make data hungry applications run

more efficiently by archiving static information in a

governed manner

Enable better strategic decisions by leveraging all

the formerly siloed information in your

enterprise

REDUCE COSTS OPTIMIZE ANALYZECONTROLEnsure compliance

with regulatory and legal

mandates by applying

necessary retention and eDiscovery

policies

VALUE PROPOSITIONS

Page 19: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

19© Copyright 2016 EMC Corporation. All rights reserved.

INFOARCHIVE WITH A DATA LAKE

Hadoop

Applications built using Hadoop & 3rd party tools

InfoArchive

Storage (Isilon) Storage (Isilon)

Big Data AnalyticsCompliant Preservation

A solution for scalable big data analyticsA compliant solution for application

decommissioning, active archiving & data reuse

Data shared by InfoArchive to

enable analytics

Page 21: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

21© Copyright 2016 EMC Corporation. All rights reserved.

1. Active Archive– Optimise Enterprise Data Warehouse storage by archiving cold data and still

analyse it as needed

2. ETL Offload– Improve EDW performance by offloading ETL processing to Hadoop

3. Semi/Unstructured Data Analytics– Increase confidence in business decisions with new data sources

4. Multi-protocol Access – Enable applications to access/update Hadoop data using NFS, SMB, HTTP, Swift

and other file/object based access methods

5. Data Management– Enterprise-grade data management at Hadoop economics

DATA LAKE BENEFITS

Unique to Isilon

Page 22: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

22© Copyright 2016 EMC Corporation. All rights reserved.

EMC CONSULTING SERVICES

Big Data Technology Advisory

Big Data Proof of Technology

Big Data Technology

Implementation

Assess Prove Deploy

Big Data Vision Workshop

Big Data Proof Of Value

Big Data Applied

Analytics Implementatio

nBusiness

Technology

Page 23: Charles Sevior,  EMC, Presentation at The Chief Data & Analytics Officer Forum, Singapore

23© Copyright 2016 EMC Corporation. All rights reserved. 23© Copyright 2016 EMC Corporation. All rights reserved.