Top Banner
1 Pivotal Confidential–Internal Use Only Modern Data Architecture Alexey Grishchenko
100

Modern Data Architecture – JD Kiev v05

Feb 13, 2017

Download

Documents

phamlien
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Modern Data Architecture – JD Kiev v05

1 Pivotal Confidential–Internal Use Only 1 Pivotal Confidential–Internal Use Only

Modern Data Architecture Alexey Grishchenko

Page 2: Modern Data Architecture – JD Kiev v05

2 Pivotal Confidential–Internal Use Only

About me

Enterprise Architect @ Pivotal �  7 years in data processing

�  5 years with MPP

�  4 years with Hadoop

�  Spark contributor

�  http://0x0fff.com

Page 3: Modern Data Architecture – JD Kiev v05

3 Pivotal Confidential–Internal Use Only

How it started…

Front End

Page 4: Modern Data Architecture – JD Kiev v05

4 Pivotal Confidential–Internal Use Only

How it started…

Front End

Back End

Page 5: Modern Data Architecture – JD Kiev v05

5 Pivotal Confidential–Internal Use Only

How it started…

Front End

Back End

DBMS

Page 6: Modern Data Architecture – JD Kiev v05

6 Pivotal Confidential–Internal Use Only

How it started…

Front End

Back End

DBMS What about BI?

Page 7: Modern Data Architecture – JD Kiev v05

7 Pivotal Confidential–Internal Use Only

How it started…

Front End

Back End

DBMS Just put it there!

Page 8: Modern Data Architecture – JD Kiev v05

8 Pivotal Confidential–Internal Use Only

How it started…

Front End

Back End

DBMS

BI

Page 9: Modern Data Architecture – JD Kiev v05

9 Pivotal Confidential–Internal Use Only

How it started…

Front End

Back End

DBMS

BI

Was it fast?

Page 10: Modern Data Architecture – JD Kiev v05

10 Pivotal Confidential–Internal Use Only

How it started…

Front End

10ms

Back End

DBMS

BI

100ms

200ms

1-2 min

Page 11: Modern Data Architecture – JD Kiev v05

11 Pivotal Confidential–Internal Use Only

How it started…

Front End

10ms

Back End

DBMS

BI

100ms

200ms

1-2 min

yes, single server…

Page 12: Modern Data Architecture – JD Kiev v05

12 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

100ms

200ms

1-2 min

More users got workstations

Page 13: Modern Data Architecture – JD Kiev v05

13 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

400ms

800ms

1-2 min

Page 14: Modern Data Architecture – JD Kiev v05

14 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

400ms

800ms

1-2 min

Split!

Page 15: Modern Data Architecture – JD Kiev v05

15 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

300ms

600ms

1-2 min

Page 16: Modern Data Architecture – JD Kiev v05

16 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

300ms

600ms

1-2 min

Even more users?

Page 17: Modern Data Architecture – JD Kiev v05

17 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

300ms

600ms

1-2 min

Split!

Page 18: Modern Data Architecture – JD Kiev v05

18 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

100ms

400ms

1-2 min

Front End

Back End

Front End

Back End

Page 19: Modern Data Architecture – JD Kiev v05

19 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

100ms

400ms

1-2 min

Front End

Back End

Front End

Back End

What about automated systems?

Page 20: Modern Data Architecture – JD Kiev v05

20 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

100ms

1 sec

5-10 min

Front End

Back End

Front End

Back End

Front End

Back End

Front End

Back End

Page 21: Modern Data Architecture – JD Kiev v05

21 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

100ms

1 sec

5-10 min

Front End

Back End

Front End

Back End

Front End

Back End

Front End

Back End

Database, please, live!

Page 22: Modern Data Architecture – JD Kiev v05

22 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

100ms

1 sec

5-10 min

Front End

Back End

Front End

Back End

Front End

Back End

Front End

Back End

Page 23: Modern Data Architecture – JD Kiev v05

23 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

100ms

800ms

15-20 min

Front End

Back End

Front End

Back End

Front End

Back End

Front End

Back End

Page 24: Modern Data Architecture – JD Kiev v05

24 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

100ms

800ms

15-20 min

Front End

Back End

Front End

Back End

Front End

Back End

Front End

Back End

What if “split” didn’t help this time?

Page 25: Modern Data Architecture – JD Kiev v05

25 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

100ms

800ms

15-20 min

Front End

Back End

Front End

Back End

Front End

Back End

Front End

Back End

Split more! Eventually it will help…

Page 26: Modern Data Architecture – JD Kiev v05

26 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

100ms

300ms

35-40 min

Front End

Back End

Front End

Back End

Front End

Back End

Front End

Back End

DBMS DBMS DBMS DBMS

Page 27: Modern Data Architecture – JD Kiev v05

27 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

100ms

300ms

35-40 min

Front End

Back End

Front End

Back End

Front End

Back End

Front End

Back End

DBMS DBMS DBMS DBMS

Page 28: Modern Data Architecture – JD Kiev v05

28 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

100ms

300ms

35-40 min

Front End

Back End

Front End

Back End

Front End

Back End

Front End

Back End

DBMS DBMS DBMS DBMS

Sales went 10% up!

Page 29: Modern Data Architecture – JD Kiev v05

29 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

100ms

300ms

35-40 min

Front End

Back End

Front End

Back End

Front End

Back End

Front End

Back End

DBMS DBMS DBMS DBMS

Sales went 10% up!

Sales went 20% down!

Page 30: Modern Data Architecture – JD Kiev v05

30 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

100ms

600ms

2-3 hrs

Front End

Back End

Front End

Back End

Front End

Back End

Front End

Back End

DBMS DBMS DBMS DBMS

Sales went 10% up!

Sales went 20% down!

Page 31: Modern Data Architecture – JD Kiev v05

31 Pivotal Confidential–Internal Use Only

First Issues

Front End

10ms

Back End

DBMS

BI

100ms

600ms

2-3 hrs

Front End

Back End

Front End

Back End

Front End

Back End

Front End

Back End

DBMS DBMS DBMS DBMS

Sales went 10% up!

Sales went 20% down!

Stop loading my system with your stupid reports!

Page 32: Modern Data Architecture – JD Kiev v05

32 Pivotal Confidential–Internal Use Only

BI

The Era of Data Warehouse

100ms

DBMS 300ms

2 days

FE BE

DBMS DBMS DBMS DBMS

FE BE

FE BE

FE BE

FE BE

ETL

DWH 1 day

Page 33: Modern Data Architecture – JD Kiev v05

33 Pivotal Confidential–Internal Use Only

BI

The Era of Data Warehouse

100ms

DBMS 300ms

2 days

FE BE

DBMS DBMS DBMS DBMS

FE BE

FE BE

FE BE

FE BE

ETL

DWH 1 day

We need more reports!

Page 34: Modern Data Architecture – JD Kiev v05

34 Pivotal Confidential–Internal Use Only

BI

The Era of Data Warehouse

100ms

DBMS 300ms

3-4 days

FE BE

DBMS DBMS DBMS DBMS

FE BE

FE BE

FE BE

FE BE

ETL

DWH 1 day

Data Mining OLAP …

Page 35: Modern Data Architecture – JD Kiev v05

35 Pivotal Confidential–Internal Use Only

BI

The Era of Data Warehouse

100ms

DBMS 300ms

3-4 days

FE BE

DBMS DBMS DBMS DBMS

FE BE

FE BE

FE BE

FE BE

ETL

DWH 1 day

Data Mining OLAP … We need

secondary site!

Page 36: Modern Data Architecture – JD Kiev v05

36 Pivotal Confidential–Internal Use Only

The Era of Data Warehouse

100ms

300ms

3-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ETL

DWH 1 day

BI Data Mining OLAP …

Page 37: Modern Data Architecture – JD Kiev v05

37 Pivotal Confidential–Internal Use Only

The Era of Data Warehouse

100ms

300ms

3-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ETL

DWH 1 day

BI Data Mining OLAP …

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

WAL Replication

3-5 minutes late

Page 38: Modern Data Architecture – JD Kiev v05

38 Pivotal Confidential–Internal Use Only

The Era of Data Warehouse

100ms

300ms

3-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ETL

DWH 1 day

BI Data Mining OLAP …

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

WAL Replication

3-5 minutes late

Page 39: Modern Data Architecture – JD Kiev v05

39 Pivotal Confidential–Internal Use Only

The Era of Data Warehouse

100ms

300ms

3-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ETL

DWH 1 day

BI Data Mining OLAP …

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

WAL Replication

3-5 minutes late

Where is our DWH? We need this data now!

Page 40: Modern Data Architecture – JD Kiev v05

40 Pivotal Confidential–Internal Use Only

The Era of Data Warehouse

100ms

300ms

3-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ETL

DWH 1 day

BI Data Mining OLAP …

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

WAL Replication

3-5 minutes late

Page 41: Modern Data Architecture – JD Kiev v05

41 Pivotal Confidential–Internal Use Only

ETL

The Era of Data Warehouse

100ms

300ms

3-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ETL

DWH 1 day

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late DWH

BI Data Mining OLAP …

5-7 days

DBMS DBMS DBMS DBMS DBMS

Page 42: Modern Data Architecture – JD Kiev v05

42 Pivotal Confidential–Internal Use Only

ETL

The Era of Data Warehouse

100ms

300ms

3-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ETL

DWH 1 day

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late DWH

BI Data Mining OLAP …

5-7 days

DBMS DBMS DBMS DBMS DBMS

Why is this data so old?

Page 43: Modern Data Architecture – JD Kiev v05

43 Pivotal Confidential–Internal Use Only

ETL

The Era of Data Warehouse

100ms

300ms

3-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ETL

DWH 1 day

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late DWH

BI Data Mining OLAP …

5-7 days

DBMS DBMS DBMS DBMS DBMS

Page 44: Modern Data Architecture – JD Kiev v05

44 Pivotal Confidential–Internal Use Only

ETL

Advanced Architecture – ELT

100ms

300ms

3-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ETL

DWH 1 day

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late DWH

BI Data Mining OLAP …

5-7 days

DBMS DBMS DBMS DBMS DBMS

DBMS DBMS DBMS …

ETL

DDS

Data Marts Reports

Aggregates

OLAP

DBMS DBMS DBMS …

ELT

DDS

Data Marts Reports

Aggregates

OLAP

ODS ODS ODS …

Page 45: Modern Data Architecture – JD Kiev v05

45 Pivotal Confidential–Internal Use Only

ELT

Advanced Architecture – ELT

100ms

300ms

3-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ELT

DWH 1 day

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late DWH

BI Data Mining OLAP …

5-7 days

DBMS DBMS DBMS DBMS DBMS

Page 46: Modern Data Architecture – JD Kiev v05

46 Pivotal Confidential–Internal Use Only

ELT

Advanced Architecture – CDC

100ms

300ms

3-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ELT

DWH 1 day

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late DWH

BI Data Mining OLAP …

5-7 days

DBMS DBMS DBMS DBMS DBMS

DBMS DBMS DBMS …

ELT

DDS

Data Marts Reports

Aggregates

OLAP

ODS ODS ODS …

DBMS DBMS DBMS …

ELT

DDS

Data Marts Reports

Aggregates

OLAP

ODS ODS ODS …

CDC

1 day

1 hour

Page 47: Modern Data Architecture – JD Kiev v05

47 Pivotal Confidential–Internal Use Only

ELT CDC

Advanced Architecture – CDC

100ms

300ms

1-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ELT

DWH 3-24 hrs

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late

BI Data Mining OLAP …

4-7 days

DBMS DBMS DBMS DBMS DBMS

CDC

DWH

Page 48: Modern Data Architecture – JD Kiev v05

48 Pivotal Confidential–Internal Use Only

ELT CDC

Advanced Architecture – CDC

100ms

300ms

1-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ELT

DWH 3-24 hrs

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late

BI Data Mining OLAP …

4-7 days

DBMS DBMS DBMS DBMS DBMS

CDC

DWH

Why is our secondary site’s

DWH so old?

Page 49: Modern Data Architecture – JD Kiev v05

49 Pivotal Confidential–Internal Use Only

ELT CDC

100ms

300ms

1-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ELT

DWH 3-24 hrs

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late

BI Data Mining OLAP …

4-7 days

DBMS DBMS DBMS DBMS DBMS

CDC

DWH

Moving Forward

Page 50: Modern Data Architecture – JD Kiev v05

50 Pivotal Confidential–Internal Use Only

ELT CDC

100ms

300ms

1-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ELT

DWH 3-24 hrs

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late

BI Data Mining OLAP …

4-7 days

DBMS DBMS DBMS DBMS DBMS

CDC

DWH

Our problems are

Moving Forward

Page 51: Modern Data Architecture – JD Kiev v05

51 Pivotal Confidential–Internal Use Only

ELT CDC

100ms

300ms

1-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ELT

DWH 3-24 hrs

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late

BI Data Mining OLAP …

4-7 days

DBMS DBMS DBMS DBMS DBMS

CDC

DWH

Our problems are

Ø  Time to action takes up to 7 days

Moving Forward

Page 52: Modern Data Architecture – JD Kiev v05

52 Pivotal Confidential–Internal Use Only

ELT CDC

100ms

300ms

1-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ELT

DWH 3-24 hrs

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late

BI Data Mining OLAP …

4-7 days

DBMS DBMS DBMS DBMS DBMS

CDC

DWH

Our problems are

Ø  Time to action takes up to 7 days

Ø Amount of data is growing

Moving Forward

Page 53: Modern Data Architecture – JD Kiev v05

53 Pivotal Confidential–Internal Use Only

ELT CDC

100ms

300ms

1-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ELT

DWH 3-24 hrs

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late

BI Data Mining OLAP …

4-7 days

DBMS DBMS DBMS DBMS DBMS

CDC

DWH

Our problems are

Ø  Time to action takes up to 7 days

Ø Amount of data is growing

Ø DWH MPP storage is expensive

Moving Forward

Page 54: Modern Data Architecture – JD Kiev v05

54 Pivotal Confidential–Internal Use Only

ELT CDC

Modern Architectures

100ms

300ms

1-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ELT

DWH 3-24 hrs

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late

BI Data Mining OLAP …

4-7 days

DBMS DBMS DBMS DBMS DBMS

CDC

DWH

Our problems are

Ø  Time to action takes up to 7 days

Ø Amount of data is growing

Ø DWH MPP storage is expensive Data Lake

Page 55: Modern Data Architecture – JD Kiev v05

55 Pivotal Confidential–Internal Use Only

ELT CDC

Modern Architectures

100ms

300ms

1-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ELT

DWH 3-24 hrs

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late

BI Data Mining OLAP …

4-7 days

DBMS DBMS DBMS DBMS DBMS

CDC

DWH

Our problems are

Ø  Time to action takes up to 7 days

Ø Amount of data is growing

Ø DWH MPP storage is expensive

Lambda

Data Lake

Page 56: Modern Data Architecture – JD Kiev v05

56 Pivotal Confidential–Internal Use Only

ELT CDC

Modern Architectures – Data Lake

100ms

300ms

1-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ELT

DWH 3-24 hrs

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late

BI Data Mining OLAP …

4-7 days

DBMS DBMS DBMS DBMS DBMS

CDC

DWH

Hadoop

DBMS DBMS DBMS …

ELT

DDS

OLAP Data Marts

Aggregates

Reports

ODS ODS ODS …

CDC

DWH ODS UDS

Analytical Archives

BI Data Mining OLAP

SQL-on-Hadoop

Data Mining At Scale

Page 57: Modern Data Architecture – JD Kiev v05

57 Pivotal Confidential–Internal Use Only

ELT CDC

Modern Architectures – Data Lake

100ms

300ms

1-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

DBMS

FE BE

ELT

DWH 3-24 hrs

BI Data Mining OLAP …

FE BE

FE BE

FE BE

FE BE

FE BE

WAL Replication

3-5 minutes late

NAS NAS Backup / Restore

3 days late

BI Data Mining OLAP …

4-7 days

DBMS DBMS DBMS DBMS DBMS

CDC

DWH

Page 58: Modern Data Architecture – JD Kiev v05

58 Pivotal Confidential–Internal Use Only

ELT CDC

Modern Architectures – Data Lake

100ms

300ms

1-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH 3-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

Data Mining BI OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

?

Page 59: Modern Data Architecture – JD Kiev v05

59 Pivotal Confidential–Internal Use Only

ELT CDC

Modern Architectures – Lambda

100ms

300ms

1-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH 3-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

Data Mining BI OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

?

Source Data

Speed Layer Batch Layer

Serving Layer

Query Query

Master Dataset

Batch View

Batch View

Batch View

Real-time View

Real-time View

Real-time View

Page 60: Modern Data Architecture – JD Kiev v05

60 Pivotal Confidential–Internal Use Only

ELT CDC

Modern Architectures – Lambda

100ms

300ms

1-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH 3-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

Data Mining BI OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

?

Page 61: Modern Data Architecture – JD Kiev v05

61 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

Modern Architectures – Lambda

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Page 62: Modern Data Architecture – JD Kiev v05

62 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

Modern Architectures

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Our problems are

Page 63: Modern Data Architecture – JD Kiev v05

63 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

Modern Architectures

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Our problems are

Ø  Too many standby systems

Page 64: Modern Data Architecture – JD Kiev v05

64 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

Modern Architectures

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Our problems are

Ø  Too many standby systems

Ø How to replicate Hadoop cluster?

Page 65: Modern Data Architecture – JD Kiev v05

65 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

Modern Architectures

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Our problems are

Ø  Too many standby systems

Ø How to replicate Hadoop cluster?

Ø How to sync data in real-time systems?

Page 66: Modern Data Architecture – JD Kiev v05

66 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

Modern Architectures

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Our problems are

Ø  Too many standby systems

Ø How to replicate Hadoop cluster?

Ø How to sync data in real-time systems?

Ø How to better sync DWH?

Page 67: Modern Data Architecture – JD Kiev v05

67 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

Modern Architectures

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Our problems are

Ø  Too many standby systems

Ø How to replicate Hadoop cluster?

Ø How to sync data in real-time systems?

Ø How to better sync DWH?

Pipelining

Page 68: Modern Data Architecture – JD Kiev v05

68 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

Page 69: Modern Data Architecture – JD Kiev v05

69 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

App

App

App

… HTTP

Page 70: Modern Data Architecture – JD Kiev v05

70 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

App

App

App

… HTTP

BE

Srv

Srv

Srv

… SOAP

Page 71: Modern Data Architecture – JD Kiev v05

71 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

App

App

App

… HTTP

BE

Srv

Srv

Srv

… SOAP

OLTP

SP JDBC

Table

Page 72: Modern Data Architecture – JD Kiev v05

72 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

App

App

App

… HTTP

BE

Srv

Srv

Srv

… SOAP

OLTP

SP JDBC

Log

Table

Page 73: Modern Data Architecture – JD Kiev v05

73 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

App

App

App

… HTTP

BE

Srv

Srv

Srv

… SOAP

OLTP

SP JDBC

Log

Table

CDC

copy Parse

Batch

Page 74: Modern Data Architecture – JD Kiev v05

74 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

App

App

App

… HTTP

BE

Srv

Srv

Srv

… SOAP

OLTP

SP JDBC

Log

Table

CDC

copy Parse

Batch

ETL

cp Batch

ETL

Page 75: Modern Data Architecture – JD Kiev v05

75 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

App

App

App

… HTTP

BE

Srv

Srv

Srv

… SOAP

OLTP

SP JDBC

Log

Table

CDC

copy Parse

Batch

ETL

cp Batch

ETL load

OD

S

DWH

Page 76: Modern Data Architecture – JD Kiev v05

76 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

App

App

App

… HTTP

BE

Srv

Srv

Srv

… SOAP

OLTP

SP JDBC

Log

Table

CDC

copy Parse

Batch

ETL

cp Batch

ETL load

OD

S

DD

S

DWH

Page 77: Modern Data Architecture – JD Kiev v05

77 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

App

App

App

… HTTP

BE

Srv

Srv

Srv

… SOAP

OLTP

SP JDBC

Log

Table

CDC

copy Parse

Batch

ETL

cp Batch

ETL load

OD

S

DD

S

Dat

a M

art

DWH

Page 78: Modern Data Architecture – JD Kiev v05

78 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

BI

App

App

App

… HTTP

BE

Srv

Srv

Srv

… SOAP

OLTP

SP JDBC

Log

Table

CDC

copy Parse

Batch

ETL

cp Batch

ETL load

OD

S

DD

S

Dat

a M

art

DWH

JDBC

Page 79: Modern Data Architecture – JD Kiev v05

79 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

BI

App

App

App

… HTTP

BE

Srv

Srv

Srv

OLTP

SP JDBC

Log

Table

CDC

copy Parse

Batch

ETL

cp Batch

ETL

OD

S

DD

S

Dat

a M

art

DWH

JDBC

Page 80: Modern Data Architecture – JD Kiev v05

80 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

BI

App

App

App

… HTTP

BE

Srv

Srv

Srv

OLTP

SP JDBC

Log

Table

CDC

copy Parse

Batch

load

OD

S

DD

S

Dat

a M

art

DWH

JDBC

AP

I

Queue ETL

ETL Batch

Page 81: Modern Data Architecture – JD Kiev v05

81 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

BI

App

App

App

… HTTP

BE

Srv

Srv

Srv

OLTP

SP JDBC

Log

Table

CDC

copy Parse

Batch

load

OD

S

DD

S

Dat

a M

art

DWH

JDBC

AP

I

Queue ETL

ETL Batch

load ETL

Page 82: Modern Data Architecture – JD Kiev v05

82 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

BI

App

App

App

… HTTP

BE

Srv

Srv

Srv

OLTP

SP JDBC

Log

Table

CDC

copy Parse

Batch

load

OD

S

DD

S

Dat

a M

art

DWH

JDBC

AP

I

Queue ETL

ETL Batch App

ETL Batch

load

load ETL

Page 83: Modern Data Architecture – JD Kiev v05

83 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

BI

App

App

App

… HTTP

BE

Srv

Srv

Srv

OLTP

SP JDBC

Log

Table

CDC

copy Parse

Batch

load

OD

S

DD

S

Dat

a M

art

DWH

JDBC

AP

I

Queue ETL

ETL Batch App

ETL Batch

load

load ETL

STG

Batch App

Hadoop

HDFS SQL On

Hadoop

Page 84: Modern Data Architecture – JD Kiev v05

84 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

BI

App

App

App

… HTTP

BE

Srv

Srv

Srv

OLTP

SP JDBC

Log

Table

CDC

copy Parse

Batch

load

OD

S

DD

S

Dat

a M

art

DWH

JDBC

AP

I

Queue ETL

ETL Batch App

ETL Batch

load

load ETL

STG

Batch App

Hadoop

HDFS SQL On

Hadoop

RTI App

Page 85: Modern Data Architecture – JD Kiev v05

85 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

FE

BI

App

App

App

… HTTP

BE

Srv

Srv

Srv

OLTP

SP JDBC

Log

Table

CDC

copy Parse

Batch

load

OD

S

DD

S

Dat

a M

art

DWH

JDBC

AP

I

Queue ETL

ETL Batch App

ETL Batch

load

load ETL

STG

Batch App

Hadoop

HDFS SQL On

Hadoop

RTI App Replicate

Page 86: Modern Data Architecture – JD Kiev v05

86 Pivotal Confidential–Internal Use Only

In-Memory Data Store

ELT CDC

100ms

300ms

0-4 days

FE BE

DBMS DBMS

FE BE

DBMS

FE BE

ELT

DWH

0-24 hrs

OLAP Data Mining BI …

FE BE

FE BE

FE BE

NAS NAS Backup / Restore

2 days late

OLAP …

3-6 days

DBMS DBMS DBMS WAL Replication

3-5 minutes late

CDC

DWH Hadoop Hadoop

? In-Memory Data Store

RTDM BI Data Mining

Modern Data Architecture – Pipelining

Page 87: Modern Data Architecture – JD Kiev v05

87 Pivotal Confidential–Internal Use Only

ELT CDC

FE

BE

DBMS DBMS

FE

BE

DBMS

FE

BE

ELT

DWH

OLAP Data Mining RTBI …

FE

BE

FE

BE

FE

BE

CDC

Hadoop In-Memory Data Store

BI

Modern Data Architecture – Pipelining

Replication Queue 3-5 minutes late

In-Memory Data Store

OLAP …

DWH Hadoop

BI Data Mining RTBI

DBMS DBMS DBMS WAL Replication

3-5 minutes late

Page 88: Modern Data Architecture – JD Kiev v05

88 Pivotal Confidential–Internal Use Only

Pivotal and Modern Data Architecture

BI

Pivotal Cloud Foundry

HTTP

FE

App

App

App

Queue BE

App

App

App

Pivotal GemFire

App

Spring XD

Streaming

Streaming

Data

Pivotal HD

Pivotal HAWQ

ES

DD

S

Dat

a M

art

Pivotal Greenplum

Data Mart PostgreSQL

SP Table

OD

S

ETL

ETL

Page 89: Modern Data Architecture – JD Kiev v05

89 Pivotal Confidential–Internal Use Only

Pivotal and Modern Data Architecture

BI

HTTP

Pivotal GemFire

App

Spring XD

Streaming

Streaming

Data

Pivotal HD

Pivotal HAWQ

ES

DD

S

Dat

a M

art

Pivotal Greenplum

Data Mart PostgreSQL

SP Table

OD

S

ETL

ETL

Pivotal Cloud Foundry

FE

App

App

App

Queue BE

App

App

App

� Pivotal Labs – agile software development for next-generation applications

� Pivotal Cloud Foundry – PaaS for customer applications

� RabbitMQ – distributed message queue service on top of PCF

� Spring IO – foundation platform for modern applications

Page 90: Modern Data Architecture – JD Kiev v05

90 Pivotal Confidential–Internal Use Only

Pivotal and Modern Data Architecture

BI

Pivotal Cloud Foundry

HTTP

FE

App

App

App

Queue BE

App

App

App

Spring XD

Streaming

Streaming

Data

Pivotal HD

Pivotal HAWQ

ES

DD

S

Dat

a M

art

Pivotal Greenplum

Data Mart PostgreSQL

SP Table

OD

S

ETL

ETL

Pivotal GemFire

App

Pivotal GemFire and Apache Geode (incubating) – in-memory data grid enabling real-time data processing and real-time decision making for enterprises

Page 91: Modern Data Architecture – JD Kiev v05

91 Pivotal Confidential–Internal Use Only

Pivotal and Modern Data Architecture

BI

Pivotal Cloud Foundry

HTTP

FE

App

App

App

Queue BE

App

App

App

Pivotal GemFire

App

Streaming

Data

Pivotal HD

Pivotal HAWQ

ES

DD

S

Dat

a M

art

Pivotal Greenplum

Data Mart PostgreSQL

SP Table

OD

S

ETL

ETL

Spring XD

Streaming

Spring XD – unified, distributed and extensible framework for data pipelining: ingesting, batching, processing and exporting

Page 92: Modern Data Architecture – JD Kiev v05

92 Pivotal Confidential–Internal Use Only

Pivotal and Modern Data Architecture

BI

Pivotal Cloud Foundry

HTTP

FE

App

App

App

Queue BE

App

App

App

Pivotal GemFire

App

Spring XD

Streaming

ES

DD

S

Dat

a M

art

Pivotal Greenplum

PostgreSQL

SP Table

OD

S

ETL

ETL

Streaming

Data

Pivotal HD

Pivotal HAWQ

Data Mart

� Pivotal HD – leading Hadoop distribution based on ODP

� Pivotal HAWQ and Apache HAWQ (incubating) – bringing the power of MPP to the Hadoop cluster, best in class SQL-on-Hadoop solution

� Apache Spark – component of the Pivotal HD distribution, modern framework for distributed data processing

Page 93: Modern Data Architecture – JD Kiev v05

93 Pivotal Confidential–Internal Use Only

Pivotal and Modern Data Architecture

BI

Pivotal Cloud Foundry

HTTP

FE

App

App

App

Queue BE

App

App

App

Pivotal GemFire

App

Spring XD

Streaming

Streaming

Data

Pivotal HD

Pivotal HAWQ

ES

DD

S

Dat

a M

art

Pivotal Greenplum

Data Mart

OD

S

ETL

ETL

PostgreSQL

SP Table

� Pivotal PostgreSQL – commercially supported by Pivotal open source distribution of PostgreSQL

Page 94: Modern Data Architecture – JD Kiev v05

94 Pivotal Confidential–Internal Use Only

Pivotal and Modern Data Architecture

BI

Pivotal Cloud Foundry

HTTP

FE

App

App

App

Queue BE

App

App

App

Pivotal GemFire

App

Spring XD

Streaming

Streaming

Data

Pivotal HD

Pivotal HAWQ

Data Mart PostgreSQL

SP Table

ETL

ETL

ES

DD

S

Dat

a M

art

Pivotal Greenplum

OD

S

Pivotal Greenplum – leading analytical MPP database, foundation for the enterprise data warehousing systems and advanced analytics

Page 95: Modern Data Architecture – JD Kiev v05

95 Pivotal Confidential–Internal Use Only

Pivotal and Modern Data Architecture Pivotal GemFire

App

Spring XD

Streaming

BI

Pivotal Cloud Foundry

HTTP

FE

App

App

App

Queue BE

App

App

App

Streaming

Data

Pivotal HD

Pivotal HAWQ

ES

DD

S

Dat

a M

art

Pivotal Greenplum

Data Mart PostgreSQL

SP Table

OD

S

ETL

ETL

Data Lake

Page 96: Modern Data Architecture – JD Kiev v05

96 Pivotal Confidential–Internal Use Only

Pivotal and Modern Data Architecture

Pivotal Cloud Foundry

HTTP

FE

App

App

App

Queue BE

App

App

App

Spring XD

Streaming

ES

DD

S

Dat

a M

art

Pivotal Greenplum

PostgreSQL

SP Table

OD

S

ETL

ETL

Pivotal GemFire

App

Streaming

Data

Pivotal HD

Pivotal HAWQ

Data Mart

BI

Lambda Architecture

Page 97: Modern Data Architecture – JD Kiev v05

97 Pivotal Confidential–Internal Use Only

Pivotal and Modern Data Architecture

ES

DD

S

Dat

a M

art

Pivotal Greenplum

PostgreSQL

SP Table

OD

S

ETL

ETL

Pivotal Cloud Foundry

HTTP

FE

App

App

App

Queue BE

App

App

App

Streaming

Pivotal HD BI

Pivotal GemFire

App

Spring XD

Streaming

Data

Pivotal HAWQ

Data Mart

Pipelining

Page 98: Modern Data Architecture – JD Kiev v05

98 Pivotal Confidential–Internal Use Only

Pivotal and Modern Data Architecture

BI

Pivotal Cloud Foundry

HTTP

FE

App

App

App

Queue BE

App

App

App

Pivotal GemFire

App

Spring XD

Streaming

Streaming

Data

Pivotal HD

Pivotal HAWQ

ES

DD

S

Dat

a M

art

Pivotal Greenplum

Data Mart PostgreSQL

SP Table

OD

S

ETL

ETL

Page 99: Modern Data Architecture – JD Kiev v05

99 Pivotal Confidential–Internal Use Only 99 Pivotal Confidential–Internal Use Only

Questions?

Page 100: Modern Data Architecture – JD Kiev v05

BUILT FOR THE SPEED OF BUSINESS