Top Banner
1 © Copyright 2011 EMC Corporation. All rights reserved. Introducción a Big Data Analytics Luis Zamora - Sales Manager Iberia Greenplum Pedro Algaba - EMC Greenplum Solutions Architect
14

Introducción a Big Data Analytics - Dell EMC Spain · Title: Microsoft PowerPoint - BREAKOUT_2_3_12H45_13H15_EMC Greenplum EMC Forum 2012_Luis Zamora Author: gparra Created Date:

Oct 12, 2018

Download

Documents

vuongnhan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introducción a Big Data Analytics - Dell EMC Spain · Title: Microsoft PowerPoint - BREAKOUT_2_3_12H45_13H15_EMC Greenplum EMC Forum 2012_Luis Zamora Author: gparra Created Date:

1© Copyright 2011 EMC Corporation. All rights reserved.

Introducción a Big Data Analytics

Luis Zamora - Sales Manager Iberia Greenplum

Pedro Algaba - EMC Greenplum Solutions Architect

Page 2: Introducción a Big Data Analytics - Dell EMC Spain · Title: Microsoft PowerPoint - BREAKOUT_2_3_12H45_13H15_EMC Greenplum EMC Forum 2012_Luis Zamora Author: gparra Created Date:

4© Copyright 2011 EMC Corporation. All rights reserved.

BIG DATA: Retos y Requerimientos

�Big Data Analytics plantea unos requerimientos más exigentes que las soluciones de Business Intelligence tradicional no resuelven

–Análisis Masivo de datos (centenares de TB hasta PB)–Datos externos a los sistemas de la organización (no

operacionales) y en muchos casos no estructurados–Procesos analíticos más agiles e iterativos–Integración con los sistemas informacionales de datos

tradicionales

Page 3: Introducción a Big Data Analytics - Dell EMC Spain · Title: Microsoft PowerPoint - BREAKOUT_2_3_12H45_13H15_EMC Greenplum EMC Forum 2012_Luis Zamora Author: gparra Created Date:

5© Copyright 2011 EMC Corporation. All rights reserved.

Optimizar los modelos de riesgo incorporando datos externos

LOW

HIGH

Daily Risk ModelUpdates

Underw

riting Risk

MonthlyRisk ModelUpdates

TRADITIONAL DATA LEVERAGED BIG DATA LEVERAGED

LegacySystem

Greenplum Database

BI Reporting

Greenplum In-DatabaseAnalytics

Greenplum Big Data Analytics

Unstructured Data Sources Enrich The Data

CASO DE USO DE BIG DATA

Page 4: Introducción a Big Data Analytics - Dell EMC Spain · Title: Microsoft PowerPoint - BREAKOUT_2_3_12H45_13H15_EMC Greenplum EMC Forum 2012_Luis Zamora Author: gparra Created Date:

6© Copyright 2011 EMC Corporation. All rights reserved.

Page 5: Introducción a Big Data Analytics - Dell EMC Spain · Title: Microsoft PowerPoint - BREAKOUT_2_3_12H45_13H15_EMC Greenplum EMC Forum 2012_Luis Zamora Author: gparra Created Date:

7© Copyright 2011 EMC Corporation. All rights reserved.

Greenplum Unified Analytics Platform

Page 6: Introducción a Big Data Analytics - Dell EMC Spain · Title: Microsoft PowerPoint - BREAKOUT_2_3_12H45_13H15_EMC Greenplum EMC Forum 2012_Luis Zamora Author: gparra Created Date:

8© Copyright 2011 EMC Corporation. All rights reserved.

Greenplum Database:Rendimiento Extremo en HW Commodity

� Optimizada para BI y Analytics

� Procesado y carga de datos en paralelo

� Arquitectura MPP-sin compartir nada con escalabilidad lineal

� Integración con repositorios de datos externos

Page 7: Introducción a Big Data Analytics - Dell EMC Spain · Title: Microsoft PowerPoint - BREAKOUT_2_3_12H45_13H15_EMC Greenplum EMC Forum 2012_Luis Zamora Author: gparra Created Date:

9© Copyright 2011 EMC Corporation. All rights reserved.

Greenplum HD: Enterprise-Ready Hadoop

� Apache Hadoop

� Servicios y soporte 24*7 EMC

� Escalabilidad asegurada– Greenplum Analytics Workbench

� Integración con Greenplum Database

Page 8: Introducción a Big Data Analytics - Dell EMC Spain · Title: Microsoft PowerPoint - BREAKOUT_2_3_12H45_13H15_EMC Greenplum EMC Forum 2012_Luis Zamora Author: gparra Created Date:

10© Copyright 2011 EMC Corporation. All rights reserved.

Greenplum In-Database Analytics

�SAS – HPA, Access, y ScoringAccelerator

�MADLib – Librería open-source de funciones analíticas avanzadas

�Extensiones analíticas soportadas – PostGIS - Geospatial support, PL/R - Statistical

Computing, PL/Java, PL/Perl

MAD

lib

MAD

lib

Page 9: Introducción a Big Data Analytics - Dell EMC Spain · Title: Microsoft PowerPoint - BREAKOUT_2_3_12H45_13H15_EMC Greenplum EMC Forum 2012_Luis Zamora Author: gparra Created Date:

11© Copyright 2011 EMC Corporation. All rights reserved.

Greenplum Chorus: Agilizando Big Data Analytics

� Una interfaz única para todos los datos– Buscar, explorar, visualizar e importar datos de

cualquier repositorio

– SAS datasets, bases de datos o ficheros Hadoop

� Provisión automática de bases de datos virtuales

� Colaborativo: Crear, compartir, publicar– Fuentes de datos, modelos analíticos,

insights

Page 10: Introducción a Big Data Analytics - Dell EMC Spain · Title: Microsoft PowerPoint - BREAKOUT_2_3_12H45_13H15_EMC Greenplum EMC Forum 2012_Luis Zamora Author: gparra Created Date:

12© Copyright 2011 EMC Corporation. All rights reserved.

Analytic Productivity Tools & AppsAnalytic Productivity Tools & Apps

Greenplum Database Hadoop

Compute

Storage

SQL DBEngine

Compute

Storage

MapReduceEngine

Data Computing InterfacesSAS PROC, SQL, MapReduce, In-Database Analytics, Parallel Data Loading

Data Computing InterfacesSAS PROC, SQL, MapReduce, In-Database Analytics, Parallel Data Loading

All Data Types• unstructured data• structured data• temporal data

• geospatial data• sensor data• spatial data

paralleldata exchange

paralleldata exchange

Co-Proceso de datos unificado

Network

Page 11: Introducción a Big Data Analytics - Dell EMC Spain · Title: Microsoft PowerPoint - BREAKOUT_2_3_12H45_13H15_EMC Greenplum EMC Forum 2012_Luis Zamora Author: gparra Created Date:

13© Copyright 2011 EMC Corporation. All rights reserved.

GreenplumData Computing Appliance

� El único appliance modular para co-proceso de datos estructurados y no estructurados

– servidores Intel estándar y switching GigE

� Plataforma Unificada para Big Data analytics– Red de interconexión interna de alto rendimiento

– Módulos para datos estructurados (GreenplumDB)

– Módulos para datos no-estructurados (GreenplumHD)

– Módulos para aplicaciones analíticas ETL / BI (GreenplumDIA)

Page 12: Introducción a Big Data Analytics - Dell EMC Spain · Title: Microsoft PowerPoint - BREAKOUT_2_3_12H45_13H15_EMC Greenplum EMC Forum 2012_Luis Zamora Author: gparra Created Date:

14© Copyright 2011 EMC Corporation. All rights reserved.

Configuración modular para Big Data Analytics

1st Rack

Add ¼ rackIncrements

+

Aggregation Rack

Add ¼ rackIncrements

+

FunctionalModule

FunctionalModule

FunctionalModule

or

or

Greenplum DIA Module

Greenplum Database Modules

or

or

Greenplum HD

Module

FunctionalModule

FunctionalModule

FunctionalModule

FunctionalModule

Page 13: Introducción a Big Data Analytics - Dell EMC Spain · Title: Microsoft PowerPoint - BREAKOUT_2_3_12H45_13H15_EMC Greenplum EMC Forum 2012_Luis Zamora Author: gparra Created Date:

20© Copyright 2011 EMC Corporation. All rights reserved.

DEMO

Page 14: Introducción a Big Data Analytics - Dell EMC Spain · Title: Microsoft PowerPoint - BREAKOUT_2_3_12H45_13H15_EMC Greenplum EMC Forum 2012_Luis Zamora Author: gparra Created Date: