Top Banner
Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Q&A box is available for your questions Webinar will be recorded for future viewing Thank you for joining! We’ll get started soon
33

Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Jan 15, 2015

Download

Data & Analytics

Hortonworks

As the enterprise's big data program matures and Apache Hadoop becomes more deeply embedded in critical operations, the ability to support and operate it efficiently and reliably becomes increasingly important. To aid enterprise in operating modern data architecture at scale, Red hat and Hortonworks have collaborated to integrate Hortonworks Data Platform with Red Hat's proven platform technologies. Join us in this interactive 3-part webinar series, as we'll demonstrate how Red Hat JBoss Data Virtualization can integrate with Hadoop through Hive and provide users easy access to data.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Q&A box is available for your questions

Webinar will be recorded for future viewing

Thank you for joining!

We’ll get started soon…

Page 2: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

An Open Source Modern Data Architecture …with Red Hat and Apache Hadoop

We do Hadoop.

Page 3: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Your speakers…

John Kreisa (@marked_man), VP Strategic Marketing, Hortonworks

Rob Cardwell, VP Middleware Technologies, Red Hat

Syed Rasheed, Sr. Solution Marketing Manager, Red Hat

Page 4: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Topics

•  Poll – Where are you on your Hadoop Journey? •  Why an open source Modern Data Architecture? •  Hortonworks and Red Hat partnership for the open MDA •  Open source MDA roadmap

Page 5: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Poll: Where are you in your Hadoop journey?

1.  Researching our options 2.  Currently evaluating some software 3.  Deep in a trial 4.  What’s Hadoop?

Page 6: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Big Data Market Trends & Projections

Big Data Explosion

% by which org’s leveraging modern info management systems outperform peers by 2015

ñ Hadoop enabled DBMS’s

85% from new data types

50x data growth 2010 to

2020

1 Zettabyte (ZB) =

1 Billion TBs

15x

growth rate of machine generated

data by 2020

The US has 1/3 of the world’s data

Big Data is 1 of 5 US GDP Game Changers $325 billion incremental annual GDP from big data analytics in retail and manufacturing by

2020

Page 7: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

APP

LIC

ATIO

NS

DAT

A S

YSTE

M

Business Analytics

Custom Applications

Packaged Applications

A data architecture under pressure from new data

•  Silos of Data •  Costly to Scale •  Constrained Schemas

Clickstream

Geolocation

Sentiment, Web Data

Sensor. Machine Data

Unstructured docs, emails

Server logs

SOU

RC

ES

Existing Sources (CRM, ERP,…)

RDBMS EDW MPP

New Data Types

…and difficult to manage new data

Page 8: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Hadoop within an emerging Modern Data Architecture

Hortonworks architected and led development of YARN

Common data set, multiple applications •  Optionally land all data in a single cluster

•  Batch, interactive & real-time use cases

•  Support multi-tenant access, processing & segmentation of data

YARN: Architectural center of Hadoop •  Consistent security, governance & operations •  Ecosystem applications certified

by Hortonworks to run natively in Hadoop

SOU

RC

ES

EXISTING  Systems  

Clickstream   Web    &Social  

Geoloca9on   Sensor    &  Machine  

Server    Logs  

Unstructured  

APP

LIC

ATIO

NS

DAT

A S

YSTE

M

Business Analytics

Custom Applications

Packaged Applications

RDBMS EDW MPP YARN: Data Operating System

1 ° ° ° ° ° ° ° ° °

° ° ° ° ° ° ° ° ° N

HDFS (Hadoop Distributed File System)

Interactive Real-Time Batch

Page 9: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Hadoop: typically used for new analytic applications SC

ALE

SCOPE

New Analytic Apps New types of data LOB-driven

Page 10: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Clickstream Capture and analyze website visitors’ data trails and optimize your website

Sensors Discover patterns in data streaming automatically from remote sensors and machines

Server Logs Research logs to diagnose process failures and prevent security breaches

New types of data Hadoop Value:

Sentiment Understand how your customers feel about your brand and products – right now

Geographic Analyze location-based data to manage operations where they occur

Unstructured Understand patterns in files across millions of web pages, emails, and documents

Page 11: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Unlock New Applications from New Types of Data

INDUSTRY USE CASE Sentiment & Web

Clickstream & Behavior

Machine & Sensor Geographic Server Logs Structured &

Unstructured

Financial Services New Account Risk Screens ✔ ✔

Trading Risk ✔

Insurance Underwriting ✔ ✔ ✔

Telecom Call Detail Records (CDR) ✔ ✔

Infrastructure Investment ✔ ✔

Real-time Bandwidth Allocation ✔ ✔ ✔

Retail 360° View of the Customer ✔ ✔ ✔

Localized, Personalized Promotions ✔

Website Optimization ✔

Manufacturing Supply Chain and Logistics ✔

Assembly Line Quality Assurance ✔

Crowd-sourced Quality Assurance ✔

Healthcare Use Genomic Data in Medial Trials ✔ ✔ ✔

Monitor Patient Vitals in Real-Time

Pharmaceuticals Recruit and Retain Patients for Drug Trials ✔ ✔

Improve Prescription Adherence ✔ ✔ ✔ ✔

Oil & Gas Unify Exploration & Production Data ✔ ✔ ✔ ✔

Monitor Rig Safety in Real-Time ✔ ✔ ✔

Government ETL Offload/Federal Budgetary Pressures ✔ ✔

Sentiment Analysis for Government Programs ✔

Page 12: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Hadoop incrementally delivers a ‘Data Lake’ SC

ALE

SCOPE

A Modern Data Architecture/Data Lake

 

New Analytic Apps New types of data LOB-driven

RDBMS

MPP

EDW

Gov

erna

nce

&

Inte

grat

ion

Secu

rity

Ope

ratio

ns

Data Access

Data Management

Data Lake An architectural shift in the data center that uses Hadoop to deliver deeper insight across a large, broad, diverse set of data at efficient scale

Page 13: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

OPERATIONAL  TOOLS  

DEV  &  DATA  TOOLS  

INFRASTRUCTURE  

HDP is deeply integrated in the data center SO

UR

CES

EXISTING  Systems  

Clickstream   Web  &Social   Geoloca9on   Sensor  &  Machine  

Server  Logs   Unstructured  

DAT

A S

YSTE

M

RDBMS   EDW   MPP  HANA

APPLICAT

IONS  

BusinessObjects BI

HDP 2.1

Gov

erna

nce

&

Inte

grat

ion

Secu

rity

Ope

ratio

ns

Data Access

Data Management

YARN

•  Enables millions of JBoss developers to quickly build applications with Hadoop

•  Simplifies deployment of Hadoop on OpenStack

•  Develops and deploys Apache Hadoop as integrated components of the open modern data architecture

Page 14: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Rob Cardwell, VP Middleware Technologies Red Hat

Page 15: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Companies strengthen relationship to bring Enterprise Apache Hadoop to the open modern data architecture

•  Engineering alignment •  Corporate alignment •  Field alignment

Page 16: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Engineering Collaboration Benefits Integration with JBoss Data Virtualization

Enable agile Big Data Hadoop integration with existing enterprise assets and maximize universal data utilization to enable self-service analytics

Integration with multiple Red Hat JBoss Middleware product family

Enables millions of JBoss developers to quickly build applications with Hadoop

Integration with Red Hat Storage Enables Hadoop to use Red Hat Storage secure resilient storage pool for data applications

Integration with Red Hat Enterprise Linux OpenStack Platform

Simplifies automated deployment of Hadoop on OpenStack

Integrated with Red Hat Enterprise Linux and OpenJDK

Develop and deploy Apache Hadoop as an integrated component for multiple deployment scenarios

Page 17: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Red Hat + Hortonworks Delivering Value for both Business and IT organizations

Business analysts and users Consume big data using existing tools and skills Application developers Easily build new big data analytical applications based on Hadoop and existing sources Enterprise architects Agile big data integration and creation of dynamic data supply chain to maximize data utilization and analytics at scale IT Operations Enable Apache Hadoop as an integrated, complementary component of the operational architecture

Page 18: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

•  A deeper strategic alliance –  Engineer solutions for seamless customer experience –  Joint go to market activities –  Integrated customer support

•  Available now –  HDP on Red Hat Storage beta program –  Red Hat JBoss Data Virtualization with HDP –  HDP on Red Hat Enterprise Linux with OpenJDK

Red Hat + Hortonworks Deliver Open Source Modern Data Architecture

Page 19: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Syed Rasheed, Sr. Solution Marketing Manager Red Hat

Page 20: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Information & Agility Gap

Over 70% BI project efforts lies in the finding and

integration of source data

Only 28% Users have any meaningful data access

•  Improve the use of data and analytics to improve business decisions and outcomes 72%

•  Identify new ways IT can better support business/marketing objectives 66%

•  Improve IT project delivery performance 56%

Decision-makers Are Demanding Improved Use Of Data And Analytics

Gartner  CIO  Agenda  Report  2013  Forrester    Informa9on  Fabric  3.0  August  8,  2013  

Page 21: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Data Challenges Getting Bigger…

NoSQL

Hive

MapReduce

HDFS

Storm

HBase Spark

Page 22: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Make Big Data Accessible for Everyone

Page 23: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Data Supply and Integration Solution

Data Virtualization sits in front of multiple data sources and ü  allows them to be treated a single source ü  delivering the desired data

ü  in the required form

ü  at the right time

ü  to any application and/or user. THINK VIRTUAL MACHINE FOR DATA

Page 24: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Easy Access to Big Data

•  Reporting tool accesses the data virtualization server via rich SQL dialect

•  The data virtualization server translates rich SQL dialect to HiveQL

•  Hive translates SQL to MapReduce

•  MapReduce runs MR job on big data

MapReduce

HDFS

Hive

Analytical Reporting

Tool

Data Virtualization

Server

Hadoop

Big Data

Page 25: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Different Users Different Views of Big Data

•  Logical tables with different forms of aggregation

•  Logical tables containing extra derived data

•  Logical tables with filtered data •  All reports/users share the same

specifications

MapReduce

HDFS

Hive

Page 26: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Caching the Big Data

•  Caches to speed up interactive reporting

•  Caches to create a consistent view of big data

•  Different caches for different reports

MapReduce

HDFS

Hive

Page 27: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Integration of Big Data with “Small Data”

•  Integrating small data with big data is easy

•  Integration specifications can be shared or be developed for individual reports

MapReduce

HDFS

Hive Application Database Server

Page 28: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Security and Big Data

•  Hadoop security is file-based •  Data virtualization can offer finer-grained security •  JBoss Data Virtualization can offer table, row,

column, and value level security on big data •  Works in conjunction with other SQL-on-Hadoop

implementations

Page 29: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Benefits of Data Virtualization on Big Data •  Enterprise democratization of big data •  Any reporting or analytical tool can be used •  Easy access to big data •  Seamless integration of big data and small data •  Sharing of integration specifications •  Collaborative development on big data •  Fine-grained security of big data •  Speedy delivery of reports on big data

You Need A Data Virtualization Strategy To Avoid Falling Behind “Without a data virtualization strategy, you risk knowing less about your customer, delivering fewer real-time business insights, losing competitive advantage, and spending more to address data challenges.

Informa9on  Fabric  3.0  August  8,  2013  

Page 30: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Red Hat + Hortonworks Making it Easier for Enterprises to Harness the Power Of Big Data

•  Integrating Hadoop into existing information infrastructure.

•  Building enterprise-grade, data-centric applications with Hadoop.

•  Operationalizing Hadoop and deliver high quality services around it.

Page 31: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Thank you!

Page 32: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Next Steps...

Download the Hortonworks Sandbox

Learn Hadoop

Build Your Analytic App

Try Hadoop 2

More about Red Hat & Hortonworks http://hortonworks.com/partner/redhat

Contact us: [email protected]

Page 33: Hortonworks and Red Hat Webinar_Sept.3rd_Part 1

Page 33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Don’t Forget to Register for our Next Webinar!

September 7th, 10 AM PST Red Hat JBoss Data Virtualization and Hortonworks Data Platform

http://info.hortonworks.com/RedHatSeries_Hortonworks.html