Top Banner
© Hortonworks Inc. 2013 Quick House Keeping Rule Q&A panel is available if you have any questions during the webinar There will be time for Q&A at the end We will record the webinar for future viewing All attendees will receive a copy of the slides and recording Page 1
30

Introduction to Hortonworks Data Platform for Windows

May 10, 2015

Download

Education

Hortonworks

According to IDC, Windows Servers run more than 50% of the servers in the Enterprise Data Center. Hortonworks has worked closely with Microsoft to port Apache Hadoop to Windows to enable organizations to take advantage of this emerging Big Data technology. Join us in this informative webinar to hear about the new Hortonworks Data Platform for Windows.

In less than an hour, you’ll learn:

-Key capabilities available in Hortonworks Data Platform for Windows
-How HDP for Windows integrates with Microsoft tools
-Key workloads and use cases for driving Hadoop today
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Quick House Keeping Rule

• Q&A panel is available if you have any questions during the

webinar

• There will be time for Q&A at the end

• We will record the webinar for future viewing

• All attendees will receive a copy of the slides and recording

Page 1

Page 2: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Introducing Hortonworks Data Platform for Windows

Enterprise Apache Hadoop for Windows Environments

March 2013

Page 2

Page 3: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Our Speakers

Page 3

John Kreisa

VP, Strategic Marketing

Saptak Sen

Sr. Product Manager

Rohit Bakshi

Product Manager

Page 4: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Agenda

• Why Hadoop on Windows?

• Hortonworks Data Platform for Windows

• Microsoft - Big Data and Apache Hadoop

• Hortonworks Data Platform under the covers

• Q&A

Page 4

Page 5: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Polling Question

Where are you with Hadoop?

__ We are running it in production

__ We have it running in our labs

__ We are just investigating Hadoop

__ What is Hadoop?

Page 5

Page 6: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Agenda

• Why Hadoop on Windows?

• Hortonworks Data Platform for Windows

• Microsoft - Big Data and Apache Hadoop

• Hortonworks Data Platform under the covers

• Q&A

Page 6

Page 7: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Why Apache Hadoop on Windows?

• According to IDC Windows Server held 73% market share in 2012– Hadoop was traditionally built for Linux servers so there are a large number of

underserved organizations

• Apache Hadoop: de-facto platform for processing massive amounts of unstructured data– Complementary to existing Microsoft technologies– There is a huge untapped community of Windows developers

and ecosystem partners

• A strong Microsoft-Hortonworks partnership and 18 months of development makes this a natural next step

Page 7

Page 8: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

What Makes Up Big Data?

Megabytes

Gigabytes

Terabytes

Petabytes

Purchase detail

Purchase record

Payment record

ERP

CRM

WEB

BIG DATA

Offer details

Support Contacts

Customer Touches

Segmentation

Web logs

Offer history

A/B testing

Dynamic Pricing

Affiliate Networks

Search Marketing

Behavioral Targeting

Dynamic Funnels

User Generated Content

Mobile Web

SMS/MMSSentiment

External Demographics

HD Video, Audio, Images

Speech to Text

Product/Service Logs

Social Interactions & Feeds

Business Data Feeds

User Click Stream

Sensors / RFID / Devices

Spatial & GPS Coordinates

Increasing Data Variety and Complexity

Transactions + Interactions + Observations

= BIG DATA

Page 8

Page 9: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Big Data: Big and Getting Bigger Fast!

• Unstructured data growth exceeds 80% year/year in most enterprises– Machine-generated data is a key driver in data growth

• IDC projects digital universe will reach 40 zettabytes (ZB) by 2020– 1 ZB = 1,000,000,000,000 GBs!– Projected to increase 15x by 2020

• According to 2012 Barclays CIO study big data outranks virtualization as #1 spending initiative

Page 9*2012 IDC Digital Universe Study

Page 10: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Enter Apache Hadoop

OSS that delivers high-scale storage & processing with enterprise-ready platform services

Page 10

HADOOP CORE

Hortonworkers are the original architects, operators, and builders of core Hadoop

PLATFORM SERVICES Enterprise Readiness

HDFS MAP REDUCE

The core of the next generation data platform…

Page 11: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Agenda

• Why Hadoop on Windows?

• Hortonworks Data Platform for Windows

• Microsoft - Big Data and Apache Hadoop

• Hortonworks Data Platform under the covers

• Q&A

Page 11

Page 12: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Introducing HDP for Windows

Page 12

HORTONWORKS DATA PLATFORM (HDP)For Windows

Hortonworks Data Platform (HDP)For Windows

• 100% Open Source Enterprise Hadoop

• Component and version compatible with Microsoft HDInsight

• Availability

• Beta release available now

• GA early 2Q 2013

PLATFORM SERVICES

HADOOP CORE Distributed Storage & Processing

DATASERVICES

Store, Process and Access Data

OPERATIONAL SERVICES

Manage & Operate at

Scale

Manage & Operate at

Scale

Store, Process and Access Data

Distributed Storage & Processing

Enterprise Readiness

Page 13: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Hortonworks Data Platform for Windows

• Enterprise-grade Apache Hadoop on Windows– Enables same experience for Hadoop on Windows & Linux

• More partners, more developers for Hadoop– Makes native Apache Hadoop available to Windows ecosystem– More options for Windows focused organizations

• Hortonworks focus: Enterprise Apache Hadoop for all platforms– Trusted reliable production-ready distribution for on-premise Hadoop on Windows

deployments

• Built with joint investment and contributions from Microsoft– Deep engineering relationship ensures tight integration and maximum performance

Page 13

HDP: the first and only distribution available on Windows & Linux

Page 14: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Hortonworks: Best In Class Hadoop Support

• Experienced enterprise support team – Experience supporting enterprise clients in production– Core engineers have real operational

experience: built and supported 44+K nodes in production– Extensive experience in commercial big data offerings

including HDP, MapR, Karmasphere

• Global 24x7 operation – support based in Sunnyvale, UK & India

• Stringent case management processes ensures high quality customer service & responsiveness

Page 14

Page 15: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Transferring Our Hadoop Expertise to You

The expert source for Apache Hadoop training &

certification

• World class training programs designed to help you learn fast

– Role-based hands on classes with 50% lab time– New HDP on Windows course

• Expert consulting services– Programs designed to transfer knowledge

• Industry leading Hadoop Sandbox program– Fastest way to learn Apache Hadoop– Multi-level tutorials for wide applicability– Customizable and updateable

Page 15

Page 16: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Hortonworks Snapshot

Page 16

• We distribute the only 100% Open Source Enterprise Hadoop Distribution: Hortonworks Data Platform

• We engineer, test & certify HDP for enterprise usage

• We employ the core architects, builders and operators of Apache Hadoop

• We drive innovation within Apache Software Foundation projects

• We are uniquely positioned to deliver the highest quality of Hadoop support

• We enable the ecosystem to work better with Hadoop

Develop Distribute Support

We develop, distribute and support the ONLY 100% open source Enterprise Hadoop distribution

Endorsed by Strategic Partners

Headquarters: Palo Alto, CAEmployees: 180+ and growingInvestors: Benchmark, Index, Yahoo

Page 17: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Agenda

• Why Hadoop on Windows?

• Hortonworks Data Platform for Windows

• Microsoft - Big Data and Apache Hadoop

• Hortonworks Data Platform under the covers

• Q&A

Page 17

Page 18: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Microsoft Big Data

Microsoft Big Data – Simplifies data management for IT – Enables IT and users to easily enrich their data with the world’s data, and– Delivers agility to end users through familiar tools like Excel

Page 18

microsoft.com/bigdata

Simplicity for IT

Agility for End Users

Page 19: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Microsoft End-To-End Big Data Platform

Page 19

Page 20: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Agenda

• Why Hadoop on Windows?

• Hortonworks Data Platform for Windows

• Microsoft - Big Data and Apache Hadoop

• Hortonworks Data Platform under the covers

• Q&A

Page 20

Page 21: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Enhancing the Core of Apache Hadoop

Deliver high-scale storage & processing with enterprise-ready platform services

Unique Focus Areas:• Bigger, faster, more flexible

Continued focus on speed & scale and enabling near-real-time apps

• Tested & certified at scale Run ~1300 system tests on large clusters for every release

• Enterprise-ready servicesHigh availability, disaster recovery, snapshots, security, …

Page 21

HADOOP CORE

Hortonworkers are the architects, operators, and builders of core Hadoop

PLATFORM SERVICES Enterprise Readiness

HDFS

MAP REDUCEWEBHDFS

Page 22: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013Page 22

HADOOP CORE

DATASERVICES

Provide data services to store, process & access data in many ways

Unique Focus Areas:• Apache HCatalog

Metadata services for consistent table access to Hadoop data

• Apache Hive Explore & process Hadoop data via SQL & ODBC-compliant BI tools

Distributed Storage & Processing

Hortonworks enables Hadoop data to be accessed via existing tools & systems

PLATFORM SERVICES Enterprise Readiness

Data Services for Full Data Lifecycle

HCATALOG

HIVEPIGSQOOP

Page 23: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013Page 23

HADOOP CORE

DATASERVICES

Provide data services to store, process & access data in many ways

Unique Focus Areas:• Apache HCatalog

Metadata services for consistent table access to Hadoop data

• Apache Hive Explore & process Hadoop data via SQL & ODBC-compliant BI tools

Distributed Storage & Processing

Hortonworks enables Hadoop data to be accessed via existing tools & systems

PLATFORM SERVICES Enterprise Readiness

Data Services for Full Data Lifecycle

HCATALOG

HIVEPIGSQOOP

Page 24: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013Page 24

HADOOP CORE

DATASERVICES

Provide data services to store, process & access data in many ways

Unique Focus Areas:• Apache HCatalog

Metadata services for consistent table access to Hadoop data

• Apache Hive Explore & process Hadoop data via SQL & ODBC-compliant BI tools

Distributed Storage & Processing

Hortonworks enables Hadoop data to be accessed via existing tools & systems

PLATFORM SERVICES Enterprise Readiness

Data Services for Full Data Lifecycle

HCATALOG

HIVEPIGSQOOP

Page 25: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Operational Services for Ease of Use

Page 25

OPERATIONAL SERVICES

Include complete operational services for productive operations & management

• Apache Oozie: Manage and schedule job execution for Hadoop jobs

Only Hortonworks provides a complete open source Hadoop management tool

DATASERVICES

Store, Process and Access Data

HADOOP CORE Distributed Storage & Processing

PLATFORM SERVICES Enterprise Readiness

Oozie

Page 26: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Inside HDP for Windows

Page 26

Hortonworks Data Platform (HDP)For Windows

• 100% Open Source Enterprise Hadoop

• Component and version compatible with Microsoft HDInsight

• Availability

• Beta release available now

• GA early 2Q 2012

PLATFORM SERVICES

HADOOP CORE

DATASERVICES

OPERATIONAL SERVICES

Manage & Operate at

Scale

Store, Process and Access Data

HORTONWORKS DATA PLATFORM (HDP)For Windows

Distributed Storage & ProcessingHDFS

WEBHDFS

MAP REDUCE

HCATALOG

HIVEPIG

SQOOP

Oozie

Page 27: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Seamless Interoperability with Your Microsoft Tools

• Integrated with Microsoft tools for native big data analysis

– Bi-directional connectors for SQL Server and SQL Azure through SQOOP

– Excel ODBC integration through Hive

• Addressing demand for Hadoop on Windows

– Ideal for Windows customers with Hadoop operational experience

• Enables all common Hadoop workloads

– Data refinement and ETL offload for high-volume data landing

– Data exploration for discovery of new business opportunities

Page 27

APPL

ICAT

ION

SDA

TA S

YSTE

MS

Microsoft Applications

HORTONWORKS DATA PLATFORMFor Windows

DATA

SO

URC

ES

MOBILEDATA

OLTP, POS SYSTEMS

Traditional Sources (RDBMS, OLTP, OLAP)

New Sources (web logs, email, sensor data, social media)

Page 28: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Demo Time!

Page 28

Excel integration with HDP• Interact with HDP through Excel• Use Data Explorer to explore and turn raw data

into valuable information

Page 29: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Maximize Your Hadoop Deployment Choice

• Use HDP for Windows for on-premises deployment on Windows Server– Ideal for Windows users with Hadoop experience– Perfect next step for those who are ready to move from POC to production

• Use HDInsight for Microsoft tooling and Management and Provisioning– HDInsight Service that offers full benefit of Windows Azure (e.g. elasticity & low cost) –

available in Preview today– HDInsight Server for full integration of Hadoop with Microsoft tools on premises –

Developer Preview available today

• Full interoperability and deployment choice across platforms– Implement big data applications that run on-premise & cloud– By leveraging open source HDP, enables seamless interoperability across

environments: Linux, Windows, Windows Azure

Page 29

Page 30: Introduction to Hortonworks Data Platform for Windows

© Hortonworks Inc. 2013

Next Steps

Page 30

Download Hortonworks Sandboxwww.hortonworks.com/sandbox

Download Hortonworks Data Platform for Windows (Beta)www.hortonworks.com/download

Follow…@hortonworks, @hortonworks_U