Top Banner
© Wikibon 2011 | Confidential www.wikibon.org [[The Wikibon Project]] Big Data and Hadoop: Key Drivers, Ecosystem and Use Cases November 2011
16

Big Data and Hadoop - key drivers, ecosystem and use cases

Jan 24, 2015

Download

Technology

Jeff Kelly

Overview of the Big Data market.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

[[The Wikibon Project]]

Big Data and Hadoop: Key Drivers, Ecosystem and Use Cases November 2011

Page 2: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

What is Big Data?

2

Big Data n Data sets whose size, type and/or speed make them impractical to process and analyze with traditional database technologies and related data management tools.

Page 3: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Why is Big Data Important?

3

Big  Data  is  the  new  de.initive  source  of  competitive  advantage  across  industries  …

…  For  those  organizations  that  embrace  Big  Data,  the  possibilities  for  innovation,  improved  agility,  and  increased  pro.itability  are  nearly  endless.

Page 4: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Three Key Big Data Drivers

4

1.  Volume, Variety, Velocity

2.  Hardware Commoditization

3.  Cloud Computing

Page 5: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Characteristics of Big Data

5

Page 6: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Sources of Big Data

6

Page 7: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Hadoop

7

Open source framework for processing, storing and analyzing Big Data.

Fundamental concept: Rather than banging away at one, huge block of data with a single machine, Hadoop breaks up Big Data into multiple parts so each part can be processed and analyzed in parallel.

Page 8: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Hadoop: The Pros and Cons

8

First the pros … Hadoop is a time- and cost-effective approach to store, process and analyze large volumes of unstructured data allowing for new and unprecedented types of analytics.

Now the cons … Hadoop is complex and difficult to deploy and manage; there’s a dearth of Hadoop-savvy engineers and Data Scientists on the job market; the risk of forking and vendor lock-in remains.

Page 9: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Hadoop: The Pros and Cons cont.

9

More pros … Many bright minds contributing to Hadoop resulting in rapid development and an ecosystem of vendors emerging to make Hadoop enterprise-ready.

Page 10: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

The Big Data Ecosystem

10

Page 11: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Big Data Pioneers

11

•  Largest Hadoop instance on the planet … 40,000 nodes handling 200+ PB of data.

•  Used to support research

for ad systems and Web search.

•  Match ads with users, detect spam in Yahoo! Mail, pick relevant top stories.

Page 12: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Big Data Pioneers cont.

12

•  Two major clusters processing and storing over 30 PB of data.

•  Uses HDFS to store copies of internal log and dimension data.

•  Developed Hive to perform large-scale analytics on user data.

•  Using HBase to store, manage and retrieve Facebook Messenger data.

Page 13: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Big Data Pioneers cont.

13

•  Uses Hadoop to support “People You May Know” feature.

•  Tailors its search engine to return most relevant results for recruiters, employers and job seekers.

•  Created a visualization tool to allow users to explore their professional network to discover hidden patterns.

Page 14: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Big Data in Financial Services

14

•  Over 30,000 databases and 15,000 applications spread across 7 business units.

•  Using Hadoop as the basis of its Common Data Platform.

•  Looking to establish 360 degree view of customer for upsell and cross-sell opportunities.

Page 15: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Big Data in Financial Services cont.

15

•  Risk management and analysis to understand financial exposure.

•  Detecting fraudulent transactions and potentially criminal activity.

•  Conduct sentiment analysis on social media data.

Page 16: Big Data and Hadoop - key drivers, ecosystem and use cases

© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org

Thank You

16

Jeffrey F. Kelly Principal Research Contributor

The Wikibon Project

[email protected] @jeffreyfkelly

www.wikibon.org www.siliconangle.com