Top Banner
Tuesday, May 1, 12
31

Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

Aug 20, 2015

Download

Technology

Inside Analysis
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

Tuesday, May 1, 12

Page 2: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

[email protected]

Twitter Tag: #briefrTuesday, May 1, 12

Page 3: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

Reveal the essential characteristics of enterprise software, good and bad

Provide a forum for detailed analysis of today’s innovative technologies

Give vendors a chance to explain their product to savvy analysts

Allow audience members to pose serious questions... and get answers!

Twitter Tag: #briefr

Tuesday, May 1, 12

Page 4: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

May: Analytics

June: Intelligence

July: Governance

August: Analytics

Twitter Tag: #briefr

Tuesday, May 1, 12

Page 5: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

Twitter Tag: #briefr

Ultimately analytics is about businesses making optimal decisions, although the range of technologies that inhabit this area is wide: statistical analysis, data mining, process mining, predictive analytics, predictive modeling, business process modeling and additionally complex event processing.

With the advent of big data, analytics has become “big analytics” with organizations diving into large heaps of data that previously was not available or usable.

Open source technologies (Hadoop, etc.) in conjunction with the cloud have expanded the range of what is possible in the cloud and considerably reduced the price of leveraging new and, often very substantial data sources.

Tuesday, May 1, 12

Page 6: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

Twitter Tag: #briefr

Robin Bloor is Chief Analyst at The Bloor Group.

[email protected]

Tuesday, May 1, 12

Page 7: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

Pervasive Software, a provider of data integration and database software, introduced Pervasive DataRush, a parallel data flow development platform several years ago.

Aside from marketing that capability it has been using it to build data integration and data flow enabled BI products that exploits the DataRush capability.

Pervasive RushAnalyzer is one the new parallel BI products that has been built using DataRush. It is aimed squarely at solving problems of in the management and analysis of big data, and delivering new capabilities.

Twitter Tag: #briefr

Tuesday, May 1, 12

Page 8: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

David Inbar is Senior Director, Pervasive Big Data Products & Solutions leading the business and product management functions for Pervasive’s Big Data Products group. Previously he led the global marketing and international channels teams for Pervasive’s Integration Products group as well as the company’s Innovation Lab. David has driven innovative business models and technology adoption strategies for many application development and data management products.

Twitter Tag: #briefr

Jim Falgout is Chief Technologist, Pervasive Big Data Products and Solutions. As Chief Technologist for Pervasive’s Big Data team, Jim Falgout is responsible for setting innovative design principles that guide Pervasive engineering teams as they develop new big data-focused releases and products. Jim is responsible for the architectural design of a software development platform for parallel applications that deliver high throughput on big data.

Tuesday, May 1, 12

Page 9: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

bigdata.pervasive.com

Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

The Briefing Room

May 1, 2012

Page 10: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

2

The Internet is the Fuel for the Fire

Source: IBM Corporation

Page 11: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

3

The Real Culprit: an Internet of Things

Source: McKinsey Global Institute report on Big Data, May 2011

Page 12: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

4

Big Data Hotspots

Page 13: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

5

Big Data Pain Points

!"#$%&'!&#"()*+,

-.&/0.&,/."1#&,,%0*(2,,(#&034&,

,,055.&50*&,0$6)*,

730#+8&,40%/#&,,%"6&#,

,,6)4("9&.,9)4$0#)8&,/.&6)(*,

:"34$%&,.&/".*,(20.*,

6042;"0.6,,,0#&.*,

(#"4&6,#""/,

:"##&(*,%"3)*".,

#"5,)35&4*,

&9&3*,(0/*$.&,6&(.+/*,

<0*0,=()&3>4*4,

<0*0,730#+4*4,

?$4)3&44,730#+4*4,

<&()4)"3,@0A&.4,

B/&.0>"30#,C3*&##)5&3(&,

<0*0,C3*&5.0*".4,

7//,<&9&#"/&.4,

Page 14: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

6

Time to Insight Falling Behind Data Growth

Page 15: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

7

Big Data Analytics Software Requirements

Additional Requirements

•  Must be usable by business users and analysts •  Graphical/visual environment •  Option to extend via scripting

•  Scalable and cross-platform: laptop, desktop, Hadoop cluster

Page 16: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

8

Page 17: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

9

DEMO

Page 18: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

10

Pervasive RushAnalyzer: Big Data Prep & Analytics

10

Page 19: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

11

Pervasive RushAnalyzer Key Differentiators

!  Comprehensive ETL and data preparation !  Analytics data scientists will love: machine learning !  Works with existing toolsets !  No cost to get started !  Scales from laptop to server to Hadoop clusters !  True distributed computing on Hadoop clusters

Page 20: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

Twitter Tag: #briefr

Tuesday, May 1, 12

Page 21: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

Tuesday, May 1, 12

Page 22: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

At the moment Big Data is often managed as “a project on the side” - isolated from the normal data flows associated with data warehousing

This situation will not last. Either the large data heaps are ephemeral or they are here to stay. But once your start gathering data you don’t usually stop treated.

If the big data heaps are here to stay they require data flow architecture. In that sense the Hadoop - Hive- HBase-Pig arrangement is really just a big prototype.

That data flow architecture must serve both big data analysis and traditional data warehousing.

Tuesday, May 1, 12

Page 23: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

Tuesday, May 1, 12

Page 24: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

We not only have the challenges of big data and big data

flow, we also have the problem of data pool proliferation

and the opportunities provided by data mashup/discovery

If we extrapolate from now we run into a complexity of

data flows that can no longer be managed by point-to-

point thinking.

In effect we get a combinatorial explosion - which

dictates the need - in fact the necessity - for data flow

architecture and data analysis architecture.

If it didn’t deliver value, no-one would do it.

Tuesday, May 1, 12

Page 25: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

The PC Revolution, The Internet Revolution, The mobile

revolution were all surprises even for those who saw them

coming. They all brought more data and more data

distribution.

The coming Embedded revolution could be characterized

as “the web of intelligent things” - things that know their

state, report their state, can respond to their state or can

respond collectively.

Think of:

A cup that knows what’s in it

A house that knows whose home

A car that knows how much you had to drink

Tuesday, May 1, 12

Page 26: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

The Challenge is Speed and

ComplexityBig Data has only just begun:

Think of current big data

projects as the early

spreadsheets

Data flow architecture is already

an issue.

Complexity is increasing

Speed is the enabler or the

barrier

Twitter Tag: #briefr

Tuesday, May 1, 12

Page 27: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

Twitter Tag: #briefr

Questions

It is not clear to me what product classification this falls under. It appears to be a data flow architecture design and implementation capability. Is that the case?

What does RushAnalyzer complement? What does it compete with?

What interfaces does it have to different data sources?

Clearly this is very fast operationally, because of the underlying parallelism. Can you give us some idea of how this compares in speed terms with, for example, a Hadoop arrangement aimed at a similar set of capabilities

What skills are required to make best use of this capability?

Tuesday, May 1, 12

Page 28: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

Twitter Tag: #briefr

Questions

Who have been the early adopters of this kind of capability

and what kind of business problems are they trying to solve?

Which vertical business sectors have shown most interest

and which have shown least interest?

Quo vadis?

Tuesday, May 1, 12

Page 29: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

Tuesday, May 1, 12

Page 30: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

Twitter Tag: #briefr

May: Analytics

• June: Intelligence

• July: Governance

• August: Analytics

Tuesday, May 1, 12

Page 31: Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

Tuesday, May 1, 12