Top Banner
Grab some coffee and enjoy the pre-show banter before the top of the hour!
31
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: BDIA Findings

Grab some coffee and enjoy the pre-show banter before the top of the hour!

Page 2: BDIA Findings

“Making Way For Big Data” Findings Webcast | June 25, 2014

Page 3: BDIA Findings

Featuring

Eric Kavanagh CEO, The Bloor Group

Robin Bloor Chief Analyst, The Bloor Group

Page 4: BDIA Findings

Findings Webcast June 25, 2014

Big Data Information Architecture

Roundtable Webcast April 9, 2014

Exploratory Webcast January 22, 2014

#BigDataArch

Page 5: BDIA Findings

The Sequence of Topics

The Great Disruption

Events

New Architectures for OLD

Page 6: BDIA Findings

1. The Great Disruption

Page 7: BDIA Findings

Moore’s Law Cubed

u  The biggest databases are NEW databases

u  They grow at the cube of Moore’s Law

u  Moore’s Law = 10x every 6 years u  VLDB: 1000x every 6 years • 1991/2 megabytes • 1997/8 gigabytes • 2003/4 terabytes • 2009/10 petabytes • 2015/16 exabytes

Page 8: BDIA Findings

Technology Evolution

Page 9: BDIA Findings

Observations…

u  Software architectures change: centralized, C/S, 3 tier/web, SOA, etc.

u  Applications migrate according to latencies

u  Wholly new applications appear because of lower latencies, e.g., VMs, CEP

u  THIS CURVE IS NO LONGER VALID…

Page 10: BDIA Findings

Memory is Becoming Hierarchical Store

u On chip speed v RAM • L1(32K) = 100x • L2(246K) = 30x • L3(8-20Mb) = 8.6x

u RAM v SSD • RAM = 300x

u  SSD v Disk • SSD = 10x

u Disk will soon turn up its toes

Note: Vector instructions and data compression

Page 11: BDIA Findings

Putting a SoC in IT

u  It’s possible that the CPU/Memory split will vanish, possibly soon

u This requires the emergence of the commodity SoC

u There are already ARM SoCs that run Linux

u Grids of SoCs would replace grids of servers

Page 12: BDIA Findings

Parallelism: The Imp is Out of the Bottle

u Multicore chips enabled parallelism

u  It has changed the whole performance equation

u  It enabled Big Data

u  Big Data is really Big Processing

Page 13: BDIA Findings

2. Events

Page 14: BDIA Findings

u Computer u On-line u PC u Internet u Mobile u Internet of things

u Batch u Centralized u Client/server u Multi-tier u Service orientation u Event driven/Big Data

Tech Revolutions

TECH REVOLUTION ARCHITECTURE

Page 15: BDIA Findings

Event Types

u Instantiation Event u A State Report u A Trigger Event u A Correction Event

We also need to consider: Data Refinement | Aggregations | Homogeneous collections | Derived Data

Page 16: BDIA Findings

The Traffic Cop

Page 17: BDIA Findings

The Evolution of Hadoop

u Hadoop is far too useful and popular to fade away

u YARN and Tez have changed the picture

u Hadoop will become the default scale out file system

u And a critical component of the DATA HUB

Page 18: BDIA Findings

Hadoop as a Clip-On

Page 19: BDIA Findings

3. New Architecture For Old?

Page 20: BDIA Findings

There MAY be some Big Data applications that are not about

data analytics.

Big Data and Analytics

If so, nobody is talking about them…

Page 21: BDIA Findings

A Process, Not an Activity

u  Data Analytics is a multi-disciplinary end-to-end process

u  Until recently it was a walled-garden. But the walls were torn down by: •  Data availability •  Scalable technology •  Open source tools

Page 22: BDIA Findings

Data Flow (The Paradox)

Our Architectures need to cater for DATA FLOW,

not data at rest

However, DO NOT MOVE THE DATA unless you absolutely have to

Page 23: BDIA Findings

The Corporate Data Flows

u There needs to be two data flows (at minimum)

u Currently we can distinguish between: • Real-time/business time applications • Analytical applications • We will build specific architectures for this

Page 24: BDIA Findings

Data Flow

The role of Hadoop is as the STAGING AREA FOR REFINEMENT

And also as a A SCALE-OUT FILE SYSTEM

Page 25: BDIA Findings

A BDIA in Overview

Think Logical, Implement Physical

Page 26: BDIA Findings

The BDIA: Two Data Flows

Page 27: BDIA Findings

Within the Data Hub

Page 28: BDIA Findings

The CRITICAL Workload Issue

u  Previously, we viewed database workloads as an i/o optimization problem

u With the BDIA the workload is a variable MIX of i/o, transformation and calculation

u No databases were built precisely for this – not even Big Data databases

Page 29: BDIA Findings

In Summary…

The Great Disruption

Events

New Architectures for OLD

Page 30: BDIA Findings

Questions?

#BigDataArch or

USE THE Q&A

Page 31: BDIA Findings

THANK YOU!

FIND ALL BDIA WEBCASTS & RESEARCH AT: http://insideanalysis.com/research/big-data-information-architecture