Top Banner
The Briefing Room The Intelligent Thing—Using In-Memory for Big Data and Beyond
35

The Intelligent Thing -- Using In-Memory for Big Data and Beyond

Aug 20, 2015

Download

Technology

Inside Analysis
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

The Briefing Room

The Intelligent Thing—Using In-Memory for Big Data and Beyond

Page 2: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

Twitter Tag: #briefr

The Briefing Room

Welcome

Host: Eric Kavanagh

[email protected]

Page 3: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

Twitter Tag: #briefr

The Briefing Room

!   Reveal the essential characteristics of enterprise software, good and bad

!   Provide a forum for detailed analysis of today’s innovative technologies

!   Give vendors a chance to explain their product to savvy analysts

!   Allow audience members to pose serious questions... and get answers!

Mission

Page 4: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

Twitter Tag: #briefr

The Briefing Room

JUNE: Database

July: CLOUD

August: HIGH PERFORMANCE ANALYTICS

September: ANALYTICS

Page 5: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

Twitter Tag: #briefr

The Briefing Room

Database

Page 6: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

Twitter Tag: #briefr

The Briefing Room

Analyst: John O’Brien

John O’Brien is Founder and CEO of

Radiant Advisors

Page 7: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

Twitter Tag: #briefr

The Briefing Room

!   Teradata is known for its data analytics solutions with a focus on integrated data warehousing, big data analytics and business applications

!   It offers a broad suite of technology platforms and solutions; data management applications; and data mining capabilities

!   Teradata Intelligent Memory is a new capability that provides automated management of data based on temperature

Teradata

Page 8: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

Twitter Tag: #briefr

The Briefing Room

Alan Greenspan

Alan Greenspan is Product Marketing Manager for Teradata Corporation. He is responsible for product marketing for the Teradata database, key database technologies, security and performance. Alan has more than 20 years with Teradata Corporation. 

Page 9: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

Alan Greenspan

TERADATA INTELLIGENT MEMORY

Page 10: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

10 Teradata Confidential

Trends • Memory is 3,000 times faster than disk • Memory per node is increasing >  96GB -> 256GB ->512GB -> 768GB -> 1TB

• Cost of memory is decreasing Issues • Memory still 80x more expensive than disk • Not all data fits into memory • Not all data worth 80x premium

Teradata Solution • Create a new extended memory space for most frequently accessed data

Exploiting Technology Trends

Page 11: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

11 Teradata Confidential

Teradata Intelligent Memory Innovative In-Memory Technology

•  New extended memory space •  Improves query performance •  A smarter approach than in-memory databases •  Leverage large memory capacities in new platforms

Teradata Intelligent Memory

Page 12: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

12 Teradata Confidential

• Sophisticated algorithms to track usage, measure temperature, and rank data

• Compliments FSG cache

• Dynamically adjusts to new query patterns

New Extended Memory Space

Intelligent Memory

most recently used data

most frequently used data

Hottest data placed and maintained in memory,

aged out as it cools

cool out very hot in

FSG Cache

Temporarily store data required for current

queries, purges least recently used

Page 13: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

13 Teradata Confidential

•  1% of data satisfies 43% of query activity

• Hottest data in memory/not all the data

•  Integrated into Teradata system

• No need for separate appliance

Improves Query Performance Performance of in-memory databases without their cost

Page 14: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

14 Teradata Confidential

• Automatic

•  Transparent

• No DBA effort

• No SQL changes

• Maintain user access to ALL data for analysis

A Smarter Approach than In-Memory Databases

Extend multi-temperature data management to memory

Page 15: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

15 Teradata Confidential

• Memory Capacities Growing Exponentially

• Traditional Cache Reaches Diminishing Returns

• Data is stored compressed and in columns and rows

• Created extended memory space beyond cache

• Use it in a new innovative way

Leverage Large Memory Capacities in New Platforms

Page 16: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

16 Teradata Confidential

Teradata Workload-Specific Platforms

670

Futu

re

2700

6700

Data Mart Appliance

Extreme Data Appliance

Data Warehouse Appliance

Active Enterprise Data Warehouse

Teradata Intelligent Memory

Page 17: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

17 Teradata Confidential

• All Members of the Workload-Specific Platform Family • Minimum Memory Requirements >  Recent models only > May require memory upgrade

• Requires Teradata Database 14.10 •  Teradata Virtual Storage is not required

Configuration Requirements

Model Memory/Node FSG Cache +

I.M./AMP Active Enterprise Data Warehouse 6700 512GB 8GB

Data Warehouse Appliance 2700 256GB 5GB

Data Mart Appliance 670H 256GB TBD Extreme Data Appliance Future TBD TBD

Page 18: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

18 Teradata Confidential

• Teradata SQL-H allows Hadoop data to take advantage of Teradata Intelligent Memory

• Hadoop data that is persisted in Teradata and becomes very hot will dynamically move into Teradata Intelligent Memory

Teradata Intelligent Memory and UDA

Page 19: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

19 Teradata Confidential

Teradata Intelligent

Memory In-Memory Databases

All data in memory Wrong goal Small data sets

Big data per node Yes No

Columnar Yes + rows Yes

Memory-speed performance Yes Yes

Compression Yes Yes

Recovery snapshot Yes Yes

SSD/HDD logging Yes Yes

Indexes, aggregates Yes No

Large node memories Yes Yes

Intelligent Memory vs In-Memory Databases

Page 20: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

20 Teradata Confidential

Reload on Reboot

Candidates VH

cylinders temp0

Cyl 56 100 Cyl 21 100 Cyl 22 99 Cyl 88 99 Cyl 42 98 Cyl 66 95

Intelligent memory

Page 21: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

21 Teradata Confidential

•  In memory expectations >  All “my queries” are faster

• Business value > Majority of queries are faster >  Increased response time

•  Intelligent Memory won’t help >  CPU constrained queries > Deep history queries – Very Hot + cold data joins

>  1-3 second queries > Data loading

Not Every Workload is IO Bound

Node

Page 22: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

22 Teradata Confidential

Teradata Intelligent Memory Sample Quotes May 2013 Coverage Report

Teradata takes on SAP's HANA with in-memory technologies push

"Teradata Intelligent Memory technology is built into

the data warehouse and customers don't have to buy a separate appliance

Teradata gets into the in-memory biz to take on SAP’s

HANA Data analytics veteran Teradata will not let the new era of

data-analysis architectures pass it by without a fight. It has already built products to address massive data volumes and

Hadoop

Teradata boosts DRAM on appliances for in-memory queries

You don't need no stinkin' HANA or Exalytics

Teradata enters the in-memory fray, intelligently

Teradata Intelligent Memory combines RAM and disk for high-performance Big Data without

the extreme requirement of exclusive in-memory operation

Teradata Extends In-Memory Computing Reach Teradata Intelligent Memory, an approach to in-

memory computing that allows the workloads running on a Teradata database appliance to make use of

extended memory.

Teradata Leverages In-Memory Technology

For Big Data Teradata (TDC) introduced Intelligent Memory, a new database technology that creates extended

memory space

Page 23: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

23 Teradata Confidential

Page 24: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

Twitter Tag: #briefr

The Briefing Room

Perceptions & Questions

Analyst: John O’Brien

Page 25: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

© Copyright 2013 Radiant Advisors. All Rights Reserved

REDEFINING HOT AND COLD DATA

25

Inside Analysis – Teradata Intelligent Memory System June 11, 2013 John O’Brien | Principal and Founder, Radiant Advisors @obrienjw @radiantadvisors [email protected]

Page 26: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

© Copyright 2013 Radiant Advisors. All Rights Reserved

ILM PRINCIPLE Redefining Hot and Cold Data

Information Management Lifecycle: “Storage is optimized when the value of information is persisted on the corresponding storage cost.” By using the age of the information, users can define its value as hot, warm, or cold temperatures then leverage corresponding tiers of data storage…

26

Page 27: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

© Copyright 2013 Radiant Advisors. All Rights Reserved

PREVIOUS DATA AGING STORAGE TIERS Redefining Hot and Cold Data

Information Lifecycle Challenges: •  Requires business usage

definition to script migration •  Different business data may

have different aging policies •  Not all policies are time based

(status based) •  Marking read-only, backups •  Isolate data, partition-based - Very operational oriented -

Try to analyze 3 years of business activity by demographic, products, or locations (hits weakest link storage tier)

27

Database Server (SMP)

Fast Disks Fast Connectivity Smaller Capacity

Fast Disks SSD

Medium Disks 15,000 rpm

Medium Disks 7,200 rpm

Slow Disks 5,400 rpm

Medium Disks Fast Connectivity Medium Capacity

Medium Disks Slow Connectivity Medium Capacity

Bulk Disks Slow Connectivity High Capacity

Tape

Storage Sub-Systems

Tape Slow Connectivity Highest Capacity

Defining and Managing Individual Data Record Policies for each tier

1-30 days 30-90 days 3-12 mos. 12-24 mos. 2+ years

This Month This Quarter This Year Yr-over-Yr compliance

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $

Page 28: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

© Copyright 2013 Radiant Advisors. All Rights Reserved

MPP SOLVES ANALYTIC WORKLOADS Redefining Hot and Cold Data

MPP Multi-tier challenges: •  Still requires business usage

definition to script data migration

•  Partition key setting and data skewness

Parallelism overcomes weakest link partition isolation Does the age of a record correspond to its value in analytics?

28

Database Server (MPP)

Fast Nodes Small Capacity More CPU/Memory

Fast Disks Solid State Disks

Medium Disks 1s terabytes per node

Slow Disks 10s Terabytes per node

Medium Nodes Medium Capacity Avg CPU/Memory

Bulk Nodes High Capacity Low CPU/Memory

Node Array

Defining and Managing Individual Data Record Policies for each MPP tier

1-30 days 1 – 12 mos. 1 - n years online

This Month This Year History

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $

Page 29: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

© Copyright 2013 Radiant Advisors. All Rights Reserved

OPTIMIZING THE MPP PLATFORM Redefining Hot and Cold Data

Intelligent Memory: •  Determines value of data by

its usage in the business via activity metrics and algorithms

•  Automatically and transparently moves data to the appropriate tier

•  Bi-directional data movement is heating up or cooling off

•  Loaded data can start hot and cool off

MPP to overcome partitioning Usage to govern storage tiers

29

Database Server (MPP)

Fast Nodes Small Capacity More CPU/Memory

Fast Disks Solid State Disks

Medium Disks 1s terabytes per node

Slow Disks 10s Terabytes per node

Medium Nodes Medium Capacity Avg CPU/Memory

Bulk Nodes High Capacity Low CPU/Memory

Node Array

Intelligent Memory management based on business usage

Hot Warm Cold

Most Often Occasional Use Rarely Used and Available

Page 30: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

© Copyright 2013 Radiant Advisors. All Rights Reserved

THE NEW PARADIGM FOR ANALYTICS Redefining Hot and Cold Data

By using the age of the information, users can define its value as hot, warm or cold temperatures matching corresponding tiers of storage…

Which meta data represents analytic value?

Monitoring a BI system’s analytic usage, the system can define its analytic value as hot, warm, or cold temperatures and then transparently persist data intelligently

30

Page 31: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

© Copyright 2013 Radiant Advisors. All Rights Reserved

THANK YOU!

For more information

www.RadiantAdvisors.com

Twitter: @RadiantAdvisors #ModernBI #RediscoveringBI

RSS: feed://radiantadvisors.com/feed/ Email us at: [email protected]

Linked IN: www.linkedin.com/company/radiant-advisors

Subscribe: Rediscovering BI monthly newsletter www.radiantadvisors.com.rediscoveringbi

31

Page 32: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

© Copyright 2013 Radiant Advisors. All Rights Reserved

QUESTIONS

•  If Teradata Intelligent-Memory can optimize a BI system’s storage persistence, how do you know what percentage of each storage tier to configure beforehand? Is it simply an economic decision at that point (the most memory and fast disk that I can afford)? •  For the secret-sauce algorithms being used in the IOPs monitoring by TIM, generally how fast do data sets “warm up” or “cool off” with usage? •  If I can anticipate high usage for a given data set on an upcoming Monday morning event, is there a way to bypass warming up and designate the hot? •  What are the boundaries for TIM optimization within Teradata Aster and are there future plans for expansion and enhancements?

32

Page 33: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

Twitter Tag: #briefr

The Briefing Room

Page 34: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

Twitter Tag: #briefr

The Briefing Room

July: CLOUD

August: HIGH PERFORMANCE ANALYTICS

September: ANALYTICS

Upcoming Topics

www.insideanalysis.com

Page 35: The Intelligent Thing -- Using In-Memory for Big Data and Beyond

Twitter Tag: #briefr

The Briefing Room

Thank You for Your

Attention