Top Banner
Uri Cohen Head of Product @ GigaSpaces @uri1803 github.com/uric In-Memory Data Grids, Demystified
41

In Memory Data Grids, Demystified!

Jan 15, 2015

Download

Technology

Uri Cohen

The principles and foundations of in memory data grids
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: In Memory Data Grids, Demystified!

Uri CohenHead of Product @ [email protected]/uric

In-Memory Data Grids, Demystified

Page 2: In Memory Data Grids, Demystified!

Agenda

• Why IMDG?• Brief History• How It Works– Data model & placement– HA and fault tolerance – Consistency – Internals

Page 3: In Memory Data Grids, Demystified!

Why IMDG?

Today, more than ever, there are many choices when it comes to storing your data

Page 4: In Memory Data Grids, Demystified!

® Copyright 2011 Gigaspaces Ltd. All Rights Reserved

4

But There Many

Solutions

Page 5: In Memory Data Grids, Demystified!

® Copyright 2011 Gigaspaces Ltd. All Rights Reserved

5

Just A Few Years Back

Page 6: In Memory Data Grids, Demystified!

So Why Indeed??

Page 7: In Memory Data Grids, Demystified!

The Need for Speed, In

Real Time…

Page 8: In Memory Data Grids, Demystified!

Some Facts

Page 9: In Memory Data Grids, Demystified!

Memory will always be faster

than disk (usually by orders of

magnitude)

Page 10: In Memory Data Grids, Demystified!

Recent Survey

Page 11: In Memory Data Grids, Demystified!

67%

The ratio of IT managers that think that real time analysis is the biggest challenge for big data implementations

Page 12: In Memory Data Grids, Demystified!

40%

• Plan to use in memory technologies for big data projects.• Only 32%

mentioned Hadoop

Page 13: In Memory Data Grids, Demystified!

Stream Processing

Page 14: In Memory Data Grids, Demystified!

Hell, Even Gartner Thinks So

“In memory computing (IMC) … provides transformational opportunities. The execution of

certain-types of hours-long batch processes can be squeezed into minutes or even seconds …

Millions of events can be scanned in a matter of a few tens of millisecond to detect correlations and patterns

pointing at emerging opportunities and threats "as things happen.”

Page 15: In Memory Data Grids, Demystified!

And nowadays

HW and SW just makes it a whole lot

cheaper

Page 16: In Memory Data Grids, Demystified!

Some Common Use Cases

Page 17: In Memory Data Grids, Demystified!

Fast, Transactional Data Access

• Inventory management • Financial

reference data• Real time

transactional data

Page 18: In Memory Data Grids, Demystified!

Real Time Stream

Processing

• Fraud Detection• Click Stream

Analysis • Real time

analytics • Continuous

calculation

Page 19: In Memory Data Grids, Demystified!

Heavyweight Offline

Calculations

• Trade Reconciliation • Pattern analysis

and detection• Number crunching

Page 20: In Memory Data Grids, Demystified!

Caching

• Database offloading • Content heavy

websites

Page 21: In Memory Data Grids, Demystified!

The Evolution of Data Grids

Page 22: In Memory Data Grids, Demystified!

First There Were Local Caches

CacheIn process cachingof Key->Value data

structure

Distribute CachePartitioned cache

nodes

IMDGPartitioned system

of record

IMDG.next()

Good for repetitive-data reads

Limited in capacity

Doesn’t handle write-heavy scenarios

Reads are only part latency path

Page 23: In Memory Data Grids, Demystified!

Then Came Distributed Caches

CacheIn process cachingof Key->Value data

structure

Distribute CachePartitioned cache

nodes

IMDGPartitioned system

of record

Increased Capacity

Still no support for write-heavy scenarios

Limited to ID-based reads

Reads are only part latency path

IMDG.next()

Page 24: In Memory Data Grids, Demystified!

In Memory Data Grids

CacheIn process cachingof Key->Value data

structure

Increased capacity

Write scalability

Can serve as system of record with querying & transaction semantics

Still limited in capacity

Latency can come from other parts of your app

Distribute CachePartitioned cache

nodes

IMDGPartitioned system

of record

IMDG.next()

Page 25: In Memory Data Grids, Demystified!

How It Works

Page 26: In Memory Data Grids, Demystified!

Data Models

Page 27: In Memory Data Grids, Demystified!

27

Data Placement – Fixed Hashing

hash(key) % #nodes

Page 28: In Memory Data Grids, Demystified!

28

Fixed Hashing - HA

hash(key) % #nodes

Page 35: In Memory Data Grids, Demystified!

Data Consistency

Since we’re dealing with distributed data, consistency cannot be taken for granted• Read after write • Read after read • Write-write consistency

Page 36: In Memory Data Grids, Demystified!

Solution 1: Single

Master

Page 37: In Memory Data Grids, Demystified!

Solution 2: Read/Write Quorums

Page 38: In Memory Data Grids, Demystified!

Some More Concerns

• Transactions• Querying • Failure detection • Leader election • Persistency • Interoperability

Page 39: In Memory Data Grids, Demystified!

IMDG.next()

Using IMDG for messaging, BL

Page 40: In Memory Data Grids, Demystified!

IMDG.next()

SSD FTW!

Page 41: In Memory Data Grids, Demystified!

Thank You!

docs.gigaspaces.com