Top Banner
In Memory Data Grid in Action with Oracle Coherence for Paris NoSQL User Group Cyrille Le Clerc Transactions chapter will be presented during another session Wednesday, May 25, 2011
76

Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

May 15, 2015

Download

Technology

In Memory Data Grids in Action with Oracle Coherence presented to No SQL users.
The "transactions" chapter is missing as it has been rescheduled to another session.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

In Memory Data Grid in Actionwith Oracle Coherencefor Paris NoSQL User Group

Cyrille Le Clerc

Transactions chapter will be presented during another session

Wednesday, May 25, 2011

Page 2: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Speaker

2

Cyrille Le Clerc

@cyrilleleclerc

blog.xebia.fr

Open Source (Apache CXF, ...)

In Memory Data Grid

Large Scale

“you build it, you run it”

Wednesday, May 25, 2011

Page 3: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

3

Once upon a time...

Wednesday, May 25, 2011

Page 4: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

4

- Released Coherence in 2001- Started as a distributed cache

- Released Gigaspaces XAP in 2001- Started as a data grid

On the Financial side

• Very low latency

• Rich queries & transactions

• Scalability

• Data consistency

Needs within financial market :

Wednesday, May 25, 2011

Page 5: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

5

Let’s define an In Memory Data Grid ...

Wednesday, May 25, 2011

Page 6: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Let’s define an In Memory Data Grid

6

eXtreme Scale

This is an In Memory Data Grid

Wednesday, May 25, 2011

Page 7: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Let’s define an In Memory Data Grid

7

This is Network Attached Memory

Wednesday, May 25, 2011

Page 8: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Similarities with NoSQL document orientedPartitioned, distributed Hastable, schema-less, value is not opaque, scale-out scalability

Very fastIn memory (persistence coming), business logic inside the data

Consistent and AvailableTransactional, redundant

Written in Java, data are POJOs Not necessary

Clients in Java, Microsoft, etc8

Let’s define an In Memory Data Grid

Wednesday, May 25, 2011

Page 9: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

9

Use cases for this presentation

Wednesday, May 25, 2011

Page 10: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Train Booking System

10

trains, stations, seats, booking and passengers

Wednesday, May 25, 2011

Page 11: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

eCommerce Web Site

11

warehouse stocks

231

2

canon-eos: 1ipod : 1headphone : 1iphone: 1...

ipad : 1 iphone: 1

barbie : 1iphone: 1cabbage-doll: 1

121

311

12

264

637

{ "name": "Barbie Computer", "stock": 637, "weigth" : 200 }

warehouse & customers shopping carts

Wednesday, May 25, 2011

Page 12: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

12

In Memory Data Grids Key Principles

Wednesday, May 25, 2011

Page 13: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Store Everything in a Mainframe !

13

3 To of RAM80 x 5.2 GHtz coresMuch more than $1,000,000

IBM z11http://ibm.com/

Wednesday, May 25, 2011

Page 14: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Spread on Inexpensive Servers

14

Mainframe Cheap Servers !http://1userverrack.net/

http://ibm.com/

Wednesday, May 25, 2011

Page 15: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Partition Data

15

MainFrame

Smallservers

Partition gamma

Partition beta

Partition alpha

Partition for scalability

Wednesday, May 25, 2011

Page 16: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Duplicate Data

16

sync synchronization

Duplicate data for high availability

Partition alpha

Master

Standby Backup

Wednesday, May 25, 2011

Page 17: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

17

Data Access Patterns

Wednesday, May 25, 2011

Page 18: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Access Patterns

This is not traditional Java EE coding style !

Can apply very complex business logic inside the data

18

Stored Procedures Style

Change management challenge !

Wednesday, May 25, 2011

Page 19: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

19

Pattern : Targeted Operation

Wednesday, May 25, 2011

Page 20: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Pattern: Targeted Operation

20

Partition gamma

Search Trains

Partition beta

Search Trains

Partition alpha

Search Trains

{ "train-id": "tgv-3071-20110512", "time" : 2011/05/12 12:15, "departure" : "Paris", "arrival" : "Marseille", "seats" : 3, }

Book Train Tickets

“train-id” is indexed

Wednesday, May 25, 2011

Page 21: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

21

Pattern : Map Reduce Style Operation

Wednesday, May 25, 2011

Page 22: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Pattern: Map Reduce

22

Partition gamma

Search Trains

Partition beta

Search Trains

Partition alpha

Search Trains

{ "departure": "Paris", "arrival": "Marseille", "time" : 2011/05/12 12:00, "seats" : 3, }

Distributed “Search Train Ticket”Wednesday, May 25, 2011

Page 23: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Pattern: Map Reduce

23

Partition gamma

Search Trains

Partition beta

Search Trains

Partition alpha

Search Trains

{ "Paris -> Marseille : 12:15", "Paris -> Marseille : 13:15"}

Distributed “Search Train Ticket”

{ #NONE# }

{ "Paris -> Lyon -> Marseille : 12:40"}

Wednesday, May 25, 2011

Page 24: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Pattern: Map Reduce

24

Partition gamma

Search Trains

Partition beta

Search Trains

Partition alpha

Search Trains

Distributed “Search Train Ticket”

{ "Paris -> Marseille : 12:15", "Paris -> Lyon -> Marseille : 12:40", "Paris -> Marseille : 13:15"}

Wednesday, May 25, 2011

Page 25: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Access Patterns

This is not traditional Java EE coding style

Don’t forget “Map Reduce” = “Distributed Table Scan”

25

Use Indexes

Change management

Wednesday, May 25, 2011

Page 26: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

26

CAP Theorem & In Memory Data Grids

Wednesday, May 25, 2011

Page 27: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

CAP Theorem and In Memory Data Grid

27

Consistency

Availability

PartitionTolerance

Only 2 of these 3 properties can be

achieved at any given moment in time

Brewer’s Conjecture

http://lpd.epfl.ch/sgilbert/pubs/BrewersConjecture-SigAct.pdf

Wednesday, May 25, 2011

Page 28: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

CAP Theorem and In Memory Data Grid

28

Consistency

Availability

PartitionTolerance

Only 2 of these 3 properties can be

achieved at any given moment in time

Brewer’s Conjecture

http://lpd.epfl.ch/sgilbert/pubs/BrewersConjecture-SigAct.pdf

Data Grids

Wednesday, May 25, 2011

Page 29: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Cross Data Center Data Consistency

29

TokyoNew York

London

World wide replicationfor financial market

Wednesday, May 25, 2011

Page 30: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Cross Data Center Data Consistency

30

West Coast

East Coast

{ "name": "Barbie Computer", "stock": 147, "weigth" : 200 }

{ "name": "Barbie Computer", "stock": 147, "weigth" : 200 }

Warehouse stocks

Wednesday, May 25, 2011

Page 31: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Cross Data Center Data Consistency

31

propagation delay !

West Coast

East Coast

{ "name": "Barbie Computer", "stock": 147, "weigth" : 200 }

set stock to 146

{ "name": "Barbie Computer", "stock": 147, "weigth" : 200 }

Wednesday, May 25, 2011

Page 32: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Cross Data Center Data Consistency

32

West Coast

East Coast

{ "name": "Barbie Computer", "stock": 147, "weigth" : 200 }

set stock to 146

{ "name": "Barbie Computer", "stock": 147, "weigth" : 200 }

set weight 175reconciliation API needed !

Wednesday, May 25, 2011

Page 33: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Cross Data Center Data Consistency

33

West Coast

East Coast

{ "name": "Barbie Computer", "stock": 147, "weigth" : 200 }

set stock to 146

{ "name": "Barbie Computer", "stock": 147, "weigth" : 200 }

set weight 175Network partitioning

Wednesday, May 25, 2011

Page 34: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

34

Data Modeling

Wednesday, May 25, 2011

Page 35: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Modeling

Dominant Question Driven Design

Constrained Tree Schema

Denormalized

35

Opposite to Relational which is Domain Driven Design

Because RPC matters

Due to dominant questions and CTS

Wednesday, May 25, 2011

Page 36: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Modeling

36

TrainStopdate

TrainStationcodename

Traincodetype

Seatnumberprice

Bookingreduction

Passengername

Typical relational data model

Wednesday, May 25, 2011

Page 37: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Modeling

37

Find the root entity and denormalize

TrainStopdate

Seatnumberprice

Bookingreduction

Passengername

Reference data

Duplicated in each grid node

TrainStationcodename

Root entity

Partitioning ready entities tree

Traincodetype

Wednesday, May 25, 2011

Page 38: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Modeling

38

Remove unused data

TrainStopdate

Seatnumberprice

Bookingreduction

Passengername

booked

TrainStationcodename

Traincodetype

Partitioned

Replicated

Wednesday, May 25, 2011

Page 39: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Modeling

39

TrainStopdate

TrainStationcodename

Seatnumberpricebooked

Traincodetype

Data Grid Ready data structure

Partitioned

Replicated

Wednesday, May 25, 2011

Page 40: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

40

Data Modeling is Hard !

Wednesday, May 25, 2011

Page 41: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Modeling is Hard !

41

Two root entities for the same MoneyTransfer !

from to

CashWitdrawaldateamount

MoneyTransferiddateamount

Accountnumber

CashWitdrawaldateamount

Accountnumber

Wednesday, May 25, 2011

Page 42: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Modeling is Hard !

42

CashWitdrawaldateamount

CashWitdrawaldateamount

MoneyTransferIniddateamount

MoneyTransferOutiddateamount

Accountnumber

Accountnumber

Split MoneyTransfer

Wednesday, May 25, 2011

Page 43: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Modeling is Hard !

43

CashWitdrawaldateamount

MoneyTransferOutiddateamount

Accountnumber

CashWitdrawaldateamount

MoneyTransferIniddateamount

Accountnumber

Split MoneyTransfer

Wednesday, May 25, 2011

Page 44: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Modeling is Hard !

44

CashWitdrawaldateamount

MoneyTransferOutiddateamount

MoneyTransferIniddateamount

Accountnumber

Data Grid Ready data structure

Wednesday, May 25, 2011

Page 45: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

45

Grid Internals

Wednesday, May 25, 2011

Page 46: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Serialization

Used for data transfer and byte oriented storage

Hot topic like Apache Thrift, Apache Avro, Google Protocol Buffer

46

Must support evolvable data structure

Wednesday, May 25, 2011

Page 47: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Storage

Store Java Beans in the grid

Store byte arrays in the grid

47

No need to unmarshall for inprocess operations

Beware of garbage collector !

Pay unmarshalling at each read and write

Slightly more garbage collector friendlyLow-level / byte-oriented APIs to read data

Wednesday, May 25, 2011

Page 48: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Communication Protocols

UDP Multi Cast (Coherence, Gigaspaces)

TCP/IP (Websphere eXtreme Scale)

48Wednesday, May 25, 2011

Page 49: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Topology

Partitions made of shards : 1 primary + 0..* backups)

Dynamic shards location (changes at runtime and at restart)

Can use dedicated “directory servers” or embed it in the “data nodes”

49Wednesday, May 25, 2011

Page 50: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

JVM and Memory

Many editors recommend tiny 1.4 Go JVM !

More than ten JVM per server

50

Garbage collector hell

Management hell

More and more IMDG support large heaps

Wednesday, May 25, 2011

Page 51: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

51

APIs

Wednesday, May 25, 2011

Page 52: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Raw Java Mapping with Oracle Coherence

52

hand-coded serializationJUnit is your friend !

public class Train extends AbstractEvolvable implements PortableObject { enum Type { HIGH_SPEED, NORMAL }

/** Key of the Cache */ String code;

/** Indexed */ String name;

Type type;

List<Seat> seats = new ArrayList<Seat>();

int version;

List<TrainStop> trainStops = new ArrayList<TrainStop>();

@Override public int getImplVersion() { return 1; }

@Override public void readExternal(PofReader pofReader) throws IOException { this.code = pofReader.readString(0); this.name = pofReader.readString(1); this.type = (Type) pofReader.readObject(2); pofReader.readCollection(3, this.seats); pofReader.readCollection(4, this.trainStops); this.version = pofReader.readInt(5); }

@Override public void writeExternal(PofWriter pofWriter) throws IOException { pofWriter.writeString(0, this.code); pofWriter.writeString(1, this.name); pofWriter.writeObject(2, this.type); pofWriter.writeCollection(3, this.seats, Seat.class); pofWriter.writeCollection(4, this.trainStops, TrainStop.class); pofWriter.writeInt(5, this.version); }}

TrainStopdate

Seatnumberpricebooked

Traincodetype

Wednesday, May 25, 2011

Page 53: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

JPA Style Mapping with Websphere eXtreme Scale

53

sub entities can have cross relations

@Entity(schemaRoot=true)public class Train { @Id String code; @Index @Basic String name; @OneToMany(cascade=CascadeType.ALL) List<Seat> seats = new ArrayList<Seat>(); @Version int version;

...}

TrainStopdate

Seatnumberpricebooked

Traincodetype

Wednesday, May 25, 2011

Page 54: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Map API with Oracle Coherence

54

NamedCache trainCache = CacheFactory.getCache("train-cache");

/** Save */ void persist(Train train) { trainCache.put(train.getCode(), train); } /** Find by key */ Train findByCode(String code) { return (Train) trainCache.get(code); }

/** Find by Query Language */ Train findByTrainName(String name) { Filter filter = QueryHelper.createFilter("name = :name" , Collections.singletonMap("name", name)); Set<Map.Entry<String, Train>> trainEntrySet = trainCache.entrySet(filter); if (trainEntrySet.isEmpty()) { return null; } else { return trainEntrySet.iterator().next().getValue(); } }

Map API

Wednesday, May 25, 2011

Page 55: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

JPA Style with Websphere eXtreme Scale

55

/** Save */void persist(Train train) { entityManager.persist(train);}

/** Find by key */Train findByCode(String code) { return (Train) entityManager.find(Train.class, code);}

/** Query Language */Train findByTrainName(String name) { Query q = entityManager.createQuery("select t from Train t where t.name=:name"); q.setParameter("name", name);

return (Train) q.getSingleResult();}

JPA Style Entity Manager

Wednesday, May 25, 2011

Page 56: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Creating Indexes

56

Map reduce (without index) = Distributed Table Scan !

Wednesday, May 25, 2011

Page 57: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Indexes with Oracle Coherence

57

class Train { String name;

Collection<String> getTrainStationsCodes() { return Collections2.transform(trainStops, ...); }

...}

{ NamedCache trainCache = CacheFactory.getCache("train-cache");

trainCache.addIndex(new ReflectionExtractor("getName"), false, null); trainCache.addIndex(new ReflectionExtractor("getTrainStationsCodes"), false, null);}

Wednesday, May 25, 2011

Page 58: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Indexes with Websphere eXtreme Scale

58

@Entity(schemaRoot=true)class Train { @Index @Basic String name;

@Index Collection<String> getTrainStationsCodes() { return Collections2.transform(trainStops, ...); }

...}

Query query = em.createQuery("select t from Train t where t.name=:name");query.getPlan();

eXtreme Scale

for q2 in Train ObjectMap using INDEX on name = ( ?name) filter ( q2.c[0] = ?name ) returning new Tuple( q2 )

This is an execution plan

Wednesday, May 25, 2011

Page 59: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

More APIs

Another Java EE versus Spring battle ? JSR 347 Data Grids vs. Spring Data

59

Unified API ontop of NoSQL stores ?

Serialization / Object to Tuple Mapping API ?

Wednesday, May 25, 2011

Page 60: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

60

Data Grid <-> Relational Database Interactions

Wednesday, May 25, 2011

Page 61: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Grid <-> Relational Database

61

Data Grids are “In Memory” -> we need to persist data on disk !

Wednesday, May 25, 2011

Page 62: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Grid <-> Relational Database

62

update / insert / delete

“select directly modified in DB”

Wednesday, May 25, 2011

Page 63: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Grid <-> Relational Database

63

backend DB

Highly available write behind queues+ SQL batched statements

Data Grid -> Relational Database

Wednesday, May 25, 2011

Page 64: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Grid <-> Relational Database

64

TrainStopdate

TrainStationcodename

Seatnumberpricebooked

Traincodetype

Constrained Tree Schema <-> Relational Impedance Mismatch

Data Grid -> Relational Database

Wednesday, May 25, 2011

Page 65: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Grid <-> Relational Database

DB writes MUST succeed !

65

Align the database on the Data Grid model !

Denormalize the databaseRemove the foreign keys, use same PKs in DB and data gridSupport unordered SQL statements

Prefer raw SQL rather than reused business logic

Wednesday, May 25, 2011

Page 66: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Grid <-> Relational Database

66

backend DB

Data Grid Originated Scheduled Refresh(Oracle System Change Number, etc)

select * from train where last_modif > ?

Relational Database -> Data Grid

Wednesday, May 25, 2011

Page 67: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Grid <-> Relational Database

67

backend DB

Database Originated PushJMS = durable subscription(Oracle Database Change Notification, etc)

Relational Database -> Data Grid

Wednesday, May 25, 2011

Page 68: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Grid <-> Relational Database

In Memory -> prepare for reloading after maintenance operations !

Prepare consistency checkers

68

Need for “graceful shutdown with disk persistence”

Wednesday, May 25, 2011

Page 69: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

69

Transactions

Wednesday, May 25, 2011

Page 70: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

70

We didn’t have the time to talk about transaction.

Another session is planned at Paris No SQL User Group for this.

Wednesday, May 25, 2011

Page 71: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

71

Let’s go live !

Wednesday, May 25, 2011

Page 72: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Grids and Operations

Standard packaging?

Limited Management

Limited debugging tools

JVM pandemia

72

Do It Yourself (layout, scripts, etc)

Do It Yourself (stop/start, detecting data loss, etc)

Dozens of JVM to manage !

Do It Yourself (debugging consoles, troubleshooting agents)

Wednesday, May 25, 2011

Page 73: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

Data Grids and Operations

Dev / Ops collaboration is required

Experts only !

73Wednesday, May 25, 2011

Page 74: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

74

The right tool for the right job

Wednesday, May 25, 2011

Page 75: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

The right tool for the right job

Incredibly fast ! Even with transactions !

Scalable

Good at data replication (when it implements it)

Very geeky on both dev and ops side

“Quite” expensive

75

Not an enterprise grade data store

Reconciliation api, etc

Requires very skilled people + change management

If you solve the data loading issue

Wednesday, May 25, 2011

Page 76: Paris NoSQL User Group - In Memory Data Grids in Action (without transactions chapter)

76

?

Questions / Answers

Wednesday, May 25, 2011