Top Banner
Apache Geode, and Pivotal's leadership role in open sourcing (Gemfire) Nitin Lamba (incubating)
35

Pivotal's effort on Apache Geode

Jan 18, 2017

Download

Technology

Apache Apex
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Pivotal's effort on Apache Geode

Apache Geode,and Pivotal's leadership role

in open sourcing (Gemfire)

Nitin Lamba

(incubating)

Page 2: Pivotal's effort on Apache Geode

Pivotal’s Open Source strategy

What is Apache Geode?

History

Differentiators

Basic Concepts

Resources

Q & A

Agenda

2

Page 3: Pivotal's effort on Apache Geode

3

Page 4: Pivotal's effort on Apache Geode

4

In 2015, Pivotal granted the components of its Big Data Suite to open source

6 Million Lines of Code4 new open source communities

Page 5: Pivotal's effort on Apache Geode

5

May 2015 Sept 2015

Sept 2015Oct 2015

Page 6: Pivotal's effort on Apache Geode

From GEMFIRE to GEODE…

6

Page 7: Pivotal's effort on Apache Geode

A distributed, memory-based data management platform for data oriented apps that need:• high performance, scalability,

resiliency and continuous availability

• fast access to critical data sets• location-aware distributed data

processing• event-driven data architecture

What is GEODE?

7

Page 8: Pivotal's effort on Apache Geode

• 1000+ systems in production (real customers)• Cutting edge use cases

Incubating but ROCK solid…

8

<2000 2004 2008 2012 2016

Early drivers• Data Volumes• Margins/ transactions• IT maintenance costs • Elasticity needs

Real-time needs• Real-time response• Time to market needs• Flexible Data Models • Persistent+In-memory

Global Data• Visibility across DC• Fast Ingest• Device to enterprise • Uptime (always on)

Open Source!• Apache Incubation• Gemfire > Geode• Geode M1 release• 1st Geode Summit

Financial Services

US DoDTrade Clearing

Travel Portal

Online Gambling

TelcosManufacturing

Auto InsurancePayroll processing

Rail systems

Page 9: Pivotal's effort on Apache Geode

…with both SCALE and SPEED, …

9

40KTransactionsper second

3TB Data

in-memory

17B Records

in-memory

120KConcurrent

users

Page 10: Pivotal's effort on Apache Geode

… and impacting a LOT of people!

10

China RailwayCorporation

Indian Railways

17%

19%

36%of the world population

Page 11: Pivotal's effort on Apache Geode

High-level Architecture

11

Powerful app development kit• APIs: Java & REST• Adapters: Redis, Lucene*, Spark*, …

Multiple persistence options• Filesystem, RDBMS or HDFS*• Sync: read-through, write-through• Async: write-behind

Durable <K,V> cache/ store• Data replicated or partitioned• Redundant storage in-memory/ disk• Flexible data retention policiesÎ

!

Loca

tor

Serv

er

Serv

er

Serv

er

Serv

er +""""

"

$

%%%

&& &% % %% %% %%

&&

A Peer-2-Peer in-memory Distributed System

REST

!

* Experimental and waiting community feedback

Page 12: Pivotal's effort on Apache Geode

• Minimize copying

• Minimize contention points

• Run user code in-process

• Partitioning & parallelism

• Avoid disk seeks

• Automated benchmarks

What makes it go FAST?

12

Page 13: Pivotal's effort on Apache Geode

• Cache• Region• Member• Client Cache• Persistence• Functions

Let’s talk about a few BASIC CONCEPTS…

13

Page 14: Pivotal's effort on Apache Geode

• In-memory storage and management for your data

• Configurable through XML, Java API or CLI

• Collection of Region

What is a CACHE?

14

Page 15: Pivotal's effort on Apache Geode

• Distributed java.util.Map on steroids (Key/Value)

• Consistent API regardless of where or how data is stored

• Observable (reactive)

• Highly available, redundant on cache Member (s).

What is a REGION?

15

Page 16: Pivotal's effort on Apache Geode

• Local, Replicated or Partitioned

• In-memory or persistent

• Redundant

• LRU

• Overflow

Region: Types & Options

16

LOCALLOCAL_HEAP_LRULOCAL_OVERFLOWLOCAL_PERSISTENTLOCAL_PERSISTENT_OVERFLOWPARTITIONPARTITION_HEAP_LRUPARTITION_OVERFLOWPARTITION_PERSISTENTPARTITION_PERSISTENT_OVERFLOWPARTITION_PROXYPARTITION_PROXY_REDUNDANTPARTITION_REDUNDANTPARTITION_REDUNDANT_HEAP_LRUPARTITION_REDUNDANT_OVERFLOWPARTITION_REDUNDANT_PERSISTENTPARTITION_REDUNDANT_PERSISTENT_OVERFLOWREPLICATEREPLICATE_HEAP_LRUREPLICATE_OVERFLOWREPLICATE_PERSISTENTREPLICATE_PERSISTENT_OVERFLOWREPLICATE_PROXY

Page 17: Pivotal's effort on Apache Geode

• Durability

• WAL for efficient writing

• Consistent recovery

• Compaction

Persistent Regions

17

Server 1 Server N

Page 18: Pivotal's effort on Apache Geode

• A process that has a connection to the system

• A process that has created a cache

• Embeddable within your application

What is a MEMBER?

18

Client

Locator

Server

Page 19: Pivotal's effort on Apache Geode

• A process connected to the Geode server(s)

• Can have a local copy of the data

• Run OQL queries on local data

• Can be notified about events on the servers

What is a CLIENT CACHE?

19

Page 20: Pivotal's effort on Apache Geode

Persistence - Shared Nothing

20

Server 3Server 2Server 1

Page 21: Pivotal's effort on Apache Geode

Persistence - Shared Nothing

21

Server 3Server 2Server 1

B1

B3

B2

B1

B3

B2

Primary

Secondary

Page 22: Pivotal's effort on Apache Geode

Persistence - Shared Nothing

22

Server 3Server 2Server 1

B1

B3

B2

B1

B3

B2

Primary

Secondary

Page 23: Pivotal's effort on Apache Geode

Persistence - Shared Nothing

23

Server 3Server 2Server 1

B1

B3

B2

B1

B3

B2

Primary

Secondary

Page 24: Pivotal's effort on Apache Geode

Persistence - Shared Nothing

24

Server 3Server 2Server 1

B1

B3

B2

B1

B3

B2

Primary

Secondary

B3

B2

Server 1 waits for others when it starts

Page 25: Pivotal's effort on Apache Geode

Persistence - Shared Nothing

25

Server 3Server 2Server 1

B1

B3

B2

B1

B3

B2

Primary

Secondary

Fetches missed operations on restart

Page 26: Pivotal's effort on Apache Geode

Persistence - Operational Logs

26

Create k1->v1

Create k2->v2

Modifyk1->v3

Create k4->v4

Modify k1->v5

Create k6->v6

Member 1Put k6->v6

Oplog2.crf

Oplog1.crf

Append to operation log

Page 27: Pivotal's effort on Apache Geode

Persistence - Operational Logs: Compaction

27

Create k1->v1

Create k2->v2

Modifyk1->v3

Create k4->v4

Modify k1->v5

Create k6->v6

Member 1Put k6->v6

Oplog2.crf

Oplog1.crf

Append to operation log

Copy live data forward

Page 28: Pivotal's effort on Apache Geode

• Used for distributed concurrent processing (Map/Reduce, stored procedure)

• Highly available

• Data oriented

• Member oriented

Functions

28

Page 29: Pivotal's effort on Apache Geode

Functions

29

Page 30: Pivotal's effort on Apache Geode

30

• Check out: http://geode.incubator.apache.org

• Subscribe: [email protected]

• Download: http://geode.incubator.apache.org/releases/

Join the Community!

Page 31: Pivotal's effort on Apache Geode

31

Thank you!

Page 32: Pivotal's effort on Apache Geode

Additional Slides

32

Page 33: Pivotal's effort on Apache Geode

Built for PERFORMANCE…

33

0

200,000

400,000

600,000

800,000

1,000,000

A Re

ads

A Up

date

s

B Re

ads

B Up

date

s

C Re

ads

D In

serts

D Re

ads

F Re

ads

F Up

date

s

Ope

ratio

ns p

er s

econ

d

YCSB Workloads

Cassandra Geode

Page 34: Pivotal's effort on Apache Geode

…and horizontal, consistent SCALABILITY!

34

Horizontal scaling for reads, consistent latency and CPU

0.

4.5

9.

13.5

18.

0.

1.25

2.5

3.75

5.

6.25

2 4 6 8 10

Speedu

p

ServerHosts

speedup latency(ms) CPU%

• Scaled from 256 clients and 2 servers to 1280 clients and 10 servers• Partitioned region with redundancy and 1K data size

Page 35: Pivotal's effort on Apache Geode

High Availability

35