Apache geodePerformance is key. Consistency is a must.
Yogesh..BG11th Mar 2016
• Introduction
• Applications
• Architecture
• Features
• Performance
• Comparison with other products
05/03/2023 Confidential 2
Agenda
Introduction
05/03/2023 Confidential 3
• GemStone was the first project for smalltalk.
• First deployed as data engine in financial sector of wall street trading platform.
• Low latency, high concurrency data management system.
• In memory data management platform.
05/03/2023 Confidential 4
Applications
• GIRE[Rapipago] a leading financial company in Argentina• 19 million transactions per month
• Southwest Airlines• Southwest.com is the world’s largest airline website by number of visitors.
• SBI, China Citic bank, Philips, BMW, Union bank, AllState,
Architecture
05/03/2023 Confidential 5
Elements are:• Cache• Region – Local, replicated & partitioned• Locators• Functions• Listeners
Cache1
region1
R:region3
R:region2
region4
Cache2
R:region1
region3
region2
R:region4
Geode Cluster
Features
05/03/2023 Confidential 6
• Distributed cloud architecture
• Pools memory, CPU, network resources, and optionally local disk
• Uses dynamic replication and data partitioning techniques
• Reliable asynchronous event notifications
• Thousands of concurrent distributed transaction(JTA complaint)
• Shared nothing persistence architecture
Features
05/03/2023 Confidential 7
• Asynchronous and synchronous cache update propagation. Delta propagation.
• Horizontally scalable
• Querying and Indexing
• Super fast write-ahead-logging (WAL) persistence
• Compression, eviction and expiration of data
• User functions
Features
05/03/2023 Confidential 8
• HDFS Store – analytics job
• Rebalancing
• Integrated security : DATA_READ, DATA_WRITE, MONITOR, ADMIN [HTTP/HTTPS Authentication for REST ]
• JVSD – for analyzing the performance issues
• Off heap memory
• REST APIs
Internals of Geode
05/03/2023 Confidential 9
• Optimized caching layer, minimum thread and process switches.
• highly concurrent data structures to minimize contention points.
• Servers manage object graphs in serialized form, so less GC.
• Batch operation to the database.
• Uses TCP/IP, UDP UniCast and UDP MultiCast for member communication
• Serialization
How to use?
05/03/2023 Confidential 10
Bucket 1Bucket 2
Bucket 3Bucket 2
Bucket 1Bucket x
ClientInsert Person(UID, name, age)
#(UID) = 2 Replicate
Query and index
05/03/2023 Confidential 11
Performance
05/03/2023 Confidential 12
• 10 times the read-and-write throughput of traditional disk-based databases.
• 4-40 times better performance of any application.
• 10million concurrent users
• Proven 10-100ms of latency in china railway system
Horizontal Scaling: Consistent Latency and CPU
05/03/2023 Confidential 13
Geode and Redis
05/03/2023 Confidential 14
• GemFireRedisServer understand the redis protocol
• Keys represents region and namespace is with in OQL boundary.
• Redis is a single-threaded server. It is not designed to benefit from multiple CPU cores.
• Redis cluster you can scale up the number of data structures, not the data structures them selves (Partitioned
regions)
• Replication : slaves loses the data when they startup and sync with master. In Geode, you can have up to 3
redundant copies (for partitioned regions). Rep is async in redis.
• Persistence : AOF with keys and values in same file, on restart need to parse entire file.
• Redis uses Sentinel for managing HA.
• Network Partition
05/03/2023 Confidential 15
Condition No pipelining and 1KB payloads Pipelining 16 requests at a time
Operation Redis GemFireRedis Redis GemFireRedis
SET 100894.94 87627.06 109277.91 109109.55
GET 103504.02 102988.52 113583.70 113523.87
INCR 99662.14 92251.61 1061300.75 575023.25
SADD 99559.35 92254.50 989119.69 644678.81
Geode and cassandra
05/03/2023 Confidential 16
Geode and HazelCast
05/03/2023 Confidential 17
https://hazelcast.com/resources/benchmark-pivotal-gemfire-vs-hazelcast/
Thank YouYogesh..BG
05/03/2023 Confidential 18