HOW TO WRITE YOUR DATABASE The story about By Victor Haydin, Head of R&D, ELEKS
May 25, 2015
HOW TO WRITE YOUR
DATABASEThe story about
By Victor Haydin, Head of R&D, ELEKS
Agenda
1.Problem definition2.Event Store: introduction3.Under the hood4.Lessons learned5.Q&A
Few notes before we start1.This story is about the development, technology
and lessons learned, not a product2.Neither about Event Sourcing3.Outsourcing point of view4.I was there on pre-sales and initiation phases.
When the team was formed I left the project.
Survey
How many of you can understand code?
Event Sourcing?
Event Store?
SEDA?
140 characters long definition:
Event sourcing - persisting entities by appending all business events to transaction log. To rebuild the state, we replay this log.
ExampleBankAccount
GotSallary: +100GotCashFromAtm: -10MadeInternetPayment: -15MadeTerminalPayment: -5PutCashToAccount: +20
Account balance: 100-10-15-5+20=90
Event SourcingPros Cons
Performance Performance
Simplification Versioning
Audit Trail Querying
Integration with other systems Not common approach
Troubleshooting Better tastes with CQRS
Fixing errors
Testing
Flexibility
The marketing says
and time-series data
for single-node configuration
AtomPub protocol for HTTP
33000 writes per second on single node
Master-Slave replication for Premium version
I’ll talk more about Mono later
web-based
Client API1.Append to Stream2.Read Events3.Subscribe4.Transactions5.Stream Metadata6.Delete Stream7.System Settings8.Connection Lifecycle
Under the hood
Disclaimer
SEDA
Time to take a look at code
SEDA outcome1.Very few locks in code2.Simple state management3.Easy to understand system composition4.Easy to extend5.Easy to monitor (DEMO)
License is MIT, code is available on github. Enjoy!
Storage Engine
All events are stored in one transaction log (splitted into chunks)
Writer Chaser Readers Scavenger
Storage pipeline
Writer1.Single-threaded2.Append-only3.Write data in chunks4.Flush depending on load and settings
Chaser1.Checks what writer has appended to log and
sends acknowledgements 2.Updates index
Readers1.Read events from transaction log with help of
Read Index
Read Index1.Maps stream name and version to position in
transaction log2.Index = MemTable + PTables3.MemTable = Dictionary of Lists4.Index entry – record of fixed length5.PTable = file with sorted sequence of entries
and cached midpoints6.Binary search FTW!
Scavenger1.Background process that go through transaction
log chunks and clean it from obsolete data (e.g. deleted streams, $maxCount, $maxAge etc.)
2.Simply writes data to a new chunk file3.Last one out shut the lights off
High Availability
High Availability1.Master-Slave replication2.Automatic master elections (N/2 + 1 quorum)3.Wins the one with most actual state4.Write always to master, read from master or
slave, depending on client’s choice 5.Byte stream replication6.Available in commercial version
Projections
Projection – the process of taking an event stream and converting it to some other form (e.g. another event stream or state object)
Projections Engine1.Built-in query language – Javascript2.Based on Google’s v8
Demo: Projections Engine
Lessons Learned
Immutability is good
Pure async is fast,although hard for development
Lockless programming is hard
Distributed programming is even harder
Debugger is almost useless
Unit-testing is a limited tool
Verbose logs rock
Real-life testingrocks like nothing else
Bugs in .NET[System.Security.SecuritySafeCritical]public virtual void Flush(Boolean flushToDisk) {
// This code is duplicated in Disposeif (_handle.IsClosed) __Error.FileNotOpen();if (_writePos > 0) {FlushWrite(false);if (flushToDisk) {
if (!Win32Native.FlushFileBuffers(_handle)) {__Error.WinIOError();
}}
}else if (_readPos < _readLen && CanSeek) {FlushRead();
}_readPos = 0;_readLen = 0;
}
*Appears to be fixed in 4.5
Bugs in Mono
TCP/IP Stack deadlocks Concurrent Stack/Queue
XML Serializer
File Stream issuesGeneral slowness:
20K w/s
Disk caches: fake flush and reordering
*By default, can be disabled
Mind the HDD/SSD difference
*It is possible to optimize for both
There is a long way to simplicity
Experiments:The way it works
Further plans1.Horizontal scalability2.Hosted service3.Adding other features
Links1. GetEventStore.com – official web site
2. GitHub.com/EventStore/EventStore – source code
3. OreDev.org/2012/sessions/a-deep-look-into-the-event-store – A deep look into The Event Store by Greg Young
4. MartinFowler.com/eaaDev/EventSourcing.html – Event Sourcing overview by Martin Fowler
5. Google <Event Sourcing|Event Store>