RAVENDB 4.0Oren [email protected]
WORK SCHEDULE
Timeline
2012 2013 2014 2015 2016 2017
2.5 Beta 2.5 Release 2.5 Maintenance 2.5 End of life
3.0 Beta 3.0 Release 3.0 Maintenance 3.0 End of life
3.5 Beta 3.5 Release 3.5 Maintenance
4.0 Beta 4.0 Release
RAVENDB IS…• Design / Development started in 2008• In production since 2010
LET’S DO SOMETHING ABOUT IT
CORE DESIGN GUIDELINES• Complexity is not maintainable• We control the stack
• Top to bottom• At least an order of magnitude
better performance• Where does this hurt?
• Stop doing that
GOODBYE ESENT• We run into multiple crashing bugs in Esent
• Some can be worked around• Some cannot
• Doesn’t run on Linux• Force a high degree of separation from storage
VORON• We can freely modify• Runs on Windows & Linux
• Optimized storage for our needs
• Memory mapped files• Remember this
THE BLITTABLE FORMAT• Random access• Use unmanaged memory
• Pointer arithmetic access
• No parsing• Once loaded, zero cost to use
BIGGEST CHANGE BY FAR• Blittable + Voron = Different ball game
• Prefetching• Index batches• Caching• Parsing• Managed allocations
• The great simplification
INDEXING• No need to batch indexes together• Indexes can now run independently of one another• Indexes can be given different priorities
• Indexing code is about an order of magnitude simpler
ON DISK STORAGEEach collection is separated, and they are all connected
Users/1• Etag: 2
Users/2• Etag: 4
Companies/1• Etag: 1
Companies/2• Etag: 3
24
13 1
234
INDEXING• Each index can cover just the documents in the collections it index• Changes to one collection won’t make other indexes stale
INDEXES• Lucene is the only game in town
• Corax can’t compare• Lucene isn’t reliable• Voron is reliable
BETTING ON CORECLR• Port RavenDB to CoreCLR• Don’t just copy the code
• Adjust based on new design
• Make it work• Make it work right
• Make it work fast
AUTO INDEXES• In CoreCLR RC1 – no support for
dynamic loading of assemblies• Means, no indexing• 4.0 has auto indexes
• More efficient (no linq)
• Users• Name• Email (Analyzed)
STATIC INDEXES• POC on CoreCLR RC2 works• TBD
IN THE MEANWHILE…
MAP/REDUCE
Theory Practice
MAP/REDUCE
LOAD DOCUMENT
WEB SOCKETS• We have bidirectional connections
• Let’s use that
• Previously, manage multiple unidirectional connections, or multiple requests
AUTHENTICATION• API keys• Single web socket request
• Deferred / external• Windows auth• Certificates
BULK INSERT
• Blittable data directly from client• No parsing costs server side
• Lazy transaction• ACID at the end
REPLICATION• Persistent web socket connections
• Rely on TCP state• Aware of liveliness• Per peer
• Large cluster handling
CHANGE VECTOR• Gossip between servers• Conflict detection
MORE SERVER RESPONSIBILITIES• Server side:
• Wait for non stale results• Write assurances
PERFORMANCE
TRIE BASED ROUTING
THE COST OF BLITTABLE
WHAT’S LEFT?• LOTS• Client• Bundles• Studio
• Performance• Performance• Performance
• Stability• Performance
SCHEDULE• Dec 2016 – Public Preview• Mar 2017 – Beta• May 2017 – Release Candidate• July 2017 – RTM
QUESTIONS?