Top Banner
NoSQL MongoDB and Redis as alternatives to traditional RDBMS
22

NoSQL solutions

Jan 16, 2017

Download

Technology

Felix Crisan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: NoSQL solutions

NoSQLMongoDB and Redis as alternatives to

traditional RDBMS

Page 2: NoSQL solutions

Then...

Page 3: NoSQL solutions

...and now

*This thing weighs less than 50g

Page 4: NoSQL solutions

Meaning of NoSQL

1970 = We have no SQL1980 = Know SQL2000 = No SQL!2005 = Not only SQL2014 = No, SQL

(slide adapted from @markmadsen)

Page 5: NoSQL solutions

MongoDB

Page 6: NoSQL solutions

MongoDB

● it is the “new MySQL”● Project started in 2007 by 10gen (now MongoDB Inc)● Cross-platform, open-source● 5th most used DBMS & most used Document Store*

(next DS CouchDB - 21st)* According to db-engines.com as of Oct 2014

Page 7: NoSQL solutions

Characteristics

● “It's really a hybrid database with features from a few different places.” (Gaetan Voyer-Perrault on Quora)

● Document Oriented but NO SCHEMA! ● Documents grouped in Collections● Binary JSON (BSON) format● Load Balancing (automated sharding, sharding key

can be user defined)● Replication (Replica Sets)● Automated failover

Page 8: NoSQL solutions

Characteristics - continued

● Primary and Secondary Indexes● JavaScript for UDF● MapReduce● Capped Collections● Aggregation Framework since 2.2● Ad-hoc Query Support

Page 9: NoSQL solutions

Caveats

Page 10: NoSQL solutions

Generic performance tips

● Use 64-bit OS● Lots of RAM, fast disks (was anyone expecting

something else?)● ensure that at least indexes + working set fit in RAM

(db.stats(), db.<coll>.stats()) - if not, you might want to try TokuMX

● Design for de-normalized data models

Page 11: NoSQL solutions

Generic performance tips

● Write-Concerns● Shard early● Fixed (or at least bounded) record size => better write

performance● Use short attribute names (reduces index & data size,

OFC!)● EXT4 or XFS

Page 12: NoSQL solutions

IRL

● virtualized server 8G RAM, 4 vCPU - no sharding, no replica sets

● 100 inserts/s , 130M doc collection WITH secondary index (avg doc size 0.6k)

● 20 inserts/s 3M doc collection WITH 18 secondary indexes (avg doc size 10k)

Page 13: NoSQL solutions

Use Cases

● Logs● Location Data (Mongo has built in Geospatial ops)● Account and User Profiles● Messaging● (complex) Config Data● http://www.mongodb.com/who-uses-mongodb (hint:

Expedia, Business Insider, The Weather Channel, Foursquare, eBay)

Page 14: NoSQL solutions

Redis

Page 15: NoSQL solutions

Redis

● Salvatore Sanfilippo (@antirez)● Started in 2009● Key-Value Store● 11th most used DBMS & most used KV Store* (next

KVS memcached - 19th)● Sponsored by Pivotal (spinoff EMC/VMware)* According to db-engines.com as of Oct 2014

Page 16: NoSQL solutions

Characteristics

● Holds all data in memory, persists on disk● Data Models

○ Strings/Blobs/Bit-Maps (not really Bitmaps)○ Hashtables○ Linked Lists○ Sets○ Sorted Sets

● HyperLogLog (+2.8.9 - trade accuracy for memory)● Master Slave Replication● High Availability (through Sentinel)

Page 17: NoSQL solutions

Characteristics - continued

● Redis Cluster in works (not production ready yet) - sharding ○ asynchronous replication○ does not guarantee strong consistency (may ‘forget’ writes)

● AOF sync - default 2s● Does not support secondary indexes● Pub/Sub mode since 2.0● Key expiry● Server scripting with Lua

Page 18: NoSQL solutions

IRL

● virtualized server 4G RAM, 1vCPU● +50k get/set per second (redis-benchmark)● only 128 queries out of 1165550375 over 10ms

(0.00001%)○ uptime_in_days:439○ used_memory_human:424.09M○ used_memory_peak_human:834.94M○ total_connections_received:1352935○ db0:keys=610884,expires=355397

Page 19: NoSQL solutions

Generic performance tips

● Use short key names (reduces data size, OFC!)● You can create secondary indexes (but you have to

maintain them, e.g. using SET)● You can have ad-hoc queries (actually is query) :

using SORT

Page 20: NoSQL solutions

Use Cases

● Cache● IPSS/IPC● Queue mechanisms (see e.g. Resque)● Log/Task buffers● Statistics and aggregation datastore● (anywhere you use memcached)● http://redis.io/topics/whos-using-redis (hint: Twitter,

GitHub, Snapchat, StackOverflow a.o.)

Page 21: NoSQL solutions

Recap

One size does NOT fit all!

Page 22: NoSQL solutions

Further reading

● Must read: http://blog.andreamostosi.name/big-data/ (almost exhaustive list of all things NoSQL and BigData)