This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
• WHY SQL / NoSQL• Alternatives to SQL• Big Data• Best of both worlds – InnoDB/memcached• MySQL Cluster – NDB/memcached• Q&A
Synopsis – How to use MySQL as a relational data store according to Codd & Date while gaining the ability to access schema-less data and looking cool while doing it.
Wikipedia: Edgar Frank "Ted" Codd was an English computer scientist who invented the relational model for database management, the theoretical basis for relational databases. ...Codd continued to develop and extend his relational model, sometimes in collaboration with Chris Date
The purpose of the relational model is to provide a declarative method for specifying data and queries: users directly state what information the database contains and what information they want from it, and let the database management system software take care of describing data structures for storing the data and retrieval procedures for answering queries. Wikipedia
● Optional local caching: three options – “cache-only”, “innodb-only”, and “caching” (both “cache” and “innodb store”). These local options can apply to each of four Memcached operations (set, get, delete and flush).
● Batch operations: user can specify the batch commit size for InnoDB memcached operations via “daemon_memcached_r_batch_size” and “daemon_memcached_w_batch_size” (default 32)
● Support all memcached configure options through MySQL configure variable “daemon_memcached_option
Fault tolerant, auto sharding, shared nothing, data on redundant boxes, 99.999% up time, ACID, geographical replication between clusters, & no single point of failure
Option 1 co-locate the memcached API with the data nodes
The applications can connect to any of the memcached API nodes – if one should fail just switch to another as it can access the exact same data instantly. As you add more data nodes you also add more memcached servers and so the data access/storage layer can scale out (until you hit the 48 data node limit).
For maximum flexibility, you can have a separate Memcached layer so that the application, the Memcached API & MySQL Cluster can all be scaled independently.
Another simple option is to co-locate the Memcached API with the application. In this way, as you add more application nodes you also get more Memcached throughput. If you need more data storage capacity you can independently scale MySQL Cluster by adding more data nodes. One nice feature of this approach is that failures are handled very simply – if one App/Memcached machine should fail, all of the other applications just continue accessing their local Memcached API.
In all of the last three examples, there has been a single source for the data (it’s all in MySQL Cluster)
.
● If you choose, you can still have all or some of the data cached within the memcached server (and specify whether that data should also be persisted in MySQL Cluster) – you choose how to treat different pieces of your data. If for example, you had some data that is written to and read from frequently then store it just in MySQL Cluster, if you have data that is written to rarely but read very often then you might choose to cache it in memcached as well and if you have data that has a short lifetime and wouldn’t benefit from being stored in MySQL Cluster then only hold it in memcached. The beauty is that you get to configure this on a per-key-prefix basis (through tables in MySQL Cluster) and that the application doesn’t have to care – it just uses the memcached API and relies on the software to store data in the right place(s) and to keep everything in sync.
● Of course if you want to access the same data through SQL then you’d make sure that it was configured to be stored in MySQL Cluster.