Top Banner
Mashing the Data Real-Time replication from MySQL to Google Cloud Datastore
20

Mashing the data

Feb 15, 2017

Download

Technology

Felix Crisan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Mashing the data

Mashing the DataReal-Time replication from

MySQL to Google Cloud Datastore

Page 2: Mashing the data

Ingredients● MySQL● NodeJS● ZongJi● Google Cloud Datastore

Page 3: Mashing the data

There are two types of DBAs:1) DBAs that do backups

2) DBAs that will do backups

Page 4: Mashing the data

MySQL● Most used Open source DB - second place overall after Oracle (but almost

equal)*● Since 1995● Currently at version 5.7 (5.7.16 in Oct’16)● Several forks - MariaDB, Percona● Several storage engines, most used is InnoDB ● NDB Cluster and Master-Master Replication for HA

* According to http://db-engines.com/en/ranking

Page 5: Mashing the data

A SQL query walks into a bar and sees two tables. He walks up to them and asks, "Can I join you?"

Page 6: Mashing the data

MySQL replication● Master - Slave(s)● Slaves can be Masters in their turn (Master->Slave->Slave->...->Slave)

○ log_slave_updates

● Only data modifying queries are logged (Create, Update, Delete; not Reads)

● 2 ½ types of replication○ Statement Based (SBR) -> binary log records queries (UPDATE … SET ..) which are then

replayed on slave

○ Row Based (RBR) -> binary log records directly the values of the affected row before and after the change is applied

○ Mixed -> binary log records a mix of SBR and RBR (default is SBR, but for certain statements + storage engine used, the log is automatically switched to row-based)

Page 7: Mashing the data

Q: Why do you never ask SQL people to help you move your furniture?

A: They sometimes drop the table

Page 8: Mashing the data

MySQL replication (cont’d)● SBR is good when changes affect lots of rows (as for e.g. 1k modified rows

we only send a few bytes across the wire)● SBR has problems when there are inconsistencies between master and

slave or when queries are not deterministic (e.g. UPDATE … SET … LIMIT 100)

● RBR is good in maintaining a better consistency (as every changed row is replicated)

● RBR can be problematic when many rows are changed with a single statement (lots of traffic over the network)

Page 9: Mashing the data
Page 10: Mashing the data

Google Cloud Datastore

Page 11: Mashing the data

What is GCD● NoSQL document database● Automatic scaling● High performance● Flexible storage

Page 12: Mashing the data

GCD (cont’d)● Balance of strong and eventual consistency

○ entity lookups by key and ancestor queries always receive strongly consistent data○ Other queries are eventually consistent

● Encryption at rest○ encrypts all data before it is written to disk

● Querying of data through GQL○ Similar with “classic” SQL; e.g. SELECT * FROM myKind WHERE myProp >= 100 AND

myProp < 200 or SELECT * FROM myKind ORDER BY myProp DESC LIMIT 100

● By default all properties are indexed, supports composite indexes (a bit more work to enable them though)

Page 13: Mashing the data

Our Setup

Page 14: Mashing the data

Setup

MySQL Master

MySQL Slave

SBR NodeJS App

RBR

Google Cloud Datastore

Google Cloud Node modules

Page 15: Mashing the data

Details about NodeJS App● Uses ZongJi (https://github.com/nevill/zongji - MySQL binlog listener)

var ZongJi = require('zongji');

var zongji = new ZongJi(config.database);

zongji.on('binlog',function (evt) {doSomething('binlog',evt)})

zongji.on('query', function(evt) {doSomething('query',evt)})

zongji.on('writerows',function(evt) {doSomething('insert',evt)})

zongji.on('updaterows', function(evt) {doSomething('update',evt)})

zongji.on('deleterows', function(evt) {doSomething('delete',evt)})

Page 16: Mashing the data

NodeJS (cont’d)zongji.start({

startAtEnd: true,

includeSchema: {yourDBhere":true,"yourOtherDBHere":true},//config.monitor,

includeEvents: [ 'tablemap', 'writerows', 'updaterows', 'deleterows' , 'query','rotate']

});

var doSomething = function(type, event) {

//event has a rows attribute containing every modified row

//it also has a tableMap containing table metadata (most important - table name)

}

Page 17: Mashing the data

NodeJS (last one, I promise)var sendToDataStore = function(namespace,idfldname,row) {

var k = datastore.key([namespace, row[idfldname]]);

datastore.save({key:k,data:row} ,function(err,res){

if(err) console.log("ERROR",err)

else console.log("OK",JSON.stringify(res))

});

}

Page 18: Mashing the data

Demo Time

Page 19: Mashing the data

In case the demo does not work

Page 20: Mashing the data

Thank you!