YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Linas Virbalas Continuent, Inc.

Page 2: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

/  Definition & Motivation /  Scoping the Challenge /  MySQL ->

•  PostgreSQL •  Oracle •  MongoDB

/  Demo 1 /  PostgreSQL ->

•  MySQL

/  Demo 2 /  Q&A

Page 3: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Page 4: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Heterogeneous Replication

Replication between different types of DBMS

Page 5: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

1.  Real-time integration of data between different DBMS types

2.  Seamless migration out of one DBMS type to another 3.  Data warehousing (real-time) from different DBMS

types 4.  Leveraging specific SQL power of other DBMS types

Page 6: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

/  Name: Linas Virbalas /  Country: Lithuania /  Implementing for Tungsten:

•  MySQL -> PostgreSQL •  MySQL -> Greenplum •  MySQL -> Oracle •  PostgreSQL WAL •  PostgreSQL Streaming Replication •  PostgreSQL Logical Replication

via Slony logs

/  Blog: http://flyingclusters.blogspot.com

Page 7: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Page 8: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

1.  MySQL -> … •  Replicating from MySQL to PostgreSQL/Greenplum, Oracle,

MongoDB

2.  PostgreSQL -> … •  Replicating from PostgreSQL to MySQL

Page 9: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

With Tungsten Replicator

Page 10: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

/  Open Source GPL v2 /  JAVA /  Interfaces to implement new:

•  Extractors •  Filters •  Appliers

/  Multiple replication services per one process

Page 11: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Technology: Replication Pipelines

Page 12: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Page 13: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

/  Statement Based Replication

/  Row Based Replication

Page 14: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Page 15: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Master Replicator

MySQL Extractor

Transaction History Log

Slave Replicator

PostgreSQL Applier

Transaction History Log

Filters Filters

Page 16: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

/  Provisioning /  Data Type Differences /  Database vs. Schema /  Default (Implicitly Defined) Schema Selection /  SQL Dialect Differences

•  Statement Replication vs. Row Replication

/  Character Sets and Binary Data /  Old Versions of MySQL

Page 17: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Provisioning

/  Harder way: Dump data explicitly

/  Easier way: Replicate a mysqldump backup

Replicator

Page 18: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

MySQL PostgreSQL ! TINYINT SMALLINT

SMALLINT SMALLINT INTEGER INTEGER BIGINT BIGINT

! CHAR(1) CHAR(5) = {‘true’, ‘false’} CHAR(x) CHAR(x) VARCHAR(x) VARCHAR(x) DATE DATE TIMESTAMP TIMESTAMP

! TEXT (diff. sizes) TEXT ! BLOB BYTEA

/  Note the type differences between MySQL and PG

Page 19: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Database vs. Schema

/  In MySQL these are the same: ! !CREATE DATABASE foo!

! !CREATE SCHEMA foo!

/  In PostgreSQL these are very different: CREATE DATABASE foo!! !CREATE SCHEMA foo!

/  Tungsten uses filters to rectify MySQL databases to PostgreSQL schemas

Page 20: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

MySQL Implicit MySQL Explicit CREATE SCHEMA s; CREATE SCHEMA s; USE s;

! CREATE TABLE t (i int); CREATE TABLE s.t (i int); ! INSERT INTO t (1); INSERT INTO s.t (1);

/  MySQL: Trivial to use `USE` /  MySQL: Going without `USE` generates different

events

/  PG: Extract the default schema from the event /  PG: Set it before applying

MySQL PostgreSQL USE s; > SET search_path TO s, "$user”;

Page 21: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

MySQL PostgreSQL CREATE TABLE complex (id INTEGER AUTO_INCREMENT PRIMARY KEY, i INT);

CREATE TABLE complex (id SERIAL PRIMARY KEY, i INT);

CREATE TABLE dt (i TINYINT); CREATE TABLE dt (i SMALLINT); …

/  Differences between DDL and DML statement SQL dialects

/  Row Replication resolves issues rising from differences in DML, but still leaves DDL to handle

/  Tungsten Replicator Filters come to the rescue! •  Simple to develop Java or JavaScript extensions •  Event structure IN -> Filter -> Event structure OUT

Page 22: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

MySQL PostgreSQL INSERT INTO embedded_blob (key, data) VALUES (1, ‘?\0^Es\0^\0\’’)

ARGH!!! (SQL statement fails)

create table xlate(id int, d1 varchar(25) character set latin1, d2 varchar(25) character set utf8);

ARGH!!! (no way to translate to common charset)

/  Statement replication: MySQL syntax is “permissive” /  Embedded binary / alternate charsets /  Different charsets for different clients

/  Row replication: database/table/column charsets may differ

/  Answer: Stick with one character set throughout; use row replication to move binary data

Page 23: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

MySQL Versions

/  Problem: Data stored on hard-to-replicate MySQL versions or configurations

•  Row replication not enabled (5.1) •  No row replication support (5.0, 4.1) •  Tungsten cannot read binlog (4.1)

/  Answer: MySQL blackhole replication •  (Blackhole = no store, just a binlog) •  Caveat: Check MySQL docs carefully

Replicator

Page 24: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Page 25: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Master Replicator

MySQL Extractor

Transaction History Log

Slave Replicator

Oracle Applier

Transaction History Log

Filters Filters

Page 26: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

/  TEXT length limitation •  VARCHAR(4000) => CLOB

/  Primary Keys and PrimaryKeyFilter •  Goal:

UPDATE t SET c1 = x1, c2 = x2, c3 = x3 WHERE p = p1

•  NOT:

UPDATE t SET c1 = x1, c2 = x2, c3 = x3 WHERE p = p1 AND c1 = x1 AND c2 = x2 AND c3 = x3 AND …!

Page 27: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Page 28: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

> use mydb switched to db mydb!

> db.test.insert( {"test": "test value", "anumber" : 5 } )!

> db.test.find() { "_id" : ObjectId("4dce9a4f3d6e186ffccdd4bb"), "test" : "test value", "anumber" : 5 }!

> exit!

Page 29: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

/  MySQL binary log doesn’t hold column names

•  mysql> INSERT INTO foo (id, data) VALUES (1, 'hello from MySQL!');

•  If nothing done becomes:

> db.foo.find(); { "_id" : ObjectId("4dc55e45ad90a25b9b57909d"), " " : "1”, " " : "hello from MySQL!”}

•  Solution: to fill in column names on master side. Then:

> db.foo.find(); { "_id" : ObjectId("4dc55e45ad90a25b9b57909d"), ” " : "1”, “ " : "hello from MySQL!”}

Page 30: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

MySQL -> MongoDB: The Pipeline

Page 31: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Page 32: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Page 33: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Logical Physical MySQL Statement Based x

MySQL Row Based x MySQL Mixed x

PostgreSQL WAL Shipping x PostgreSQL Streaming Replication x Filters (data transformation) possible + -

Different data/structure on slave possible

+ -

/  A transaction is not accessible to the replicator under physical replication

/  Tungsten Replicator manages WAL/Streaming Replication

Page 34: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Logical Physical MySQL Statement Based x

MySQL Row Based x MySQL Mixed x

PostgreSQL WAL Shipping x PostgreSQL Streaming Replication x

Tungsten Replicator w/ PostgreSQLSlonyExtractor

x

Filters (data transformation) possible + - Different data/structure on slave

possible + -

/  With PostgreSQLSlonyExtractor transaction goes through the Replicator pipeline

Page 35: Breaking the-database-type-barrier-replicating-across-different-dbms

Slave Replicator

MySQLApplier

Transaction History Log

Master Replicator

PostgreSQL SlonyExtractor

Transaction History Log

Filters Filters

Page 36: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Page 37: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

/  We’ve reviewed an open source heterogeneous replicator (professional services available upon request)

/  Tungsten Replicator encapsulates the complexity and corner cases of the subject

/  Replicating: •  out of MySQL – now; •  out of PostgreSQL – prototype; •  out of Oracle – designs ready, awaiting sponsorship.

Page 38: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Page 39: Breaking the-database-type-barrier-replicating-across-different-dbms

© Continuent 2010

Open Source http://tungsten-replicator.org #tungsten @ irc.freenode.net

My Blog: http://flyingclusters.blogspot.com

Commercial [email protected]

Continuent Web Site: http://www.continuent.com


Related Documents