MySQL High Availability Solutions MySQL High Availability Solutions Lenz Grimmer < Lenz Grimmer <[email protected]> > http://lenzg.net/ | Twitter: | Twitter: @lenzgr 2011-01-26 | San Francisco MySQL Meetup | USA 2011-01-26 | San Francisco MySQL Meetup | USA
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
MySQL High Availability SolutionsMySQL High Availability Solutions
MySQL Storage EnginesMySQL Storage Engines(InnoDB, MyISAM, PBXT...)(InnoDB, MyISAM, PBXT...)
Disk StorageDisk Storage(XFS, ReiserFS, JFS, ext3...)(XFS, ReiserFS, JFS, ext3...)
MySQL ClientMySQL Client
Why HA?● Something can and will fail● Service Maintenance● Downtime is expensive● Adding HA to an existing system is complex
Elimination of the SPOF● Identify what will fail eventually
● Hard disks● Fans
● Consider what might fail● Application crashes● OOM-Situations, Kernel-Panics● Network connections, Cables● Power supply
What is HA Clustering?● Redundancy● One system or service goes down → others
take over● IP address takeover, service takeover● Ensuring data availability & integrity● Not designed for high-performance
High Availability Levels
Rules of High Availability● Prepare for failure● Aim to ensure that no important data is
lost● Keep it simple, stupid (KISS)● Complexity is the enemy of reliability● Automate it● Test your setup frequently!
HA Components● Heartbeat
● Checks that services that are monitored, are alive. ● Can check individual servers, software services,
networking etc.
● HA Monitor● Configuration of the services● Ensures proper shutdown and startup ● Allows manual control
● Shared storage / Replication
Split-Brain● Communications failures can lead to
separated partitions of the cluster● If those partitions each try and take control
of the cluster, then it's called a split-brain condition
● If this happens, then bad things will happen● Use Fencing or Moderation/Arbitration to
avoid it
Redundancy Using MySQL Replication
MySQL ReplicationMySQL Replication
MySQL Replication● Unidirectional● Statement- or row-based (MySQL 5.1)● Built into MySQL● Easy to use and set up● One Master, many Slaves● Asynchronous – Slaves can lag behind● New in MySQL 5.5: Semisync Replication
MySQL Replication (2)● Master maintains Binary logs & index● Replication on Slave is single-threaded
✔ Proven (around since MySQL 3.23)✔ Smaller log files✔ Auditing of actual SQL statements✔ No primary key requirement for replicated tables
● Con✗ Non-deterministic functions and UDFs✗ LOAD_FILE(), UUID(), CURRENT_USER(),
FOUND_ROWS()(but RAND() and NOW() work)
Row-based Replication● Pro
✔ All changes can be replicated✔ Similar technology used by other RDBMSes✔ Fewer locks required for some INSERT, UPDATE or
DELETE statements
● Con✗ More data to be logged✗ Log file size increases (backup/restore implications)✗ Replicated tables require explicit primary keys✗ Possible different result sets on bulk INSERTs
Replication Topologies
Master > Slave
Masters > Slave (Multi-Source)
Master < > Master (Multi-Master)
Master > Slaves
Ring (Multi-Master)
Master > Slave > Slaves
Master-Master Replication● Two nodes are both master and slave to
each other● Useful for easier failover● Not suitable for (write) load-balancing● Don't write to both masters
simultaneously!● Use Sharding or Partitioning instead
(e.g. MySQL Proxy)
MySQL Replication as a HA Solution● What happens if the Master fails?● What happens if the Slave fails?● This doesn’t sound like High Availability!● Yes!● Replication is only part of a HA
configuration
Pacemaker (Linux-HA)● Supports 2 or more Nodes (v2)● Resource monitoring (Apps and HW)● Active fencing mechanism (STONITH)● Node failure detection in seconds● Supports many applications (incl. MySQL)● http://clusterlabs.org/● http://www.clusterlabs.org/wiki/Load_Balanced_MySQL_Replicated_Cluster
Replication & HA● Combined with Pacemaker● Virtual IP takeover● Slave gets promoted to Master● Side benefits: load balancing & backup● Can be tricky to fail back● No automatic conflict resolution● Proper failover needs to be scripted
Redundancy With Disk Replication
Disk ReplicationDisk Replication
DRBD● Distributed Replicated Block Device● “RAID-1 over network”● Synchronous/asynchronous block replication● Automatic resynchronisation● Application-agnostic● Can mask local I/O-Errors● Active/passive configuration● Dual-primary mode (requires cluster file sytem
DRBD in Detail● DRBD replicates data blocks between to
block devices● DRBD can be combined with Linux-HA and
other HA solutions● MySQL runs normally
on primary node● MySQL is not active on
the secondary node● DRBD is Linux only
Applications
Virtual IPActive Node Passive Node
DRBD
Redundancy Using Shared Storage
Replication vs. SAN● Data Consistency / Integrity● Synchronous vs. asynchronous● SAN can become the SPOF● Cold caches● “Split brain”-Situations● SAN/NAS I/O Overhead
Redundancy with MySQL Cluster
MySQL Cluster● “Shared-nothing”-Architecture● Automatic partitioning● Distributed fragments● Synchronous replication● Fast, automatic fail-over of data nodes● Automatic resynchronisation● Transparent to MySQL applicationen● Supports transactions● http://mysql.com/products/database/cluster/
MySQL Cluster● In-memory indexes● Not suitable for all query patterns
(cross-table JOINs, range scans)● No support for foreign keys● Not suitable for long running transactions● Network latency crucial● Can be combined with MySQL replication
(RBR)
MySQL Cluster & Replikation● MySQL Cluster
● Easy failover from one MySQL node to another● Scaling write load using multiple SQL nodes
● Asynchronous replication from Cluster to regular MySQL slaves
● Slaves take read load (InnoDB/MyISAM)● Quick setup of new slaves (Cluster Online
Galera Replication● Patch for InnoDB plus external library● Synchronous replication● Single- or multi-master● Multicast-Replication● HA plus load sharing possible● Certifikate-based replikation
(instead of 2PC)● http://codership.com/products/mysql_galera
Solaris Cluster / OpenHA Cluster● Provides failover and scalability services● Solaris / OpenSolaris (Project Colorado)● Kernel-level components plus userland● Agents monitor applications● Geo Edition to provide Disaster Recovery
using Replication● Open Source since June 2007● http://hub.opensolaris.org/bin/view/Community+Group+ha-clusters/WebHome