Top Banner
IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th , 2006 Database Replication and Change Propagation Technologies for Continuous Availability
20

IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

Apr 01, 2015

Download

Documents

Sawyer Haskett
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group

®

Serge Bourbonnais

Database ReplicationSilicon Valley laboratory

WMO TECO-WIS Convention

Seoul, November 8th, 2006

Database Replication and Change Propagation Technologies for Continuous Availability

Page 2: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group | DB2 Information Management Software

Abstract - IBM Database Replication Technologies• Database replication technologies allow an IT infrastructure to achieve continuous

availability of the enterprise operations, by providing solutions for Disaster Recovery, Workload Isolation, and Information Integration.

• When used for Disaster Recovery, each replicated database can be fully active, and copies do not need to be identical. Some trade-offs include administrative costs, and the overhead of capturing and applying changes.

• For Workload Isolation, the replication process can manage conflicts that may arise from the application workload, database constraints, loading a target while changes are still occurring at the source, or changes arriving out of order in a multi-node configuration. Conflict resolution either relies on timestamp and the origin of each change, or on a designated master. Configurations for data distribution and consolidation to/from hundreds of databases can also be deployed.

• For Information Integration, the replication process deals with heterogeneous data schemas, data stores, or even data models.

IBM Database replication technologies can capture and propagate changes with low-latency at high-throughput over long distances, while preserving database transactional integrity, and tolerating system outages or intermittent connectivity.

Page 3: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group | DB2 Information Management Software

Agenda

Database Replication Technologies for continuous availability In support of the Global Enterprise: From Continuous Availability to

Business Integration

Where Database Replication fits

Where replication does not fit IBM Product Architecture and Capabilities

Capture, Apply, Federation, and Transforms

Topologies, Conflict Detection and Resolution Sample Implementations

Page 4: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group | DB2 Information Management Software

Why Replication in an Information System?

1. Disaster Recovery Goal: High-Availability

Applications: Standby copy for failover, Scheduled and Unscheduled outages

Requirements: Minimize recovery time and eliminate or reduce data loss. Preserve transactional consistency.

2. Workload Isolation Goal: High-Availability, Improve Performance

Applications: Data Distribution/Consolidation, Regional Data Centers, Caches

Requirements: Maintain live copies or subsets for working in disconnected mode, often geographically distributed. Need to detect and resolve conflicts, if any. Data mappings and transformations.

3. Information Integration Goal: High-Availability, Improve Performance, Global Enterprise View

Applications: Analytics, Enterprise Business Integration

Requirements: Moving data to/from heterogeneous data stores. Cleansing and transformations. Assembling objects with data from several sources.

Requirements

More

RequirementsLess

Page 5: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group | DB2 Information Management Software

Relational tablesRelational tables Business ObjectsBusiness Objects

<order><oid>197</oid> <pid>AS207</pid> <desc>Wheel</desc> <qty>1</qty></order>

DatabaseDatabase

Log Shipping•HADR (LUW)Disk Mirroring•GDPS•PPRC

Logical Replication•Data Propagator•Q Replication

Event Publishing•Q Event Publish•II FederationIntegration Software•DataStage

Maintain a full database copy for Disaster Recovery

Maintain a logicaldatabase subset for Disaster Recovery,Workload Isolation

Publish changes (with transformations) forDisaster Recovery,Workload Isolation,Data Integration

Technologies and Products

Propagated Objects

Requirementsand Scenarios

From Disaster Recovery to Information Integration: Replication needs are cumulating - Semantics are increasing.

Database Replication Application Space

Page 6: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group | DB2 Information Management Software

Application Space for Database Replication Technologies

Database Replication is a good fit when: Asynchronous Capture and Delivery

Outages. Network, servers, a site, the RDBMS.Occasionally connected

Non-identical sources and targets

Different platforms. OS, RDBMS, even data modelsDifferent shapes. Sub-setting requiredRow-level transformations., Codepages, Schemas

Update-Anywhere with possible conflicts

Only possible with replication When some data loss is tolerable in case of a major disaster.

Often, solution can be designed to limit loss to a few seconds. Fast delivery over large distances (1000s km)

Several 10 000s rows/second achievable (up to 100 000row/sec) Avoid or minimize full-refresh of data at the target

Other factors: Minimize down-time, administrative cost, application performance impact

Replication Technologies guarantee Transactional Consistency with Resilience

Page 7: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group | DB2 Information Management Software

Limits of Database Replication technologies

Non-zero Data Loss required in case of Disaster (fire, flood) Use Synchronous technologies instead, i.e., HADR, PPRC

Set-level .transformations are required on the data Use ETL software instead

However, replication can be used to feed a staging area for ETL tools. Replication can hide the differences between the target and the source (database schema, data model, codepage, hardware architecture) and provide a continuous, asynchronous feed.

Business Objects need to be assembled Develop applications in the application layer

Other factors Cost-Benefit analysis of the solutions, given the requirements

Page 8: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group | DB2 Information Management Software

DB2

IBM SQL Replication

Source server

Non-DB2Non-DB2

Triggers

Staging Tables

Apply DB2 Information Integrator

•Informix•Oracle•Sybase•SQL Server•Teradata

database recoverylog

Capture SQL

Apply

DB2

Apply

DB2

DB2

Apply•z/OS•iSeries•UDB LUW

Staging is in relational tables Control, and Monitoring information also in relational tables Transport is over a database connection

Target servers

Control tables •Informix

•Oracle•Sybase•SQL Server

Page 9: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group | DB2 Information Management Software

A parenthesis: Database Federation

Nicknames appear as local tables. For example: > db2 list tablesTable/View Schema Type------------------------------- --------------- -----T1 BOURBON TORAT3 BOURBON NCUSTOMERS BOURBON T

T1 is a Table; ORAT3 a Nickname

Remote objects (structured files, tables, spreadsheets) appear to the application as if local tables in a DB2 database

Local and non-local data can be manipulated in the same SQL statement

CREATE NICKNAME ORAT3 FOR ORACLE9.SCOTT.T3

INSERT INTO ORAT3 VALUES(5)

SELECT * FROM ORAT3

Page 10: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group | DB2 Information Management Software

ApplyApply

DB2

DB2

IBM Q Replication

Source server

database recoverylog

Capture

Staging and Transport over MQSeries persistent message queues High-throughput, Low-latency. Apply with parallel agents

Target servers

DB2

Apply•z/OS•UDB LUWApplyApply

Control tables

•z/OS•UDB LUW•VM/VSE

WebSphere Queue Manager (or client)

Admin queue

Restart queue

Send queues

Control Tables

Non-DB2

Apply DB2 Information Integrator

•Informix•Oracle•Sybase•SQL Server•Teradata

ApplyApplyControl Tables

Page 11: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group | DB2 Information Management Software

Performance

Q Replication is between 3 to 10 times faster than SQL replication

Higher throughput and shorter latency

Capture measured throughput: 49000rows/second (V9.1)

Latency less than 2 seconds achievable over 1000s of kilometers

Measured Time to clear up receive queues after an outage 1,000,000 rows accumulated in target receive queue

Continuous arrival rate: 5,000 rows per second

Time to re-sync target database = 91 seconds

(1) Turbo Freeway (2064-216) 2 LPARs 4CP for the source system and 4cp for the target system.

Page 12: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group | DB2 Information Management Software

Q Replication subscriptions - defining target copies Projection over columns and rows of a table:

Only changes for subscribed tables are sent

Some transactions can be ignored (e.g., by owner ID, trans ID, with signal or command)

Some operations can be ignored (e.g., delete)

Filter rows with a predicate (e.g., WHERE :LOCATION ='EAST' AND :SALES > (SELECT SUM(expense) FROM STORES WHERE stores.deptno = :DEPTNO)

Database Schema mappings examples: 1 column to N columns, e.g., [ :C1 || :C2]

N columns 1 columns, e.g., [substr(:C2,2,3)]

Generated columns, e.g., [CURRENT TIMESTAMP]

Capture side: Apply side ORDERS

oid priceIBMORDERS

ibmID price ts

Replication handles codepage conversion, architecture difference

Page 13: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group | DB2 Information Management Software

Q Replication Subscription Types

Primary Secondary/backup

Unidirectional Changes are replicated in one direction

1:N – N:1 topologies – Distribution and Consolidation

Changes can be filtered and transformed Bidirectional – master/slave

Changes replicated in both directions

Conflicts detected on data values:

• Conflict rules: Check key, changed only, or all columns One server designated as winner

• Conflict action: Force, ignore, merge change Tree topologies only

Minimum overhead Peer to peer – no master, use timestamps

Conflicts resolved by using most recent version, no master copy - Handles out of order arrivals (e.g., delete before insert)

Requires extra columns and triggers

Source

Target(s)

Page 14: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group | DB2 Information Management Software

Data Distribution from a (CCD) staging area

Read/write

Source Table

Read-only

CCD Table

Target Table

Q Replication

SQL Apply

… Target Table

Target Table

MQSeries

SQL Connections

Q ApplyQ Capture

SQL Apply

SQL Apply

Page 15: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group | DB2 Information Management Software

Consistent Changed Data (CCD) Apply targets

Usages: AUDIT trail of database changes.

• Answer: Who changed what, when, and how? Staging table for data distribution (with SQL Apply)

•For updates, before values can be optionally present in the CCD (e.g., XPARTNO)•Condensed CCD: Contains only the latest changed value of each row•Complete CCD: Initially created with values for all rows from the source table.

5.03

AUTHID OPERATIONCOMMITSEQ LOGMARKER XPARTNO PARTNO XPRICE PRICE

1 USER_A Ucurrent

timestamp A7571 A7571 4.31

2 USER_B Icurrent

timestamp null A7981 null 121.03

PARTS_CCD

3 USER_A Dcurrent

timestamp null A7981 null null

Page 16: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group | DB2 Information Management Software

Usage Building the Data Warehouse Business Integration Auditing requirements

Function Capture changed data in real time Correlate by transactions within a single database Output: XML or CSV

Log-basedcapture

User Application

DB2 z/OS and LUW

VSAM

Capture WebSphere MQWebSphere MQ Integrator Broker

Target DBs

DataStage

WebSphere Business Integration

JMS-aware Application

Event Publishing

Software AGAdabas

IMS

CA IDMS

Page 17: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group

17

Mazda

Support 700 dealers in USA

Trouble matching customer demand with available inventory

More current data needed to track sales achievements with period-end goals

Sales and inventory information is replicated every minute to portal server

Improved access to current data without changes to existing IT infrastructure

Challenge

Solution

Business benefits Increased auto sales Improved dealer satisfaction Currency of information improved by

93%

Technology benefits Re-used existing application and data

base infrastructure Decreased network load compared to

full data refreshes 4 times an hour Ease and speed of deployment

“Within 5 weeks of receiving the [WebSphere] Information Integrator product we were able to implement it in our … environments. It now provides us up to the minute sales activity.”

Joe Neria, Software Consultant. Mazda

Page 18: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group

18

International provider of financial & investment services

Corporate initiative to provide customers better performing real-time queries by utilizing multiple sites.

Replication of critical order processing details for core business functionality

Q Replication for high speed movement of up to 10 Million transactions to secondary site several thousand miles away. Current implementation is Uni-Directional with peer-to-peer plans.

Challenge

Solution

Business benefits Replicating 5-10 Million

transactions with less than 2 seconds latency.

More efficient and cost-effective resource utilization

Secondary platform services reporting and business intelligence queries and acts as backup to primary

Technology benefits Real-time back up of secondary

system provides results in increased capacity for peak workloads.

Page 19: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group

19

CitiStreet

Support single sign-on access through both Web and IVR applications ensuring 24x7 portal access for plan participants and sponsors

Support redundant, active single sign-on applications for failover processing replicating profile changes between them in real time.

Challenge

Solution

Overview• CitiStreet is one of the largest and most

experienced global benefits providers servicing over 9 million plan participants across all markets. CitiStreet was formed in partnership between subsidiaries of State Street Corporation and Citigroup

Business benefits Ensure application availability for plan

participants and sponsors The new solutions from IBM will

improve data integrity with a reduced level of maintenance

Technology benefits Maintain bi-directional synchronization

of profile updates (approx 175,000 updates daily) in real time

“Since nearly 10 million of CitiStreet customers are offered 24-hour access to their retirement accounts, the company can't afford downtime and must be able to replicate data changes when they happen. We fully replicate our database over redundancy data lines, so to us the stability and speed of that asynchronous replication is strategic for us." Barry Strasnick , CIO

CitiStreet

Page 20: IBM Software Group ® Serge Bourbonnais Database Replication Silicon Valley laboratory WMO TECO-WIS Convention Seoul, November 8 th, 2006 Database Replication.

IBM Software Group | DB2 Information Management Software

Summary

IBM develops Data Propagation technologies to provide Continuous Availability and achieve a Global Integrated view of the enterprise in an heterogeneous environment

Q Replication (IBM WebSphere Replication Server) delivers low latency, high throughput, and resilience. It is best-of-breed for heavy OLTP workloads, providing resilience and preserving transactional integrity throughout outages while minimizing the need for full data refreshes.