Top Banner
Data Migration Strategy for AFP Reengineering Project Version 1.0 Data Migration Strategy for AFP Reengineering Project Version 1.0 TCS Confidential Page 1 of 48
48

Data Migration Strategy

Apr 06, 2015

Download

Documents

lnaveenk
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

Data Migration

Strategy for AFP Reengineering Project

Version 1.0

TCS Confidential Page 1 of 35

Page 2: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

ABOUT THIS DOCUMENT

Purpose

The purpose of this document is to lay out the structure for data migration for an application

reengineering project

Intended Audience

This document is primarily for the use of consultants associated with Data Migration projects

GlossaryTCS Tata Consultancy Services

TCS Confidential Page 2 of 35

Page 3: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

Contents1 INTRODUCTION........................................................................................................4

1.1 Background................................................................................................................................ 41.2 Scope......................................................................................................................................... 41.3 Assumptions............................................................................................................................... 51.4 Open Items................................................................................................................................. 61.5 System Description..................................................................................................................... 6

1.5.1 Source System Description....................................................................................................61.5.2 Target System Description.....................................................................................................6

2 Migration Approach.................................................................................................72.1 Introduction................................................................................................................................. 72.2 Planning..................................................................................................................................... 82.3 Analysis.................................................................................................................................... 10

2.3.1 Analysis of Source Inventory................................................................................................102.3.2 Source Data Analysis...........................................................................................................112.3.3 Data Cleansing.....................................................................................................................122.3.4 Extraction programs.............................................................................................................122.3.5 Analysis of Target Database................................................................................................12

2.4 Strategy definition..................................................................................................................... 142.4.1 Proof of concept................................................................................................................... 14

2.5 Design...................................................................................................................................... 152.5.1 Mapping rules....................................................................................................................... 162.5.2 Data Format – Source to Text File.......................................................................................162.5.3 Non-key source fields becoming key fields in target.............................................................172.5.4 Date and time stamp / load date fields and user id..............................................................17

2.6 Construction............................................................................................................................. 172.6.1 Data migration approach......................................................................................................182.6.2 Source System (VSAM / DB2) to Staging database (Oracle)...............................................182.6.3 Staging database (Oracle) to Target database (Oracle).......................................................202.6.4 Cleansing............................................................................................................................. 212.6.5 Audit trail data, summary data..............................................................................................222.6.6 Reports................................................................................................................................. 222.6.7 Special Requirements..........................................................................................................22

2.7 Testing...................................................................................................................................... 232.7.1 Validation............................................................................................................................. 252.7.2 Audit..................................................................................................................................... 262.7.3 Testing Lifecycle..................................................................................................................26

2.8 Pre-Implementation(Dry Runs).................................................................................................272.9 Implementation......................................................................................................................... 27

2.9.1 Cutover Considerations........................................................................................................302.9.2 Change Control....................................................................................................................30

Scope of Change Control....................................................................................................................... 302.9.3 Traceability........................................................................................................................... 312.9.4 Backup and Recovery..........................................................................................................33

3 Risks........................................................................................................................334 Guidelines...............................................................................................................345 Recommendation...................................................................................................35

TCS Confidential Page 3 of 35

Page 4: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

1 INTRODUCTION

1.1 Background

ING has initiated a program to replace the existing Pension Fund Management applications running in Mainframe systems with the J2EE application. This project will replace these legacy systems with more flexible systems with up-to-date technological platforms and functionality.

As part of the replacement, the data from the existing mainframe applications should be moved to the target Oracle database. ING has invited Tata Consultancy Services (TCS) Limited to prepare the data migration strategy document. This document details the various steps necessary for the life cycle of the data migration project that will feed the legacy data to state of the art “Oracle database”.

1.2 Scope

The scope of this document is to define the strategy for the various phases of data migration. The phases in this data migration project are as follows.

Preparation Stage

o Planning

o Analysis

o Design

o Construction

o Testing

Implementation Stage

o Pre-Implementation/Dry Runs

o Implementation/Production data migration

This document also addresses

Tools

Cutover Considerations

Proof of Concepts

Guidelines

Special Requirements

TCS Confidential Page 4 of 35

Page 5: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

Change Control and Traceability

Challenges and Risks

Roadmap

1.3 Assumptions

Target data model will be developed iteration wise and so may undergo several changes. So source data analysis has to be done based on evolving target data model. Once the target data model is baselined unmapped fields in source will be further analyzed to confirm whether it can be actually ignored.

ING will define the strategy, analysis, design and construct scripts for Data Cleansing. TCS will support and complement this.

The production cut-over window for implementation is expected to be 48 hours over a weekend. This could change based on the volume of the record, relationship between tables which defines the order of migration

The source inventory and corresponding data are based on the assumption that the go-live date will be on a weekend that doesn’t fall on a month-end.

The current strategy is to extract the data from mainframe source using Informatica power exchange and use Informatica powercenter to transform and load Oracle target database

Existing master data will not be updated during migration window.

Data to be migrated is frozen before the start of the migration

There will not be any explicit lock on the data to be migrated by any of the application accessing the data during the outage window

The current existing model is base lined and assumed to be 100% complete.

The scope of data migration project is to migrate only the data that will be accessed by the target application system

ING will provide the list of concurrent activities during the outage window. The impact of it will be studied and the outage window size will be decided

TCS Confidential Page 5 of 35

Page 6: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

1.4 Open Items

Need for migrating the historic and back up data in tapes which are not going to be accessed by the target application, target table and the strategy for the same will be analyzed by ING and discussed and finalized. Both ING and TCS will discuss and resolve on the extra effort involved and the impact on the plan.

The possible solution could be one time migration either through regular interface or using scripts and then incremental migration using regular interface.

The scope of migrating the data present in tapes which are rarely used by the application needs to be finalized. The feasibility of the target application system accessing the same tapes needs to be studied

Risk analysis, Implementation details, Roll back strategy, handling of exceptions are yet to be finalized.

The migration strategy of back up data when the layout is different is yet to be finalized.

1.5 System Description

The scope of the data migration project is to migrate the data from the existing mainframe system to

ORACLE Database. The System architecture related to these systems is:

1.5.1 Source System Description

System Operating

System

Software

Platform

Database

1 IBM Mainframe

OS/390

COBOL, VSAM, CICS DB2

1.5.2 Target System Description

System Operating

System

Software

Platform

Database

1 UNIX Java/J2EE Oracle

TCS Confidential Page 6 of 35

Page 7: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

2 Migration Approach

2.1 Introduction

Data migration is process by which data is moved from source databases to target databases. Currently source data is in VSAM and flat files and DB2 tables in Mainframe. This data needs to be moved to target databases in Oracle. The various phases involved in this endeavor are as described below.

Preparation Stage

o Planning

o Analysis

o Strategy Definition

o Design

o Construction

o Testing

Implementation Stage

o Pre-Implementation/Dry Runs

o Implementation/Production data migration

The preparation stage will be used to develop data migration strategy and the data migration programs. This will be tested in non-production environment. All the factors that influence Implementation stage like business requirements, data volumes and infrastructure constraints should be taken into account in the preparation stage. This stage is very vital in the success of any data migration program. This stage will be done in seven iterations and will be synchronized with the iterations in ING Core AFP Project.

The actual execution of the data migration programs on the production data will be done in implementation stage. Implementation is planned in two phases. Preceding each implementation will be a Pre-Implementation or dry run to test the data migration scripts with production data in simulated test environment.

TCS Confidential Page 7 of 35

Page 8: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

2.2 Planning

All planning activities required for data migration will be done in this phase. Other activities that will be taken up in this phase will be the finalization of source inventory, creation of standards, strategy for data analysis, cleansing, implementation and selection of tools.

Assumptions

Project Plan is available

Activities

SL Category Task Schedule (Week-Day)

1 Planning Conduct kick-off meeting for the phase2 Planning Prepare detailed plan for the strategy documentation

TCS Confidential Page 8 of 35

Page 9: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

SL Category Task Schedule (Week-Day)

phase3 Planning Prepare detailed plan for the Iterations4 Planning Consolidate source inventory. 5 Planning Creation of standards. 6 Planning Identify and evaluate tools for data migration7 Planning Set up environment for next phase

Planning Identify candidates for Proof of Concept(POC)14 Documentation Document results of proof of concept (PoC) for identified

candidates.15 Tools Finalize the list of tools & environment setup definitions16 Environment Identify development/testing environment. 18 Configuration

DataIdentify, document and obtain approval for the configuration and reference data requirements

26 Acceptance Define Acceptance Criteria

Deliverables

Updated Project Plan Source Inventory list Inventory List for POC

Tools

The tool required for various phases of data migration has been identified during POC and the list is given below.

Sl Process Sub-process Tools

1 Extraction VSAM Informatica Power Exchange

DB2 Informatica Power Exchange

2File Comparison DFSORT,COBOL

3Transformation Informatica Power Center, COBOL

4Loading Informatica Power Center Source Analyzer and Warehouse

Designer

5Cleansing Pre Extraction << ING >>

Extraction <<ING >>

Transformation << ING /TCS >>

Target Database << ING >>

6Data Analysis Manual/SQL/Excel

7Audit Informatica

8Validation Informatica Reports

9Reporting Informatica

10Scheduling Informatica Power Center Workflow manager

TCS Confidential Page 9 of 35

Page 10: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

2.3 Analysis

Detailed analysis of source and target databases will be carried out in this phase. Data analysis will be carried out to understand the contents of source data and documented. Data cleansing requirements are documented and criteria for extraction audit and validation of source data are agreed upon.

2.3.1 Analysis of Source Inventory

The VSAM files, DB2 tables and flat files (structures, data and copybook layouts) are assumed to be base lined for inventory purposes. As Archive data migration will take place if archives are in current source format, their inventory needs to be documented.

When data is migrated from VSAM and DB2 to Oracle, the data that needs to be migrated and the data that is left in source because of duplications etc. need to be identified as part of scope analysis.

Sl Description Quantity Link for the list

1 No of VSAM files in

inventory

667

2 No of DB2 tables in

inventory

313

3 No of VSAM files to

be migrated

4 No of DB2 tables to

be migrated

5 No of VSAM backups

6 No of DB2 backups

7 Volume of data

8 Size of DB2 database 25GB

9 Size of VSAM

database is

245GB

10 No of DB2 Tables

with Reference Data

11 No of VSAM files with

Reference Data

12 No of DB2 tables with

transaction data

13 No of VSAM files with

TCS Confidential Page 10 of 35

Page 11: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

Sl Description Quantity Link for the list

transaction data

14 No of DB2 tables with

Master data

15 No of VSAM files with

Master Data

16 No of Databases in

the system

2.3.2 Source Data Analysis

Data analysis for all the source entities needs to be documented. This will be done iteration wise based on the evolving target data model. ING will provide the field description, ranges, and domain values for all the fields. This will help us in deciding whether an unmapped source field can be ignored or not. The following excel format is agreed upon and ING and TCS will jointly complete for all the VSAM files and DB2 table attributes and their descriptions.

As part of Standardization measure, the domain values of the source database may have to be standardized for target (based on international standards, ING specifics or new application design). Such domain values should be agreed upon and signed off well in advance, as part of analysis phase.

The analysis should also cover the following aspects of source and target data model,

- Business dependencies between the entities

- Understanding of multiple record layouts

- Technical dependencies between the entities

- Database specific constraints that may have potential impact on the data conversion (for

example the impact of migration of COMP-3, OCCURS, REDFINES, etc. from a mainframe

environment to Unix/Oracle)

2.3.3 Data Cleansing

Based on the data analysis, the fields that need to cleansed should be identified. Data cleansing

is required to ensure that only accurate, consistent and complete data is loaded into target

database. Data cleansing will be required for

TCS Confidential Page 11 of 35

Page 12: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

- Junk Characters/Characters not supported by Oracle like nulls

- Invalid Domain Values

- Domain value standardization

- Values not within Range of the field

- Format consolidation (eg, dates , amount fields)

- Referential integrity (eg, affiliate RUT in any transaction table should also be present in

affiliate master)

The cleansing requirements should be documented clearly, stating the present conditions and the

proposed corrective action. The field analysis template itself can be used for documenting

cleansing requirements. Data cleansing requirements and routines will be provided by ING. We

also need to identify at what stage the cleansing rules can be applied (extraction , transformation

or load)

2.3.4 Extraction programs

The extraction rules will be based on the business need and the data required for each iteration.

Extraction rules to extract data from the source (VSAM / DB2) needs to be defined jointly by ING

and TCS and the same will be incorporated in the extraction programs.

2.3.5 Analysis of Target DatabaseOnce the target database design is completed and baselined the following table will be updated

Sl Table Name Total Not

Null

Date Unique Key

1

Total

TCS Confidential Page 12 of 35

Page 13: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

Assumptions

Updated Project Plan is available

Finalized Source inventory list for current iteration is available

Target data model for current iteration is available

Activities

SL Category Task Schedule (Week-Day)

1 Analysis Document base-lined source inventory2 Analysis Categorize the source entities in “Reference, Transaction and Master”3 Analysis Identify candidate field. Analyze and understand the domains, range/set

of valid values of the identified candidate fields.4 Analysis Analyze the source and target data models for cardinality,optionality and

relationships5 Analysis Understand the record identifiers for data stores with multiple layouts

(Internal to COBOL programs – may be hidden in the data definition)6 Analysis Understand the impact of environment specific constructs like

compressed data items (Comp variables in COBOL), repeating data groups (Occurs clause in COBOL) , reusage of storage space(Redefines and value clause in COBOL), date structures(date may not have century part, maybe Julian date)

7 Analysis Identify System Dependencies (eg, Character set in mainframe is EBCDIC while it is ASCII in UNIX. Date format is Date + Time in target Oracle while it may not be the case in source)

8 Analysis Classify the entities that “must be converted for the target”, entities that “must be only used for transformation”, entities that are “redundant”, entities that are “not required for target”, entities that are “in question”. Identify the owner for the entities that are “in question”

9 Analysis Finalize and document the criteria for data extraction 10 Analysis Identify the right source based on the discussion with maintenance and

business team. Right instance of the data.11 Analysis Define general flow for migrations process (VSAM extract flat files

versus master files)12 Analysis Review the standards for data mapping from target to source.13 Data Cleansing Identify and document data cleansing requirements.

Deliverables Data analysis findings Updated Inventory list

TCS Confidential Page 13 of 35

Page 14: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

Challenges

It is essential to baseline both source and target data models to reduce rework. However it is not practical when analysis is done in iterations. It is vital that any changes to the source and target baseline should be informed to the data migration team immediately. The changes should be immediately analysed and data analysis document updated.

All environment specific constructs should be identified. It should be verified whether the informatica tool will handle it. If the tool does not handle it suitable solutions should be identified for migrating them to target. During POC we have identified the following list

o Character set in mainframe and Unix are different. Mainframe uses EBCDIC while Unix uses ASCII. Informatica power center is able to handle this conversion.

o Occurs , and Redefines can be handled by Informatica power center.

o For Occurs depending we have to manually alter the data to make it the maximum number before loading in informatica power center. Usage of Power Exchange will be able to address this problem.

o Loading of DB2 null data into Oracle was found to be a problem. An extra field was manually added before every column that may contain null. This is to hold the null indicator. Usage of Power Exchange will be able to address this problem

o In Oracle Date is defined as YYYY-MM-DD-Time but in Vsam files it can be of any combination. A transformation rule was written in power center to transform source date to target format

o We could not find any Julian dates in POC. So a strategy for transforming it is not identified. Further analysis to be done to check if ING core AFP system uses Julian date or not.

2.4 Strategy definition

The various strategies related to data migration are defined in this phase. The data migration strategy document is prepared in this phase. A proof of concept has been done to validate the migration strategy for extraction, transformation and load. This document will be updated with best practices and lessons learnt after each iteration.

2.4.1 Proof of concept

The migration of following VSAM files and DB2 tables will be the scope for the Proof of concepts. The extraction, transformation and load will be done for these sample data in the development environment. VSAM

1. CUENTAS.PROD.PMC321D1 2. CUENTAS.PROD.PMC321D2 3. CUENTAS.PROD.COT905D1 4. BENEFIC.PROD.PCB150D1 5. BENEFIC.PROD.PCT200D1 6. BENEFIC.PROD.PPR100D1 7. INCORPOR.DESA.EAE02M 8. INCORPOR.PROD.EAE03M

DB2

TCS Confidential Page 14 of 35

Page 15: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

1. PER_INC_REC 2. RECLAMO 3. EMPLEADO 4. DIRECCION_POSTAL 5. DIRECCION_PERSONA

The proof of concept is completed and the following is proved

1. Extraction of VSAM file to flat file and ftp to text file 2. Extraction of DB2 to flat file and ftp to text file3. Mapping and transformation between source and staging tables using informatica power center 4. Mapping and transformation between staging and target tables using informatica power center5. Loading of VSAM and DB2 extract flat file into staging tables using informatica power center 6. Moving data from staging database to target database by executing the mapping and

transformation scripts in informatica power center workflow7. Transfer of scripts and integration between offshore and onsite

Assumptions

Project Plan is available

Activities

SL Category Task Schedule (Week-Day)

1 Strategy definition Define data migration strategy 2 Strategy definition Define testing strategy3 Strategy definition Define Implementation strategy4 Strategy definition Create data migration strategy document5 POC Do proof of concept6 Review Review the data migration strategy document7 Presentation Presentation to selected audience8 Sign-off Obtain sign-off from Clients on the strategy documents

Deliverables

Data Migration Strategy Document

2.5 Design

The objective of this phase is to define a set of rules to transform data from source to target. The mapping rules are based on source and target data structure and domain information provided by ING. The mapping repository is created to maintain list of mapping rules. The following template is used for mapping repository

TCS Confidential Page 15 of 35

Page 16: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

2.5.1 Mapping rules

Direct mapping

Identify target fields with one to one relationship with source and specify the source value to be used

Transformation rule mapping For remaining target fields, document transformation rule in detail, specifying source fields and

computation clearly.

Default value mapping

Identify target fields that have no relation with source and specify the default value to be populated. Functional and design people need to be involved in taking these kinds of decisions.

Unmapped fields in source

Unmapped fields in source will be analyzed and risk of not migrating these data will be estimated. This analysis will be done only if the field is unmapped even after all iterations are completed.

2.5.2 Data Format – Source to Text File

VSAM to Flat file (Any COBOL Layout to Free format Layout)

All the following conversions will be done by Informatica Power center itself based on the standards

VSAM DATA TYPE Flat File REMARKS

COMP-3 Free format Signed Edited text numeric field

COMP-2 Free format numeric display field

Signed Decimal Sign edited text field

COMP Free format Signed Edited text numeric field

Numeric Numeric

DB2 to Flat file

DB2 Data Type Flat file REMARKS

SMALLINT PIC -9(4) 1 <= n <= 15

INTEGER PIC -9(9) 16 <= n <= 31

DECIMAL (p,s) or

NUMBER (p,s)

PIC –9(p).9(p-s)

PIC 9(p).9(p-s)

p – precision

s – scale

TCS Confidential Page 16 of 35

Page 17: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

1 <= p <= 31 and

0 <= s <= p

CHAR (n) PIC (n) 1 <= n <= 255

2.5.3 Non-key source fields becoming key fields in targetFor the source data where the non-key fields become key fields in target, proper integrity and the

order of migration should be performed so that the complete information is retained without any

data inconsistency and data redundancy. Unique & non-unique constraint will be analyzed and

the proper validation technique will be ascertained, so that there is no undefined information in

the system. Proper indexes will be defined in the target system so that the access time is within

the SLA.

2.5.4 Date and time stamp / load date fields and user id Date will be ORACLE format of mm/dd/ccyy with default value set by the business. Time stamp

will also be default ORACLE timestamp. For load dates field and update user id field the date

when the loading/migration is done and a default User Id will be assigned.

Assumptions

Baselined source and data model for the current iteration is available

Data analysis findings is available

Activities

SL Category Task Schedule (Week-Day)

1 Design Create mapping repository 2 Review Review the mapping repository

DeliverablesMapping repository

2.6 Construction

The objective of this phase is development of data migration suite. This phase consists of creation of extraction , transformation and load scripts for data migration.

TCS Confidential Page 17 of 35

Page 18: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

2.6.1 Data migration approachThe data migration would occur in 2 stages. In the first stage data will be migrated from the source

systems to the staging ORACLE database in the same layout as the file layouts. In the second stage we

will move the data from the staging database to the target ORACLE database. Following diagram depicts

the data migration steps:

2.6.2 Source System (VSAM / DB2) to Staging database (Oracle)

2.6.2.1 Extract

The strategy of extraction given here is without Informatica Power Exchange. The impact of having Informatica Power Exchange on extraction process will be analyzed and the same will be updated in this document after iteration 1.

The data from VSAM file and DB2 table is extracted by the following steps

Steps for extraction of VSAM files 1. REPRO JCL’s to extract the VSAM files into flat files will be written. Temporary

variable to be used in the JCL and the name of the file to be hard coded at only one place.

2. The JCL should also contain a step to FTP the flat file in binary format to FTP Server.3. Logical grouping of the files in one JCL should be determined and standardized

Steps for extraction of DB2 tables 1. DB2 Unload JCL’s to extract the DB2 tables’ data into flat files will be written.

Temporary variable to be used in the JCL and the name of the table and the load file to be hard coded at only one place

2. The JCL should also contain a step to FTP the flat file in binary format to FTP Server.

3. Logical grouping of the tables in one JCL should be determined and standardized

TCS Confidential Page 18 of 35

Page 19: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

Pre processing - Informatica Power center

1. The COBOL format programs with copybook names with “.CBL” extension will be written

2. The copybooks in the same folder with ".CPY” extension will be copied3. The source descriptions will be defined in Informatica Source Analyzer4. Using the source descriptions the target table descriptions (Staging Oracle db)

will be defined in Informatica Warehouse designer 5. The staging target tables are created in the database

Note: Any compatibility issues between Mainframe data and loading data into Infomatica Power center will be analyzed and the extraction process may have an impact. The document will be updated accordingly.

The following are also done as part of the extraction process Some degree of data cleansing activity will be performed as a part of the

extraction process. These will include replacing junk characters by blanks, substituting zero for a numeric field.

Reporting mechanism on each extraction process will also be developed. This will report the details of the rejected records, bad records, excluded records, and bad data.

Transferring the text files from mainframe to UNIX environment will be performed by typical ftp. The file to be transferred will be split into number of files and split files will be compressed by PKZIP software. The compressed files will be transferred through UNIX box to Informatica server. In UNIX the files will be de-compressed with the help of PKUNZIP software and will be loaded into Informatica server.

2.6.2.2 Transform

The following are the steps that needs to be followed in informatica power center 1. The mapping rules are defined and linked between the source and target in the

mapping designer 2. The transformations rules are designed and scripted in Transformation developer 3. Some degree of data cleansing activity will be performed as a part of the

extraction process.

2.6.2.3 Load

The following are steps involved in loading the data from VSAM and DB2 to staging oracle database.

1. The reusable sessions are created which will define the mapping 2. The workflow is created in the Workflow Manager which will define which session

needs to be executed and sequence and time of execution3. The workflow is executed to load the data from the source to the staging

database. The number of workflows will be decided based on the sequence of the migration

4. The referential integrity will not be maintained in this database.5. Indexes will be created based on the performance requirements

2.6.3 Staging database (Oracle) to Target database (Oracle)

2.6.3.1 Extract

TCS Confidential Page 19 of 35

Page 20: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

The data from staging database is not extracted but it is physically represented as mapping and transformation and the Informatica Power center picks it up from staging database to the target database.

Pre processing - Informatica Power center

1. The source descriptions (Staging database needs to be defined as source )will be defined in Informatica Source Analyzer

2. Using the logical target database design the target table descriptions (Target Oracle db) will be defined in Informatica Warehouse designer

3. The target tables will be available in the database already created by the application team

2.6.3.2 Transform

The following are the steps that needs to be followed in informatica power center 1. The mapping rules are defined and linked between the source and target in the

mapping designer 2. The transformations rules are designed and scripted in Transformation developer 3. Cleansing activities will also be done here. 4. The data will be ported to Informatica server through UNIX box.

2.6.3.3 Load

The following are steps involved in loading the data from staging oracle database to target oracle database.

1. The reusable sessions are created which will define the mapping 2. The workflow is created in the Workflow Manager which will define which session

needs to be executed and sequence and time of execution3. The workflow is executed to load the data from the staging database to target

database. 4. The referential integrity will be maintained in this database and hence the data

loading will have to be performed based on the defined loading sequence.5. Indexes will be created based on the Target database schema requirements.6. Additional indexes may also be necessary to improve the performance

requirements

Note: For the input source data that does not require cleansing in staging will be migrated directly to the target. The analysis of cleansing for the files plays a major role in deciding this strategy. This approach will save a lot of time during the implementation

TCS Confidential Page 20 of 35

Page 21: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

2.6.4 Cleansing

2.6.4.1 Pre Migration (Production Phase)

This process will cleanse all the non voluminous and business non-critical data. The main

purpose of cleaning the data directly in production is to avoid any cleaning activities of the similar

data in the subsequent migration. Therefore the data, which is cleaned, will remain clean

throughout the different phases of migration. The types of data that will be cleaned are:

Name: These are entity properties data like, Customer name, Customer address, Dealer name, Bank name DSSO Name.

Comment: These are entity attributes data, like, description, Comments, Attention fields and any other fields that are not participants in business validation.

TCS Confidential Page 21 of 35

Page 22: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

Appropriation: These are any standardization data like, customer name standardization, address standardization.

2.6.4.2 Extraction Process

This level cleansing process is to clean voluminous business non-critical data. This also includes

the data that are routine and static clean. The data that are cleaned in this process are:

Technical Data (does not need any Business intervention) Default Data (Handling of Space, Null, Date) Cleaning of junk character User identified incorrect data

2.6.4.3 During Transformation

The major part of the data cleansing rules is applied in this stage. Cleansing at transformation

include while transforming data from source to staging and also while transforming staging to

target. These include:

Inconsistency in Business Domain value (ZIP code, RUT) Unmapped data

2.6.4.4 In staging

Some level of cleansing will be done on the data present in the staging table. Either Java

programs or SQL will be written to clean the data present in staging.

Note: If cleansing will be done during transformation and staging, then ING and TCS has to analyze the impact on the effort involved and the changes to the plan.

2.6.5 Audit trail data, summary dataAudit trail data will contain total number of records migrated and summation of any numeric field.

This will be re-validated in the target system to confirm the correctness of File transfer. Record

rejection, record appropriation can also be included in the Audit data.

Summary data will contain all type of key information for migration of a particular entity. For

example for Affiliate Master, RUT, Name of the affiliate and any other information that are critical

to the entity will be considered.

TCS Confidential Page 22 of 35

Page 23: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

2.6.6 Reports

Exceptions reports will be analyzed, gap will be studied and new/changed data cleansing definition will

be incorporated. Both the rule definition and programs will be configured. Use of any reporting tool will

be analyzed and finalized

2.6.7 Special Requirements

Any special requirements that arise as part of Data Analysis will be documented and updated frequently

Assumptions

Baselined source and target data model for the current iteration is available

Mapping repository available

Data cleansing requirements available

Activities

SL Category Task Schedule (Week-Day)

1 Construction Extraction routines to be written for extracting data from mainframe2 Construction Source and target definitions to be created in informatica power center

using information from source and target Data model 3 Construction Mapping and transformation rules are created in informatica power

center based on the information collected in mapping repository 4 Construction Session and workflows are created using power center for executing the

mapping and transformation rules.5 Data Cleansing Data cleansing rules are also written if required in this stage 6 Validation Finalize and document the criteria for data validation, to the verify the

correctness of migration (Business Validation)7 Audit Finalize and document the criteria to verify the completeness of

migration (Technical Validation)

Deliverables

Extraction routines Source and target definitions Mapping and transformation rules Load routines (Sessions and workflows) Audit and validation routines

2.7 TestingThis phase comprises of testing the data migration suite for each iteration. Testing will check all the transformations / mappings / workflows / cleansing / audit and validations. Individual test cases need to

TCS Confidential Page 23 of 35

Page 24: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

be prepared for testing out various functionalities. The following matrix illustrates the broad areas that the test cases will pertain to –

Attributes Measurement plan Remarks

Business

important fields

for checksum

1. Identify all business important fields that can be used for summation checks for data extracts and in target tables

2. Perform summations on the identified fields in incoming data files and match the sum

3. Perform summations on identified fields in the ODS and match with that of incoming data

Business important fields that can be used for checksums need to be requested to ING users and it should be included in the extracts.

Business rules All data elements to be mapped to business rules

All data elements and relationship should pass associated

business rules

(E.g.- Data attribute can contain only one out of a set of

values)

All Business rules

should be provided by

ING users and TCS will

do a feasibility analysis

for the same

Integrity checks 1. Identify all integrity constraints2. All data must pass through associated integrity

constraints(Eg- There can be no detail records in the absence of a master)

Integrity constraints should be specified by ING users and verified and validated

Outlier

conditions

Identify the Minimum, Maximum and default values for

data attributes

All data attributes should contain a valid value

Raise alert when invalid values are detected

Min, Max and Default

values should be

provided in and verified

and validated

Alert

mechanism

1. Identify all steps which need to generate alert (eg- Invalid incoming data, failed integrity checks, outliers, load

2. Raise alerts

Any specific alert requirements should be specified in the ETL strategy to incorporate the same in development

Correctness of

calculations

Identify fields involving complex calculations

Recalculate once loading is complete

Match with previously calculated values

ING users to specify

critical fields involving

complex calculations

and the same should be

incorporated

Audit trail Identify data to be captured in audit trail ( Eg- File name,

number of records on file, Records inserted from file,

Capture audit attributes during load process and store in

Audit table

Any specific audit

requirements should be

specified in the ETL

specs and will be

incorporated

Incoming data

summary

Identify summary information for input data to be sent

in additional file (File name, number of records, date

Incoming control

summary file

TCS Confidential Page 24 of 35

Page 25: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

Attributes Measurement plan Remarks

and

Perform checks on incoming data ( Match record

count in control file and actual number of files

received, match

Raise alert in case of mismatch

specification to be

provided and same

should be incorporated

in the extract

Business test

cases

TCS will write 40 to 50 test cases to check the

bussiness scenario for audit. Bussines test cases

could be writing SQL queries to get the data from

target and verify it using existing mainframe data. The

choice of the business criteria can be identified from

the legacy reports or may be provided by ING

Information on critical

reports to be provided

by ING

2.7.1 ValidationThe following are the validations that will be performed to ensure the correctness of the data migrated.

No Category Source Destination Criteria

1 Number of physical record All Entities All Entities Exact match or

deviation justified

2 Sum Field-1; Table-1

Field-2; Table-2

Field-1; Table-1

Field-1; Table-1

Field-2; Table-2

Field-1; Table-3

Exact match or

deviation justified

3 Sum against a branch Field-1; Table-1 Field-1; Table-1 Exact match or

deviation justified

4 Total number of active

affiliates

Field-1; Table-1 Field-1; Table-1 Exact match or

deviation justified

5 Total number of deceased

affiliates

Field-1; Table-1 Field-1; Table-1 Exact match or

deviation justified

6 Totals Field-1; Table-1 Field-1; Table-1 Exact match or

deviation justified

7 Status Fields Group By count Group By count Exact match or

deviation justified

8 Null fields Count Field-1

Count Field-2

Count Field-1

Count Field-2

Exact match

9 Blank Fields Count Field-1

Count Field-2

Count Field-1

Count Field-2

Exact match

TCS Confidential Page 25 of 35

Page 26: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

No Category Source Destination Criteria

10 Not Null Fields Count Field-1 Count Field-1 Exact match

11 Duplicate Rows Table-1

Table-2

Table-1

Table-2

Exact match

12 Deleted Rows Justify

13 Key fields (RUT, Folio

Number)

Group by range Group by range Exact match

14 Name Fields Compare by key Compare by key Exact match

16 Round off Verify correct

decimal places

Verify correct

decimal places

Exact match

17 Truncation Error on Identified

Field

Correct truncation Correct truncation Exact match

18 Exceptions Defined Defined Validate

19 Bad Records Identify Defined Validate

2.7.2 Audit

Audit rules are expected to be defined by the ING Core AFP Data Migration team in the following format. The auditing should be done based on the reliable reports from business. Business reports will be provided by ING to be used for auditing

No Category Source Destination Criteria

2.7.3 Testing Lifecycle

The construction and unit testing will be done by TCS onsite/offshore team after the finalization of design document. This will be going forward basis.

The Migration components will be delivered by TCS upon completion of construction/unit testing.

The components will be validated by ING Data Migration team.

After this primary validation a bigger Revolution Unit Testing will be performed by ING data

Migration team after ING Core AFP application is delivered. TCS will support the testing.

After the completion of this phase Performance Testing and Revolution Unit Testing will be

performed in parallel. TCS will support these two types of testing.

TCS Confidential Page 26 of 35

Page 27: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

Assumptions

Data migration suite available (Extraction, transformation and load routines)

Audit and validation routines available

Source Data for migration is available

Activities

SL Category Task Schedule (Week-Day)

1 Testing Test the data migration suite2 Audit and validation Run the Audit and validation scripts and verfy the completeness and

correctness of data migration.

DeliverablesTested data migration suite

2.8 Pre-Implementation(Dry Runs)Pre-Implementation or Dry run is the simulation of production implementation in test environment.

The objective is to understand the complexities during implementation, in terms of the window for

data migration, infrastructure requirements and to fine tune the programs and implementation

procedures if required. Data migration implementation is planned in two phases. So Pre-

implementation run will also be done for each of these phases. This will be done by ING Data

Migration team and the Business Capability Team. TCS will support this testing.

The strategy describes the go-no-go checkpoints after different stages and a Root cause Analysis (RCA) will be done for each checkpoints. Based on the RCA the Data Mapping, Data Model, Design, Migration Design, Migration component codes will be revisited and necessary actions will be taken.

Assumptions

Tested Data migration suite available for the current implementation phase

Test environment that is simulated based on production is available

Source Data for pre-implementation dry run is available

Activities

SL Category Task Schedule (Week-Day)

1 Pre implementation Test the data migration suite2 Audit and validation Run the Audit and validation scripts and verfy the completeness and

correctness of data migration. 3 Performance Performance tuning of data migration suite if required

TCS Confidential Page 27 of 35

Page 28: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

DeliverablesFull volume Tested data migration suite

2.9 Implementation Implementation phase comprises of activities for implementing the actual production data migration. The implementation of data migration depends mainly on the implementation window, volume of data to be migrated and the type of data. On further analysis on data and discussions the implementation strategy will be finalized.

As of now the implementation of phase 1 roll out alone is considered. Based on further analysis the document will be updated for the implementation of phase 2 roll out

All the back up data will be migrated two weeks ahead and the reference data will be migrated one week ahead and the transaction and master data will be done in one weekend before live. The same is depicted in the figure below

Points to be considered to adopt this approach: 1. All the back up data be extracted in 48 hours2. All the reference data be extracted in 48 hours 3. All the transaction, master and catch up reference data be extracted, cleaned,

transformed and loaded in 48 hours 4. It is assumed that data migrated on first weekend is not going to change at all5. It is assumed that data migrated on second weekend (reference data) may not

change in one week6. Additional effort is involved in doing catch up for reference data 7. Testing of the data will be in the parallel run time 8. Incremental migration may be required for reference data

TCS Confidential Page 28 of 35

Page 29: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

Note: Data cleansing implementation is not considered. The cleansing implementation will have impact on the strategy defined here and this document will be updated based on the cleansing implementation

Following source files of ING Core AFP System split according to the modules and the best strategy and time for migrating these data will be tabularized in the following format once the approach is finalized.

# The following no of records and database sizes are based on the available information in production.

Sl System Data # (M)

Volume (GB)

Vertical Split(By Design)

Horizontal Split

Special Treatment

Link for the list of Tables/Files

Proposed date of Migration

1 Contracts Transaction

2 Contracts Master3 Contracts Refere

nce 4 Accounts -

1Transaction

5 Accounts -1

Master

6 Accounts -1

Reference

7 Claims -1 Transaction

8 Claims -1 Master9 Claims -1 Refere

nce10 Accounts -

2Transaction

11 Accounts -2

Master

12 Accounts -2

Reference

13 Claims -2 Transaction

14 Claims -2 Master15 Claims -2 Refere

nce16 Pensions Transa

ction17 Pensions Master18 Pensions Refere

nce19 Bonds Transa

ction20 Bonds Master21 Bonds Refere

nce

TCS Confidential Page 29 of 35

Page 30: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

Assumptions

Full volume Tested Data migration suite available for the current implementation phase

Activities

SL Category Task Schedule (Week-Day)

1 Implementation Backup the source data to be migrated if required2 Implementation Backup the target data in phase 2 implementation as

target database would be operational between phase 1 and phase 2 implementation

3 Implementation Execute the data migration suite (Extraction, Cleansing, transformation, Load scripts)

4 Implementation Resolve and reconcile any data errors5 Audit and

validationExecute the Audit and validation scripts and verfy the completeness and correctness of data migration.

6 Implementation Resolve and reconcile any errors encountered7 Implementation Invoke fallback procedures if unable to Resolve and

reconcile any errors encountered8 Implementation Make the target application go live

DeliverablesData migrated to target table as per data migration requirements.

2.9.1 Cutover Considerations

Sl Candidate Issue

1 Master Files Weekend cutover will not have any issue. Go-live on weekday will require the

files to be kept on hold.

2 Quarterly Back up

files

These type of files which are not going to be modified can be migrated two

weeks ahead

3 Contracts All the contracts related files should go live in the month end only

4 Deceased Data Data related to deceased can be migrated well in ahead as they are not going to be modified

5 Closed Claims All data pertaining to closed claims can be migrated well in ahead

6 Inactive affiliate The data related to inactive affiliate can be migrated well in ahead as they are not going to be modified

6 DB2 tables Weekend cutover will not have any issue. Go-live on weekday will require the record locked

7 Maintenance

Changes

Stop online users to do any maintenance transaction in the last 3-4 days before implementation. This will make the database more static.

8 Regulatory Stop applying regulatory changes in the last 1 month before implementation

TCS Confidential Page 30 of 35

Page 31: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

Sl Candidate Issue

Changes

9 Final Backups prior to migration

After the completion of the batch cycle final backups need to be taken.

2.9.2 Change Control

Scope of Change Control

Sl Artifacts Owner Repository Formal1 Source Database Schema Business Analyst Team2 Source Data Business Analyst Team3 Target Data Model Business Analyst Team4 Extraction Rules Business Analyst Team5 Extraction Programs Technical Team6 Extraction Jobs/Schedules Technical Team7 Extracted Data on mainframe Technical Team8 Transfer Programs Technical Team9 Transformation Rules Business Analyst Team10 Transformation Programs Technical Team11 Transformation Jobs/Schedules Technical Team12 Transferred Data (in UNIX) Technical Team13 Conversion Database Technical Team14 Transformed Data Technical Team15 Loading Programs Technical Team16 Loaded Data Technical Team17 Loading Jobs/Schedules Technical Team18 Cleansing Rules Business Analyst Team19 Cleansing Programs/Scripts Technical Team20 Cleansing Report Technical Team21 Validation Rules Technical Team22 Validation Programs Technical Team23 Validation Reports Technical Team24 Test Case (Unit/Integration) Technical Team25 Test Script (Unit/Integration) Technical Team26 Test Result (Unit/Integration) Technical Team27 Test Report (Unit/Integration) Technical Team28 Audit Rules Business Analyst Team29 Audit Programs Technical Team30 Audit Reports Technical Team

TCS Confidential Page 31 of 35

Page 32: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

2.9.3 TraceabilityThe traceability of the data migration artifacts (documents and programs) are to be traced from target

fields to the audit and validation routines in the ING Core AFP system. The following diagram depicts the

traceability requirements in different stages.

Following example can be used as a template for traceability matrix

Sl Trace Tracing to

Filed Number <Table Number>-<Field Number>1 Target Field2 Target Table3 Source Field4 Source Table / File5 Clean Rule6 Clean Program7 Clean Report8 Extract Rule9 Extract Program10 Extract Job11 Extract Schedule12 Transfer Spec13 Transfer Program14 Transfer Jobs15 Transfer Schedule16 Transform Spec17 Transform Program18 Transfer Job19 Transfer Schedule20 Clean Rule21 Clean Program22 Clean Report

TCS Confidential Page 32 of 35

Page 33: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

Sl Trace Tracing to

23 Load Spec24 Load Program25 Load Job26 Load Schedule27 Clean Rule28 Clean Program29 Clean Report30 Test Case (Unit/Integration)31 Test Script (Unit/Integration)32 Test Result (Unit/Integration)33 Test Report (Unit/Integration)34 Validation Rule35 Validation Script36 Validation Report37 Audit Rule38 Audit Script39 Audit Report

2.9.4 Backup and RecoveryThe backup and recovery process for the version controlled artifacts will be placed in Rational Clear Case

tool under corresponding folders. The frequency of the back ups will depend on the type of the artifact.

The strategy developed for the data migration has been modularized to enable restart of the process at

any stage of failure.

The exception handling during outage window will be decided based on the business criticality of the data

that is migrated.

The possible ways of handling exceptions are

1. Stop the migration. Delete everything and start from the first

2. Write the exception in separate file and continue with the migration without inserting

that record

3. Write the exception in separate file and continue with the migration by inserting the

record with pre defined values.

4. Stop the migration. Analyze the exception, solve it and restart the migration from the

last commit point

TCS Confidential Page 33 of 35

Page 34: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

3 Risks Target Database Design is not available on time. This may impact on defining mapping rules and

transformation rules and the whole migration process.

Delay in Source Inventory analysis by ING

Delay in Data Cleansing activities by ING

The production cut-over window for implementation is expected to be 48 hours over a weekend.

This window might get reduced.

Environment readiness

Cutover window – Network, Link, Database, Extended Production Window

Software Version Change (Oracle, Informatica, OS)

Major changes in source due to SAFP Regulatory Changes

Change in the layout of the files

4 Guidelines Pre extraction data cleansing is advisable for voluminous non-critical data

Vertical split of the source data is preferable only when design of the target data model demands

it. Vertical split to handle the voluminous data is not recommended.

The scripts written for extraction and all other activities needs to be written in standard formats

and back ups taken periodically.

Non Critical and back up data can be migrated two weeks before the system goes live.

TCS Confidential Page 34 of 35

Page 35: Data Migration Strategy

Data Migration Strategy for AFP Reengineering Project Version 1.0

5 Recommendation1. Target data model for conversion database is yet to be firmed up. Once we have the firmed up

target model, the mapping can be started. However, the iterative development model of ING Core AFP project definitely demands for an iterative construction and unit testing phase for the data migration programs. .

2. The assumption of month-end-weekend implementation of the whole ING Core AFP Data Migration may not hold good. The DM POC can be used as a contingency plan for the complete implementation.

3. The fine line between the several interfaces and the cutoff scenario of data migration has to be properly monitored. Several cutoff issues are related to the handling of the interfaces during the cutover window. We recommend formal weekly interaction among the interface team, migration team, business capability team and maintenance team.

4. The complete migration life cycle (extraction to loading on target data base) has been designed with redundancy and modularity. This is to ensure that in every logical break point one can commit or restart.

5. To minimize the risk, the implementation strategy assumed a one off data migration on the weekend and incremental build for five days. This portion will include only the data that are unlikely to change. The data related to active accounts will be migrated on the production cutover weekend.

6. We recommend a comprehensive traceability matrix based on the target fields on target database. This will provide the proper insight into the project as well as help in change control mechanism.

7. The data migration for the phase I and phase II will be assumed to be two separate implementations.

8. Candidate field analysis for all the source data is recommended upfront to identify the potential data-cleansing requirement.

6 Responsibility Matrix

TCS Confidential Page 35 of 35