Test Data Management – The Next Hype? Leif Bäck, MainSoft, Finland Europe’s Premier Software Testing Event Stockholmsmässan, Sweden WWW.EUROSTARCONFERENCES.COM “Testing For Real, Testing For Now”
Test Data Management – The Next Hype?
Leif Bäck, MainSoft,
Finland
Europe’s Premier Software Testing EventStockholmsmässan, Sweden
WWW.EUROSTARCONFERENCES.COM
“Testing For Real, Testing For Now”
Test Data Management -
The Next Hype?
EuroSTAR2009
Leif Bäck, MainSoft
My Background
• CapGemini 77-85
• ICS 85-88
• Cominvest 88-92
• MainSoft 93-
– System Software (sales, marketing,
implementation, training ….)
– 15 last years specializing in Test Data
Management-solutions
Test Data Management – The Next Hype?
Agenda
• Test Data – overview
• TDM-functionality
• TDM-solution
– Concept, implementation, success factors ...
• Cases
• Summary
Test Data - overview
• Test Data has, until recently, been the forgotten
component in software development and testing
• Developers and Testers often struggle with Test
Data
• Creating and managing Test Data is time
consuming, complex and prone to human error
• The enterprise is getting more complex every
day (also the data models and the testing)
Traditional Model
Input OutputProgram
Database Application
DA
TA
BA
SE
Program
Program
Program
The challenge: Complicated structures
Test Data – overview cont…
• Test Data cases are easily consumed
• Test Data has often a “best before date”
• Cloning (of production data) is for many not an
option any more (security, long run times…)
• The quality of your Test Data affects the quality
of your testing – this has a direct affect on the
bottom line!!!!
• The need for Test Data Management is rapidly
growing – is it the next hype?!?
Some TDM-phases
• Mid 1990’s many powerful tools hit the market– The usage at the start was often disorganized (the tools were used
as utilities to solve specific problems by those who had skills enough)
– At the same time automated testing tools had their first boom
• Y2K-testing was an opportunity, but the focus was elsewhere (too much money around?!?)
• Companies start to organize the usage of the tools (at the same time big investments in other testing tools)
• Today there seems to be a fast increasing interest in TDM-solutions with drivers like:– SOA
– Better predictability
– Need to cut costs
– Need to collect ”silent knowledge”
Requirements on a TDM-toolset - Extract
CUSTOMERS
ORDERS
DETAILSITEMSITEMS
Requirements on a TDM-toolset –
Extract/Insert
MyOrd-extract
Production DB Test DB
Cust1
Cust2
Cust3
Cust1
Cust2
Cust3
Mapping
Requirements on a TDM-toolset - Delete
MyOrd-extract
Orders DB
X
Cust1
Cust2
Cust3
Requirements on a TDM-toolset -
Compare
After-extract
Before-extractCompare
Compare-results
on "paper"
on screen
-new rows
-deleted rows-changed rows
Requirements on a TDM-toolset –
Test Cycle
Extract
ProgX
Extract
T1
T2
Compare
Requirements on a TDM-toolset -
Regression Test
Prog V1
Extract
Monday
Prog V2
Tuesday
Extract
Compare
InsertInsert
Requirements on a TDM-toolset -
Database Subsetting
Jill's test cases
Arthur's test cases
Jerry's test cases
Jim's test cases
C4
Requirements on a TDM-toolset -
Subsetting Scenario
Test DB
C4
C2 C5
C3C7C2
C9
PROGRAM
4 Compare2 Run Program
3 Extract after
Compare
1 Extract before
5 Delete
6 Insert original
C4C4
What is a TDM-solution?
• An application, unique for every company, with which you control your Test Data (in different Test Environments)
• Built on a powerful toolset (that takes care of the dirty work against the DB’s)
• Company unique add-on’s (depending on ambitions, resources, application-portfolio, datamodel, number of environments, other testing tools, testing culture …….)
Company
Test
IT
Test Data Management - concept
Test Data planDatabases and
Test Data
TDM application
Application area
Test env. plan
Test planTest analysis
Test env. resp.
Testdata resp.
Organisation Management
System architecture
Testenv. architecture
Business Organisation
Test
Unit-testIT
Systemtest
Acceptancetest
Implementation of Test Data Management
Production
Requirements
Design
Construction
TDM architecture
TDW
ProductionAcceptance
Test
System
TestDevelopment
Local development environmentsX% of production
Fast Track
De-identification
Updated x times / year
Search
Search
Functionality in a TDM-application
Keys to extract production data
Keeps track of who owns Test Data
Validating and control of technical keys
Keeps track of which env. Test Data is loaded in
De-identification of sensitive data
Test Data synchr. in integrarated applications
Subsetting
Data owners
De-identification
Integration
Validation
Enviroments
Categorization of Test Data and Test Cases
Categorization
System- and regressiontesting
Test strategy with optimized Test Data
Subsetting
Acceptance- och releasetesting
Integrated applicationsCyclical tests with soft date
”Right answers” Computerized compare
Test Data in full volume
Simulation of production Focus on critical applications Business analysis
De-identified Test Data
Simulation of future events
Performance Tests
Integrated applications
Key factors to success
Test Data planTest env. planTest plan
Test management
Organization
Test env.management Testdata management
Planning
Teststrategy
Tools
TDM-applicationGeneric Toolset
Test Data subsetting Regressionstest Acceptancetest
Case: Company X
• This company maintains the centralized register of employer payments of the Labour Pension Insurance Sceme of Finland
• These were previously held by each labour pension insurance company separately but are now (from the start of 2007) concentrated in Company X
• Large project with more than 1000 persons involved spread across several different companies– up to 200 external consultants in Company X alone
• Providing good test material was an essential success factor
Case: Company X
Pension Insurance
company TESTPH-
system
PROD
Bookkeeping
1FTP-file with
soc.security numbers 2
z/OS
AIX
3
4
PROGR
5
Case: Company Y
• Company Y is a software company that
develops systems for several different labour
pension insurance companies
• When building and testing applications
Company Y needs test material in their test
databases
– This material (or the basis for it) is usually
extracted from production
Movements of Test Data between diffrent
environments
Common
CUST A CUST B CUST C CUST D
CUST A CUST B CUST C CUST D
DEV
SYSTEM
TEST
PRO
CUST A CUST B CUST C CUST D
Material Extraction
• Company Y had used a generic tool to extract material from their production DB2 for several years– this task had been centralized to a group of 8 testers
– as input from the ”orderer” they got a filled Excel-form
• However, the tasks were executed ”by hand” (using ready made processes)– the amount of manual steps to be performed could easily exceed
100 and the operators needed lots of coordination and expert knowledge
• These testers were heavily overloaded (and not that happy)!!
Excel File to Order Test Data
Features in company Y’s TDM-application
• Reads the Excel file and saves information in a database
• Checks validy of request
• Checks existence of keys (persons and insurance policies)
• Obtains secondary keys (e.g. policyperson)
• Resolves dependencies between applications
– e.g. EMOP,ASPA
• Checks if keys are already reserved by somebody else (conflicts)
– information on who is reserving them is shown
• Generates tasks for the generic tool and executes them
– instrumentation to monitor amount of CPU etc...
• Reserves the keys (bookkeeping)
Benefits Company Y achieved
• Test Data integrity
• Speed and precision in Test Data deliveries
• Manual work decreased dramatically
• Bookkeeping
• Minimization of personal dependencies
• Visibility, traceability
• Cost awareness, monitoring
Case: Company Z
• Company Z is a Pension insurance company that specializes in Labour Pension Insurances
• Company Z recently renewed nearly all of its application portfolio (billing, payouts, insurance, DW, actuary, extranet...)
• Large project with up to 150 persons involved in the project
Case: Company Z
• Technical platforms: >5
• Kinds of DBMS's: 4
• Number of databases: ~20
• Number of tables: >1200
• Number of integration interfaces: ~100
• Number of batches: 150
• Number of online dialogs: 100
• Number of test cases: > 1400
Company Z: Life Cycle Tests
• 10 months worth of batches were run on each person
• at a rate of about 7 min/month
• soft date was used to simulate time flow
• before/after comparisons were made
• A set of 30 persons with different profiles
• Extracted from converted data
Case: Company Z
PROD
ACCEPTANCEDEV
SYSTEST
TDS/Fast track - 1-10 at a time
TDS/Restore TDS/Baseline
Case: Company Z
EXTRACT Process Report
Extract File : FLY.TDSRES.B1968.S004.EAF.SEXT.XFAccess Definition : TDS.EAF.EXTRACTCreated by : Job K87376, using SQLID K87376 on DB2 Subsystem DB2PTime Started : 2009-04-05 23.40.08Time Finished : 2009-04-06 00.06.43
Process Options:Process Mode : BatchRetrieve Data using : DB2Limit Extract Rows : 40000000RowList : 'FLY.TDSRES.B1968.S004.EAF.SEXT.PNS'
Total Number of Extract Tables : 120
Total Number of Extracted Rows : 4266656
Total Number of First Pass Start Table Rows : 12994
~0,5% of a total of 2M persons
• Collect requirements from those who will use the Test Data (Testing
needs dictate)
• Analyze the application-portfolio (enable minimizing of Test Data)
• Centralize (who is responsable?) and plan (roadmap)
• Use a toolset (let the tool take care of the dirty work)
• The TDM-application should be a part of the company’s ITIL and meet
the requirements from development and release control
• Take care of monitoring (report about usage)
• Inform the organization about achieved benefits (internal success stories
are the best sale arguments)
• Create a demand to use the TDM-application in the company (avoid
forcing people to use it)
• Try to find persons that are on fire for this (it makes it a lot easier)!!!
• An area where high ROI can be expected!!!
• A well working TDM-application can make the diffrence between success
and disaster!!!
Test Data Management - Summary