PowerCenter Upgrades with Zero Downtime Presented by Greg Wade & Ed Wagner, Lockheed Martin
Jan 26, 2015
PowerCenter Upgrades with Zero Downtime
Presented by Greg Wade & Ed Wagner, Lockheed Martin
Topics
• About Us
• Our Environment
• Requirements
• Approach
• Development/ Test Transition
• Emergency Release Strategy
• Lessons Learned
• Q&A
2© 2013 Lockheed Martin Corporation. All Rights Reserved.
3
About Us
About Us
• Who we are
• Greg Wade, Information Systems Architect
• Ed Wagner, ETL Lead Developer
• What we do
• Build large scale active data warehouses with near real time data loads and high availability for Department of Defense (DoD) customers
• Where we work
• Lockheed Martin IS&GS; A leading federal services and information technology contractor
4© 2013 Lockheed Martin Corporation. All Rights Reserved.
5
Our Environment
Our Environment
6© 2013 Lockheed Martin Corporation. All Rights Reserved.
Input
Data Acquisition
EDWTeradata
StagingOracle
ETL• SPARC Enterprise M5000
• 144 GB RAM • Oracle Solaris 10 OS
• Informatica toolsets• PowerCenter• B2BDT• Metadata Manager
Excess ETL server and repository capacity needed to perform a Zero Downtime Upgrade.
Our Environment
7© 2013 Lockheed Martin Corporation. All Rights Reserved.
Informatica Powercenter1 Domain Per 6 Environments
7 Integration Services24 x 7 Operations
Data Acquisitions~125K
Flat FilesXML
StagingOracle
OperationalData Store
Oracle
EDWTeradata
TPT Files~65K
Oracle Change
Files
400+ WORKFLOWS
Multiple Sessions
(200+ Trans)
Finalize Command
Tasks
InitializeCommand
Tasks
8
Requirements
Requirements
• Develop a reusable upgrade strategy
• Version 8.6.1 HF 11 to 9.1 HF 3 … and future versions
• Reusable for major upgrades … not hot fix installation
• Orchestration of upgrade in six environments
• Minimize / eliminate production system outage
• Zero data loss
• Complete regression and performance testing required prior to production install
9© 2013 Lockheed Martin Corporation. All Rights Reserved.
Production must be upgraded while it is running with fully tested process and baseline.
Requirements
• Complete or partial roll-back must be supported
• Must be able to support any Production Emergency Release (ER) contingency during any point in the upgrade lifecycle
• Test and development of ongoing products had to continue uninterrupted
10© 2013 Lockheed Martin Corporation. All Rights Reserved.
.Development, test, and deployment on both versions must be possible for length of upgrade.
11
ApproachPlan, Development, Testing, Production
In Place Upgrades
• “Go Big or Go Home” Approach
• Pros
• Complex orchestration not required
• Can be accomplished quickly
• Preserves history (statistics, logs, etc)
• Cons
• Roll-back complex
• Requires production downtime
• Baseline differences in each environment may produce unpredictable results
• May loose SCM controlled baseline
12© 2013 Lockheed Martin Corporation. All Rights Reserved.
Too much risk to SLAs … we were scared!
High Level Approach
13© 2013 Lockheed Martin Corporation. All Rights Reserved.
Dual installations not possible under Windows.
Dual instances of 8.6 and 9.1 stood
up in each environment
Production baseline
migrated from external CM (8.6 to 9.1)
Regression and performance
testing performed before Go-Live
Data sources migrated one
at a time
9.1 objects migrated between environments via
XML
Start Date: Oct-2011
Complete Date: Mar-2012
Approach – The Upgrade Begins
• Install 9.1 in development environment• Clean install … DO NOT UPGRADE 8.6
• Create a second PowerCenter repository
• Load SCM controlled production baseline into 9.1
• XML Objects migration from 8.6 to 9.1
• Only possible when no repository changes
• Same version migration and then upgrade required when repository changes between versions
• Authoritative dev baseline ( ) in 8.6
14© 2013 Lockheed Martin Corporation. All Rights Reserved.
D
Before you proceed make sure you are licensed to install multiple repositories.
D
Approach – The Upgrade Begins
• Created second Oracle schema
• v8.6.1 – IN86_RPS v9.1.0 – IN91_RPS
• Ensure sufficient file system space available for two complete PowerCenter installations
• Use symbolic links
• Change installation directory
• Update UNIX Profile
• Created a script to source the v8 instead of the new default v9 profile
• INFA_HOME• IFCMPath (Content Manager / B2BDT)• PATH / CLASSPATH
15© 2013 Lockheed Martin Corporation. All Rights Reserved.
Better space allocation during next HW refresh will make future upgrades easier.
Open new ports in host based firewalls and any infrastructure firewalls.
Approach – The Upgrade Begins
• Established second set of ports
• PowerCenter Web-based Administration Console - SSL
• v8.6.1 – XXXX• v9.1.0 – ZXXX
• Gateway Host
• v8.6.1 – YYYY• v9.1.0 – ZYYY
• Remember to target new ports during Server Installation
• Install new GUI client tools, configured to use new Gateway ports
• Old and new client tools available concurrently to monitor and correct issues with either existing or new installation
16© 2013 Lockheed Martin Corporation. All Rights Reserved.
Approach – The Upgrade Begins
• Resolve any import problems
• None were encountered from 8.6 -> 9.1
• Validate a sampling of data exchanges for functionality
• Export version 9.1 mappings and workflows and place under SCM
17© 2013 Lockheed Martin Corporation. All Rights Reserved.
Initial baseline of 9.1 configuration complete
Approach – Regression Testing
• Clean install of Version 9.1
• Content imported from SCM
• Baseline created in development
• Authoritative dev baseline ( ) remains in 8.6
• Authoritative dev baseline ( ) remains in 8.6
18© 2013 Lockheed Martin Corporation. All Rights Reserved.
9.1 installed into test and ready for regression and performance testing.
D TT
D
Approach – Regression Testing
• Environment
• Do both server versions successfully start up?
• Validates dual profiles
• How to select what to test
• Test it all?
• Select a representative subset?
• What to look for when you test
• Rows in/out
• Do the rows match?
19© 2013 Lockheed Martin Corporation. All Rights Reserved.
Approach – Regression Testing
20© 2013 Lockheed Martin Corporation. All Rights Reserved.
Name Workflows MappingCommand
TaskLogging Parameter
FilesSource/ TargetFiles
Connections Joiners Expressions Update Strategies
Aggregators
wf_A.XML ¢ ¢ ¢ ¢ ¢ ¢ ¢ N/A ¢wf_B.XML ¢ X ¢ ¢ ¢ ¢ ¢ ¢ N/A
wf_C.XMLX ¢ ¢ ¢ ¢ ¢ ¢ N/A N/A
SELECT DISTINCT REP_WORKFLOWS.SUBJECT_AREA, REP_WORKFLOWS.WORKFLOW_NAME, OPB_TASK_INST_RUN.INSTANCE_NAME, OPB_TASK_INST_RUN.TASK_ID, OPB_TASK_VAL_LIST.PM_VALUE, OPB_TASK_VAL_LIST.EXEC_ORDERFROM REP_WORKFLOWS REP_WORKFLOWS, OPB_TASK_INST_RUN OPB_TASK_INST_RUN, OPB_TASK_VAL_LIST OPB_TASK_VAL_LISTWHERE UPPER(SUBJECT_AREA) IN ('FOLDER_A', 'FOLDER_B', 'FOLDER_C') AND REP_WORKFLOWS.WORKFLOW_ID = OPB_TASK_INST_RUN.WORKFLOW_ID AND OPB_TASK_INST_RUN.TASK_ID = OPB_TASK_VAL_LIST.TASK_ID AND OPB_TASK_INST_RUN.INSTANCE_NAME NOT IN ('Start', 'tmr_Wait') AND OPB_TASK_VAL_LIST.PM_VALUE NOT IN ('$$cmd_a', '$$cmd_b $PMWorkflowRunId', '$$cmd_c')ORDER BY REP_WORKFLOWS.SUBJECT_AREA;
Review subset of workflows
Approach – Performance Testing
• Identical test datasets were repeatedly run against both PowerCenter versions using identical SCM-sourced components
• Intentional backlog created an opportunity to measure all affected system component behaviors under stressed conditions
• Created and ran a Unix script to capture real-time ETL Server memory / CPU utilization
• Executed Repository queries
• Determine workflow run times
• Input and output rows per second
• Compared all results from both versions to assess scope and depth of potential differences
21© 2013 Lockheed Martin Corporation. All Rights Reserved.
Approach – Repository SQL Review
22© 2013 Lockheed Martin Corporation. All Rights Reserved.
• Executed SQL against repository tables in both 8.6 and 9.1
• Regression testing by comparing input_rows, output_rows, failed_rows, and trans_errs
• Same rows, failures, and errors should be same between 8.6 and 9.1
• Depending on your design, some failures and errors may be “normal”
• Performance testing by comparing number of runs and runtimes
• Runtimes will not match, looking for outliers
• Input and output rows per second provide a good picture of performance
Approach – Testing
• Test results reviewed with engineering team
• GO / NO GO Decision point
23© 2013 Lockheed Martin Corporation. All Rights Reserved.
D T
Validation of 9.1 configuration complete
Are we ready for production?
D T
Approach – Ready for Production?
24© 2013 Lockheed Martin Corporation. All Rights Reserved.
• 9.1 baseline does not match current production baseline
• Development and Testing of new projects continued in 8.6 while testing 9.1
Baseline merges need to be performed.
Approach – Ready for Production?
25© 2013 Lockheed Martin Corporation. All Rights Reserved.
Resync 9.1 Baseline
Migrate Development
to 9.1
Begin testing new releases
on 9.1
Production deployment
Production changes need to be minimized during final resync
Approach – Install Production Content
26© 2013 Lockheed Martin Corporation. All Rights Reserved.
During content migration ensure 8.6 infrastructure is not modified.
8.6 processes production workload.
Final 9.1 baseline loaded.
8.6/9.1 configured to use same file system store.
8.6/9.1 configured to
use same target DB
Approach – Interface Cutover
27© 2013 Lockheed Martin Corporation. All Rights Reserved.
DO NOT run a workflow simultaneously in 8.6 and 9.1.
• Workflows cutover one at a time or in groups by IS
• Cutover performed manually
• Decided not to automate
• Rollback by moving interface back to 8.6
Approach – Interface Cutover
• Risk reduction
• Selecting the best time
• Having support on hand
• Internal
• Oracle Database Administrators /Unix System Administrators• Ops Team aware of, and involved in, the Production upgrades• Corporate and Customer Management – Available to make critical go /
no-go decisions as issues were identified and actioned
• External - Informed Informatica Support that the major upgrade was ongoing and that time-sensitive support may be needed needed
• Balancing act to track down problems ourselves versus getting the Pros involved early weighted more heavily towards Informatica Support than usual
• All SRs were dispositional quickly and effectively
28© 2013 Lockheed Martin Corporation. All Rights Reserved.
Approach – Interface Cutover
• Checklist – Repeated for each WorkflowUnschedule Workflow in 8.6Wait for Workflow to complete, if runningDisable operational alerts for 8.6 workflowRemove IS from Workflow in 8.6Record sequencer values for 8.6 workflowAssign workflow to IS in 9.1Reset sequencer values in 9.1 workflowSchedule workflow in 9.1Enable operational alerts for 9.1 workflowValidate successful completion of workflow
29© 2013 Lockheed Martin Corporation. All Rights Reserved.
Steps may vary slightly in your environment for external schedulers, etc
30
Development / Test Transition
Final Dev and Test Migration
• v8.6.1 was no longer needed to support Emergency Releases – Production at v9.1
• v8.6.1 Development and Test work complete; snapshots checked into SCM
• v8.6.1 Repository build complete; Import into v9.1 Repository
• Goodbye v8.6.1 and Hello v9.1 in all environments!
31© 2013 Lockheed Martin Corporation. All Rights Reserved.
32
Emergency Release Strategy
Approach - Emergency Releases
• Development PowerCenter components were maintained in both the new and old Repository formats
• Required fixes were made in v8.6.1 until successful testing was completed
• Changes were made in the v8.6.1 version and propagated through the upstream environments for testing and Production deployment
• Final version was also imported into v9.1 Repositories
33© 2013 Lockheed Martin Corporation. All Rights Reserved.
34
Lessons Learned
Lessons Learned
• Regression Testing Results may be surprising
• Direct comparison of ETL runs may reveal problems not caused by the upgrade.• Bad mapping logic produced nondeterministic results
• Performance Testing
• Many factors not part of the upgrade may impact performance results• Other test environment activity (DB, Network)
• HW configuration issues
• New and old PowerCenter versions may require different OS tuning• System may need to be reconfigured between runs
• More system resources may need to be added
35© 2013 Lockheed Martin Corporation. All Rights Reserved.
Lessons Learned
• Production Cut Over• Requires large manual effort
• Best to work in pairs to avoid mistakes
• Can be done in multiple live environments without user impact … they never noticed!
• Have engineering, SA, DBA support on hand
• Development and Test• Extra coordination required across projects
• Are all changes made to the correct baseline?
• Emergency changes need to be made in multiple baselines
• Ensure all developers have both sets of development tools installed and functioning early
• Create additional integration services names due to PM Locks
36© 2013 Lockheed Martin Corporation. All Rights Reserved.
Lessons Learned
• The upgrade was considered a success by all stakeholders
• Reach out to the Pro’s at Informatica when things happen that you don’t understand
• Would we do it again?
37© 2013 Lockheed Martin Corporation. All Rights Reserved.
PowerCenter can successfully be upgraded in a production environment with ZERO downtime.
YES!
38
Questions and Answers
Don’t forget to rate my session in the mobile app!