Top Banner
B. Robertson, NASA GSFC 1 NASA Project Management Challenge 2011 Lessons Learned during Development of the Solar Dynamics Observatory Brent Robertson NASA Goddard Space Flight Center MMS Deputy Project Manager Previously, SDO Observatory Manager
23
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Robertson.brent

B. Robertson, NASA GSFC 1NASA Project Management Challenge 2011

Lessons Learned during Development of the Solar Dynamics Observatory

Brent RobertsonNASA Goddard Space Flight Center

MMS Deputy Project Manager

Previously, SDO Observatory Manager

Page 2: Robertson.brent

B. Robertson, NASA GSFC 2NASA Project Management Challenge 2011

SDO Lessons Learned

• Lessons learned have been developed by looking at:

– Problems encountered during build and test of SDO

– Technical and Programmatic Issues reported by Project over the life of the Project

– Residual Risks accepted by Project

– On-orbit operational performance

Page 3: Robertson.brent

B. Robertson, NASA GSFC 3NASA Project Management Challenge 2011

SDO Mission/Project Overview

• SDO is a NASA Category 1 mission, first Living With a Star (LWS) Mission, part of Sun-Earth Connection theme

• Characterize the dynamic state of the Sun enhancing the understanding of solar processes and space weather

• NASA GSFC:– Managed the mission

– Built the S/C in-house

– Managed and integrated the instruments

– Develops/managed the Ground System and Mission Operations

– Performed Observatory environmental testing at GSFC

• Principal Investigators responsible for development of their Instrument & Science Operations Center

• Feb, 2010 Atlas-V launch from KSC into GEO-Transfer Orbit (GTO), circularize to GEO-Sync Orbit, inclined 28.5 degrees

• Design Drivers:– Continuous high data rate/volume

– Geosynchronous orbit (mass to orbit, radiation)

– 5 year mission life

– Instrument pointing and stability

Page 4: Robertson.brent

B. Robertson, NASA GSFC 4NASA Project Management Challenge 2011

SDO Investigations

Helioseismic and Magnetic Field Investigation (HMI, Stanford University): HMI will observe "filtergrams" of the Sun which will be used to produce dopplergrams and magnetograms. Analysis of these measurements will allow us to understand the interior processes governing the transition from solar minimum to solar maximum, will be able to probe the dynamics of the near-surface shear layer to observe local strong flux regions before they reach the photosphere, and will measure the highly variable magnetic field.

EUV Variability Experiment (EVE, University of Colorado, Laboratory for Atmospheric and Space Physics):

EVE will specify the spectral irradiance with a sensitivity that allows us to gauge the energy input into the complex processes of the Earth's atmosphere and near-Earth space. Its temporal resolution will allow us, for the first time, to understand the flare-induced impacts on these processes.

Atmospheric Imaging Assembly (AIA, Lockheed-Martin Solar and Astrophysical Laboratory):

AIA will capture the initiation and progression of dynamic processes, with the spatial resolution necessary to understand their connection to the magnetic field and the spectral coverage to infer the processes at multiple temperatures.

Page 5: Robertson.brent

B. Robertson, NASA GSFC 5NASA Project Management Challenge 2011

SDO images of the Sun in 10X HD resolution

Page 6: Robertson.brent

B. Robertson, NASA GSFC 6NASA Project Management Challenge 2011

Solar Prominence Eruption captured by SDO

Page 7: Robertson.brent

B. Robertson, NASA GSFC 7NASA Project Management Challenge 2011

SDO Project Milestones

Legend: Progress Bar Schedule Reserve

CY 2003 Q1 Q2 Q3 Q4

CY 2004 CY 2005 CY 2006 CY 2007 CY 2008Q1 Q2 Q3 Q4

Q1 Q2 Q3 Q4

Q1 Q2 Q3 Q4

Q1 Q2 Q3 Q4

Q1 Q2 Q3 Q4

Q1 Q2 Q3 Q4

CY 2009CY 2002Q1 Q2 Q3 Q4

CY 2001 Q1 Q2 Q3 Q4

‘00 Q4 Ops

(5 Yrs)

LAUNCH

LaunchAs of: Feb. 11, 2010

Original Date Original Date

SU=Stanford University LMSAL= Lockheed Martin Solar Astrophysics Laboratory LASP= Laboratory For Atmospheric and Space Physics

MISSION MILESTONES

ImplementationFormulation

ICR PDR

6/04

CR CDR

4/05

PSR

6/099/03 2/10

PER

3/083/04

INST. DEVELOPMENT

S/C DEVELOPMENT& INTEGRATION

OBSERVATORY TEST

GROUND SYSTEMS DEVELOPMENT

LAUNCH VEHICLE DEVELOPMENT @ KSC

Inst. Selections

AO Rel1/02

8/02Ship 12/07

Build/Test

S/C Bus StudiesObs Test

2/10

LV Development/Integration

Ship

AO Process

7/09

9/07

Procure/Develop/Test

(36 mo.s)

Concepts/Design/Long Lead

Concepts/Design/Long Lead

In-House ATP

Phase A Phase B

Build Comp.s

7/09

Phase C/D

8/06

Procure

Procure

Procure

5/05GS CDR GS MOR

10/07FORRGS TRR

6/14/06

9/02Award Contracts

Concepts/Design

SRR/SCR

4/03

Code M ATP &Issue RFO

L.V. Selection

L-30 month funding profile2/06

7/04 StartImplementation

LRR 2/10

PCA MILESTONES

Build/Test

Ship

Concepts/Design/Long Lead

Procure

Build/TestConcepts/Design/Long Lead

Procure

AIA - LMSAL

HMI - SU

EVE - LASP Ship

11/07

9/07

CDR

CDR

CDR

Instrument Integration

PER

PER

Start I&T

Commissioning

PER

Pre-FormLRD 12/08

LV Delay due to Atlas Manifest Issues

LV Delay

‘10Q1

Page 8: Robertson.brent

B. Robertson, NASA GSFC 8NASA Project Management Challenge 2011

SDO PR / PFR Process

• SDO Project utilized the on-line GSFC Problem Reporting / Problem Failure Reporting (PR/PFR) System

– Applied to Engineer Test Unit (ETU) testing (post acceptance), Flight box assembly & testing, Instrument testing (while resident at GSFC), Observatory integration & test, and associated GSE

• A PR is a minor anomaly, and must be described in detail and dispositioned before it can be closed

– PRs were typically procedural errors that could be closed in a short time period (days)

• A PFR is a more serious problem and is scrutinized by more people. It goes through a number of stages during problem investigation and resolution

– Fields required to be completed for each PFR include Problem Description, Subsystem, Disposition, Defect Cause, Cause of Problem, Action Taken, Residual Risk Determination

– Approval for closure required by Product Development Lead /Instrument Lead, Systems Engineer, Quality & Project

• Database of SDO PFRs provides lessons learned for future Projects

• SDO Project used a rigorous process to surface potential problems that might surface during flight

Page 9: Robertson.brent

B. Robertson, NASA GSFC 9NASA Project Management Challenge 2011

SDO PFR History (1/2)

• 484 total PFRs generated during development and test of SDO

– Includes problems discovered with GSE, ETU and Flight

• Majority of PFRs written after start of I&T

• 28 PFRs written after shipment to launch processing site

– SDO spent more time at launch processing facility than originally planned while waiting for launch

Page 10: Robertson.brent

B. Robertson, NASA GSFC 10NASA Project Management Challenge 2011

SDO PFR History (2/2)• Project struggled to close PFRs until

Observatory environmental testing was completed

• Increase in number of Open PFRs occurred after shipment to launch processing site

• Majority of PFRs were closed within ~9 months of being opened, but some remained open 2 years to monitor for recurrence, trending, or for final closure testing

Page 11: Robertson.brent

B. Robertson, NASA GSFC 11NASA Project Management Challenge 2011

PFR Defect Cause• Majority of SDO PFRs were caused by design error

– Most design errors were corrected by minor fixes such as Flight Software table updates and Ground Support Equipment rework

• Of 484 PFRs, 19 PFRs resulted in serious Project Issues and/or Residual Risk at launch

• Project Issues were most caused by design errors and workmanship issues

Page 12: Robertson.brent

B. Robertson, NASA GSFC 12NASA Project Management Challenge 2011

Subsystem PFRs

• In general, complexity of Subsystem was related to number of PFRs discovered

• Electrical Ground Support Equipment (EGSE) had a disproportionate large number of PFRs

• Instrument issues were not captured in GSFC PFR System until Project took delivery

– Instrument problems resolved during build are not represented

• Most complex instruments (AIA, HMI) had most Project Issues / Residual Risk

Page 13: Robertson.brent

B. Robertson, NASA GSFC 13NASA Project Management Challenge 2011

SDO Project Issues

• Project reported Issues on a monthly basis

• Programmatic Issues were more prevalent early in the Project mainly due to funding issues and late deliveries

• Technical issues occurred throughout Observatory integration and test

– Most technical issues were a result of PFRs after I&T start

• Additional technical issues were reported after shipment to the launch processing facility

Page 14: Robertson.brent

B. Robertson, NASA GSFC 14NASA Project Management Challenge 2011

High Cost Technical Issues

• During SDO development, 19 “high cost” technical issues were encountered

– PFRs requiring rework of Flight hardware; PFRs whose resolution held up I&T; PFRs resulting in Residual Risk

• Over half of all technical issues were discovered during Observatory test

– Many issues were due to “interactive complexity” that could only be discovered after system integration

• Most PFRs resulting in Residual Risk were a result of Instrument issues discovered during Observatory test

Problem Problem Discovered During Defect Cause ImpactPyrovalve did not actuate Bench testing Design: Simultaneous ignition results in failure Operational fix; Lengthy investigation diverted resourcesLeaking Fill & Drain Valves Subsystem Acceptance Test Vendor workmanship Valve leak fixed after Propulsion System deliveryKa Band Transmitter Build Issue Bench testing Design: AlSi housing subject to cracking Refab of housing delayed Ka Transmitter deliveryHMI Processor Reset Anomaly Instrument Thermal Vac Test Design: Interaction of PS turn on transient Use as Is; Residual riskStar Tracker Reset Anomaly Component Thermal Vac Test Random Part Failure Return to vendor for repairACE LPSC anomaly Observatory Test Vendor workmanship of hybrid part De-integration of box from Observatory, replacement of cardAIA/HMI LVDS ESD Issue Subsystem Integration Operator Error De-integration of box from Observatory for repair; Residual riskHGAS Over-temperature Bake-out Subsystem Thermal Bakeout Procedure Rebuild subsystem with spare partsEVE Filter Crack Observatory Inspection Design: Thin film sensitive to vibration De-integration from Observatory for repairIRU Heater Cycling Observatory Test Design: System impact of 100 Hz heater cycling Change to Flight SoftwareDSS Shorted Diode Observatory Test Random Part Failure De-integration from Observatory; Return to vendor for repairS-Band Receiver Lock Issue Observatory Test Vendor workmanship De-integration from Observatory; Return to vendor for repairLPSC Design Flaw Box Acceptance Test Design: Low marginal to chassis short De-integration of multiple boxes from Observatory for repairHMI Image Corruption Observatory Test Design: Rare data corruption Use as Is; Residual riskAIA Image Corruption Observatory Test Design: Rare data corruption Software patch; Lengthy investigation diverted resourcesHMI LED Intensity Trend Observatory Thermal Vac Test Unknown Use as Is; Residual riskPSE Oscillator Post Ship Observatory Test Workmanship Use as Is; Residual riskSDO AIA HOPA Circuit Anomaly Post Ship Observatory Test Workmanship Use as Is; Residual riskSDO Spacecraft Box Grounding Issue Post Ship Observatory Test Design: Cold flow of grounding material Investigation diverted resources late in I&T flow

Page 15: Robertson.brent

B. Robertson, NASA GSFC 15NASA Project Management Challenge 2011

Why did these Costly Issues Occur? (1/2)• The number of HMI and AIA instrument PFRs were related to the complexity of their science

camera high speed bus operation

– Ground System changes were made during I&T in order to better evaluate instrument performance

– HMI and AIA were under more schedule pressure to deliver to I&T than other efforts

• In some cases, design errors were caused when one subsystem adversely impacted another subsystem design without realizing it, due to interactive complexity

– The Attitude Control Subsystem IRU heater cycling frequency was found to have an adverse impact on the Power Subsystem operation

– Mechanical/Thermal Subsystem use of Choseal proved to be inadequate for Electrical Subsystem grounding of boxes

• Some issues with vendor components could not be discovered until after their integration and test into the Subsystem or Observatory

– Digital Sun Sensor electronics issue (shorted diode) was not discovered until DSS was powered by a fully redundant Attitude Control Electronics (ACE) on Observatory

– S-Band Transponder workmanship issue (incorrect number of windings on an inductor) was not discovered until Observatory Thermal Vac when it experienced loss of lock in a narrow temperature range

– Fill and Drain valve issue (leak due to workmanship / contamination) was not found until Propulsion Subsystem proof test

• Interactive complexity caused issues that were difficult to find until after system integration

Page 16: Robertson.brent

B. Robertson, NASA GSFC 16NASA Project Management Challenge 2011

Why did these Costly Issues Occur? (2/2)

• Although ETUs were built and mitigated much risk, some design errors were not discovered until Flight build

– Problems with Ka Transmitter Al-Silica enclosure (cracking and chipping ) were not discovered until Flight assembly

– Low Power Switch Card design flaw (marginal board via to heat sink short) was not discovered until buildup of a Flight spare LPSC

• In some cases, complacency was a factor

– Many bake-outs had been performed on SDO hardware with no issue

– The SDO High Gain Antenna Subsystem (HGAS) was subjected to a damaging over-temperature condition because facility control software for test heaters was inadvertently left off and nobody noticed

– SDO Project increased oversight of flight hardware in test after this occurred

Page 17: Robertson.brent

B. Robertson, NASA GSFC 17NASA Project Management Challenge 2011

SDO On-Orbit Performance: A Measure of Success

• SDO on-orbit performance continues to exceed science requirements

• Two issues that eluded ground testing were discovered and resolved during on-orbit checkout

– Fuel slosh interaction with on-orbit Fault Detection & Correction

– Gyro bias wondering in zero G

• After 10 months of on-orbits operations, 10 low severity anomalies have occurred

– No hardware failures have occurred

• No issues have occurred as a result of residual risks accepted by Project at launch

SDO On-Orbit Anomalies through Dec 3, 2010 (All Low Severity)SDO-0001: (ground system antenna) SDO1 Man HPA FaultsSDO-0002: TSM 282 Failed on Threshold 3 Prior to SeparationSDO-0003: Oscillations Within Spacecraft Attitude in Inertial ModeSDO-0004:  RTSs in RAM not ChecksummedSDO-0005: AMF2 Burn Aborted - Fuel SloshSDO-0006: AIA Unable to Jam Instrument ClockSDO-0007: ST2 Low Data Quality FlagSDO-0008: Adjusted PSEB Battery State of Charge calculation onboardSDO-0009: (ground system antenna) SDO1 S-Band Tracking Receivers BadSDO-0010: EVE MEGS-B channel degradation

Page 18: Robertson.brent

B. Robertson, NASA GSFC 18NASA Project Management Challenge 2011

Looking Back: SDO Strengths (1/2)

• ETUs were invaluable

– ETUs mitigated Flight manufacturing and technical risk

– Power Subsystem Electronics (PSE) ETU and Ka Transmitter ETU were utilized on Observatory to keep I&T moving while Flight Units were built

• We had the right balance between Systems, Quality Assurance and Project

– Prevalent in all processes such as Risk Management, PFR Closure and Configuration Control Board Meetings

• We had the right people doing the job early in the Project

– We had a very strong team from the beginning and we got stronger as we brought more people on

• We had very good working relationships

– Communication was good even though we were a large team

– The Project leadership was strong

– We complimented each other’s strengths

– The science goal was compelling and understandable

– There was a high level of commitment to the mission from everyone.  Issues were not hidden.

– We had fun

– There was little infighting, bickering or other destructive behavior

Page 19: Robertson.brent

B. Robertson, NASA GSFC 19NASA Project Management Challenge 2011

Looking Back: SDO Strengths (2/2)

• We had a good technical understanding of our deliverables

– Although we didn’t build the instruments, we understood them well enough and as a result were able to better deal with issues at the Observatory level.

• We listened to our Product Development Lead engineers when they asked for something beyond scope

• We utilized a Management Information System (MIS) that allowed 24/7 availability of documentation , pictures and information

– This increased efficiency, especially in I&T

– Pictures proved invaluable during investigation of some anomalies

• We had a co-located team during I&T

• We had strong Project Support in Configuration Management, Resources, Scheduling, and Property

– We valued their opinion, gave them a say, and maybe that’s why we had a strong team

• We handled distractions well

– Knowing another in-house Project had priority over us could have been a bigger distraction but we did not waste much time on it

Page 20: Robertson.brent

B. Robertson, NASA GSFC 20NASA Project Management Challenge 2011

Looking Back: SDO Technical Lessons Learned (1/2)

• Expect unexpected problems that will use up cost and schedule reserve

– We never expected that we would have damaged our High Gain Antenna Subsystem (HGAS) in TV Bakeout so badly that we would need to scrap and rebuilt it with spare parts

– It was not expected that we would need to de-integrate nine different boxes from the Observatory for repair during SDO I&T

• Common products used in multiple subsystems are both a blessing and a curse

– Multiple users allow discovery of potential marginal issues thereby enabling reliability growth

– Consequence of a design error is multiplied by the number of users of the common design

– SDO used a common Low Power Switch Card (LPSC) which was found to have a design error on the spare unit, the 9th build. This was very costly because it required de-integration of 5 boxes to fix, however it surfaced a potentially disabling short to ground.

• Pay more attention to Electrical Ground Support Equipment

– EGSE had more PFRs than most Subsystems / Instruments

– Resolution of EGSE problems took up valuable I&T time

• Do not underestimate the difficulty of getting EEE parts

– We struggled with hybrid parts issues (Solid State Power Converters) and our parts procurement process

Page 21: Robertson.brent

B. Robertson, NASA GSFC 21NASA Project Management Challenge 2011

Looking Back: SDO Technical Lessons Learned (2/2)

• Pay more attention to verifications by analysis

– SDO discovered a costly High Gain Antenna blockage design error shortly after PDR

– We thought we had three different requirement verification analyses but in fact they all used the same STK model which had an error

• Do not underestimate the difficulty of developing new technology

– We knew the Ka Transmitter effort was going to be difficult, but still encountered multiple issues despite our best plans

– Ka Transmitter effort experienced most cost growth of any subsystem and was the last component to be delivered to I&T

• Don’t do anything for the first time while away from home

– Problems were encountered when installing the flight battery for the first time at the launch processing facility which would have been resolved more quickly if discovered at GSFC

• Beware of interactive complexity; use a thorough “Test as You Fly” approach

Page 22: Robertson.brent

B. Robertson, NASA GSFC 22NASA Project Management Challenge 2011

Looking Back: Programmatic SDO Lessons Learned

• Get your ETUs built and tested by CDR

– SDO did not achieve this; we thought we could make up time between CDR and start of I&T but were wrong

• Realize that you are taking your eye off the ball if you are working a big issue

– We spent much time solving a High Gain Antenna blockage design error after PDR

– Its resolution took our eye off the ball on some things

• Never lose your launch slot

– In Oct 2007, SDO lost its Aug 2008 launch slot when LRD slipped by 4 months to Dec 2008 due to late spacecraft and instrument deliveries to I&T

– SDO’s wait for launch until Feb, 2010 due to launch vehicle manifest issues was very costly

• Be diligent in closing paperwork as soon as possible

SDO Launch, Feb11, 2010

Page 23: Robertson.brent

B. Robertson, NASA GSFC 23NASA Project Management Challenge 2011

Summary

• SDO Project developed a technically challenging observatory relying on new technology to accomplish its mission in a threatening GEO environment

• GFSC in-house flight project was led by a strong Project, Systems and QA team

• SDO Project successfully used of a rigorous PR/PFR process to surface potential problems prior to flight

• SDO Project encountered programmatic and technical issues that were effectively managed with its risk management process

• Unexpected problems will always occur despite a Project’s best plans

• "SDO will change our understanding of the sun and its processes, which affect our lives and society. This mission will have a huge impact on science, similar to the impact of the Hubble Space Telescope on modern astrophysics.”

Post launch quote, Dick Fisher, director of the Heliophysics Division at NASA Headquarters