1 Raising the Bar: Equipping Systems Engineers to Excel with DOE Plan In Front In Back Face E ast Face W est Face E ast Face W est Eyes O pen LeftHand 0.43 0.58 0.52 0.40 RightHand 0.62 0.29 0.28 0.36 Eyes Closed LeftHand 0.62 0.57 0.47 0.40 RightHand 0.42 0.26 0.42 0.47 Ponder Process Manpower Materials Methods Machines Response to Effect Causes Causes Measurements Milieu (Environment) Cause- Effect (CNX) Diagra m Produce presented to: INCOSE Luncheon September 2009 Greg Hutto Wing Ops Analyst, 46 th Test Wing [email protected]
95
Embed
1 Raising the Bar: Equipping Systems Engineers to Excel with DOE PlanPonderProcess Manpower Materials MethodsMachines Response to Effect Causes Measurements.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Raising the Bar: Equipping Systems Engineers to Excel with DOE
Plan
In Front In BackFace East Face West Face East Face West
Eyes Open Left Hand 0.43 0.58 0.52 0.40Right Hand 0.62 0.29 0.28 0.36
Eyes Closed Left Hand 0.62 0.57 0.47 0.40Right Hand 0.42 0.26 0.42 0.47
Test design is not an art…it is a science Talented scientists in T&E Enterprise however…limited
knowledge in test design…alpha, beta, sigma, delta, p, & n
Our decisions are too important to be left to professional opinion alone…our decisions should be based on mathematical fact
53d Wg, AFOTEC, AFFTC, and 46 TW/AAC experience Teaching DOE as a sound test strategy not enough Leadership from senior executives (SPO & Test) is key
Purpose: DOD adopts experimental design as the default approach to test, wherever it makes sense Exceptions include demos, lack of trained testers, no control
3
Background -- Greg Hutto B.S. US Naval Academy, Engineering - Operations Analysis M.S. Stanford University, Operations Research USAF Officer -- TAWC Green Flag, AFOTEC Lead Analyst Consultant -- Booz Allen & Hamilton, Sverdrup Technology Mathematics Chief Scientist -- Sverdrup Technology Wing OA and DOE Champion – 53rd Wing, now 46 Test Wing USAF Reserves – Special Assistant for Test Methods (AFFTC/CT) and Master
Instructor in DOE for USAF TPSPractitioner, Design of Experiments -- 19 Years
Green Flag EW Exercises ‘79 F-16C IOT&E ‘83 AMRAAM, JTIDS ‘ 84 NEXRAD, CSOC, Enforcer ‘85 Peacekeeper ‘86 B-1B, SRAM, ‘87 MILSTAR ‘88 MSOW, CCM ’89 Joint CCD T&E ‘90
1994-1996 - Bare Base Study led to reduction of 50% in weight, cost, and improved sustainability through COTS solutions $20,000, 60 BTU-hr (5 ton) crash-survivable, miniaturized air
conditioner replaced by 1 Window A/C - $600
4
Google Specs: Frigidaire FAM18EQ2A Window Mounted Heavy Duty Room Air Conditioner, 18,000/17,800 BTU Cool, 16,000 BTU (Heat), 1,110 Approximately Cool Area Sq. Ft, 9.7 EER, 11" Max. Wall Thickness (FAM18EQ2 FAM18EQ FAM18E FAM18
FAM-18EQ2A) $640.60
5
Overview
Four Challenges – The 80% JPADS
3 DOE Fables for Systems Engineers
Policy & Deployment
Summary
SystemsEngineeringChallenges
Concept
Product &Process Design
Production
Operations
Decommission/End-use
Requirements - Quality Function
Deployment (QFD)
AoA -Feasibility
Studies
Process Flow Analysis
DesignedExperiments –
Improve Performance,
Reduce Variance
SPC – Defect SourcesSupply
Chain Management
LeanManufacturing
Statistical Process Control
(Acceptance Test)
Serviceability
AvailabilityMaintainability
Reliability
Failure Modes &Effects Analysis
(FMEA)
Robust Product Design
Simulation
Historical / Empirical Data
Analysis
Systems Engineering Simulations of Reality
At each stage of development, we conduct experiments Ultimately – how will this device function in service (combat)? Simulations of combat differ in fidelity and cost Differing goals (screen, optimize, characterize, reduce variance, robust
design, trouble-shoot) Same problems – distinguish truth from fiction: What matters? What
doesn’t?
7
Reqt'ts DevAoA
ConceptsRisk Reduction
EMDProd & Mfr
Sustain Production
Captive Subsystem
Prototype
Prod Rep
Acq PhaseSimulation of Reality
M&S Hardware System/Flight Test
WarfarePhysics
HWIL/SIL
Industry Statistical Methods: General Electric
Fortune 500 ranked #5 in 2005 - Revenues Global Most Admired Company - Fortune 2005 America’s Most Admired #2, World’s Most Respected
#1 (7 years running), Forbes 2000 list #2 2004 Revenues $152 B Products and Services
Began in late 1980’s facing foreign competition 1998, Six Sigma Quality becomes one of three
company-wide initiatives ‘98 Invest $0.5B in training; Reap $1.5B in benefits! “Six Sigma is embedding quality thinking - process
thinking - across every level and in every operation of our Company around the globe”1
“Six Sigma is now the way we work – in everything we do and in every product we design” 1
1 Jack Welch - General Electric website at ge.com
What are Statistically Designed Experiments?
Purposeful, systematic changes in the inputs in order to observe corresponding changes in the outputs
Results in a mathematical model that predicts system responses for specified factor settings
Responses Factorsf
INPUTS
(Factors)OUTPUTS
(Responses)
PROCESS:
Air-to-Ground Munitions
weather, training, TLE, launch conditions
Noise
Altitude
Weapon type
Impact Velocity
Delivery Mode
Impact Angle Delta
Impact Angle
Impact Velocity Delta
Miss Distance
Why DOE? Scientific Answers to Four Fundamental Test Challenges
Four Challenges1. How many? Depth of Test – effect of test size on
uncertainty
2. Which Points? Breadth of Testing – searching the vast employment battlespace
3. How Execute? Order of Testing – insurance against “unknown-unknowns”
4. What Conclusions? Test Analysis – drawing objective, supported conclusions
DOE effectively addresses all these challenges!
Inputs(X’s)
Noise
Outputs(Y’s)
Noise
PROCESS
Today’s Example – Precision Air Drop System
Just when you think of a good class example – they are already building it!
46 TS – 46 TW Testing JPADS
12
The dilemma for airdropping supplies has always been a stark one. High-altitude airdrops often go badly astray and become useless or even counter-productive. Low-level paradrops face significant dangers from enemy fire, and reduce delivery range. Can this dilemma be broken? A new advanced concept technology demonstration shows promise, and is being pursued by U.S. Joint Forces Command (USJFCOM), the U.S. Army Soldier Systems Center at Natick, the U.S. Air Force Air Mobility Command (USAF AMC), the U.S. Army Project Manager Force Sustainment and Support, and industry. The idea? Use the same GPS-guidance that enables precision strikes from JDAM bombs, coupled with software that acts as a flight control system for parachutes. JPADS (the Joint Precision Air-Drop System) has been combat-tested successfully in Iraq and Afghanistan, and appears to be moving beyond the test stage in the USA… and elsewhere.
Requirements:Probability of ArrivalUnit Cost $XXXXDamage to payloadPayloadAccuracyTime on targetReliability …
Capability:Assured SOF re-supply of material
13
A beer and a blemish …
1906 – W.T. Gossett, a Guinness chemist
Draw a yeast culture sample Yeast in this culture? Guess too little – incomplete
fermentation; too much -- bitter beer
He wanted to get it rightright
1998 – Mike Kelly, an engineer at contact lens company
Draw sample from 15K lot How many defective lenses? Guess too little – mad
customers; too much -- destroy good product
He wanted to get it rightright
14
The central test challenge …
In all our testing – we reach into the bowl (reality) and draw a sample of JPADS performance
Consider an “80% JPADS” Suppose a required 80%
P(Arrival) Is the Concept version
acceptable? We don’t know in advance
which bowl God hands us … The one where the system
works or, The one where the system
doesn’t
The central challenge of test – what’s in the bowl?
15
Start -- Blank Sheet of Paper
Let’s draw a sample of _n_ drops How many is enough to get it right?
3 – because that’s how much $/time we have 8 – because I’m an 8-guy 10 – because I’m challenged by fractions 30 – because something good happens at 30!
Let’s start with 10 and see …
=> Switch to Excel File – JPADS Pancake.xls
16
A false positive – declaring JPADS is degraded (when it’s not) --
In this bowl – JPADS performance is
acceptable
Suppose we fail JPADS when it has 4 or more misses
We’ll be wrong (on average) about 10% of the time
We can tighten the criteria (fail on 7) by failing to field more good systems
We can loosen the criteria (fail on 5) by missing real degradations
Let’s see how often we miss such degradations …
Maverick OK -- 80% We Should Field
050
100150200250
300350
3 4 5 6 7 8 9 10
Hits
Fre
qu
ency
Wrong ~10% of time
JPADS
17
A false negative – we field JPADS (when it’s degraded) --
Use the failure criteria from the previous slide
If we field JPADS with 6 or fewer hits, we fail to detect the degradation
If JPADS has degraded, with n=10 shots, we’re wrong about 65% of the time
We can, again, tighten or loosen our criteria, but at the cost of increasing the other error
In this bowl – JPADS P(A) decreased 10% --
it is degraded
Maverick Poor -- 70% Pk -- We Should Fail
0
50
100
150
200
250
300
3 4 5 6 7 8 9 10
Hits
Fre
qu
ency
Wrong 65% of time
JPADS
18
We seek to balance our chance of errors
Combining, we can trade one error for other ( for
We can also increase sample size to decrease our risks in testing
These statements not opinion –mathematical fact and an inescapable challenge in testing
There are two other ways out … factorial designs and real-valued MOPs
Enough to Get It Right: Confidence in stating results; Power to find small differences
Maverick OK -- 80% We Should Field
050
100150200250300350
3 4 5 6 7 8 9 10
Hits
Fre
qu
ency
Maverick Poor -- 70% Pk -- We Should Fail
0
50
100
150
200
250
300
3 4 5 6 7 8 9 10
Hits
Fre
qu
ency
Wrong 65% of time
Wrong 10% of time
JPADS
JPADS P(A)
19
A Drum Roll, Please …
For = = 10%, = 10% degradation in PA
N=120!
But if we measure miss distance for same confidence and power
N=8
20
Recap – First Challenge
Challenge 1: effect of sample size on errors – Depth of Test
So -- it matters how many we do and it matters what we measure
Now for the 2nd challenge – Breadth of testing – selecting points to search the employment battlespace
21
Challenge 2: Breadth -- How Do Designed Experiments Solve This?
Designed Experiment (n). Purposeful control of the inputs (factors) in such a way as to deduce their relationships (if any) with the output (responses).
Test JPADS Payload Arrival
Inputs (Conditions)Inputs (Conditions)
JPADS Concept A B C …
Tgt Sensor (TP, Radar)
Payload Type Platform (C-130, C-117)
Outputs (MOPs)Outputs (MOPs)
Hits/misses
RMS Trajectory Dev
P(payload damage)
Miss distance (m)
Statistician G.E.P Box said …
“All math models are false …but some are useful.”
“All experiments are designed … most, poorly.”
22
Type Measure of PerformanceTarget acquisition rangeTarget Standoff (altitude)launch rangemean radial arrival distanceprobability of damagereliabilityInteroperabilityhuman factorstech datasupport equipmenttactics
Objective
Subjective
Battlespace Conditions for JPADS Case
Systems Engineering Question: Does JPADS perform at required capability level across the planned battlespace?
Name Setting Low Level High LevelSCP 460.00 300.00 500.00
Turn Rate 2.80 0.00 4.00Ride Hard Medium Hard
Airspeed 180.00 160.00 230.00
Prediction 95% PI low 95% PI highDeviation from SCP 6.96 4.93 8.98Pilot Ratings 3.62 3.34 3.90
Performance Predictions
Design of Experiments Test Process is Well-Defined
Output
Process Step
Decision
Start
Yes
No
Output
Process StepProcess Step
DecisionDecision
Start
Yes
No
Test Matrix Results and Analysis
Planning: Factors Desirable and Nuisance
Desired Factors and Responses Design Points
Model Build Discovery, PredictionA - o - A S i d e s l i p S t a b i l i z e r L E X T y p e A - o - A S i d e s l i p S t a b i l i z e r L E X T y p e
We understand operations, aero, mechanics, materials, physics, electro-magnetics …
To our good science, DOE introduces the Science of Test
Bonus: Match faces to names – Ohm, Oppenheimer, Einstein, Maxwell, Pascal, Fisher, Kelvin
45
It applies to our tests: DOE in 50+ operations over 20 years
IR Sensor Predictions Ballistics 6 DOF Initial Conditions Wind Tunnel fuze characteristics Camouflaged Target JT&E ($30M) AC-130 40/105mm gunfire CEP evals AMRAAM HWIL test facility validation 60+ ECM development + RWR tests GWEF Maverick sensor upgrades 30mm Ammo over-age LAT testing Contact lens plastic injection molding 30mm gun DU/HEI accuracy (A-10C) GWEF ManPad Hit-point prediction AIM-9X Simulation Validation Link 16 and VHF/UHF/HF Comm tests TF radar flight control system gain opt New FCS software to cut C-17 PIO AIM-9X+JHMCS Tactics Development MAU 169/209 LGB fly-off and eval
Characterizing Seek Eagle Ejector Racks SFW altimeter false alarm trouble-shoot TMD safety lanyard flight envelope Penetrator & reactive frag design F-15C/F-15E Suite 4 + Suite 5 OFPs PLAID Performance Characterization JDAM, LGB weapons accuracy testing Best Autonomous seeker algorithm SAM Validation versus Flight Test ECM development ground mounts (10’s) AGM-130 Improved Data Link HF Test TPS A-G WiFi characterization MC/EC-130 flare decoy characterization SAM simulation validation vs. live-fly Targeting Pod TLE estimates Chem CCA process characterization Medical Oxy Concentration T&E Multi-MDS Link 16 and Rover video test
46
Three DOE Stories for T&E
Requirements: SDB II Build-up SDD Shot Design
Acquisition: F-15E Suite 4E+ OFP Qualification
Test: Combining Digital-SIL-Live Simulations
We’ve selected these from 1000’s to show T&E Transformation
Plan
In Front In BackFace East Face West Face East Face West
Eyes Open Left Hand 0.43 0.58 0.52 0.40Right Hand 0.62 0.29 0.28 0.36
Eyes Closed Left Hand 0.62 0.57 0.47 0.40Right Hand 0.42 0.26 0.42 0.47
PonderProcessManpower Materials
Methods Machines
Response to
EffectCauses
Causes
Measurements
Milieu (Environment)
Cause-Effect (CNX)
Diagram
Produce
47
Testing to Diverse Requirements: SDB II Shot Design
Test Objective: SPO requests help – 46 shots right N? Power analysis – what can we learn? Consider Integrated Test with AFOTEC What are the variables? We do not
know yet … How can we plan? What “management reserve”
Power (1-) 80% 97.50% 80% detect shifts equal or ++
Same
Case: Integration of Sim-HWIL-Captive-Live Fire Events
Test Objective: Most test programs face this – AIM-9X,
AMRAAM, JSF, SDB II, etc… Multiple simulations of reality with
increasing credibility but increasing cost Multiple test conditions to screen for most
vital to performance How to strap together these simulations
with prediction and validation?
DOE Approach:
• In digital sims screen 15-20 variables with fractional factorials and predict performance
• In HWIL, confirm digital prediction (validate model) and further screen 8-12 factors; predict
• In live fly, confirm prediction (validate) and test 3-5 most vital variables
• Prediction Discrepancies offer chance to improve sims
Results:
• Approach successfully used in 53d Wing EW Group
• SIL labs at Eglin/PRIMES > HWIL on MSTE Ground Mounts > live fly (MSTE/NTTR) for jammers and receivers
• Trimmed live fly sorties from 40-60 to 10-20 (typical) today
• AIM-9X, AMRAAM, ATIRCM: 90% sim reduction
1000’s
Digital Mod/Sim
1000’s
Digital Mod/Sim
Predict
Validate
Validate
10’sLive Shot
10’sLive Shot
100’sHWIL or captive
100’sHWIL or captive
Predict
15-20 factors
8-12 factors
3-5 factors
$ - Credibility
+
51
A Strategy to be the BestBest …Using Design of Experiments
Inform Leadership of Statistical Thinking for Test
Adopt most powerful test strategy (DOE) Train & mentor total team Combo of AFIT, Center, & University Revise AF Acq policy, procedures Share these test improvements
53d Wing Policy Model: Test Deeply & Broadly with Power & Confidence
From 53d Wing Test Manager’s Handbook*:“While this [list of test strategies] is not an all-
inclusive list, these are well suited to operational testing. The test design policy in the 53d Wing supplement to AFI 99-103 mandates that we achieve confidence and power across a broad range of combat conditions. After a thorough examination of alternatives, the DOE methodology using factorial designs should be used whenever possible to meet the intent of this policy.”
* Original Wing Commander Policy April 2002
March 2009: OTA Commanders Endorse DOE for both OT & DT
53
“Experimental design further provides a valuable tool to identify and mitigate risk in all test activities. It offers a framework from which test agencies may make well-informed decisions on resource allocation and scope of testing required for an adequate test. A DOE-based test approach will not necessarily reduce the scope of resources for adequate testing.
Successful use of DOE will require a cadre of personnel within each OTA organization with the professional knowledge and expertise in applying these methodologies to military test activities. Utilizing the discipline of DOE in all phases of program testing from initial developmental efforts through initial and follow-on operational test endeavors affords the opportunity for rigorous systematic improvement in test processes.”
Nov 2008: AAC Endorses DOE for RDT&E Systems Engineering
AAC Standard Systems Engineering Processes and Practices
54
July 2009: 46 TW Adopts DOE as default method of test
56
We Train the Total Test Team … but first, our Leaders!
OA/TE Practitioner Series 10 sessions and 1 week each Reading--Lecture--Seatwork
Basic Statistics Review (1 week) Random Variables and Distributions Descriptive & Inferential Statistics Thorough treatment of t Test
Applied DOE I and II (1 week each) Advanced Undergraduate treatment Graduates know both how and why
Journeymen Testers
Leadership Series DOE Orientation (1 hour) DOE for Leaders (half day) Introduction to Designed Experiments (PMs- 2 days)
Ongoing Monday Continuation Training
Weekly seminars online Topics wide ranging
New methods New applications Problem decomposition Analysis challenges Reviewing the basics Case studies Advanced techniques
DoD web conferencing Q&A session following Monday 1400-1500 CR 22 Bldg 1
or via DCO at desk
Operations Analyst Forum/DOE Continuation Training for 04 May 09:
Location/Date/Time: B1, CR220, Monday, 04 May 09, 1400-1500 (CST)
Purpose: OPS ANALYST FORUM, DOE CONTINUATION TRAINING, USAF T&E COLLABORATION