Challenges in Modeling
Jan 31, 2016
Challenges in Modeling
COMPLEXITIES OF MODELS
• Large State Space (e.g. Bedrock, Wireless handoff)– Model construction problem
– Model solution problem
• Model Stiffness.
Fast and slow rates acting together– Failure And Recovery/Repair (HSP Markov model in Bedrock)
– Performance and failure (Wireless handoff)
COMPLEXITIES OF MODELS
(Continued)• Modeling Non-Exponential Distributions
(e.g. N+1 problem)
• Believability/Understandability/Usability
• What about software?
Potential Solutions
• Largeness
– Largeness Tolerance
– Largeness Avoidance
LARGENESS TOLERANCE
• Automated Model Construction
– Loops in the specification of CTMC (SHARPE)
– Stochastic Petri nets (SPNP, SHARPE)
– High level languages (SAVE, QNAP, ASSIST, SDM)
– Fault-Tree + Recovery Info (HARP)
– Object-Oriented Approaches (TANGRAM)
LARGENESS TOLERANCE (Continued)
• Efficient numerical solution techniques
– Sparse Storage
– Accurate and Efficient Solution Methods
We have Generated and Solved Models
with 1,000,000 states (has gone up
considerably recently)
Steady-State : NEAR-Optimal SOR
Transient: Modified Jensen's method
MODEL SPECIFICATION LANGUAGES
• Different languages can be used to specify a
single model type:
SAVE, QNAP, SPNP all appear very different;
underlying model type is Markov
• Same language can be used to specify different
model types:SPNP input language used for
Markovian SPN analytic numeric solution or
non-Markovian SPN simulation solution
MODEL SPECIFICATION LANGUAGES (Continued)
• Languages can be domain specific:
– Reliability: HARP, SDM
– Availability: SAVE
– Performance: RESQ, QNAP
• Language can be domain independent:
– SHARPE, SPNP
LARGENESS AVOIDANCE
• Non-State-Space methods
– Reliability block diagrams
– Fault-trees
– Product-Form Queuing Networks
• Approximate solutions
– State Truncation
SAVE, SPNP (Kantz and Trivedi: PNPM91)
Case Study: JPL REE System Availability Modeling in Spacecraft Architecture
LARGENESS AVOIDANCE (Cont.)
• Stochastic Petri Nets (State-space-based modeling)
• State truncation by introducing guard function
Guard g is defined as
If (mark(“…_dn”) >= K)
return (0);
else
return (1);
SPN MODELING
AVAILABILITY MEASURES
LARGENESS AVOIDANCE (Continued)
• Approximate solutions
– Hierarchical Decomposition
and Fixed-Point Iteration among submodels:
• Heidelberger and Trivedi; IEEE-TC,1983
(Queueing Models)
• Ciardo and Trivedi; PNPM91 (SPN Models)
• Tomek and Trivedi (Availability Models)
• Lanus, Liang & Trivedi: (Bedrock)
• Wireless handoff work: Ma, Han & Trivedi
LARGENESS AVOIDANCE (Continued)
• Approximate solutions
– Performability:
Multiprocessor example
– Fluid Approximation:
Mitra; Kulkarni; Ciardo; Nicol, and Trivedi;
FSPN
Difficulties in Modeling Using MRMs
• Stiffness
Causes numerical difficulties in solution– Stiffness Tolerance
Develop stiffness tolerant numerical
solution methods – Stiffness Avoidance
Avoid generating stiff models through
decomposition
Potential Solutions (Continued)
• Stiffness
– Stiffness Tolerance
– Stiffness Avoidance
• Modeling Non-Exponential Distributions
– Stage-type expansion, MRGP, NHCTMC, DES
STIFFNESS TOLERANCE
• Automatic Detection of Stiffness (HARP)
• Special Stable ODE Solver
Reibman and Trivedi (TR-BDF2)
Computers and Operations Research, 1988.
Malhotra and Trivedi (Pade, Implicit RK)
STIFFNESS TOLERANCE (Continued)
• Uniformization for Stiff Markov Chains
Muppala and Trivedi
We can solve models with rate ratios of 108 or
higher
Implemented in SHARPE & SPNP
STIFFNESS AVOIDANCE
• Model-level decomposition
– Hierarchical Composition (SHARPE)
Composition of Submodel solutions without
generating a single one-level overall model
(Bedrock example)
– Fixed-Point Iteration (Wireless handoff example)
STIFFNESS AVOIDANCE (Continued)
• Importance Sampling (simulation)
– Lewis, Goyal, Heidelberger, Shahbuddin, Geist, Nicola
– Can also apply to analytic-numeric methods
(Heidelberger, Muppala, and Trivedi; Performance 93)
• Importance splitting (Simulation)
– Tuffin and Trivedi; Tools’ 00
Non-Exponential Behavior
• Non state space models: Fault Trees, Reliability
Graphs, RBDs; no problem
Non-Exponential Behaviorin State Space Models
NON-EXPONENTIAL DISTRIBUTIONS
• Phase-Type Expansions
– N+1 example
• Non-Homogeneous Markov Chains
CARE III, HARP
Soft Rel model with imperfect repairs solved
using SHARPE
NON-EXPONENTIAL DISTRIBUTIONS (Continued)
• Semi-Markov Chains N+1 example• Markov Regenerative Processes: Choi, Logothetis, Kulkarni, Trivedi• DSPN and MRSPN:
Choi, Kulkarni, Trivedi• Discrete-Event Simulation Now in SPNP (FSPN and Non-Markovian SPN
Simulation), RESQ, QNAP, Bones, SES workbench
CASE STUDY: AT & T
• GSHARPE:– A Preprocessor to SHARPE developed at Bell Labs by
a Duke Student.– User can specify Weibull Failure times and lognormal
and other repair time distributions.– GSHARPE fits these to phase type distributions and
produces a Markov model that is generated for processing by SHARPE
Potential Solutions (Continued)
• Believability/Understandability/Usability
– GUI, many practical examples, short-courses, tools, Boeing SDM project
• Incorporation in the design process
– VHDL Availability Model,
– C Program Perf. Model
– Ada Program SPN Perf. Model (SPC)
• Connection between measurements & models
BELIEVABILITYUNDERSTANDABILITY
• Integration of Measurements and Models
– Measurements Provide Parameters to Models
– Models Provide Guidelines For Measurements
– Models Validated Against Measurements
• Integration of Different Modeling Tools
– Boeing SDM project
BELIEVABILITY/UNDERSTANDABILITY
(Continued)
• Many Case-Studies of Validations Needed
– Vaxcluster Availability Model: Wein & Sathaye
– Hsueh, Iyer and Trivedi; IEEE-TC, Apr. 1988
– Lucent Validation of ESS; Veena Mendiratta
• Technology Transfer
– Short courses
– Development and Dissemination of Tools
(SHARPE, SPNP)
BELIEVABILITY/UNDERSTANDABILITY
(Continued)
• Application of the Techniques and Tools
– Motorola
– Cisco
– 3Com
– HP
– Sun
CASE STUDY: BOEING
• An Integrated Reliability Environment
• A working prototype
• Developed a high-level modeling language (SDM)
• Designed and implemented an intelligent interpreter
CASE STUDY: BOEING (Continued)
• Interpreter determines which solution method is applicable
• Translator translates the SDM input file into an input file of any of the engines down below
• Five different modeling engines are integrated:
– CAFTA, SETS, EHARP, SHARPE and SPNP.
MODELING AND MEASUREMENTS: INTERFACES
• Measurements supply Input Parameters to Models
(Model Calibration or Parameterization)
Confidence Intervals should be obtained
Boeing, Draper, Union Switch projects
• Model Sensitivity Analysis can suggest which Parameters to Measure More Accurately: Blake, Reibman and Trivedi: SIGMETRICS 1988; Fricks and Trivedi: 1997
MODEL CALIBRATION
What is ?
• Fault Model for Each Component– Design,Manufacturing: Heisenbugs, Bohrbugs
– Operational: Permanent, Intermittent,Transient
– Human
• Fault Arrival Processes (PP,Weibull,NHPP)
• Failure Rates (Sources:MIL-STD)
MODEL CALIBRATION (Continued)
What is c ?
• Field Data
• Fault/Error Injection (FIAT,MESSALINE)
• Analytic Coverage Model
What is ?
• Maintenance Model Corrective; dispatch , travel, repair time, dead on arrival, imperfect repair
Preventive
MODEL CALIBRATION (Continued)
What is r ?• Binary: Up & Down
• Capacity-Oriented:
Number of Operational Resources in Each State• Performance-Oriented:
Evaluate Perf. in Each Degraded Level of Syst. Config.
1. Measurements
2. Simulation Model
3. Analytic Model -- SHARPE, SPNP
– Validation: Does the conceptual model faithfully
reflect the behavior of the system?
– Verification: Has the conceptual model been
correctly implemented?
VALIDATION&VERIFICATION
MODEL VALIDATION (Continued)
• Three step process outlined by Naylor and Finger– Face validation: Discussion with the experts
– Input-Output validation: Compare results obtained from model with those from measurements
– Validation of model assumptions: Either prove that the assumptions are correct or do statistical testing
MODEL ASSUMPTIONS/ERRORS
• Errors in Model Structure
– Missing or Extra Arcs
– Missing or Extra States
– Use Face Validation to avoid these errors.
• Errors Due to Non-Independence
• Distributional Errors
• Parametric Errors
MODEL ASSUMPTIONS/ ERRORS(Continued)
• Errors Due Approximations
– Decomposition/Aggregation/Iteration
– State Truncation
• Numerical Solution Errors
– Discretization Errors
– Round-Off Errors
Model Verification
• Programming Errors
• Approximation errors: Tight bounds due to
approximations are desirable
• Numerical: Errors in numerical algorithms
should be bounded
What about software?
• Testing phase– Software reliability estimation
• Black-box based approach
• Architecture-based approach
• Operational phase– Fault tolerance coverage (c in Markov model)
– Countering software aging
• Symptom-based fault management
Conclusions:• Availability evaluation is very important in
characterizing systems
• Evaluation can be performed either through measurements, simulation or analytical modeling
• Model verification and validation should form an integral part of the modeling process