How to use SHARPE and SPNP Dr. Dan (Dong-Seong) Kim University of Canterbury, New Zealand [email protected] http://www.cosc.canterbury.ac.nz/dongseong.kim
How to use SHARPE and SPNP
Dr. Dan (Dong-Seong) Kim
University of Canterbury, New Zealand
[email protected] http://www.cosc.canterbury.ac.nz/dongseong.kim
mailto:[email protected]
University of Canterbury
University of Canterbury (UC) • originated in 1873 in the centre of Christchurch as
Canterbury College, the first constituent college of the University of New Zealand
• Ernest Rutherford – physicist – Nobel Prize in chem.
• John Key– politician currently Prime Minister of New Zealand
Computer Science and Software Engineering department at UC has been ranked in the top 101-150 Computer Science departments in the 2011 International QS World University Rankings.
http://en.wikipedia.org/wiki/Christchurchhttp://en.wikipedia.org/wiki/University_of_New_Zealandhttp://en.wikipedia.org/wiki/University_of_New_Zealandhttp://en.wikipedia.org/wiki/Ernest_Rutherfordhttp://en.wikipedia.org/wiki/John_Keyhttp://en.wikipedia.org/wiki/John_Keyhttp://en.wikipedia.org/wiki/Prime_Minister_of_New_Zealandhttp://en.wikipedia.org/wiki/Prime_Minister_of_New_Zealandhttp://en.wikipedia.org/wiki/Prime_Minister_of_New_Zealandhttp://www.topuniversities.com/university-rankings/world-university-rankings/2011/subject-rankings/technology/computer-science-information-systems?page=4http://www.topuniversities.com/university-rankings/world-university-rankings/2011/subject-rankings/technology/computer-science-information-systems?page=4
About myself
Lecturer (Assistant Professor in US) since Aug. 2011 • Full time/permanent
• Computer science and software engineering Dept.
• Research/teaching: Computer and Network Security
Postdoc at Duke U. from June 2008- July 2011 • (Kishor S. Trivedi group)
U of Maryland, USA in 2007 • Virgil D. Gligor group (former ACM SIGSAC chair)
Studied at KAU in Korea (BS, MS, PhD) • JongSou Park group (Penn. State PhD)
Security
Hardware/ Software
Network
Ubiquitous computing
Cyber physical systems
Intrusion Detection /Tolerance systems
Sensor Nets Dependability and Security
Cloud/VDC
Fault tolerance Reliability of Sat./UAV
Embedded System
REASSURE analysis (REliability, Availability, Security, SUrvivability, REsilience)
Smart Grid
Outline - SHARPE
Brief Intro. To SHARPE
Download
Analytic Modeling using SHARPE • Non-state space models
o Reliability Block Diagram (RBD)
o Fault Tree
o Reliability Graph
• State space models o Continuous Time Markov Chains (CTMC)
o Stochastic Petri nets/Stochastic Reward nets
• Hierarchical Models
• Others
6/81
Copyright © by Kishor S. Trivedi
Health & Medicine
Communication
Avionics
Entertainment Banking
Motivation: Dependence on Computer Systems
Dependability– An umbrella term
Trustworthiness of a computer system such that reliance can justifiably be placed on the service it delivers
Dependability
Attributes
Availability Reliability Safety Maintainability
Fault Prevention Fault Removal Fault Tolerance Fault Forecasting
Means
Threats Faults Errors Failures
IFIP WG10.4
Failure occurs when the delivered service no longer complies with the desired output.
Error is that part of the system state which is liable to lead to subsequent failure.
Fault is adjudged or hypothesized cause of an error.
Faults are the cause of errors that may lead to failures
Fault Error Failure
Reliability Availability
Dependability Measures
Dependability Attributes or Measures
• Reliability: “The ability of a system to perform a required function
under given conditions for a given time interval.” No recovery is
assumed after system fails (there can be recovery after a component
failure)
• Availability: “The ability of a system to be in a state to perform a
required function at a given instant of time or at any instant of time
within a given time interval.“
Reliability, Availability, Performance
Reliability-Based
• Availability: Steady-state, Transient, Interva
Downtime
• Reliability: R(t), System MTTF
Performance
• Throughput, Loss Probability, Response Time
“Does it work, and for how long?''
“Given that it works, how well does it work?''
Composite Performance and Availability
Need Techniques and Tools That Can Evaluate • Performance, Availability and Their
Combinations
“How much work will be done(lost) in a
given interval including the effects of
failure/repair/contention?''
Performability
Methods of Evaluation
Measurement-Based
• Most believable, most expensive
• Not always possible or cost effective during system
design
Model-Based
• Less believable, Less expensive
• Discrete-Event Simulation vs. Analytic
Numerical solution tool
Close-form
solution
Evaluation Methods
Model-based
Discrete-event simulation
Hybrid
Analytic Models
Quantitative evaluation
Measurement-based
Overview of SHARPE
SHARPE: Symbolic-Hierarchical Automated Reliability and Performance Evaluator
Well-known modeling tool (Installed at over 300 Sites; companies and universities)
Combines flexibility of Markov models and efficiency of combinatorial models
Ported to most architectures and operating systems
Used for Education, Research, Engineering Practice
Overview of SHARPE (cont.)
Graphical User Interface is now available
Used for analysis of performance, dependability
and performability
Hierarchy facilitates largeness & stiffness
avoidance
Steady-state as well as transient analysis
Written in C language
Architecture of SHARPE
Fault tree
Reliability graph
Reliability
Block
Diagrams
Task graph Pfqn, Mfqn
Hierarchical & Hybrid Compositions
Semi-Markov chain
Markov chain
Petri net
(GSPN & SRN)
Availability/Reliability Performance Performability
Non-state space models
Analytic models
Non-state space model types
Series Parallel reliability block diagrams
(RBDs)
Non-SP reliability block diagrams
(reliability graph: Relgraph)
Fault trees (FTs)
Fault trees with repeated events
Non-state space models (cont.)
Reliability block diagrams, Fault trees and
Reliability graphs
• Commonly used for reliability and availability
• These model types are similar in that they capture
conditions that make a system fail in terms of the
structural relationships between the system
components.
Non-state space models (cont.)
Non-state space modeling techniques (like RBDs, relgraphs and FTs) are easy to use and assuming statistical independence solve for system reliability, system availability and system MTTF; can find bottlenecks
Each component can have attached to it
• A probability of failure
• A failure rate
• A distribution of time to failure
• Steady-state and instantaneous unavailability
Non-state space models (cont.)
can be solved using fast algorithms assuming stochastic independence between system components (all implemented in SHARPE) • Sum of Disjoint Products (SDP) algorithms.
• Binary Decision Diagrams (BDD) algorithms.
• Factoring (conditioning) algorithms.
• Series-parallel composition algorithms.
• Bounding algorithm for relgraphs
Failure/Repair Dependencies are often present; • RBDs, FTs cannot easily handle these (e.g., shared repair,
warm/cold spares, imperfect coverage, non-zero switching time, travel time of repair person, reliability with repair).
Non-state space models (cont.)
Reliability block diagrams (Control-voice channel example)
Fault Tree (Control-voice channel example)
Reliability block diagrams
2 Control and 3 Voice Channels Example
control
control
voice
voice
voice
Description
Each control channel has a reliability Rc(t)
Each voice channel has a reliability Rv(t)
System is up if at least one control channel and at least 1 voice channel are up.
Reliability:
]))(1(1][))(1(1[)( 32 tRtRtR vc
Exercise: (non-) virtualized vs. virtualizedsystem
H. Ramasamy and M. Schunter, ‚Architecting Dependable Systems Using Virtualization,‛ In Workshop on DSN-2007.
Fault Tree
Markov chains
Markov chains
To model complex interactions between components, use models like Markov chains or more generally state space models.
Markov reliability models will have one or more absorbing states;
• Markov availability models will have n absorbing states
Many examples of dependencies among system components have been observed in practice and captured by continuous-time Markov chains (CTMCs).
Modeling Taxonomy
Abstract models
Discrete-event simulation
Hybrid
Analytic models
Non-state-space models
State-space models
State-Space Models
States and labeled state transitions
State can keep track of:
• Number of functioning resources of each type
• States of recovery for each failed resource
• Number of tasks of each type waiting at each resource
• Allocation of resources to tasks
A transition:
• Can occur from any state to any other state
• Can represent a simple or a compound event
Markov Availability model WebSphere AP Server
Application server and proxy server (with escalated levels of recovery)
• Delay and imperfect coverage in each step of recovery modeled
UA UR UB(1-r)rm
rrmqra
(1-q)ra
bbm
RE(1-b)bm
m
UOUPg ed2
1D
ed2dd1
(1-e)d2
UN
dm
1N
(1-d)d1
(1-e)d2
2N(1-d)d1
(1-e)d2
ed2 dd1
Failure detection By WLM
By Node Agent
Manual detection
Recovery Node Agent
Auto process restart
Manual recovery Process restart
Node reboot
Repair
State-Space model taxonomy
(discrete) State space
models
Markovian models
non-Markovian models
discrete-time Markov chains (DTMC)
continuous-time Markov chains (CTMC)
Markov reward models (MRM)
Semi-Markov process (SMP)
Markov regenerative process
Non-Homogeneous Markov
Can relax the assumption of exponential distributions
Problem with State Space Models
State space explosion problem or the largeness problem
Stochastic Petri nets (SPNs) and related formalisms for easy specification and automated generation/solution of underlying Markov model
Or use hierarchical (Multilevel) model composition.
• e.g. Upper level : FT or RBD, lower level: Markov chains
• Many practical examples of the use of hierarchical models exist
Markov Reward Models (MRMs)
Modeling any system with a pure reliability / availability model can lead to incomplete, or, at least, less precise results. • Gracefully degrading systems may be able to
survive the failure of one or more of their active components and continue to provide service at a reduced level.
• Markov reward model is commonly used technique for the modeling of gracefully degradable system
Two-State Markov Availability
Model in SHARPE
Availability measures
Steady-state Availability
Steady-state Unavailability
Transient Availability
Average Cumulative Downtime
Snapshot of the GUI
bind
lambda 0.0033
mu 1
end
* We define a model of type Markov with the name m2_state
markov m2_state readprob
*specify each transition and its rate
1 0 lambda
0 1 mu
end
* specify that state 1 is the initial state with probability 1
1 1
end * define variable A as the steady-state probability of
* the Markov chain m2_state being in state 1 var A prob(m2_state,1)
var U prob(m2_state,0)
var downtime 60*8760*U
*ask sharpe to compute and print A, U and downtime
expr A, U, downtime
*define function A(t) as the transient probability for the Markov chain
* to be in the up state 1 at time t
func A(t) tvalue(t;m2_state,1)
* we ask SHARPE to compute and print pointwise availability
* for time points 0 through 1000 in steps of 50 hours
loop t,0,1000,50
expr A(t)
end
end
Outputs from SHARPE
A: 9.9668e-01
U: 3.3223e-03
Downtime: 1.7462e+03
0.995
0.996
0.997
0.998
0.999
1
1.001
0 50 100
150
200
250
300
350
400
450
500
550
600
650
700
750
800
850
900
950
1000
Time
insta
nta
neo
us a
vailab
ilit
y A
(t)
Outputs
Two component system: Markov availability model
Assume we have a two-component parallel redundant system with repair rate m.
Assume that the failure rate of both components is .
When both the components have failed, the system is considered to have failed.
Markov availability model (Cont.)
Let the number of properly functioning components be the state of the system. The state space is {0,1,2} where 0 is the system down state.
We wish to examine effects of shared vs. non-shared repair
2 1 0
2
m
m2
2 1 0
2
m
m
Non-shared (independent)
repair
Shared repair
Markov availability model (Cont.)
Note: Non-shared case can be modeled &
solved using a RBD or a FT but shared case
needs the use of Markov chains.
m
m
A
2)1(1m
m
sysA
Markov availability model (Cont.)
Markov availability model (Cont.)
72
markov shared
2 1 2*lambda
* Could be also written
* 2 1 2/MTTF
1 0 lambda
1 2 mu
0 1 mu
end
Shared Case
bind mu 1
lambda 0.1
end var U prob(shared,0)
var downtime 60*8760*U
loop j ,2, 5, 0.5
bind lambda 1.0 *10^-j
expr downtime
end
end
Markov availability model (Cont.)
Non-shared case
Markov availability model (Cont.)
markov non_shared
2 1 2*lambda
1 0 lambda
0 1 mu
1 2 2*mu
end
Non-shared Case
bind mu 1
end var U prob(non_shared,0) var downtime 60*8760*U
loop j ,2, 5, 0.5
bind lambda 1.0 *10^-j
expr downtime
end
end
Copyright © 2008 by K.S. Trivedi 78
Markov availability model (Cont.)
Non-shared case
Comparing Shared/Non-shared cases
CTMC: WFS example
WFS Example
A Workstations-Fileserver Example
Computing system consisting of:
• A file-server
• Two workstations
• Computing network connecting them
System operational as long as:
• One of the Workstations
and
• The file-server are operational
Computer network is assumed to be fault-free.
The WFS Example
Assuming exponentially distributed times to
failure
• w : failure rate of workstation
• f : failure rate of file-server
Assume that components are repairable
• mw: repair rate of workstation
• mf : repair rate of file-server
File-server has (preemptive) priority for repair
over workstations (such repair priority cannot
be captured by non-state-space models)
Markov Chain for WFS Example
Markov Availability Model for WFS
0_0
2_1 1_1
1_0 2_0
0_1
f
2w
2w
w
mw mw
w
mf mf mf f f
Since each state is reachable from every other state, the
CTMC is irreducible. Furthermore, all states are positive
recurrent (since it is a finite state CTMC).
In the previous figure, the label (i_j) of each state is
interpreted as follows: i represents the number of
workstations that are still functioning and j is ‘1’ or ‘0’
depending on whether the file-server is up or down
respectively.
Note that in the text, no component failures are
allowed from system failure states; this is commonly
assumed by many engineers in practice. Here we
allow component failures from system failure states to
show that this situation can also be modeled.
Markov Availability Model for WFS (cont.)
Model in SHARPE GUI
Outputs; steady state
Analysis Frame
Output Generated by SHARPE
Markov Reward Models (MRMs)
Markov Reward Models (MRMs)
Continuous Time Markov Chains are useful
models for performance as well as availability
prediction
Extension of CTMC to Markov reward models
make them even more useful
Attach a reward rate (or a weight) ri to state i of
CTMC. Let Z(t)=rX(t) be the instantaneous reward
rate of CTMC at time t
Markov Reward Models (MRMs) (Continued)
Define Y(t) the accumulated reward in the interval (0,t]
Computing the expected values of these measures is easy
For computing distribution of Y(t) see the Perfomability monograph edited by Haverkort, Marie, Rubino & Trivedi
t
dZtY0
)()(
3-State Markov Reward Model with Sample
Paths of X(t) and Z(t) Processes
r1 = 1.7
r2 = 1.0
r3 = 0.0
1 2 3
2
m2 m3
CTMC: Security models
Legend
R
IR CR IA CA
CIA
A
Dependability and Security Models
A Dependability and Security Model Classification
A Availability
C Confidentiality
I Integrity
R Reliability
Security Modeling Taxonomy
Analytic models
Non-state space models
State-space models
Attack tree
SMP
CTMC
Security compromise of smart card
Confidentiality compromise
Availability compromise
Integrity compromise
Get PIN Unauthorized access
Dump communication
Applications/algorithm Data Protocol
Block access
Hardware damage
Denial Of service
Non-state space model: Attack Tree
Fault tree -> attack tree
I1 I2 I3
Non-state space model: Attack graph
Reliability graph -> attack graph
Will be covered in more detail • In security talk
Probabilistic Security Quantification
State transition diagram of system security states/ attacker behavior, and use Markov chains, Semi Markov Process, Stochastic Petri Nets
Attack (Response/countermeasure) Trees for incorporating both attacker and system behavior.
Hierarchical and fixed-point iterative models in future?
Security Quantification of SITAR
Security vs. attack rate
0.9550.96
0.9650.97
0.9750.98
0.9850.99
0.9951
0.33
2.33
4.33
6.33
8.33
10.3
3
12.3
3
threat level 3
threat level 1
Mean time to security failure vs. attack rate
0
5000
10000
15000
20000
25000
30000
1 2 3 4 5 6 7 8 9 10 11 12 13
threat level 3
threat level 1
System security
Mean time to severe security failure
Threat level 1
Threat level 3
State Space Explosion
State space explosion can be handled in two ways: • Large model tolerance must apply to specification, storage
and solution of the model. If the storage and solution problems can be solved, the specification problem can be solved by using more concise (and smaller) model specifications that can be automatically transformed into Markov models (GSPN and SRN models).
• Large models can be avoided by using hierarchical model composition.
Ability of SHARPE to combine results from different kinds of models • Possibility to use state-space methods for those parts of a
system that require them, and use non-state-space methods for the more ‚well-behaved‛ parts of the system.
Dependability and Security Evaluation Methods
Model-based
Discrete-event simulation
Hybrid
Analytic Models
Quantitative evaluation
Measurement-based
Hierarchical models
largeness
Combinatorial models
Efficiency, simplicity
State-space models
Dependency capture
VDC example
Introduction
VMware Virtualization • w/o virtualization vs. w virtualization
Introduction
VM live migration (e.g., VMotion in VMware) • enables the Live Migration of Virtual Machines
across hosts
How useful is this ?
Introduction
VMware
CPU utilization CPU utilization
VMware VMware
CPU utilization
OS
App App
OS
App
OS
App
OS
App
OS
Call for Upgrade
Upgrade Completion
VM live migration (VMotion) (cont)
host1 host2 host3
Introduction
VMotion
Bottle App
ESX Server 1 ESX Server 3 ESX Server 2
Bottle App Bottle App
VM live migration (VMotion) (cont) • automatic resource allocation to bottleneck
Introduction
VMware HA (High Availability) • provides easy to use, cost effective high
availability.
Resource Pool X Host failure
Revisit (non-) virtualized vs. virtualizedsystem
H. Ramasamy and M. Schunter, ‚Architecting Dependable Systems Using Virtualization,‛ In Workshop on DSN-2007.
A System Architecture
VMM1
Host1
VMM2
Host2
VM2
APP2
VM1
APP1
SAN
CPU
Mem
Power
NIC
Cooler
Hardware
Failure classification in virtualized two hosts
Failures
Hardware failures Software failures
(non-aging related Mandel bugs)
VMM Application VM Power
failure
Memory
failure
CPU
failure
Network
failure
SAN
failure
Cooler
failure
How can you represent this system using stochastic model?
Any idea?
A system availability model of virtualized two nodes (old)
System Failure
AND
HW1
CPU Mem Net Pwr
host1
VMM1 HW2
CPU Mem Net Pwr
host2
VMM2
SAN SAN
VMs VMs
A system hierarchical model of virtualized two hosts
System Failure
VM sub SAN
HW1
CPU1 Mem1 NIC1 Pwr1
Host1
VMM1
Coo1
HW2
CPU2 Mem2 NIC2 Pwr2
Host2
VMM2
Coo2
AND
OR
Compute equivalent failure and recovery rate in one host (subtree) in the fault tree And use it in VM Markov model
Markov submodel
CPU submodel
D1 RP
VMM1
host1
VMM2
host2
VM2
APP2
VM1
APP1
SAN
CPU Mem Power NIC Cooler
MTTFeq and MTTRequ computation of VM host
Top level is a host fault subtree (boxed)
1. Solve Markov submodels such as CPU, MEM, NIC, PWR, COO in the host fault subtree. • We can compute MTTFeq, MTTReq of the Markov
submodels
2. Compute MTTFeq and MTTReq of a host fault subtree model
3. Use MTTFeq and MTTReq values into VM sub Markov model to be shown
Ref: Dazhi Wang, MS Thesis
System flow review
A host fault subtree
VM sub Markov chain model
2) MTTF_eq/MTTR_eq
of a host fault subtree
HW1
CPU Mem NIC Pwr
Host
1
VMM1
Coo 1) MTTF_eq/MTTR_eq of
Markov submodels
Measure of interests
(availability)
CPU Mem NIC Pwr VMM Coo SAN
Input Parameters values of the Markov submodels
System Failure
VM sub SAN
HW1
CPU1 Mem1 NIC1 Pwr1
Host1
VMM1
Coo1
HW2
CPU2 Mem2 NIC2 Pwr2
Host2
VMM2
Coo2
AND
OR
3) Use MTTFeq and MTTReq
values into VM sub Markov
model to be shown
VMware HA (High Availability)
• After a host failure detection, restart VM on the other host
VM migration
• After the host is repaired, VM is moved to on the original host
Resource Pool
Host failure detected!
x Host failure
RP Host repair
started
UP Host repaired
restart
Revisit: HA and VM migration
Availability model of virtualized two nodes system : VMsub model
Up state
Down state
Host failure
Host failure
detected
VM is restarted
on other host
Host is repaired
VM is migrated
130
131/86
Input Parameters values in Markov submodels
For execution only, real value will be used
alpha_sp 2
lambda_cpu 2/8760
mu_cpu 2
lambda_mem 2/8760
mu_mem 2
lambda_net 2/8760
mu_net 2
mu2_net 1
lambda_pwr 2/8760
mu_pwr 2
mu2_pwr 1
lambda_coo 2/8760
lambda2_coo 2/8760
alpha_coo 1
mu2_coo 1
mu_coo 2
delta_vmm 1
b_vmm 0.9
beta_vmm 6
mu_vmm 1
lambda_vmm 1/2880
132/86
1) Compute MTTFeq and MTTReq of Markov submodels in the host fault subtree
Results from SHARPE execution
MTTFeq MTTReq
CPU 2.19050000e+003 5.00000000e-001
Memory 1.09550000e+003 5.00000000e-001
Network 2.19049994e+003 9.99942929e-001
Power 2.19000000e+003 4.99942929e-001
Cooler 4.38099977e+003 5.00399446e-001
VMM 2.88000000e+003 1.31666667e+000
133/86
2) Compute MTTFeq and MTTReq of host fault subtree model
MTTF equivalent : 3.49899881e+002
MTTR equivalent : 6.79629790e-001
They are used in VM sub Markov chain model in the next slides.
134/86
Steady state availability of system fault
tree
Steady state availability : 9.99770797e-001 -------------------------------------------
Down time : 1.20469077e+002 (min)
Other examples index
Non-state space models
• Reliability Block Diagram (RBD): voice, IBM, NEC, CISCO
• Fault Tree: the same
• Reliability Graph
State space models
• Continuous Time Markov Chains (CTMC): o Simple two components failure-recovery: Bluebook
o Other failure-recovery model: IBM, NEC-VDC
o Software rejuvenation
o Security model: SITAR
o UAV model
• Stochastic Petri nets/Stochastic Reward nets o NEC-VDC, SITAR, NEC-warm rejuvenation, Disaster tolerance
Reliability models in practice
Availability models in practice
Expected interval availability
Stochastic Petri nets/SRN
UUXUUX UFaXUUX UUXUFaX
UUXUDaX h h
hd hd
vr vr
vdvd
v v
a a
ad ad
UDaXUUX
UPaXUUX UUXUPaX FUXUUX UUXFUX
DXXUUR UURDXX
2am
1a ac m
2am
(1 ) 1a ac m (1 ) 1a ac m
UFvXUUX
UDvXUUX
DXXUUU
hm
UUUDXX
UUXUFvX
UUXUDvX
UXXFUU
hh
UUUUXX
FUUUXX
hd hd
DUUUXX UXXDUU
hmvm vm
vr
1a ac m
2 vr2
UPvXUUX UUXUPvX
UXXUUU
(1− cv)r v
μv
cvr
v
(1− cv)r v
μv
cvr
v
Model with only 1st Host Failure (IEEE-TR)
UUXUUX UFaXUUX UUXUFaX
UUXUDaX h h
hd hd
vr vr
vdvd
v v
a a
ad ad
UDaXUUX
UPaXUUX UUXUPaX FUXUUX UUXFUX
DXXUUR UURDXX
2am
1a ac m
2am
(1 ) 1a ac m (1 ) 1a ac m
UFvXUUX
UDvXUUX
DXXFUU
hm
UUUDXX
UUXUFvX
UUXUDvX
UXXFUU
hh
UUUUXX
FUUUXX
hd hd
DUUUXX UXXDUU
hmvm vm
vr
1a ac m
2 vr2
UPvXUUX UUXUPvX
DXXURR
(1− cv)r v
μv
cvr
v
(1− cv)r v
μv
cvr
v
DXXDUU FUUDXX URRDXX DUUDXX
DXXUUU
UXXUUU
hhdhm
vr
2
hm
2 hd
2
vr2
UXXURR URRUXX
hm
UXXUUR
vr2
h
UURUXX
vr2
hm
vrvr
Model with 2nd Host Failure
Sensitivity of MTTF_VMs
Parameter Sensitivity – Model with only 1st Failure
Sensitivity – Model with 2nd Failure
lambda_h -1.99866136e+00 -1.99869163e+00
m_v 9.99931045e-01 5.20949060e-02
lambda_a 1.04508774e-03 1.04508774e-03
lambda_v 3.25669697e-05 3.25669696e-05
r_v -2.00199738e-05 -2.24829226e-05
mu_v -1.72413369e-05 -1.72413369e-05
The order of parameters in the sensitivity ranking is the same for both models
Obs: In the paper, the values for sens. of some parameters were wrong, due to a computation error that was detected and fixed now. But only the order of r_v and mu_v was swaped with this fix. (6th -> 5th, 5th -> 6th in the ranking)
Unavailability and MTTF_VMs
Metric Model with only 1st Failure Model with 2nd Failure
Unavailability 3.96688083e-10 4.01031499e-07
MTTF 3.85133268e+07 2.00774300e+06
MTTF and Unavailability are larger in the second model
SRN models
Refer to Trivedi’s SPN lecture notes if necc.
Open the SHARPE project file.
SPN model for WFS example
Modeling Practices using SHARPE
Performance models in practice
Performability models in practice
SPNP: Stochastic Petri Net Package.
Version 6.0
Outline - SPNP
Brief Intro. To SPNP
Download
Analytic Modeling using SPNP • Stochastic Petri nets/Stochastic Reward nets
• Others
SPNP
Modeling tool well-known (Installed at over 180 Sites;
companies & universities)
Ported to Most Architectures and Operating Systems
Used For Performance, Dependability and
Performability
Steady-State as well as Transient Analysis
Written in C Language
SPNP GUI supports animantion
Stochastic Reward Nets
Bipartite graph with two kinds of nodes:
places and transitions
Timed and immediate transitions
Inhibitor arc, variable cardinality arc,
guard function, and priority
Reward rate based measures
Fluid stochastic Petri net: fluid places and
fluid arcs
Characteristics
Structural characteristics:
• Marking dependency
• Resampling policies, for general distributions
Stochastic characteristics:
• Allow definition of reward rates in terms of net level entities
• Automatically generate the reward rates for the markings
Characteristics (cont.)
Non- Markovian SPNs as well as Fluid Stochastic Petri
Nets (FSPNs) can be described and solved
Besides the analytic numeric solution of Markovian
models discrete-event simulation is now available
A user-friendly GUI Interface is now available
Transition Type
Poisson
Binominal
Negative Binominal
2-stageHyperexponential
3-stageHypoexponential
Erlang
Pareto
Cauchy
Transition Type
Exponential
Deterministic(Constant)
Uniform(a,b)
Geometric(p)
Weibull(a,b)
Lognormal
Truncated Normal
Gamma
Beta
Available Distribution functions
Solution Techniques
SRN Model
Markovian Nets Non-Markovian Nets FSPNs
Extended Reachability Graph
(ERG)
Generate All Markings
(tangible + vanishing)
Generate Reward Rates
for Tangible Markings
make correspondance list
schedule transition events
schedule fluid events
change states
Compute statistics
and confidence interval
Discrete
Event
Simulation
(DES)
modify
clock
Steps in Analytic Numeric Analysis of Markovian SRN
Generates all markings of the SRN by considering all the enabled transitions in each marking. Classify the markings as Tangible and Vanishing markings.
Stochastic Reward Nets
Extended Reachability Graphs
Markov Reward Model
Solve MRM for Steady-State
Transient behavior using
known methods
Eliminates the vanishing markings
Discrete Event Simulation
FSPN (Fluid Stochastic Petri net)
Used as a model for: • Systems involving fluid variables
• Approximation of models with a large number of tokens
No need to generate the reachability graph
Possibility to give the number of replications or the desired relative error.
SRN models in SPNP
Some examples
References
K. Trivedi’s lecture notes on SHARPE demonstration
SHARPE portal
SPNP site at K. Trivedi’s homepage.