-
Formal Methods meets Machine Learning: Explorations in
Cyber-
Physical System DesignSanjit A. Seshia
ProfessorUC Berkeley
Joint work with:Jyo Deshmukh, Tommaso Dreossi, Alex Donze, Dorsa
Sadigh, Susmit Jha, Xiaoqing Jin, Tomoyuki Kaga, Tomoya
Yamaguchi,
S. Shankar Sastry
Stanford UniversityDecember 4, 2017
-
Cyber-Physical Systems (CPS):Integration of computation with
physical processes, defined by both cyber & physical
Courtesy of Kuka Robotics Corp.Courtesy of Doug Schmidt
Power generation and distribution
Courtesy of General Electric
Military systems:
E-Corner, Siemens
Transportation(Air traffic control at SFO)Avionics
Telecommunications
Instrumentation(Soleil Synchrotron)
Daimler-Chrysler
Automotive
Building Systems
Factory automation
[E. A. Lee]
Automotive domain representative of key societal challenges:
Smart Cities / Infrastructure Energy Efficiency Climate Change
Humans and Automation
-
Growing Use of Machine Learning/AI in Cyber-Physical Systems
S. A. Seshia 3
Many Safety-Critical Systems
-
Growing Features Growing Costs
70 to 100 ECUs in modern luxury cars, close to 100M LOC Engine
control: 1.7M LOC
F-22 raptor: 1.7M, Boeing 787: 6.5M
Frost & Sullivan: 200M to 300M LOC Electronics &
Software: 35-40% of luxury car cost
200219971988 2009200219971988 2009
[from J. Deshmukh]
4
Charette, R., This Car Runs on Code, IEEE spectrum,
http://spectrum.ieee.org/transportation/systems/this-car-runs-on-code
High Cost of Failures Safety-critical: human life at risk
Recalls, production delays, lawsuits, etc. Toyota UA: $1.2B
settlement with DoJ in 2014, lawsuits, Tesla autopilot incidents:
reasons still unclear
S. A. Seshia
-
Formal Methods to the Rescue?
Industry need for higher assurance Increasing interest in Formal
Methods
Formal methods = Mathematical, Algorithmic techniques for
modeling, design, analysis Specification: WHAT the system must/must
not do Verification: WHY it meets the spec (or not) Synthesis: HOW
it meets the spec (correct-by-construction
design)
Major success story: Digital circuit design Can we address the
challenges of CPS design?
5
-
Formal Methods meets Machine Learning
Machine Learning Formal Methods Greater efficiency, ease of
use/applicability Formal Inductive Synthesis
Formal Methods Machine Learning Stronger assurances of
safety/correctness for
learning systems
S. A. Seshia 6
Further details:1. S. A. Seshia, Combining Induction, Deduction,
and Structure for
Verification and Synthesis, Proceedings of the IEEE, November
2015.2. S. A. Seshia, D. Sadigh, and S. S. Sastry, Towards Verified
Artificial
Intelligence, July 2016, http://arxiv.org/abs/1606.08514
-
Outline
Synthesizing Requirements for Closed-Loop Control Systems
Industrial Tech Transfer to Toyota
Falsification of Deep Learning based CPS Context: autonomous
driving
Conclusion
S. A. Seshia 7
-
Mining Requirements for Closed-Loop Control Systems
S. A. Seshia 8
[Jin, Donze, Deshmukh, Seshia, HSCC 2013, TCAD 2015;Yamaguchi et
al. FMCAD 2016]
-
Challenges for Verification of Control Systems
Closed-loop setting very complex software + physical artifacts
nonlinear dynamics large look-up tables large amounts of
switching
Requirements Incomplete/Informal Specifications often
created
concurrently with the design! Designers often only have
informal
intuition about what is good behavior shape recognition
Mining Requirements from Closed-Loop Models 9
Experimental Engine Control Model
-
Industry Problem: Applying Formal Methods to Legacy Systems
Its working, but I dont understand why!
Our Solution: Requirements Mining
Value added by mining:
Mined Requirements become useful
documentation
Use for code maintenance and revision
Use during tuning and testing
Mining Requirements from Closed-Loop Models 10
-
Designer reviews mined requirements Settling time is 6.25 ms
Overshoot is 100 units Expressed in Signal
Temporal Logic [Maler & Nickovic, 04]
Tool extracts properties of closed-loop design using a
Simulator
Designers View of Our Solution
6.25ms
100
Mining Requirements from Closed-Loop Models 11
-
CounterExample Guided Inductive Synthesis (CEGIS)
Find Tightest Properties
Settling Time is ??Overshoot is ??Upper Bound on x is ??
Are there behaviors that do NOT satisfy theserequirements?
Settling Time is 5 msOvershoot is 5 KPaUpper Bound on x is
3.6
1.
Mining Requirements from Closed-Loop Models 12
Experimental Engine Control Model
[Jin, Donze, Deshmukh, Seshia, HSCC13; TCAD15]
Idea of CEGIS goes back to ASPLOS 2006 paperby Solar-Lezama et
al.
-
Settling Time is 5.3 msOvershoot is 5.1 KPaUpper Bound on x is
3.8
Settling Time is msOvershoot is KPaUpper Bound on x is
CounterExample Guided Inductive Synthesis (CEGIS)
Find Tightest Properties
Settling Time is ??Overshoot is ??Upper Bound on x is ??
Are there behaviors that do NOT satisfy theserequirements?
Counterexamples
1.
Mining Requirements from Closed-Loop Models 13
Experimental Engine Control Model
-
CounterExample Guided Inductive Synthesis
Find "Tightest" Properties
Settling Time is ??Overshoot is ??Upper Bound on x is ??
Are there behaviors that do NOT satisfy theserequirements?
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is
4.1
NO
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is
4.1
Mined Requirement
Counterexamples
1.
Mining Requirements from Closed-Loop Models 14
Experimental Engine Control Model
-
CounterExample Guided Inductive Synthesis
Find "Tightest" Properties
Settling Time is ??Overshoot is ??Upper Bound on x is ??
Are there behaviors that do NOT satisfy theserequirements?
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is
4.1
NO
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is
4.1
Mined Requirement
Counterexamples
1.
Mining Requirements from Closed-Loop Models 15
Experimental Engine Control Model
Parametric Signal Temporal Logic
(PSTL)
Parameter Synthesis (exploits
monotonicity)
Optimization-based Falsification
-
CounterExample Guided Inductive Synthesis
Find "Tightest" Properties
Settling Time is ??Overshoot is ??Upper Bound on x is ??
Are there behaviors that do NOT satisfy theserequirements?
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is
4.1
NO
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is
4.1
Mined Requirement
Counterexamples
1.
Mining Requirements from Closed-Loop Models 16
Experimental Engine Control Model
Parametric Signal Temporal Logic
(PSTL)
Parameter Synthesis (exploits
monotonicity)
Optimization-based Falsification
-
Signal Temporal Logic (STL) Extension of Linear Temporal Logic
(LTL) and Metric
Temporal Logic (MTL) Quantitative semantics: satisfaction of a
property over a
trace given real-valued interpretation Greater value more easily
satisfied Non-negative satisfaction value Boolean satisfaction
Example: For all time points between 60 and 100, the absolute
value of x is below 0.1
S. A. Seshia 17
0 100
1
-0.1 +0.1
60
x
t
-
Quantitative Satisfaction Function for STL
Function that maps STL formula and a given trace (valuation of
signals) to a numeric value
Example: is inf[60,100] (0.1 - |x|)
Quantifies how much a trace satisfies a property Large positive
value: trace easily satisfies Small positive value: trace close to
violating Negative value: trace does not satisfy
Mining Requirements from Closed-Loop Models 18
-
Parametric Signal Temporal Logic (PSTL)
Constants in STL formula replaced with parameters Scale
parameters Time parameters
Examples:
S. A. Seshia 19
Between some time and 10 seconds, x remains greater than some
value
After transmissionshifts to gear 2, itremains in gear 2 for at
least secs
-
CounterExample Guided Inductive Synthesis
Find "Tightest" Properties
Settling Time is ??Overshoot is ??Upper Bound on x is ??
Are there behaviors that do NOT satisfy theserequirements?
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is
4.1
NO
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is
4.1
Mined Requirement
Counterexamples
1.
Mining Requirements from Closed-Loop Models 20
Experimental Engine Control Model
Parametric Signal Temporal Logic
(PSTL)
Parameter Synthesis (exploits
monotonicity)
Optimization-based Falsification
-
Parameter Synthesis = Find -tight values of params (for suitably
small )
0 100
3
2.9
1 000 000
Find "Tightest" Properties
Mining Requirements from Closed-Loop Models 21
Too loose
Want the value of corresponding to the tightest satisfaction
over a set of traces
x
-
Parameter Synthesis
Non-linear optimization problem
Satisfaction function for STL is non-linear in general
Nave (strawman) approach:
grid parameter space to precision
evaluate satisfaction value at each point
pick valuation with smallest satisfaction value
Problem: Exponential number of grid points (in
#parameters)
Find "Tightest" Properties
Mining Requirements from Closed-Loop Models 22
-
If upper bound of all signals is 3, any number > 3 is also an
upper bound
Satisfaction function monotonic in parameter value Example:
(, x) = inft ( - x(t) ) For all x, (, x) is a monotonic function
of Advantage: If monotonic, use binary search over
parameter space, otherwise exhaustive search
Satisfaction Monotonicity
0 10050
34
Find "Tightest" Properties
Mining Requirements from Closed-Loop Models 23
-
Need to decide whether:For all x, (, x) is a monotonic function
of
Theorem: Deciding monotonicity of a PSTL formula is
undecidable
Use an encoding to satisfiability modulo theories (SMT) solving
Quantified formulas involving uninterpreted functions,
and arithmetic over reals linear arithmetic if PSTL predicates
are linear
Solved easily with Z3
Deciding Satisfaction MonotonicityFind "Tightest" Properties
Mining Requirements from Closed-Loop Models 24
-
CounterExample Guided Inductive Synthesis
Find "Tightest" Properties
Settling Time is ??Overshoot is ??Upper Bound on x is ??
Are there behaviors that do NOT satisfy theserequirements?
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is
4.1
NO
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is
4.1
Mined Requirement
Counterexamples
1.
Mining Requirements from Closed-Loop Models 25
Experimental Engine Control Model
Parametric Signal Temporal Logic
(PSTL)
Parameter Synthesis (exploits
monotonicity)
Optimization-based Falsification
-
Black-Box Falsification Procedure
u S(u)
Falsification Tool
\
\
Mining Requirements from Closed-Loop Models
Are there behaviors that do NOT satisfy theserequirements?
26
-
Falsification as Optimization
Solve * = minu (, S(u)) Leverages quantitative semantics of STL
Relies on standard numerical optimization methods (e.g.
Nelder-Mead)
If * < 0, found falsifying trace!
Nonlinear Optimization Problem, No exact solution, Limited
theoretical guarantees
Are there behaviors that do NOT satisfy theserequirements?
Mining Requirements from Closed-Loop Models 27
-
Experimental Evaluation Summary [details in TCAD15 paper]
Defined Templates for Common Requirements in
Automotive Control all monotonic PSTL!! Dwell-Time requirements
Timed/Untimed Safety properties Timed Inevitability (bounded
liveness) Input Profiles: assumptions on shape of input signals
Control-theoretic requirements on output signals (bounded
overshoot/undershoot, settling time, error from reference
signal, etc.)
Three Benchmarks Simple Simulink Automatic Transmission Model
Toyota HSCC14 Challenge Air-Fuel Ratio controller Toyota
Experimental Diesel Engine Airpath controller
S. A. Seshia 28
-
Results on Industrial Airpath Controller
Found max overshoot with 7000+ simulations in 13 hours Attempt
to mine maximum observed settling time:
stops after 4 iterations gives answer tsettle = simulation time
horizon (shown in trace below)
Experimental Engine Control Model
Mining Requirements from Closed-Loop Models 29
[Jin, Donze, Deshmukh, Seshia, HSCC 2013]
-
Mining can expose deep bugs
Uncovered a tricky bug Discussion with control designer revealed
it to be a real
bug Root cause identified as wrong value in a look-up table,
bug was fixed Why mining could be useful for bug-finding:
Can uncover subtle relations that should not hold Looking for
bugs Mine for negation of bug
Experimental Engine Control Model
Mining Requirements from Closed-Loop Models 30
-
Bug fixed Settling time successfully mined
S. A. Seshia 31
OLD
NEW
-
Industrial Case Studies with Toyota
Work with group @ Toyota Japan on enabling software verification
by mining specifications on the closed-loop system
Useful in a production setting: Finds issues where previous
methods fell short! Reduced 70% of human effort
S. A. Seshia 32
[Yamaguchi et al., FMCAD16]
-
Toyota Units Experience with Model Checking
33
Total Work hour
560min
1 trial
70min
40min
7 trials
Making modelMaking property
Revising property
Revising model
Executingmodel checking
Mappingcounterexample
Making/revising property: 110 minMapping counterexample: 280 min
for just 1 module
[Yamaguchi et al., FMCAD16]
Graph5
30
40
70.71
70.71
70.71
70.71
70.71
70.71
70.71
Sheet1
3013010110
4014010110
15710515575
40728015575
103307214
2048010330
565214
3013010110
4014010110
1 70.7138.8
2 70.7138.8
3 70.7138.8
4 70.7138.8
5 70.7138.8
6 70.71351
7 70.71
564.97565
15710514.85060606061557515
40728039.60161616161557515
103304.24303030372142.8
2048011.3147474747103306
31.91
495194
Sheet1
Sheet2
Sheet3
-
34
Overview of Methodology
in out
1. Pre-condition mining
Pre-condition forsoftware module
2. Softwaremodel checking
Module levelcounterexample
System levelcounterexample
3. Simulation-BasedVerification
controller
sensor
plant
module module
module
target
Pedal Brake WaterTemp
module
AirTemp
Breach
Breach
SLDV/CBMC
[Yamaguchi et al., FMCAD16]
-
From CEGIS to Oracle-Guided Inductive Synthesis
Inductive Synthesis: Learning from Examples (ML)Formal Inductive
Synthesis: Learn from Examples while satisfying a Formal
Specification
S. A. Seshia 36
[Jha & Seshia, A Theory of Formal Synthesis via Inductive
Learning, 2015,Acta Informatica 2017.]
General Approach: Oracle-Guided LearningCombine Learner with
Oracle (e.g., Verifier) that answers Learners Queries
LEARNER ORACLE
query
response
-
Falsification of Cyber-Physical Systems with Machine Learning
Components
S. A. Seshia 37
T. Dreossi, A. Donze, and S. A. Seshia. Compositional
Falsification of Cyber-Physical Systems with Machine Learning
Components, In NASA Formal Methods Symposium, May 2017.
-
Challenges for Verified AI
S. A. Seshia 38
System SEnvironment ESpecification
YES [+ proof]Does S || E satisfy ?
NO [+ counterexample]
S. A. Seshia, D. Sadigh, S. S. Sastry. Towards Verified
Artificial Intelligence. July 2016.
https://arxiv.org/abs/1606.08514.
Design Correct-by-Construction
instead?Counterexamples, etc. from Rich Signal Spaces?
-
Problem: Verify Automotive System (CPS) that uses ML-based
Perception
39
Focus: Falsification: finding scenarios that violate safety
properties Test (Data) Generation: generate interesting data
for
training / testing improve accuracy Deep Neural Networks, given
the increasing interest and use
in the automotive context.
S. A. Seshia
Controller Plant
Environment
Learning-Based Perception
Sensor Input
-
Automatic Emergency Braking System (AEBS)
40
AEBS Controller Plant
Environment
Deep Learning-Based Object Detection
Goal: Brake when an obstacle is near, to maintain a minimum
safety distance Controller, Plant, Env models in
Matlab/Simulink
Object detection/classification system based on deep neural
networks Inception-v3, AlexNet, trained on ImageNet more recent:
squeezeDet, Yolo, trained on KITTI
-
Our Approach: Use a System-Level Specification
S. A. Seshia 41
Verify the Deep Neural Network Object Detector
Verify the System containing the Deep Neural Network
Formally Specify the End-to-End Behavior of the System
Controller Plant
Environment
Learning-Based Perception
STL Formula: G (dist(ego vehicle, env object) > )
-
Approach: Simulation-Based Falsification
S. A. Seshia 42
Challenge: Very High Dimensionality of Input Space!
Standard solution: Use Compositional (Modular)Verification
However: no formal spec. for neural network component!
Compositional Verification without Compositional
Specification?!!
-
Our Approach: Combine Temporal Logic CPS Falsifier with ML
Analyzer
43
CPS Falsifier
ML Analyzer
Spec
Model
Region of Uncertainty
Interesting sensor data(images)
Error? CPS Falsifier uses abstraction of
ML component Optimistic analysis: assume ML
classifier is always correct Pessimistic analysis: assume
classifier is always wrong
Compositional: CPS Falsifier and ML Analyzer can be designed and
run
independently (& communicate)!
S. A. Seshia
Difference is the region of uncertainty where output of the ML
component matters
-
Identifying Region of Uncertainty (ROU) for Automatic Emergency
Braking System
44
Perform Optimistic and Pessimistic Analyses on the Deep Neural
Network
ML always correct ML always wrong Potentially unsafe region
(ROU) depending on ML
component (yellow)
S. A. Seshia
-
Machine Learning Analyzer
45
Systematically Explore ROU in the Image (Sensor) Space
Feature space
brightness car z-pos
Abstraction map
brightnesscar z-pos
car x-pos
Abstract space A
S. A. Seshiax
Abstract space A
()
Neural network {, }
Systematic Sampling (low-discrepancy sampling)
-
Sample Result
46
Inception-v3Neural
Network(pre-trained on ImageNet using
TensorFlow)
Misclassifications
This misclassification may not be of concern
-
Sample Result
47
Inception-v3Neural
Network(pre-trained on ImageNet using
TensorFlow)
Misclassifications
Corner caseImage
But this one is a real hazard!
-
Image Streams
S. A. Seshia 48
Superimposition of tests on backgroundBlind spots
Results on squeezeDet NN and KITTI dataset for autonomous
driving
[Dreossi, Ghosh, et al., ICML 2017 workshop]
-
Verifier-Guided Training of Deep Neural Networks
Instance of Oracle-Guided Inductive Synthesis Oracle is Verifier
(CPSML Falsifier) used to perform
counterexample-guided training of DNNs Substantially increase
accuracy with only few
additional examples
S. A. Seshia 49
DEEP NEURAL NETWORK
FALSIFIER (CPS + ML)
Learned Classifier
-
Conclusion: Formal Methods meets Machine Learning
Formal Methods can play an important role in CPS Design with
high assurance Industrial scale and machine learning pose
particular
challenges
Machine Learning Formal Methods Formal Inductive Synthesis (of
specifications,
programs, etc.)
Formal Methods Machine Learning Compositional reasoning about
learning-based systems
S. A. Seshia 50
-
Towards Verified Learning-based CPS
S. A. Seshia 51
S. A. Seshia, D. Sadigh, S. S. Sastry. Towards Verified
Artificial Intelligence. July 2016.
https://arxiv.org/abs/1606.08514.
Challenges1. Environment (incl.
Human) Modeling2. Specification
3. Learning Systems Complexity
4. Efficient Training, Testing, Verification
5. Design for Correctness
PrinciplesData-Driven, Introspective Environment
ModelingSystem-Level Specification; Robustness/Quantitative
Spec.
Abstract & ExplainVerification-Guided, Adversarial Analysis
and ImprovisationFormal Inductive Synthesis
Exciting Times Ahead!!! Thank you!
Formal Methods meets Machine Learning: Explorations in
Cyber-Physical System DesignCyber-Physical Systems
(CPS):Integration of computation with physical processes, defined
by both cyber & physical Growing Use of Machine Learning/AI in
Cyber-Physical Systems Growing Features Growing CostsFormal Methods
to the Rescue?Formal Methods meets Machine LearningOutlineMining
Requirements for Closed-Loop Control SystemsChallenges for
Verification of Control SystemsIndustry Problem: Applying Formal
Methods to Legacy SystemsDesigners View of Our SolutionSlide Number
12Slide Number 13Slide Number 14Slide Number 15Slide Number
16Signal Temporal Logic (STL)Quantitative Satisfaction Function for
STLParametric Signal Temporal Logic (PSTL)Slide Number 20Parameter
Synthesis = Find -tight values of params (for suitably small
)Parameter SynthesisSatisfaction MonotonicityDeciding Satisfaction
MonotonicitySlide Number 25Black-Box Falsification
ProcedureFalsification as OptimizationExperimental Evaluation
Summary [details in TCAD15 paper]Results on Industrial Airpath
ControllerMining can expose deep bugsBug fixed Settling time
successfully mined Industrial Case Studies with ToyotaToyota Units
Experience with Model Checking Overview of MethodologyFrom CEGIS to
Oracle-Guided Inductive SynthesisFalsification of Cyber-Physical
Systems with Machine Learning ComponentsChallenges for Verified AI
Problem: Verify Automotive System (CPS) that uses ML-based
PerceptionAutomatic Emergency Braking System (AEBS)Our Approach:
Use a System-Level SpecificationApproach: Simulation-Based
FalsificationOur Approach: Combine Temporal Logic CPS Falsifier
with ML AnalyzerIdentifying Region of Uncertainty (ROU) for
Automatic Emergency Braking SystemMachine Learning AnalyzerSample
ResultSample ResultImage StreamsVerifier-Guided Training of Deep
Neural NetworksConclusion: Formal Methods meets Machine
LearningTowards Verified Learning-based CPS