Top Banner
Formal Methods meets Machine Learning: Explorations in Cyber- Physical System Design Sanjit A. Seshia Professor UC Berkeley Joint work with: Jyo Deshmukh, Tommaso Dreossi, Alex Donze, Dorsa Sadigh, Susmit Jha, Xiaoqing Jin, Tomoyuki Kaga, Tomoya Yamaguchi, S. Shankar Sastry Stanford University December 4, 2017
50

Formal Methods meets Machine Learning: Explorations in ...sseshia/talks/Seshia-FM_ML_CPS... · Power generation and distribution. ... • Theorem: Deciding ... – Discussion with

Jun 05, 2018

Download

Documents

donhi
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • Formal Methods meets Machine Learning: Explorations in Cyber-

    Physical System DesignSanjit A. Seshia

    ProfessorUC Berkeley

    Joint work with:Jyo Deshmukh, Tommaso Dreossi, Alex Donze, Dorsa Sadigh, Susmit Jha, Xiaoqing Jin, Tomoyuki Kaga, Tomoya Yamaguchi,

    S. Shankar Sastry

    Stanford UniversityDecember 4, 2017

  • Cyber-Physical Systems (CPS):Integration of computation with physical processes, defined by both cyber & physical

    Courtesy of Kuka Robotics Corp.Courtesy of Doug Schmidt

    Power generation and distribution

    Courtesy of General Electric

    Military systems:

    E-Corner, Siemens

    Transportation(Air traffic control at SFO)Avionics

    Telecommunications

    Instrumentation(Soleil Synchrotron)

    Daimler-Chrysler

    Automotive

    Building Systems

    Factory automation

    [E. A. Lee]

    Automotive domain representative of key societal challenges: Smart Cities / Infrastructure Energy Efficiency Climate Change Humans and Automation

  • Growing Use of Machine Learning/AI in Cyber-Physical Systems

    S. A. Seshia 3

    Many Safety-Critical Systems

  • Growing Features Growing Costs

    70 to 100 ECUs in modern luxury cars, close to 100M LOC Engine control: 1.7M LOC

    F-22 raptor: 1.7M, Boeing 787: 6.5M

    Frost & Sullivan: 200M to 300M LOC Electronics & Software: 35-40% of luxury car cost

    200219971988 2009200219971988 2009

    [from J. Deshmukh]

    4

    Charette, R., This Car Runs on Code, IEEE spectrum, http://spectrum.ieee.org/transportation/systems/this-car-runs-on-code

    High Cost of Failures Safety-critical: human life at risk Recalls, production delays, lawsuits, etc. Toyota UA: $1.2B settlement with DoJ in 2014, lawsuits, Tesla autopilot incidents: reasons still unclear

    S. A. Seshia

  • Formal Methods to the Rescue?

    Industry need for higher assurance Increasing interest in Formal Methods

    Formal methods = Mathematical, Algorithmic techniques for modeling, design, analysis Specification: WHAT the system must/must not do Verification: WHY it meets the spec (or not) Synthesis: HOW it meets the spec (correct-by-construction

    design)

    Major success story: Digital circuit design Can we address the challenges of CPS design?

    5

  • Formal Methods meets Machine Learning

    Machine Learning Formal Methods Greater efficiency, ease of use/applicability Formal Inductive Synthesis

    Formal Methods Machine Learning Stronger assurances of safety/correctness for

    learning systems

    S. A. Seshia 6

    Further details:1. S. A. Seshia, Combining Induction, Deduction, and Structure for

    Verification and Synthesis, Proceedings of the IEEE, November 2015.2. S. A. Seshia, D. Sadigh, and S. S. Sastry, Towards Verified Artificial

    Intelligence, July 2016, http://arxiv.org/abs/1606.08514

  • Outline

    Synthesizing Requirements for Closed-Loop Control Systems Industrial Tech Transfer to Toyota

    Falsification of Deep Learning based CPS Context: autonomous driving

    Conclusion

    S. A. Seshia 7

  • Mining Requirements for Closed-Loop Control Systems

    S. A. Seshia 8

    [Jin, Donze, Deshmukh, Seshia, HSCC 2013, TCAD 2015;Yamaguchi et al. FMCAD 2016]

  • Challenges for Verification of Control Systems

    Closed-loop setting very complex software + physical artifacts nonlinear dynamics large look-up tables large amounts of switching

    Requirements Incomplete/Informal Specifications often created

    concurrently with the design! Designers often only have informal

    intuition about what is good behavior shape recognition

    Mining Requirements from Closed-Loop Models 9

    Experimental Engine Control Model

  • Industry Problem: Applying Formal Methods to Legacy Systems

    Its working, but I dont understand why!

    Our Solution: Requirements Mining

    Value added by mining:

    Mined Requirements become useful

    documentation

    Use for code maintenance and revision

    Use during tuning and testing

    Mining Requirements from Closed-Loop Models 10

  • Designer reviews mined requirements Settling time is 6.25 ms Overshoot is 100 units Expressed in Signal

    Temporal Logic [Maler & Nickovic, 04]

    Tool extracts properties of closed-loop design using a Simulator

    Designers View of Our Solution

    6.25ms

    100

    Mining Requirements from Closed-Loop Models 11

  • CounterExample Guided Inductive Synthesis (CEGIS)

    Find Tightest Properties

    Settling Time is ??Overshoot is ??Upper Bound on x is ??

    Are there behaviors that do NOT satisfy theserequirements?

    Settling Time is 5 msOvershoot is 5 KPaUpper Bound on x is 3.6

    1.

    Mining Requirements from Closed-Loop Models 12

    Experimental Engine Control Model

    [Jin, Donze, Deshmukh, Seshia, HSCC13; TCAD15]

    Idea of CEGIS goes back to ASPLOS 2006 paperby Solar-Lezama et al.

  • Settling Time is 5.3 msOvershoot is 5.1 KPaUpper Bound on x is 3.8

    Settling Time is msOvershoot is KPaUpper Bound on x is

    CounterExample Guided Inductive Synthesis (CEGIS)

    Find Tightest Properties

    Settling Time is ??Overshoot is ??Upper Bound on x is ??

    Are there behaviors that do NOT satisfy theserequirements?

    Counterexamples

    1.

    Mining Requirements from Closed-Loop Models 13

    Experimental Engine Control Model

  • CounterExample Guided Inductive Synthesis

    Find "Tightest" Properties

    Settling Time is ??Overshoot is ??Upper Bound on x is ??

    Are there behaviors that do NOT satisfy theserequirements?

    Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1

    NO

    Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1

    Mined Requirement

    Counterexamples

    1.

    Mining Requirements from Closed-Loop Models 14

    Experimental Engine Control Model

  • CounterExample Guided Inductive Synthesis

    Find "Tightest" Properties

    Settling Time is ??Overshoot is ??Upper Bound on x is ??

    Are there behaviors that do NOT satisfy theserequirements?

    Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1

    NO

    Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1

    Mined Requirement

    Counterexamples

    1.

    Mining Requirements from Closed-Loop Models 15

    Experimental Engine Control Model

    Parametric Signal Temporal Logic

    (PSTL)

    Parameter Synthesis (exploits

    monotonicity)

    Optimization-based Falsification

  • CounterExample Guided Inductive Synthesis

    Find "Tightest" Properties

    Settling Time is ??Overshoot is ??Upper Bound on x is ??

    Are there behaviors that do NOT satisfy theserequirements?

    Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1

    NO

    Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1

    Mined Requirement

    Counterexamples

    1.

    Mining Requirements from Closed-Loop Models 16

    Experimental Engine Control Model

    Parametric Signal Temporal Logic

    (PSTL)

    Parameter Synthesis (exploits

    monotonicity)

    Optimization-based Falsification

  • Signal Temporal Logic (STL) Extension of Linear Temporal Logic (LTL) and Metric

    Temporal Logic (MTL) Quantitative semantics: satisfaction of a property over a

    trace given real-valued interpretation Greater value more easily satisfied Non-negative satisfaction value Boolean satisfaction

    Example: For all time points between 60 and 100, the absolute value of x is below 0.1

    S. A. Seshia 17

    0 100

    1

    -0.1 +0.1

    60

    x

    t

  • Quantitative Satisfaction Function for STL

    Function that maps STL formula and a given trace (valuation of signals) to a numeric value

    Example: is inf[60,100] (0.1 - |x|)

    Quantifies how much a trace satisfies a property Large positive value: trace easily satisfies Small positive value: trace close to violating Negative value: trace does not satisfy

    Mining Requirements from Closed-Loop Models 18

  • Parametric Signal Temporal Logic (PSTL)

    Constants in STL formula replaced with parameters Scale parameters Time parameters

    Examples:

    S. A. Seshia 19

    Between some time and 10 seconds, x remains greater than some value

    After transmissionshifts to gear 2, itremains in gear 2 for at least secs

  • CounterExample Guided Inductive Synthesis

    Find "Tightest" Properties

    Settling Time is ??Overshoot is ??Upper Bound on x is ??

    Are there behaviors that do NOT satisfy theserequirements?

    Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1

    NO

    Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1

    Mined Requirement

    Counterexamples

    1.

    Mining Requirements from Closed-Loop Models 20

    Experimental Engine Control Model

    Parametric Signal Temporal Logic

    (PSTL)

    Parameter Synthesis (exploits

    monotonicity)

    Optimization-based Falsification

  • Parameter Synthesis = Find -tight values of params (for suitably small )

    0 100

    3

    2.9

    1 000 000

    Find "Tightest" Properties

    Mining Requirements from Closed-Loop Models 21

    Too loose

    Want the value of corresponding to the tightest satisfaction over a set of traces

    x

  • Parameter Synthesis

    Non-linear optimization problem

    Satisfaction function for STL is non-linear in general

    Nave (strawman) approach:

    grid parameter space to precision

    evaluate satisfaction value at each point

    pick valuation with smallest satisfaction value

    Problem: Exponential number of grid points (in

    #parameters)

    Find "Tightest" Properties

    Mining Requirements from Closed-Loop Models 22

  • If upper bound of all signals is 3, any number > 3 is also an upper bound

    Satisfaction function monotonic in parameter value Example:

    (, x) = inft ( - x(t) ) For all x, (, x) is a monotonic function of Advantage: If monotonic, use binary search over

    parameter space, otherwise exhaustive search

    Satisfaction Monotonicity

    0 10050

    34

    Find "Tightest" Properties

    Mining Requirements from Closed-Loop Models 23

  • Need to decide whether:For all x, (, x) is a monotonic function of

    Theorem: Deciding monotonicity of a PSTL formula is undecidable

    Use an encoding to satisfiability modulo theories (SMT) solving Quantified formulas involving uninterpreted functions,

    and arithmetic over reals linear arithmetic if PSTL predicates are linear

    Solved easily with Z3

    Deciding Satisfaction MonotonicityFind "Tightest" Properties

    Mining Requirements from Closed-Loop Models 24

  • CounterExample Guided Inductive Synthesis

    Find "Tightest" Properties

    Settling Time is ??Overshoot is ??Upper Bound on x is ??

    Are there behaviors that do NOT satisfy theserequirements?

    Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1

    NO

    Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1

    Mined Requirement

    Counterexamples

    1.

    Mining Requirements from Closed-Loop Models 25

    Experimental Engine Control Model

    Parametric Signal Temporal Logic

    (PSTL)

    Parameter Synthesis (exploits

    monotonicity)

    Optimization-based Falsification

  • Black-Box Falsification Procedure

    u S(u)

    Falsification Tool

    \

    \

    Mining Requirements from Closed-Loop Models

    Are there behaviors that do NOT satisfy theserequirements?

    26

  • Falsification as Optimization

    Solve * = minu (, S(u)) Leverages quantitative semantics of STL Relies on standard numerical optimization methods (e.g.

    Nelder-Mead)

    If * < 0, found falsifying trace!

    Nonlinear Optimization Problem, No exact solution, Limited theoretical guarantees

    Are there behaviors that do NOT satisfy theserequirements?

    Mining Requirements from Closed-Loop Models 27

  • Experimental Evaluation Summary [details in TCAD15 paper] Defined Templates for Common Requirements in

    Automotive Control all monotonic PSTL!! Dwell-Time requirements Timed/Untimed Safety properties Timed Inevitability (bounded liveness) Input Profiles: assumptions on shape of input signals Control-theoretic requirements on output signals (bounded

    overshoot/undershoot, settling time, error from reference signal, etc.)

    Three Benchmarks Simple Simulink Automatic Transmission Model Toyota HSCC14 Challenge Air-Fuel Ratio controller Toyota Experimental Diesel Engine Airpath controller

    S. A. Seshia 28

  • Results on Industrial Airpath Controller

    Found max overshoot with 7000+ simulations in 13 hours Attempt to mine maximum observed settling time:

    stops after 4 iterations gives answer tsettle = simulation time horizon (shown in trace below)

    Experimental Engine Control Model

    Mining Requirements from Closed-Loop Models 29

    [Jin, Donze, Deshmukh, Seshia, HSCC 2013]

  • Mining can expose deep bugs

    Uncovered a tricky bug Discussion with control designer revealed it to be a real

    bug Root cause identified as wrong value in a look-up table,

    bug was fixed Why mining could be useful for bug-finding:

    Can uncover subtle relations that should not hold Looking for bugs Mine for negation of bug

    Experimental Engine Control Model

    Mining Requirements from Closed-Loop Models 30

  • Bug fixed Settling time successfully mined

    S. A. Seshia 31

    OLD

    NEW

  • Industrial Case Studies with Toyota

    Work with group @ Toyota Japan on enabling software verification by mining specifications on the closed-loop system

    Useful in a production setting: Finds issues where previous methods fell short! Reduced 70% of human effort

    S. A. Seshia 32

    [Yamaguchi et al., FMCAD16]

  • Toyota Units Experience with Model Checking

    33

    Total Work hour

    560min

    1 trial

    70min

    40min

    7 trials

    Making modelMaking property

    Revising property

    Revising model

    Executingmodel checking

    Mappingcounterexample

    Making/revising property: 110 minMapping counterexample: 280 min for just 1 module

    [Yamaguchi et al., FMCAD16]

    Graph5

    30

    40

    70.71

    70.71

    70.71

    70.71

    70.71

    70.71

    70.71

    Sheet1

    3013010110

    4014010110

    15710515575

    40728015575

    103307214

    2048010330

    565214

    3013010110

    4014010110

    1 70.7138.8

    2 70.7138.8

    3 70.7138.8

    4 70.7138.8

    5 70.7138.8

    6 70.71351

    7 70.71

    564.97565

    15710514.85060606061557515

    40728039.60161616161557515

    103304.24303030372142.8

    2048011.3147474747103306

    31.91

    495194

    Sheet1

    Sheet2

    Sheet3

  • 34

    Overview of Methodology

    in out

    1. Pre-condition mining

    Pre-condition forsoftware module

    2. Softwaremodel checking

    Module levelcounterexample

    System levelcounterexample

    3. Simulation-BasedVerification

    controller

    sensor

    plant

    module module

    module

    target

    Pedal Brake WaterTemp

    module

    AirTemp

    Breach

    Breach

    SLDV/CBMC

    [Yamaguchi et al., FMCAD16]

  • From CEGIS to Oracle-Guided Inductive Synthesis

    Inductive Synthesis: Learning from Examples (ML)Formal Inductive Synthesis: Learn from Examples while satisfying a Formal Specification

    S. A. Seshia 36

    [Jha & Seshia, A Theory of Formal Synthesis via Inductive Learning, 2015,Acta Informatica 2017.]

    General Approach: Oracle-Guided LearningCombine Learner with Oracle (e.g., Verifier) that answers Learners Queries

    LEARNER ORACLE

    query

    response

  • Falsification of Cyber-Physical Systems with Machine Learning Components

    S. A. Seshia 37

    T. Dreossi, A. Donze, and S. A. Seshia. Compositional Falsification of Cyber-Physical Systems with Machine Learning Components, In NASA Formal Methods Symposium, May 2017.

  • Challenges for Verified AI

    S. A. Seshia 38

    System SEnvironment ESpecification

    YES [+ proof]Does S || E satisfy ?

    NO [+ counterexample]

    S. A. Seshia, D. Sadigh, S. S. Sastry. Towards Verified Artificial Intelligence. July 2016. https://arxiv.org/abs/1606.08514.

    Design Correct-by-Construction

    instead?Counterexamples, etc. from Rich Signal Spaces?

  • Problem: Verify Automotive System (CPS) that uses ML-based Perception

    39

    Focus: Falsification: finding scenarios that violate safety properties Test (Data) Generation: generate interesting data for

    training / testing improve accuracy Deep Neural Networks, given the increasing interest and use

    in the automotive context.

    S. A. Seshia

    Controller Plant

    Environment

    Learning-Based Perception

    Sensor Input

  • Automatic Emergency Braking System (AEBS)

    40

    AEBS Controller Plant

    Environment

    Deep Learning-Based Object Detection

    Goal: Brake when an obstacle is near, to maintain a minimum safety distance Controller, Plant, Env models in Matlab/Simulink

    Object detection/classification system based on deep neural networks Inception-v3, AlexNet, trained on ImageNet more recent: squeezeDet, Yolo, trained on KITTI

  • Our Approach: Use a System-Level Specification

    S. A. Seshia 41

    Verify the Deep Neural Network Object Detector

    Verify the System containing the Deep Neural Network

    Formally Specify the End-to-End Behavior of the System

    Controller Plant

    Environment

    Learning-Based Perception

    STL Formula: G (dist(ego vehicle, env object) > )

  • Approach: Simulation-Based Falsification

    S. A. Seshia 42

    Challenge: Very High Dimensionality of Input Space!

    Standard solution: Use Compositional (Modular)Verification

    However: no formal spec. for neural network component!

    Compositional Verification without Compositional Specification?!!

  • Our Approach: Combine Temporal Logic CPS Falsifier with ML Analyzer

    43

    CPS Falsifier

    ML Analyzer

    Spec

    Model

    Region of Uncertainty

    Interesting sensor data(images)

    Error? CPS Falsifier uses abstraction of

    ML component Optimistic analysis: assume ML

    classifier is always correct Pessimistic analysis: assume

    classifier is always wrong

    Compositional: CPS Falsifier and ML Analyzer can be designed and run

    independently (& communicate)!

    S. A. Seshia

    Difference is the region of uncertainty where output of the ML component matters

  • Identifying Region of Uncertainty (ROU) for Automatic Emergency Braking System

    44

    Perform Optimistic and Pessimistic Analyses on the Deep Neural Network

    ML always correct ML always wrong Potentially unsafe region (ROU) depending on ML

    component (yellow)

    S. A. Seshia

  • Machine Learning Analyzer

    45

    Systematically Explore ROU in the Image (Sensor) Space

    Feature space

    brightness car z-pos

    Abstraction map

    brightnesscar z-pos

    car x-pos

    Abstract space A

    S. A. Seshiax

    Abstract space A

    ()

    Neural network {, }

    Systematic Sampling (low-discrepancy sampling)

  • Sample Result

    46

    Inception-v3Neural

    Network(pre-trained on ImageNet using

    TensorFlow)

    Misclassifications

    This misclassification may not be of concern

  • Sample Result

    47

    Inception-v3Neural

    Network(pre-trained on ImageNet using

    TensorFlow)

    Misclassifications

    Corner caseImage

    But this one is a real hazard!

  • Image Streams

    S. A. Seshia 48

    Superimposition of tests on backgroundBlind spots

    Results on squeezeDet NN and KITTI dataset for autonomous driving

    [Dreossi, Ghosh, et al., ICML 2017 workshop]

  • Verifier-Guided Training of Deep Neural Networks

    Instance of Oracle-Guided Inductive Synthesis Oracle is Verifier (CPSML Falsifier) used to perform

    counterexample-guided training of DNNs Substantially increase accuracy with only few

    additional examples

    S. A. Seshia 49

    DEEP NEURAL NETWORK

    FALSIFIER (CPS + ML)

    Learned Classifier

  • Conclusion: Formal Methods meets Machine Learning

    Formal Methods can play an important role in CPS Design with high assurance Industrial scale and machine learning pose particular

    challenges

    Machine Learning Formal Methods Formal Inductive Synthesis (of specifications,

    programs, etc.)

    Formal Methods Machine Learning Compositional reasoning about learning-based systems

    S. A. Seshia 50

  • Towards Verified Learning-based CPS

    S. A. Seshia 51

    S. A. Seshia, D. Sadigh, S. S. Sastry. Towards Verified Artificial Intelligence. July 2016. https://arxiv.org/abs/1606.08514.

    Challenges1. Environment (incl.

    Human) Modeling2. Specification

    3. Learning Systems Complexity

    4. Efficient Training, Testing, Verification

    5. Design for Correctness

    PrinciplesData-Driven, Introspective Environment ModelingSystem-Level Specification; Robustness/Quantitative Spec.

    Abstract & ExplainVerification-Guided, Adversarial Analysis and ImprovisationFormal Inductive Synthesis

    Exciting Times Ahead!!! Thank you!

    Formal Methods meets Machine Learning: Explorations in Cyber-Physical System DesignCyber-Physical Systems (CPS):Integration of computation with physical processes, defined by both cyber & physical Growing Use of Machine Learning/AI in Cyber-Physical Systems Growing Features Growing CostsFormal Methods to the Rescue?Formal Methods meets Machine LearningOutlineMining Requirements for Closed-Loop Control SystemsChallenges for Verification of Control SystemsIndustry Problem: Applying Formal Methods to Legacy SystemsDesigners View of Our SolutionSlide Number 12Slide Number 13Slide Number 14Slide Number 15Slide Number 16Signal Temporal Logic (STL)Quantitative Satisfaction Function for STLParametric Signal Temporal Logic (PSTL)Slide Number 20Parameter Synthesis = Find -tight values of params (for suitably small )Parameter SynthesisSatisfaction MonotonicityDeciding Satisfaction MonotonicitySlide Number 25Black-Box Falsification ProcedureFalsification as OptimizationExperimental Evaluation Summary [details in TCAD15 paper]Results on Industrial Airpath ControllerMining can expose deep bugsBug fixed Settling time successfully mined Industrial Case Studies with ToyotaToyota Units Experience with Model Checking Overview of MethodologyFrom CEGIS to Oracle-Guided Inductive SynthesisFalsification of Cyber-Physical Systems with Machine Learning ComponentsChallenges for Verified AI Problem: Verify Automotive System (CPS) that uses ML-based PerceptionAutomatic Emergency Braking System (AEBS)Our Approach: Use a System-Level SpecificationApproach: Simulation-Based FalsificationOur Approach: Combine Temporal Logic CPS Falsifier with ML AnalyzerIdentifying Region of Uncertainty (ROU) for Automatic Emergency Braking SystemMachine Learning AnalyzerSample ResultSample ResultImage StreamsVerifier-Guided Training of Deep Neural NetworksConclusion: Formal Methods meets Machine LearningTowards Verified Learning-based CPS