Top Banner
.lu software verification & validation V V S Automated Testing of Autonomous Driving Assistance Systems Lionel Briand VVIoT, Sweden, 2018
63

Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Sep 23, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

.lusoftware verification & validationVVS

Automated Testing of Autonomous Driving Assistance

Systems

Lionel Briand

VVIoT, Sweden, 2018

Page 2: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Collaborative Research @ SnT

2

• Research in context• Addresses actual needs• Well-defined problem• Long-term collaborations• Our lab is the industry

Page 3: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Software Verification and Validation @ SnT Centre

3

• Group established in 2012

• Focus: Automated, novel, cost-effective V&V solutions

• ERC Advanced Grant

• ~ 25 staff members

• Industry and public partnerships

Page 4: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Introduction

4

Page 5: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Autonomous Systems

• May be embodied in a device (e.g., robot) or reside entirely in the cyber world (e.g., financial decisions)

• Gaining, encoding, and appropriately using knowledge is a bottleneck for developing intelligent autonomous systems

• Machine learning, e.g., deep learning, is often an essential component

5

Page 6: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Motivations

• Dangerous tasks

• Tedious, repetitive tasks

• Significant improvements in safety

• Significant reduction in cost, energy, and resources

• Significant optimization of benefits

6

Page 7: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Autonomous CPS

• Read sensors, i.e., collect data about their environment

• Make predictions about their environment

• Make (optimal) decisions about how to behave to achieve some objective(s) based on predictions

• Send commands to actuators according to decisions

• Often mission or safety critical7

Page 8: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

A General and Fundamental Shift• Increasingly so, it is easier to learn behavior from data using machine learning,

rather than specify and code

• Deep learning, reinforcement learning …

• Assumption: data captures desirable behavior, in a comprehensive manner

• Example: Neural networks (deep learning)

• Millions of weights learned

• No explicit code, no specifications

• Verification, testing?

8

Page 9: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Many Domains• CPS (e.g., robotics)

• Visual recognition

• Finance, insurance

• Speech recognition

• Speech synthesis

• Machine translation

• Games

• Learning to produce art9

Page 10: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Testing Implications

• Test oracles? No explicit, expected test behavior

• Test completeness? No source code, no specification

10

Page 11: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

CPS Development Process

11

Functional modeling: • Controllers• Plant• Decision

Continuous and discrete Simulink models

Model simulation and testing

Architecture modelling• Structure• Behavior• Traceability

System engineering modeling (SysML)

Analysis: • Model execution and

testing• Model-based testing• Traceability and

change impact analysis

• ...

(partial) Code generation

Deployed executables on target platform

Hardware (Sensors ...) Analog simulators

Testing (expensive)

Hardware-in-the-Loop Stage

Software-in-the-Loop StageModel-in-the-Loop Stage

Page 12: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

MiL Components

12

Sensor

Controller

Actuator Decision

Plant

Page 13: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Opportunities and Challenges• Early functional models (MiL) offer opportunities for early

functional verification and testing

• But a challenge for constraint solvers and model checkers:

• Continuous mathematical models, e.g., differentialequations

• Discrete software models for code generation, but with complex operations

• Library functions in binary code13

Page 14: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Automotive Environment

• Highly varied environments, e.g., road topology, weather, building and pedestrians …

• Huge number of possible scenarios, e.g., determined by trajectories of pedestrians and cars

• ADAS play an increasingly critical role

• A challenge for testing

14

Page 15: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Testing Advanced Driver Assistance Systems

15

Page 16: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Objective

• Testing ADAS

• Identify and characterize most critical/risky scenarios

• Test oracle: Safety properties

• Need scalable test strategy due to large input space

16

Page 17: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

17

Automated Emergency Braking System (AEB)

17

“Brake-request” when braking is needed to avoid collisions

Decision making

Vision(Camera)

Sensor

Brake Controller

Objects’ position/speed

Page 18: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Example Critical Situation

• “AEB properly detects a pedestrian in front of the car with a high degree of certainty and applies braking, but an accident still happens where the car hits the pedestrian with a relatively high speed”

18

Page 19: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Testing via Physics-based Simulation

19

Page 20: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Simulation

20

SUT

SimulatorEgo Vehicule

(physical plant)

Pedestrians

Other Vehicules

- Road- Traffic sign- Weather

OutputsTime-stamped vectors for: - the SUT outputs - the states of the physical plant and the mobile environment objects

sensors

cameras

actuators

Environment

mobile objects

static aspects

Dynamic models

Inputs- the initial state of the physical plant and the mobile environment objects- the static environment aspects

Feedback loop

Page 21: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Our Goal

• Developing an automated testing technique for ADAS

• To help engineers efficiently and effectively explore the complex test input space of ADAS

• To identify critical (failure-revealing) test scenarios

• Characterization of input conditions that lead to most critical situations

21

Page 22: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

ADAS Testing Challenges

• Test input space is large, complex and multidimensional

• Explaining failures and fault localization are difficult

• Execution of physics-based simulation models is computationally expensive

22

Page 23: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Our Approach• Effectively combine evolutionary computing algorithms and

decision tree classification models

• Evolutionary computing is used to search the input space for safety violations

• We use decision tress to guide the search-based generation of tests faster towards the most critical regions, and characterize failures

• In turn, we use search algorithms to refine classification models to better characterize critical regions of the ADAS input space

23

Page 24: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

AEB Domain Model

- visibility: VisibilityRange- fog: Boolean- fogColor: FogColor

Weather

- frictionCoeff: Real

Road1

- v0 : RealVehicle

- : Real- : Real- : Real- :Real

Pedestrian

- simulationTime: Real- timeStep: Real

Test Scenario

11

- ModerateRain- HeavyRain- VeryHeavyRain- ExtremeRain

«enumeration»RainType- ModerateSnow

- HeavySnow- VeryHeavySnow- ExtremeSnow

«enumeration»SnowType

- DimGray- Gray- DarkGray- Silver- LightGray- None

«enumeration»FogColor

1

WeatherC{{OCL} self.fog=false

implies self.visibility = “300” and self.fogColor=None}

Straight

- height: RampHeight

Ramped

- radius: CurvedRadius

Curved

- snowType: SnowType

Snow

- rainType: RainType

Rain

Normal

- 5 - 10 - 15 - 20 - 25 - 30 - 35 - 40

«enumeration»CurvedRadius (CR)

- 4 - 6 - 8 - 10 - 12

«enumeration»RampHeight (RH)

- 10 - 20 - 30 - 40 - 50 - 60 - 70 - 80 - 90 - 100- 110 - 120 - 130 - 140 - 150 - 160 - 170 - 180 - 190 - 200 - 210 - 220 - 230 - 240 - 250 - 260 - 270 - 280 - 290 - 300

«enumeration»VisibilityRange

- : TTC: Real- : certaintyOfDetection: Real- : braking: Boolean

AEB Output

- : Real- : Real

Output functions

Mobile object

Position vector

- x: Real- y: Real

Position1 11

1

1

Static input

1

Output

11

Dynamic input

x

p0

yp0vp0✓p0

vc0

v3

v2v1

F1F2

Page 25: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Search-Based Software Testing• Express test generation problem

as a search problem

• Search for test input data with certain properties, i.e., constraints

• Non-linearity of software (if, loops, …): complex, discontinuous, non-linear search spaces (Baresel)

• Many search algorithms (metaheuristics), from local search to global search, e.g., Hill Climbing, Simulated Annealing and Genetic Algorithms

Section IV discusses future directions for Search-BasedSoftware Testing, comprising issues involving executionenvironments, testability, automated oracles, reduction ofhuman oracle cost and multi-objective optimisation. Finally,Section V concludes with closing remarks.

II. SEARCH-BASED OPTIMIZATION ALGORITHMS

The simplest form of an optimization algorithm, andthe easiest to implement, is random search. In test datageneration, inputs are generated at random until the goal ofthe test (for example, the coverage of a particular programstatement or branch) is fulfilled. Random search is very poorat finding solutions when those solutions occupy a very smallpart of the overall search space. Such a situation is depictedin Figure 2, where the number of inputs covering a particularstructural target are very few in number compared to thesize of the input domain. Test data may be found fasterand more reliably if the search is given some guidance.For meta-heurstic searches, this guidance can be providedin the form of a problem-specific fitness function, whichscores different points in the search space with respect totheir ‘goodness’ or their suitability for solving the problemat hand. An example fitness function is plotted in Figure3, showing how - in general - inputs closer to the requiredtest data that execute the structure of interest are rewardedwith higher fitness values than those that are further away.A plot of a fitness function such as this is referred to as thefitness landscape. Such fitness information can be utilized byoptimization algorithms, such as a simple algorithm calledHill Climbing. Hill Climbing starts at a random point in thesearch space. Points in the search space neighbouring thecurrent point are evaluated for fitness. If a better candidatesolution is found, Hill Climbing moves to that new point,and evaluates the neighbourhood of that candidate solution.This step is repeated, until the neighbourhood of the currentpoint in the search space offers no better candidate solutions;a so-called ‘local optima’. If the local optimum is not theglobal optimum (as in Figure 3a), the search may benefitfrom being ‘restarted’ and performing a climb from a newinitial position in the landscape (Figure 3b).

An alternative to simple Hill Climbing is SimulatedAnnealing [22]. Search by Simulated Annealing is similar toHill Climbing, except movement around the search space isless restricted. Moves may be made to points of lower fitnessin the search space, with the aim of escaping local optima.This is dictated by a probability value that is dependenton a parameter called the ‘temperature’, which decreasesin value as the search progresses (Figure 4). The lowerthe temperature, the less likely the chances of moving to apoorer position in the search space, until ‘freezing point’ isreached, from which point the algorithm behaves identicallyto Hill Climbing. Simulated Annealing is named so becauseit was inspired by the physical process of annealing inmaterials.

Input domain

portion of input domain

denoting required test data

randomly-generatedinputs

Figure 2. Random search may fail to fulfil low-probability test goals

Fitn

ess

Input domain

(a) Climbing to a local optimum

Fitn

ess

Input domain(b) Restarting, on this occasion resulting in a climb to the global optimum

Figure 3. The provision of fitness information to guide the search withHill Climbing. From a random starting point, the algorithm follows thecurve of the fitness landscape until a local optimum is found. The finalposition may not represent the global optimum (part (a)), and restarts maybe required (part (b))

Fitn

ess

Input domainFigure 4. Simulated Annealing may temporarily move to points of poorerfitness in the search space

Fitn

ess

Input domainFigure 5. Genetic Algorithms are global searches, sampling many pointsin the fitness landscape at once

“Search-Based Software Testing: Past, Present and Future” Phil McMinn

Genetic Algorithm

25

Section IV discusses future directions for Search-BasedSoftware Testing, comprising issues involving executionenvironments, testability, automated oracles, reduction ofhuman oracle cost and multi-objective optimisation. Finally,Section V concludes with closing remarks.

II. SEARCH-BASED OPTIMIZATION ALGORITHMS

The simplest form of an optimization algorithm, andthe easiest to implement, is random search. In test datageneration, inputs are generated at random until the goal ofthe test (for example, the coverage of a particular programstatement or branch) is fulfilled. Random search is very poorat finding solutions when those solutions occupy a very smallpart of the overall search space. Such a situation is depictedin Figure 2, where the number of inputs covering a particularstructural target are very few in number compared to thesize of the input domain. Test data may be found fasterand more reliably if the search is given some guidance.For meta-heurstic searches, this guidance can be providedin the form of a problem-specific fitness function, whichscores different points in the search space with respect totheir ‘goodness’ or their suitability for solving the problemat hand. An example fitness function is plotted in Figure3, showing how - in general - inputs closer to the requiredtest data that execute the structure of interest are rewardedwith higher fitness values than those that are further away.A plot of a fitness function such as this is referred to as thefitness landscape. Such fitness information can be utilized byoptimization algorithms, such as a simple algorithm calledHill Climbing. Hill Climbing starts at a random point in thesearch space. Points in the search space neighbouring thecurrent point are evaluated for fitness. If a better candidatesolution is found, Hill Climbing moves to that new point,and evaluates the neighbourhood of that candidate solution.This step is repeated, until the neighbourhood of the currentpoint in the search space offers no better candidate solutions;a so-called ‘local optima’. If the local optimum is not theglobal optimum (as in Figure 3a), the search may benefitfrom being ‘restarted’ and performing a climb from a newinitial position in the landscape (Figure 3b).

An alternative to simple Hill Climbing is SimulatedAnnealing [22]. Search by Simulated Annealing is similar toHill Climbing, except movement around the search space isless restricted. Moves may be made to points of lower fitnessin the search space, with the aim of escaping local optima.This is dictated by a probability value that is dependenton a parameter called the ‘temperature’, which decreasesin value as the search progresses (Figure 4). The lowerthe temperature, the less likely the chances of moving to apoorer position in the search space, until ‘freezing point’ isreached, from which point the algorithm behaves identicallyto Hill Climbing. Simulated Annealing is named so becauseit was inspired by the physical process of annealing inmaterials.

Input domain

portion of input domain

denoting required test data

randomly-generatedinputs

Figure 2. Random search may fail to fulfil low-probability test goals

Fitn

ess

Input domain

(a) Climbing to a local optimum

Fitn

ess

Input domain(b) Restarting, on this occasion resulting in a climb to the global optimum

Figure 3. The provision of fitness information to guide the search withHill Climbing. From a random starting point, the algorithm follows thecurve of the fitness landscape until a local optimum is found. The finalposition may not represent the global optimum (part (a)), and restarts maybe required (part (b))

Fitn

ess

Input domainFigure 4. Simulated Annealing may temporarily move to points of poorerfitness in the search space

Fitn

ess

Input domainFigure 5. Genetic Algorithms are global searches, sampling many pointsin the fitness landscape at once

Page 26: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Multiple Objectives: Pareto Front

26

Individual A Pareto dominates individual B ifA is at least as good as B

in every objective and better than B in at

least one objective.

Dominated by x

F1

F2

Pareto frontx

• A multi-objective optimization algorithm (e.g., NSGA II) must:• Guide the search towards the global Pareto-Optimal front.• Maintain solution diversity in the Pareto-Optimal front.

Page 27: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Decision Trees

27

Partition the input space into homogeneous regions

All points Count 1200

“non-critical” 79%“critical” 21%

“non-critical” 59%“critical” 41%

Count 564 Count 636“non-critical” 98%“critical” 2%

Count 412“non-critical” 49%“critical” 51%

Count 152“non-critical” 84%“critical” 16%

Count 230 Count 182

vp0 >= 7.2km/h vp

0 < 7.2km/h

✓p0 < 218.6� ✓p0 >= 218.6�

RoadTopology(CR = 5,Straight,RH = [4� 12](m))

RoadTopology

(CR = [10� 40](m))

“non-critical” 31%“critical” 69%

“non-critical” 72%“critical” 28%

Page 28: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Search Algorithm (NSGAII-DT)• We use multi-objective search algorithm (NSGAII)

• Three objectives (CB): Minimum distance between the pedestrian and the field of view, the car speed at the time of collision, and the probability that the object detected in front of the car is a pedestrian

• Inputs are vectors of values containing static and dynamic variables: precipitation, fogginess, road shape, visibility range, car-speed, person-speed, person-position (x,y), person-orientation

• Each search iteration calls simulations to compute fitness

• We use decision tree classification models to predict scenario criticality28

Page 29: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

NSGAII-DT1. Generate an initial representative set of input scenarios and run the simulator to label each scenario as critical or non-critical2. Build a decision tree model

criticalregion

non-criticalregion

non-criticalregion

conditions yesno

critical scenarionon-critical scenario

conditionsyesno

3. Run the NSGAII search algorithm for the elements inside each critical leaf

NSGAII

Mutation and crossover

NDS

Select best scenarios

The new scenarios are added to the initial population

4. Rebuild the decision tree (step 2) or stop the process

most criticalregion

conditions yesno

conditions yesno

Region in the input space that is likely to contain more critical scenarios

Page 30: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

All points Count 1200

“non-critical” 79%“critical” 21%

“non-critical” 59%“critical” 41%

Count 564 Count 636“non-critical” 98%“critical” 2%

Count 412“non-critical” 49%“critical” 51%

Count 152“non-critical” 84%“critical” 16%

Count 230 Count 182

vp0 >= 7.2km/h vp

0 < 7.2km/h

✓p0 < 218.6� ✓p0 >= 218.6�

RoadTopology(CR = 5,Straight,RH = [4� 12](m))

RoadTopology

(CR = [10� 40](m))

“non-critical” 31%“critical” 69%

“non-critical” 72%“critical” 28%

Initial Classification Model

We focus on generating more scenarios in the critical region, respecting the conditions that lead to that region

30

Page 31: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

All points Count 3367

“non-critical” 58%“critical” 42%

“non-critical” 43%“critical” 57%

Count 2198 Count 1169“non-critical” 88%“critical” 12%

Count 338“non-critical” 17%“critical” 83%

Count 1860“non-critical” 47%“critical” 53%

“non-critical” 42%“critical” 58%

Count 1438 Count 422“non-critical” 64%“critical” 36%

Count 553“non-critical” 29%“critical” 71%

Count 885“non-critical” 51%“critical” 49%

“non-critical” 37%“critical” 63%

Count 548 Count 337“non-critical” 73%“critical” 27%

x

p0 >= 37.4 ^ RoadTopology

(Straight,

RH = [4� 12])

x

p0 < 37.4^RoadTopology

(Straight,

✓p0 < 232.5�✓p0 >= 232.5�

x

p0 < 33x

p0 >= 33

✓p0 >= 185.6�✓p0 < 185.6�

yp0 < 57.7yp

0 >= 57.7

^

^

^^

^

^ RoadTopology

RoadTopology

RoadTopology

RoadTopology

RoadTopology

RoadTopology

(Straight,

(CR = [5� 40])

(CR = [5� 40])

(CR = [5� 40])

(CR = [5� 40])

(Straight,

CR = [5� 40],

CR = [5� 40])

CR = [5� 40])

CR = [5� 40])

Refined Classification Model

We get a more refined decision tree with more critical regions and more homogeneous areas

31

Page 32: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Research Questions

• RQ1: Does the decision tree technique help guide the evolutionary search and make it more effective?

• RQ2: Does our approach help characterize and converge towards homogeneous critical regions?

• Failure explanation

• Usefulness (feedback from engineers)

32

Page 33: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

RQ1: NSGAII-DT vs. NSGAII

33

NSGAII-DT outperforms NSGAII

HV

0.0

0.4

0.8G

D

0.05

0.15

0.25

SP

20.6

1.0

1.4

6 10 14 18 22 24Time (h)

NSGAII-DTNSGAII

Page 34: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

RQ1: NSGAII-DT vs. NSGAII

• NSGAII-DT generates 78% more distinct, critical test scenarios compared to NSGAII

34

Page 35: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

RQ2: NSGAII-DT (evaluation of the generated decision trees)

35

Goo

dnes

sOfFit

Reg

ionS

ize

1 5 642 30.40

0.50

0.60

0.70

tree generations

(b) 0.80

71 5 642 30.00

0.05

0.10

0.15

tree generations

(a) 0.20

7

Goo

dnes

sOfFit-

crt

1 5 642 3

0.30

0.50

0.70

tree generations

(c) 0.90

7

The generated critical regions consistently become smaller, more homogeneous and more precise over successive tree generations of

NSGAII-DT

Page 36: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

50m

76m

36m32m

θ[15m-40m]

vehicle speed > 36km/h

pedestrian speed < 6km/h

Failure explanation

• A characterization of the input space showing under what input conditions the system is likely to fail

36

• Visualized by decision trees or dedicated diagrams

• Path conditions in trees

road sidewalk

Page 37: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Usefulness

• The characterizations of the different critical regions can help with:

(1) Debugging the system model (or the simulator)

(2) Identifying possible hardware changes to increase ADAS safety

(3) Providing proper warnings to drivers37

Page 38: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Automated Testing of Feature Interactions Using

Many Objective Search

38

Page 39: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

System Integration

39

actuators

sensors

feature n

feature 2

feature 1

Integration component

System Under Test (SUT)

...cameras

Page 40: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Case Study: SafeDrive• Our case study describes an automotive system consisting of

four advanced driver assistance features:

• Cruise Control (ACC)

• Traffic Sign Recognition (TSR)

• Pedestrian Protection (PP)

• Automated Emergency Breaking (AEB)

40

Page 41: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Simulation

41

SUT

SimulatorEgo Vehicule

(physical plant)

Pedestrians

Other Vehicules

- Road- Traffic sign- Weather

OutputsTime-stamped vectors for: - the SUT outputs - the states of the physical plant and the mobile environment objects

sensors

cameras

actuators

Environment

mobile objects

static aspects

Dynamic models

Inputs- the initial state of the physical plant and the mobile environment objects- the static environment aspects

Feedback loop

Page 42: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Actuator Command Vectors

42

Page 43: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Safety Requirements

43

Page 44: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Features• Behavior of features based on machine learning algorithms processing sensor

and camera data

• Interactions between features may lead to violating safety requirements, even if features are correct

• E.g., ACC is controlling the car by ordering it to accelerate since the leading car is far away, while a pedestrian starts crossing the road. PP starts sending braking commands to avoid hitting the pedestrian.

• Complex: predict and analyze possible interactions at the requirements level in a complex environment

• Resolution strategies cannot always be determined statically and may depend on environment

44

Page 45: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Objective• Automated and scalable testing to help ensure that resolution

strategies are safe

• Detect undesired feature interactions

• Assumptions: IntC is white-box (integrator is testing), features were previously tested

• Extremely large input space since environmental conditions and scenarios can vary a great deal

45

Page 46: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Input Variables

46

Page 47: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Search• Input space is large

• Dedicated search algorithm (many objectives) directed/guided by test objectives (fitness functions)

• Fitness (distance) functions: reward test cases that are more likely to reveal integration failures leading to safety violations

• Combine three types of functions: (1) safety violations, (2) unsafe overriding by IntC, (3) coverage of the decision structure of integration component

• Many test objectives to be satisfied by the test suite

47

Page 48: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Failure Distance

• Reveal safety requirements violations

• Fitness functions based on the trajectory vectors for the ego car, the leading car and the pedestrian, generated by the simulator

• PP fitness: Minimum distance between the car and the pedestrian during the simulation time.

• AEB fitness: Minimum distance between the car and the leading car during the simulation time.

48

Page 49: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Distance Functions

49

When any of the functions yields zero, a safety failure corresponding to

that function is detected.

Page 50: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Unsafe Overriding Distance

• Goal: Find faults more likely to be due to faults in integration component

• Reward test cases generating integration outputs deviating from the individual feature outputs, in such a way as to possibly lead to safety violations.

• Example: A feature f issues a braking command while the integration component issues no braking command or a braking command with a lower force than that of f .

50

Page 51: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Branch Distance

• Branch coverage of IntC

• Fitness: Approach level and branch distance d (standard for code coverage)

• d(b,tc) = 0 when tc covers b

51

Page 52: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Combining Distance Functions• Goal: Execute every branch of IntC such that while executing

that branch, IntC unsafely overrides every feature f and its outputs violate every safety requirement related to f.

52

Indicates that tc has not covered the branch j

Branch covered but did not caused unsafe override of f

Branch covered, unsafe override, but did not violate requirement I

Page 53: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Search Algorithm• Best test suite covers all search objectives, i.e., for all IntC

branches and all safety requirements

• Not a Pareto front optimization problem

• Objectives compete with each others

• Example: cannot have the ego car violating the speed limit after hitting the leading car in one test case

• Tailored, many-objective genetic algorithm

• Must be efficient (test case executions are very expensive)

53

Page 54: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Search Algorithm

54

Randomly generated TCsCompute fitness

Tests are evolvedCrossover, mutation

Fittest tests selected

Correct constraint violations

Archive covering tests

Page 55: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Evaluation

55

2

0

4 6 8 10 12

1

2

3

4

5

6

7FITest

Baseline

Time (h)

Num

ber o

f Int

egra

tion

erro

rs

Page 56: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Discussion

56

Page 57: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Observations• We will rarely have precise and complete requirements, face great

diversity in the physical environment, including many possible scenarios.

• It is possible, however, to define properties characterizing unacceptable situations (safety)

• Notion of test coverage is elusive: No specification or code/models for some key (decision) components based on ML

• Failure is not clear cut: It is a matter of risk, trade-off …

• We have executable/simulable functional models (e.g., Simulink) at early stages

57

Page 58: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Conclusions• We proposed solutions based on:

• Efficient and realistic (hardware, physics) simulation

• Metaheuristic search, e.g., evolutionary computing

• Guided by fitness functions derived from properties of interest (e.g., safety requirements)

• Machine learning, e.g., to speed up search

• No guarantees though

58

Page 59: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Generalizing

• Examples presented from (safety-critical) cyber-physical systems, e.g., safety requirements

• Can a similar strategy be applied in other domains to test for bias or any other undesirable properties (e.g., legal), when system behavior is driven by machine learning?

• Executable models of environment and users?

59

Page 60: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Summary• Machine learning plays an increasingly prominent role in

autonomous systems

• No (complete) requirements, specifications, or even code

• Some safety and mission-critical requirements

• Neural networks (deep learning) with millions of weights

• How do we gain confidence in such software in a scalable and cost-effective way?

60

Page 61: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

Acknowledgements

• Raja Ben Abdessalem

• Shiva Nejati

• Annibale Panichella

• IEE, Luxembourg

61

Page 62: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

References

• R. Ben Abdessalem et al., "Testing Advanced Driver Assistance Systems Using Multi-Objective Search and Neural Networks”, IEEE ASE 2016

• R. Ben Abdessalem et al., "Testing Vision-Based Control Systems Using Learnable Evolutionary Algorithms”, IEEE/ACM ICSE 2018

62

Page 63: Automated Testing of Autonomous Driving Assistance Systemsvviot2018/slides/VViOT2018-Briand.pdf · human oracle cost and multi-objective optimisation. Finally, Section V concludes

.lusoftware verification & validationVVS

Automated Testing of Autonomous Systems

Lionel Briand

VVIoT, Sweden, 2018