Top Banner
[email protected] 1 Part 3 Part 3 Unified HW+SW Reliability Insertion & Estimation
37
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 1

Part 3Part 3

Unified HW+SW Reliability Insertion & Estimation

Page 2: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 2

Reliability Insertion & Estimation Reliability Insertion & Estimation

for HW-SW Systemsfor HW-SW Systems((High-Level X Gate-Level Fault InjectionHigh-Level X Gate-Level Fault Injection))

Reliability-Oriented Reliability-Oriented

HW-SW PartitioningHW-SW Partitioning((Controlling system functions mapping into HW-SW to optimizeControlling system functions mapping into HW-SW to optimize reliability reliability))

TestabilityTestability-Oriented -Oriented

HW-SW PartitioningHW-SW Partitioning((Controlling system functions mapping into HW-SW to optimizeControlling system functions mapping into HW-SW to optimize testability testability))

Part 1/3Part 1/3

Part 2/3Part 2/3

Part 3/3Part 3/3

Page 3: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 3

HW Part (VHDL)SW Part (C/C++)

Partitioning and Communication Channels Generation (focusing on

real-time applications)

Communication Channels

Fault-Tolerant Communication

Channels

HW/SW-Level Fault-Tolerance Generator

Initial Description (SystemC)

Mutation Constraints

Fault Injection and Simulation (Mutation Analysis)

No

Yes

No System-Level Fault Coverage & Reliability ReportsDesired reliability level?

Electromagnetic Compatibility (EMC) Tests

End Design Process

Fault-Tolerant HW Part (VHDL)

Fault-Tolerant SW Part (C/C++)

System-Level Fault Coverage & Reliability ReportsDesired reliability level?

Laboratory Verification Step

(Post-Implementation) Reliability

C/C++ Code Compilation and HW Synthesis

Initial System Specification and Partitioning Step

Step I

Step II

Step III

Yes

Simulation of Critical System Functions

Yes

System-Level Behavior ReportsDesired Real-Time Response?

No

Repartition the system differently. If necessary, divide the HW-SW parts

into two or more blocks (moving from the concept of a centralized to a

distributed system) to attend real-time constraints.

Modify the implemented fault tolerant functions or

add new ones to the design.

System Reliability

Insertion Step

Real-Time Response and System ReliabilityEstimation Steps

(Pre-Implementation)

Step IV

 Design

Methodology

Details

Page 4: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 4

Reliability Insertion & Reliability Insertion &

Estimation Estimation

for HW-SW Systemsfor HW-SW Systems((High-Level X Gate-Level Fault InjectionHigh-Level X Gate-Level Fault Injection))

Part 1/3Part 1/3

Page 5: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 5

Summary

1. Introduction: overview on previous works

2. The Methodology

2.1. Initial System Reliability Estimation

2.2. System Reliability Insertion & Estimation

3. Prototyping Environments

4. Case Studies

5. Conclusions

Part 1/3Part 1/3

Page 6: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 6

1. Introduction: overview on previous works

Previous works suggest how to design systems towards:

monetary cost, performance, communication rates, power consumption, silicon area, testability, memory size .

None of the above strategies suggest how to partition the design & estimate the reliability

of a system in a HW/SW common basis.

For embedded systems which are safety-critical and particularly

complex to design, integrate reliability constraints during

HW-SW partitioning may have very good returns.

Part 1/3Part 1/3

Page 7: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 7

In an attempt to improve this point, design teams have

developed approaches well suited to add reliability

during the design phase of embedded safety-critical

computing systems :

First, partition / estimate the reliability of the system as

it is.

Next, if necessary, insert reliability functions into the HW

- SW parts of the system.

Then, estimate the reliability of the modified system.

Part 1/3Part 1/3

1. Introduction: overview on previous works

Page 8: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 8

2. The Methodology

Initial Partitioning and Reliability Estimation

Fig. 1. Reliability estimation of the system as it is.

System Description in C

SW Part(C description)

Generation of the High-Reliability HW

High-Reliability HW Part

Weak Mutants Generator

System-Level Fault Coverage & Reliability Reports Desired Reliability Level?

NO YES C Code Compilation and HW Synthesis

SW Part

Try

to p

arti

tion

the

sys

tem

diff

eren

tly

HW Part(Handel-C description)

Generation of the Com. Channels with Error Detection and/or

Correction Code

System Reliability Verification

Mutation Constraints

Partitioning

High-Reliability Functions Library

Part 1/3Part 1/3

Page 9: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 9

2. The Methodology

Added-Reliability Estimation and Repartitioning

Fig. 2. Reliability insertion into the HW - SW parts / Modified system reliability estimation.

System Description in C

SW Part(C description)

Generation of the High-Reliability HW

High-Reliability HW Part

Weak Mutants Generator

System-Level Fault Coverage & Reliability Reports Desired Reliability Level?

NO YES C Code Compilation and HW Synthesis

SW Part

Try

to p

arti

tion

the

sys

tem

diff

eren

tly

HW Part(Handel-C description)

Generation of the Com. Channels with Error Detection and/or

Correction Code

System Reliability Verification

Mutation Constraints

Partitioning

High-Reliability Functions Library

Part 1/3Part 1/3

Page 10: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 10

2. The Methodology

HW:

Built-In Self Test (BIST)

Triplication with Voter (Fault Masking)

Duplication with Comparator (Performance Degradation:

stop & go!)

Error Detection & Correction Codes (Minimized,

Combined Performance/Area Degradation)

Embedded Reliability Functions:

Part 1/3Part 1/3

Page 11: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 11

2. The Methodology

SW:Robustness

Acceptance Tests (Specification & Placement)

Capability Check (Checks for System Capabilities at a

Given Time)

Recovery Blocks (Primary & Alternate Programs)

Stress Testing (Abnormal Situations)

Performance Testing (Real-Time Applications)

Embedded Reliability Functions:

Part 1/3Part 1/3

Page 12: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 12

The goal: partition the description and estimate the reliability of the system (at a high-level description) against transient or permanent faults.

The solution: adaptation of the Mutation Analysis

Approach, originally proposed for software testing in 1978

by DeMillo. Goal development of a criterion for the

selection of test vectors: The idea was to apply a test vectors set to

the original program and to its mutated versions in order to determine

whose vectors distinguish the program from its mutated versions.

2. The Methodology

Part 1/3Part 1/3

Page 13: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 13

Used a fault injection technique by means of generating small

syntactic changes in the original code and determination of which

test vectors were able to detect the mutated versions of the code.

Change of paradigm use as a criterion for fault-coverage

estimation,

i.e., system reliability verification at a high-level description.

Note: it must be shown that the stuck-at fault coverage at the gate

level ≥ than the one obtained by means of mutation analysis in a

VHDL HW description, at the system level.

2. The Methodology

Part 1/3Part 1/3

Page 14: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 14

The measurement of the fault-coverage:

If a program has M mutants, E of which are equivalent, and a test set T kills K mutants, the mutation score is defined to be:

MS(P,T) = K . (M - E)

2. The Methodology

Part 1/3Part 1/3

K faults detected

E faults equivalentM faults injected

Page 15: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 15

Fig. 2. Mutant Data Structure (MDS) for the weak mutation generation procedure.

The MDS for a given program consists of two parts:

an array I representing the program input test vectors, in which each element

points to an array C containing the name and state of all comparators

outputs in the program during program execution.

i1

i2

ik

Input Test Vectors Array

k arrays storing the states of the Cn program checker outputs

C 1 C 2C n nSt 1 St 2 St

C 1 C 2C n nSt 1 St 2 St

C 1 C 2C n nSt 1 St 2 St

2. The Methodology

Part 1/3Part 1/3

Page 16: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 16

If any of the comparators outputs is set to “1”, then the

injected fault (i.e., the mutated statement in the code) is

detected, otherwise the fault can be classified as redundant

or even undetectable by the additional HW blocks/SW

routines (thus, lowering system reliability).

2. The Methodology

Part 1/3Part 1/3

Page 17: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 17

Table 1. Mutation operators set for

VHDL/C++ functional descriptions.

Type DescriptionAOR Arithmetic Operator ReplacementABS Absolute Value InsertionCR Constant ReplacementCVR Constant for Variable ReplacementLOR Logical Operator ReplacementROR Relational Operator ReplacementODR Operation for Delay ReplacementOSR Operation for Skip ReplacementVCR Variable for Constant ReplacementVR Variable ReplacementUOI Unary Operator InsertionBOR Bit Operator Replacement

2. The Methodology

Part 1/3Part 1/3

Page 18: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 18

Library IEEE;use IEEE.STD_LOGIC_1164.allentity CRYPT isport ( entrada_info : in integer range 0 to 3; entrada_chave : in integer

range 0 to 100; saida : out integer range 0 to 100 );end CRYPT;

Architecture ARCH_NAME of CRYPT isbeginprocess(entrada_info, entrada_chave, saida)variable temp1 : integer range 0 to 6;variable temp2 : integer range 0 to 18;variable temp3 : integer range 0 to 18;variable temp4 : integer range 0 to 118;constant sum_const : integer := 3;constant mul_const : integer := 2;constant sub_const : integer := 1;

begintemp1 := entrada_info + sum_const;

temp1 := entrada_info - sum_const; -- Mutant 1: AOR temp1 := entrada_info + temp2; -- Mutant 2: CVR

temp2 := temp1 * mul_const; temp2 := temp1 * sum_const; -- Mutant 3: CR delay; -- Mutant 4: ODR

temp3 := temp2 - sub_const; temp3 := sum_const - sub_const; -- Mutant 5: VCR temp3 := temp3 - sub_const; -- Mutant 6: VR

temp4 := entrada_chave + temp3; skip; -- Mutant 7: OSR

saida <= temp4;end process;

end ARCH_NAME;

Fig. 3. Example of fault injection in a VHDL description. (The symbol identifies mutated statements.)

Part 1/3Part 1/3

Page 19: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 19

3. Prototyping Environments

Photo 1. Altera UP1 + Texas TMS320C67 DSP uProcessor.

Part 1/3Part 1/3

Page 20: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 20

Photo 2. Altera Excalibur + SOPC

HW

3. Prototyping Environments Part 1/3Part 1/3

Page 21: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 21

4. Case Study

Fault coverage comparison for

Stuck-at faults X mutation analysis

Circuit Numberof Gates

Numberof testvectors

generated

Number ofdetectablestuck-at

faults

Numberof stuck-at faultsdetected

Multiplier 2x2 19 9 116 100 %Multiplier 4x3 110 15 622 100 %Multiplier 6x6 431 30 2420 99.50 %Multiplier 8x4 353 23 1996 99.30 %Multiplier 8x6 565 25 3211 99.78 %Multiplier 8x8 809 36 4548 99.74 %

Table 2. Stuck-at fault testing summary for the 6 Multiplier

Circuit operand widths.

Circuit Numberof Gates

Numberof testvectors

generated

Number ofgeneratedmutants

Numberof

mutantskilled

Multiplier 2x2 19 9 22 95.45 %Multiplier 4x3 110 15 106 94.34 %Multiplier 6x6 431 30 432 88.19 %Multiplier 8x4 353 23 364 88.88 %Multiplier 8x6 565 25 600 88.67 %Multiplier 8x8 809 36 832 88.34 %

Table 3. Mutation analysis summary for the 6 Multiplier

Circuit operand widths.

Part 1/3Part 1/3

Page 22: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 22

Table 4. Stuck-at fault testing summary for the 4 ALU operand widths.

Circuit Numberof Gates

Number oftest vectorsgenerated

Number ofdetectablestuck-at

faults

Numberof stuck-at faultsdetected

ALU - 4 bit 71 18 452 99.55 %ALU - 8 bits 155 22 980 98.98 %ALU - 12 bits 239 21 1508 98.80 %ALU - 16 bits 323 21 1908 98.53 %

Circuit Numberof Gates

Number oftest vectorsgenerated

Number ofgeneratedmutants

Numberof

mutantskilled

ALU - 4 bit 71 18 92 94.56 %ALU - 8 bits 155 22 204 92.15 %ALU - 12 bits 239 21 316 92.40 %ALU - 16 bits 323 21 428 87.38 %

Table 5. Mutation analysis summary for the 4 ALU operand widths.

4. Case StudyPart 1/3Part 1/3

Page 23: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 23

Table 6. Stuck-at fault testing summary for the 3 Adder Circuit architectures.

Table 7. Weak mutation analysis summary for the 3 Adder Circuit architectures.

AdderArcuitecture

(4 bits)Numberof Gates

Numberof testvectors

generated

Number ofdetectablestuck-at

faults

Numberof stuck-at faultsdetected

Simple Adder 49 10 296 100 %Manchester 76 12 343 99.56 %

Carry Lookahead 64 10 372 99.73 %

AdderArcuitecture

(4 bits)Numberof Gates

Numberof testvectors

generated

Number ofgeneratedmutants

Numberof

mutantskilled

Simple Adder 49 10 46 86.95 %Manchester 76 12 57 83.37 %

Carry Lookahead 64 10 62 82.26 %

4. Case StudyPart 1/3Part 1/3

Page 24: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 24

5. Conclusions

A unified fault injection campaign in the HW + SW

parts for systems specified in VHDL/C languages

may be reduce design cycle time and produce

confident results to help designers take

reliability-related decisions at the very early steps

of the design process.

Part 1/3Part 1/3

Page 25: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 25

Reliability-Oriented Reliability-Oriented

HW-SW PartitioningHW-SW Partitioning((Controlling system functions mapping into HW-SW to optimizeControlling system functions mapping into HW-SW to optimize reliabilityreliability))

We could think on We could think on partitioning the systempartitioning the system into HW + SW into HW + SW

parts and use the parts and use the unified fault injection methodologyunified fault injection methodology described described

previously to verify which is the previously to verify which is the most reliable configurationmost reliable configuration

Part 2/3Part 2/3

Page 26: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 26

Summary

1. The Methodology/Example

Add FT to the HW part and check it by means of the mutation analysis technique. The final goal is the derive a methodology to help the designer to partition the system into HW and SW parts according to FT criteria.

2. Case Study

3. Conclusions

Part 2/3Part 2/3

Page 27: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 27

void crypt() {tocrypt = info ^ xor_const; /* tocrypt info constant “xor_const” */tocrypt = tocrypt + add_const; /* tocrypt tocrypt + constant “add_const” */tocrypt = (tocrypt * mult_const)<-8; /* tocrypt tocrypt * low-byte of constant “mult_const” */tocrypt = tocrypt + key; /* tocrypt tocrypt + variable “key” */} /* end of routine crypt */

execution time .

tocrypt = tocrypt + add_const;

tocrypt = (tocrypt * mult_const)<-8;

tocrypt = tocrypt + key;

if(tocript == info^xor_const); else error ! 0; stop;

if(residue(tocrypt + add_const) == residue(tocrypt) + residue(add_const)); else error ! 0; stop;

if(residue(tocrypt * mult_const) == residue(tocrypt) * residue(mult_const))<-8; else error ! 0; stop;

if(residue(tocrypt + key) == residue(tocrypt) + residue(key)); else error ! 0; stop;

void cript

tocrypt = info^xor_const;

Fig. 1. Translating the user Handel-C code into a reliable version:

(a) original routine crypt; (b) reliable version of this routine .

(a)

(b)

1. The MethodologyHereafter we add FT-tolerance to the HW part and estimate the

obtained result by means of mutation analysis

in a VHDL HW description level.

Part 2/3Part 2/3

Page 28: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 28

Program “Cryptography” C code total length: 30 lines.

Consisting mainly of three routines: is_valid, crypt, set_bit.

 

void crypt() {

tocrypt = info ^ xor_const;

tocrypt = info & xor_const; /* MUTANT 1 */

tocrypt = info ^ key; /* MUTANT 2 */

tocrypt = tocrypt + add_const;

tocrypt = tocrypt + xor_const; /* MUTANT 3 */

delay; /* MUTANT 4 */

tocrypt = (tocrypt * mult_const)<-8;

tocrypt = (tocrypt - mult_const)<-8; /* MUTANT 5 */

tocrypt = (tocrypt * mult_const)\\8; /* MUTANT 6 */

tocrypt = tocrypt + key;

tocrypt = tocrypt + xor_const; /* MUTANT 7 */

skip; } /* MUTANT 8 */

Fig. 2. Example of fault injection in a Handel-C description. (The symbol identifies mutated statements.)

1. The MethodologyPart 2/3Part 2/3

Page 29: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 29

is_valid crypt set_bit

System Reliability

(%)

System Partitioning

S

S

H

S

H

H

S

H

S

H

S

S

H

S

H

H

S

S

S

H

S

H

H

H

00.00

85.26

91.58

84.66

92.93

92.90

87.00

93.83

Number of Mutants Generated

Detected Not Detected Total

47

56

47

58

41

43

52

37

47

380

558

378

580

606

400

600

0

324

511

320

539

563

348

563

Table 1. System partitioning possibilities and resulting reliability.

After running this example ...

2. Case StudyPart 2/3Part 2/3

Page 30: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 30

3. Conclusions

For critical applications, partitioning at early steps the system into

HW + SW parts according to reliability constraints may be of

interest (reduction of design cycle time).

Use the unified fault injection methodology in the HW + SW parts

to help estimating the most reliable configuration for the system.

Part 2/3Part 2/3

Page 31: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 31

Part 3/3Part 3/3

Testability-Oriented Testability-Oriented

HW-SW PartitioningHW-SW Partitioning((Controlling system functions mapping into HW-SW to optimizeControlling system functions mapping into HW-SW to optimize testability testability))

We could think on We could think on partitioning the systempartitioning the system into HW + SW into HW + SW

parts and use the parts and use the unified fault injection methodologyunified fault injection methodology described described

previously to verify which is the previously to verify which is the most testable configurationmost testable configuration

Page 32: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 32

Yves Le Traon, Ghassan Al Hayek, Chantal Robach [ITC’96]: Testability-Oriented Hardware-Software Partitioning

Test-based HW/SW partitioning approach for a co-design specification.

Depending on the HW or SW implementation choice for each unit level component, the test cost for the systems is evaluated.

The unit test costs are estimated by means of mutation-based analysis WRT the implementation choices.

1. Methodology

Part 3/3Part 3/3

Page 33: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 33

The # of test vectors used for testing the SW implementation (Nsoft) and the # of test vectors for testing the HW implementation (Nhard) are computed at the unit-level component (process) and used throughout an algorithm to evaluate the testing effort for the global system.

1. Methodology

Part 3/3Part 3/3

Page 34: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 34

To perform testing cost evaluation, the algorithm is based on a flowgraph which represents the control flow structure of the system.

Each node represents a process (unit-level component).

The Testing Cost (TC) of the whole specification is the sum of all costs necessary to test each path of the specification graph.

1. Methodology

1

1 1

2

2 2

3

3

4

x z

x y

tz

x

y1

3 N+2

Cn

y1 yn

xnx1

4

x2

y1

12

3

z x

y

Fig 1. Prime flowgraphs.

Part 3/3Part 3/3

If If then/else Interruptions Case Struc

Page 35: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 35

Application: robot to collect precious objects in deep waters

Equipments: - Frontal sensor, to detect obstacles and objects- Boxes to place fragile objects- Boxes for non-fragile objects- Hand to pick-up objects and an electric battery to provide

energy

Routines:- Turn-right, turn-left, turn-back, advance, object analysis

2. Case StudyPart 3/3Part 3/3

Page 36: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 36

HW/SW Implementation

Turn-right Turn-left Turn-back Advance Object-Analysis

System Testing Cost

S X S S S 84

S X H S S 85

H X S S S 87

H X H S S 88

S X X H S 163

H X X H S 164

S X S X H 498

S X H X H 499

H X S X H 501

H X H X H 502

2. Case StudyAfter running this example ...

Table 1. Robot testing costs by implementation choices.

Part 3/3Part 3/3

Page 37: Vargas@computer.org1 Part 3 Unified HW+SW Reliability Insertion & Estimation.

[email protected] 37

Part 3/3Part 3/3

3. Conclusions

For critical applications, partitioning at early steps the system into

HW + SW parts according to test efforts constraints may be of

interest (reduction of design cycle time).

Use the unified fault injection methodology in the HW + SW parts

to help estimating the lowest test effort for the system.