Top Banner
1 Model-based Methods to Make Distributed Services Fault- Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU
29

1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

Dec 13, 2015

Download

Documents

Dylan Walters
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

1

Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable

Humberto Nicolás Castejón MartínezInstitutt for Telematikk, NTNU

Page 2: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

2

Outline

• Definition of Distributed System and Service

• Characteristics of Model-Driven Development (MDD)

• Dependability overview

◊ Means to achieve dependability

◊ Benefits from MDD

◊ Literature approaches

• Summary

Page 3: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

3

Distributed Systems and Services

• A Distributed System consists of separate autonomous components that operate concurrently and interact with each other, by message passing, in order to provide some service to the system’s environment/end-users

◊ E.g. Public Telephone Switch Network

• A Service is an identified functionality, with value for the end-users of the system, that results from a collaboration between components of the system

Page 4: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

4

“Elaboration” Development

Problem domain/

Requirements

Problem domain/

Requirements

Developing a service amounts to write executable

code that fulfills the user requirements

Developing a service amounts to write executable

code that fulfills the user requirements

• Incomplete high-level descriptions of functionality• Source code becomes the only complete view of the system,

and the only one maintained

ImplementationImplementation

Page 5: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

5

Tackling Development Complexity

Two golden rules for tackling development complexity:

• Separation of concerns:

Identify aspects that are as independent as possible and describe them separately.

• Conceptual abstraction:

Replace low level concepts representing technical detail by more high level abstract concepts better suited to describe and understand the problem at hand Use models!

Page 6: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

6

• The goal is to reduce the gap between problem and implementation domains through the use of models describing the system at multiple levels of abstraction and from different perspectives, and through automated techniques for model transformation and analysis

Model Driven Development

Problem domain/Requirements

Problem domain/Requirements

ImplementationImplementation

Implementation Oriented

Implementation Oriented

Problem domain/Requirements

Problem domain/Requirements

ImplementationImplementation

SpecificationModels

SpecificationModels

DesignModels

DesignModels

V&V

V&V

Automatic Code Generation

Automatic Model Transformation

ModelOriented

ModelOriented

VS

Page 7: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

7

Dependability

• Dependability of a system is

“the ability to avoid service failures that are more frequent and more severe than is acceptable” [Avizienis2004]

• Dependability is a property of the system in its environment!

Page 8: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

8

Dependability Tree

Availability

Reliability

Safety

Faults

Errors

Failures

Fault Prevention

Fault Removal

Fault Tolerance

Fault Prediction

•Dependability

Attributes

Threats

Means

IFIP WG 10.4

Page 9: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

9

Dependability Attributes

• A dependable system must be

◊ Available

◊ Reliable

◊ Safe

• More emphasis on one or another attribute depending on the particular system/service

ready to be used when needed

works properly and continuously in a time interval

operates without catastrophic consequences on the environment

Page 10: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

10

Dependability Threats

• Failure

◊ No compliance to specification◊ No compliance to requirements/user’s needs

• Error

• Fault

The delivered service deviates from the correct service, as observed by the end-user

Deviation from correctness in the internal system state that may lead to failure

The cause of an error

Page 11: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

11

Error

Dependability Threats: Causality Chain

... Fault Error Failure Fault Error Failure ...

InternalDormant

Fault

Error Errorpr

opag

atio

n

External Fault

Component C1 Component C2

Error

Failure of C1 = Fault for C2

Page 12: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

12

Fault Prevention and Removal with MDD

• Fault Prevention Avoid the occurrence or introduction of faults

◊ Abstraction and separation of concerns: better understanding fewer specification mistakes

◊ Automatic model transformations and code generation: compliance between source and target models

• Fault Removal Reduce the number or severity of faults

◊ Formal model verification and validation

◊ Model animation/simulation

◊ Automatic generation of test cases. E.g. The “Model-based Generation of Tests for Dependable Embedded Systems” (MOGENTES) EU project

Page 13: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

13

Fault Prediction with MDD

• Fault Prediction Estimate the present and future number of faults, and their likely consequences, by means of qualitative and quantitative evaluation

• Traditional approach: based on the system description, a dependability expert builds one or more dependability models

◊ Big gap between system design process and dependability modeling and analysis

• MDD approach: A dependability expert annotates the system design models with dependability-related information, and dependability models are automatically constructed

Page 14: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

14

Some MDD Approaches for Fault Prediction

• [Addouche2006] - Extended UML state machines and communication diagrams are converted into Probabilistic Timed Automata for verification of probabilistic temporal properties related to the dependability of real time systems

• [Huszerl2002] - UML state machines annotated with timing and probabilistic properties are transformed into Stochastic Reward Nets (SRNs)

• [Leangsuksun2003] - UML deployment models are mapped into Fault Tree and Markov Chain models for the detection of hardware failures

• [Pai2002] and [Majzik2002] transform annotated UML structural diagrams (e.g. class and deployment diagrams) into Dynamic Fault Trees and Timed Petri Nets, respectively

Page 15: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

15

Fault Tolerance

• Fault Tolerance Deliver correct service despite the occurrence of faults

• Mainly achieved by means of redundancy (in hardware and software)

• Three types of redundancy

◊ Static Redundancy: Tries to mask a fault by using redundant components/services

◊ Dynamic Redundancy: Based on error detection and error recovery

◊ Hybrid Redundancy: Combination of static and dynamic redundancy

Page 16: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

16

Dynamic Redundancy

• Once an error is detected, appropriate actions are taken to return the system into a valid state

• Two types of error recovery

◊ Backward error recovery (BER): Restores the system to a previous valid state (i.e. to a saved recovery point)

• Can be used to mask unanticipated faults

• Not useful with highly interactive systems

◊ Forward error recovery (FER): Continues from an erroneous state by making selective corrections to the system state (e.g. by means of exception handling mechanisms)

• Depends on accurate identification of the cause of errors

Page 17: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

17

Fault Tolerance with MDD

• Software-based fault-tolerance mechanisms are just software, so their implementation can certainly benefit from the MDD approach

◊ Separate models/views for normal behavior and fault-tolerant mechanisms + model composition/weaving

◊ Refinement for adding e.g. exception handling behavior

◊ Early test of fault-tolerance solutions

◊ Deployment models for specification of hardware redundancy and static software redundancy

Page 18: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

18

Some MDD approaches for Fault Tolerance

• [Reddy2005] - Fault-tolerant mechanisms are described by aspect models (with parameterized class and sequence diagrams) and automatically composed with the overall system model. Analysis of the integrated model is also provided.

• [Domokos2005] - Aspect oriented modeling is used to design the architecture of fault tolerant systems. A model weaver generates both an integrated design model and an associated dependability model based on SPNs.

• [Bucchiarone2007] - System architecture is modeled with a UML 2 component diagram, following the pattern dictated by the idealized fault tolerant component, i.e. with differentiated parts for normal and exceptional behaviors. Each part has its own state machine (in an extended version). Test cases are automatically created.

Page 19: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

19

Modeling Dependable Sys. with PatternsPattern

RepositoryStatic System

Model

Binding

Automatic Expansion

Concrete Scenarios

Dynamic System Model

VerificationFrom

[Sand2006]

Page 20: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

20

Summary

Thank you!

• Model Driven Development can positively contribute to all four means of achieving dependability

◊ By helping to reduce the number of faults in services

◊ By automatically detecting service faults through model V&V

◊ By allowing an integrated and precise development of normal and fault-tolerant behaviors at different levels of abstraction

◊ By automatically constructing dependability models through model transformations

• Some approaches that exploit MDD for achieving dependability already exist, but more work has to be done (since this is also true for MDD itself)

Page 21: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

21

References

• [Addouche2006] – N. Addouche, C. Antoine, J. Montmain, “Methodology for UML Modeling and Formal Verification of Real-Time Systems”, Intl. Conf. on Computational Intelligence for Modelling Control and Automation, and Intl. Conf. on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06), IEEE CS, 2006

• [Avizienis2004] – A. Avizienis, J-C. Laprie, B. Randell, C. Landwehr, “Basic Concepts and Taxonomy of Dependable and Secure Computing”, IEEE Transactions on Dependable and Secure Computing, vol. 1, no. 1, 2004

• [Bucchiarone2007] – A. Bucchiarone, H. Muccini and P. Pelliccione, “Architecting Fault-tolerant Component-based Systems: from requirements to testing”, Electronic Notes in Theoretical Computer Science, vol. 168, Elsevier, 2007

• [Domokos2005] – P. Domokos and I. Majzik, “Design and Analysis of Fault Tolerant Architectures by Model Weaving”, 9th IEEE Intl. Symposium on High-Assurance Systems Engineering (HASE’05), IEEE CS, 2005

• [Huszerl2002] – G. Huszerl, I. Majzik, A. Pataricza, K. Kosmidis, M. Dal Cin, “Quantitative Analysis of UML Statechart Models of Dependable Systems”, The Computer Journal, Vol 45(3), May 2002

• [Leangsuksun2003] – C. Leangsuksun, H. Song, L. Shen, “Reliability Modeling Using UML”, Int. Conf. on Software Engineering Research and Practice (SERP'03), CSREA Press, 2003

• [Majzik2002] – I. Majzik, A. Pataricza, A. Bondavalli, “Stochastic Dependability Analysis of System Architecture Based on UML Models”, ICSE 2002 Workshop on Software Architectures for Dependable Systems, LNCS 2677, Springer, 2002

• [Pai2002] – G. J. Pai, J. B. Dugan, “Automatic Synthesis of Dynamic Fault Trees from UML System Models”, 13th Intl. Symposium on Software Reliability Engineering (ISSRE’02), IEEE CS, 2002

Page 22: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

22

References (II)

• [Reddy2005] – R. Reddy, R. France, G. Georg, “An Aspect Oriented Approach to Analyzing Dependability Features”, Workshop on Aspect Oriented Modeling at Intl. Conf. on Aspect Oriented Software Development (AOM-AOSD 2005)

• [Sand2006] – M. Sand, “Patternbasierte Verifikation objektorientierter Modelle - Methodik, Semantik und Verfahren“, PhD Thesis, University of Erlangen-Nürnberg, 2006

Page 23: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

23

Static Redundancy: N-version Programming

• The elements of n-version programming are:

◊ Variants: modules with different design but providing the same service

◊ Controller: responsible for the coordinated execution of the variants

◊ Adjudicator: responsible for checking the results offered by the variants

• Useless if not combined with hardware redundancy!

Controller

Variant n

Variant 2

Variant 1

Adjudicator

Page 24: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

24

Coordinated Atomic Actions (CAAs)

• Support for error recovery of multiple interacting components in a distributed system

• A CAA is designed as a set of participants cooperating inside the CAA and a set of resources accessed by those participants

• The CAA starts when all participants have been activated and finishes when all of them reach the end of the CAA (i.e. produce a normal outcome)

• If an error is detected inside a CAA (i.e. a participant raises an exception), all participants are involved in recovery

• If recovery is successful, the action completes normally. Otherwise, a failure exception is propagated to the containing CAA

Page 25: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

25

Dependability Annotations in UML models

• Several proposals in the literature to annotate UML models with dependability-related information

◊ Each one covers only certain dependability aspects

• UML profile for Modeling Quality of Service & Fault Tolerance Characteristics & Mechanisms (QoS&FT)

◊ Flexible, but heavy-weight mechanisms

◊ May require the creation of extra objects just for annotation purposes

• Dependability Analysis Modelling (DAM) profile [Bernardi2008]

◊ Emphasis on quantitative analysis

◊ Aims at unifying best practices reported in literature

◊ Compliant with MARTE profile• Dependability-specific data types defined with MARTE’s mechanisms

(i.e. Non-Functional Properties framework and Value Specification Language)

• Specializes concepts from MARTE’s generic quantitative analysis model

Page 26: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

26

DAM’s Conceptual Model

• Represents the main dependability concepts from literature

• System Core: Concepts for the description of the system to be analyzed, and for the description of redundancy structures

• Threats: Concepts for modeling threats and their relationships

• Maintenance: Concepts for modeling repair/recovery actions

Page 27: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

27

DAM’s Conceptual Model: Core

• Structural view: System as set of components interconnected via connectors

• Behavioral view: ◊ System delivers high-level services (i.e. behavior as observed by

the users) upon user service requests.◊ Components interact to deliver the high-level services, by

providing and requesting basic services to each other◊ Service = sequence of steps (i.e. component states and actions)

Page 28: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

28

DAM’s Conceptual Model: Redundancy

• Represents redundancy structures that may characterize a system• Components can play different roles within a redundant structure:

◊ Variants: modules with different design but providing the same service, and allocated over different spares

◊ Controller: responsible for the coordinated execution of the variants◊ Adjudicator: responsible for checking the results offered by the variants

Page 29: 1 Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable Humberto Nicolás Castejón Martínez Institutt for Telematikk, NTNU.

29

DAM’s Conceptual Model: Threats