1 7. Fault Tolerance Introduction Process Resilient Recovery.

1

7. Fault Tolerance

Introduction

Process Resilient

Recovery

2

Learning Objectives

To understand the basic concepts of fault tolerance and different types of failures in DS, and failure masking by redundancy;

To study the main design issues in process resilient, and how process replication (process group) can be used to mask failures and reach agreement in faulty systems;

To understand the concept of error recovery, and the basic idea of combining checkpointing with message logging.

3

Introduction: Basic Concept

• A characteristic feature of DSs is the notion of partial (Independent) failures: Each component of the system can fail independently, leaving the others still running, which may not immediately made known to the other components.

• Fault tolerance: An important goal in DSs is to construct the system in such a way that can automatically recover from partial failures without seriously affecting the overall performance.

• Being fault-tolerant is strongly related to the notion of dependable systems: including availability, reliability, safety, and maintainability.

4

Introduction: Basic Concept

Availability: A system is ready to be used immediately.

Reliability: A system can run continuously without failure, which is defined in terms of time interval (instead of an instance of time in availability).

Safety: When a system temporarily fails to operate correctly, nothing catastrophic happens.

Maintainability: How easy a failed system can be repaired.

A system is said to fail when it cannot meet its promises. An error is a part of a system’s state that may lead to a failure. The cause of an error is called a fault.

5

Introduction: Failure Models and Masking Failure by Redundancy

To get a better understanding on how serious a failure in DS actually is, several classification schemes have been developed, as shown in the next slide.

The key technique for masking faults is to use redundancy, including information redundancy, time redundancy, and physical redundancy.

Information redundancy: Extra bits are added to allow recovery from garbled bit, such as using Hamming code.

Time redundancy: An action is performed, and then, if need be, it is performed again, such as transactions.

Physical redundancy: Extra physical components exist to provide fault tolerance.

6

Failure Models: Different types of failures

Type of failure Description

Crash failure A server halts, but is working correctly until it halts

Omission failure Receive omission Send omission

A server fails to respond to incoming requestsA server fails to receive incoming messagesA server fails to send messages

Timing failure A server's response lies outside the specified time interval

Response failure Value failure State transition failure

The server's response is incorrectThe value of the response is wrongThe server deviates from the correct flow of control

Arbitrary failure A server may produce arbitrary responses at arbitrary times

7

Process Resilience

It is critical to protect against process failures, which is achieved by replicating processes into groups.

The purpose of having groups is to allow processes to deal with collections of processes as a single abstraction. When a message is sent to the group itself, all members of the group can receive it.

Process group may be dynamic. New groups can be created and old groups can be removed. A process can join or leave a group during system operation. A process can be a member of several groups at the same time.

Internal structure of a group: In flat group, all processes are equal. In hierarchical group, some kind of hierarchy exists, such as one being the coordinator, all others are workers.

8

Flat Groups versus Hierarchical Groups

a) Communication in a flat group.

b) Communication in a simple hierarchical group

9

Recovery

Once a failure has occurred, it is essential that faulty process can recover to a correct state.

An error is that part of a system that may lead to a failure. There are two forms of error recovery: forward recovery and backward recovery.

In forward recovery, when the system has just entered an erroneous state, an attempt is made to bring the system in a correct new state for further execution. The main issue is how it knows in advance which error may occur.

The backward recovery is to bring the system from its present erroneous state back into a previous correct state.

10

Recovery

For backward recovery, it is necessary to record the system’s state from time to time. Each time (part of) the system’s state is recorded, a checkpoint is said to be made.

Backward recovery techniques have been widely applied as a general error recovery mechanism, as they are often independent of any specific system or process.

Taking a checkpoint is often a costly operation. Thus many fault-tolerant DSs combine checkpointing with message logging.

In message logging, after a checkpoint has been taken, a process logs its messages before sending them off (for late possible replay), which makes it possible to restore a state that lies beyond the most recent checkpoint without the cost of checkpointing.

11

Summary

A system is fault tolerant if it can continue to operate in the presence of failures.

Several types of failures such as crash failure, omission failure, timing failure, response failure, and Byzantine (arbitrary) failure in DS have been identified.

Redundancy is the key technique need to achieve fault tolerance, which can be applied to processes to have a group of processes working closely to provide a service.

Recovery in fault-tolerant systems is invariably achieved by checkpointing the state of the system on a regular basis.

12

8. Security

Introduction and Cryptography

Security Channels: Authentication

Security Channel: Message Confidentiality and Integrity

Access Control

Security Management

Cast Study

13

Learning Objectives

To become familiar with the range of security threats faced by networked and distributed systems (DSs);

To examine various cryptographic techniques fundamental to security in DSs, such as symmetric crytosystem and asymmetric crytosystem;

To fully study the two main parts in security in DS: secure channel and authorization (access control), using main techniques of encryption, authentication, and access control;

To gain an understanding of the major methods in security management.

14

Introduction

The security problems in DS arise from the openness of Internet and distributed systems.

Security measures must be incorporated into computer systems whenever they are potential targets for malicious or mischievous attacks.

Security in computer systems is strongly related to the notion of dependability that we justifiably trust to deliver its services. Confidentiality and integrity are two major properties in such systems.

Confidentiality: the information is disclosed only to authorized parties.

Integrity: Alteration to a system’s assets (hardware, software, data etc) can be made only in an authorized way.

15

Security Model: Threats and forms of attack

Masquerading – assuming the identity of another user/principal

Eavesdropping (Interception)– obtaining private or secret information

Message tampering (Modification)– altering the content of messages in transit

Replaying (Fabrication)– storing secure messages and sending them at a later date

Denial of service (Interruption)– flooding a channel or other resource, denying access to

others

*

16

Security Policy and Mechanisms

Security policy is a set of requirements and guidelines to ensure a desired level of security for the activities that are performed in the system.

Security mechanisms are employed to implement the security policy.

Security in DSs can be roughly divided into two major parts: secure channel and authorization.

Secure channel: to ensure secure communication, including authentication, message confidentiality and integrity.

Authorization (access control): to ensure that a process gets only those access rights to the resources in a DS it is entitled to.

17

Introduction: Secure channels

Properties Each process is sure of the identity of the other (authentication) Data is private and protected against eavesdropping (confidentiality) Protection against alternation of data (integrity)

Employs cryptographic techniques Authentication based on proof of ownership of secrets Confidentiality and integrity based on cryptographic techniques

*

Principal A

Secure channelProcess p Process q

Principal BThe enemy

Cryptography

18

Principal (user) Principal (server)

Introduction: Authorization

Access rights

Network

invocation

resultClient

Server

Object

Object (or resource)– Mailbox, system file, part of a commercial web site

Principal– User or process that has authority (rights) to perform actions– Identity of principal is important

*

19

Important Security Mechanisms

Encryption: Using cryptographic techniques, encryption transforms data into something an attacker cannot understand (for confidentiality). It also provide support for integrity checks.

Authentication: It is used to verify the claimed identity of a user, client, server and so on.

Authorization: It is necessary to check whether a client is authorized to perform the action required.

Auditing: It is used to trace which clients accessed what, and in which way, for late security analysis.

20

Cryptography

Fundamental to security in DSs is the use of cryptographic techniques: the sender first encrypts message P (plaintext) into an unintelligible message C (ciphertext) , and then sends C to the receiver who must decrypt C into its original form P.

Encryption and decryption are achieved by using cryptographic methods parameterized by keys, as shown in the next slide, which can protect against eavesdropping (for confidentiality) and tampering (for integrity).

Notations:

C = EK(P) – ciphertext C is obtained by encrypting plaintext P using key K in a cryptographic method (encryption function).

P = DK(C) – plaintext P is obtained by decrypting ciphertext C using key K in a cryptographic method.

21

Cryptography (1)

Intruders and eavesdroppers in communication.

22

Cryptography: Symmetric Cryptosystem

Message P, key K, published encryption functions E, D

Symmetric (secret key)

C = EK(P); P = DK(EK(P))

* Same key K for E and D;

* P must be hard (infeasible) to compute if K is not known;

* Usual form of attack is brute-force: try all possible key values for a known pair P, C. Resisted by making K sufficiently large ~ 128 bits.

23

Cryptography: Asymmetric Cryptosystem

Message P, keys Ke, Kd, published encryption functions E, D

Asymmetric (public key)

Separate encryption and decryption keys: Ke and Kd

C = EKe(P); P = DKd(EKe(P))

• One of the keys is kept private, other is made public;• Depends on the use of a trap-door function to make the

keys. A trap-door function is a one-way function with a secret exit - e.g. product of two large numbers; easy to multiply, very hard (infeasible) to factorize;

• E has high computational cost;• Very large keys > 512 bits.

24

Cryptography (2)

Other notations used in this chapter.

Notation Description

KA, B Secret key shared by A and B

Public key of A

Private key of A

K A

K A

25

Cryptography: Examples of Symmetric Cryptographic Algorithm

DES: The US Data Encryption Standard (1977). No longer strong in its original form. 56-bit key, operating on 64-bit blocks of data.

TEA: A simple but effective algorithm developed at Cambridge U (1994) for teaching and explanation. 128-bit key.

Triple-DES: Applies DES three times with two different keys. 112-bit key.

IDEA: International Data Encryption Algorithm (1990). Resembles TEA. 128-bit key.

AES: A proposed US Advanced Encryption Standard (1997). 128/256-bit key.

There are also many other effective algorithms.

26

Cryptography: Examples of Asymmetric Cryptographic Algorithm

RSA: The first practical algorithm (Rivest, Shamir and Adelman 1978) and still the most frequently used. Key length is variable, 512-2048 bits. The security of RSA comes from the fact that no methods are known to efficiently find the prime factors of large numbers.

Elliptic curve: A recently-developed method, shorter keys and faster.

Asymmetric algorithms are about1000 times slower and are therefore not practical for bulk encryption, but their other properties make them ideal for key distribution and for authentication uses.

27

Security Channels

The client-server model has been used as convenient way to organize a DS. When considering security in DSs, it is also useful to think in terms of clients and servers.

Two major issues in security: secure channel and authorization.

A secure channel should be achieved by authentication and protection for message confidentiality and integrity.

By a common sense, authentication and message integrity are strongly related and should go together (why?).

A session key is a shared key that is generally used only for as long as the channel exists. It is commonly used for secret-key cryptography to ensure message confidentiality and integrity after authentication.

28

Security Channels: Authentication Based on Shared Secret Key

Authentication based on a shared secret key: Suppose that both Alice (A) and Bob (B) have a shared secret key KA,B (how they obtain KA,B will be discussed later). A protocol known as challenge-response protocol is shown in the next slide.

In this protocol, after B receives A’s identity and request for setting up a communication channel between A and B (Message 1), it sends a challenge RB (Message 2, it could be a random number) to A; A is required to encrypt the challenge with KA,B, and return KA,B(RB) (Message 3) to B; A also sends a challenge RA (Message 4) which B responds to by returning KA,B(RA) (Message 5). If both A and B can decrypt KA,B(RA) and KA,B(RB) respectively, then they can be sure about the other’s identity.

29

Authentication Based on Shared Secret Key (1)

Authentication based on a shared secret key.

30

Security Channels: Authentication Using a Key Distribution Center (KDC)

The Key Distribution Center (KDC) shares a secret key with each of the hosts, e.g., KDC has a shared secret key with A, KA,KDC. The system with N hosts now only manages N keys.

The principle of using a KDC is shown in Slide 31. A first sends a message to the KDC, telling it that she wants to talk to B. The KDC returns a shared secret key KA,B (encrypted with the secret key KA,KDC); The KDC also sends KA,B to B (encrypted with KB,KDC).

A may want to start setting up a secure channel with B even before B had received the shared key from the KDC. The KDC may just pass KB,KDC(KA,B ), called ticket, to A and let A take care of connecting to B, as shown in Slide 32, which is actually a variation of the well-known authentication protocol, Needham-Schroeder authentication protocol in Slide 33.

31

Authentication Using a Key Distribution Center (1)

The principle of using a KDC.

32


Using a ticket and letting Alice set up a connection to Bob.

33


The Needham-Schroeder authentication protocol.

34

Security Channels: Authentication Using a Key Distribution Center (KDC)

In Needham-Schroeder protocol in Slide 33, the challenge RA1 in Message 1 from A to the KDC is also known as a nonce that is a random number and can only be used once.

Nonce is mainly used to uniquely relate two messages to each other, such as the Message 1 and Message 2 in the protocol which can be used to avoid replaying attack.

The problem with Needham-Schroeder protocol: The intruder, Chuck (C), may get a hold of an old key KA,B, he could replay Message 3, and get B to set up a channel. In this case, we need to relate Message 3 to Message 1 to void the replaying from C.

35

Security Channels: Authentication Using Public Key Cryptography

Consider the situation that A wants to set up a secure channel to B, and both A and B have the other’s public key. A typical authentication protocol based on public-key is shown in the next slide.

In the protocol, A first sends a challenge RA to B encrypted with B’s public key. Only B can decrypt the message.

After B receives A’s request, he returns the decrypted challenge, along with his own challenge to authenticate A, and a new generated session key KA,B, which are put into a message and the message is encrypted with A’s public key.

A finally returns her response to B’s challenge using the session key, proving her identity by showing that she could decrypt Message 2.

36

Authentication Using Public-Key Cryptography

Mutual authentication in a public-key cryptosystem.

37

Security Overview

Two major parts of security in DS:

Security channel: authentication, message confidentiality, and message integrity

Authorization (access control): access control list and capabilities

Cryptography: symmetric (secret key) cryptosystem and asymmetric (public key) cryptosystem

Examples of symmetric (secret key) cryptosystem algorithms (DES, TEA etc) and asymmetric (public key) cryptosystem algorithms (RSA, Elliptic curve)

38

Security

Security Channel:

* Authentication: based on shared secret key, or Key Distribution Center (Needham-Schroeder protocol); or based on public key cryptography

* Message confidentiality: using cryptography method

* Message integrity: using digit signature based on cryptography

Authorization (access control): Access control list and capabilities

Security management: Key distribution and authorization management

Case study: Kerberos system

39

Security Channels: Message Confidentiality and Integrity

In addition to authentication, a secure channel should also provide guarantees for message confidentiality and integrity.

Message confidentiality can be achieved by simply encrypting a message before sending it, which can be done either through a secret key, or using the receiver’s public key.

Protecting message integrity is more complicated. Digit signature is often used to ensure message integrity.

A message m can be signed by a principal A by encrypting a copy of M with a key KA, and attaching it to a plaintext copy of m (as well as A’s identifier).

40

Digital Signatures

• Requirements of digit signature:– To authenticate stored document files as well as

messages;– To protect against forgery;– To prevent the signer from repudiating a signed

document (denying their responsibility).

• One popular form is to use a public-key (asymmetric) cryptosystem.

• Public-key method is particular well adapted for generation of digit signatures as it is relatively simple and does not require any communication between the sender and receiver.

*

41

Digital Signatures with Public Keys

• When A (Alice) sends a message m to B (Bob), she encrypts it with her private key KA

-.

• If A also wants to keep the message content a secret, she can use B’s public key KB

+ to encrypt the message, that is, to send KB

+(m, KA-(m)) to B.

• When the message arrives at B, B first decrypts it using its own private key KB

- to get (m, KA-(m)), it then uses A’s public

key KA+ to decrypt the signed version of m (i.e., decrypting KA

-

(m)), and then compare KA+ KA

-(m) with m.

• If they are the same, it means that the message really came from A, and has not been modified.

42

Digital Signatures (1)

Digital signing a message using public-key cryptography.

43


Major points in using public keys:

* B (Bob) must be sure it has the public key that is indeed owned by A (Alice) – using certificate described later;

* The validity of A‘s signature holds only as long as A’s private key remains secret – A must keep the private key secret;

* Once A changed her private key, her statement sent to B becomes worthless – using a central authority to keep track of the keys;

* A encrypts the entire message whose length may be very long, resulting a high cost – using message digest, shown in the next slide.

44


Instead of encrypting message M, we encrypt its secure digest instead.

A secure digest function computes a fixed-length hash H(M) that characterizes the document M;

A H(M) should be

- Fast to compute – easy to compute H(M) given M;

- Hard to invert – hard to compute M given H(M);- Given M and H(M), it is computationally infeasible to find

another M’, M≠M’, such that H(M)=H(M’).

45

Digit Signature with Secure Digest Functions

• Some popular digests (hash functions):

* MD5: Developed by Rivest (1992). Computes a 128-bit digest.

* SHA: (1995) based on Rivest's MD4 but made more secure by producing a 160-bit digest.

Digitally signing a message using a message digest is shown in the next slide.

A first computes a message digest H(m), and encrypts it using her private key, and send KA

-(H(m)), together with m, to B. B then decrypts the KA

- ( H(m)) using A’s public key and compared with the calculated message digest. If they match, B knows that the message has been signed by A.

46

Digital Signatures (2)

Digitally signing a message using a message digest.

47

Access Control

In the client-server model, the client can issue requests that are to be carried out by the server once a secure channel between client and server has been set up.

Requests from a client involve carrying out operations on resources controlled by the server. Such requests can be carried out only if the client has sufficient access right for the requests.

Verifying access rights is referred as access control, whereas authorization is about granting access rights. The two terms are strongly related and can be used interchangeably.

48

Access Control: ACL and Capability

Access control list (ACL): Each object maintains a list (ACL) of the access rights of subjects who want to access the object (E.g. Unix file access permissions). It corresponds to the case in which the matrix is distributed column-wise without empty entries.

Using ACL, when a client sends a request to a server to access an object, the server’s reference monitor will check whether it knows the client and if the client has the right to access the object by checking the object’s ACL.

Capabilities: A capability corresponds to an entry in the access control matrix (the access control matrix is distributed row-wise).

49

Access Control: ACL and Capability

A capability is similar to a ticket (or a key): its holder is given rights associated with the ticket. It should be protected against modifications.

One way to protect capabilities is to use issuer’s signature. A capability may have the format: <resource id, permitted operations, checking code> with signature in the checking code.

Using capabilities, a client simply sends its request to the server that will not check the client’s identity (why?). The server needs only check whether the capability is valid and the requested operation is in the capability.

Problem with capabilities: eavesdropping, difficulty of cancellation etc.

50

ACL and capability

Comparison between ACLs and capabilities for protecting objects.

a) Using an ACL

b) Using capabilities.

51

Security Management: Key Distribution

Security management includes key distribution and certification, and authorization management.

In a symmetric cryptosystem (using secret key), the initial shared secret key must be delivered along a secure channel that provides authentication and confidentiality. If no key is available to set up the initial secure channel, some other communication means than network should be used, such as using snail mail etc.

For asymmetric cryptosystem (using public-key), the private key needs to be delivered in the same way as secret key does. The public key should be distributed in such a way that the receivers can be sure that the key is indeed paired to the claimed private key.

52

Case Study: Kerberos

Secures communication with servers on a local network– Developed at MIT in the 1980s to provide security across

a large campus network > 5000 users;– Based on Needham-Schroeder protocol.

Standardized and now included in many operating systems– Internet RFC 1510, OSF DCE– BSD UNIX, Linux, Windows 2000, NT, XP, etc.– Available from MIT

Kerberos server creates a shared secret key for any required server and sends it (encrypted) to the user's computer.

53


Kerberos can be viewed as a security system that assists clients in setting up a secure channel with a server. Security is based on shared secret key.

There are two components: Authentication Server (AS) that authenticates a user and provides a key used to set up secure channel with servers, and Ticket Granting Server (TGS) that hands out tickets used to convince a server about the identity of the client.

Kerberos system can be explain by the figure in the next slide, in which Alice (A) wants to set up a secure channel with Bob (B) using Kerberos.

54

Example: Kerberos (1)

Authentication in Kerberos.

55


A logs onto the system using any workstation available (Message 1). The workstation then sends her name (plaintext) to the AS (Message 2).

The AS then returns a session key KA,TGS and a ticket KAS,TGS(A, KA,TGS ) for her to hand over to the TGS (Message 3). Message 3 is encrypted with the secret key KA,AS shared between A and AS.

After the workstation receive the response (Message 3) from AS, it prompts A for password (Message 4 and Message 5 respectively), and use the password to generate the key KA,AS, after that A’s password can be ignored.

Now A can consider herself has logged into the system, and she can contact other users or servers.

56


Assume that A now wants to talk to B, she requests the TGS to generate a session key for B (Message 6).

Message 6 contains the ticket KAS,TGS(A, KA,TGS ) (to prove she is A), and timestamp t encrypted with key KA,TGS , which is used to prevent intruders from maliciously replaying Message 6 again, trying to set up a channel to B. If it differs more than a few minutes from the current time, the request for a ticket is rejected.

The TGS then responds with a session key KA,B, again encapsulated in a ticket that A will later have to pass to B (Message 7)

Setting up a secure channel with B is straightforward, and is shown in the next slide.

57

Example: Kerberos (2)

Setting up a secure channel in Kerberos.

58

Summary I

It is essential to protect the resources, communication channels and interfaces of distributed systems and applications against any attacks.

This is achieved by the use of secure channels and authorization (access control) mechanisms.

A secure channel ensures secure communication: providing authentication, message confidentiality and integrity.

Public-key and secret-key cryptography provide the basis for setting up secure channels. It is common practice to use public-key (asymmetric) cryptography for distributing short term shared secret keys (session keys).

59

Summary II

Authorization (access control) deals with protecting resources in such a way that only processes that have the proper access rights can actually access and use those resources. Access control always takes place after authentication.

There are two ways to implementing access control: access control list (ACL) and capabilities.

Two important issues in security management are the key management and authorization management.

Kerberos is a widely-used security system based on shared secret keys, Its main focus is on authentication, although it also incorporates protocols for access control and delegation of access rights.

60

Tutorial-I

Q1. Each of the security attacks: Masquerading (impostoring), eavesdropping, message tampering, replaying, and denial of service, is generally prevented by which mechanism?

Q2. Discuss the differences of using secrete key and public key for secure communication. What are the advantages and disadvantages of the two approaches, respectively?

Q3. Assume that Alice wants to send a message m to Bob. Instead of encrypting m with Bob’s public key KB+, she generates a session key KA,B and then sends [KA,B (m), KB+ (KA,B)]. Why is this scheme generally better? (Hint: consider performance issues.)

61

Tutorial-I

Q4. What is the major problem in using shared secret key for authentication? How the authentication based Key Distribution Center (KDC) can be used to overcome it?

Q5. Would it be safe to join message 3 and message 4 in the authentication protocol shown in the Slide 3, into KA,B(RB,RA)?

Q6. Why is it not necessary in the figure in Slide 4 for the KDC to know for sure it was talking to Alice when it receives a request for a secret key that Alice can share with Bob?

62

Tutorial-II

Q1. Can we safely adapt the authentication protocol shown in the figure in the next slide, such that message 3 consists only of RB?

Q2. Can secret key be used for digit signature? List the main features if secret key is used for digit signature.

Q3. Initial exchanges of public keys are vulnerable to the man-in-the-middle attack. Describe as many ways against it as you can.

Q4. Does it make sense to restrict the lifetime of a session key? If so, give an example how that could be established.

63

Tutorial-II

Q1: Mutual authentication in a public-key cryptosystem.

64

Tutorial-II

Q5. How are ACLs implemented in a UNIX file system?

Q6. In message 2 of the Needham-Schroeder authentication protocol, the ticket is encrypted with the secret key shared between Alice and the KDC. Is this encryption necessary?

Q7. Complete the figure in the next slide by adding the communication for authentication between Alice and Bob.

65

Tutorial-II

Q7: Authentication in Kerberos.

1 7. Fault Tolerance Introduction Process Resilient Recovery.

Documents