Top Banner
Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen Vrije Universiteit Amsterdam, Faculty of Science Dept. Mathematics and Computer Science Room R4.20. Tel: (020) 598 7784 E-mail:[email protected], URL: www.cs.vu.nl/steen/ 01 Introduction 02 Architectures 03 Processes 04 Communication 05 Naming 06 Synchronization 07 Consistency and Replication 08 Fault Tolerance 09 Security 10 Distributed Object-Based Systems 11 Distributed File Systems 12 Distributed Web-Based Systems 13 Distributed Coordination-Based Systems 00 – 1 /
22

Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

May 14, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Distributed SystemsPrinciples and Paradigms

Chapter 12(version April 7, 2008)

Maarten van Steen

Vrije Universiteit Amsterdam, Faculty of ScienceDept. Mathematics and Computer Science

Room R4.20. Tel: (020) 598 7784E-mail:[email protected], URL: www.cs.vu.nl/∼steen/

01 Introduction02 Architectures03 Processes04 Communication05 Naming06 Synchronization07 Consistency and Replication08 Fault Tolerance09 Security10 Distributed Object-Based Systems11 Distributed File Systems12 Distributed Web-Based Systems13 Distributed Coordination-Based Systems

00 – 1 /

Page 2: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Distributed Web-Based Systems

Essence: The WWW is a huge client-server systemwith millions of servers; each server hosting thousandsof hyperlinked documents:

Client machine

Browser

OS

Server machine

Web server

1. Get document request (HTTP)

3. Response

2. Server fetchesdocument fromlocal file

• Documents are generally represented in text (plaintext, HTML, XML)

• Alternative types: images, audio, video, but alsoapplications (PDF, PS)

• Documents may contain scripts that are executedby the client-side software

12 – 1 Distributed Web-Based Systems/12.1 Architecture

Page 3: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Multi-tiered Architectures

Observation: Already very soon, Web sites were or-ganized into three tiers:

Web server Database serverCGI process

CGI program

1. Get request

3. Start process to fetch document

5. HTML document created

HTTP request handler6. Return result

4. Database interaction

12 – 2 Distributed Web-Based Systems/12.1 Architecture

Page 4: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Web Services

Observation: At a certain point, people started rec-ognizing that it is was more than just user ↔ site in-teraction: sites could offer services to other sites ⇒

standardization is then badly needed.

Service description (WSDL)

Client machine

Client application

Stub

Server application

Stub

Communication subsystem

Communication subsystem

SOAP

Service description (WSDL)Service description (WSDL)

Directory service (UDDI)

Publish serviceLook up

a service

Generate stub from WSDL description

Server machine

Generate stub from WSDL description

12 – 3 Distributed Web-Based Systems/12.1 Architecture

Page 5: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Clients: Web browsers

Observation: browsers form the Web’s most impor-tant client-side sofware. They used to be simple, butthat is long ago.

User interface

Browser engine

Rendering engine

Network comm.

HTML/XML parser

Display back end

Client-side script

interpreter

12 – 4 Distributed Web-Based Systems/12.2 Processes

Page 6: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Apache Web Server

Observation: More than 70% of all Web sites arebased on Apache. The server is internally organizedmore or less according to the steps needed to processan HTTP request:

Hook Hook Hook Hook

Function

... ... ...

Module Module Module

Apache coreFunctions called per hook

Link between function and hook

Request Response

12 – 5 Distributed Web-Based Systems/12.2 Processes

Page 7: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Server Clusters (1/2)

Essence: To improve performance and availability,WWW servers are often clustered in a way that istransparent to clients:

Frontend

Webserver

Webserver

Webserver

Webserver

Request Response

Front end handlesall incoming requestsand outgoing responses

LAN

Problem: The front end may easily get overloaded,so that special measures need to be taken.

Transport-layer switching: Front end simply passesthe TCP request to one of the servers, taking someperformance metric into account.

Content-aware distribution: Front end reads the con-tent of the HTTP request and then selects thebest server.

12 – 6 Distributed Web-Based Systems/12.2 Processes

Page 8: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Server Clusters (2/2)

Question: Why can content-aware distribution be somuch better?

SwitchClient

Webserver

Webserver

Distributor

Distributor

Dis-patcher

1. Pass setup requestto a distributor

2. Dispatcher selectsserver

3. Hand offTCP connection

4. InformswitchSetup request

Other messages

5. Forwardothermessages

6. Server responses

12 – 7 Distributed Web-Based Systems/12.2 Processes

Page 9: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Communication (1/2)

Essence: Communication in the Web is generally basedon HTTP; a relatively simple client-server transfer pro-tocol having the following request messages:

OperationDescription

Head Request to return the header of a documentGet Request to return a document to the clientPut Request to store a documentPost Provide data that are to be added to a docu-

ment (collection)Delete Request to delete a document

12 – 8 Distributed Web-Based Systems/12.3 Communication

Page 10: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Communication (2/2)

HeaderC/S

ContentsAccept C The type of documents the client can handle

Accept-Charset C The character sets are acceptable for the client

Accept-Encoding

C The document encodings the client can handle

Accept-Language

C The natural language the client can handle

Authorization C A list of the client’s credentials

WWW-Authenticate

S Security challenge the client should respond to

Date C+S Date and time the message was sent

ETag S The tags associated with the returned document

Expires S The time for how long the response remains valid

From C The client’s e-mail address

Host C The TCP address of the document’s server

If-Match C The tags the document should have

If-None-Match C The tags the document should not have

If-Modified-Since

C Tells the server to return a document only if it hasbeen modified since the specified time

If-Unmodified-Since

C Tells the server to return a document only if it hasnot been modified since the specified time

Last-Modified S The time the returned document was last modified

Location S A document reference to which the client shouldredirect its request

Referer C Refers to client’s most recently requested document

Upgrade C+S The application protocol sender wants to switch to

Warning C+S Information about status of the data in the message

12 – 9 Distributed Web-Based Systems/12.3 Communication

Page 11: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

SOAP

Simple Object Access Protocol: Based on XML,this is the standard protocol for communication be-tween Web services.

• SOAP is bound to an underlying protocol (i.e., itis not independent from its carrier)

• Conversational exchange style: Send a docu-ment one way, get a filled-in response back.

• RPC-style exchange: Used to invoke a Web ser-vice.

12 – 10 Distributed Web-Based Systems/12.3 Communication

Page 12: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

A Note on XML

Observation: XML has the advantage of allowing self-describing documents. Full stop (i.e., it introducesperformance problems and is not meant to be readby human beings)

env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope"><env:Header>

<n:alertcontrol xmlns:n="http://example.org/alertcontrol"><n:priority>1</n:priority><n:expires>2001-06-22T14:00:00-05:00</n:expires>

</n:alertcontrol></env:Header><env:Body>

<m:alert xmlns:m="http://example.org/alert"><m:msg>Pick up Mary at school at 2pm</m:msg>

</m:alert></env:Body>

</env:Envelope>

12 – 11 Distributed Web-Based Systems/12.3 Communication

Page 13: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Naming: URL

URL: Uniform Resource Locator tells how and whereto access a resource.

Scheme Host name Pathname

Scheme Host name Port Pathname

Scheme Host name Port Pathname

http

http

http

://

://

://

www.cs.vu.nl

www.cs.vu.nl

130.37.24.11

:

:

80

80

/home/steen/mbox

/home/steen/mbox

/home/steen/mbox

(a)

(b)

(c)

Examples:http HTTP http://www.cs.vu.nl:80/globe

mailto Mail mailto:[email protected]

ftp FTP ftp://ftp.cs.vu.nl/pub/minix/README

file Local file file:/edu/book/work/chp/11/11

data Inline data data:text/plain;charset=iso-8859-7,%e1%e2%e3

telnet Remote login telnet://flits.cs.vu.nl

tel Telephone tel:+31201234567

modem Modem modem:+31201234567;type=v32

12 – 12 Distributed Web-Based Systems/12.4

Page 14: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Synchronization: WebDAV

Problem: There is a growing need for collaborativeauditing of Web documents, but bare-bones HTTP can’thelp here. Solution: Web Distributed Authoring andVersioning.

• Supports exclusive and shared write locks, whichoperate on entire documents

• A lock is passed by means of a lock token; theserver registers the client(s) holding the lock

• Clients modify the document locally and post itback to the server along with the lock token

Note: There is no specific support for crashed clientsholding a lock.

12 – 13 Distributed Web-Based Systems/12.5 Synchronization

Page 15: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Web Proxy Caching

Basic idea: Sites install a separate proxy server thathandles all outgoing requests. Proxies subsequentlycache incoming documents. Cache-consistency pro-tocols:

• Always verify validity by contacting server• Age-based consistency:

Texpire = α · (Tcached − Tlast modi f ied) + Tcached

• Cooperative caching, by which you first check yourneighbors on a cache miss:

Webproxy

Webserver

Webproxy

WebproxyCache

Cache

Cache

Client

Client

ClientClient

Client

ClientClient

Client

Client

2. Ask neighboring proxy caches

1. Look inlocal cache

HTTP Get request

3. Forward requestto Web server

12 – 14 Distributed Web-Based Systems/12.6 Consistency and Replication

Page 16: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Replication in Web HostingSystems

Observation: By-and-large, Web hosting systems areadopting replication to increase performance. Muchresearch is done to improve their organization. Fol-lows the lines of self-managing systems:

Web hosting system

Metric estimation

Analysis

+/-+/-+/-

Reference input

Initial configuration

Uncontrollable parameters (disturbance / noise)

Observed output

Measured outputAdjustment triggers

Corrections

Replica placement

Consistency enforcement

Request routing

12 – 15 Distributed Web-Based Systems/12.6 Consistency and Replication

Page 17: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Handling Flash Crowds

Observation: We need dynamic adjustment to bal-ance resource usage. Flash crowds introduce a se-rious problem:

(a) (b)

(c) (d)

2 days 2 days

6 days 2.5 days

12 – 16 Distributed Web-Based Systems/12.6 Consistency and Replication

Page 18: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Server Replication

Content Delivery Network: CDNs act as Web host-ing services to replicate documents across the Inter-net providing their customers guarantees on high avail-ability and performance (example: Akamai).

Origin server

Client

CDN server

CDN DNS server

Regular DNS system

Cache

1. Get base document

2. Document with refs to embedded documents

6. Get embedded documents (if not already cached)

5. Get embedded documents

7. Embedded documentsReturn IP address client-best server

DNS lookups 3

4

Question: How would consistency be maintained inthis system?

12 – 17 Distributed Web-Based Systems/12.6 Consistency and Replication

Page 19: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Replication of Web Apps. (1/3)

Observation: Replication becomes more difficult whendealing with databses and such. No single best solu-tion.

Authoritative databaseSchema Schema

Server Serverquery

response

full/partial data replication

full schema replication/ query templates

Content-blind cache

Content-aware cache

Database copy

Client

Edge-server side Origin-server side

Assumption: Updates are carried out at origin server,and propagated to edge servers.

12 – 18 Distributed Web-Based Systems/12.6 Consistency and Replication

Page 20: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Replication of Web Apps. (2/3)

Authoritative databaseSchema Schema

Server Serverquery

response

full/partial data replication

full schema replication/ query templates

Content-blind cache

Content-aware cache

Database copy

Client

Edge-server side Origin-server side

• Full replication: high read/write ratio, often incombination with complex queries. Note: replica-tion may possibly speed-down performance whenR/W ratio goes down.

• Partial replication: high read/write ratio, but incombination with simple queries

12 – 19 Distributed Web-Based Systems/12.6 Consistency and Replication

Page 21: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Replication of Web Apps. (3/3)

Authoritative databaseSchema Schema

Server Serverquery

response

full/partial data replication

full schema replication/ query templates

Content-blind cache

Content-aware cache

Database copy

Client

Edge-server side Origin-server side

• Content-aware caching: Check for queries at lo-cal database, and subscribe for invalidations atthe server. Works good with range queries andcomplex queries.

• Content-blind caching: Simply cache the resultof previous queries. Works great with simple queriesthat address unique results (e.g., no range queries).

12 – 20 Distributed Web-Based Systems/12.6 Consistency and Replication

Page 22: Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed

Security: TLS (SSL)

Transport Layer Security: Modern version of thethe Secure Socket Layer (SSL), which “sits” betweentransport layer and application protocols. Relativelysimple protocol that can support mutual authentica-tion using certificates:

Clie

nt

Ser

ver

[ K

[ K

+

+

S

C

CA

CA

]

]

([ R ] CKS+ )

Possibilities

Choices

1

2

3

4

5

12 – 21 Distributed Web-Based Systems/12.6 Consistency and Replication