Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen Vrije Universiteit Amsterdam, Faculty of Science Dept. Mathematics and Computer Science Room R4.20. Tel: (020) 598 7784 E-mail:[email protected], URL: www.cs.vu.nl/∼steen/ 01 Introduction 02 Architectures 03 Processes 04 Communication 05 Naming 06 Synchronization 07 Consistency and Replication 08 Fault Tolerance 09 Security 10 Distributed Object-Based Systems 11 Distributed File Systems 12 Distributed Web-Based Systems 13 Distributed Coordination-Based Systems 00 – 1 /
22
Embed
Distributed Systems - TUNI · Distributed Systems Principles and Paradigms Chapter 12 (version April 7, 2008) Maarten van Steen ... Delete Request to delete a document 12 – 8 Distributed
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Distributed SystemsPrinciples and Paradigms
Chapter 12(version April 7, 2008)
Maarten van Steen
Vrije Universiteit Amsterdam, Faculty of ScienceDept. Mathematics and Computer Science
Observation: At a certain point, people started rec-ognizing that it is was more than just user ↔ site in-teraction: sites could offer services to other sites ⇒
standardization is then badly needed.
Service description (WSDL)
Client machine
Client application
Stub
Server application
Stub
Communication subsystem
Communication subsystem
SOAP
Service description (WSDL)Service description (WSDL)
Observation: More than 70% of all Web sites arebased on Apache. The server is internally organizedmore or less according to the steps needed to processan HTTP request:
Essence: Communication in the Web is generally basedon HTTP; a relatively simple client-server transfer pro-tocol having the following request messages:
OperationDescription
Head Request to return the header of a documentGet Request to return a document to the clientPut Request to store a documentPost Provide data that are to be added to a docu-
ment (collection)Delete Request to delete a document
12 – 8 Distributed Web-Based Systems/12.3 Communication
Communication (2/2)
HeaderC/S
ContentsAccept C The type of documents the client can handle
Accept-Charset C The character sets are acceptable for the client
Accept-Encoding
C The document encodings the client can handle
Accept-Language
C The natural language the client can handle
Authorization C A list of the client’s credentials
WWW-Authenticate
S Security challenge the client should respond to
Date C+S Date and time the message was sent
ETag S The tags associated with the returned document
Expires S The time for how long the response remains valid
From C The client’s e-mail address
Host C The TCP address of the document’s server
If-Match C The tags the document should have
If-None-Match C The tags the document should not have
If-Modified-Since
C Tells the server to return a document only if it hasbeen modified since the specified time
If-Unmodified-Since
C Tells the server to return a document only if it hasnot been modified since the specified time
Last-Modified S The time the returned document was last modified
Location S A document reference to which the client shouldredirect its request
Referer C Refers to client’s most recently requested document
Upgrade C+S The application protocol sender wants to switch to
Warning C+S Information about status of the data in the message
12 – 9 Distributed Web-Based Systems/12.3 Communication
SOAP
Simple Object Access Protocol: Based on XML,this is the standard protocol for communication be-tween Web services.
• SOAP is bound to an underlying protocol (i.e., itis not independent from its carrier)
• Conversational exchange style: Send a docu-ment one way, get a filled-in response back.
• RPC-style exchange: Used to invoke a Web ser-vice.
12 – 10 Distributed Web-Based Systems/12.3 Communication
A Note on XML
Observation: XML has the advantage of allowing self-describing documents. Full stop (i.e., it introducesperformance problems and is not meant to be readby human beings)
data Inline data data:text/plain;charset=iso-8859-7,%e1%e2%e3
telnet Remote login telnet://flits.cs.vu.nl
tel Telephone tel:+31201234567
modem Modem modem:+31201234567;type=v32
12 – 12 Distributed Web-Based Systems/12.4
Synchronization: WebDAV
Problem: There is a growing need for collaborativeauditing of Web documents, but bare-bones HTTP can’thelp here. Solution: Web Distributed Authoring andVersioning.
• Supports exclusive and shared write locks, whichoperate on entire documents
• A lock is passed by means of a lock token; theserver registers the client(s) holding the lock
• Clients modify the document locally and post itback to the server along with the lock token
Note: There is no specific support for crashed clientsholding a lock.
Basic idea: Sites install a separate proxy server thathandles all outgoing requests. Proxies subsequentlycache incoming documents. Cache-consistency pro-tocols:
• Always verify validity by contacting server• Age-based consistency:
• Cooperative caching, by which you first check yourneighbors on a cache miss:
Webproxy
Webserver
Webproxy
WebproxyCache
Cache
Cache
Client
Client
ClientClient
Client
ClientClient
Client
Client
2. Ask neighboring proxy caches
1. Look inlocal cache
HTTP Get request
3. Forward requestto Web server
12 – 14 Distributed Web-Based Systems/12.6 Consistency and Replication
Replication in Web HostingSystems
Observation: By-and-large, Web hosting systems areadopting replication to increase performance. Muchresearch is done to improve their organization. Fol-lows the lines of self-managing systems:
Web hosting system
Metric estimation
Analysis
+/-+/-+/-
Reference input
Initial configuration
Uncontrollable parameters (disturbance / noise)
Observed output
Measured outputAdjustment triggers
Corrections
Replica placement
Consistency enforcement
Request routing
12 – 15 Distributed Web-Based Systems/12.6 Consistency and Replication
Handling Flash Crowds
Observation: We need dynamic adjustment to bal-ance resource usage. Flash crowds introduce a se-rious problem:
(a) (b)
(c) (d)
2 days 2 days
6 days 2.5 days
12 – 16 Distributed Web-Based Systems/12.6 Consistency and Replication
Server Replication
Content Delivery Network: CDNs act as Web host-ing services to replicate documents across the Inter-net providing their customers guarantees on high avail-ability and performance (example: Akamai).
Origin server
Client
CDN server
CDN DNS server
Regular DNS system
Cache
1. Get base document
2. Document with refs to embedded documents
6. Get embedded documents (if not already cached)
5. Get embedded documents
7. Embedded documentsReturn IP address client-best server
DNS lookups 3
4
Question: How would consistency be maintained inthis system?
12 – 17 Distributed Web-Based Systems/12.6 Consistency and Replication
Replication of Web Apps. (1/3)
Observation: Replication becomes more difficult whendealing with databses and such. No single best solu-tion.
Authoritative databaseSchema Schema
Server Serverquery
response
full/partial data replication
full schema replication/ query templates
Content-blind cache
Content-aware cache
Database copy
Client
Edge-server side Origin-server side
Assumption: Updates are carried out at origin server,and propagated to edge servers.
12 – 18 Distributed Web-Based Systems/12.6 Consistency and Replication
Replication of Web Apps. (2/3)
Authoritative databaseSchema Schema
Server Serverquery
response
full/partial data replication
full schema replication/ query templates
Content-blind cache
Content-aware cache
Database copy
Client
Edge-server side Origin-server side
• Full replication: high read/write ratio, often incombination with complex queries. Note: replica-tion may possibly speed-down performance whenR/W ratio goes down.
• Partial replication: high read/write ratio, but incombination with simple queries
12 – 19 Distributed Web-Based Systems/12.6 Consistency and Replication
Replication of Web Apps. (3/3)
Authoritative databaseSchema Schema
Server Serverquery
response
full/partial data replication
full schema replication/ query templates
Content-blind cache
Content-aware cache
Database copy
Client
Edge-server side Origin-server side
• Content-aware caching: Check for queries at lo-cal database, and subscribe for invalidations atthe server. Works good with range queries andcomplex queries.
• Content-blind caching: Simply cache the resultof previous queries. Works great with simple queriesthat address unique results (e.g., no range queries).
12 – 20 Distributed Web-Based Systems/12.6 Consistency and Replication
Security: TLS (SSL)
Transport Layer Security: Modern version of thethe Secure Socket Layer (SSL), which “sits” betweentransport layer and application protocols. Relativelysimple protocol that can support mutual authentica-tion using certificates:
Clie
nt
Ser
ver
[ K
[ K
+
+
S
C
CA
CA
]
]
([ R ] CKS+ )
Possibilities
Choices
1
2
3
4
5
12 – 21 Distributed Web-Based Systems/12.6 Consistency and Replication