EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University [email protected].

EEC-681/781EEC-681/781Distributed Computing Distributed Computing

SystemsSystems

Lecture 3Lecture 3

Wenbing ZhaoWenbing ZhaoDepartment of Electrical and Computer EngineeringDepartment of Electrical and Computer Engineering

Cleveland State UniversityCleveland State University

[email protected]@ieee.org

Fall Semester 2006Fall Semester 2006 EEC-681: Distributed Computing SystemsEEC-681: Distributed Computing Systems Wenbing ZhaoWenbing Zhao

22

OutlineOutline

• Mock quiz

• Techniques for scaling

• Middleware

• Distributed system models– Fundamental model


33

Techniques for ScalingTechniques for Scaling

• Hiding communication latencies

• Distribution

• Replication


44

Hiding Communication Hiding Communication LatenciesLatencies

• Applicable to geographical scalability

• Technique #1: Avoid waiting for responses to remote service requests– Use asynchronous communication style

• Technique #2: Reduce the overall communication by moving part of the computation from server to client– Code/process migration


55

Scaling by DistributionScaling by Distribution

• Distribution: Partition data and computations across multiple machines

• Examples: Domain name services (DNS)– DNS name space is hierarchically organized

into a tree of domains, which are divided into nonoverlapping zones


66

Decentralized Naming ServiceDecentralized Naming Service


77

Scaling by ReplicationScaling by Replication

• Replication: Make copies of data available at different machines across the distributed system

• Examples:– Replicated file servers– Replicated databases– Mirrored Web sites– Large-scale distributed shared memory systems


88

Scaling by ReplicationScaling by Replication

• Replication Benefits– Increasing availability– Load balancing– Increasing the geographical scalability by

placing a copy nearby different users


99

Problem with Scaling by Problem with Scaling by ReplicationReplication

• Having multiple copies might leads to inconsistencies:– Modifying one copy makes that copy different from the

rest– Always keeping copies consistent and in a general

way requires global synchronization on each modification

– Global synchronization precludes large-scale solutions• Global synchronization is needed to minimize

inconsistencies


1010

Scaling by CachingScaling by Caching

• Caching: A special form of replication. It allows client processes to access local copies– Web caches (browser/Web proxy)– File caching (at server and client)

• Similarity to replication: making a copy of a resource, generally in the proximity of the client accessing that resource

• Difference from replication: caching is a decision made by the client of a resource, not by the owner of the resource


1111

MiddlewareMiddleware

• Middleware: – Position: in between a distributed application

and operating system– Functionality: it provides common services for

distributed computing– Implementation: it incorporates lots of

distributed systems design principles


1212

Why MiddlewareWhy Middleware

• Better productivity– Quicker implementation of business logic

because you don’t worry many low level distributed computing issues

• Better performance– Middleware is written by experts

• Better interoperability, extensibility, etc.

• Better security


1313

Middleware ServicesMiddleware Services

• Many middleware systems offer a more-or-less complete collection of services and discourage using anything else but their interfaces to those services


1414


• Communication services: Abandon primitive socket based message passing in favor of:– Procedure calls across networks– Remote-object method invocation– Message-queuing systems– Advanced communication streams– Event notification service


1515


• Information system services: help manage data in a distributed system:– Large-scale, system-wide naming services– Advanced directory services (search engines)– Location services for tracking mobile objects– Persistent storage facilities– Data caching and replication


1616


• Control services: Giving applications control over when, where, and how they access data:– Distributed transaction processing– Code migration

• Security services: Secure processing and communication:– Authentication and authorization services– Simple encryption services– Auditing service


1717

Distributed System ModelsDistributed System Models

• Fundamental models– Concerned with a more formal description of

the properties that are common to all of the architectural models

• Architectural models– Concerned with the placement of its parts and

the relationships between them


1818

Fundamental ModelsFundamental Models• Interaction Models

– Synchronous distributed systems– Asynchronous distributed systems

• Failure Models– Timing faults– Process faults

• Security Models– Protecting process– Protecting communication channels– Protecting objects against authorized access


1919

Common Properties on Common Properties on InteractionsInteractions

• Communications over computer networks– Latency, throughput, jitter

• Synchronization of distributed processes– No global time, each machine has its own

“view”of time– As a result, cannot fully control process

execution, message delivery, clock drift


2020

Synchronous Distributed SystemsSynchronous Distributed Systems

• Definition– Time to execute each step of a process has

known lower and upper bounds– Each message transmitted over a channel is

received within a known bounded time– Each process has a local clock whose drift

rate from real time has a known bound


2121

Synchronous Distributed SystemsSynchronous Distributed Systems

• How to build one– For each task, guaranteed sufficient

resources being allocated, e.g., processor cycles and network bandwidth

– Clocks with bounded drift rates


2222

Asynchronous Distributed SystemsAsynchronous Distributed Systems

• Definition– Unbounded time to execute each step of a

process– Unbounded message transmission– Unbounded clock drift rates from real time


2323

Asynchronous Distributed SystemsAsynchronous Distributed Systems

• Actual systems are very often asynchronous because of the need for sharing of processors and network

• Protocols designed for asynchronous systems can be used in any system


2424

Impossibility ResultsImpossibility Results

• In an asynchronous distributed system, processes cannot reach consensus with even one process failure– One cannot distinguish a failed node from a

slow one– Implications: Perfect Common knowledge is

not achievable in an asynchronous distributed system

EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University [email protected].

Documents

EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University [email protected].