Topic 1 Intro to Net and Distribute Sys

8/3/2019 Topic 1 Intro to Net and Distribute Sys

http://slidepdf.com/reader/full/topic-1-intro-to-net-and-distribute-sys 1/22

X INTRODUCTION

The latter part of the 20th century sets the stage for the age of information , duringwhich technologies for information gathering, processing and distribution were

developed. One of the most important technologies developed for informationmanagement was the computer . Computers were introduced after the SecondWorld War. Within half a century, computer technology and the scope of itsapplications have developed very fast, and computers are now as common ascars and TV sets.

In the early years of evolution, computers were large and expensive. Onlygovernments and large organisations had them for computation. Since the mid-1980s, two important improvements were introduced that radically changed theface of computers (and computing technologies).

By the end of this topic 1, you should be able to:

1. Differentiate between networks and distributed systems;

2. Explain the role of a network in a distributed system;

3. Outline the challenges of designing and implementing distributed

systems;

4. Describe the architectural models of distributed systems; and

5. Identify the fundamental models of distributed systems.

LEARNING OUTCOMES

TTooppiicc 11 X Introductionto Networkand Distributed

System



X TOPIC 1 INTRODUCTION TO NETWORK AND DISTRIBUTED SYSTEM2

1. The first development was in the power of microprocessors. Thecomputational power of microprocessors has increased two times every 1.5years, but the price has fallen drastically. For instance, a $100 millionmachine could execute one instruction per second in 1945. A $10,000machine can execute 109 instructions per second now. That is theprice/performance gain of 1013.

2. The second development was the invention of computer networking. Alocal area network (LAN), can connect hundreds of computers within a

building or on a campus and provide very high data-transmission rates between them. The Internet allows machines all over the world to

communicate with each other.

Now, powerful computers connected through networks are found everywhere.They are used in every aspect of business, including advertising, production,shipping, planning, billing and accounting. Many companies have multiplenetworks, and even primary and secondary schools are using computer networksto provide students and teachers with access to the Internet. Networks can beused to provide some complicated services and share important resources.However, since a computer operating system cannot directly and effectivelywork with general network protocols to provide complicated remote services andhandle issues related to the remote services, we need some additional software to

work with the above tasks to allow computers and networks to cooperateproperly. We call this linkage system a distributed system.

NETWORKS

Networks, or computer networks, have been growing rapidly. They are now anessential part of our computer systems. According to Tanenbaum (1995), anetwork is an inter-connected collection of autonomous computers. Mostcomputers in your organisation whether PCs or servers are most probablyconnected to a network. A computer network consists of a series of computers

that are connected together so that they can communicate with each other.Computer networks can also share peripheral devices like printers.

1.1.1 Network Goals

To Overcome Geographic SeparationThe most obvious goal for network communication is to overcome geographicseparation. Just like a telephone network transfers voices over a distance,computer networks provide data transmission services between separated

1.1



TOPIC 1 INTRODUCTION TO NETWORK AND DISTRIBUTED SYSTEM W 3

computers or data devices. For example, if a user wants to print a document, heor she will send the document from his or her computer to the printer, whichmay be located far away from the user, through a network. The printer will printout the document.

Computer networks also enhance human communication. Branches of companiesand organisations that have been developed overseas can still communicateefficiently and effectively with their headquarters, through computer networks.Overseas trading is as simple as local trading, and international cooperationamong companies exists all over the world.

Networks can be classified into three different geographic scopes.

1. A wide area network (WAN) spans the longest distance, such as a city, acountry or even all over the world. The most common example of a WAN isthe Internet. Almost all networked computers are connected to it.

2. A local area network (LAN) is usually limited to a geographic scope of afew kilometres. LANs are commonly used within a campus, a building oreven an office. A LAN usually supports a distributed system.

3. A metropolitan area network (MAN) is one that covers a metropolitan area,such as cable TV (CATV).

To Share ResourcesAnother important goal of networking is to enable resource sharing. Somedevices (such as printers) are expensive so many users should be able to accessthem. An obvious solution is to connect the devices to the network, and everyuser who is connected to the network can share the devices. For example, manyusers can share an expensive high-speed colour laser printer by connecting it to anetwork with other computers. Sharing resources does not need to be limited tophysical (hardware) devices data files and application software can also beshared.

To Support Distributed SystemsAnother important but not too obvious goal is to support the operations of distributed systems. All distributed services need computer networks to sendrequests and to return corresponding results. The results may be a simple "yes" or"no" answer, or a big binary file - depending on the nature of distributed servicesand the request.




1.1.2 Design Issues of Networks

Computer networks are complicated systems. Many network technologies exist,and each technology has features that distinguish it from the others. Manydifferent commercial products are developed in different ways, and sometimesthey may be combined to form a complicated network. To help you understandthe basic concepts of computer networks, the following main design issues of computer networks are introduced: transmission media, network hardware, andnetwork software. All of them are elaborated further in Topic 2 .

Transmission MediaTransmission media are used for the actual transmission (transportation) of dataor information. At the lowest level, computer networks encode data into a formof signals, such as electromagnetic or optical signals, and send them through atransmission medium. For example, copper wires are used to transmit data in theform of electromagnetic waves from a sender to its corresponding receiver.

Each transmission medium has its own characteristics, such as bandwidth, delay,cost and ease of installation and maintenance. You are introduced to differenttransmission media in Topic 2 .

Network Hardware

Let us consider which hardware should be included in a network. Figure 1.1shows a simple high-level model of a network.

Figure: 1.1: A Simple high-level model of a network

The sender generates a message and puts it into the network. The network

receives the message and then transfers it to the receiver. The receiver takes themessage out and gives it to its application program. Note that there may be manysmall networks (called sub-networks) connected to each other to form a bignetwork.

Structurally, a network includes a set of nodes inter-connected by a set of transmission lines, and each connection is called a link . There are usually manysenders and receivers. Senders may send many messages and at the same time,receivers may need to receive many messages from different senders. Practically,senders and receivers can be computers, workstations or terminals. We assume




users run application programs in those machines and thus, we can call themhosts or host computers (they can also be called nodes, end-stations, machines orend-users).

There are many ways in which nodes and links are inter-connected to form sub-networks. Usually, we classify them into two types of transmission technology:

1. Point-to-point communication : A network with point-to-pointcommunication consists of many connections between individual pairs of machines. To go from the source to the destination, the sender needs toprovide the destination's address (location), and the message might need to

travel through some intermediate nodes before reaching its destination.

2. Shared-point communication : In this type of network a singlecommunication channel is shared by all machines on a network. If amachine in the network wants to transmit a message to its correspondingdestination, it uses the shared common communication channel to do thetransmission, and the destination has to receive (copy, or pick up) themessage from the channel. Only the destination can copy the message fromthe channel - others do not have the right to do so.

Usually, we classify a network by considering its coverage area. Thus, we have

the following classes of network:

Local Area Networks (LANs) : These are small networks that are usuallyimplemented for use in offices, buildings and campuses up to a few squarekilometres in size. They are widely used to connect personal computers,workstations and devices in company offices to exchange information andshare resources. The three common kinds of network topology for LANs arestar, bus and ring. The size of LAN is small (the coverage area should bewithin a few kilometres) so the transmission delay (the average time taken

between sending and receiving a message) is short (less than 10 ms). Thedata-transmission rate (the average speed of transmitting a message from a

sender to a receiver) is high (from 107 to 109 bits per second). We focus onLANs in our study, because LANs are usually built as distributed systems.

Metropolitan Area Networks (MANs) : A MAN is actually like a LAN but itinter-connects computers and computing resources that span a single city, forexample, a large business organisation with several or many buildingslocated throughout a city. Each building has its own LAN, and all of theseLANs are connected to one another, forming a big network that spans thewhole city, i.e. MAN.




Some of the following businesses in Hong Kong are likely to have their ownMAN: Park'n Shop, Marks & Spencer, G2000, and so on. Each of these firmshas a chain of shops in Hong Kong, and each shop's computer network isconnected to a broader MAN system.

Wide Area Networks (WANs) : They span a large geographical area such as acountry or a continent - or even the world. They connect many machinestogether (at least thousands) and usually the transmission delay is long (up toa few seconds) and the transmission rate is low (until now it may be up to 106

bits per second but still lower than LANs). One of the most commonexamples of a WAN is the Internet.

Wireless Networks (or Mobile Networks ) are another type of network. Theyuse a wireless transmission medium. Many users who have desktop machineson LANs and WANs want to work outside with their computers. It isimpossible if their computers are wired, thus, there is a lot of interest inexploring the use of wireless networks. The terminals used in wirelessnetworks are mobile computers or personal digital assistants (PDAs).Wireless networks are necessary, especially when the environment is difficultfor cabling or when users are always "on the move".

1.1.3 Network Software

Modern networks are highly complicated systems, but your learning andunderstanding would be incomplete if we only consider the hardware side of things. Network software is also highly structured and complicated. To reducethe network design complexity, most of them are organised as a series of layers,as shown in Figure 1.2.

1. Give three reasons for using computer networks. Briefly explaineach.

2. State the names of the different transmission media that youknow.

3. Suppose there are n computers and you want to have acommunication path between any two of them. This is to beachieved by direct (point-to-point) connection with links only (noswitching nodes or routers). How many links are required? Whatimplication can you draw from your answer?

SELF-TEST 1.1




Figure 1.2: Layers, protocols, and interfaces

A layer is defined as a service provided for its upper layer. The number of layersand the functions of each one are different for different networks. Each layer isindependent from the others, but there is a communication interface between twoadjacent layers.

Consider two computers, host 1 and host 2, that communicate with each other.Both have the same number of layers. Note that the number of layers in two

computers does not need to be the same, but when a sender wants tocommunicate with a receiver, each host must have corresponding layers in itssystem.

For example, for layer 2 of host 1 to communicate with host 2, host 2 must havethe corresponding (peer) layer 2 in its system. The set of rules governing themessage exchange between two machines in layer n is called n -peer protocol orsimply layer n protocol. The messages exchanged between these two layers arecalled n -PDUs (Protocol Data Units of layer n ). The format and the meaning of the fields in n -PDUs are specified in the layer n protocol.

To process the communication, layer n uses the services provided by layer n

1 and the interface between layer n and layer n 1 (layer n 1/ n interface) isstandardised. When a sender, say host 1, wants to send a message to itscorresponding receiver, say host 2, the message first passes to the highest layer,say N , of host 1. After processing by layer N , the resulting message is passed tolayer N 1 . By repeating the above steps, we have the lowest layer, layer 1 of host1, send the final message to the receiver through the physical transmissionmedium. Layer 1 of host 2:




1. Receives the message through the physical transmission medium;

2. Processes the message; and

3. Passes it to its upper layer, layer 2 .

By repeating the above steps, the highest layer of host 2 will receive the originalmessage and then pass it to the corresponding application program. The wholeprocess, which sends messages from one side to the other, is called networkprotocol.

REFERENCE MODELSYou might (or might not) know that there are many different networks andnetwork protocols in the world. If we did not have a standard model to follow, itwould be difficult for users to communicate with others whose network protocolis different from theirs - it is like two computers "speaking" different dialects (orperhaps different languages). Software designers need to use a great deal of effortto overcome this problem.

Around the early 1980s, the International Standard Organization (ISO) proposedthe Open System Inter-connection (OSI) reference model. The model aimed to

standardise network components to allow multi-vendor development andsupport. This OSI reference model was expected to become the dominantstandard in the computer network market. It is a layered reference model withseven layers - the physical layer, data link layer, network layer, transport layer,session layer, presentation layer, and application layer. They are all well defined,well structured, and each layer has its own networking function(s).

However, the ISO OSI reference model is too complicated (too many layers). Inthe 1970s, the United States (US) Department of Defence developed a researchnetwork called ARPANET. Then, after further development, it became theTCP/IP reference model (some texts refer to TCP/IP as a suite of network

protocols) and was released in the commercial market in the 1980s. Then itquickly became the dominant model or standard in the computer networksmarket, which is a major reason why the Internet developed so rapidly - therewas a common network protocol suite. The most important reason why theTCP/IP reference model succeeded (over the ISO OSI reference model) was itssimplicity and ease of operation. The TCP/IP reference model, shown in Figure1.3, has only four layers - host-to-network layer, network or IP (Internet Protocol)layer, transport or TCP (Transmission Control Protocol) layer, and applicationlayer.

1.2




Application

TCP

IP

Host-to-network

Figure 1.3: The TCP/IP reference model

1. The host-to-network layer handles all physical transmission issues - theformat of the physical signals, the network topology, the hardware devices(transmitters, receivers and routers) and low-level network protocol (relatedto physical networks). All physical network configurations are related tothis layer.

2. IP (Internet Protocol) is the second layer. It has two main functions:

(a) To define a unique and well-defined IP address for each machine andto define the format of its PDU (i.e. datagram); and

(b) To provide services to route a datagram from a sender to itscorresponding receiver through a network.

3. TCP (Transmission Control Protocol) is the third layer. It provides end-to-end reliable communication between two user-processes in two differentmachines. Note that there is also a disconnected protocol defined over IP. Itis called the User Datagram Protocol (UDP).

4. The application layer is the highest layer. It provides some simpleapplication services for end-users on top of TCP such as FTP (File TransferProtocol), SMTP (Simple Mail Transfer Protocol), and HTTP (HypertextTransfer Protocol).

Because of the popularity of the TCP/IP reference model, you are given acomplete picture of it in Topic 3 . In Topic 3 , you learn the functions of IPand TCP, and how they support the application layers. You also learn somesimple application services such as FTP and email services.




DISTRIBUTED SYSTEMS

As mentioned previously, a computer operating system cannot directly andeffectively work with general network protocols to provide complicated remoteservices and handle issues related to remote services. The reason is that, in the

beginning, the design of a computer operating system did not include thesupport of remote services. Computers and computer operating systems were

introduced after the Second World War, but networking became popular (evenpossible) more than 30 years after the first computers were developed. Therefore,it is easy to see that operating systems were designed for standalone computers

but not for networked computers.

You might say that operating systems could be upgraded to support the remoteservices, but it is always easier and cheaper to install additional software tohandle remote services than to upgrade the original ones. Therefore, distributedsystems software is the additional software to support remote services for a set of computers connected by a computer network.

1.3

1. Which type of network (LAN, MAN or WAN) supports thefollowing scenario:

(a) A large shopping centre (e.g. Festival Walk)?

(b) The School of Science and Technology in the OpenUniversity of Malaysia?

(c) World Wide Web (WWW)?

Justify your answers.

2. Which layer of the TCP/IP reference model handles each of thefollowing functions:

(a) Corrects the errors of the physical signals?

(b) Shows the path that a message travelled from a sender to itscorresponding receiver?

(c) Provides an email service?

(d) Corrects the errors of the received data bits?

Justify your answers.

SELF-TEST 1.2




A typical distributed system is shown in Figure 1.4. It shows the components of adistributed system based on a LAN. Such a system, equipped with appropriatesoftware, can support the needs of a substantial number of computers (users),thereby performing a function similar to a single powerful computer.

Figure 1.4: A simple distributed system

A distributed system is defined as a collection of autonomous computers linked by a network with software designed to produce an integrated computingfacility. Tanenbaum (2002) also provides a simple and direct definition of distributed systems - a distributed system is a collection of independentcomputers that appears to its users as a single coherent system. Keep thisdefinition in mind, but you will not look into the details of distributed systemsuntil you have finished Topic 4 and Topic 5 .

1.3.1 The Relationship between Networks andDistributed Systems

A beginner can easily get confused between the definitions of a network and adistributed system. A network, or specifically a computer network, is an inter-connected collection of autonomous computers. Two computers are inter-connected if they can exchange messages or information. The connection betweenthem is through a transmission medium such as copper wire, fibre optics,microwaves or satellite.




The key difference between the two systems is, in a distributed system, theexistence of multiple autonomous computers is transparent to the users orappears to the users as a single computer. Users can use the services provided bythe distributed system, input some data (parameters or files) to the system andwait for the output from the system. Users do not need to know exactly how andwhere the remote services are in the system.

For a network, users must explicitly log on to a machine, explicitly know whatthe machine can do, explicitly submit data to the correct location, and explicitlytell the machine how to return their results (e.g. give their own logical addressesto the machine).

In fact, a distributed system is built on top of a network. Networks are just one of the resources of distributed systems, and distributed systems use them to deliverand receive data. For example, both distributed systems and networks supportfile movement, but users in networks need to know the locations of the senderand receiver, the network configuration and which network protocol is used,whereas users in distributed systems do not need to know these things. In fact,they should not know these details.

1.3.2 The Advantages and Disadvantages of

Distributed SystemsThe main advantage of building a distributed system is resource sharing, whichis similar to a network but which is done more efficiently and effectively.Compared to standalone computer systems, distributed systems can sharecommon databases and some expensive peripherals, e.g. printers, in a better way.The following are the advantages of distributed systems over standalonecomputers:

Economy: Microprocessors in distributed systems offer a betterprice/performance gain than in standalone computer systems. For a

standalone computer system, an expensive, high-performance and high-reliability CPU is used, whereas we can achieve the same performance usinga large number of cheap CPUs together in a distributed system.

Consider the following example. A process requires ten hours for execution by a high-speed computer. But in a distributed system, we can use ten cheapand slow CPUs, with the speed which is ten times slower than the high-speedone, in parallel to finish the same process. Both may use the same amount of time to finish the job, but the cost of 10 slow CPUs will be much lower thanthat of the high-speed one.




Inherent distribution: Some applications involve spaciously separatedmachines. For example, a supermarket chain may have many stores.Management needs to keep track of inventory at each store and update thiskind of information at headquarters. To implement this application, acommercial distributed system is a natural choice.

Reliability: A distributed system is more reliable than a standalone computer.A standalone computer will crash if its CPU crashes. However, in adistributed system, if a single machine crashes, the rest of the system can stillsurvive and operate properly with some fault tolerance facilities (i.e.hardware or/and software). Also, as the CPU of a standalone computer

system is usually very expensive, it would cost a lot to replace. But, it is muchcheaper to replace malfunctioning components in a distributed system.

Incremental growth: Computing power cannot be upgraded easily in astandalone computer system, since the cost of upgrading the CPU is veryhigh. However, for distributed systems, we do not need to upgrade allmachines at the same time. We can upgrade the CPU of each individualmachine (one at a time) so that the system grows incrementally · step-by-step.

Nevertheless, there are also some disadvantages of distributed systems withregard to standalone computer systems:

Software complexity: Designing the software for a distributed system is muchmore complex than designing for a standalone computer system, because thesoftware for distributed systems has to take care of many machines and theirinteractions. For example, we need to design a system that allows for sharingresources while maintaining operational consistency in the users' machines,

but consistency is much simpler in standalone computer systems.

Communication delay: There is almost no communication delay forstandalone computer systems. In distributed systems, however, the delay can

be significant because there are no dedicated links - all users share the many

paths in the network. Sometimes, because of the long communication delays,users prefer to execute a program locally rather than in a remote server.

Security: It is difficult to protect confidential data in distributed systems, because many users can access shared resources and the system must becapable of identifying unauthorised users. Most standalone computer systemsdo not have this feature - they are closed systems and it is not easy(impossible) to access them from the outside. Distributed systems, however,are usually connected to the Internet, and thus, anyone can try to access(hack) their systems via this worldwide WAN.




1.3.3 Characteristics of Distributed Systems

To design and implement a distributed system, the following characteristicsshould be considered:

Resource sharing

Heterogeneity

Openness

Security

Scalability

Fault-handling

Concurrency

Transparency.

Note that all of the above characteristics should be considered but they do notneed to be implemented at the same time. Sometimes, the characteristics that

should be implemented depend on the nature of application services provided(or desired). Thus, we can say the above points bring challenges when building adistributed system.

Resource Sharing As mentioned, resource sharing is the most important characteristic or advantageof a distributed system, and thus, all distributed systems should deal with thisissue. The term "resource" is abstract, since it can represent hardware (e.g.printers, CPUs) or data (e.g. shared database, share executed files). To managethe resource(s) effectively, a program called resource manager is required to

Topic 1, sections 1.11.2 (Introduction), 27.

READING

1. Name one network and one distributed service.

2. Give another example to show the difference between computernetworks and distributed s stems.

SELF-TEST 1.3




provide an interface between the resource and users. The resource managershould provide the resource name, identify the resource location, map theresource name to a communication address, and coordinate concurrent accessesto ensure consistency.

Heterogeneity Heterogeneity applies here to a variety of different hardware and softwarecomponents operating together in the different levels in a distributed system:

Networks

Computer hardware

Operating systems

Programming languages

Implementations by different developers.

Since a distributed system can be implemented by more than one group of developers and it might be supported by different hardware and software, astandard protocol or interface is essential for all of them to ensure that the systemworks properly.

For networks, the most common approach of linking different components

together is to use the Internet protocols, i.e. TCP/IP. You investigate TCP/IP inTopic 3.

For computer hardware and operating systems, heterogeneity is not a majorproblem, because computers communicate through networks, and thus, it willnot affect a distributed system if exchange messages are standardised.

For programming languages, heterogeneity might cause problems becauseprogram files might produce different results if they compile and execute indifferent machines. Now, however, some programming languages, such as Java,are platform independent, and produce programs that are compatible and

interoperable on all machines.

To handle different implementation approaches for different developers,standard interfaces are needed for each application service, i.e. standardise theinput data and output result format. Thus, even if a client requests an applicationservice from a server that was implemented by other developers, the client andserver - even though they are different - can still communicate with each other, asthey "speak the same dialect".




Openness Openness is the characteristic that determines whether a distributed system can

be extended or expanded in various ways. For hardware, we should beconcerned about whether additional peripherals, memory or communicationinterfaces can be put into the system or not. For software, additional operatingsystem functions, communication protocols and resource-sharing devices should

be able to join the system without any modification to the system.

Security Many of the information resources are maintained in a distributed system for theusers to share. However, some critical resources should not be shared by

unauthorised users, but need to be protected. There are two kinds of protection.

1. Resources that should not (must not) be accessed by unauthorised usersmust be protected. A firewall is usually used to form a barrier around adistributed system so that all incoming and outgoing traffic will beinspected.

2 If sensitive information is sent in a message over a network, you also need asecurity procedure to protect the message so that unauthorised users cannotaccess the content of the message. Security is not always implemented in adistributed system, because it depends on how sensitive the information

resources are.

Scalability Distributed systems can operate effectively at many different scales. A system isscalable if it remains stable when the number of users and the amount of resources are increased significantly - in other words, adding users does notadversely affect the way the system works. Usually there are three kinds of scale:

1. The smallest one is two workstations with one file server:

2. The middle and most common one is a distributed system within a LAN.Hundreds of workstations and several file servers and printer servers might

be interconnected; and

3. The largest one involves inter-networking. Several inter-connected LANsmight contain thousands of computers and many shared resources.

Fault-Handling Sometimes, distributed systems fail. Some output results might be incorrect,some incoming requests might be lost, a server might be down, or some servicesstop before they complete the computation. A good distributed system should be




capable of detecting, correcting, or even preventing such failures, although thefailures are difficult to handle.

Generally there are two kinds of failures - hardware and software. Hardwareredundancy is used to handling hardware failures, i.e. redundancy componentsreplace the failed ones. Programs should be designed to tolerate or automaticallyrecover from faults of software failure.

Fault-handling, like security, is not always important and is not alwaysimplemented in a distributed system, because it depends on how critical theinformation resources are. Since it is difficult to implement (sometimes the cost of

the system may be double if fault-handling is included), it would be consideredonly if the resources are extremely critical (e.g. banking database).

Concurrency Since there are many clients (users) and several servers in a distributed system, itis possible to have more than one process executing in parallel. Concurrency isone of the intrinsic characteristics of distributed systems. There are two reasonsfor parallel executions to occur:

1. Many users simultaneously invoke commands or interact with (the same)application programs.

2. Many server processes run concurrently, each corresponding to a singlerequest from a client process.

The major drawback of this characteristic is the problem of inconsistency. Forexample, if more than one process executes in parallel with the same database,some processes might get inconsistent data. To overcome this, some databaseupdating algorithms might be added to avoid the inconsistencies.

Transparency This characteristic is hidden from the user and the (application) programmer. A

distributed system is transparent if it achieves the image of being a single systemto make everyone think that the collection of independent components is simplya single time-sharing system.

There are several transparencies that should be achieved in a distributed system:

Access transparency enables local and remote resources to be accessed usingidentical operations. That means it does not matter whether the resources arefrom local or remote machines - the way to access them should be identical(or very similar at the least).




Location transparency enables resources to be accessed without knowingtheir location. Users might need to know the name of the resources but nottheir location.

Concurrency transparency enables several processes to operate concurrently(simultaneously, at the same time) using shared resources withoutinterference between them. The most common example of this transparency ismultiple processes accessing a shared database. All processes can retrieve andsave data from the shared database, but the database must still maintain theconsistency of the base data.

Replication transparency enables multiple instances (copies) of resources to be used to increase reliability and performance without the users orapplication programmers knowing anything about the replicas. The multipleinstances of resources are usually distributed uniformly (through the system),so that users can always find one of the instances close to them.

Failure transparency enables the concealment of faults, and allows users andapplication programs to complete their tasks despite the failure of hardwareor software components. The power of failure transparency is highlydependent on how many resources are held in reserve within this fault-tolerance scheme.

Mobility transparency allows the movement of resources and clients within asystem, without affecting how the users or the programs operate. It is highlyrelated to location transparency.

Performance transparency allows the system to be reconfigured to improveperformance as loads vary. This is very difficult to implement if the loading of a system increases a lot, because a distributed system has no way to improveits performance significantly.

Scaling transparency allows the system and applications to expand in scalewithout changing the system structure or the application algorithms.

Topic 1, section 1.4, 1625.

READING




1.3.4 Architectural Models of Distributed Systems

An architectural model is a structural model of separately specified componentsthat provides a consistent frame of reference for the design. A distributed systemcan be described from the following perspectives as:

A layered software model;

According to the system architecture; and

In the design requirements.

Layered Software Model Consider the layer structure of a standalone computer system in Figure 1.5. Itshows its four main hardware and software layers - applications, run-timesupport for programming languages (for example, interpreters and libraries), theoperating system and hardware components. Note that the hardware componentincludes both computer and network hardware.

Application

Run-time support

Operating system

Hardware

Figure 1.5: The layered structure of a standalone computer system

For distributed systems, however, there is a different layered structure. Theplatform is the lowest-level hardware and operating system, which isindependent from the distributed system. Middleware is a layer designed toimplement a distributed system. It provides distributed services to applicationprograms.

1. Name the resources that can be shared in a distributed system.

2. Which two transparencies do you consider to be the mostimportant? Justify your answer.

SELF-TEST 1.4




System Architecture In a distributed system, processes are arranged together to perform useful tasks.The following are the four types of system architecture.

1. The client-server model is the most widely used model. Clients sendinvocations (requests to an authority) to the servers (the authority) for itsremote services. Then the server executes the remote services based on theinvocations and sends the results back to the clients.

2. Services provided by multiple servers. This model is usually used toprovide complicated services such as fault-tolerance or security issues.

3. Proxy servers and caches. The cache is a fast secondary storage device thatrecords the most recently used data objects. When a client requests anobject, the caching service first checks the cache and supplies the objectfrom the cache if it is available. If not, a search is required through the Webservers. The cache will be updated when the object has been found - thismost recently sought object will be added into the cache memory.

4. Peer processes. Sometimes, some processes play similar roles, interactingco-operatively as peers to perform a task. This model is usually used

because of the nature of the application, e.g. group communication.

Design Requirements There are four requirements in the design of a distributed system:

1. Performance issues concern how a distributed system functions or performswhen it executes some application services. Since an application serviceusually exchanges messages through a network, the network performanceis highly related to the performance of a distributed system. There are twomain parameters for network performance:

(a) Message transmission delay (the time taken to send a message from asender to its corresponding receiver).

(b) Throughput (the data transfer rate).

Also, the software-processing rate and the computational-load balancing(the load distribution of machines in a distributed system) are factors of performance.

2. Quality of Service (QoS). The QoS experienced by clients and users is thereliability, security and performance. The concern here is whether fault-tolerance can be achieved in a distributed system to maintain its reliabilityand availability. As for security, a reasonable degree of security should be




applied to the data that are stored and transmitted within a distributedsystem.

3. Use of caching and replication. Both cache and replicated servers should beused to improve the performance and availability within a distributedsystem. The concern is how to validate a cached response, how to refreshcache and how to maintain the consistency of cache and replicated servers.

4. Dependability issues. The dependability of computer systems is defined ascorrectness, security and fault tolerance. Note that QoS is related to theclient side, whereas dependability issues are related to the server side.

1.3.5 Fundamental Models

To classify the distributed system, the following classification models are basedon the fundamental properties of systems.

Interaction model. In these kinds of distributed system, processes interact bypassing messages that result in communication and coordination amongprocesses. This is the most common model. Examples are the client-serverand group communication models.

Failure model. Since processes and communication networks might fail in adistributed system, this model defines how the system should recover whenthe above failure occurs.

Security model. Since resources are shared within a distributed system, thismodel defines how to protect those resources from being accessed by theunauthorised users, and it provides a secure way for authorised users toaccess the shared resources.

Topic 2, section 2.2, 3147.

READING




x Topic 1 has looked at some basic concepts of networks and distributedsystems. By now you should know the basic definition of networks and theircharacteristics, which you will find very useful when you study Topic 2.

x You should also know the definition of distributed systems and be aware of their advantages over standalone computer systems. Moreover, you shouldnow understand the differences between networks and distributed systems.

x You have also learned the basic characteristics of distributed systems. Thesection dealing with architectural models demonstrates very clearly how the

layered software in distributed systems is significantly different from thelayered software in conventional standalone computer systems.

x You should also understand the four system architectures and their designrequirements, and the fundamental models of distributed systems.

x You should now be ready to move on to a more detailed study of networks,the focus of Topics 23.

Tanenbaum, A S and van Steen, M (2002) Distributed Systems: Principles and Paradigms , Upper Saddle River, NJ: Prentice Hall.

1. Instead of building a middleware layer, is it possible to build thedistributed application services into the application layer for alayered structure of distributed systems? Justify your answer.

2. Give a practical example for each fundamental model.

SELF-TEST 1.3

Topic 1 Intro to Net and Distribute Sys

Documents