kloukin/teaching/dis/ds-handouts.pdf · © City University London, Dept. of Computing Distributed Systems / 0 - 1 Distributed Systems Dr Christos Kloukinas Dept. of Computing City

© City University London, Dept. of Computing Distributed Systems / 0 - 1

Distributed Systems

Dr Christos Kloukinas Dept. of Computing

City University London [email protected]

020 7040 8848 Room A309

(that’s opposite the labs, up the stairs)


Who, Where & How?

♦  Who: Christos Kloukinas ♦  Where: Lectures (Mon 14-16:00 @ C340),

Lab (Mon 16-17:00 @ A217) & Moodle Discussion board

♦  Notes, etc.: Moodle! ♦  Email: [email protected] ♦  Web: http://www.soi.city.ac.uk/~kloukin/ ♦  How: Best if you come with questions – less dull


Nature of the Module

♦  This is a practical module with hands-on experience!

♦  Involves Java Programming. ♦  Theory in the lectures. ♦  Practice in the labs.


What you will learn - I

♦ Problems that occur during construction of distributed systems.

♦ Principles and techniques to solve them. ♦ Components of an infrastructure for

distributed systems (OMG/CORBA).


What you will learn - II

♦  OMG - industry consortium. Defines Specifications/interfaces for interoperable software using an object-oriented technology

♦  Practical experience with an OMG/CORBA implementation.

♦  CORBA (Common Object Request Broker Architecture) » standard architecture for distributed object systems.

It allows a distributed, heterogeneous collection of objects to interoperate.


CORBA

Application Objects

CORBA facilities

CORBA services

Object Request Broker

Lifecycle


Prerequisites of the Module

♦  Programming.

♦  Networks and Communications.

♦  Database Systems.

♦  Parallel & Concurrent Programming.


Suggested Textbooks

♦  W. Emmerich: Engineering distributed objects. Wiley. 2000. ISBN 0-471-98657-7

♦  A.S. Tanenbaum and M. van Steen: Distributed Systems: Principles and Paradigms. Prentice Hall. 2002. ISBN 0-13-121786-0

♦  G. Coulouris, J. Dollimore and T. Kindberg: Distributed Systems: Concepts and Design (2nd ed). Addison-Wesley. 1996. ISBN 0-201 6243308

♦  A. Vogel and K. Duddy: Java Programming with CORBA (2nd ed). Wiley. 1998. ISBN 0-471-24765-0

♦  R. Orfali and D. Harkey: Client/Server Programming with Java and CORBA (2nd ed). Wiley. 1998. ISBN 0-471-24578-X


Coursework

♦  Implementation with deadlines in sessions 4, 6, 9 (tbc): » Done in pairs – choose your pair now! » Involves Java Programming (lots of it…).

♦  See Moodle for further details (available soon).


Module Outline – Part I 1.  Motivation

2.  Distributed Software Engineering

3.  Communication

4.  RMI vs. CORBA

5.  Building Distributed Systems with CORBA

6.  Poly-lingual systems/programming


Module Outline – Part II

Common Problems in Distributed Systems:

7. Naming and Trading

8. Concurrent Processes and Threads

9. Transactions

10. Security


Distributed Systems

Session 1: Motivation

Christos Kloukinas Dept. of Computing

City University London


Outline

1.  What is a Distributed System 2.  Why bother with them? 3.  Examples of Distributed Systems 4.  Common Characteristics 5.  Summary


What is Distributed?

♦  Data are Distributed »  If data must exist in multiple computers for admin and ownership reasons

♦  Computation is Distributed »  Applications taking advantage of parallelism, multiple processors, »  particular feature »  Scalability and heterogeneity of Distributed System

♦  Users are Distributed »  If Users communicate and interact via application (shared objects)


History of Distributed Computing

♦  1940. The British Government came to the conclusion that 2 or 3 computers would be sufficient for UK.

♦  1960. Mainframe computers took up a few hundred square feet.

♦  1970. First Local Area Networks (LAN) such as Ethernet. ♦  1980. First network cards for PCs. ♦  1990. First wide area networks, the Internet, that evolved

from the US Advanced Research Projects Agency net (ARPANET, 4 nodes in 1969) and was, later, fueled by the rapid increase in network bandwith and the invention of the World Wide Web at CERN in 1989.


Distributed System Types (Enslow 1978)

Fully Distributed

Processors

Control

Fully replicated

Not fully replicated master directory

Local data, local directory

Master-slave

Autonomous transaction based

Autonomous fully cooperative

Homog. special purpose

Heterog. special purpose

Homog. general purpose

Heterog. general purpose


1. What is a Distributed System?

A collection of components that execute on different computers. Interaction is achieved using a computer network.

A distributed system consists of a collection of autonomous computers, connected through a network and distributed operating system software, which enables computers to coordinate their activities and to share the resources of the system, so that users perceive the system as a single, integrated computing facility.


1.1 Centralised System Characteristics

♦  Non-autonomous parts: The system possesses full control. ♦  Homogeneous: Constructed using the same technology

(e.g., same programming language and compiler for all parts).

♦  Component shared by all users all the time. ♦  All resources accessible. ♦  Software runs in a single process. ♦  Single Point of control. ♦  Single Point of failure (either they work or they do not

work).


1.2 Distributed System Characteristics

♦  Multiple autonomous components. ♦  Heterogeneous. ♦  Components are not shared by all users. ♦  Resources may not be accessible. ♦  Software runs in concurrent processes on different

processors. ♦  Multiple Points of control. ♦  Multiple Points of failure (but more fault tolerant!).


1.3 Model of a Distributed System

Component 1

Component n

Middleware

Network Operating System

Hardware

Host 1

Component 1

Component n

Middleware

Network Operating System

Hardware

Host n

..

..

………...

Network


2. Examples of Distributed Systems

♦  Local Area Network

♦  Database Management System

♦  Automatic Teller Machine Network

♦  World-Wide Web


2.1 Local Area Network


2.2 Database Management System


2.3 Automatic Teller Machine Network


3. Common Characteristics

♦  What are we trying to achieve when we construct a distributed system?

♦  Certain common characteristics can be used to assess distributed systems

» Resource Sharing » Openness » Concurrency » Scalability »  Fault Tolerance »  Transparency


3.1 Resource Access and Sharing

♦  Ability to use any hardware, software or data anywhere in the system ... once authorised!.

♦  Security implications: Resource manager controls access, provides naming scheme and controls concurrency.

♦  Resource sharing model: client/server vs n-tier architectures.


3.2 Openness

♦  Openness is concerned with extensions and improvements of distributed systems.

♦  Detailed interfaces of components need to be standardized and published.

♦  It is crucial because the overall architecture needs to be stable even in the face of changing functional requirements.


3.3 Concurrency

♦  Components in distributed systems are executed in concurrent processes.

♦  Components access and update shared resources (e.g. variables, databases, device drivers).

♦  Integrity of the system may be violated if concurrent updates are not coordinated. » Lost updates »  Inconsistent analysis


3.4 Scalability

♦  Adaptation of distributed systems to » accommodate more users » respond faster (this is the hard one)

♦  Usually done by adding more and/or faster processors.

♦  Components should not need to be changed when scale of a system increases.

♦  Design components to be scalable!


3.5 Fault Tolerance

♦  Hardware, software and networks fail!

♦  Distributed systems must maintain availability even at low levels of hardware/software/network reliability.

♦  Fault tolerance is achieved by » Redundancy (replication) » Recovery » Design


3.6 Transparency

♦  Distributed systems should be perceived by users and application programmers as a whole rather than as a collection of cooperating components.

♦  Transparency has different aspects that were identified by ANSA (Advanced Network Systems Architecture).

♦  These represent properties that a well-designed distributed systems should have

♦  They are dimensions against which we measure middleware components.


3.6.1 Access Transparency

♦  Enables local and remote information objects to be accessed using identical operations, that is, the interface to a service request is the same for communication between components on the same host and components on different hosts.

♦  Example: File system operations in Unix Network File System (NFS).

♦  A component whose access is not transparent cannot easily be moved from one host to the other. All other components that request services would first have to be changed to use a different interface.


3.6.2 Location Transparency

♦  Enables information objects to be accessed without knowledge of their physical location.

♦  Example: Pages in the Web. ♦  Example: When an NFS administrator moves

a partition, for instance because a disk is full, application programs accessing files in that partition would have to be changed if file location is not transparent for them.


3.6.3 Migration Transparency

♦  Allows the movement of information objects within a system without affecting the operations of users or application programs.

♦  It is useful, as it sometimes becomes necessary to move a component from one host to another (e.g., due to an overload of the host or to a replacement of the host hardware).

♦  Without migration transparency, a distributed system becomes very inflexible as components are tied to particular machines and moving them requires changes in other components.


3.6.4 Replication Transparency

♦  Enables multiple instances of information objects to be used to increase reliability and performance without knowledge of the replicas by users or application programs.

♦  Example: Distributed DBMS.

♦  Example: Mirroring Web Pages.


3.6.5 Concurrency Transparency

♦  Enables several processes to operate concurrently using shared information objects without interference between them. Neither user nor application engineers have to see how concurrency is controlled.

♦  Example: Bank applications. ♦  Example: Database management system.


3.6.6 Scalability Transparency

♦  Allows the system and applications to expand in scale without change to the system structure or the application algorithms.

♦  How system behaves with more components ♦  Similar to performance Transparency, i.e

QoS provided by applications. ♦  Example: World-Wide-Web. ♦  Example: Distributed Database.


3.6.7 Performance Transparency

♦  Allows the system to be reconfigured to improve performance as loads vary.

♦  Consider how efficiently the system uses resources.

♦  Relies on Migration and Replication transparency

♦  Example: TCP/IP routes according to traffic. ♦  Load Balancing. ♦  Difficult to achieve because of dynamism


3.6.8 Failure Transparency

♦  Enables the concealment of faults! ♦  Components can be designed without taking into

account that services they rely on might fail. ♦  Server components can recover from failures

without the server designer taking measures for such recovery.

♦  Allows users and applications to complete their tasks despite the failure of other components.

♦  Its achievement is supported by both concurrency and replication transparency.


Dimensions Of Transparency

Scalability Transparency

Migration Transparency

Access Transparency

Performance Transparency

Replication Transparency

Location Transparency

Failure Transparency

Concurrency Transparency


4. Summary

♦  What is a distributed system and how does it compare to a centralised system?

♦  What are the characteristics of distributed systems?

♦  What are the different dimensions of transparency?


Distributed Systems

Session 2: Distributed Software Engineering


City University London Software Engineering: the study of techniques used to produce high-quality software


Outline

0 LAST Session Summary + additional material. 1 Motivation 2 The CORBA Object Model 3 The OMG Interface Definition Language (IDL) 4 Other Approaches 5 Summary


Summary & Key Points of Lecture 1

 What is a Distributed System?  Adoption of DS is driven by Non-Functional

Requirements  Distribution needs to be transparent to users

and application designers  Transparency has several dimensions  Transparency dimensions depend on each

other


Definition

  A distributed system consists of a collection of autonomous computers, connected through a network and distributed operating system software, which enables computers to coordinate their activities and to share the resources of the system, so that users perceive the system as a single, integrated computing facility.

  Certain common characteristics can be used to assess distributed systems: Resource Sharing, Openness, Concurrency, Scalability, Fault Tolerance, and Transparency


Distributed System Types (Enslow 1978)

Fully Distributed

Processors

Control

Fully replicated

Not fully replicated master directory

Local data, local directory

Master-slave

Autonomous transaction based

Autonomous fully cooperative

Homog. special purpose

Heterog. special purpose

Homog. general purpose

Heterog. general purpose


Dimensions Of Transparency

Scalability Transparency

Migration Transparency

Access Transparency

Performance Transparency

Replication Transparency

Location Transparency

Failure Transparency

Concurrency Transparency


1.3 Model of a Distributed System

Component 1 Component n

Middleware

Network Operating System Hardware

Host 1

Component 1 Component n

Middleware

Network Operating System Hardware

Host n

..

..

………...

Network


Middleware Examples

  Transaction-oriented »  IBM CICS »  BEA Tuxedo »  IBM Encina »  MS Transaction Server

  Message-oriented »  MS Message Queue »  NCR TopEnd »  IBM MQSeries »  Sun Tooltalk »  Sun JavaSpaces

  Procedural »  Sun ONC »  Linux RPCs »  OSF DCE

  Object-oriented »  OMG CORBA »  Sun Java/RMI »  Microsoft COM »  Sun Enterprise Java Beans


0.3 Client-Server Computing(O’Leary 2000)

 A client is defined as a requester of services  A server is defined as a provider of services  A single machine can be both a client and a

server depending on the software configuration


0.4 Client-Server (O’Leary 2000)

 Processing can be improved because client and server share processing loads » Client/server computing considers that the client has

computing power that is not being used » Fundamental idea is to break apart an application into

components that can run on different platforms

 Thin vs. Fat Clients: » a thin client has most of the functionality with server; » a fat client has most of the functionality with the client.


0.5 Two tier architectures

  The user system interface is usually located in the user's desktop environment.

  Database management services are usually in a server that is a more powerful machine that services many clients.

  Processing management is split between the user system interface environment and the database management server environment.

  The database management server provides stored procedures and triggers.

  Good for LAN with work group users < 100


0.6 Three-Tiered Architecture

  Three Tiered Architecture is an information model with distinct pieces -- client, applications services and data sources -- that can be distributed across a network.

  Client Tier -- The user component displays information, processes, graphics, communications, keyboard input and local applications.

  Applications Service Tier -- A set of sharable multitasking components that interact with clients and the data tier. It provides the controlled view of the underlying data sources.

  Data Source Tier -- One or more sources of data such as mainframes, servers, databases, data warehouses, legacy applications etc.


0.7 Examples of three tier architectures

 Three tier architecture with transaction processing monitor technology

 Three tier with message server  Three tier with an application server  Three tier with an ORB architecture( e.g CORBA)  Distributed/collaborative enterprise architecture.


ORB

CORBA IIOP

OR

B

ORB

ORB

ORB

DBMS

Lotus Notes

TP Monitors

Tier 2 Server Objects

Tier 1 View Objects Legacy Applications

Tier 3

Business Objects

0.8 Three Tier Client/Server Object Style


1 CORBA – Motivation & Overview

 Distributed Systems consist of multiple components.

 Components are heterogeneous.  Components still have to be interoperable.  There has to be a common model for

components, which expresses » component states, » component services, and »  interaction of components with other components.


1.1 Example1: Java Object Model & Java Language

 Object » Runtime entity instance of class

 Interface » declare a set of methods for a Java object

without implementation   Method Invocation

» primitive type passed by value » object references passed by value


1.2 Ex 2: Distributed Object Model (Wolrath et al)

 Remote object »  object whose methods can be accessed from another address space

 Remote interface »  an interface that declares the methods of a remote object throws Remote Exception to deal with different failure models

 RMI »  non-remote object passed by value »  remote object passed by remote reference


1.3 CORBA Object Model & OMG IDL

 Model describes components, states, interactions and other concepts

 OMG/IDL is a language for expressing all concepts of the CORBA object model. » separation of interface from implementation » Enables interoperability and transparency »  IDL compiles into client stubs and server skeletons » Stubs and skeletons serve as proxies for clients

and servers, respectively


1.4 CORBA Client


Client Stub

Request

CORBA Object Implementations

CORBA Services

C++ Ada Cobol Smalltalk Java C

Server Skeleton

IDL IDL IDL IDL IDL IDL



1.5 Example1: StockQuoter IDL Interface module Quoter { //stock quoter server, some interface to query the prices of

stock exception Invalid_Stock_Symbol {};

interface Stock; interface Stock_Factory { Stock get_stock (in string stock_symbol) raises (Invalid_Stock_Symbol);

}; interface Stock { readonly attribute string symbol; // Get the stock symbol. readonly attribute string full_name; // Get the name. double price (); // Get the price }; };


2.3 Example 2: ATM Controller


Teller Controller IDL Definition

interface ATM; interface TellerCtrl { typedef sequence<ATM> ATMList; exception InvalidPIN; exception NotEnoughMoneyInAccount {...}; readonly attribute ATMList ATMs; readonly attribute BankList banks; void accept_request(in Requester req, in short amount) raises(InvalidPIN,NotEnoughMoneyInAccount); };


2 The CORBA Object Model

 Components ⇔ objects.  Component state ⇔ object attributes.  Usable component services ⇔ object

operations.  Component interactions ⇔ operation

execution requests.  Component service failures ⇔ exceptions.


3 The OMG Interface Definition Language   OMG/IDL is a language for expressing all concepts of the

CORBA object model.

  IDL is a 'contractual' language that lets you specify a component's (object's) boundaries and its interfaces with potential clients

  CORBA IDL is language neutral and totally declarative, i.e., it does not define implementations details

  Provides operating system and programming language independent interfaces to all services and objects that reside on the CORBA bus.

  Different programming language bindings are available. (We’ll work with Java)


2.1 Types of Distributed Objects

  Attributes and operations and exceptions are properties defined in object types.

  Object types are those properties that are shared by similar objects. Only their identity and values of their attributes differ.

  Objects may export these properties to other objects.

  Objects are instances of types.

  Object types are specified through interfaces that determine the operations that clients can request, that is, they define a contract that binds the interaction between client and sever objects.


3.1 Types

A type is one of the following:  Atomic types

(void, boolean, short, long, float, char, string),  Object types (interface),  Constructed types:

» Records (struct), » Variants (union), and » Lists (sequence), or

 Named types – aliases (typedef).


3.1 Types (Examples)

struct Requester { int PIN; string AccountNo; string Bank; };

typedef sequence<ATM> ATMList;


2.2 Attributes

  Attributes have a (unique) name and a type   Type can be an object type or a non-object type

(e.g., Boolean values, characters or numbers).   Attributes are readable by other components   Attributes may or may not be modifiable by other

components (readonly).   Attributes correspond to one or two operations (get

/set).   Attributes are declared within an interface.   Attribute name must be unique within interface.


2.2 Attributes (Examples)

readonly attribute ATMList ATMs; readonly attribute BankList banks;

readonly attribute string symbol; readonly attribute string full_name;


2.3 Operations

 Operations modify the state of an object or just compute functions

 Used for service requests  Operations have a signature that consists of

» a name, » a list of in, out, or inout parameters, » a return value type (result) or void if none, and » a list of exceptions that the operation can raise.


2.3 Operations (Examples)

void accept_request(in Requester req, in short amount) raises(InvalidPIN, NotEnoughMoneyInAccount);

short money_in_dispenser(in ATM dispenser) raises(InvalidATM);


2.4 Operation Execution Requests

  A client object can request an operation execution from a server object.

  Operation request is expressed by sending a message (operation name) to server object.

  Conceptually, an object request is a triple consisting of an object reference, the name of an operation and a list of actual parameters.

  Parameters are marshaled (packaged and transmitted, e.g., serialisation )

  Client have to react to exceptions that the operation may raise.


2.5 Exceptions   Service requests may not be executed properly.   Exceptions have a unique name.   Exceptions may declare additional data structures.   Exceptions are used to explain (and locate) the reason

of failure to the requester of the operation   Operation execution failures may be

» generic (system), raised by the middleware, e.g., an unreachable server object; or

» specific, raised by the server object, when the execution of a request would violate the object’s integrity, e.g., not enough money in a bank account.


2.5 Exceptions cont…

exception InvalidPIN; exception InvalidATM; exception NotEnoughMoneyInAccount { short available; };

 Specific Failures may be explained by specific exceptions  Example


3.5 Interfaces

  In distributed systems, services are syntactically specified through interfaces that capture the names of the functions available together with types of the parameters, return values, possible exceptions, etc.

 There is no legal way a process can access or manipulate the state of an object other than invoking methods made available to it through an object’s interface.


3.5 Interfaces

 Attributes, exceptions and operations are defined in interfaces.

  Interfaces have an identifier, which denotes the object type associated with the interface.

  Interfaces must be declared before they can be used.

  Interfaces can be declared in a forward manner


3.5 Interfaces (Example) interface ATM; /* forward declaration! */ interface TellerCtrl { typedef sequence<ATM> ATMList; exception InvalidATM; exception InvalidPIN; exception NotEnoughMoneyInAccount {short available;}; readonly attribute ATMList ATMs; readonly attribute BankList banks; void accept_request(in Requester req, in short amount) raises(InvalidPIN,NotEnoughMoneyInAccount); };


3.6 Modules

  A single global name space for all identifiers is unreasonable.

  IDL includes Modules to restrict visibility of identifiers.

  Access to identifiers from other modules by qualification with module identifier:

moduleName::identifierName


3.6 Modules (Example)

module Bank { interface AccountDB {}; };

module ATMNetwork { typedef sequence<Bank::AccountDB> BankList; exception InvalidPIN; interface ATM; interface TellerCtrl {...}; };


2.6 Sub-typing/Inheritance

  Object types are organised in a type hierarchy.   Subtypes inherit attributes, exceptions and operations from

their supertypes.   Subtypes can add more specific properties.   Subtypes can redefine inherited properties.   Advantages:

» Reuse » Changes are easier to manage » Abstraction makes designing DS elegant and easier to

understand » Enables polymorphism (an attribute or parameter can

refer to instances of different types).


3.7 Inheritance

 Notation to define object type hierarchy.  Type hierarchy has to form an acyclic graph.

 Type hierarchy graph has one root called (Object).

 Subtypes inherit the attributes, exceptions and operations of all super-types.


3.7 Inheritance (Examples)

interface Controllee; interface Ctrl { typedef sequence<Controllee> CtrleeList; readonly attribute CtrleeList controls; void add(in Controllee new_controllee); void discard(in Controllee old_controllee); }; interface ATM : Controllee {...}; interface TellerCtrl : Ctrl {...};


3.7 Multiple Inheritance

  An object type can inherit from more than one super-type.   May cause name clashes if different super-types export the

same identifier.   Example: interface Set { void add(in Element new_elem); };

interface TellerCtrl:Set, Ctrl { ... };   Name clashes are not allowed!


3.8 Redefinition

 Behaviour of an operation as defined in a super-type may not be appropriate for a subtype.

 Operation can be re-defined in the subtype.  Binding messages to operations is dynamic.  Operation signature must not be changed.  Operations in (abstract) super-types are not

implemented.


3.8 Redefinition (Example)

interface Ctrl {

void add(in Controllee new_controllee); };

interface TellerCtrl : Ctrl {

void add(in ATM new_controllee);

};

TellerCtrl cannot redefine add’s interface – only its behaviour! It cannot overload it either!


3.9 Polymorphism

 Objects can be assigned to an attribute or passed as a parameter, even though they are instances of subtypes of the attribute’s/parameter’s respective type.

 Attributes, parameters and operations are polymorph.

 Example: Using Polymorphism, instances of type ATM can be inserted into attribute controls that Ctrl has inherited from Ctrl.


2.7 Problems of the Model

  Interactions between components are not defined in the model.

 No concept for abstract or deferred types.  Model does not include primitives for the

behavioural specification of operations.  Semantics of the model is only defined

informally.


4 Other Approaches: (D)COM

  (D)COM is Microsoft’s Distributed Component Object Model (http://microsoft.com/com/).

 Evolved from OLE/COM.  Weaker than CORBA object model since it

» does not support inheritance, » does not have a strong type system and » does not support exceptions.


4. Other Approaches: Darwin

  Experimental language developed at Imperial College http://www-dse.doc.ic.ac.uk/Research/Darwin

  Supports dynamic configuration of distributed components.

  Graphical and textual notation.   Components provide and require services.   Primitive for binding service requester to service

provider.   Formal semantics based on Milner’s π-calculus.


5 Summary

 Client-Server vs n-Tier Architecture  Why do we need a component model?  What are the primitives of the CORBA object

model?  What is OMG/IDL?  What are the strengths and weaknesses of

the CORBA approach?


EXTRA MATERIAL

(Not to be examined)


0.2 Further Examples:Computational Grids

  Inspired by the electrical power grid’s pervasiveness, reliability and easy to use, computer scientists in the mid-90s began exploring the design and development of an analogous infrastructure called the computational power Grid


Vision

 To build an environment that enables » sharing, » selection, » aggregation of a wide variety of

 geographically distributed resources including » supercomputers, » storage systems, data sources, and » specialised devices owned by different organisations for

solving large-scale resource intensive problems in science, engineering, and commerce (Buyya, 2002).


0.2 Computational Grid

  Motivation: Small computing resources such as PCs have the potential to provide vast computing power when connected. And yet…

  Many of these resources lie idle most of the time. Millions of online-PCs are only involved in tasks like word processing or browsing the Internet. The computing resources of many organisations are often severely under-utilised, specially outside of peak business hours.

  At the same time, there are many individuals and organisations that have intensive computations to perform but only have limited access to resources that are available to execute them.


Possible exploitation (Source: IBM)

  Analyze the value of an investment portfolio in minutes rather than hours?

  Unite research teams with others around the world to take advantage of the most up-to-date knowledge?

  Significantly accelerate the drug discovery process?   Scale your business to meet cyclical demand?   Cut the design time of your products in half while reducing the

instances of defects? Source: http://www-1.ibm.com/grid/about_grid/index.shtml

© City University London, Dept. of Computing Distributed Systems / 2 - 56 DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago

tomographic reconstruction

real-time collection

wide-area dissemination

desktop & VR clients with shared controls

Advanced Photon Source

Online Access to Scientific Instruments

archival storage

© City University London, Dept. of Computing Distributed Systems / 2 - 57 Image courtesy Harvey Newman, Caltech

Data Grids for High Energy Physics

Tier2 Centre ~1 TIPS

Online System

Offline Processor Farm

~20 TIPS

CERN Computer Centre

FermiLab ~4 TIPS France Regional Centre

Italy Regional Centre

Germany Regional Centre

Institute Institute Institute Institute ~0.25TIPS

Physicist workstations

~100 MBytes/sec

~100 MBytes/sec

~622 Mbits/sec

~1 MBytes/sec

There is a “bunch crossing” every 25 nsecs.

There are 100 “triggers” per second Each triggered event is ~1 MByte in size

Physicists work on analysis “channels”.

Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server

Physics data cache

~PBytes/sec

~622 Mbits/sec or Air Freight (deprecated)




Caltech ~1 TIPS

~622 Mbits/sec

1 TIPS is approximately 25,000

SpecInt95 equivalents


Network for Earthquake Engineering Simulation

 NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other

 On-demand access to experiments, data streams, computing, archives, collaboration

NEESgrid: Argonne, Michigan, NCSA, UIUC, USC


  Community = »  1000s of home computer

users »  Philanthropic computing

vendor (Entropia) »  Research group

(Scripps)   Common goal= advance

AIDS research

Home Computers Evaluate AIDS Drugs


Computational Grids Resourses

  Global Grid Forum (http://www.gridforum.org/): community-initiated forum of 5000+ individual researchers and practitioners working on distributed computing, or "grid" technologies

  GridComputing (http://www.gridcomputing.com/)   myGrid (http://www.mygrid.org.uk/), an EPSRC project   Platforms:

» Globus (http://www.globus.org/) » Unicore (http://www.unicore.org/) »  Load Sharing Facility (http://www.platform.com/)


Distributed Systems

Session 3: Communication In Distributed

Systems




0 Outline & Review

  Last session we have discussed an object-oriented component model. Common properties of similar components are modeled as object types (interfaces). Services offered by distributed components are modeled as operations of these object types.

  This session, we are going to consider the following problem: What communication primitives are needed in a distributed system and how are they used to implement service requests?


0.1 Last session’s Learning Outcomes

 Why do we need a component model?  What are the primitives of the CORBA object

model?  What is OMG/IDL?  What are the strength and weaknesses of the

CORBA approach?


0.2 WHY?

 Distributed Systems consist of multiple components.

 Components are heterogeneous.  Components still have to be interoperable.  There has to be a common model for

components that expresses » component states, » component services and »  interaction of components with other components.


0.3 Primitives Of CORBA Object Model??

 Components ⇔ objects.  Component state ⇔ object attributes.  Usable component services ⇔ object

operations.  Component interactions ⇔ operation

execution requests.  Component service failures ⇔ exceptions.


0.4 CORBA && OMG IDL Client


Client Stub

Request

CORBA Object Implementations

CORBA Services


Server Skeleton

IDL IDL IDL IDL IDL IDL


Server


0.5 The OMG Interface Definition Language   OMG/IDL is a language for expressing all concepts of the

CORBA object model.

  IDL is a 'contractual' language that lets you specify a component's (object's) boundaries and its interfaces with potential clients

  CORBA IDL is language neutral and totally declarative (i.e does not define implementations details)

  Provides operating system and programming language independent interfaces to all services and objects that resides on the CORBA bus.

  Different programming language bindings are available. (We’ll work with JAVA)


0.6 Problems of the Model

  Interactions between components are not fully defined in the model.

 No concept for abstract or deferred types.  Model does not include primitives for the

behavioural specification of operations.  Semantics of the model is only defined

informally. Bastide R. et al.: “Petri Net Based Behavioural Specification of CORBA Systems.”

Lecture Notes in Computer Science, Vol. 1630: Application and Theory of Petri Nets 1999, 20th Int Conference, ICATPN'99, Williamsburg, Virginia, USA, pp. 66-85. Springer-Verlag, June 1999.


Objective For this session

 In this session, we are going to consider the following questions: » What communication primitives are

needed in a distributed system? » How are these primitives used to

implement service requests?


Outline

1 Communication: Introduction 2 Communication Primitives

3 Client/Server Communication

4 Group Communication

5 Summary


1.0 Introduction

 No shared memory in Distributed System  So all communication based on message passing  Consider Process/ Component P1

communicating with P2, what is required?

network

P1

Operating system

Address space 1

P2

Operating system

Address space 2


1.1 Introduction

 P1 builds a message in its address space  Executes a system call  Operating system fetches message and

transmits over network to P2   Issues & agreements??

»  Meaning of bits being sent ? »  Volts being used to signal 0-bit, 1-bit ? »  which was the last bit sent? »  Error detection? »  How long are numbers, strings etc? »  How are they represented?


1.2 Communication Standards

 Need for standards to deal with numerous levels and issues in communication

 OSI Reference Model developed by (ISO) for open systems

  Identifies various levels, assigns standard names and defines functionality

 Defines PROTOCOLS  A protocol is an agreement between communicating

parties on how communication is to proceed © City University London, Dept. of Computing Distributed Systems / 3 - 14

1.3 Protocols

 To allow a group of machines to communicate over a network, all must agree on protocols to use

 OSI distinguishes two types of protocols » Connection-oriented (like in telephone) » Connectionless (postal service)

  In OSI model, communication is partitioned into 7 layers

 Each layer deals with one aspect of communication


2 Communication Primitives

Application

Presentation

Transport

Network

Data link

Physical

Session

The ISO/OSI (International Organization for

Standardization/ Open Systems Interconnection) Reference Model: 1.  Need for standardization of the

communication between hosts built by different organizations.

2.  Each layer builds on abstractions provided by the layer below.


2.7 Physical Layer

 This layer conveys the bit stream - electrical impulse, light or radio signal -- through the network at the electrical and mechanical level.

  It provides the hardware means of sending and receiving data on a carrier,

 Sets standards for electrical,mechanical and signaling interfaces.

 Defines cables, cards and physical aspects. Fast Ethernet, RS232, and ATM are protocols with physical layer components.


2.6 Data Link Layer

  At this layer, data packets are encoded and decoded into bits. It furnishes transmission protocol knowledge and management and handles errors in the physical layer, flow control and frame synchronization.

  The data link layer is divided into two sub-layers: The Media Access Control (MAC) layer and the Logical Link Control (LLC) layer. »  The MAC sub-layer controls how a computer on the network gains

access to the data and permission to transmit it. »  The LLC layer controls frame synchronization, flow control and error

checking


2.5 Network Layer

 This layer provides switching and routing technologies, creating logical paths, known as virtual circuits, for transmitting data from node to node in a WAN.

 Routing means choosing the best path  Routing and forwarding are functions of this

layer, as well as addressing, internetworking, error handling, congestion control and packet sequencing.


2.5.1 Example of a Net Layer Protocol: Internet Protocol (IP)

 Protocol for sending data between machines on the Internet

 Each host has a unique IP address (URL).  Data divided into packets (next)   IP is connectionless.  Data packets travel independently, and maybe out of

order (re-sequencing is done by TCP, at the transport layer)


Presentation & Transport

Application

Presentation

Transport

Network

Data link

Physical

Session We are going to review two layers of the model that are important for the implementation of service requests in general, and CORBA operation invocation requests in particular


Transport Layer

Application

Presentation

Transport

Network

Data link

Physical

Session

• Level 4 of ISO/OSI Reference Model • Concerned with the transparent transport of information through the network • Responsible for end-to-end error recovery and flow control. It ensures complete data transfer • It is the lowest level at which messages (not packets) are handled. Messages addressed to communication ports • Protocols maybe connection-oriented or connectionless • Two facets in Unix:

• TCP and • UDP


2.4 Transport Layer

 This layer provides transparent transfer of data between end-systems/hosts,

 Responsible for end-to-end error recovery and flow control. It ensures complete data transfer.

  It is the lowest level at which messages (not packets) are handled. Messages are addressed to communication ports.

 Protocols may be connection-oriented or connectionless


2.8 ISO/OSI Transport Layer

  The transport layer implements transport of data on the basis of some network layer (the network layer itself may be implemented as the Internet Protocol (IP) or OSI's X-25 protocol).

  There are a number of transport layer implementations, though the most prominent ones are TCP and UDP that are available in virtually all UNIX operating system variants.

  TCP is connection-oriented. This means that a connection between two distributed components has to be maintained by the session layer.

  UDP is connectionless. The session layer is not required when transport is UDP based.


2.9 Transmission Control Protocol: TCP

  TCP provides bi-directional stream of bytes (unstructured data) between two distributed components. »  A component using TCP is unaware that data is broken into segments for transmission

over the network.

  UNIX rsh, rcp and rlogin are based on TCP.

  Reliable, often used with unreliable network protocols »  (e.g., a telephone line used with a Serial Line Internet Protocol (SLIP)). »  Or with internet Protocol (IP) . Applications such as ftp that need a reliable connection

for a prolonged periods of time establish TCP connections.

  Slow! As the two ends connected by the stream may have a different computation speed,

  TCP buffers the stream so that the two processes are (partially) decoupled.


2.11 TCP Operation

•  When a data segment is received correctly at destination, an acknowledgement (ACK) segment is sent to the sending TCP

• ACK contains sequence number of the last byte correctly received incremented by 1

• The network can fail to deliver a segment. If the sending TCP waits for too long for an acknowledgement, it times out and re-sends the segment, on the assumption that the datagram has been lost

• Then network can potentially deliver duplicated segments, and can deliver segments out of order. TCP buffers out of order segments or discards duplicates, using byte count for identification


2.12 User Datagram Protocol: UDP   UDP enables a component to pass unilaterally a message

(datagrams) containing a sequence of bytes with restricted length (packets) to another component.

  Connection-less (like a postal service)   UNIX rwho command is UDP based   UDP is unreliable because it does not detect messages that are lost

completely.It depends on lower layers’ reliability (e.g. optical wire with Asynchronous Transfer Mode (ATM) network implementations).

  Or used for applications where reliability is not a concern »  e.g. DNS, streaming multimedia, Voice over IP (WHY???)

  Fast & efficient: It does not spend any resources on error-detection and correction, no connection overhead, no waiting for ACK.

  Application can opt to use UDP where its prepared to implement its own reliability


2.13 Transport Layer: Sockets

 Transport layer implementations are available (in all UNIX workstations and servers as well as various Microsoft OS) in the form of sockets.

 Sockets are identified by an Internet domain name and a port number.

 Sockets of type SOCK_STREAM provide the programming interface to TCP.

 SOCK_DGRAM to UDP (sento, recvfrom). © City University London, Dept. of Computing Distributed Systems / 3 - 28

2.3 Session Layer

 This layer establishes, manages and terminates connections between applications.

 The session layer sets up, coordinates, and terminates conversations, exchanges, and dialogues between the applications at each end.

  It deals with session and connection coordination.


2.2 Presentation Layer

 This layer provides independence from differences in data representation (e.g., encryption) by translating from application to network format, and vice versa.

 The presentation layer works to transform data into the form that the application layer can accept.

 This layer formats and encrypts data to be sent across a network, providing freedom from compatibility problems. It is sometimes called the syntax layer.


Presentation Layer

Application

Presentation

Transport

Network

Data link

Physical

Session

• At Application layer: Complex Data types

• How to transmit complex values through transport layer • Presentation Layer issues:

• Complex data structures • Heterogeneity


2.14 ISO/OSI Presentation Layer

  There is a considerable mismatch between the complex types used at the application layer, such as records, lists and unions of other complex types in IDL, and those that can be transported by TCP and UDP.

  A further complication arises from the fact that atomic types are represented differently on different hardware platforms.

  The task of the presentation layer is to resolve these heterogeneity and transform complex data structures into forms that are suitable for transport layers, such as TCP and UDP.


2.16 Heterogeneity

  Different hardware and operating system platforms use different representations for elementary data types such as integers and characters: »  Most modern operating systems represent 16-bit integers as

two bytes, where the most significant byte comes first. Older machines, such as IBM mainframes, represent these integers exactly the other way around.

»  There are also different encodings for character sets. Characters may be encoded as 7-bit ASCII, in the ISO 8-bit character set or in the emerging 16-bit representation, which accounts for the representation of Asian characters as well.

  Distributed operating systems resolve these differences within the presentation layer so as to enable heterogeneous components to communicate with each other.


2.17 Example: Endianness

 Big endian means that the most significant byte of any multibyte data field is stored at the lowest memory address, which is also the address of the larger field.(Sun’s SPARC, Motorola 68K, JAVA Virtual Machine.

 Little endian means that the least significant byte of any multibyte data field is stored at the lowest memory address, which is also the address of the larger field. (Intel 80x86 processors


2.18 Solution Heterogeneity

  There are different approaches. One is to convert data during marshalling into a common/shared and well defined representation. An example of this is Sun’s External Data Representation (XDR), which is used in most Remote Procedure Calls (RPC). »  For each platform, provide a mapping between common and specific

representation

  Another approach is the Abstract Syntax Notation ASN.1 that was standardised by the CCITT. It provides a notation for including the type definition together with each value into the marshalled representation.


Complex Data Structures

  Marshalling: Disassembles a data Structure into transmittable form

  Unmarshalling: Reassemble the complex data structure

Class Person {

private int dob;

private String name

private long id

public String marshal(){

return id+”,”+name.size+”,”+name;

);

}

}


2.15. Marshalling

  Marshalling flattens complex data structures into a transportable representation, usually a stream of bytes, which may be split into a sequence of messages if necessary.

  The stream of bytes not only contains the data itself, but also meta-information, such as the length of a certain entry, or an encoding for its types.

  The presentation layer at the receiving component then performs the reverse mapping, which is called unmarshalling. It reconstructs the complex type from data and meta data that is included in the stream received.

  Note, that marshalling in practice is rarely programmed manually. It is being taken care of by the distributed operating system, such as an ORB in CORBA.


2.19 XDR Message

Length of sequence

Smith

Length of sequence

CARDINAL

The message is ‘Smith’,’London ’,1934

5

“Smit”

“h___”

“on __”

“Lond”

7

1934

• Arrays, structures and strings represented as sequence of bytes with specified length

• Characters are ASCII code

• Specify which end of each is MSB

• Describes serialised byte streams

• streams can be passed across network


2.1 Application Layer

  This layer supports application and end-user processes.   Communication partners are identified, quality of service is

identified, user authentication and privacy are considered, and any constraints on data syntax are identified. Everything at this layer is application-specific.

  This layer provides application services for file transfers, e-mail, and other network software services. Telnet and FTP are applications that exist entirely in the application level. Tiered application architectures are part of this layer.


2.20 Communication Patterns

  Basic operations: send and receive messages

  Message delivery: Synchronous or Asynchronous (*)

  Messages are used to model: Request and Notification.

(*) Meaning completely different from a/synchronous systems…


2.29 Request

  Bi-directional communication.   The sender expects the delivery of a result from the receiver.   Requester receives reply message.   Request/reply messages contain marshalled parameters/results.

send(...) receive(...)

Requester Provider

receive(...) request

send(...) reply

...


2.21 Synchronous Communication

» The sender invokes the send operation.The message is buffered in the local transport layer.

» The message is sent by the local transport layer to the remote transport layer.

» The message is received by the transport layer of the remote component and is buffered there.

» The receiver invokes the receive operation to obtain the message. This causes an acknowledgement to be sent to the sender.

» The acknowledgement is received by the sender.


1.3 Synchronous Communication

Time

sender send

receiver

blocked

Transport Layer

receive blocked

(1) (3) (4) (5) (2)


2.23 Communication Deadlocks

P1:

send() to P2; receive() from P2;

P2:

send() to P1; receive() from P1;

P1

P2

Waits-for

  Components are mutually waiting for each other.

  It is hard to prove whether or not a system is deadlock-free and most distributed operating systems therefore do not do much about them and leave it to the designer to avoid them.

  To avoid deadlocks: Waits-for relation has to be acyclic!


2.28 Notification

  Uni-directional communication.   Message contains marshalled notification parameters.   The sender informs a receiver about a certain incident.

send(...)

Notifier Notified

receive(...)


2.25. Asynchronous Communication

 With asynchronous message delivery, the sender does not wait until the receiver has acknowledged the receipt of the message delivery, but continues as soon as the message has been passed to the local transport layer.

  It may be delayed still, if message buffers of the transport layer are exhausted.


1.3 Asynchronous Communication

Time

sender send

receiver

Transport Layer

receive blocked

(1) (3) (4) (2)


2.26 Asynchronous Communication Pros and Cons

»  The sender and receiver are decoupled and do not depend on each other.

»  This usually results in a higher degree of concurrency between sender and receiver and increases the overall distributed system performance.

»  The most important advantage is probably that the system is less likely to run into a deadlock.

»  The sender does not know whether or not the receiver has actually received the message. Asynchronous delivery can therefore not reasonably be used together with unreliable transport layer implementations.

»  Additional overhead is required if the message order has to be maintained.


3.0 Client/Server Communication

 The client/server model underlies almost every distributed system. Hence it is important to understand the principles of client/server communication.

 Qualities of service. » Request protocol (R). » Request Reply protocol (RR). » Request Reply Acknowledgement protocol (RRA).


3.1 Quality of service – Client/Server

  Exactly once: The service is executed once and only once.

  At most once: The service request may or may not be or have been executed. If the service is not executed the client is being informed of the failure.

  At least once: The call may be once or more than one time.

  Maybe: It is neither guaranteed that the service has been executed nor is the client informed of failure occurrences should there be any.


3.2 Request Protocol

  If the client can cope with the “maybe” quality of service, the client may not want to wait for the server to finish the service. This protocol, however, is unsuitable if the service has to return data or the client has to know what happened to the service execution.

  The advantages are that »  there is only one message involved thus the network is not

unnecessarily overloaded and »  The client can continue execution as soon as acknowledgement of

message delivery has been returned. (FROM WHOM? A/Synchronous send…)

execution request send(...)

Client Server

receive(...) exec op;


3.3 Request/Reply Protocol

  To be applied if client expects result from server.   Client requests service execution from server through request message.   Delivery of service result in reply message.   If the reply message is not received after a certain period of time this can

have many reasons (the server has not finished the execution yet; the reply message has been lost).

  Servers therefore keep a history of reply messages and clients may resend the request and the server then resends the reply.

send(...)

receive(...)

Client Server

receive(...) exec op; send(...)

request

reply


3.4 RRA Protocol   Depending on the amount of client/server communication cycles, the

maintenance of a history may involve a serious overhead!   The RRA protocol is designed to limit this overhead.   RRA adds to RR an additional acknowledgement message which is

sent by the client as soon as a reply has been received.   The receipt of an acknowledgement message enables the server to

dump the reply message of that communication cycle (and all previous non-acknowledged replies).

send(...)

receive(...) send (...)

Client Server

receive(...) exec op; send(...) receive(...)

request

reply

ackn


RR & RRA – Quality of Service?

 Request provides is for Maybe QoS  What about RR & RRA?

» How can the server know a req. is repeated? – Add £10 to my account. – Add £10 to my account. – A repeat? A new one?

» So, it depends on if/how the call is identified. If it isn’t then At least once. If it is then At most once.

 Exactly once needs to make sure that the request will be performed even when failures occur – very expensive!


4 Group Communication

 Client/server requests: » The communication pattern that we have seen so

far was bi-lateral in the sense there were only two parties involved, client and server.

» Moreover, it was intimate as the client component always had to identify the server component.

 Sometimes other properties are required: » Communication between multiple components. » Anonymous communication.


4.1 Concepts

  Broadcast: Send msg to a group.

  Multicast: Send msg to subgroup only.

N N N N

N N

N N

N N

N N

N N

M N N

N N N N

M M N N


3.2 Qualities of Service

  Ideal: Immediate and reliable.

S

R1

R2 Time

S

R1

Time

 Optimal: Simultaneous and reliable.

R2


3.2 Qualities of Service

  In reality: not simultaneous ...

... and not reliable

Time

Time S

R1

R2

S

R1

R2


4.2 Quality of Service – Group Communication

  Problem: To achieve reliable broadcast/multicast is very expensive.

  Degrees of reliability: »  Best-effort is the lowest of these degrees. No explicit measure are

taken to guarantee a certain quality. »  K-reliability is a guarantee that at least k messages are going to be

delivered to their recipients. »  Totally-ordered delivery refers to the fact that messages of one

communication cycle are not overtaken by a later cycle. »  Atomicity denotes the fact that either messages are delivered to all

recipients or to none at all.

  Choose the degree of reliability needed and be prepared to pay the price.


4.3 CORBA Event Management

 CORBA event management service defines interfaces for different group communication models.

 Events are created by producers and communicated through an event channel to multiple consumers.

 Service does not define a quality of service (left to implementers).


4.3.1 Push Model

 Consumers register with the event channels through which events of interest are communicated.

 Event producers create a new event by invoking a push operation of an event channel.

 Event channel notifies all registered consumers by invoking their push operations.


4.3.1 Push Model (Example)

Shared value updated

Producer Event Channel

Redisplay chart

Redisplay table

Consumer Consumer

push(...) push(...) push(...)


4.3.2 The Pull Model

 Event producer registers its capability of producing events with an event channel.

 Consumer obtains event by invoking the pull operation of an event channel.

 Event channel asks producer to produce event and delivers it to the consumer.


4.3.2 Pull Model (Example)

Current value is: 76.10

Producer Event Channel

Current share value?

Consumer Consumer

pull(...) pull(...)


5 Summary

  What communication primitives do distributed systems use? (OSI stack)

  How are differences between application and communication layer resolved? (XDR/ASN)

  What quality of service do the client/server protocols achieve that we discussed? (M/LO/MO/EO)

  What quality of services are involved in group communication? (Best Eff./K-Rel/Tot. Ord./Atomic)

  Understanding CORBA event management. (Push vs Pull)


Reading

 Read Chapter 4 of [CDK94].  Read OMG Documentation about Event

Management http://www.omg.org/CORBA/   http://www.omg.org/technology/documents

/CORBAservices_spec_catalog.htm  http://www.soi.city.ac.uk/~kloukin/teaching/ds

-labs/corba/eventservices.idl  http://www.soi.city.ac.uk/~kloukin/teaching/ds

-labs/corba/eventservices.pdf © City University London, Dept. of Computing Distributed Systems / 3 - 66

Further Reading (for Last session)

Further Reading:   Object Management Group. Common Object Request Broker:

Architecture and Specification. Rev. 2.0, Chapter 3. OMG IDL Syntax and Semantics. Framingham, Mass. July 1995 (available at: http://www.omg.org/)

  International Telecommunication Union. CCITT Recommendation X.720: Information Technology - Open Systems Interconnection - Structure of Management Information: Management Information Model. Geneva, Switzerland. 1993 (available at http://www.itu.ch/).

  Microsoft’s Distributed Component Object Model. Information at http://www.microsoft.com/com/


EXTRA MATERIAL



Communication Primitives Overview - I

Application

Presentation

Transport

Network

Data link

Physical

Session

4. Transport Layer connects two distributed components and isolates upper layers from concerns as to how reliable lower layers are. Responsible for end-to-end error recovery.It ensures complete data transfer

3. Network Layer isolates the higher layers from routing and switching considerations

2. Data Link Layer Maps the physical circuit (the cable) and converts it into a point-to-point link that appears relatively error-free (checksums, parity checking is done here)

1. Physical Layer Concerned with transmission of bits over a physical circuit


Communication Primitives Overview - II

Application

Presentation

Transport

Network

Data link

Physical

Session

7. Application Layer concerned with distributed components and their interaction. CORBA objects and their interactions are one example. Remote procedure calls are another

6. Presentation Layer has to resolve differences in information representation between distributed components. (Only needed for connection-oriented protocols)

5. Session Layer provides facilities to support and maintain associations between two or more distributed components


2.10 TCP Segments

 Source & Destination Port numbers »  processes wait for connections at pre-agreed port

numbers  Segment and ACK Numbers

»  Every data segment is identified by a 32-bit Sequence number(for explicit acknowledgement)

»  ACK number identifies the next sequence number that the sender of the acknowledgement expects to receive

 Application Data

TCP slices incoming byte-stream into data segments. A segment contains administrative header and App Data


2.5.2 IP Packet


Distributed Systems

Session 4: RPCs (Remote Method Invocation) Java RMI.




Outline

 Motivation and Introduction to Java RMI  Conceptual Framework  RMI Details  Example Implementation  Summary


0 Motivation

  DS require computations running in different address spaces (different hosts) to communicate

  For basic communication; Java Supports sockets; Sockets API=SEND & RECV CALLS

  Sockets require Client & Server engage in application level protocols to encode and decode messages for exchange

  Design of such protocols is cumbersome and error prone   Alternative is Remote Procedure Call (RPC) (think of sin(),

log(), static methods…)


0.1 RPC

 RPC abstracts the communication interface to the level of procedure call i.e provides procedural interface to distributed (remote) services

  Instead of working directly with socket, programmer has illusion of calling a local proc. (transparency)

 BUT in reality; arguments of the call are packaged and shipped to remote target of call (marshalling)

 RPC systems encode arguments and return values using an external representation such as XDR


0.2 RPC to RMI

 RPC does not translate well into distributed object systems (DOS)

 Communication between program-level objects residing in different address spaces is required

 To match the semantics of object invocation, DOS require remote method invocation or RMI

 Here, a local surrogate(stub) object manages invocation on remote object

 RPC + Object Orientation © City University London, Dept. of Computing Distributed Systems / 4 - 6

0.3 Middleware Layers

Request-Reply protocol Marshalling and external data Representation

RMI and RPC Application and services

TCP and UDP

Middleware layers


0.3 Java RMI:The Essence

 RMI provide you with an object Oriented mechanism to invoke a method on an object that exist somewhere else.

 Java RMI system assumes the homogeneous environment of the java virtual machines (JVM)

  therefore it takes advantage of the java platform’s object model whenever possible.


0.4 Java

 Java is an object-oriented programming language developed by Sun Microsystems that is both compiled and interpreted: A Java compiler creates byte-code, which is interpreted by a virtual machine (VM).

 Java is portable: Different VMs interpret the byte-code on different hardware and operating system platforms.


0.5 RMI Rationale  High-level primitive for object communication (not just

UDP datagrams and TCP streams).  RMI is tightly integrated with the rest of Java

language specification, development environment and the Java VM.

 Reliance on Java VMs for the resolution of heterogeneity. RMIs can assume a homogeneous representation

 Java/RMI does not support RMI between Java objects and objects written in another OO language (unless you use the native interface for C/C++)


0.6 Client-Service

 Before getting into the details, we should examine what an RMI system looks like in comparison to a standard strong-referenced object relationship.

  In a standard instantiate-and-invoke relationship, there are only the Client and Service objects.

 They both live in the same virtual machine.   Method invocations are made directly on the Service

object.


0.7 Client-Service


0.8 RMI

  In RMI, the Client object does not directly instantiate the Service,

 BUT gets a reference to its interface through the RMI Naming service.

 This interface hooks up the client system to the server through a series of layers and proxies until it reaches the actual methods provided by the Service object.


0.9 Remote Method Invocation


1.0 Conceptual Framework: Aspects

 Architecture.

 Accessing components from programming languages.

  Interfaces to lower layers.

 Component identification.

 Service invocation styles.

 Handling of failures.


1.1 System Goals of RMI   Seamless integration of objects on different VMs.   Support callbacks from servers to applets.   Distributed object model for Java

»  Security, write once run everywhere, multithreaded »  Object Orientation

  Simplicity (learning curve)   Safety (maintain Java standards)   Flexibility (several invocation mechanisms and various

reference semantics, distributed garbage collection)


Distributed Object Application Requirements

 Locate remote objects » App can register its remote objects with RMI naming

facility, the rmiregistry  Communicate with remote objects

» Details of communication handled by RMI.(Transparency)  Load class bytecodes for objects that are passed as

parameters or return values » RMI provides class loading


2.0 Remote Method Invocation

 Overview of RMI architecture.

 Generation of client/server stubs.

 RMI interface.

 Binding.

 Handling of remote methods.

 Failures of RMIs.


2.1 RMI Architecture

Client Server

Network

Local Call

Server Stub

RMI Interface

RMI Interface

Server Skeleton

Remote Object Method

send receive send receive


2.2 RMI Components

 Remote Object   Interfaces  Client  Server  Stub  Skeleton


2.31 The Remote Object:   Remote object is a class that is designed to

execute on a server but be treated by the client as if it were local.

 There are several reasons why you would want to implement a class as a remote object: »  the object will run faster, security and proximity to

necessary resources than it would on the client. »  If the above reasons don’t apply then it’s probably not a

good idea to implement a class as a remote object.


2.32 The interface

 An RMI remote object must extend java.rmi.Remote. When deploying the remote object, a stub is created that implements the same interface.

 The major purpose of the interface is to provide the template that is used by both the remote object and its stubs.

 The client never instantiates the remote object itself, and in fact doesn’t even need the class file on its system.


2.33 The Interface: Advantages  There are several advantages to using an

interface that makes RMI a more robust platform. »  Security by preventing decompiling »  The interface is significantly smaller than the actual remote object’s class,

so the client is lighter in weight. »  Maintainability; If changes are made to the underlying remote object, it

would need to be propagated to the clients, otherwise serious errors can occur.

»  From an architectural standpoint, the interface is cleaner. The code in the remote object will never run on the client, and the interface acts appropriately as a contract between the caller and the class performing the work remotely.


2.34 The Client

 Users of remote objects .  Use Naming Class to lookup() objects instead of

creating them with the new keyword.  However, the remote object is NOT returned, only a

stub which happens to implement the same interface of the remote object.

 Once the client has an object which implements the remote object interface, it can make calls on it as if it was the real object.


2.35 The stub  The client needs more than just the interface to call

the methods on the remote object.   Proxy for the remote object, obtained via naming

service   Implements all of the methods of its interface.  The stub’s major functionality is serializing objects

between the client and server over the RMI port, i.e Marshalling and unmarshalling


2.36 The Skeleton  On the other side of the connection is a skeleton

of the remote object, as well as the remote object itself.

 When the server starts, it creates an instance of the remote object and waits for invocations.

 Each time a method is called on the stub from the client, the skeleton object receives a “dispatch” from the server .

 The Skeleton is responsible for dispatching the call to the actual object implementation.


2.37 The Server

 RMI server must be present and running on the network.

 Create with an instance of class that implement remote interface. I.e creates remote objects

 Can be used to start the RMI naming service.  Binds an instance of the remote object to the

naming service, giving it an alias in the process.


Implementation

Hello World Program


3.0 How to Write RMI Applications Define your

remote interface

Server class (.class)

Server skeleton (.class)

Client Stub (.class) Implement Client

javac

Start client

Start RMI registry

Server

Client

rmic

Start Server objects Register remote objects

Implement the interface

javac

uses

(.java)

(.class)

Run the Stub Compiler (.java)

(.class)

Client

8

1

4

3

2

5 9

7 10 6


3.0 How to write an RMI application

To write an RMI application proceed as follows:   1) Define a remote interface (Server Services) by extending java.rmi.Remote and have

methods throw java.rmi.RemoteException   2) Implement the remote interface. You must provide a Java server class that implements

the interface. It must be derived from the class java.rmi.UnicastRemoteObject   3) Compile the server class using javac   4) Run the stub compiler rmic. Run rmic against your (.class) file to generate client stubs

and server skeletons for your remote classes. (REMEMBER proxies for marshalling & unmarshalling)

  5) Start the RMI registry on your server (call rmiregistry &). The registry retrieves and registers server objects. In contrast to CORBA it is not persistent.

  6) Start the server object and register it with the registry using the bind method in java.rmi.Naming

  7) Write the client code using java.rmi.Naming to locate the server objects.   8) Compile the client code using javac   9) Start the client


3.1 Implementing RMI

  Interface for remote object: public interface Hello extends java.rmi.Remote

{String sayHello() throws java.rmi.RemoteException; }

  Implementation of server (declaration): import java.rmi.*; import java.rmi.server.UnicastRemoteObject; public class HelloImpl

extends UnicastRemoteObject implements Hello {

private String name; ...


3.2 Implementing RMI: RMI Core

  RemoteException superclass for exceptions specific to remote objects thrown by RMI runtime (broken connection, a reference mismatch, e.g.)

  The Remote interface embraces all remote objects (Does not define methods, but serves to flag remote objects)

  The RemoteObject class corresponds to Java’s Object class. It implements remote versions of methods such as hashCode, equals, toString

  The class RemoteServer provides methods for creating and exporting servers (e.g. getClientHost, getLog), I.e. common superclass to server implementations and provides the framework to support a wide range of remote reference semantics.

  UnicastRemoteObject Your server must either directly inherit or indirectly extend the class and inherit its remote behaviour: implements a special server with the following characteristics: »  all references to remote objects are only valid during the life of the process which

created the remote object »  it requires a TCP connection-based protocol »  parameters, invocations etc. are communicated via streams


3.3 Interfaces and Classes

Remote RemoteObject

RemoteServer

Activatable UnicastRemoteObject

IOException

RemoteException

Interfaces Classes

extension implementation


3.4 Implementing RMI (Server)

  Impl. of server (constructor, method, main): public HelloImpl(String s)throws RemoteException {super(); name = s;} public String sayHello() throws RemoteException {return "Hello World!";} public static void main(String args[]){ System.setSecurityManager(new

RMISecurityManager()); try {

HelloImpl obj = new HelloImpl("HelloServer"); Naming.rebind("//myhost/HelloServer",obj); } catch (Exception e) {…} “localhost” or run Java

like this: java helloSrv

`hostname ̀© City University London, Dept. of Computing Distributed Systems / 4 - 34

3.5 Implementing RMI (Server)

  The class Naming is the bootstrap mechanism for obtaining references to remote objects based on Uniform Resource Locator (URL) syntax. The URL for a remote object is specified using the usual host, port and name: rmi://host:port/name host = host name of registry (defaults to current host) port = port number of registry (defaults to the registry port number) name = name for remote object

  A registry exists on every node that allows RMI connections to servers on that node. The registry on a particular node contains a transient database that maps names to remote objects. When the node boots, the registry database is empty. The names stored in the registry are pure and are not parsed. A service storing itself in the registry may want to prefix its name of the service by a package name (although not required), to reduce name collisions in the registry.


3.6 Implementing RMI (Client)

  Remote remoteHello = null; Hello myHello = null; System.setSecurityManager(

new RMISecurityManager()); try {

remoteHello = Naming.lookup("//myhost/HelloServer"); myHello = (Hello) remoteHello;

}…


3.7 Implementing RMI: Summary

 To summarise the above example: » The server creates the server object and binds it to

a name » The client uses lookup to get an object reference

and then has to perform a cast to turn a RemoteObject into an object of the proper type


4.0 RMI Interface

 Used by client or server directly: » Locating servers. » Choosing a transport protocol. » Authentication and security. »  Invoking RMIs dynamically.

 Used by stubs for: » Generating unique message IDs. » Sending messages. » Maintaining message history.


5.0 Binding

 How to locate an RMI server that can execute a given procedure in a network?

 Can be done » statically (i.e. at compile-time) or » dynamically (i.e. at run-time).


5.1 Binding

  A problem that arises is to locate that server in a network which supports the program with the desired remote procedures.

  This problem is referred to as binding.   Binding can be done statically or dynamically. The binding

we have seen in the last example was static because the hostname was determined at compile time.

  Static binding is fairly simple, but seriously limits migration and replication transparency.

  With dynamic binding the selection of the server is performed at run-time. This can be done in a way that migration and replication transparency is retained.


5.1 Binding

  Limited support for dynamical server location with the LocateRegistry class to obtain the bootstrap Registry on some host. Usage (minus exception handling):

// Server wishes to make itself available to others: SomeSRVC service = ...; // remote object for service Registry registry = LocateRegistry.getRegistry(); registry.bind("I Serve", service); // The client wishes to make requests of the above service: Registry registry = LocateRegistry.getRegistry("foo.services.com"); SomeSRVC service = (SomeSRVC)registry.lookup("I Serve"); service.requestService(...);

  Programs can be easily migrated from one server to another and be replicated over multiple hosts with full transparency for clients.


6.0 Handling of Remote Methods

 Call handled synchronously by server.

 Concurrent RMIs: » serial or » concurrently.

 Server availability: » continuous or » on-demand. »  (RMI & CORBA support both)


7.0 Failures of RMIs

 Machines or networks can fail at any time.

 At most once semantics.

 RMI return value indicates success.

 Up to the client to avoid maybe semantics!


Summary 1

  The client process’s role is to invoke the method on a remote object. The only two things that are necessary for this to happen are the remote interface and stub classes.

  The server, which “owns” the remote object in its address space, requires all parts of the RMI interchange.

  When the client wants to invoke a method on a remote object, it is given a surrogate that implements the same interface, the stub. The client gets this stub from the RMI server as a serialized object and reconstitutes it using the local copy of that class.


Summary 2

  The third part of the system is the object registry. When you register objects with the registry, clients are able to obtain access to it and invoke its methods.

  The purpose of the stub on the client is to communicate via serialized objects with the registry on the server. It becomes the proxy for communication back to the server.


Summary  The critical parts of a basic RMI system include the

client, server, RMI registry, remote object and its matching stub, skeleton and interface.

 A remote object must have an interface to represent it on the client, since it will actually only exist on the server. A stub which implements the same interface acts as a proxy for the remote object.

 The server is responsible for making its remote objects available to clients by instantiating and registering them with Naming service.


The Remote Method Invocation (RMI) is a Java system that can be used to easily develop distributed object-based applications. RMI, which makes extensive use of object serialization, can be expressed by the following formula:

RMI = Sockets + Object Serialization + Some Utilities

The utilities are the RMI registry and the compiler to generate stubs and skeletons.

If you are familiar with RMI, you would know that developing distributed object-based applications in RMI is much simpler than using sockets.

So why bother with sockets and object serialization then?

Critique


•  The advantages of RMI in comparison with sockets are: •  Simplicity: RMI is much easier to work with than sockets •  No protocol design: unlike sockets, when working with RMI there is no need to worry about designing a protocol between the client and server -- a process that is error-prone.

•  The simplicity of RMI, however, comes at the expense of the network. •  There is a communication overhead involved when using RMI and that is due to the RMI registry and client stubs or proxies that make remote invocations transparent. For each RMI remote object there is a need for a proxy, which slows the performance down.

Critique


Online Resources & Reading

  Chapter 4 of Course textbook   Chapter 5 of Coulouris & Dollimore   Examples at //web.archive.org/web/20031220223738/http:/

/www.churchillobjects.com/c/11086.html   Tutorials at //engronline.ee.memphis.edu/advjava/online.htm   short tutorial at //www.eg.bucknell.edu/~cs379

/DistributedSystems/rmi_tut.html

Next Session: CORBA & COMPARISON WITH RMI


Distributed Systems

Session 5: Common Object Request Broker,

(CORBA)

Christos KloukinasDept. of Computing

City University London© City University London, Dept. of Computing Distributed Systems / 5 - 2

0.0 Review: RMI� RMI – Remote Method Invocation» RPC in Java Technology and more » Concrete programming technology» Designed to solve the problems of writing and organizing executable code

» Native to Java, an extension of core language» Benefits from specific features of Java

–Object serialization– Portable, downloadable object implementations– Java interface definitions


0.1 RMI: Benefits� Invoke object methods, and have them execute on remote Java Virtual Machines (JVMs)

� Entire objects can be passed and returned as parameters» Unlike many other remote procedure call based mechanisms requiring either primitive data types as parameters, or structures composed of primitive data types

� New Java objects can be passed as a parameter» Can move behavior (class implementations) from client to server and server to client


0.2 RMI: Benefits� Enables use of Design Patterns» Use the full power of object oriented technology in distributed computing, such as two- and three-tier systems (pass behavior and use OO design patterns)

� Safe and Secure» RMI uses built-in Java security mechanisms

� Easy to Write/Easy to Use» A remote interface is an actual Java interface

� Distributed Garbage Collection» Collects remote server objects that are no longer referenced by any client in the network


0.3 RMI: Implementation


0.4 Developing RMI� Define a remote interface» define a remote interface that specifies the signatures of the methods to be provided by the server and invoked by clients

» It must be declared public, in order for clients to be able to load remote objects which implement the remote interface.

» It must extend the Remote interface, to fulfill the requirement for making the object a remote one.

» Each method in the interface must throw a java.rmi.RemoteException.


Developing RMI � Implement the remote interface� Develop the server

�Create an instance of the RMISecurityManager and install it �Create an instance of the remote object�Register the object created with the RMI registry

� Develop the client– First obtain a reference to the remote object from the RMI registry


Developing RMI� Running the application» Generate stubs and skeletons - rmic» Compile the server and the client - javac» Start the RMI registry - rmiregistry» Start the server and the client


Outline� 1.0 The Object Management Group and Introduction� 2.0 Object Management Architecture� 3.0 CORBA Communication� 4.0 Implementation, “Hello World” Example� 5.0 RMI vs CORBA Comparison


Remember: Conceptual Framework� Architecture.� Accessing components from programming languages.

� Interfaces to lower layers.� Component identification.� Service invocation styles.� Handling of failures.


CORBA

� Object management architecture.� Accessing remote objects.� ORB interface.� Object identification� Activation strategies.� Request vs. notification.� Handling of failures.


1.0 The Object Management Group� The OMG is a non-profit consortium created in 1989 with

the purpose of promoting theory and practice of object technology in distributed computing systems to reduce the complexity, lower the costs, and hasten the introduction of new software applications.

� Originally formed by 13 companies, OMG membership grew to over 500 software vendors, developers and users.

� OMG realizes its goals through creating standards which allow interoperability and portability of distributed object oriented applications. They do not produce software or implementation guidelines.


1.1 CORBA (Common Object Request Broker Architecture)

� Specification by OMG of an OO infrastructure for Distributed Computing.

� Defines Object Request Broker and IDL� Enables Software interoperability across languages and platforms

� Applicable to legacy, commercial-off-the-shelf(COTS) integration and new software development

� CORBA is just a specification for creating and using distributed Objects, it is an integration technology NOT a programming language


1.2 CORBA: A Specification� Takes care of cross-language issues automatically

� Uses OMG IDL (Interface Definition Language

� Runs over IIOP (Internet Inter-Orb Protocol)

Java Client

Cobol Client

Java Object

C++Object

CORBA/IIOP

CORBA/IIOP

C O RB A /

IIO P


1.3 CORBA Concepts� CORBA’s theoretical underpinnings are based on three important concepts;» An Object-Oriented Model» Open Distributed Computing Environment» Component Integration and Reuse

� CORBA Provides» Uniform access to services» Uniform discovery of resources and object names» Uniform error handling methods» Uniform security policies


1.4 . The OMG Object Model� The OMG Object Model defines common object semantics for specifying the externally visible characteristics of objectsin a standard and implementation-independent way.

� In this model clients request services from objects (which will also be called servers) through a well-defined interface.

� This interface is specified in OMG IDL (Interface Definition Language). A client accesses an object by issuing a request to the object.

� The request is an event, and it carries information including anoperation, the object reference of the service provider, and actual parameters (if any).

© City University London, Dept. of Computing Distributed Systems / 5 - 17 © City University London, Dept. of Computing Distributed Systems / 5 - 18


1.5 About CORBA Objects� CORBA objects differ from typical objects in 3 ways» CORBA objects can run on any platform.» CORBA objects can be located anywhere » CORBA Objects can be written in any language that has an IDL mapping

� A CORBA object is a virtual programming entity that consists of an identity, an interface, and an implementation which is known as a Servant.» It is virtual in the sense that it does not really exist unless it is made concrete by an implementation written in a programming language


1.6 Objects and Applications� CORBA applications are composed of objects.� Typically, there are many instances of an object of a single

type - for example, an e-commerce website would have many shopping cart object instances, all identical in functionality but differing in that each is assigned to a different customer, and contains data representing the merchandise that its particular customer has selected.

� For other types, there may be only one instance. When a legacy application, such as an accounting system, is wrapped in code with CORBA interfaces and opened up to clients on the network, there is usually only one instance.


2.0 Object Management Architecture (OMA)

ApplicationObjects

CORBA facilities

CORBA services



2.1 OMA Model� CORBA is based on the object model, derived from the abstract core object model of OMG’sOMA (Object Management Architecture)

� OMA groups objects into four categories» CORBAservices» CORBAfacilities» CORBAdomain» Application object

OMA

CORBA


2.2 CORBAservices� CORBAservices» Provide basic functionality, that almost every object needs

– Naming Service-name binding ,associating names and references– Event Service- asynchronous event notification– Concurrency Control Service-mediates simultaneous access

� CORBAfacilities (sometimes called Horizontal CORBAfacilities)» Between CORBAservices and Application Objects» Potentially useful across business domains

– Printing, Secure Time Facility, Internationalization Facility, Mobile Agent Facility.


2.3 OMA model� Domain (Vertical) CORBAfacilities» Domain-based and provide functionality for specific domains such as telecommunications, electronic commerce, or health care.

� Application Objects» Topmost part of the OMA hierarchy » Customized for an Individual application, so do not need standardization


2.4 CORBA Architecture

Dynamic Dynamic InvocationInvocation

ClientClientStubsStubs

ORBORBInterfaceInterface

ServerServerSkeletonSkeleton

ObjectObjectAdapterAdapter

ORB CoreORB Core

ClientClient Object ImplementationObject Implementation


CORBA ArchitectureClient

DynamicInvocation

ORBInterface

DynamicSkeleton

StaticSkeleton

Stub

ObjectAdapter

Object Implementation(Servant)

Object Request Broker (ORB)

Interface identical for all ORB implementationsThere maybe multiple object adaptersThere are stubs and skeletons for each object type

Up-call interface

Normal call interfaceORB-dependent interface


2.5 CORBA Architecture� A general CORBA request structure

IIOP

Request from a client to an object implementation

Request

A request consists of•Target object (identified by unique object reference)•Operation.•Parameters (the input, output, and in-out parameters defined for the operation; maybe specified individually or as a list•Optional request context•Results (the results values returned by operation)


2.6 CORBA Architecture� CORBA is composed of five major components;

» ORB, » IDL, » Dynamic Invocation Interface (DII), » Interface Repositories (IR),» Object Adapters (OA),» Inter-Orb Protocol (IIOP)

� CORBA provides both static and dynamic interfaces to its services

� Happened because two strong proposals from HyperDesk and Digital based on a Dynamic API & from SUN and HP based on a static API. “Common” stands for a two-API proposal


2.7 Object Request Broker, ORB� Core of CORBA, middleware that establishes the client/server relationship between objects

� This is the object manager in CORBA, the software that implements the CORBA specification, (implements the session, transport and network layers), provides object location transparency, communication and activation, i.e» Find object implementation for requests (provide location transparency)

» Prepare the object implementation to receive request» Communicate the data making up request.(Vendors & Products: ORBIX from IONA, VisiBroker from Inprise, JavaIDL from javasoft)


2.8 CORBA Architecture: ORB� On the client side the ORB is responsible for

» accepting requests for a remote object» finding the implementation of the object» accepting a client-side reference to the remote object (converted to a language specific form, e.g. a java stub object)

» Routing client method calls through the object reference to the object implementation

� On the Server side» lets object servers register new objects» receives requests from the client ORB» uses object’s skeleton interface to invoke the object activation method» Creates reference for new object and sends it back to client.


2.9 CORBA Architecture: Stubs,Skeletons� Client Stub

» provides the static interfaces to object services. These precompiled stubs define how clients invoke corresponding services on the server. From a client’s perspective, the stub acts like a local call- it’s a local proxy for a remote server object. Generated by the IDL compiler (there are as many stubs as there are interfaces!)

� Server Skeleton» provides static interfaces to each service exported by the server. Performance unmarshalling, and the actual method invocation on the server object

� ORB Interface» Interface to few ORB operations common to all objects, e.g. operation which returns an object’s interface type.


2.10 CORBA Architecture: Servant &Clients

� Object -- This is a CORBA programming entity that consists of an identity, an interface, and an implementation, which is known as a Servant.

� Servant -- This is an implementation programming language entity that defines the operations that support a CORBA IDL interface. Servants can be written in a variety of languages, including C, C++, Java, Smalltalk, and Ada.

� Client -- This is the program entity that invokes an operation on an object implementation. Accessing the services of a remote objectshould be transparent to the caller. Ideally, it should be as simple as calling a method on an object. The remaining components help to support this level of transparency.


2.11 CORBA Architecture: DII� Dynamic Invocation Interface (DII)» Static invocation interfaces are determined at compile time, and they are presented to the client using stubs

» The DII allows client applications to use server objects without knowing the type of objects at compile time– Client obtains an instance of a CORBA object and makes invocations on that object by dynamically creating requests.

» DII uses the interface repository to validate and retrieve the signature of the operation on which a request is made


2.12 CORBA Architecture: DSI� Dynamic Skeleton Interface (DSI)» Server-side dynamic skeleton interface» Allows servers to be written without having skeletons, or compile time knowledge for which objects will be called remotely

» Provides a runtime binding mechanism for servers that need to handle incoming method calls for components that do not have IDL-based compiled skeletons

» Useful for implementing generic bridges between ORBs» Also used for interactive software tools based on interpreters and distributed debuggers


2.13 CORBA Architecture: IR� Interface Repository» allows YOU to obtain and modify the descriptions of all registered component interfaces(method supported, parameters i.e method signatures)

» It is a run-time distributed database that contains machine readable versions of the IDL interfaces

» Interfaces can be added to the interface repository» Enable Clients to;

– locate an object that is unknown at compile time– find information about its interface– build a request to be forwarded through the ORB


2.13 CORBA Architecture: OA� Object Adapter (OA) e.g (Basic-BOA or Portable-POA)

» Purpose:interface an object’s implementation with its ORB» Primary way that an object implementation accesses services provided by the ORB.

» Sits on top of the ORB’s core communication services and accepts requests for service on behalf of server objects, passing requests to them and assigning them IDs (object references)

» Registers classes it supports and their run-time instances with the implementation repository

» In summary, its duties are:– Object reference generation, and interpretation, method invocation, security of interactions, and implementation of object activation and de-activation


2.14 Implementation Repository� Provides a run-time repository of information about classes a server supports, the objects that are instantiated and their IDs

� Also serves as a common place to store additional information associated with implementations of ORBS» e.g. trace information, audit trails and other administrative data


2.15 Summary of CORBA Interfaces� Interface and Implementation Repositories

Accessesincludes includes Decsribes

•All objects are defined in IDL by specifying their interfaces

•Object definitions (interfaces) are manifested as objects in the interface repository, compiled to stubs and skeletons

•Descriptions of object implementations are maintained as objects in the implementation repository







ORB CoreORB Core

2.16 Accessing Remote Objects



2.17 Client Side




ClientClient

Clients perform requests using object referencesClients May issue requests through object interface stubs (static) or dynamic invocation interface (Dynamic)

Clients may access general ORB services:

•Interface Repository•Context management•List Management•Request Management


2.18 Implementation Side (Server side)� Implementations receive requests through skeletons (without knowledge of invocation approach)




Object ImplementationObject ImplementationThe object Adapter provides for:

•Management of references;•Method invocation;•authentication•implementation registration•activation/deactivation


3.0 CORBA Communication� CORBA Spec Neutral w.r.t network protocols

» CORBA specifies GIOP, a high level standard protocol for communication between ORBs

� Generalized Inter-ORB Protocol (GIOP) is a collection of message requests an ORB can make over a network

� GIOP maps ORB requests to different transports» Internet Inter-ORB Protocol (IIOP) uses TCP/IP to carry the messages, hence fits well into Internet world

» Environment Specific Inter-ORB Protocol (ESIOP) complements GIOP enabling interoperability with environments not having CORBA support


3.1 CORBA Communication� GIOP contains specifications for» Common Data Representation (CDR)» Message formats (Reply, Request, LocateReply, LocateRequest, CancelRequest, etc)

» Message transport assumptions– Connection-Oriented– Reliable– A Byte Stream Protocol


3.2 Communication: Inter-Orb ArchitectureCORBA IDL

General Inter-ORBProtocol (GIOP)Internet

Inter-ORB Protocol(IIOP)

TCP/IPInternet

Others for example

OSI andIPX/SPX

Object Request Semantics

Transfer and Message Syntax

Transports


3.3 Other Features� CORBA Messaging» CORBA 2.0 provides three different techniques for operation invocations:–Synchronous The client invokes an operation, then pauses, waiting for a response

–Deferred synchronous The client invokes an operation then continues processing. It can go back later to either poll or block waiting for a response

–One-way The client invokes an operation, and the ORB provides a guarantee that the request will be delivered. In one-way operation invocations, there is no response


3.4 New Features» Two newer, enhanced mechanisms are introduced

–Callback The client supplies an additional object reference with each request invocation. When the response arrives, the ORB uses that object reference to deliver the response back to the client

–Polling The client invokes an operation that immediately returns a valuetype that can be used to either poll or wait for the response

» The callback and polling techniques are available for clients using statically typed stubs generated from IDL interfaces (not for DII)


4.0 Implementation

Implementation


4.0 How to Write CORBA ApplicationsCreate IDLdefinition

Server SkeletonExample Servant Client IDL Stub

Implement Client

javacImplementation

repository Start server

Interface repository

Server Client

idl

Implement Servant

javacStart client

Object Adapter

loadis used by

instantiates


4.1 Example: Hello world IDmodule HelloApp{ interface Hello{ string sayHello();

};};


4.2 Example: Hello World Serverimport HelloApp.*;import org.omg.CosNaming.*;import org.omg.CosNaming.NamingContextPackage.*;import org.omg.CORBA.*;

class HelloServant extends _HelloImplBase { public String sayHello() {

return "\nHello world !!\n";}}public class HelloServer {public static void main(String args[]){try{// create and initialize the ORBORB orb = ORB.init(args, null);

// create servant and register it with the ORBHelloServant helloRef = new HelloServant();orb.connect(helloRef);

// get the root naming contextorg.omg.CORBA.Object objRef =

orb.resolve_initial_references("NameService");NamingContext ncRef =

NamingContextHelper.narrow(objRef);

// bind the Object Reference in NamingNameComponent nc = new NameComponent("Hello", "");NameComponent path[] = {nc};ncRef.rebind(path, helloRef);

// wait for invocations from clientsjava.lang.Object sync = new java.lang.Object();synchronized (sync) {sync.wait();

}

} catch (Exception e) {…} }

}

Server Skeleton

Servant

ORB interface

Binding


4.3 Example: Hello World Clientimport HelloApp.*; import org.omg.CosNaming.*; import org.omg.CORBA.*;

public class HelloClient {public static void main(String args[]) {try{ORB orb = ORB.init(args, null); // create and initialize the ORB

// get the root naming contextorg.omg.CORBA.Object objRef = orb.resolve_initial_references("NameService");NamingContext ncRef = NamingContextHelper.narrow(objRef);

// resolve the Object Reference in NamingNameComponent nc = new NameComponent("Hello", "");NameComponent path[] = {nc};Hello helloRef = HelloHelper.narrow(ncRef.resolve(path));

// call the Hello server object and print resultsString hello = helloRef.sayHello();System.out.println(hello);

} catch (Exception e) {…}}}

Casting

Service Request


4.4 Static vs. Dynamic Invocation

� Static invocation: IDL operations must have been defined before client can be developed.

� Does not suit every application (Example?)� Dynamic invocation interface enables clients to define operation invocations at run-time.

� Interface repository can be used to ensure that calls are type safe.


4.5 ORB Interface

� Object type Object.� Initialisation of object request broker.� Initialisation of client / server applications.� Programming interface to interface repository.


4.6 Object Identification� Objects are uniquely identified by object identifiers.� Object identifiers are persistent.� Identifiers can be externalised (converted into

string) and internalised.� Identifiers can be obtained » from a naming or a trading service,» by reading attributes,» from an operation result or» by internalising an externalised reference.



C

4.7 Activation Strategies

BasicObjectAdapter

ProcessObject

A

B

D A Shared ServerB Unshared ServerC Server per methodD Persistent server

RegistrationActivation


3.5 Request vs. Notification

� IDL operations are handled synchronously.� For notifications, it may not be necessary to await the server, if operation does not» have a return value,» have out or inout parameters and» raise specific exceptions.

� Notification can be implemented as onewayoperations in IDL.

� Client continues after notification is delivered.© City University London, Dept. of Computing Distributed Systems / 5 - 60

3.5 Notification (Example)/* person.idl */enum sex_type {FEMALE, MALE};

struct Person {string first_name;string last_name;sex_type sex;string city;

};

interface PersonManager {oneway void print(in Person);long store(in Person pers);Person load(in long pers_id);

};


3.6 Failures

� CORBA operation invocations may fail for the same reasons as RMIs.

� Exceptions give detailed account why an operation has failed.

� System vs. application specific exceptions.


CORBA AND JAVA� 1997: RMI Introduced with JDK1.1� 1998: JavaIDL with JDK1.2 – Java ORB supporting IIOP. ORB also supports RMI over IIOP ⇒ remote objects written in the Java programming language accessible from any language via IIOP

� CORBA provides the network transparency, Java provides the implementation transparency


4 Comparison� RMI architecture lacks interface repository (but has reflection).

� IDL and RMI allow for:» inheritance, » attributes and» exceptions. (These three are missing in RPC)

� IDL has multiple standardised language bindings. RMI is part of JAVA, RPC goes with C and C++


4 Comparison (cont´d)

� RMIs are lightweight.� Component identification is reflexive in IDL and RMI, as opposed to RPC.

� Basic object adapter provides more flexible activation strategies.

� Oneway operations can be used for asynchronous notifications.


4 Comparison (cont´d)

� RMIs may be more efficient than CORBA operation invocations.

� RMI comes with JAVA, whilst you would have to obtain a CORBA product (open-source ones exist).


6.1 RMI vs CORBA� RMI is a Java-centric distributed object system. The only way currently to integrate code written in other languages into a RMI system is to use the Java native-code interface to link a remote object implementation in Java to C or C++ code. This is a possibility, but complicated.

� CORBA, on the other hand, is designed to be language-independent. Object interfaces are specified in a language that is independent of the actual implementation language. This interface description can then be compiled into whatever implementation language suits the job and the environment.


6.2 RMI vs CORBA (ctd.)

� Relatively speaking, RMI can be easier to master, especially for experienced Java programmers, than CORBA. CORBA is a rich, extensive family of standards and interfaces, and delving into the details of these interfaces is sometimes overkill for the task at hand.


6.3 RMI vs CORBA (ctd.)� CORBA is a more mature standard than RMI, and

has had time to gain richer implementations. The CORBA standard is a fairly comprehensive one in terms of distributed objects, and there are CORBA implementations out there that provide many more services and distribution options than RMI or Java. The CORBA Services specifications, for example, include comprehensive high-level interfaces for naming, security, and transaction services.


6.4 The Bottom Line

� So which is better, CORBA or RMI? Basically, it depends. If you're looking at a system that you're building from scratch, with no hooks to legacy systems and fairly mainstream requirements interms of performance and other language features, then RMI may be the most effective and efficient tool for you to use.

� On the other hand, if you're linking your distributed system to legacy services implemented in other languages, or if there is the possibility that subsystems of your application will need to migrate to other languages in the future, or if your system depends strongly on services that are available in CORBA and not in RMI,or if critical subsystems have highly-specialized requirements that Java can't meet, then CORBA may be your best bet.


Distributed Systems

Session 6: Implementing Distributed Systems with

OMG/CORBAChristos Kloukinas

Dept. of ComputingCity University London


Announcements♦Milestone 2 Due today

» Server Implementation» A least version 1» 1 submission per pair» Identify your partner in the submission

♦Some Self Assessment questions up on CitySpace


Taking Stock: Module Outline1 Motivation2 Distributed Software Engineering3 Communication4 RMI 5 CORBA vs RMI6 Building Distributed Systems with CORBA- Common Problems in Distributed Systems

7 Naming and Trading8 Concurrent Processes and Threads9 Transactions10 Security


0.0 CORBA IDL♦CORBA IDL is very expressive and widely available on many platforms for different programming languages. This has motivated the use of CORBA as a mechanism to explain, study and experiment with principles of distributed systems


0.1 Last session: Object Management Architecture

ApplicationObjects

CORBA facilities

CORBA services


Naming trading concurrency Lifecycle

SecurityTransactions


0.2 Last session: CORBA Architecture


0.3 Last session Summary♦Revisited CORBA/IDL♦Static Vs Dynamic Invocation♦ Interface Repository♦Dynamic Invocation Interface (DII)♦Dynamic Skeleton Interface (DSI).♦Basic Object Adapter♦CORBA Communication and the IIOP Protocol♦Hello World Example♦Compare and Contrast, CORBA and JAVA RMI


Outline of Session 6

♦ IDL programming language bindings.♦Difference between centralised and distributed object lifecycle.

♦CORBA Lifecycle Service.


Outline♦ To actually develop distributed systems an IDL is not sufficient. The operations declared at the interface need to be implemented in order to be used.

♦ For both implementation and use of distributed operations, bindings to existing programming languages need to be defined. The standardisation of these programming language bindings will then facilitate the interoperability between distributed objects that are implemented in different programming languages to form so calledpolylingual applications.

♦ A further prerequisite for distributed object-oriented applications is the ability to create distributed objects in a location transparent way. Moreover, objects may have to be copied or relocated and during that may have to be migrated to different platforms. Also objects may have to be removed.


1 IDL Programming Language Bindings

1 Polylingual applications2 Standardisation of bindings3 Available bindings4 What bindings need to address5 An example: IDL/Java


1.1 Polylingual Applications♦ Distributed computing frameworks, such as CORBA are not only used for the construction and a-priori integration of new components. They are probably more often used for the a-posteriori integration of applications from existing components.

♦ Polylingual applications have components in different programming languages.

♦ To achieve interoperability between these components, language bindings are needed that map different language concepts onto each other.

♦ Problem: with n different languages, n(n-1) different language bindings needed.

♦ Solution: One language (such as IDL) as a mediator. Requires only nbindings.


1.1 Polylingual Applications with IDLClient side

Object Implementation

Side

Infra-structure IDLJava

ClientC++ Obj.

Impl.IDL

•Multiple polylingual clients accessing same object

•IDL/JAVA binding enables client to invoke exported operations on server object

•IDL/C++ Binding used to implement exported operations


1.2 Standardisation of Bindings♦ Facilitate portability:

» If different ORB vendors used different programming language bindings, neither object implementations nor clients of these implementations would be portable. As this is very undesirable, the OMG has standardised a number of language bindings.

» ORB vendors must respect these language bindings to be able to claim that they are CORBA compliant.

♦ Decrease learning curve of developers:» Developers who studied one language binding do not have

to learn the binding again if they switch to an ORB from another vendor.


1.3 Available Bindings

♦C♦C++♦Smalltalk♦Ada-95♦OO Cobol♦ Java

It is sufficient for the compliance of an object request broker product to the CORBA standard if it provides one of these bindings. Most brokers, however, provide more than one binding. Nevertheless, no product is currently available that implements all bindings.


1.4 What Bindings Need to Address♦Atomic data types and type constructors♦Constants♦ Interfaces and multiple inheritance♦Object references♦Attribute accesses♦Operation execution requests♦Exceptions♦ Invocation of ORB operations


1.5 An Example: IDL/Java1 Modules2 Atomic Types3 Enumerations4 Records5 Interfaces6 Attributes7 Operations8 Inheritance9 Exceptions10 Operation Execution Requests


1.5.1 Modules♦ As an example, assume that an interface Account is included in the IDL BankApplication module. This interface will be represented in Java as class Account. From outside the package BankApplication, the class can be accessed using BankApplication.Account.

♦ Note that in this way the avoidance of name clashes is supported, which makes the approach particularly useful for the construction of large distributed systems.


1.5.1 Modules

IDL:

Java:

module BankApplication {...

};

package BankApplication; ...


1.5.2 Atomic Types

IDL Javashort/unsigned short shortlong/unsigned long intlong long/unsigned long long longfloat floatdouble doublechar charboolean booleanoctet bytestring String


1.5.2 Atomic Types (ctd.)♦ Most atomic types map naturally to Java♦ Java’s platform independence is of great value here. In the IDL to C++ mapping, for example, there is the problem of different representations on different platforms (shorts can be 32 or 64 bit on Unix and 16 bit on PCs or the significance of a byte may be different (low endian vs high endian architecture). Therefore IDL-to-C++ does not map anything to atomic C++ types. Java does not have this problem because the Java Virtual Machine is standardised.


1.5.3 Enumerations♦ IDL provides an enumeration type;

» An ordered list of identifiers whose values are assigned in ascending order according to their order in enumerationmodule addresses{enum Sex {male, female};};

♦ Java has no enumeration type and thereforehas to implement an enumeration as a class!

� The class provides constants of the enumeration type which is internally realised as integers

� Additionally the Java class provides a method to convert integers to the enumeration type


Enumeration (2)♦ One shortcoming of Java is the missing enumeration type. An IDL enumeration is mapped to an enumeration class in Java♦ Example: The above IDL enumeration in implemented by the Java code below. The constants can be accessed as Sex.male

and Sex.female . Integers (0 and 1 in this case) can be translated to enumeration typepackage Addresses;public final class Sex implements java.lang.Cloneable {

public static final int _male = 0;public static final Sex male = new Sex(_male);public static final int _female = 1;public static final Sex female = new Sex(_female);public static final Sex IT_ENUM_MAX = new Sex(Integer.MAX_VALUE);public int value () {return ___value;}public static Sex from_int (int value) {

switch (value) { case _male : return male;case _female : return female;

default : throw new org.omg.CORBA.BAD_PARAM("Enum out of range");} }private Sex (int value) { ___value = value;}private int ___value;public java.lang.Object clone(){return from_int(___value);}

}


1.5.4 Records

final public class Info { public int height;public short weight;public Info() {}public Info(int height, short weight){

this.height = height;this.weight = weight;}

};

struct Info {long height;short weight;

};IDL

Java

Likewise, IDL records are mapped to a Java class.Attributes of the record are mapped to public attributes of the class.Names used in IDL are used directly in Java.


♦ IDL interfaces are translated into Java public interfaces. The reasons for that are obvious: The inheritance and subtype relationships in IDL can be mapped to inheritance in Java and interface components, such as attributes and operations can be implemented as Java methods.

♦ The interface name can be kept as the class name because no name conflicts can occur in Java which would not have already been detected in IDL.

1.5.5 Interfaces


1.5.6 AttributesIDL

Java

interface person { attribute readonly string name;attribute address lives_at;

};

public interface person {public String name();public address lives_at();public void lives_at(address value);

};


1.5.6 Attributes (ctd.)♦ IDL attributes are implemented as Java class attribute with

access methods. For readonly attributes a single (get) method is generated and for other attributes a pair of (set and get) methods is created.

♦ An access of an attribute from a remote object can fail for similar reasons as an operation execution request. These failures are handled using exceptions in both Java and IDL.

♦ The visibility of methods that implement attributes is public. This is necessary to retain the IDL semantics that any attribute that is declared can be accessed from other classes.


1.5.7 Operations

IDL interface Dog { void bark(out short times);void growl(in string at);

};

public interface Dog{public void bark(org.omg.CORBA.ShortHolder times);public void growl(String at);};

Java


1.5.7 Operations (ctd.)♦ Operations defined in an IDL interface are mapped to public Java methods of the class that represents the interface.

♦ The method name is retained because again this cannot cause scoping problems in Java that would not have been detected in IDL. Likewise, parameter names are retained as they cannot cause name clashes.

♦ The mapping of parameter types is more complicated. Since Java does NOT provide pointers, parameters of atomic type are passed by value,

♦ IMPORTANT STUFF♦ Truth #1: Everything in Java is passed by value. Objects,

however, are never passed at all.., ONLY their references are (by value again…).

♦ Truth #2: The values of variables are always primitives or references, never objects.


Operations ctd. (in, out,inout parameters) ♦ IN : to be passed with a meaningful value

» Value of the actual parameter is copied into the formal parameter when the operation is invoked. Modification of formal parameter affects only the formal parameter, not the actual parameter. This is the most common form of parameter passing and is the only one provided in C & Java(CALL-BY-VALUE)

♦OUT: Whose value will be changed by operation» The value of the formal parameter is copied into the actual parameter when the

procedure returns. Modifications to the formal parameter do not affect the formal parameter until the function returns. (CALL-BY-RESULT)

» So it really should be passed by reference (to be modified!)♦ INOUT: Combination of IN and OUT♦ E.G. Consider f(s) and call f(g), s: formal parameter,and g actual

parameter© City University London, Dept. of Computing Distributed Systems / 6 - 30

Operations ctd (implementing out, inout)♦ CORBA IDL in parameter implement call-by-value semantics , JAVA supports this, so consequently in maps to normal JAVA parameters and requires no additional effort.

♦ whereas IDL’s out and inout parameter do NOT have JAVA counterparts, SO some additional mechanism is required for call-by-result, etc

♦ Java creates for every type a holder class, a container, an object which wraps up the value. Since object references can be passed by value the out/inout parameter can now be realised in java programs.

♦ I.e Clients instantiate an instance of appropriate Holder class, which is then passed by value.

♦ To support portable stubs and skeletons, holder classes also implement the org.omg.CORBA.portable.Streamable interface, to allow for marshalling and unmarshalling. (The whole object is sent!)


Operations (2)

♦ package org.omg.CORBA;♦ public final class ShortHolder♦ {♦ public short value;♦ public ShortHolder() {}♦ public ShortHolder(short s) { value = s; }♦ }

•The shortholder class of the above example is, for example, part of the org.omg.CORBA package:

•In language bindings providing pointers out/inout parameters are realised by pointers

•Contents of the instance are modified by server invocation•Client then uses possibly changed contents


1.5.8 Inheritanceinterface student : person { attribute string subject;

};public interface student extends person{string subject();void subject(String value);

}

IDL

Java

IDL provides multiple inheritance, JAVA does not.How is the problem solved??


1.5.8 Inheritance (ctd.)♦ Inheritance between IDL interfaces is implemented as inheritance

between the respective Java interfaces. ♦ Note : Java interfaces do allow multiple inheritance whereas Java classes do not.

♦ Therefore IDL interfaces with multiple inheritance map to JAVA interfaces with multiple inheritance

♦ When implementing such a Java interface one uses the implements-keyword and therefore inherits only the names of methods and attributes and not any code. The Java class implementing a Java interface with multiple inheritance implements every single method/attribute of the interface and is therefore in control.


1.5.9 Exceptionsinterface employee : person { exception too_young{...};void retire() raises (too_young);

};

public interface Employeeextends org.omg.CORBA.Object {

public void retire() throws EmployeePackage.too_young;

}

IDL

Java

Exceptions that are declared within an interface are mapped to Java classes.


Exceptions (2)♦ The previous example leads to the following Java class:package Exception.EmployeePackage;public final class too_young extends org.omg.CORBA.UserExceptionimplements java.lang.Cloneable {

public String explanation; public short age;public too_young() {super();}public too_young(String explanation,short age) {super(); this.explanation = explanation; this.age = age;}

...}♦ Note that programming languages such as C which do not provide exceptions, model exceptions by additional parameters to methods.(much faster but easier to ignore…)


1.5.10 Operation Execution Requests

♦Operation execution requests have no counterpart in IDL as IDL is an interface and not an implementation definition language!


1.5.10 Operation Execution Requestsemployee emp;...try { emp.retire();

} catch (too_young e){ // Handle the Specific Exception

}catch (SystemException se){ switch (SysEx.minor() ) {case BAD_PARAM : ... ; break;case NO_MEMORY : ... ; break;

};};

Java


2 Lifecycle Service

ApplicationObjects

CORBAfacilities

CORBAservices


Lifecycle


Introduction/Summary♦ The problem of distributed object life cycle is the problem of

» Creating an Object, Deleting an Object, Moving and Copyingan object, Operating on a graph of distributed objects.

♦Client model of object lifecycle is based on» Factories and target objects supporting LifeCycleObject

interface which defines operation for delete, move and copy ♦GenericFactory interface is defined

» Generic factory is sufficient to create objects of different types» By Defining GenericFactory interface, implementations that

administer resources are enabled.© City University London, Dept. of Computing Distributed Systems / 6 - 40

2.1 Introduction♦ Component creation in a distributed system is more

complicated than in a centralised system mainly because:» 1) Often the component is to be created on a non-local machine. The component creation mechanisms available in programming languages (such as constructors in Java) cannot be used because location specification has to be included.

» 2) Location has to be identified, and identification must be transparent♦ More problems arise for duplication and migration of

components» due to potentially heterogeneous source and target platform, also 2) above

♦ The deletion of components is more difficult as well.» Garbage collection techniques assume that all objects are available in one address space. This is not the case in a distributed system. The techniques cannot be directly applied.


2.1 Introduction (ctd.)

Obj1

Obj1

Obj1

Obj1

Obj1

creation

duplication

migration

removal

replication

Obj1

Obj1

Obj1Obj2 Obj3

Clien

t

Server


2.2 CORBA Lifecycle Service -Object Creation

♦ Object creation is done in the CORBA lifecycle service by so called Factory objects. These are plain CORBA objects themselves that export an operation that create and return new objects.

♦ The factory objects use object constructors for the implementation of these creation operations. The new objects, therefore, run in the same address space as the factory object.

♦ An example is the personFactory object which can be used to create a new object of type person. To do so a client who wishes to create a person object calls the operation createPerson which will return a reference to a newly created person object. This person object will run on the same machine as the object of type personFactory.

♦ The problem of location transparency then gets down to locating factory objects.


2.2 Object Creation (ctd.)♦ Object Creation is done in CORBA lifecycle service by factory objects♦ The life cycle module exports the FactoryFinder interface, which supports factory location.

♦ A client wishing to locate a factory can invoke the find_factoriesoperation, which will return a sequence of Factory objects. The parameter of the find_factory operation is a key that can be considered as an external (and location independent) name. If no factories are found with that name, the NoFactory exception will be raised.

♦ Factories register with a factory finder using a private protocol. This protocol is likely to be defined in an interface that inherits from the FactoryFinder interface.This, however, is transparent for clients.

♦ Factory finders are not only used for the immediate location of a factory (for creation purposes), but they are also used as proxies (placeholders) for location information that is to be passed to move and copy operations.


2.2 Object Creation

♦ Factory finder objects can be located by other means (e.g. naming or trading).(will discuss in next lecture)

interface FactoryFinder {Factories find_factories (in Key factory_key)

raises (NoFactory);};

Factory location supported by:


2.2 Object Creation (ctd.)♦ It would be fairly costly if a factory interface had to be created for each object

type. This would immediately double the number of interfaces in the distributed application.

♦ The life cycle service therefore defines the GenericFactory interface. It exports an operation by means of which it can be checked whether the Factory is able to create an object of a particular type (whose name is given as a key).

♦ A second operation allows clients to create an instance of the type whose name is given as a key.

♦ This overcomes the problem that type specific factories are not needed. In addition, resources can be managed for instances of different types that reside on one location.

♦ As a disadvantage, however, a type specific initialisation (which can be achieved within an object constructing operation of a specific factory) is not possible through this generic interface.


2.2 Object Creation♦ LifeCycle Service includes generic factory:interface GenericFactory {boolean supports(in Key k);Object create_object(in Key k, in NVP criteria)

raises (NoFactory, InvalidCriteria,CannotMeetCriteria);

};♦Advantage: no type specific factories required.♦Disadvantage: No specific initialisations.


Example import org.omg.CosNaming.*; import org.omg.CosLifeCycle.*;import org.omg.CORBA.*;//1) Instantiating the factory from an interoperable object reference stored in fileString factoryIOR;factoryIOR = getFactoryIOR("genfac.ior");org.omg.CORBA.Object genFacRef = orb.string_to_object(factoryIOR);GenericFactory fact = GenericFactoryHelper.narrow(genFacRef);

//2) Using the factory to create an object// struct NameComponent { Istring id; Istring kind; };NameComponent nc = new NameComponent("sBuyer::BuyerServer",

"object interface");NameComponent key[] = {nc};NVP mycriteria[] = {};org.omg.CORBA.Object objRef = fact.create_object(key, mycriteria);Buyer1Ref = BuyerHelper.narrow(objRef);


2.3 Object Duplication♦ The interface to duplicate objects is in LifeCycleObject.

» The copy operation takes a FactoryFinder as a parameter. This factory finder defines the (set of) locations on which the copy should be created.

♦ Object types that are to be copied or moved around to other locations have to be subtypes of LifeCycleObject.

♦ To accomplish type specific implementations of the copy operation while retaining a unique and generic interface that is seen by clients, subtypes of LifeCycleObject redefine the copyoperation.

♦ interface LifeCycleObject {LifeCycleObject copy (in FactoryFinder there)

raises (NoFactory,...); ...};


2.3 Object Duplication

♦ The copying of an object cannot be implemented by the ORB.» It uses the factory to create a new object on the target machine. In this way, the problem of heterogeneous machine code of object implementations is resolved.

♦Attribute values are transferred either» through parameters of the object constructing operation » through explicit operation invocations done after the object has been made. This way heterogeneity of data representation is resolved


2.4 Object Deletion♦ Objects that are created also have to be removed. In many object

oriented programming languages this is done implicitly as the object is no longer referenced.

♦ This requires reference counting and garbage collection techniques which are not applicable to distributed objects because they are too expensive in a distributed setting!

♦ Deletion of an object is defined in the LifeCycleObject interface as well. To free the resources allocated by an object clients explicitly invoke the remove operation.

interface LifeCycleObject {void remove() raises (NotRemovable);

};


2.5 Object Migration♦ Migration is the removal of an object implementation from one

location to another location.♦ The client view of migration is also defined by the

LifeCycleObject interface by means of the move operation.♦ Interfaces defining specific objects that inherit from

LifeCycleObject redefine move in an application specific way! It is often possible to use the copy and the remove operation for that purpose. This, however, is not done generically, as more efficient ways may be possible in application specific situations.

interface LifeCycleObject {void move(in FactoryFinder there)

raises(NoFactory,NotMovable);};


2.6 What’s Missing: Replication♦ No relationship is maintained between two objects once they have

been copied. They therefore do not evolve together but are completely independent from each other.

♦ This means that the life cycle service does not support replication and therefore replication transparency is not support in the CORBA framework.

♦ There are integrations of particular CORBA products (Orbix, for instance) with replication middleware components (ISIS). These integrations, however, are not standardised and applications that use them will not be portable.

♦ The advantages of replication are that it allows for a higher load and also it supports fault tolerance because the state of an object can be recovered from a replica if an implementation has crashed.


4 Summary

♦Polylingual applications.♦ IDL programming language bindings.♦Difference between centralised and distributed object lifecycle.

♦CORBA Lifecycle Service.


EXTRA GOODIES

♦(Do read these…)


3 Static vs. Dynamic Invocation






ORB CoreORB Core



3.1 Generic Applications

Example: Object BrowserPerson

Name:

Age:

Celia Cruz

56

Generic applicationsuse components whose types are not (yet) known.


3.2 Static Invocation♦Advantages:

» Requests are simple to define.» Availability of operations checked by programming

language compiler.» Requests can be implemented fairly efficiently.

♦Disadvantages:» Generic applications cannot be build.» Recompilation required after operation interface

modification.© City University London, Dept. of Computing Distributed Systems / 6 - 58

3.3 Dynamic Invocation Interface♦ Interface to create operation execution requests dynamically.

♦Requests are objects.♦Attributes for operation name, parameters and results.

♦Operations to » change operation parameters,» issue the request and» obtain the request results.


3.4 Creation of Requests

interface Object {ORBstatus create_request (in Context ctx, // operation contextin Identifier operation, // operation to execin NVList arg_list, // args of operationinout NamedValue result, // operation resultout Request request // new request objectin Flags req_flags // request flags

);...

};© City University London, Dept. of Computing Distributed Systems / 6 - 60

3.5 Synchronous Requests

Client Serverinvoke


3.6 Deferred Synchronous Requests

Client Serversend

get_response


3.7 Interface Repository

♦Makes type information of interfaces available at run-time.

♦Enables development of generic applications. ♦Achieves type-safe dynamic invocations.♦Supports construction of interface browser.♦Used by ORB itself.


3.8 Locating Interface Definitions

Alternatives for locating interface definitions:♦Any interface inherits the operation

InterfaceDef get_interface() from Object.♦Associative search using lookup_name. ♦Navigation through the interface repository using contents and defined_in attributes.


3.9 Example: Object Browser

♦ Use interface repository to find out about object types at run-time

♦ Use dynamic invocation interface to obtain attribute values.

Person

Name:

Age:

Celia Cruz

56


3.10 Object Browser Interaction DiagramObject InterfaceDef Request Request

get_interface()name()describe_interface()create_request()invoke()create_request()invoke()


4 Summary

♦Polylingual applications.♦ IDL programming language bindings.♦Difference between centralised and distributed object lifecycle.

♦Dynamic vs static invocation♦CORBA Lifecycle Service.


Distributed Systems

Session 7: Naming and TradingChristos Kloukinas



0.0 Last session♦ TWO Practical Aspects to developing Applications in

CORBA» Programming language bindings e.g JAVA/IDL

–Why Standardisations of bindings is important� Facilitate Portability,� Decreases the learning curve of developers

» CORBA Lifecycle Service– Introduced the problem distributed object lifecycle and noted that component creation is complicated in distributed systems

– Talked about Factory objects, and how the create object and hence facilitate location transparency

– Factory finders, and CORBA FactoryFinder interface


0.1 Last session Cont..♦ Costly to create Factory interface for each object type

» Motivated the need for generic factory» CORBA GenericFactory to support object creation

♦ LifecycleObject interface supports object » Duplication,Deletion, Migration

♦ Noted that (Standard) Replication support is Missing in CORBA» Would be desirable for Load balancing and fault tolerance


Outline

1 Location Transparency: A reminder2 Naming3 Trading4 Summary


1 Location Transparency

♦ The location transparency principle suggests to keep the physical location of components transparent for both, the component itself and all clients of the component. Only then can the component be migrated to other servers without having to change the components or its clients.

♦ In the CORBA framework, location transparency is already supported by the fact that objects are identified by object references, which are independent of the object’s location. Plus,

♦ Naming supports the definition of external names for components.♦ Trading supports the definition of service characteristics for a component

with a trader.


2 Naming

1 Naming Service Examples2 Common Characteristics3 CORBA Naming Service4 Limitations


2.1 NFS Directories

h o m e

ed

ja m

t e ach i

n g web

www

papers

usr

s b inbi n

i netd

l p r

r login


2.1 NFS Directories (ctd.)♦ NFS is based on directories. Directories include a number of name

bindings, each of which maps a name to a file or a subdirectory.♦ Names are unique within the scope of the directory and can be composed

to path names by delimiting the name components using a '/'. ♦ Every file or directory of the file system must have at least one entry in

some directory. If the last binding is removed the file or the directory ceases to exist. NO NAME, NO LIFE!

♦ A file or directory can have more than one name. An example is the directory that is shared by users ‘ed’ and ‘jam’. In ‘ed’ home directory that directory has the name 'web' while user ‘jam’ has given it the name 'www'.

♦ The naming scheme for files in the NFS supports location transparency because now files can be identified using pathnames rather than physical addresses(such as the hard-disk drive names C:) or the IP address of the server machine to which a partition of the file system is connected.


2.1 X.500 Directory ServiceX.500 Service (root)

Germany (country) United Kingdom (country) Greece (country)

British Airways Plc. (organization)...

...

City University (organization) ...

SOI (organizationalUnit) SOE (organizationalUnit) ......

...

CSR (organizationalUnit) CS (organizationalUnit)

Michael Schroeder (person)George Spanoudakis (person)

......

......© City University London, Dept. of Computing Distributed Systems / 7 - 10

2.1 X.500 Directory Service

♦ The X.500 Directory Service is an recommendation of the International Telecommunication Union (ITU) formerly known as CCITT.

♦ X.500 defines a global name space and it is therefore the basis for component identification in wide area networks, while the network file system is merely used in local area network.

♦ X.500 defines a directory tree and components can have only one name. Having a name is not existential for a component and there may well be subordinate components that are not named but can be identified otherwise.

♦ X.500 directory service entries not only have a name, but also a role attribute, given in brackets. In file systems these roles are sometimes indicated informally by using file name extensions, such as '.cc' for a C++ file or '.doc' for a word processor document.


2.1 Internet Domain Name Servicens.nasa.gov

(root)deac.ukuni- dortmund.de

uni- paderborn.de

dns.germany.eu.net(de)

ns1.cs.ucl.ac.uk(ac.uk)

ic.ac.ukqmw.ac.ukcity.ac.uk

nameserv.city.ac.uk(city.ac.uk)

*.city.ac.uk*.uni- paderborn.de

uni-paderborn.de(uni- paderborn.de)


2.1 Internet Domain Name Service

♦ Another global name service that has become very prominent recently is the Internet Domain Name Service (DNS). The root of DNS is maintained by a machine called ns.nasa.gov that is operated by the US space agency NASA.

♦ Each DNS node maintains a table with domains of which it knows the name servers. The root node, for instance would have entries identifying the domains '.de' and '.ac.uk' representing Germany and all academic sites in the UK.

♦ A name lookup performed by a machine of City’s local network of a machine in the network of 'uni-paderborn.de' would then first be performed by nameserv.city.ac.uk. If that name server could not resolve the binding, it would ask the next higher level name server and so on until it gets to the root.


2.2 Common Characteristics♦ All the naming services we looked at include the concept of external

names that can be defined for distributed components, be they file names, names of organizations or Internet domain names.

♦ All names are defined within the scope of hierarchically organisedname spaces. These are directories in NFS or the X.500 directory tree or name servers in the Internet.

♦ All naming services provide two fundamental operations to define and lookup names. The operation that defines a new name is usually referred to as 'bind', while the operation that searches for a component is commonly denoted as 'resolve'.

♦ Moreover, the name bindings are stored persistently by the name servers. Directory and file names are stored as part of the file systemon disks. Directory entries in X.500 are stored persistently by the respective servers and the Internet domain name servers store name bindings persistently in configuration databases.


2.3 CORBA Naming ServiceApplicationObjects

CORBAfacilities

CORBAservices


NamingThe CORBA Naming service was defined in 1993 as the very first CORBA service.The purpose of the CORBA Naming service is to provide a basic mechanism by means of which external names can be defined for CORBA objects references.


2.3 Introduction

♦ Names are hierarchically organised in so called naming contexts. Name bindings have to be unique within the context (i.e., no other name binding with the same name occurs in the context) . However, one object can have different names in the same context or even the same name within different contexts.

♦ Note, that it is not necessary to bind a name to every CORBA object, thus name bindings are not existential for CORBA objects (opposed to file names in NFS). Other ways how objects can be located include:

» Accesses of attributes whose type is a subtype of Object.» Executing operations whose result is a subtype of Object.» Using the CORBA Trading service.» Using CORBA Query or Relationship facilities.


CupW inners

1.FC

Alave

s

2.3 Naming Contexts Leafs=object names, non-leafs=context names

Pre m

i er First

Man

Uni t e

d

Chelsea

Q PR

South End

United

E nglandSp a in

1. Li g

a 2. Liga

Mad

r i d Bilbao

Eib a

r Alaves

UEFA

Manchester

United


2.3. CORBA Names ♦ Names in the CORBA naming service are sequences of simple

names. They are composed in a similar way as path names in NFS, as sequences of a number of directory names and a file name.

♦ A simple name is a (value, kind) tuple. (à la X.500)♦ Only the value component is used for resolving the name.♦ The kind attribute is used to store and provide additional

information about the role that an object or naming context has.♦ A simple name in the above example would be ("Chelsea","Club")

or ("England","League"), while the composite name identifying Athletic Bilbao within the context of the UEFA would consist of:{(”Spain",”1. Liga"),”Bilbao","Club")}.


module CosNaming { typedef string Istring;

struct NameComponent {Istring id;Istring kind;

};

typedef sequence <NameComponent> Name;...};

2.3. IDL Types for Names


2.3. The IDL Interfaces♦ Naming Service is specified by two IDL interfaces:

»NamingContext defines operations to bind objects to names and resolve name bindings.

»BindingInterator defines operations to iterate over a set of names defined in a naming context. An iterator is an object that can be used to enumerate over a collection of objects and visit single elements or chunks of these objects successively.


interface NamingContext {void bind(in Name n, in Object obj)

raises (NotFound, ...);Object resolve(in Name n)

raises (NotFound,CannotProceed,...);void unbind (in Name n)

raises (NotFound, CannotProceed...);NamingContext new_context();NamingContext bind_new_context(in Name n)

raises (NotFound, ...)void list(in unsigned long how_many,

out BindingList bl, out BindingIterator bi);};

2.3. Naming Context


2.3. Naming Context (ctd.)♦ Operation bind creates a name binding in the naming context identified by

the naming context that executes the operation and all name components but the last included in the first parameter n. In that naming context bind inserts a name that equals the last name component and associates it to obj.

♦ Operation resolve returns the object that is identified by the naming context by the executing naming context and the name n. If there is no such name binding in that context, exception NotFound will be raised.

♦ Operation unbind deletes the name binding identified by the executing naming context and name n.

♦ Operations new_context and bind_new_context create new naming context objects. the latter operation also creates a name binding as identifiedby the name n.

♦ Operation list is used to obtain all name bindings in the naming context. Parameter how_many obtains an upper bound for the number of name bindings that are to be included in the out parameter bl. If there are more bindings than how_many in the naming context a binding iterator will be created and returned as out parameter bi.


2.3. Binding Iteratorinterface BindingIterator {boolean next_one(out Binding b);boolean next_n(in unsigned long how_many,

out BindingList bl);void destroy();

}


2.3. Binding Iterator (ctd.)

♦ Operations provided by BindingIterator will be used after list has been executed on a naming context. They will then provide successive bindings that were not included in the BindingList returned by list.

♦ Operation next_one returns just one binding while operation next_n returns as many bindings as the client requests through the in parameter how_many.

♦ Both operations have a return value that indicates whether there are further bindings available in the context that have not yet been obtained.


Client/Server Naming Scenario

ORB

ORBORB

Namespace<Name_1,object1><Name_2,object2><Name_N,object_N>

1. bind(name,object_ref)

3 . r e so l v e (

n a m e )

2. Name Server

Client

4. Invoke Service

Server


Server Side: Creating A Name SpaceORB ORB.init(args,null);1. org.omg.CORBA.Object objRef= org.omg.CORBA.resolve_initial_references("NameService");NamingContext rootContext= NamingContextHelper.narrow(objRef);

NameComponent comp1[]={new NameComponent(“UEFA”,”ORG”)}2. NamingContext uefaContext = rootContext.bind_new_context(comp1);NameComponent comp2[]={new NameComponent(“England”,”Country”)};3. NamingContext englandContext= uefaContext.bind_new_context(comp2);NameComponent comp3[]={new NameComponent(“Premier”,”League”)};4. NamingContext premierContext = englandContext.bind_new_context(comp3);NameComponent name[0]={new NameComponent(“Arsenal”,”Club”)}5. premierContext.bind(name,arsenalRef);


CupW inners

1.FC

Alave

s

2.3 Naming Contexts Leafs=object names, non-leafs=context names

Pre m

i er First

Man

Uni t e

d

Chelsea

Q PR

South End

United

E nglandSp a in

1. Li g

a 2. Liga

Mad

r i d Bilbao

Eib a

r Alaves

UEFA

Manchester

United


Creating Name Space Scenario

1. Resolve_initial_references

Application ORB RootContext

2. Bind_new_context

uefaContextenglandContext

3. Bind_new_context

4. Bind_new_context

premierContext

Arsenal

5. Bind

Server Application Name Server© City University London, Dept. of Computing Distributed Systems / 7 - 28

2.3. Example: Client Finding Objects ORB ORB.init(args,null);1. org.omg.CORBA.Object objRef= org.omg.CORBA.resolve_initial_references("NameService");CosNaming.NamingContext root= CosNaming.NamingContextHelper.narrow(objRef);

2. CosNaming.NameComponent name[] = {new NameComponent(“UEFA”,”ORG”),new NameComponent(“England”,”Country”),new NameComponent(“Premier”,”League”),new NameComponent(“Arsenal”,”Club”)}

3. Team t=TeamHelper.narrow(root.resolve(name));4. t.print();

Casting


Client: Finding Objects ScenarioClient ORB Root Arsenal

2. Create name

Object Reference

4. Invoke method

Client Name Server Premier Server

3. Resolve

1. Resolve_initial_references


2.4 Limitations♦ Limitation of Naming: Client always has to identify the server by name. White Pages

♦ Inappropriate if client just wants to use a service at a certain quality but does not know from whom:» Automatic cinema ticketing;» Video on demand;» Electronic commerce.


3 Trading

1 Characteristics2 Example3 OMG/CORBA Trading Service


3.1 Trading Characteristics♦ The principle idea of a trading service: Have a mediator that acts as a broker between clients and servers.

♦ This broker enables a client to change its perspective when it tries to locate a server component from:» locating individual server components (`WHO` is the server that you are interested in? – i.e., White Pages)

» to the set of services the client is interested in (`WHAT` are the services that you need? – i.e., Yellow Pages).

♦ The broker then selects a suitable service provider on behalf of the client.

♦ Other examples: yellow pages, insurance & stock Brokers


3.1 Trading Characteristics♦ Language for expressing types of services that both client and

server understand.♦ Language expressive enough to define the different types and

quality of services that a server offers or that a client may wish to use» performance, reliability or privacy.

♦ The quality of service may be defined statically or dynamically.» A static definition is appropriate (because it is simpler) if the quality of service is independent of the state of the server.

» This might be the case for qualities such as precision, privacy or reliability. ♦ For qualities such as performance, however, the server may not be able to ensure a particular quality of service statically at the time it registers the service with the trader. » Then a dynamic definition of the quality would be used that would make the trading service inquire about the quality when a client needs to know it.


3.1 Trading Characteristics: Steps1. SERVERS have to register the services they offer with the trader.

» trader is then in a position to respond to service inquiries from clients. 2. CLIENTS then use common language to ask the trader for a server

that provides the type of service the client is interested in. » Clients may or may not include specifications of the quality of service that they expect the server to provide.

3.a TRADER then reacts to such an inquiry of clients in different ways. » Service matching: The trader may itself attempt to match the clients

request with the best offer and just return the identification of a single server that provides the service with the intended quality.

3.b TRADER may also compile a list of those servers that offer a service which matches the clients request. » Service shopping: The trader returns the list to the client. Client selects the most appropriate server.


3.2 Example♦ Distributed system for video-on-demand:

Trader

Video- on-demandprovider

MGM

Warner

Independent

User

ServerR egis ter(Title, Qos )

Films of different formats, resolutions, size

Lookup(“matrix, 1024x768)


3.3 CORBA Trading Service

ApplicationObjects

CORBAfacilities

CORBAservices


Trading


3.3 OMG Trading Service

Trader

Client Server

(1) Register( 2) Lookup (2a) Monitor QoS

(3) Application


3.3 PropertiesSpecify qualities of service:typedef Istring PropertyName;typedef sequence<PropertyName> PropertyNameSeq;typedef any PropertyValue;struct Property {PropertyName name;PropertyValue value;

};typedef sequence<Property> PropertySeq;enum HowManyProps {none, some, all}union SpecifiedProps switch (HowManyProps) {case some : PropertyNameSeq prop_names;

};


3.3 Properties (ctd.)♦ A property is a name value structure, where a property name is a string

and a property value can be any type. ♦ The type Property could be used to specify, for instance, response

time by setting the name to the string response_time and the valueto 0.1 (seconds).

♦ As services usually have more than one property, the type PropertySeq can be used to declare all the properties that a service has.

♦ Type SpecifiedProps is a variant record (union) that is used by clients to tell the trader about those properties they expect a service to have. If the discriminator of the variant is set to none the clients does not care about the properties a service has, if it is set to all the client expects the service to meet all properties and if it is set to some the component prop_names specifies a sequence of properties that the client is expecting.


3.3 RegisterTrader interface for servers:interface Register {OfferId export(in Object reference,

in ServiceTypeName type,in PropertySeq properties) raises(...);

OfferId withdraw(in OfferId id) raises(...);void modify(in OfferId id,

in PropertyNameSeq del_list,in PropertySeq modify_list)raises (...);

};


3.3 Register (ctd.)♦ The operation export is used by the server to make a new service

known to the trader. As arguments it passes an object reference to the object that implements the service, a string denoting the service name and the properties defining the qualities of that service. The export operation returns a unique identifier for the offer which is used for referring to the offer in other operations.

♦ By invoking operation withdraw a server deletes the service identified by the offer identifier.

♦ Using operation modify, the server can dynamically change the qualities of service the trader advertises. Again the service is identified by the offer identifier passed as the first parameter. The properties named in the second parameter are deleted and the properties identified in the last parameter change their value.


3.3 LookupTrader interface for clients:interface Lookup {void query(in ServiceTypeName type,

in Constraint const,in Preference pref,in PolicySeq policies,in SpecifiedProps desired_props,in unsigned long how_many,out OfferSeq offers, out OfferIterator offer_itr,out PolicyNameSeq Limits_applied)raises (...);

};


3.3 Lookup (ctd.)

♦ The most important parameter of the query operation is the name of the service the clients is interested in. Parameter pref identifies whether the clients want the trader to do service matching or whether the clients want to do service shopping for the servers implementing some service. Parameter desired_props identifies the qualities of service the client wants the server to guarantee. The usual iterator pattern is applied to pass the matching servers through the out parameter offers.


4 Summary♦ Location Transparency requires other forms of identification than physical addresses.

♦ Naming services provide facilities to give external names to components.

♦ Trading services match service types requested by clients to servers that can satisfy them.


Reading

[Emmerich] Chapter 8[CDK94] Chapter 9. Name Services.[OMG96a] Object Management Group: The Naming Service.

[OMG96b] Object Management Group: The Trading Object Service.


Distributed Systems

Session 8: Concurrency ControlChristos Kloukinas



Last session

1 Location Transparency•Not a good idea to hard code location information in components--> Migration difficult

2 Naming•Associating external names to references

3 Trading•looking up servers by what services they offer


0.1 Naming1 Naming Service Examplese.g NFS, X.500, DNS2 Common Characteristics

» External names, hierarchies, contexts, persistence of bindings, resolveand bind operations.

3 CORBA Naming Serviceinterface NamingContext

4 Limitations- not always the case that we know names© City University London, Dept. of Computing Distributed Systems / 8 - 4

0.2. Java Example: Client Finding Objects ORB ORB.init(args,null);1. org.omg.CORBA.Object objRef= org.omg.CORBA.resolve_initial_references("NameService");CosNaming.NamingContext root= CosNaming.NamingContextHelper.narrow(objRef);

2. CosNaming.NameComponent name[] = {new NameComponent(“UEFA”,”ORG”),new NameComponent(“England”,”Country”),new NameComponent(“Premier”,”League”),new NameComponent(“Arsenal”,”Club”)}

3. Team t=TeamHelper.narrow(root.resolve(name));4. t.print();

Casting

Transparently get the naming service


0.3 Trading1. Characteristics

♦ Need a trader (mediator), Quality of service, language to express quality of service.

♦ Quality of service can be expressed statically (e.g. privacy, precision) or dynamically (e.g performance)

♦ Service matching and service shopping2. Example: Video on Demand3. OMG/CORBA Trading Service


Session 8 - Outline

1 Motivation2 Concurrency Control Techniques3 CORBA Concurrency Control Service4 Summary


1 Motivation♦ How can multiple components in a

distributed system use a shared component concurrently without violating the integrity of the component?

♦ This question is of fundamental importance as there are only very few distributed systems where all components are only used by a single component at a time.


1 Motivation (ctd.)♦ Resources maintained concurrently may be hardware components (e.g. a printer), operating system resources (e.g. files or sockets), databases (e.g. the bank accounts kept by different banks) or CORBA objects.

♦ For some types of accesses, resources may have to be accessed in mutual exclusion» It does not make sense to have print jobs of different users being printed in an interleaved way;

» Only one user should be editing a file at a time, otherwise the changes made by other users would be overwritten if the last user saves his or her file;

» integrity of databases or CORBA objects may be lost through concurrent updates.

♦ Hence, the need arises to restrict the concurrent access of multiple components to a shared resource in a sensible way.


1 Motivation (ctd.)♦Concurrent access and updates of resources

which maintain state information may lead to:» lost updates» inconsistent analysis

♦Motivating example for lost updates:» Cash withdrawal from ATM and concurrent» Credit of cheque

♦Motivating example for inconsistent analysis:» Funds transfer between accounts of one customer» Sum of account balances (Report for Inland Revenue)


1 Motivating Examplesclass Account {

protected float balance;public float get_balance() {return balance;};void debit(float amount){

float new=balance-amount;balance=new;

};void credit(float amount) {

float new=balance+amount;balance=new;

};};

The object stores the balance in the instance variable balance. The object can return the current balance through operation get_balance().

The debit() operation subtracts the amount passed as a parameter from the balance and the credit() operation adds the amount passed as a parameter.


1 Lost Updates

Time WRITER WRITER

Customer@ATM: Clerk@Counter:Balance of account anAcc at t0 is 75

t0anAcc.debit(50):

new=25;

balance=25;

anAcc.credit(50);

new=125;

balance=125;

t1t2t3t4t5t6


1 Inconsistent Analysis

Funds transfer: Inland Revenue Report:t0t1t2t3t4

Time WRITER READER

Balances at t0 Acc1: 7500, Acc2: 0

t5t6t7

Acc1.debit(7500):Acc1.new=0;Acc1.balance=0;Acc2.credit(7500):Acc2.new=7500;Acc2.balance=7500;

float sum=0;sum+=Acc2.get_bal():// sum=0;sum+=Acc1.get_bal():// sum=0;


2 Concurrency Control Techniques

1 Assessment Criteria2 Pessimistic Concurrency Control

» e.g. Two Phase Locking (2PL)3 Optimistic Concurrency Control4 Comparison


Concurrency Control Techniques♦ Ensures integrity of shared resource amidst

concurrent access» e.g in database, ensures users from editing same record

at the same time» concerned with serialising transactions, ensuring safe

execution♦ resolving conflicts and deadlocks♦ ensuring fairness among concurrent processes♦ restoring component integrity


2.1 Assessment Criteria♦ Serialisability: Concurrent threads are serialisable, if they can be executed one after another and have the same effect on shared resources. It can be proven that serialisable threads do not lead to lost updates and inconsistent analysis.

♦ Deadlock freedom: Concurrency control techniques that use locking may force threads to wait for other threads to release a lock before they can access a resource. This may lead to situations where the wait-for relationship is cyclic and threads are deadlocked.

♦ Fairness: refers to the fact whether all threads have the same chances to get access to resources.

♦ Complexity: On the other hand to compute precisely those and only those schedules that are serialisable may be very complex and we are interested in the complexity that a concurrency control schedule has in order to estimate its performance overhead.

♦ Concurrency!!!: We are also interested in the degree of concurrency that a control scheme allows threads to perform. It is obviously undesirable to restrict schedules that do not cause serialisability problems.


Concurrency Control Techniques: Families♦ Pessimistic

» Assumes that collisions are likely to occur. Locks are used.

» + Changes are consistent and safe» - Is not scalable

♦Optimistic» The idea is that you accept the fact that collisions

occur infrequently, and instead of trying to prevent them you simply choose to detect them and then resolve the collision when it does occur.

» Uses timestamps, and actions can be rolled back


2.2 Two Phase Locking (2PL)♦ The most popular concurrency control technique. Used in:

» RDBMSs (Oracle, Ingres, Sybase, DB/2, etc.)» ODBMSs (O2, ObjectStore, Versant, etc.)» Transaction Monitors (CICS, etc)

♦ The principal component that implements 2PL is a lock manager from which concurrent processes or threads acquire locks on every shared resource they access.

♦ The lock manager investigates the request and compares it with the locks that were already granted on the resource . » If the requested lock does not conflict with an already granted lock, the lock manager will grant the lock and note that the requester is now using the resource.


Terminology♦ Locks and Locksets♦ Locking♦ Lock Compatibility♦ Locking Conflict♦ Deadlocks♦ Waiting graph♦ Locking granularity♦ Hierarchical Locking♦ Locking transparency


2.2 Locks

♦ A lock is a token that indicates that a process accesses a resource in a particular mode.

♦ Minimal lock modes: read and write.♦ Locks are used to indicate to concurrent processes or threads the way in which a resource is used.

♦ The lock manager, therefore, maintains a set of locks for each resource I.e. associates locksets with every shared object


2.2 Locking♦ Processes acquire locks before they access

shared resources and release locks afterwards.♦ 2PL: Processes do not acquire locks once they

have released a lock.♦ Typical 2PL locking profile of a process:

Number oflocks held

Time


2.2 Locking♦ 2PL is based on the assumption that processes or threads always acquire locks before they access a shared resource and that they release a lock if they do not need the resource anymore.

♦ In 2PL, processes do not acquire locks once they have released a lock.

♦ This means that threads operate in cycles where there is a lock acquisition phase and a lock release phasein each cycle.

♦ 2PL has its name due to these two phases.© City University London, Dept. of Computing Distributed Systems / 8 - 22

2.2 Lock Compatibility♦ The lock manager grants locks to requesting processes or

threads on the basis of already granted locks and their compatibility with the requested lock.

♦ The very core of any pessimistic concurrency control technique that is based on locking is the definition of a lockcompatibility matrix. It defines the different lock modes and the compatibility between them.

♦ Minimal lock compatibility matrix:Read Write

Read + -Write - -


2.2 Locking Conflicts♦ Locking conflict: When access cannot be granted due to

incompatibility between requested lock and previously- granted lock♦ On the occasion of a locking conflict,

» Requester cannot use the resource until the conflicting lock has been released.♦ There are two approaches to handle locking conflicts.

» The requesting process can be forced to wait until the conflicting locks are released. This may, however, be too restrictive since the process or thread may well do other computations in between.

» Alert the process or thread that the lock cannot be granted. It can then continue with other processing until a point in time when it definitely needs to get access to the resource.

♦ Several 2PL implementations provide two locking operations, a blocking and a non- blocking one, so the requester can decide.


2.2 Example (Avoiding Lost Updates)

Time

Customer@ATM: Clerk@Counter:Balance of account anAcc at t0 is 75

t0 anAcc.debit(50):anAcc.lock(write);new=75-50=25;balance=25;anAcc.unlock(write);

anAcc.credit(50);anAcc.lock(write);

new=25+50=75;balance=75;anAcc.unlock(write);

t1t2t3t4t5t6


2.2 Example (Avoiding Lost Updates)

♦ Before the account objects are changed, the debit and credit operations request a lock on the account object from the lock manager.

♦ Then the lock manager detects a write/write locking conflict and forces the second process to wait until the first process has released its lock. Then the second process reads the up-to-date value of the balance of the account and modifies it without loosing the update of the first process.


2.2 Deadlocks♦ Recall that lock manager may force processes or threads to wait for other processes to release locks.

♦ This solves problem of lost update and inconsistent analysis.♦ Processes may request locks for more than one object ♦ Situations may arise where two or more processes or threads are mutually waiting for each other to release their locks..

♦ These situations are called deadlocks and ♦ Very undesirable as they block threads and prevent them from finishing their jobs.

♦ Hence 2PL is NOT deadlock-free.


Waiting Graph

p1p2

p3

p7

p9 p6

p8

p4

p5

In this process waiting graph, the four processes P1, P2,P3,P7 are in a deadlock© City University London, Dept. of Computing Distributed Systems / 8 - 28

2.2.1 Deadlock Detection and Resolution♦ Deadlocks are resolved by lock managers.♦ Manager maintains up- to- date representation of the waiting graph.♦ Manager records every locking conflict by inserting a graph edge.♦ Also when a conflict is resolved by releasing a conflicting lock the

respective edge has to be deleted.♦ Manager uses waiting graph to detect deadlocks.♦ Resolution: Break cycles, i.e. select one process or thread that

participates in such a cycle and abort it.» Select a node that has maximum incoming or outgoing edges to reduce chances of further deadlock

» An abortion of a process requires to undo all actions that the process has performed and to release all locks the process has held!!!


2.2 Locking Granularity♦ Observation: Objects that are accessed concurrently are often contained in more coarse grained composite objects e.g

» Directories can contain other directories, files are contained in directories, files have records

» Relational databases contain a set of tables, which contain a set of tuples, which contain attributes; or

» Distributed composite objects may act as containers for component objects, which may again contain other objects

♦ A normal access pattern is to visit all or a large subset of theobjects that are contained.

♦ Concurrency control manager can save effort by exploiting containment hierarchies.


2.2.1 Locking Granularity♦ Two phase locking is applicable to resources of any granularity.

» It works for CORBA objects as well as for files and directories or even complete databases.

♦ However, the degree of concurrency that is achieved with 2PL depends on the granularity that is used for locking. » A high degree of concurrency is achieved with small locking granules.

♦ The disadvantage of choosing a small locking granularity is that a huge number of locks have to be acquired if bigger granules haveto be locked.

♦ Trade- off : Degree of concurrency Vs locking overhead. » If we decrease the granularity we can process more processes concurrently

but have to be prepared to spend higher costs for the management of locks.♦ The dilemma can be resolved using an optimisation, which is hierarchical

locking.


2.2.2 Containment HierarchyBank

G1 G2 Gn

B1 B2 Bn

Accounts

Branches

Group of Branches

Bank

Containment hierarchy of account objects


2.3 Hierarchical Locking♦ Allows locking of all objects contained in a composite object

(container).♦ BUT also allows a process to indicate, at container level, the sub-

resources that it is intending to use in a particular mode.♦ The hierarchical locking schemes therefore introduce intention

locks, such as intention read and intention write locks.♦ I.e intention locks are acquired for a composite object before a

process requests a real lock for an object that is contained in the composite object.

♦ Intention locks signal to those processes that wish to lock entire composite object that some other processes currently has locks for objects contained in composite object


2.3.1 Hierarchical Locking♦ Intention Read Indicate that some process has or is about to

acquire read lock on the objects inside a composite object♦ Intention Write indicate that some process has or is about to acquire write locks on object in composite object.

♦ Processes that want to lock a certain resource would then acquire intention locks on the container of that resource and all its containers.

♦ The lock compatibility matrix is defined in a way that a locking conflict will arise if a container object is already locked in either read or write mode. IR R IW W

IR + + + -R + + - -

IW + - + -W - - - -


2.3.2 Hierarchical Locking♦ NB: Intention read and intention write are compatible because they do not actually correspond to any locks.

♦ Other modes:» IR lock is compatible with R lock because accessing object for reading

does not change values» IR lock is incompatible with W lock because it is not possible to modify

every element of the composite object while some other process process is reading the state of an object of the composite

» etc etc♦ Hence the advantage of hierarchical locking is that it

» enables different lock granularities to be used at the same time♦ Overhead is that for every individual object intention locks have to be used on every composite object in which the object is contained. (may be contained in more than one containers)


2.4 Transparency of Locking♦ The last question that we have to discuss is WHO is acquiring the locks, i.e. who invokes the lock operation for a resource. The options are:» the concurrency control infrastructure, such as the concurrency control manager of a database management system;

» the implementation of components or» the clients of the components.

♦ The first option is very much desirable as then concurrency control would be transparent to the application programmers of both the component and its clients.

♦ Unfortunately this is only possible on limited occasions (in a database system) because the concurrency control manager would have to manage all resources and it would have to be informed about every single resource access.

♦ The last option is very undesirable and it is in fact always avoidable. Hence distributed components should be designed in a way that concurrency control is hidden within their implementation and not exposed at their interface and is transparent to designers of CLIENTS


2.4 Optimistic Concurrency Control♦ In general, the complexity of two phase locking is linear in the number of the accessed resources. With hierarchical locking it is even slightly more complex as also containers of resources have to be locked in intentional mode.

♦ This overhead, however, is unreasonable if the probability of a locking conflict is very limited.

♦ Given the motivating examples we discussed earlier, it is quite unlikely that you withdraw cash from an ATM in that very millisecond when a clerk credits a cheque.

♦ This is where optimistic concurrency control comes in. » It follows a laissez-faire approach and works as a watchdog that

detects conflicts only when they really happen.


2.3 Optimistic Concurrency Control (ctd.)♦ Every thread or process works on its private logical copy of the

set of shared resources. ♦ While a process or thread accesses resources, the concurrency

control manager keeps a log of them. ♦ Timestamps are required♦ At a certain point in time, the access patterns are validated

against conflicts with concurrent processes or threads. ♦ If no conflicts occurred the changes done can be made known to the

global set of resources. ♦ If conflicts occurred the process has to discard its logical copy

and start over again on an up-to-date copy of the resources.


Phases♦ 1. Read:

» Process/transaction executes reading values ,writing to a private copy♦ 2. Validation

» when process completes, manager checks whether process could have possibly conflicted with any other concurrent process. If there is a possibility, the process aborts, and restarts.

♦ 3. Write: » If there is no possibility of conflict, the transactions commits.

♦ If there are few conflicts, » validation can be done efficiently, and leads to better performance than other concurrency control methods. Unfortunately, if there are many conflicts, the cost of repeatedly restarting operation, hurts performance significantly


2.3 Validation Prerequisites♦ As a pre-requisite for optimistic concurrency control it is required to separate the overall sequence of operations a process performsinto distinguishable units. A validation of the access pattern of a unit is then performed during a validation phase at the end of each unit.

♦ For each unit the following information has to be gathered:» Starting time of the unit st(U).» Time stamp for start of validation TS(U).» Ending time of unit E(U).» Read and write sets RS(U) and WS(U). (set of resources U has accessed in read and write mode)

♦ Needs precise time information!!!♦ Requires synchronisation of the local clocks!!! (of resources CORBA objects) © City University London, Dept. of Computing Distributed Systems / 8 - 40

2.3 Validation Set♦ The validation of a unit has to be done against all

concurrent units that have already been successfully validated. We, therefore denote the set of those units as the validation set VU(u).

♦ VU(u) is formally defined as:VU(u):={x | st(u)<E(x) and x has been validated }

i.e VU(u) contains units x that were active concurrently with u but have been validated before it


2.3 Conflict Detection♦ During the validation phase, the concurrency control manager has to

look for two types of conflicts: read/write and write/write conflicts.♦ A read/write conflict occurred during the course of a unit u iff:

∃ u’ ∈ VU(u) : WS(u) ∩ RS(u’) ≠ {} ∨ RS(u) ∩ WS(u’) ≠ {}

♦ A write/write conflict occurred during the course of a unit u iff:∃ u’ ∈ VU(u) : WS(u) ∩ WS(u’) ≠ {}

♦ In both cases the unit cannot be completed but has to be undone.

--u has written a resourcethat this other unit U’ hasread and vice versa

--u has modified a resourcethat this other unit u’ hasmodified as well


Optimistic Conc. Control – Example (1/3)♦ Assume that you have the following optimistic units:Unit# |start time |end time |read set |write set1 | 1 | 5 | 1,3,5 | 2,42 | 3 | 7 | 2,3,5 | 6,43 | 5 | 9 | 2,3,5 | 7,84 | 10 | 15 | 7,3,5 | 7,8

» What is the validation set (VU) of each one of them?» Which ones have a conflict (read/write or write/write) and

where exactly does the conflict appear?» Which of the transactions in the table above will get

validated?


Optimistic Conc. Control – Example (2/3)VU(1) = {}

Why? Because when it finishes, no other unit has finished yet.

So, unit 1 gets validated immediately.VU(2) = {1}

Why? Because the end time of unit 1 (5) is greater than the starting time of unit 2 (3) and unit 1 has been validated.

Unit 2 has a read/write conflict with unit 1 (in resource 2) and a write/write with unit 1 (in resource 4).


Optimistic Conc. Control – Example (3/3)VU(3) = {}

Why? Because only unit 2 has an end time greater than the starting time of unit 3 but unit 2 has not been validated (so it’s ignored).

Therefore, unit 3 gets validated immediately.VU(4) = {}

Why? Because no unit has an end time greater than the starting time of unit 4.

Thus, unit 4 will be validated as well.


2.4 Comparison♦ Both, pessimistic and optimistic techniques,

» guarantee serialisability of processes» impose a serious complexity in that they need the ability to undo the effect of processes and threads.

♦ Pessimistic techniques cause a » considerable concurrency control overhead through locking and » they are not deadlock-free» However, they are sufficiently efficient when conflicts are likely.

♦ A serious advantage of optimistic techniques» a neglectable overhead when conflicts are unlikely» Furthermore they are deadlock-free. » However the computation of conflict sets is very, very difficult and

complex in a distributed setting. Moreover the optimistic techniques assume the existence of synchronised clocks, which are generally not available in a distributed setting.


2.4 Comparison (ctd.)

♦ In summary, the disadvantages of optimistic concurrency control overwhelm the advantages and in most distributed systems concurrency is controlled using pessimistic techniques.


3 CORBA Concurrency Control Service

ApplicationObjects

CORBAfacilities

CORBAservices


ConcurrencyControl


3 Lock Compatibility♦ The Concurrency Control service supports hierarchical locking, as many

CORBA objects take the role of container objects.

♦ As a further optimisation the service defines a lock type for upgrade locks. ♦ Upgrade locks are read locks that are not compatible to themselves.

Upgrade locks are used in occasions when the requester knows that it only needs a read lock to start with but later will have to acquire a write lock on that resource as well.

♦ If two processes are in this situation, they would run into a deadlock if they used only read locks. With upgrade locks the deadlock can be prevented as the second process trying to acquire the upgrade lock will be delayed already.


3 Lock Compatibility (ctd.)

♦Compatibility matrix:IR R U IW W

IR + + + + -R + + + - -U + + - - -

IW + - - + -W - - - - -


3 Locksets♦ The central object type defined by the Concurrency Control service is the lockset. A lockset is associated to a resource.

♦With the Concurrency Control service, concurrency control has to be managed by the implementation of a shared resource. Hence the implementation of a resource would usually have a hidden lockset attribute.

♦ Operation implementations included in that resource acquire locks before they access or modify the resource.


3 The IDL Interfacesinterface LocksetFactory {LockSet create();

};interface Lockset {void lock(in lock_mode mode);boolean try_lock(in lock_mode mode);void unlock(in lock_mode mode);void change_mode(in lock_mode held,

in lock_mode new);};


3 The IDL Interfaces (ctd.)

♦ A LocksetFactory facilitates the creation of new locksets. The create operation of that interface would usually be executed during the construction of an object that implements a shared resource.

♦ The Lockset interface provides operations to lock, unlock and upgrade locks. The difference between lock and try_lock is that the former is blocking while the latter would return control to the caller also when the lock has not been granted.

♦ Used at the servant internally, clients don’t see them


4 Summary

1 Motivation2 Concurrency Control Techniques3 CORBA Concurrency Control Service


4 Summary♦ Lost updates and inconsistent analysis.♦ Pessimistic vs. optimistic concurrency control

» Pessimistic control: – higher overhead for locking.+works efficiently in cases where conflicts are likely

» Optimistic control: + small overhead when conflicts are unlikely.– distributed computation of conflict sets expensive.– requires global clock.

♦CORBA uses pessimistic two-phase locking.


Distributed Systems

Session 9: Transactions




Last Session: Summary♦ Lost updates and inconsistent analysis.♦ Pessimistic vs. optimistic concurrency control



♦CORBA uses pessimistic two-phase locking.


Session 9 - Outline

1 Motivation2 Transaction Concepts3 Two phase Commit4 CORBA Transaction Service5 Summary


1 Motivation♦What happens if a failure occurs during modification of resources?

» e.g. system failure, disk crash♦Which operations have been completed?

» e.g in cash transfer scenario; was debit successful?♦Which operations have not (and have to be done again)?

» was credit unsuccessful??♦ In which states will the resources be?

» e.g. Has the money being lost on its way and does it need to be recovered?


What is Required? Transactions♦Clusters a sequence of object requests together such that they are performed with ACID properties» i.e transaction is either performed completely or not at all» leads from one consistent state to another» is executed in isolation from other transactions» once completed it is durable

♦Used in Databases and Distributed Systems♦ For example consider the Bank account scenario from last session


Scenario Class DiagramDirectBanking

+funds_transfer(from:Account,to:Account,amount:float)

InlandRevenue+sum_of_accounts(

set:Account[]):float

Account-balance:float =0

+credit(amount:float)+debit(amount:float)+get_balance():float

•A funds transfer involving a debit operation from one account and a credit operation from another account would be regarded as a transaction•Both operations will have to be executed or not at all.

•They should be isolated from other transactions.•They should be durable, once transaction is completed.

•They leave the system in a consistent state.


2 Transaction Concepts

1 ACID Properties» Atomicity» Consistency» Isolation» Durability

2 Transaction Commands: Commit vs. Abort3 Identify Roles of Distributed Components 4 Flat vs. Nested Transactions


2.1.1 Atomicity♦ Transactions are either performed completely or no modification is done.» I.e perform successfully every operation in cluster or none is performed

» e.g. both debit and credit in the scenario♦ Start of a transaction is a continuation point to which it can roll back.

♦ End of transaction is next continuation point.


2.1.2 Consistency♦ Shared resources should always be consistent.♦ Inconsistent states occur during transactions:

» hidden for concurrent transactions» to be resolved before end of transaction.

♦ Application defines consistency and is responsible for ensuring it is maintained.

» e.g for our scenario, consistency means “no money is lost”: at the end this is true, but in between operations it may be not

♦ Transactions can be aborted if they cannot resolve inconsistencies.


2.1.3 Isolation♦ Each transaction accesses resources as if there were no other concurrent transactions.

♦Modifications of the transaction are not visible to other resources before it finishes.

♦Modifications of other transactions are not visible during the transaction at all.

♦ Implemented through:» two-phase locking or» optimistic concurrency control.


2.1.4 Durability

♦ A completed transaction is always persistent (though values may be changed by later transactions).

♦Modified resources must be held on persistent storage before transaction can complete.

♦May not just be disk but can include properly battery-backed RAM or the like of EEPROMs.


2.2 Transaction Commands♦ Begin:

» Start a new transaction.♦Commit:

» End a transaction.» Store changes made during transaction.» Make changes accessible to other transactions.

♦ Abort:» End a transaction.» Undo all changes made during the transaction.


2.3 Roles of Components

Distributed system components involved in transactions can take role of:♦Transactional Client♦Transactional Server♦Coordinator


2.3.1 Coordinator

♦Coordinator plays key role in managing transaction.

♦Coordinator is the component that handles begin / commit / abort transaction calls.

♦Coordinator allocates system-wide unique transaction identifier.

♦Different transactions may have different coordinators.


2.3.2 Transactional Server♦ Every component with a resource accessed or modified under transaction control.

♦ Transactional server has to know coordinator.♦ Transactional server registers its participation in a transaction with the coordinator.

♦ Transactional server has to implement a transaction protocol (two-phase commit).


2.3.3 Transactional Client♦Only sees transactions through the transaction coordinator.

♦ Invokes services from the coordinator to begin, commit and abort transactions.

♦ Implementation of transactions are transparent for the client.

♦Cannot tell difference between server and transactional server.


2.4 Flat Transactions

Flat Transaction

Commit

Crash

Flat Transaction

Rollback

BeginTrans.

BeginTrans.

Rollback

BeginTrans.

Flat Transaction

Abort


2.4 Nested TransactionsMain Transaction

Call Call

Call

Commit

BeginTrans.

BeginTrans.

CommitBeginTrans.

CommitBeginTrans.

Commit


3 Two-Phase Commit♦ Committing a distributed transaction involves distributed decision making.Communication defines commit protocol

♦ Multiple autonomous distributed servers:» For a commit, all transactional servers have to be able to commit.

» If a single transactional server cannot commit its changes, thenevery server has to abort.

♦ Single phase protocol is insufficient.♦ Two phases are needed:

» Phase one: Voting» Phase two: Completion.


3 Phase One♦Called the voting phase.♦Coordinator asks all servers if they are able (and willing) to commit.

♦ Servers reply:» Yes: it will commit if asked, but does not know yet if it is actually going to commit.

» No: it immediately aborts its operations.♦Hence, servers can unilaterally abort but notunilaterally commit a transaction.


3 Phase Two♦Called the completion phase.♦Co-ordinator collates all votes, including its own, and decides to» commit if everyone voted ‘Yes’.» abort if anyone voted ‘No’.

♦ All voters that voted ‘Yes’ are sent» ‘DoCommit’ if transaction is to be committed.» Otherwise ‘Abort'.

♦ Servers acknowledge DoCommit once they have committed.


Object Request for 2 Phase CommitAcc1@BankA:Resource

Acc2@BankB:Resource :Coordinator

begin()debit() register_resource()

credit()

commit()

register_resource()

vote()vote()

doCommit()

doCommit()


3 Server Uncertainty (1)♦ Period when a server is able to commit, but does not yet know if it has to.

♦ This period is known as server uncertainty..♦ Usually short (time needed for co-ordinator to receive and process votes).

♦ However, failures can lengthen this process, which may cause problems.

♦ Solution is to store changes of transaction in temporary persistent storage e.g. log file, and use for recovery on restarting.


3 Recovery in Two-Phase Commit

♦ Failures prior to the start of 2PC result in abort.♦ If server fails prior to voting, it aborts.♦ If it fails after voting, it sends GetDecision.♦ If it fails after committing it (re)sends HaveCommitted message.


3 Recovery in Two-Phase Commit

♦Coordinator failure prior to transmitting DoCommit messages results in abort (since no server has already committed).

♦ After this point, co-ordinator will retransmit all DoCommit messages on restart.» This is why servers have to store even their provisional changes in a persistent way.

» The coordinator itself needs to store the set of participating servers in a persistent way too.


3 ComplexityAssuming N participating servers & Coordinator:♦ (N) Requests from servers to register. ♦ (N) Voting requests from coordinator to servers.♦ (N) Completion requests from coordinator to servers (worst case – may be fewer if some had aborted).

♦Hence, complexity of requests is linear (O(3N)=O(N)) in the number of participating servers.


3 Committing Nested Transactions

♦Cannot use same mechanism to commit nested transactions as:» subtransactions can abort independently of parent.» subtransactions must have made decision to commit or abort before parent transaction.

♦ Top level transaction needs to be able to communicate its decision down to all subtransactions so they may react accordingly.


3 Provisional Commit♦ Subtransactions vote either:

» aborted or» provisionally committed.

♦ Abort is handled as normal.♦ Provisional commit means that coordinator and transactional servers are willing to commit the sub-transactions but have not yet done so.

♦Why not commit? Because the topmost transaction may ask them to abort.


3 Locking and Provisional Commits♦ Locks cannot be released after provisional commit (otherwise cannot commit/abort when asked to).

♦Data items remain ‘protected’ until top-level transaction commits.

♦ This may reduce concurrency.♦ Interactions between sibling subtransactions:

» should they be prevented as they are different?» allowed as they are part of the same transaction?

♦Generally they are prevented.© City University London, Dept. of Computing Distributed Systems / 9 - 30

4 CORBA Transaction Service

ApplicationObjects

CORBAfacilities

CORBAservices


Transaction


4 IDL Interfaces

Object Transaction Service defined through three IDL interfaces:♦Current (transaction interface)♦Coordinator

♦Resource (transactional servers)


4 Current – Implicit Current Txinterface Current {void begin() raises (...);void commit (in boolean report_heuristics)

raises (NoTransaction, HeuristicMixed,HeuristicHazard);

void rollback() raises(NoTransaction);Status get_status();string get_transaction_name();Coordinator get_control();Coordinator suspend();void resume(in Coordinator which)

raises(InvalidControl);};

(every CORBA object has an implicit transaction associated with it)


4 Coordinator – Explicit Tx Coordinator interface Coordinator {Status get_status();Status get_parent_status();Status get_top_level_status();boolean is_same_transaction(in Coordinator tr);boolean is_related_transaction(in Coordinator tr);RecoveryCoordinator register_resource(

in Resource r) raises(Inactive);void register_subtran_aware(

in subtransactionAwareResource r)raises(Inactive, NotSubtransaction);

...};


4 Resourceinterface Resource {Vote prepare(); // ask the resource/server to votevoid rollback() raises(...);void commit() raises(...);void commit_one_phase raises(...);void forget();

};interface SubtransactionAwareResource:Resource {void commit_subtransaction(in Coordinator p);void rollback_subtransaction();

};


4 Transaction Example: Funds TransferAcc1@bankA(Resource)

Acc2@bankB(Resource) Current Coordinatorbegin()

debit() get_control()

credit()commit()

prepare()

register_resource()get_control()register_resource()

prepare()commit()commit()


5 Summary♦ Transaction concepts:

» ACID» Transaction commands (begin, (vote), commit, abort)» Roles of distributed components in transactions

♦ Two-phase commit» phase one: voting» phase two: completion

♦CORBA Transaction Service» implements two-phase commit» needs resources that are transaction aware.


Reading♦ For further background reading

[WOLF2000] Distributed Object Transactions. Chapter 11

[CDK94] Distributed Transactions. Chapter 14[OMG96] Object Management Group: Object

Transaction Service. (WebCT)


Distributed Systems

Session 10: SecurityChristos Kloukinas



0.0 Last Session: Transactions♦What happens if a failure occurs during modification of resources?» e.g. system failure, disk crash

♦Which operations have been completed?» e.g in cash transfer scenario; was debit successful?

♦Which operations have not (and have to be done again)?» And credit unsuccessful??

♦ In which states will the resources be?» e.g. Has the money on its way being lost and does it need to be recovered?


0.1 What is Required? Transactions♦Clusters a sequence of object requests together such that they are performed with ACID properties» i.e transaction is either performed completely or not at all» leads from one consistent state to another» is executed in isolation from other transactions» once completed it is durable

♦Used in Databases and Distributed Systems♦For example consider the Bank account scenario from last session


0.2 Transaction Summary♦Transaction concepts:» ACID» Transaction commands» Roles of distributed components in transactions

♦Two-phase commit Protocol» phase one: voting» phase two: completion

♦CORBA Transaction Service» implements two-phase commit» needs resources that are transaction aware.


Session 10 - Outline

1 Motivation2 Styles of Attacks3 Cryptography4 Authentication5 Security Systems6 Summary


Security in Distributed Systems

ENCRYPTION

Authentication Access Control

Auditing Non-RepudiationHigh Level

Low Level

E.g. Consider online Bank Example


1 Motivation♦More vital/secret data handled by distributed components.

♦Security: protecting data stored in and transferred between distributed components from unauthorised access.

♦Security is a non-functional requirement that cannot be added as a component but has to be built into all components.


1 Why are Distributed Systems insecure?♦Distributed component rely on messages sent and received from network.

♦ Is network (especially WAN networks) secure?» Packets can be intercepted and modified at network layer!

♦ Is client component secure?♦ Is client component who it claims to be?♦Are users of calling components really who they claim to be?


1 Effects of Insecurity♦Confidential Data may be stolen, e.g.:» corporate plans.» new product designs.» medical/financial records (e.g. Access bills....).

♦Data may be altered, e.g.:» finances made to seem better than they are.» results of tests, e.g. on drugs, altered.» examination results amended (up or down).


1 Need for Security♦Loss of confidence: above effects may reduce confidence in computerised systems.

♦Claims for damages: legal developments may allow someone to sue if data on computer has not been guarded according to best practice.

♦Loss of privacy: data legally stored on a computer may well be private to the person concerned (e.g. medical/personnel) record.


2 Threats♦Categorisation of attacks (and goals of attacks) that may be made on system.

♦Four main areas:» leakage: information leaving system.» tampering: unauthorised information altering.» resource stealing: illegal use of resources.» vandalism: disturbing correct system operation.» denial of service: disrupting legitimate system use.

♦Used to specify what the system is proof, or secure, against.


2 Threats♦Leakage denotes the disclosure of information to

unauthorised subjects.» Baazi hacking into a CAD System of Rolls Royce in order to obtain

the latest design RR's jet engines. » Although fatal in this case, leakage is probably the category that

causes the least damage of the above.♦Tampering denotes the unauthorised modification of

data.» We would have a case of tampering, if you hacked into the

School's database in order to alter the marks of your Distributed System courseworks (which you cannot because it is for security reasons not connected to the network!)


2 Threats (ctd.)

♦Resource stealing identifies the illegal use of resourcesand not paying, e.g CPU time, Bandwith, Air time of mobiles» A case of resource stealing has occurred when hackers hacked

into computers of telephone companies and managed to have their phone calls charged to other customer's accounts.

♦Vandalism denotes the disturbance of correct system operation.» The security of CS Dept. in Milan was broken and super user

privileges were acquired and then the system's hard disks were formatted. This caused serious damage to the departmental operations for a session.


2 Methods of Attack♦Eavesdropping: Obtaining message copies without authority.

♦Masquerading (Spoofing): Using identity of another principle without authority.

♦Message tampering: Intercepting and altering messages.

♦Replaying: Storing messages and sending them later.

♦Flooding: sending too many messages


Message

Example

MessageTyperequestIDObjectRefMethodArguments

Msg in XDRStub

ORBIIOP

->Credit(...)

->balance


Some Examples♦Eavesdropping

» request parameters from client to server may contain sensitive information, e.g pins, balances

» Stubs marshal these into standard data representation» By listening to or sniffing traffic attackers can obtain and decode request parameters-->eavesdropping

♦ Tampering» Attacker modifies request parameters before they reach server, e.g credit amount

♦Replaying» Attacker intercepts and stores message and has server repeatedly execute operation

» NB attacker doesn’t have to interpret message, so encryption doesn’t help!


2 Infiltration♦Launch of attack requires access to the system.» Launched by legitimate users.» Launched after obtaining passwords of known users.

♦Subtle ways of infiltration:» Viruses» Trojan horses.


2 Examples♦ Viruses: “I love you” bug, 2000

� (Visual basic script, which read email addresses and sent itself to these addresses, huge damage caused world-wide)

♦ Trojan horses:» StuffIt4-5, 1997

� Archive program for MAC, deletes all sys-files» AOLPassword, 1997

� Monitors keyboard and obtains password for AOL» AOL4Free, 1997

� Promises free AOL access and destroys hard-disk» Quota, 1996

� Unix quota program, which emailed author Unix passwords» IRC II, 1994

� Chat program which allows author to access user machine


3 Cryptography

1 Terminology2 Modern cryptography: Symmetric encryption3 Modern cryptography: Asymmetric encryption and PGP


3.1 Cryptographic Terminology♦Plain text: the message before encryption.♦Cipher text: the message after encryption.♦Key: information needed to convert from plain text to cipher text (or vice-versa).

♦Function: the encryption or decryption algorithm used, in conjunction with key, to encrypt or decrypt message.

♦Key distribution: How to distribute keys between senders and receivers


3.2 Requirements for modern cryptography♦ Kerkhoff’s principle: knowledge of encryption algorithm

should not be an advantage♦ With computers a brute force attempt is possible, i.e. try

every possible substitution until a valid message is produced.

♦ Computers are good at this, modern schemes must be computationally hard to solve to remain secure.

♦ 15 May 1973 American National Bureau of standards requests proposals for encryption standard

♦ Data Encryption Standard, DES, developed. Standard describes DEA, Data Encryption Algorithm

♦ DEA is an improvement of IBM’s 1970 Lucifer algorithm♦ Since November 26, 2001, there’s AES, based on Rijndael


3.2 DES/AES: Symmetric Encryption♦One key is used to both encrypt and decrypt data

♦Encryption and decryption functions are often chosen to be the same

♦Security should not be compromised by making function well-known as security comes from secret keys


3.2 DES/AES: Using Secret Keys♦Sender and recipient exchange keys through some secure, trusted, non-network based means.

♦Sender encodes message using function and sends, knowing that only the holder of the key (the intended recipient) can make sense of it.

♦Recipient decodes message & knows that only a key-holding sender could have generated it.

♦Message can be captured but is of no use.© City University London, Dept. of Computing Distributed Systems / 10 - 24

Secret Key Encryption for Distributed Objects

1.acquire KAB

2.f(KAB, M) --> {M}KAB

3. send()1.acquire KAB

3.f (KAB, {M}KAB) --> M2. receive()

Caller Called

Client A Server B

Stub Skeleton

-1

Figure also suggests how to deploy

{M}KAB


Secret Key Encryption for Distributed Objects♦ Encryption is done after marshalling or unmarshalling and it has

been noted that the server object is not local.♦ Encrypted object request that is transmitted via network is secured

against eavesdropping and message tampering ♦ Note that the encryption can be kept entirely transparent for client

and server programmers, as it is done by middleware or by the stubs created by middleware

♦NB: Disadvantage: For Secret Key encryption for distributed objects, number of keys needed increases quadratically by number of objects (one key per pair of communicating objects…)

♦ Public Key (aka Asymmetric) Encryption overcomes this problem


3.3 Asymmetric Encryption

♦Gives 'one-way' security.♦Two keys generated, one used with decryption algorithm (private key) and one with encryption algorithm (public key).

♦Generation of private key, given public key is computationally hard.

♦Do not need secure key transmission mechanism for key distribution.


3.3 Asymmetric Encryption: Using Public Keys

♦Recipient generates key pair.♦Public key is published by trusted service.♦Sender gets public key, and uses it to encode message.

♦Recipient decrypts message with its private key.

♦Replies can be encoded using sender’s public key from the trusted distribution service.

♦Message can be captured but is of no use.© City University London, Dept. of Computing Distributed Systems / 10 - 28

3.3 Asymmetric Encryption: Sending a msg securely

Sender

Message

Encrypted Message

Public key of recipient

Recipient

Message

Encrypted Message

Private key of recipient

transmit


3.3 Asymmetric Encryption: Signing a msg

Sender

Message

Encrypted Message

Private key of sender

Recipient

Message

Encrypted Message

Public key of sender

transmit

Asymmetric encryption is very versatile: Besides secure transmission, it can be used to sign messages.Question: How to sign a message and send it securely?


3.3 Asymmetric Encryption with RSA: How does it work?

♦Rivest, Shamir, Adleman (Boston, Aug 77) develop the RSA algorithm

♦ We need a one-way function (e.g “Yx mod P”) with trap door ♦ Solution:

» Private key: p,q (both large prime numbers), Public key: N = p q and e» Encryption: C = Me mod N» Decryption: Calculate d such that e d = 1 mod (p-1)(q-1) then M=Cd mod N

♦ Can it be attacked: No!!! – as the power in modular arithmetic is a one- way function– computing p,q from N does not work as prime factorisations is another one- way function (and it’s believed to be computationally hard to factor a number)


Public Key Encryption for Distributed Objects

1.acquire KPB

2.f(KPB, M) --> {M}KPB

3. send()1.publish KPB

3. g(KSB, {M}KPB) --> M2. receive()

Caller Called

Stub Skeleton

{M}KAB

1.generate KPB, KSB


Public Key Encryption for Distributed Objects♦Transmission of message is secure

» as only B has the matching private key to decrypt message♦For decryption:

» Apply function g to private key and encrypted message.♦Differences between public and secret key

» One pair of keys generated for every object, so number of keys is linear to number of objects

♦Because different functions, f and g» use of public of public keys is more complicated for reply

messages. A must generate pair of keys and publish its public key, which B acquires to encrypt reply message


3.3 DES, RSA and PGP – some history♦Both DES and RSA were independently discovered in 1975 by Ellis,Cocks and Williamson in top secret Government communication HQ in UK

♦DES and RSA not available to the public (classified as weapons!)

♦ In the 80s Zimmermann implements PGP (pretty good privacy) as freeware!» And gets to meet some nice fellows from the FBI…


3.3 Pretty Good Privacy♦Public Key encryption used in PGP♦Generally available, and can be used for» encryption of messages » digital signatures.

♦PGP combines DES and RSA» DES fast, but symmetric, hence key distribution problem

» RSA slower, but no key distribution problem» Solution: Use RSA to encrypt and distribute key for DES encryption!!!


Hybrid: Secure Layer (SSL) Protocol♦Used in Netscape for secure downloads♦Uses RSA encryption♦SSL Client» generates a secret key for one session, that key is encrypted using server’s public key

♦Session key then forwarded to the server and used for further communication between clients and server

♦Most O-O middleware use SSL rather than straight TCP as transport protocol, to prevent eavesdropping and tampering of object request traffic


4 Authentication

1 Motivation

2 Types of Authentication

3 Needham/Schroeder Protocol


4.1 What is Authentication?

Authentication: Proving you are who you claim to be.♦ In centralised systems: Password check at session start.

♦ In distributed systems:» Ensuring that each message came from claimed source.» Ensuring that each message has not been altered.» Ensuring that each message has not been replayed.


4.2 Types of Authentication♦Authentication can be used to ensure a number of different aspects of an interaction.

♦Proving that a client of a server is who it claims to be.

♦This can be refined to proving that the end user has the right to use a service.


4.2 Types of Authentication

♦Proving both client and server are who they say they are.

♦This is needed to prevent imposter services collecting information or disrupting (vandalising) the system.

♦This is really just an extension of the idea of authentication a client.


4.2 Types of Authentication♦Securing communication from eavesdropping.♦Authentication will usually involve encrypting data.

♦This can be used just at the start, to prove the identity of the two ends of the communication link.


4.3 Needham/Schroeder Protocol♦Provides a secure way for pairs of components to obtain keys to be used during communication.

♦Based on an authentication server:» maintains a name and a secret key for each component.

» can generate keys for peer-to-peer communications.

♦Secret keys are used for communication with server.


4.3 Needham/Schroeder ProtocolAuthenticationServer,AS

C S

1 : C ,

S, N

C2:

{N C, S,

K CS, {K

CS, C}

K S} K C

3: {KCS,C}KS4: {NS}KCS5: {NS-1}KCS

C: Client NameS: Server NameKC: Client´s secret keyKS: Server´s secret keyKCS: NEW Secret key for client/server

communicationNx: Nonce generated by x{M}K: Message encrypted in key K

Assumptions:• C & S’s identity registered with AS• AS and object share secret key for mutual communication

• AS is a trusted authority


5 Security Systems: Kerberos♦Kerberos is a network authentication protocol» allow users and services to authenticate themselves to each other

♦Based on Needham/Schroeder Protocol.♦Developed by Steiner at MIT (1988).♦Used in » OSF/DCE.(OSF Distributed Computing Environment )» Unix NFS.» An adapted version of it is used in Microsoft Windows


5 Security Systems: CORBASupports the following security functionality:♦ Authentication of users.♦ Authentication between objects.♦ Authorisation and access control.♦ Security auditing.♦ Non-repudiation.♦ Administration of security information.Cryptography is not exposed at interfaces - The OMG has taken explicit care to avoid exposing keys and any other confidential knowledge within the specs. This was done to avoid that the CORBA security specification would be classified by the US Government as a weapon & as such be unavailable for use outside the US.


6 Summary♦Threats, Methods of Attack, Infiltration♦Cryptology:» Secret Keys» Public Keys

♦Authentication: Needham/Schroeder Protocol♦Systems:» CORBA

© City University London, Dept. of Computing Distributed Systems / Revision - 1

Distributed Systems- Revision -




How To♦ First get the past exams, to get a better idea of what the exam will be like.♦ Fast revision: For each session, read:

» Its introduction;» Its summary; and» The summary for it that is at the beginning of the next session!

♦ Then read each session (+ notes!) and try to come up with questions for them of your own.♦ Answer these questions & those in the past exams♦ Feel free to collaborate on this – use cityspace.

» I will be correcting any wrong answers in Cityspace (but not providing correct answers to begin with)


Session 1 – Motivation

1. What is a Distributed System2. Why bother with them? Non-Functional Reqs3. Examples of Distributed Systems4. Common Characteristics5. Summary

» What is a distributed system and how does it compare to a centralised system?

» What are the characteristics of distributed systems?

» What are the different dimensions of transparency?» How do they depend on each other?


Session 2 – Distributed SW Eng.♦ Distributed Systems consist of multiple components.

♦ Components are heterogeneous.♦ Components still have to be interoperable.♦ There has to be a common model for components, which expresses» component states,» component services, and» interaction of components with other components.


Session 3 – Communication in DS♦ What communication primitives do distributed

systems use? (OSI stack)♦ How are differences between application and

communication layer resolved? (XDR/ASN)♦ What quality of service do the client/server

protocols achieve? (M/LO/MO/EO)♦ What quality of services are involved in group

communication?(Best Eff./K-Rel/Tot. Ord./Atomic)♦ The CORBA event management. (Push vs Pull)


Session 4 – RMI♦ Motivation and Introduction to Java RMI♦ Conceptual Framework♦ RMI Details ♦ Example Implementation♦ Summary & Critique of RMI


Session 5 – CORBA♦ Introduction♦ Object Management Architecture♦ CORBA Communication♦ Implementation, “Hello World” Example♦ RMI vs CORBA Comparison


Session 6 – Programming in CORBA

1. Poly-lingual applications2. Standardisation of bindings3. What bindings need to address4. An example: IDL/Java

♦ How does each IDL construct map to Java?5. Object LifeCycle


Session 7 – Naming & Trading1 Location Transparency: A reminder2 Naming3 Trading♦ Location Transparency requires other forms of

identification than physical addresses.♦ Naming services provide facilities to give external

names to components.♦ Trading services match service types requested

by clients to servers that can satisfy them.© City University London, Dept. of Computing Distributed Systems / Revision - 10

Session 8 – Concurrency♦ Lost updates and inconsistent analysis.♦ Pessimistic vs. optimistic concurrency control



♦ CORBA uses pessimistic two-phase locking.


Session 9 – Transactions♦ Transaction concepts:

» ACID – Can’t do without it…» Transaction commands (begin, (vote), commit, abort)» Roles of distributed components in transactions

♦ Two-phase commit» phase one: voting» phase two: completion

♦ CORBA Transaction Service» implements two-phase commit» needs resources that are transaction aware.


Session 10 – Security♦ Threats, Methods of Attack, Infiltration♦ Cryptology:

» Secret Keys» Public Keys

♦ Authentication: Needham/Schroeder Protocol♦ Systems:

» Kerberos, CORBA

kloukin/teaching/dis/ds-handouts.pdf · © City University London, Dept. of Computing Distributed Systems / 0 - 1 Distributed Systems Dr Christos Kloukinas Dept. of Computing City

Documents