This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
♦ Data are Distributed » If data must exist in multiple computers for admin and ownership reasons
♦ Computation is Distributed » Applications taking advantage of parallelism, multiple processors, » particular feature » Scalability and heterogeneity of Distributed System
♦ Users are Distributed » If Users communicate and interact via application (shared objects)
♦ 1940. The British Government came to the conclusion that 2 or 3 computers would be sufficient for UK.
♦ 1960. Mainframe computers took up a few hundred square feet.
♦ 1970. First Local Area Networks (LAN) such as Ethernet. ♦ 1980. First network cards for PCs. ♦ 1990. First wide area networks, the Internet, that evolved
from the US Advanced Research Projects Agency net (ARPANET, 4 nodes in 1969) and was, later, fueled by the rapid increase in network bandwith and the invention of the World Wide Web at CERN in 1989.
A collection of components that execute on different computers. Interaction is achieved using a computer network.
A distributed system consists of a collection of autonomous computers, connected through a network and distributed operating system software, which enables computers to coordinate their activities and to share the resources of the system, so that users perceive the system as a single, integrated computing facility.
♦ Non-autonomous parts: The system possesses full control. ♦ Homogeneous: Constructed using the same technology
(e.g., same programming language and compiler for all parts).
♦ Component shared by all users all the time. ♦ All resources accessible. ♦ Software runs in a single process. ♦ Single Point of control. ♦ Single Point of failure (either they work or they do not
♦ Multiple autonomous components. ♦ Heterogeneous. ♦ Components are not shared by all users. ♦ Resources may not be accessible. ♦ Software runs in concurrent processes on different
processors. ♦ Multiple Points of control. ♦ Multiple Points of failure (but more fault tolerant!).
♦ Enables local and remote information objects to be accessed using identical operations, that is, the interface to a service request is the same for communication between components on the same host and components on different hosts.
♦ Example: File system operations in Unix Network File System (NFS).
♦ A component whose access is not transparent cannot easily be moved from one host to the other. All other components that request services would first have to be changed to use a different interface.
♦ Enables information objects to be accessed without knowledge of their physical location.
♦ Example: Pages in the Web. ♦ Example: When an NFS administrator moves
a partition, for instance because a disk is full, application programs accessing files in that partition would have to be changed if file location is not transparent for them.
♦ Allows the movement of information objects within a system without affecting the operations of users or application programs.
♦ It is useful, as it sometimes becomes necessary to move a component from one host to another (e.g., due to an overload of the host or to a replacement of the host hardware).
♦ Without migration transparency, a distributed system becomes very inflexible as components are tied to particular machines and moving them requires changes in other components.
♦ Enables multiple instances of information objects to be used to increase reliability and performance without knowledge of the replicas by users or application programs.
♦ Enables several processes to operate concurrently using shared information objects without interference between them. Neither user nor application engineers have to see how concurrency is controlled.
♦ Example: Bank applications. ♦ Example: Database management system.
0 LAST Session Summary + additional material. 1 Motivation 2 The CORBA Object Model 3 The OMG Interface Definition Language (IDL) 4 Other Approaches 5 Summary
A distributed system consists of a collection of autonomous computers, connected through a network and distributed operating system software, which enables computers to coordinate their activities and to share the resources of the system, so that users perceive the system as a single, integrated computing facility.
Certain common characteristics can be used to assess distributed systems: Resource Sharing, Openness, Concurrency, Scalability, Fault Tolerance, and Transparency
Three Tiered Architecture is an information model with distinct pieces -- client, applications services and data sources -- that can be distributed across a network.
Client Tier -- The user component displays information, processes, graphics, communications, keyboard input and local applications.
Applications Service Tier -- A set of sharable multitasking components that interact with clients and the data tier. It provides the controlled view of the underlying data sources.
Data Source Tier -- One or more sources of data such as mainframes, servers, databases, data warehouses, legacy applications etc.
Three tier architecture with transaction processing monitor technology
Three tier with message server Three tier with an application server Three tier with an ORB architecture( e.g CORBA) Distributed/collaborative enterprise architecture.
Model describes components, states, interactions and other concepts
OMG/IDL is a language for expressing all concepts of the CORBA object model. » separation of interface from implementation » Enables interoperability and transparency » IDL compiles into client stubs and server skeletons » Stubs and skeletons serve as proxies for clients
}; interface Stock { readonly attribute string symbol; // Get the stock symbol. readonly attribute string full_name; // Get the name. double price (); // Get the price }; };
Attributes and operations and exceptions are properties defined in object types.
Object types are those properties that are shared by similar objects. Only their identity and values of their attributes differ.
Objects may export these properties to other objects.
Objects are instances of types.
Object types are specified through interfaces that determine the operations that clients can request, that is, they define a contract that binds the interaction between client and sever objects.
Operations modify the state of an object or just compute functions
Used for service requests Operations have a signature that consists of
» a name, » a list of in, out, or inout parameters, » a return value type (result) or void if none, and » a list of exceptions that the operation can raise.
2.5 Exceptions Service requests may not be executed properly. Exceptions have a unique name. Exceptions may declare additional data structures. Exceptions are used to explain (and locate) the reason
of failure to the requester of the operation Operation execution failures may be
» generic (system), raised by the middleware, e.g., an unreachable server object; or
» specific, raised by the server object, when the execution of a request would violate the object’s integrity, e.g., not enough money in a bank account.
In distributed systems, services are syntactically specified through interfaces that capture the names of the functions available together with types of the parameters, return values, possible exceptions, etc.
There is no legal way a process can access or manipulate the state of an object other than invoking methods made available to it through an object’s interface.
Behaviour of an operation as defined in a super-type may not be appropriate for a subtype.
Operation can be re-defined in the subtype. Binding messages to operations is dynamic. Operation signature must not be changed. Operations in (abstract) super-types are not
Objects can be assigned to an attribute or passed as a parameter, even though they are instances of subtypes of the attribute’s/parameter’s respective type.
Attributes, parameters and operations are polymorph.
Example: Using Polymorphism, instances of type ATM can be inserted into attribute controls that Ctrl has inherited from Ctrl.
Inspired by the electrical power grid’s pervasiveness, reliability and easy to use, computer scientists in the mid-90s began exploring the design and development of an analogous infrastructure called the computational power Grid
To build an environment that enables » sharing, » selection, » aggregation of a wide variety of
geographically distributed resources including » supercomputers, » storage systems, data sources, and » specialised devices owned by different organisations for
solving large-scale resource intensive problems in science, engineering, and commerce (Buyya, 2002).
Motivation: Small computing resources such as PCs have the potential to provide vast computing power when connected. And yet…
Many of these resources lie idle most of the time. Millions of online-PCs are only involved in tasks like word processing or browsing the Internet. The computing resources of many organisations are often severely under-utilised, specially outside of peak business hours.
At the same time, there are many individuals and organisations that have intensive computations to perform but only have limited access to resources that are available to execute them.
Analyze the value of an investment portfolio in minutes rather than hours?
Unite research teams with others around the world to take advantage of the most up-to-date knowledge?
Significantly accelerate the drug discovery process? Scale your business to meet cyclical demand? Cut the design time of your products in half while reducing the
instances of defects? Source: http://www-1.ibm.com/grid/about_grid/index.shtml
Global Grid Forum (http://www.gridforum.org/): community-initiated forum of 5000+ individual researchers and practitioners working on distributed computing, or "grid" technologies
GridComputing (http://www.gridcomputing.com/) myGrid (http://www.mygrid.org.uk/), an EPSRC project Platforms:
Last session we have discussed an object-oriented component model. Common properties of similar components are modeled as object types (interfaces). Services offered by distributed components are modeled as operations of these object types.
This session, we are going to consider the following problem: What communication primitives are needed in a distributed system and how are they used to implement service requests?
Interactions between components are not fully defined in the model.
No concept for abstract or deferred types. Model does not include primitives for the
behavioural specification of operations. Semantics of the model is only defined
informally. Bastide R. et al.: “Petri Net Based Behavioural Specification of CORBA Systems.”
Lecture Notes in Computer Science, Vol. 1630: Application and Theory of Petri Nets 1999, 20th Int Conference, ICATPN'99, Williamsburg, Virginia, USA, pp. 66-85. Springer-Verlag, June 1999.
P1 builds a message in its address space Executes a system call Operating system fetches message and
transmits over network to P2 Issues & agreements??
» Meaning of bits being sent ? » Volts being used to signal 0-bit, 1-bit ? » which was the last bit sent? » Error detection? » How long are numbers, strings etc? » How are they represented?
At this layer, data packets are encoded and decoded into bits. It furnishes transmission protocol knowledge and management and handles errors in the physical layer, flow control and frame synchronization.
The data link layer is divided into two sub-layers: The Media Access Control (MAC) layer and the Logical Link Control (LLC) layer. » The MAC sub-layer controls how a computer on the network gains
access to the data and permission to transmit it. » The LLC layer controls frame synchronization, flow control and error
This layer provides switching and routing technologies, creating logical paths, known as virtual circuits, for transmitting data from node to node in a WAN.
Routing means choosing the best path Routing and forwarding are functions of this
layer, as well as addressing, internetworking, error handling, congestion control and packet sequencing.
Session We are going to review two layers of the model that are important for the implementation of service requests in general, and CORBA operation invocation requests in particular
• Level 4 of ISO/OSI Reference Model • Concerned with the transparent transport of information through the network • Responsible for end-to-end error recovery and flow control. It ensures complete data transfer • It is the lowest level at which messages (not packets) are handled. Messages addressed to communication ports • Protocols maybe connection-oriented or connectionless • Two facets in Unix:
The transport layer implements transport of data on the basis of some network layer (the network layer itself may be implemented as the Internet Protocol (IP) or OSI's X-25 protocol).
There are a number of transport layer implementations, though the most prominent ones are TCP and UDP that are available in virtually all UNIX operating system variants.
TCP is connection-oriented. This means that a connection between two distributed components has to be maintained by the session layer.
UDP is connectionless. The session layer is not required when transport is UDP based.
TCP provides bi-directional stream of bytes (unstructured data) between two distributed components. » A component using TCP is unaware that data is broken into segments for transmission
over the network.
UNIX rsh, rcp and rlogin are based on TCP.
Reliable, often used with unreliable network protocols » (e.g., a telephone line used with a Serial Line Internet Protocol (SLIP)). » Or with internet Protocol (IP) . Applications such as ftp that need a reliable connection
for a prolonged periods of time establish TCP connections.
Slow! As the two ends connected by the stream may have a different computation speed,
TCP buffers the stream so that the two processes are (partially) decoupled.
• When a data segment is received correctly at destination, an acknowledgement (ACK) segment is sent to the sending TCP
• ACK contains sequence number of the last byte correctly received incremented by 1
• The network can fail to deliver a segment. If the sending TCP waits for too long for an acknowledgement, it times out and re-sends the segment, on the assumption that the datagram has been lost
• Then network can potentially deliver duplicated segments, and can deliver segments out of order. TCP buffers out of order segments or discards duplicates, using byte count for identification
This layer provides independence from differences in data representation (e.g., encryption) by translating from application to network format, and vice versa.
The presentation layer works to transform data into the form that the application layer can accept.
This layer formats and encrypts data to be sent across a network, providing freedom from compatibility problems. It is sometimes called the syntax layer.
There is a considerable mismatch between the complex types used at the application layer, such as records, lists and unions of other complex types in IDL, and those that can be transported by TCP and UDP.
A further complication arises from the fact that atomic types are represented differently on different hardware platforms.
The task of the presentation layer is to resolve these heterogeneity and transform complex data structures into forms that are suitable for transport layers, such as TCP and UDP.
Different hardware and operating system platforms use different representations for elementary data types such as integers and characters: » Most modern operating systems represent 16-bit integers as
two bytes, where the most significant byte comes first. Older machines, such as IBM mainframes, represent these integers exactly the other way around.
» There are also different encodings for character sets. Characters may be encoded as 7-bit ASCII, in the ISO 8-bit character set or in the emerging 16-bit representation, which accounts for the representation of Asian characters as well.
Distributed operating systems resolve these differences within the presentation layer so as to enable heterogeneous components to communicate with each other.
Big endian means that the most significant byte of any multibyte data field is stored at the lowest memory address, which is also the address of the larger field.(Sun’s SPARC, Motorola 68K, JAVA Virtual Machine.
Little endian means that the least significant byte of any multibyte data field is stored at the lowest memory address, which is also the address of the larger field. (Intel 80x86 processors
There are different approaches. One is to convert data during marshalling into a common/shared and well defined representation. An example of this is Sun’s External Data Representation (XDR), which is used in most Remote Procedure Calls (RPC). » For each platform, provide a mapping between common and specific
representation
Another approach is the Abstract Syntax Notation ASN.1 that was standardised by the CCITT. It provides a notation for including the type definition together with each value into the marshalled representation.
Marshalling flattens complex data structures into a transportable representation, usually a stream of bytes, which may be split into a sequence of messages if necessary.
The stream of bytes not only contains the data itself, but also meta-information, such as the length of a certain entry, or an encoding for its types.
The presentation layer at the receiving component then performs the reverse mapping, which is called unmarshalling. It reconstructs the complex type from data and meta data that is included in the stream received.
Note, that marshalling in practice is rarely programmed manually. It is being taken care of by the distributed operating system, such as an ORB in CORBA.
This layer supports application and end-user processes. Communication partners are identified, quality of service is
identified, user authentication and privacy are considered, and any constraints on data syntax are identified. Everything at this layer is application-specific.
This layer provides application services for file transfers, e-mail, and other network software services. Telnet and FTP are applications that exist entirely in the application level. Tiered application architectures are part of this layer.
Bi-directional communication. The sender expects the delivery of a result from the receiver. Requester receives reply message. Request/reply messages contain marshalled parameters/results.
It is hard to prove whether or not a system is deadlock-free and most distributed operating systems therefore do not do much about them and leave it to the designer to avoid them.
To avoid deadlocks: Waits-for relation has to be acyclic!
With asynchronous message delivery, the sender does not wait until the receiver has acknowledged the receipt of the message delivery, but continues as soon as the message has been passed to the local transport layer.
It may be delayed still, if message buffers of the transport layer are exhausted.
» The sender and receiver are decoupled and do not depend on each other.
» This usually results in a higher degree of concurrency between sender and receiver and increases the overall distributed system performance.
» The most important advantage is probably that the system is less likely to run into a deadlock.
» The sender does not know whether or not the receiver has actually received the message. Asynchronous delivery can therefore not reasonably be used together with unreliable transport layer implementations.
» Additional overhead is required if the message order has to be maintained.
If the client can cope with the “maybe” quality of service, the client may not want to wait for the server to finish the service. This protocol, however, is unsuitable if the service has to return data or the client has to know what happened to the service execution.
The advantages are that » there is only one message involved thus the network is not
unnecessarily overloaded and » The client can continue execution as soon as acknowledgement of
message delivery has been returned. (FROM WHOM? A/Synchronous send…)
To be applied if client expects result from server. Client requests service execution from server through request message. Delivery of service result in reply message. If the reply message is not received after a certain period of time this can
have many reasons (the server has not finished the execution yet; the reply message has been lost).
Servers therefore keep a history of reply messages and clients may resend the request and the server then resends the reply.
3.4 RRA Protocol Depending on the amount of client/server communication cycles, the
maintenance of a history may involve a serious overhead! The RRA protocol is designed to limit this overhead. RRA adds to RR an additional acknowledgement message which is
sent by the client as soon as a reply has been received. The receipt of an acknowledgement message enables the server to
dump the reply message of that communication cycle (and all previous non-acknowledged replies).
Further Reading: Object Management Group. Common Object Request Broker:
Architecture and Specification. Rev. 2.0, Chapter 3. OMG IDL Syntax and Semantics. Framingham, Mass. July 1995 (available at: http://www.omg.org/)
International Telecommunication Union. CCITT Recommendation X.720: Information Technology - Open Systems Interconnection - Structure of Management Information: Management Information Model. Geneva, Switzerland. 1993 (available at http://www.itu.ch/).
Microsoft’s Distributed Component Object Model. Information at http://www.microsoft.com/com/
4. Transport Layer connects two distributed components and isolates upper layers from concerns as to how reliable lower layers are. Responsible for end-to-end error recovery.It ensures complete data transfer
3. Network Layer isolates the higher layers from routing and switching considerations
2. Data Link Layer Maps the physical circuit (the cable) and converts it into a point-to-point link that appears relatively error-free (checksums, parity checking is done here)
1. Physical Layer Concerned with transmission of bits over a physical circuit
7. Application Layer concerned with distributed components and their interaction. CORBA objects and their interactions are one example. Remote procedure calls are another
6. Presentation Layer has to resolve differences in information representation between distributed components. (Only needed for connection-oriented protocols)
5. Session Layer provides facilities to support and maintain associations between two or more distributed components
Java is an object-oriented programming language developed by Sun Microsystems that is both compiled and interpreted: A Java compiler creates byte-code, which is interpreted by a virtual machine (VM).
Java is portable: Different VMs interpret the byte-code on different hardware and operating system platforms.
In RMI, the Client object does not directly instantiate the Service,
BUT gets a reference to its interface through the RMI Naming service.
This interface hooks up the client system to the server through a series of layers and proxies until it reaches the actual methods provided by the Service object.
2.33 The Interface: Advantages There are several advantages to using an
interface that makes RMI a more robust platform. » Security by preventing decompiling » The interface is significantly smaller than the actual remote object’s class,
so the client is lighter in weight. » Maintainability; If changes are made to the underlying remote object, it
would need to be propagated to the clients, otherwise serious errors can occur.
» From an architectural standpoint, the interface is cleaner. The code in the remote object will never run on the client, and the interface acts appropriately as a contract between the caller and the class performing the work remotely.
To write an RMI application proceed as follows: 1) Define a remote interface (Server Services) by extending java.rmi.Remote and have
methods throw java.rmi.RemoteException 2) Implement the remote interface. You must provide a Java server class that implements
the interface. It must be derived from the class java.rmi.UnicastRemoteObject 3) Compile the server class using javac 4) Run the stub compiler rmic. Run rmic against your (.class) file to generate client stubs
and server skeletons for your remote classes. (REMEMBER proxies for marshalling & unmarshalling)
5) Start the RMI registry on your server (call rmiregistry &). The registry retrieves and registers server objects. In contrast to CORBA it is not persistent.
6) Start the server object and register it with the registry using the bind method in java.rmi.Naming
7) Write the client code using java.rmi.Naming to locate the server objects. 8) Compile the client code using javac 9) Start the client
RemoteException superclass for exceptions specific to remote objects thrown by RMI runtime (broken connection, a reference mismatch, e.g.)
The Remote interface embraces all remote objects (Does not define methods, but serves to flag remote objects)
The RemoteObject class corresponds to Java’s Object class. It implements remote versions of methods such as hashCode, equals, toString
The class RemoteServer provides methods for creating and exporting servers (e.g. getClientHost, getLog), I.e. common superclass to server implementations and provides the framework to support a wide range of remote reference semantics.
UnicastRemoteObject Your server must either directly inherit or indirectly extend the class and inherit its remote behaviour: implements a special server with the following characteristics: » all references to remote objects are only valid during the life of the process which
created the remote object » it requires a TCP connection-based protocol » parameters, invocations etc. are communicated via streams
Impl. of server (constructor, method, main): public HelloImpl(String s)throws RemoteException {super(); name = s;} public String sayHello() throws RemoteException {return "Hello World!";} public static void main(String args[]){ System.setSecurityManager(new
RMISecurityManager()); try {
HelloImpl obj = new HelloImpl("HelloServer"); Naming.rebind("//myhost/HelloServer",obj); } catch (Exception e) {…} “localhost” or run Java
The class Naming is the bootstrap mechanism for obtaining references to remote objects based on Uniform Resource Locator (URL) syntax. The URL for a remote object is specified using the usual host, port and name: rmi://host:port/name host = host name of registry (defaults to current host) port = port number of registry (defaults to the registry port number) name = name for remote object
A registry exists on every node that allows RMI connections to servers on that node. The registry on a particular node contains a transient database that maps names to remote objects. When the node boots, the registry database is empty. The names stored in the registry are pure and are not parsed. A service storing itself in the registry may want to prefix its name of the service by a package name (although not required), to reduce name collisions in the registry.
A problem that arises is to locate that server in a network which supports the program with the desired remote procedures.
This problem is referred to as binding. Binding can be done statically or dynamically. The binding
we have seen in the last example was static because the hostname was determined at compile time.
Static binding is fairly simple, but seriously limits migration and replication transparency.
With dynamic binding the selection of the server is performed at run-time. This can be done in a way that migration and replication transparency is retained.
Limited support for dynamical server location with the LocateRegistry class to obtain the bootstrap Registry on some host. Usage (minus exception handling):
// Server wishes to make itself available to others: SomeSRVC service = ...; // remote object for service Registry registry = LocateRegistry.getRegistry(); registry.bind("I Serve", service); // The client wishes to make requests of the above service: Registry registry = LocateRegistry.getRegistry("foo.services.com"); SomeSRVC service = (SomeSRVC)registry.lookup("I Serve"); service.requestService(...);
Programs can be easily migrated from one server to another and be replicated over multiple hosts with full transparency for clients.
The client process’s role is to invoke the method on a remote object. The only two things that are necessary for this to happen are the remote interface and stub classes.
The server, which “owns” the remote object in its address space, requires all parts of the RMI interchange.
When the client wants to invoke a method on a remote object, it is given a surrogate that implements the same interface, the stub. The client gets this stub from the RMI server as a serialized object and reconstitutes it using the local copy of that class.
The third part of the system is the object registry. When you register objects with the registry, clients are able to obtain access to it and invoke its methods.
The purpose of the stub on the client is to communicate via serialized objects with the registry on the server. It becomes the proxy for communication back to the server.
Summary The critical parts of a basic RMI system include the
client, server, RMI registry, remote object and its matching stub, skeleton and interface.
A remote object must have an interface to represent it on the client, since it will actually only exist on the server. A stub which implements the same interface acts as a proxy for the remote object.
The server is responsible for making its remote objects available to clients by instantiating and registering them with Naming service.
The Remote Method Invocation (RMI) is a Java system that can be used to easily develop distributed object-based applications. RMI, which makes extensive use of object serialization, can be expressed by the following formula:
RMI = Sockets + Object Serialization + Some Utilities
The utilities are the RMI registry and the compiler to generate stubs and skeletons.
If you are familiar with RMI, you would know that developing distributed object-based applications in RMI is much simpler than using sockets.
So why bother with sockets and object serialization then?
• The advantages of RMI in comparison with sockets are: • Simplicity: RMI is much easier to work with than sockets • No protocol design: unlike sockets, when working with RMI there is no need to worry about designing a protocol between the client and server -- a process that is error-prone.
• The simplicity of RMI, however, comes at the expense of the network. • There is a communication overhead involved when using RMI and that is due to the RMI registry and client stubs or proxies that make remote invocations transparent. For each RMI remote object there is a need for a proxy, which slows the performance down.
0.0 Review: RMI� RMI – Remote Method Invocation» RPC in Java Technology and more » Concrete programming technology» Designed to solve the problems of writing and organizing executable code
» Native to Java, an extension of core language» Benefits from specific features of Java
0.1 RMI: Benefits� Invoke object methods, and have them execute on remote Java Virtual Machines (JVMs)
� Entire objects can be passed and returned as parameters» Unlike many other remote procedure call based mechanisms requiring either primitive data types as parameters, or structures composed of primitive data types
� New Java objects can be passed as a parameter» Can move behavior (class implementations) from client to server and server to client
0.2 RMI: Benefits� Enables use of Design Patterns» Use the full power of object oriented technology in distributed computing, such as two- and three-tier systems (pass behavior and use OO design patterns)
� Safe and Secure» RMI uses built-in Java security mechanisms
� Easy to Write/Easy to Use» A remote interface is an actual Java interface
� Distributed Garbage Collection» Collects remote server objects that are no longer referenced by any client in the network
0.4 Developing RMI� Define a remote interface» define a remote interface that specifies the signatures of the methods to be provided by the server and invoked by clients
» It must be declared public, in order for clients to be able to load remote objects which implement the remote interface.
» It must extend the Remote interface, to fulfill the requirement for making the object a remote one.
» Each method in the interface must throw a java.rmi.RemoteException.
Developing RMI � Implement the remote interface� Develop the server
�Create an instance of the RMISecurityManager and install it �Create an instance of the remote object�Register the object created with the RMI registry
� Develop the client– First obtain a reference to the remote object from the RMI registry
Developing RMI� Running the application» Generate stubs and skeletons - rmic» Compile the server and the client - javac» Start the RMI registry - rmiregistry» Start the server and the client
1.0 The Object Management Group� The OMG is a non-profit consortium created in 1989 with
the purpose of promoting theory and practice of object technology in distributed computing systems to reduce the complexity, lower the costs, and hasten the introduction of new software applications.
� Originally formed by 13 companies, OMG membership grew to over 500 software vendors, developers and users.
� OMG realizes its goals through creating standards which allow interoperability and portability of distributed object oriented applications. They do not produce software or implementation guidelines.
1.3 CORBA Concepts� CORBA’s theoretical underpinnings are based on three important concepts;» An Object-Oriented Model» Open Distributed Computing Environment» Component Integration and Reuse
� CORBA Provides» Uniform access to services» Uniform discovery of resources and object names» Uniform error handling methods» Uniform security policies
1.4 . The OMG Object Model� The OMG Object Model defines common object semantics for specifying the externally visible characteristics of objectsin a standard and implementation-independent way.
� In this model clients request services from objects (which will also be called servers) through a well-defined interface.
� This interface is specified in OMG IDL (Interface Definition Language). A client accesses an object by issuing a request to the object.
� The request is an event, and it carries information including anoperation, the object reference of the service provider, and actual parameters (if any).
1.5 About CORBA Objects� CORBA objects differ from typical objects in 3 ways» CORBA objects can run on any platform.» CORBA objects can be located anywhere » CORBA Objects can be written in any language that has an IDL mapping
� A CORBA object is a virtual programming entity that consists of an identity, an interface, and an implementation which is known as a Servant.» It is virtual in the sense that it does not really exist unless it is made concrete by an implementation written in a programming language
1.6 Objects and Applications� CORBA applications are composed of objects.� Typically, there are many instances of an object of a single
type - for example, an e-commerce website would have many shopping cart object instances, all identical in functionality but differing in that each is assigned to a different customer, and contains data representing the merchandise that its particular customer has selected.
� For other types, there may be only one instance. When a legacy application, such as an accounting system, is wrapped in code with CORBA interfaces and opened up to clients on the network, there is usually only one instance.
2.2 CORBAservices� CORBAservices» Provide basic functionality, that almost every object needs
– Naming Service-name binding ,associating names and references– Event Service- asynchronous event notification– Concurrency Control Service-mediates simultaneous access
� CORBAfacilities (sometimes called Horizontal CORBAfacilities)» Between CORBAservices and Application Objects» Potentially useful across business domains
– Printing, Secure Time Facility, Internationalization Facility, Mobile Agent Facility.
2.3 OMA model� Domain (Vertical) CORBAfacilities» Domain-based and provide functionality for specific domains such as telecommunications, electronic commerce, or health care.
� Application Objects» Topmost part of the OMA hierarchy » Customized for an Individual application, so do not need standardization
2.5 CORBA Architecture� A general CORBA request structure
IIOP
Request from a client to an object implementation
Request
A request consists of•Target object (identified by unique object reference)•Operation.•Parameters (the input, output, and in-out parameters defined for the operation; maybe specified individually or as a list•Optional request context•Results (the results values returned by operation)
� CORBA provides both static and dynamic interfaces to its services
� Happened because two strong proposals from HyperDesk and Digital based on a Dynamic API & from SUN and HP based on a static API. “Common” stands for a two-API proposal
2.7 Object Request Broker, ORB� Core of CORBA, middleware that establishes the client/server relationship between objects
� This is the object manager in CORBA, the software that implements the CORBA specification, (implements the session, transport and network layers), provides object location transparency, communication and activation, i.e» Find object implementation for requests (provide location transparency)
» Prepare the object implementation to receive request» Communicate the data making up request.(Vendors & Products: ORBIX from IONA, VisiBroker from Inprise, JavaIDL from javasoft)
2.8 CORBA Architecture: ORB� On the client side the ORB is responsible for
» accepting requests for a remote object» finding the implementation of the object» accepting a client-side reference to the remote object (converted to a language specific form, e.g. a java stub object)
» Routing client method calls through the object reference to the object implementation
� On the Server side» lets object servers register new objects» receives requests from the client ORB» uses object’s skeleton interface to invoke the object activation method» Creates reference for new object and sends it back to client.
» provides the static interfaces to object services. These precompiled stubs define how clients invoke corresponding services on the server. From a client’s perspective, the stub acts like a local call- it’s a local proxy for a remote server object. Generated by the IDL compiler (there are as many stubs as there are interfaces!)
� Server Skeleton» provides static interfaces to each service exported by the server. Performance unmarshalling, and the actual method invocation on the server object
� ORB Interface» Interface to few ORB operations common to all objects, e.g. operation which returns an object’s interface type.
� Object -- This is a CORBA programming entity that consists of an identity, an interface, and an implementation, which is known as a Servant.
� Servant -- This is an implementation programming language entity that defines the operations that support a CORBA IDL interface. Servants can be written in a variety of languages, including C, C++, Java, Smalltalk, and Ada.
� Client -- This is the program entity that invokes an operation on an object implementation. Accessing the services of a remote objectshould be transparent to the caller. Ideally, it should be as simple as calling a method on an object. The remaining components help to support this level of transparency.
2.11 CORBA Architecture: DII� Dynamic Invocation Interface (DII)» Static invocation interfaces are determined at compile time, and they are presented to the client using stubs
» The DII allows client applications to use server objects without knowing the type of objects at compile time– Client obtains an instance of a CORBA object and makes invocations on that object by dynamically creating requests.
» DII uses the interface repository to validate and retrieve the signature of the operation on which a request is made
2.12 CORBA Architecture: DSI� Dynamic Skeleton Interface (DSI)» Server-side dynamic skeleton interface» Allows servers to be written without having skeletons, or compile time knowledge for which objects will be called remotely
» Provides a runtime binding mechanism for servers that need to handle incoming method calls for components that do not have IDL-based compiled skeletons
» Useful for implementing generic bridges between ORBs» Also used for interactive software tools based on interpreters and distributed debuggers
2.13 CORBA Architecture: IR� Interface Repository» allows YOU to obtain and modify the descriptions of all registered component interfaces(method supported, parameters i.e method signatures)
» It is a run-time distributed database that contains machine readable versions of the IDL interfaces
» Interfaces can be added to the interface repository» Enable Clients to;
– locate an object that is unknown at compile time– find information about its interface– build a request to be forwarded through the ORB
» Purpose:interface an object’s implementation with its ORB» Primary way that an object implementation accesses services provided by the ORB.
» Sits on top of the ORB’s core communication services and accepts requests for service on behalf of server objects, passing requests to them and assigning them IDs (object references)
» Registers classes it supports and their run-time instances with the implementation repository
» In summary, its duties are:– Object reference generation, and interpretation, method invocation, security of interactions, and implementation of object activation and de-activation
2.14 Implementation Repository� Provides a run-time repository of information about classes a server supports, the objects that are instantiated and their IDs
� Also serves as a common place to store additional information associated with implementations of ORBS» e.g. trace information, audit trails and other administrative data
Clients perform requests using object referencesClients May issue requests through object interface stubs (static) or dynamic invocation interface (Dynamic)
» CORBA specifies GIOP, a high level standard protocol for communication between ORBs
� Generalized Inter-ORB Protocol (GIOP) is a collection of message requests an ORB can make over a network
� GIOP maps ORB requests to different transports» Internet Inter-ORB Protocol (IIOP) uses TCP/IP to carry the messages, hence fits well into Internet world
» Environment Specific Inter-ORB Protocol (ESIOP) complements GIOP enabling interoperability with environments not having CORBA support
3.3 Other Features� CORBA Messaging» CORBA 2.0 provides three different techniques for operation invocations:–Synchronous The client invokes an operation, then pauses, waiting for a response
–Deferred synchronous The client invokes an operation then continues processing. It can go back later to either poll or block waiting for a response
–One-way The client invokes an operation, and the ORB provides a guarantee that the request will be delivered. In one-way operation invocations, there is no response
3.4 New Features» Two newer, enhanced mechanisms are introduced
–Callback The client supplies an additional object reference with each request invocation. When the response arrives, the ORB uses that object reference to deliver the response back to the client
–Polling The client invokes an operation that immediately returns a valuetype that can be used to either poll or wait for the response
» The callback and polling techniques are available for clients using statically typed stubs generated from IDL interfaces (not for DII)
4.2 Example: Hello World Serverimport HelloApp.*;import org.omg.CosNaming.*;import org.omg.CosNaming.NamingContextPackage.*;import org.omg.CORBA.*;
class HelloServant extends _HelloImplBase { public String sayHello() {
return "\nHello world !!\n";}}public class HelloServer {public static void main(String args[]){try{// create and initialize the ORBORB orb = ORB.init(args, null);
// create servant and register it with the ORBHelloServant helloRef = new HelloServant();orb.connect(helloRef);
// get the root naming contextorg.omg.CORBA.Object objRef =
� Object type Object.� Initialisation of object request broker.� Initialisation of client / server applications.� Programming interface to interface repository.
4.6 Object Identification� Objects are uniquely identified by object identifiers.� Object identifiers are persistent.� Identifiers can be externalised (converted into
string) and internalised.� Identifiers can be obtained » from a naming or a trading service,» by reading attributes,» from an operation result or» by internalising an externalised reference.
� IDL operations are handled synchronously.� For notifications, it may not be necessary to await the server, if operation does not» have a return value,» have out or inout parameters and» raise specific exceptions.
� Notification can be implemented as onewayoperations in IDL.
CORBA AND JAVA� 1997: RMI Introduced with JDK1.1� 1998: JavaIDL with JDK1.2 – Java ORB supporting IIOP. ORB also supports RMI over IIOP ⇒ remote objects written in the Java programming language accessible from any language via IIOP
� CORBA provides the network transparency, Java provides the implementation transparency
6.1 RMI vs CORBA� RMI is a Java-centric distributed object system. The only way currently to integrate code written in other languages into a RMI system is to use the Java native-code interface to link a remote object implementation in Java to C or C++ code. This is a possibility, but complicated.
� CORBA, on the other hand, is designed to be language-independent. Object interfaces are specified in a language that is independent of the actual implementation language. This interface description can then be compiled into whatever implementation language suits the job and the environment.
� Relatively speaking, RMI can be easier to master, especially for experienced Java programmers, than CORBA. CORBA is a rich, extensive family of standards and interfaces, and delving into the details of these interfaces is sometimes overkill for the task at hand.
6.3 RMI vs CORBA (ctd.)� CORBA is a more mature standard than RMI, and
has had time to gain richer implementations. The CORBA standard is a fairly comprehensive one in terms of distributed objects, and there are CORBA implementations out there that provide many more services and distribution options than RMI or Java. The CORBA Services specifications, for example, include comprehensive high-level interfaces for naming, security, and transaction services.
� So which is better, CORBA or RMI? Basically, it depends. If you're looking at a system that you're building from scratch, with no hooks to legacy systems and fairly mainstream requirements interms of performance and other language features, then RMI may be the most effective and efficient tool for you to use.
� On the other hand, if you're linking your distributed system to legacy services implemented in other languages, or if there is the possibility that subsystems of your application will need to migrate to other languages in the future, or if your system depends strongly on services that are available in CORBA and not in RMI,or if critical subsystems have highly-specialized requirements that Java can't meet, then CORBA may be your best bet.
Taking Stock: Module Outline1 Motivation2 Distributed Software Engineering3 Communication4 RMI 5 CORBA vs RMI6 Building Distributed Systems with CORBA- Common Problems in Distributed Systems
7 Naming and Trading8 Concurrent Processes and Threads9 Transactions10 Security
0.0 CORBA IDL♦CORBA IDL is very expressive and widely available on many platforms for different programming languages. This has motivated the use of CORBA as a mechanism to explain, study and experiment with principles of distributed systems
0.3 Last session Summary♦Revisited CORBA/IDL♦Static Vs Dynamic Invocation♦ Interface Repository♦Dynamic Invocation Interface (DII)♦Dynamic Skeleton Interface (DSI).♦Basic Object Adapter♦CORBA Communication and the IIOP Protocol♦Hello World Example♦Compare and Contrast, CORBA and JAVA RMI
Outline♦ To actually develop distributed systems an IDL is not sufficient. The operations declared at the interface need to be implemented in order to be used.
♦ For both implementation and use of distributed operations, bindings to existing programming languages need to be defined. The standardisation of these programming language bindings will then facilitate the interoperability between distributed objects that are implemented in different programming languages to form so calledpolylingual applications.
♦ A further prerequisite for distributed object-oriented applications is the ability to create distributed objects in a location transparent way. Moreover, objects may have to be copied or relocated and during that may have to be migrated to different platforms. Also objects may have to be removed.
1.1 Polylingual Applications♦ Distributed computing frameworks, such as CORBA are not only used for the construction and a-priori integration of new components. They are probably more often used for the a-posteriori integration of applications from existing components.
♦ Polylingual applications have components in different programming languages.
♦ To achieve interoperability between these components, language bindings are needed that map different language concepts onto each other.
♦ Problem: with n different languages, n(n-1) different language bindings needed.
♦ Solution: One language (such as IDL) as a mediator. Requires only nbindings.
1.2 Standardisation of Bindings♦ Facilitate portability:
» If different ORB vendors used different programming language bindings, neither object implementations nor clients of these implementations would be portable. As this is very undesirable, the OMG has standardised a number of language bindings.
» ORB vendors must respect these language bindings to be able to claim that they are CORBA compliant.
♦ Decrease learning curve of developers:» Developers who studied one language binding do not have
to learn the binding again if they switch to an ORB from another vendor.
It is sufficient for the compliance of an object request broker product to the CORBA standard if it provides one of these bindings. Most brokers, however, provide more than one binding. Nevertheless, no product is currently available that implements all bindings.
1.4 What Bindings Need to Address♦Atomic data types and type constructors♦Constants♦ Interfaces and multiple inheritance♦Object references♦Attribute accesses♦Operation execution requests♦Exceptions♦ Invocation of ORB operations
1.5.1 Modules♦ As an example, assume that an interface Account is included in the IDL BankApplication module. This interface will be represented in Java as class Account. From outside the package BankApplication, the class can be accessed using BankApplication.Account.
♦ Note that in this way the avoidance of name clashes is supported, which makes the approach particularly useful for the construction of large distributed systems.
IDL Javashort/unsigned short shortlong/unsigned long intlong long/unsigned long long longfloat floatdouble doublechar charboolean booleanoctet bytestring String
1.5.2 Atomic Types (ctd.)♦ Most atomic types map naturally to Java♦ Java’s platform independence is of great value here. In the IDL to C++ mapping, for example, there is the problem of different representations on different platforms (shorts can be 32 or 64 bit on Unix and 16 bit on PCs or the significance of a byte may be different (low endian vs high endian architecture). Therefore IDL-to-C++ does not map anything to atomic C++ types. Java does not have this problem because the Java Virtual Machine is standardised.
1.5.3 Enumerations♦ IDL provides an enumeration type;
» An ordered list of identifiers whose values are assigned in ascending order according to their order in enumerationmodule addresses{enum Sex {male, female};};
♦ Java has no enumeration type and thereforehas to implement an enumeration as a class!
� The class provides constants of the enumeration type which is internally realised as integers
� Additionally the Java class provides a method to convert integers to the enumeration type
Enumeration (2)♦ One shortcoming of Java is the missing enumeration type. An IDL enumeration is mapped to an enumeration class in Java♦ Example: The above IDL enumeration in implemented by the Java code below. The constants can be accessed as Sex.male
and Sex.female . Integers (0 and 1 in this case) can be translated to enumeration typepackage Addresses;public final class Sex implements java.lang.Cloneable {
public static final int _male = 0;public static final Sex male = new Sex(_male);public static final int _female = 1;public static final Sex female = new Sex(_female);public static final Sex IT_ENUM_MAX = new Sex(Integer.MAX_VALUE);public int value () {return ___value;}public static Sex from_int (int value) {
default : throw new org.omg.CORBA.BAD_PARAM("Enum out of range");} }private Sex (int value) { ___value = value;}private int ___value;public java.lang.Object clone(){return from_int(___value);}
final public class Info { public int height;public short weight;public Info() {}public Info(int height, short weight){
this.height = height;this.weight = weight;}
};
struct Info {long height;short weight;
};IDL
Java
Likewise, IDL records are mapped to a Java class.Attributes of the record are mapped to public attributes of the class.Names used in IDL are used directly in Java.
♦ IDL interfaces are translated into Java public interfaces. The reasons for that are obvious: The inheritance and subtype relationships in IDL can be mapped to inheritance in Java and interface components, such as attributes and operations can be implemented as Java methods.
♦ The interface name can be kept as the class name because no name conflicts can occur in Java which would not have already been detected in IDL.
1.5.6 Attributes (ctd.)♦ IDL attributes are implemented as Java class attribute with
access methods. For readonly attributes a single (get) method is generated and for other attributes a pair of (set and get) methods is created.
♦ An access of an attribute from a remote object can fail for similar reasons as an operation execution request. These failures are handled using exceptions in both Java and IDL.
♦ The visibility of methods that implement attributes is public. This is necessary to retain the IDL semantics that any attribute that is declared can be accessed from other classes.
1.5.7 Operations (ctd.)♦ Operations defined in an IDL interface are mapped to public Java methods of the class that represents the interface.
♦ The method name is retained because again this cannot cause scoping problems in Java that would not have been detected in IDL. Likewise, parameter names are retained as they cannot cause name clashes.
♦ The mapping of parameter types is more complicated. Since Java does NOT provide pointers, parameters of atomic type are passed by value,
♦ IMPORTANT STUFF♦ Truth #1: Everything in Java is passed by value. Objects,
however, are never passed at all.., ONLY their references are (by value again…).
♦ Truth #2: The values of variables are always primitives or references, never objects.
Operations ctd. (in, out,inout parameters) ♦ IN : to be passed with a meaningful value
» Value of the actual parameter is copied into the formal parameter when the operation is invoked. Modification of formal parameter affects only the formal parameter, not the actual parameter. This is the most common form of parameter passing and is the only one provided in C & Java(CALL-BY-VALUE)
♦OUT: Whose value will be changed by operation» The value of the formal parameter is copied into the actual parameter when the
procedure returns. Modifications to the formal parameter do not affect the formal parameter until the function returns. (CALL-BY-RESULT)
» So it really should be passed by reference (to be modified!)♦ INOUT: Combination of IN and OUT♦ E.G. Consider f(s) and call f(g), s: formal parameter,and g actual
Operations ctd (implementing out, inout)♦ CORBA IDL in parameter implement call-by-value semantics , JAVA supports this, so consequently in maps to normal JAVA parameters and requires no additional effort.
♦ whereas IDL’s out and inout parameter do NOT have JAVA counterparts, SO some additional mechanism is required for call-by-result, etc
♦ Java creates for every type a holder class, a container, an object which wraps up the value. Since object references can be passed by value the out/inout parameter can now be realised in java programs.
♦ I.e Clients instantiate an instance of appropriate Holder class, which is then passed by value.
♦ To support portable stubs and skeletons, holder classes also implement the org.omg.CORBA.portable.Streamable interface, to allow for marshalling and unmarshalling. (The whole object is sent!)
♦ package org.omg.CORBA;♦ public final class ShortHolder♦ {♦ public short value;♦ public ShortHolder() {}♦ public ShortHolder(short s) { value = s; }♦ }
•The short- holder class of the above example is, for example, part of the org.omg.CORBA package:
•In language bindings providing pointers out/inout parameters are realised by pointers
•Contents of the instance are modified by server invocation•Client then uses possibly changed contents
1.5.8 Inheritance (ctd.)♦ Inheritance between IDL interfaces is implemented as inheritance
between the respective Java interfaces. ♦ Note : Java interfaces do allow multiple inheritance whereas Java classes do not.
♦ Therefore IDL interfaces with multiple inheritance map to JAVA interfaces with multiple inheritance
♦ When implementing such a Java interface one uses the implements-keyword and therefore inherits only the names of methods and attributes and not any code. The Java class implementing a Java interface with multiple inheritance implements every single method/attribute of the interface and is therefore in control.
Exceptions (2)♦ The previous example leads to the following Java class:package Exception.EmployeePackage;public final class too_young extends org.omg.CORBA.UserExceptionimplements java.lang.Cloneable {
public String explanation; public short age;public too_young() {super();}public too_young(String explanation,short age) {super(); this.explanation = explanation; this.age = age;}
...}♦ Note that programming languages such as C which do not provide exceptions, model exceptions by additional parameters to methods.(much faster but easier to ignore…)
2.1 Introduction♦ Component creation in a distributed system is more
complicated than in a centralised system mainly because:» 1) Often the component is to be created on a non-local machine. The component creation mechanisms available in programming languages (such as constructors in Java) cannot be used because location specification has to be included.
» 2) Location has to be identified, and identification must be transparent♦ More problems arise for duplication and migration of
components» due to potentially heterogeneous source and target platform, also 2) above
♦ The deletion of components is more difficult as well.» Garbage collection techniques assume that all objects are available in one address space. This is not the case in a distributed system. The techniques cannot be directly applied.
♦ Object creation is done in the CORBA lifecycle service by so called Factory objects. These are plain CORBA objects themselves that export an operation that create and return new objects.
♦ The factory objects use object constructors for the implementation of these creation operations. The new objects, therefore, run in the same address space as the factory object.
♦ An example is the personFactory object which can be used to create a new object of type person. To do so a client who wishes to create a person object calls the operation createPerson which will return a reference to a newly created person object. This person object will run on the same machine as the object of type personFactory.
♦ The problem of location transparency then gets down to locating factory objects.
2.2 Object Creation (ctd.)♦ Object Creation is done in CORBA lifecycle service by factory objects♦ The life cycle module exports the FactoryFinder interface, which supports factory location.
♦ A client wishing to locate a factory can invoke the find_factoriesoperation, which will return a sequence of Factory objects. The parameter of the find_factory operation is a key that can be considered as an external (and location independent) name. If no factories are found with that name, the NoFactory exception will be raised.
♦ Factories register with a factory finder using a private protocol. This protocol is likely to be defined in an interface that inherits from the FactoryFinder interface.This, however, is transparent for clients.
♦ Factory finders are not only used for the immediate location of a factory (for creation purposes), but they are also used as proxies (placeholders) for location information that is to be passed to move and copy operations.
2.2 Object Creation (ctd.)♦ It would be fairly costly if a factory interface had to be created for each object
type. This would immediately double the number of interfaces in the distributed application.
♦ The life cycle service therefore defines the GenericFactory interface. It exports an operation by means of which it can be checked whether the Factory is able to create an object of a particular type (whose name is given as a key).
♦ A second operation allows clients to create an instance of the type whose name is given as a key.
♦ This overcomes the problem that type specific factories are not needed. In addition, resources can be managed for instances of different types that reside on one location.
♦ As a disadvantage, however, a type specific initialisation (which can be achieved within an object constructing operation of a specific factory) is not possible through this generic interface.
Example import org.omg.CosNaming.*; import org.omg.CosLifeCycle.*;import org.omg.CORBA.*;//1) Instantiating the factory from an interoperable object reference stored in fileString factoryIOR;factoryIOR = getFactoryIOR("genfac.ior");org.omg.CORBA.Object genFacRef = orb.string_to_object(factoryIOR);GenericFactory fact = GenericFactoryHelper.narrow(genFacRef);
//2) Using the factory to create an object// struct NameComponent { Istring id; Istring kind; };NameComponent nc = new NameComponent("sBuyer::BuyerServer",
2.3 Object Duplication♦ The interface to duplicate objects is in LifeCycleObject.
» The copy operation takes a FactoryFinder as a parameter. This factory finder defines the (set of) locations on which the copy should be created.
♦ Object types that are to be copied or moved around to other locations have to be subtypes of LifeCycleObject.
♦ To accomplish type specific implementations of the copy operation while retaining a unique and generic interface that is seen by clients, subtypes of LifeCycleObject redefine the copyoperation.
♦ interface LifeCycleObject {LifeCycleObject copy (in FactoryFinder there)
♦ The copying of an object cannot be implemented by the ORB.» It uses the factory to create a new object on the target machine. In this way, the problem of heterogeneous machine code of object implementations is resolved.
♦Attribute values are transferred either» through parameters of the object constructing operation » through explicit operation invocations done after the object has been made. This way heterogeneity of data representation is resolved
2.4 Object Deletion♦ Objects that are created also have to be removed. In many object
oriented programming languages this is done implicitly as the object is no longer referenced.
♦ This requires reference counting and garbage collection techniques which are not applicable to distributed objects because they are too expensive in a distributed setting!
♦ Deletion of an object is defined in the LifeCycleObject interface as well. To free the resources allocated by an object clients explicitly invoke the remove operation.
2.5 Object Migration♦ Migration is the removal of an object implementation from one
location to another location.♦ The client view of migration is also defined by the
LifeCycleObject interface by means of the move operation.♦ Interfaces defining specific objects that inherit from
LifeCycleObject redefine move in an application specific way! It is often possible to use the copy and the remove operation for that purpose. This, however, is not done generically, as more efficient ways may be possible in application specific situations.
2.6 What’s Missing: Replication♦ No relationship is maintained between two objects once they have
been copied. They therefore do not evolve together but are completely independent from each other.
♦ This means that the life cycle service does not support replication and therefore replication transparency is not support in the CORBA framework.
♦ There are integrations of particular CORBA products (Orbix, for instance) with replication middleware components (ISIS). These integrations, however, are not standardised and applications that use them will not be portable.
♦ The advantages of replication are that it allows for a higher load and also it supports fault tolerance because the state of an object can be recovered from a replica if an implementation has crashed.
Alternatives for locating interface definitions:♦Any interface inherits the operation
InterfaceDef get_interface() from Object.♦Associative search using lookup_name. ♦Navigation through the interface repository using contents and defined_in attributes.
♦ The location transparency principle suggests to keep the physical location of components transparent for both, the component itself and all clients of the component. Only then can the component be migrated to other servers without having to change the components or its clients.
♦ In the CORBA framework, location transparency is already supported by the fact that objects are identified by object references, which are independent of the object’s location. Plus,
♦ Naming supports the definition of external names for components.♦ Trading supports the definition of service characteristics for a component
2.1 NFS Directories (ctd.)♦ NFS is based on directories. Directories include a number of name
bindings, each of which maps a name to a file or a subdirectory.♦ Names are unique within the scope of the directory and can be composed
to path names by delimiting the name components using a '/'. ♦ Every file or directory of the file system must have at least one entry in
some directory. If the last binding is removed the file or the directory ceases to exist. NO NAME, NO LIFE!
♦ A file or directory can have more than one name. An example is the directory that is shared by users ‘ed’ and ‘jam’. In ‘ed’ home directory that directory has the name 'web' while user ‘jam’ has given it the name 'www'.
♦ The naming scheme for files in the NFS supports location transparency because now files can be identified using pathnames rather than physical addresses(such as the hard-disk drive names C:) or the IP address of the server machine to which a partition of the file system is connected.
♦ The X.500 Directory Service is an recommendation of the International Telecommunication Union (ITU) formerly known as CCITT.
♦ X.500 defines a global name space and it is therefore the basis for component identification in wide area networks, while the network file system is merely used in local area network.
♦ X.500 defines a directory tree and components can have only one name. Having a name is not existential for a component and there may well be subordinate components that are not named but can be identified otherwise.
♦ X.500 directory service entries not only have a name, but also a role attribute, given in brackets. In file systems these roles are sometimes indicated informally by using file name extensions, such as '.cc' for a C++ file or '.doc' for a word processor document.
♦ Another global name service that has become very prominent recently is the Internet Domain Name Service (DNS). The root of DNS is maintained by a machine called ns.nasa.gov that is operated by the US space agency NASA.
♦ Each DNS node maintains a table with domains of which it knows the name servers. The root node, for instance would have entries identifying the domains '.de' and '.ac.uk' representing Germany and all academic sites in the UK.
♦ A name lookup performed by a machine of City’s local network of a machine in the network of 'uni-paderborn.de' would then first be performed by nameserv.city.ac.uk. If that name server could not resolve the binding, it would ask the next higher level name server and so on until it gets to the root.
2.2 Common Characteristics♦ All the naming services we looked at include the concept of external
names that can be defined for distributed components, be they file names, names of organizations or Internet domain names.
♦ All names are defined within the scope of hierarchically organisedname spaces. These are directories in NFS or the X.500 directory tree or name servers in the Internet.
♦ All naming services provide two fundamental operations to define and lookup names. The operation that defines a new name is usually referred to as 'bind', while the operation that searches for a component is commonly denoted as 'resolve'.
♦ Moreover, the name bindings are stored persistently by the name servers. Directory and file names are stored as part of the file systemon disks. Directory entries in X.500 are stored persistently by the respective servers and the Internet domain name servers store name bindings persistently in configuration databases.
NamingThe CORBA Naming service was defined in 1993 as the very first CORBA service.The purpose of the CORBA Naming service is to provide a basic mechanism by means of which external names can be defined for CORBA objects references.
♦ Names are hierarchically organised in so called naming contexts. Name bindings have to be unique within the context (i.e., no other name binding with the same name occurs in the context) . However, one object can have different names in the same context or even the same name within different contexts.
♦ Note, that it is not necessary to bind a name to every CORBA object, thus name bindings are not existential for CORBA objects (opposed to file names in NFS). Other ways how objects can be located include:
» Accesses of attributes whose type is a subtype of Object.» Executing operations whose result is a subtype of Object.» Using the CORBA Trading service.» Using CORBA Query or Relationship facilities.
2.3. CORBA Names ♦ Names in the CORBA naming service are sequences of simple
names. They are composed in a similar way as path names in NFS, as sequences of a number of directory names and a file name.
♦ A simple name is a (value, kind) tuple. (à la X.500)♦ Only the value component is used for resolving the name.♦ The kind attribute is used to store and provide additional
information about the role that an object or naming context has.♦ A simple name in the above example would be ("Chelsea","Club")
or ("England","League"), while the composite name identifying Athletic Bilbao within the context of the UEFA would consist of:{(”Spain",”1. Liga"),”Bilbao","Club")}.
2.3. The IDL Interfaces♦ Naming Service is specified by two IDL interfaces:
»NamingContext defines operations to bind objects to names and resolve name bindings.
»BindingInterator defines operations to iterate over a set of names defined in a naming context. An iterator is an object that can be used to enumerate over a collection of objects and visit single elements or chunks of these objects successively.
2.3. Naming Context (ctd.)♦ Operation bind creates a name binding in the naming context identified by
the naming context that executes the operation and all name components but the last included in the first parameter n. In that naming context bind inserts a name that equals the last name component and associates it to obj.
♦ Operation resolve returns the object that is identified by the naming context by the executing naming context and the name n. If there is no such name binding in that context, exception NotFound will be raised.
♦ Operation unbind deletes the name binding identified by the executing naming context and name n.
♦ Operations new_context and bind_new_context create new naming context objects. the latter operation also creates a name binding as identifiedby the name n.
♦ Operation list is used to obtain all name bindings in the naming context. Parameter how_many obtains an upper bound for the number of name bindings that are to be included in the out parameter bl. If there are more bindings than how_many in the naming context a binding iterator will be created and returned as out parameter bi.
♦ Operations provided by BindingIterator will be used after list has been executed on a naming context. They will then provide successive bindings that were not included in the BindingList returned by list.
♦ Operation next_one returns just one binding while operation next_n returns as many bindings as the client requests through the in parameter how_many.
♦ Both operations have a return value that indicates whether there are further bindings available in the context that have not yet been obtained.
Server Side: Creating A Name SpaceORB ORB.init(args,null);1. org.omg.CORBA.Object objRef= org.omg.CORBA.resolve_initial_references("NameService");NamingContext rootContext= NamingContextHelper.narrow(objRef);
2.4 Limitations♦ Limitation of Naming: Client always has to identify the server by name. White Pages
♦ Inappropriate if client just wants to use a service at a certain quality but does not know from whom:» Automatic cinema ticketing;» Video on demand;» Electronic commerce.
3.1 Trading Characteristics♦ The principle idea of a trading service: Have a mediator that acts as a broker between clients and servers.
♦ This broker enables a client to change its perspective when it tries to locate a server component from:» locating individual server components (`WHO` is the server that you are interested in? – i.e., White Pages)
» to the set of services the client is interested in (`WHAT` are the services that you need? – i.e., Yellow Pages).
♦ The broker then selects a suitable service provider on behalf of the client.
♦ Other examples: yellow pages, insurance & stock Brokers
3.1 Trading Characteristics♦ Language for expressing types of services that both client and
server understand.♦ Language expressive enough to define the different types and
quality of services that a server offers or that a client may wish to use» performance, reliability or privacy.
♦ The quality of service may be defined statically or dynamically.» A static definition is appropriate (because it is simpler) if the quality of service is independent of the state of the server.
» This might be the case for qualities such as precision, privacy or reliability. ♦ For qualities such as performance, however, the server may not be able to ensure a particular quality of service statically at the time it registers the service with the trader. » Then a dynamic definition of the quality would be used that would make the trading service inquire about the quality when a client needs to know it.
3.1 Trading Characteristics: Steps1. SERVERS have to register the services they offer with the trader.
» trader is then in a position to respond to service inquiries from clients. 2. CLIENTS then use common language to ask the trader for a server
that provides the type of service the client is interested in. » Clients may or may not include specifications of the quality of service that they expect the server to provide.
3.a TRADER then reacts to such an inquiry of clients in different ways. » Service matching: The trader may itself attempt to match the clients
request with the best offer and just return the identification of a single server that provides the service with the intended quality.
3.b TRADER may also compile a list of those servers that offer a service which matches the clients request. » Service shopping: The trader returns the list to the client. Client selects the most appropriate server.
3.3 Properties (ctd.)♦ A property is a name value structure, where a property name is a string
and a property value can be any type. ♦ The type Property could be used to specify, for instance, response
time by setting the name to the string response_time and the valueto 0.1 (seconds).
♦ As services usually have more than one property, the type PropertySeq can be used to declare all the properties that a service has.
♦ Type SpecifiedProps is a variant record (union) that is used by clients to tell the trader about those properties they expect a service to have. If the discriminator of the variant is set to none the clients does not care about the properties a service has, if it is set to all the client expects the service to meet all properties and if it is set to some the component prop_names specifies a sequence of properties that the client is expecting.
3.3 Register (ctd.)♦ The operation export is used by the server to make a new service
known to the trader. As arguments it passes an object reference to the object that implements the service, a string denoting the service name and the properties defining the qualities of that service. The export operation returns a unique identifier for the offer which is used for referring to the offer in other operations.
♦ By invoking operation withdraw a server deletes the service identified by the offer identifier.
♦ Using operation modify, the server can dynamically change the qualities of service the trader advertises. Again the service is identified by the offer identifier passed as the first parameter. The properties named in the second parameter are deleted and the properties identified in the last parameter change their value.
♦ The most important parameter of the query operation is the name of the service the clients is interested in. Parameter pref identifies whether the clients want the trader to do service matching or whether the clients want to do service shopping for the servers implementing some service. Parameter desired_props identifies the qualities of service the client wants the server to guarantee. The usual iterator pattern is applied to pass the matching servers through the out parameter offers.
distributed system use a shared component concurrently without violating the integrity of the component?
♦ This question is of fundamental importance as there are only very few distributed systems where all components are only used by a single component at a time.
1 Motivation (ctd.)♦ Resources maintained concurrently may be hardware components (e.g. a printer), operating system resources (e.g. files or sockets), databases (e.g. the bank accounts kept by different banks) or CORBA objects.
♦ For some types of accesses, resources may have to be accessed in mutual exclusion» It does not make sense to have print jobs of different users being printed in an interleaved way;
» Only one user should be editing a file at a time, otherwise the changes made by other users would be overwritten if the last user saves his or her file;
» integrity of databases or CORBA objects may be lost through concurrent updates.
♦ Hence, the need arises to restrict the concurrent access of multiple components to a shared resource in a sensible way.
2.1 Assessment Criteria♦ Serialisability: Concurrent threads are serialisable, if they can be executed one after another and have the same effect on shared resources. It can be proven that serialisable threads do not lead to lost updates and inconsistent analysis.
♦ Deadlock freedom: Concurrency control techniques that use locking may force threads to wait for other threads to release a lock before they can access a resource. This may lead to situations where the wait-for relationship is cyclic and threads are deadlocked.
♦ Fairness: refers to the fact whether all threads have the same chances to get access to resources.
♦ Complexity: On the other hand to compute precisely those and only those schedules that are serialisable may be very complex and we are interested in the complexity that a concurrency control schedule has in order to estimate its performance overhead.
♦ Concurrency!!!: We are also interested in the degree of concurrency that a control scheme allows threads to perform. It is obviously undesirable to restrict schedules that do not cause serialisability problems.
♦ The principal component that implements 2PL is a lock manager from which concurrent processes or threads acquire locks on every shared resource they access.
♦ The lock manager investigates the request and compares it with the locks that were already granted on the resource . » If the requested lock does not conflict with an already granted lock, the lock manager will grant the lock and note that the requester is now using the resource.
2.2 Locking♦ 2PL is based on the assumption that processes or threads always acquire locks before they access a shared resource and that they release a lock if they do not need the resource anymore.
♦ In 2PL, processes do not acquire locks once they have released a lock.
♦ This means that threads operate in cycles where there is a lock acquisition phase and a lock release phasein each cycle.
2.2 Lock Compatibility♦ The lock manager grants locks to requesting processes or
threads on the basis of already granted locks and their compatibility with the requested lock.
♦ The very core of any pessimistic concurrency control technique that is based on locking is the definition of a lockcompatibility matrix. It defines the different lock modes and the compatibility between them.
2.2 Locking Conflicts♦ Locking conflict: When access cannot be granted due to
incompatibility between requested lock and previously- granted lock♦ On the occasion of a locking conflict,
» Requester cannot use the resource until the conflicting lock has been released.♦ There are two approaches to handle locking conflicts.
» The requesting process can be forced to wait until the conflicting locks are released. This may, however, be too restrictive since the process or thread may well do other computations in between.
» Alert the process or thread that the lock cannot be granted. It can then continue with other processing until a point in time when it definitely needs to get access to the resource.
♦ Several 2PL implementations provide two locking operations, a blocking and a non- blocking one, so the requester can decide.
♦ Before the account objects are changed, the debit and credit operations request a lock on the account object from the lock manager.
♦ Then the lock manager detects a write/write locking conflict and forces the second process to wait until the first process has released its lock. Then the second process reads the up-to-date value of the balance of the account and modifies it without loosing the update of the first process.
2.2 Deadlocks♦ Recall that lock manager may force processes or threads to wait for other processes to release locks.
♦ This solves problem of lost update and inconsistent analysis.♦ Processes may request locks for more than one object ♦ Situations may arise where two or more processes or threads are mutually waiting for each other to release their locks..
♦ These situations are called deadlocks and ♦ Very undesirable as they block threads and prevent them from finishing their jobs.
2.2.1 Deadlock Detection and Resolution♦ Deadlocks are resolved by lock managers.♦ Manager maintains up- to- date representation of the waiting graph.♦ Manager records every locking conflict by inserting a graph edge.♦ Also when a conflict is resolved by releasing a conflicting lock the
respective edge has to be deleted.♦ Manager uses waiting graph to detect deadlocks.♦ Resolution: Break cycles, i.e. select one process or thread that
participates in such a cycle and abort it.» Select a node that has maximum incoming or outgoing edges to reduce chances of further deadlock
» An abortion of a process requires to undo all actions that the process has performed and to release all locks the process has held!!!
2.2.1 Locking Granularity♦ Two phase locking is applicable to resources of any granularity.
» It works for CORBA objects as well as for files and directories or even complete databases.
♦ However, the degree of concurrency that is achieved with 2PL depends on the granularity that is used for locking. » A high degree of concurrency is achieved with small locking granules.
♦ The disadvantage of choosing a small locking granularity is that a huge number of locks have to be acquired if bigger granules haveto be locked.
♦ Trade- off : Degree of concurrency Vs locking overhead. » If we decrease the granularity we can process more processes concurrently
but have to be prepared to spend higher costs for the management of locks.♦ The dilemma can be resolved using an optimisation, which is hierarchical
2.3 Hierarchical Locking♦ Allows locking of all objects contained in a composite object
(container).♦ BUT also allows a process to indicate, at container level, the sub-
resources that it is intending to use in a particular mode.♦ The hierarchical locking schemes therefore introduce intention
locks, such as intention read and intention write locks.♦ I.e intention locks are acquired for a composite object before a
process requests a real lock for an object that is contained in the composite object.
♦ Intention locks signal to those processes that wish to lock entire composite object that some other processes currently has locks for objects contained in composite object
2.3.1 Hierarchical Locking♦ Intention Read Indicate that some process has or is about to
acquire read lock on the objects inside a composite object♦ Intention Write indicate that some process has or is about to acquire write locks on object in composite object.
♦ Processes that want to lock a certain resource would then acquire intention locks on the container of that resource and all its containers.
♦ The lock compatibility matrix is defined in a way that a locking conflict will arise if a container object is already locked in either read or write mode. IR R IW W
2.3.2 Hierarchical Locking♦ NB: Intention read and intention write are compatible because they do not actually correspond to any locks.
♦ Other modes:» IR lock is compatible with R lock because accessing object for reading
does not change values» IR lock is incompatible with W lock because it is not possible to modify
every element of the composite object while some other process process is reading the state of an object of the composite
» etc etc♦ Hence the advantage of hierarchical locking is that it
» enables different lock granularities to be used at the same time♦ Overhead is that for every individual object intention locks have to be used on every composite object in which the object is contained. (may be contained in more than one containers)
2.4 Transparency of Locking♦ The last question that we have to discuss is WHO is acquiring the locks, i.e. who invokes the lock operation for a resource. The options are:» the concurrency control infrastructure, such as the concurrency control manager of a database management system;
» the implementation of components or» the clients of the components.
♦ The first option is very much desirable as then concurrency control would be transparent to the application programmers of both the component and its clients.
♦ Unfortunately this is only possible on limited occasions (in a database system) because the concurrency control manager would have to manage all resources and it would have to be informed about every single resource access.
♦ The last option is very undesirable and it is in fact always avoidable. Hence distributed components should be designed in a way that concurrency control is hidden within their implementation and not exposed at their interface and is transparent to designers of CLIENTS
2.4 Optimistic Concurrency Control♦ In general, the complexity of two phase locking is linear in the number of the accessed resources. With hierarchical locking it is even slightly more complex as also containers of resources have to be locked in intentional mode.
♦ This overhead, however, is unreasonable if the probability of a locking conflict is very limited.
♦ Given the motivating examples we discussed earlier, it is quite unlikely that you withdraw cash from an ATM in that very millisecond when a clerk credits a cheque.
♦ This is where optimistic concurrency control comes in. » It follows a laissez-faire approach and works as a watchdog that
» Process/transaction executes reading values ,writing to a private copy♦ 2. Validation
» when process completes, manager checks whether process could have possibly conflicted with any other concurrent process. If there is a possibility, the process aborts, and restarts.
♦ 3. Write: » If there is no possibility of conflict, the transactions commits.
♦ If there are few conflicts, » validation can be done efficiently, and leads to better performance than other concurrency control methods. Unfortunately, if there are many conflicts, the cost of repeatedly restarting operation, hurts performance significantly
2.3 Validation Prerequisites♦ As a pre-requisite for optimistic concurrency control it is required to separate the overall sequence of operations a process performsinto distinguishable units. A validation of the access pattern of a unit is then performed during a validation phase at the end of each unit.
♦ For each unit the following information has to be gathered:» Starting time of the unit st(U).» Time stamp for start of validation TS(U).» Ending time of unit E(U).» Read and write sets RS(U) and WS(U). (set of resources U has accessed in read and write mode)
2.4 Comparison♦ Both, pessimistic and optimistic techniques,
» guarantee serialisability of processes» impose a serious complexity in that they need the ability to undo the effect of processes and threads.
♦ Pessimistic techniques cause a » considerable concurrency control overhead through locking and » they are not deadlock-free» However, they are sufficiently efficient when conflicts are likely.
♦ A serious advantage of optimistic techniques» a neglectable overhead when conflicts are unlikely» Furthermore they are deadlock-free. » However the computation of conflict sets is very, very difficult and
complex in a distributed setting. Moreover the optimistic techniques assume the existence of synchronised clocks, which are generally not available in a distributed setting.
♦ In summary, the disadvantages of optimistic concurrency control overwhelm the advantages and in most distributed systems concurrency is controlled using pessimistic techniques.
3 Lock Compatibility♦ The Concurrency Control service supports hierarchical locking, as many
CORBA objects take the role of container objects.
♦ As a further optimisation the service defines a lock type for upgrade locks. ♦ Upgrade locks are read locks that are not compatible to themselves.
Upgrade locks are used in occasions when the requester knows that it only needs a read lock to start with but later will have to acquire a write lock on that resource as well.
♦ If two processes are in this situation, they would run into a deadlock if they used only read locks. With upgrade locks the deadlock can be prevented as the second process trying to acquire the upgrade lock will be delayed already.
3 Locksets♦ The central object type defined by the Concurrency Control service is the lockset. A lockset is associated to a resource.
♦With the Concurrency Control service, concurrency control has to be managed by the implementation of a shared resource. Hence the implementation of a resource would usually have a hidden lockset attribute.
♦ Operation implementations included in that resource acquire locks before they access or modify the resource.
♦ A LocksetFactory facilitates the creation of new locksets. The create operation of that interface would usually be executed during the construction of an object that implements a shared resource.
♦ The Lockset interface provides operations to lock, unlock and upgrade locks. The difference between lock and try_lock is that the former is blocking while the latter would return control to the caller also when the lock has not been granted.
♦ Used at the servant internally, clients don’t see them
What is Required? Transactions♦Clusters a sequence of object requests together such that they are performed with ACID properties» i.e transaction is either performed completely or not at all» leads from one consistent state to another» is executed in isolation from other transactions» once completed it is durable
♦Used in Databases and Distributed Systems♦ For example consider the Bank account scenario from last session
•A funds transfer involving a debit operation from one account and a credit operation from another account would be regarded as a transaction•Both operations will have to be executed or not at all.
•They should be isolated from other transactions.•They should be durable, once transaction is completed.
2.1.1 Atomicity♦ Transactions are either performed completely or no modification is done.» I.e perform successfully every operation in cluster or none is performed
» e.g. both debit and credit in the scenario♦ Start of a transaction is a continuation point to which it can roll back.
3 Phase Two♦Called the completion phase.♦Co-ordinator collates all votes, including its own, and decides to» commit if everyone voted ‘Yes’.» abort if anyone voted ‘No’.
♦ All voters that voted ‘Yes’ are sent» ‘DoCommit’ if transaction is to be committed.» Otherwise ‘Abort'.
♦ Servers acknowledge DoCommit once they have committed.
♦ Failures prior to the start of 2PC result in abort.♦ If server fails prior to voting, it aborts.♦ If it fails after voting, it sends GetDecision.♦ If it fails after committing it (re)sends HaveCommitted message.
♦Coordinator failure prior to transmitting DoCommit messages results in abort (since no server has already committed).
♦ After this point, co-ordinator will retransmit all DoCommit messages on restart.» This is why servers have to store even their provisional changes in a persistent way.
» The coordinator itself needs to store the set of participating servers in a persistent way too.
3 ComplexityAssuming N participating servers & Coordinator:♦ (N) Requests from servers to register. ♦ (N) Voting requests from coordinator to servers.♦ (N) Completion requests from coordinator to servers (worst case – may be fewer if some had aborted).
♦Hence, complexity of requests is linear (O(3N)=O(N)) in the number of participating servers.
♦Cannot use same mechanism to commit nested transactions as:» subtransactions can abort independently of parent.» subtransactions must have made decision to commit or abort before parent transaction.
♦ Top level transaction needs to be able to communicate its decision down to all subtransactions so they may react accordingly.
♦ Abort is handled as normal.♦ Provisional commit means that coordinator and transactional servers are willing to commit the sub-transactions but have not yet done so.
♦Why not commit? Because the topmost transaction may ask them to abort.
0.1 What is Required? Transactions♦Clusters a sequence of object requests together such that they are performed with ACID properties» i.e transaction is either performed completely or not at all» leads from one consistent state to another» is executed in isolation from other transactions» once completed it is durable
♦Used in Databases and Distributed Systems♦For example consider the Bank account scenario from last session
1 Effects of Insecurity♦Confidential Data may be stolen, e.g.:» corporate plans.» new product designs.» medical/financial records (e.g. Access bills....).
♦Data may be altered, e.g.:» finances made to seem better than they are.» results of tests, e.g. on drugs, altered.» examination results amended (up or down).
2 Threats♦Categorisation of attacks (and goals of attacks) that may be made on system.
♦Four main areas:» leakage: information leaving system.» tampering: unauthorised information altering.» resource stealing: illegal use of resources.» vandalism: disturbing correct system operation.» denial of service: disrupting legitimate system use.
♦Used to specify what the system is proof, or secure, against.
2 Threats♦Leakage denotes the disclosure of information to
unauthorised subjects.» Baazi hacking into a CAD System of Rolls Royce in order to obtain
the latest design RR's jet engines. » Although fatal in this case, leakage is probably the category that
causes the least damage of the above.♦Tampering denotes the unauthorised modification of
data.» We would have a case of tampering, if you hacked into the
School's database in order to alter the marks of your Distributed System courseworks (which you cannot because it is for security reasons not connected to the network!)
♦Resource stealing identifies the illegal use of resourcesand not paying, e.g CPU time, Bandwith, Air time of mobiles» A case of resource stealing has occurred when hackers hacked
into computers of telephone companies and managed to have their phone calls charged to other customer's accounts.
♦Vandalism denotes the disturbance of correct system operation.» The security of CS Dept. in Milan was broken and super user
privileges were acquired and then the system's hard disks were formatted. This caused serious damage to the departmental operations for a session.
» request parameters from client to server may contain sensitive information, e.g pins, balances
» Stubs marshal these into standard data representation» By listening to or sniffing traffic attackers can obtain and decode request parameters-->eavesdropping
♦ Tampering» Attacker modifies request parameters before they reach server, e.g credit amount
♦Replaying» Attacker intercepts and stores message and has server repeatedly execute operation
» NB attacker doesn’t have to interpret message, so encryption doesn’t help!
3.1 Cryptographic Terminology♦Plain text: the message before encryption.♦Cipher text: the message after encryption.♦Key: information needed to convert from plain text to cipher text (or vice-versa).
♦Function: the encryption or decryption algorithm used, in conjunction with key, to encrypt or decrypt message.
♦Key distribution: How to distribute keys between senders and receivers
Secret Key Encryption for Distributed Objects♦ Encryption is done after marshalling or un- marshalling and it has
been noted that the server object is not local.♦ Encrypted object request that is transmitted via network is secured
against eavesdropping and message tampering ♦ Note that the encryption can be kept entirely transparent for client
and server programmers, as it is done by middleware or by the stubs created by middleware
♦NB: Disadvantage: For Secret Key encryption for distributed objects, number of keys needed increases quadratically by number of objects (one key per pair of communicating objects…)
♦ Public Key (aka Asymmetric) Encryption overcomes this problem
Asymmetric encryption is very versatile: Besides secure transmission, it can be used to sign messages.Question: How to sign a message and send it securely?
3.3 Asymmetric Encryption with RSA: How does it work?
♦Rivest, Shamir, Adleman (Boston, Aug 77) develop the RSA algorithm
♦ We need a one-way function (e.g “Yx mod P”) with trap door ♦ Solution:
» Private key: p,q (both large prime numbers), Public key: N = p q and e» Encryption: C = Me mod N» Decryption: Calculate d such that e d = 1 mod (p-1)(q-1) then M=Cd mod N
♦ Can it be attacked: No!!! – as the power in modular arithmetic is a one- way function– computing p,q from N does not work as prime factorisations is another one- way function (and it’s believed to be computationally hard to factor a number)
3.3 DES, RSA and PGP – some history♦Both DES and RSA were independently discovered in 1975 by Ellis,Cocks and Williamson in top secret Government communication HQ in UK
♦DES and RSA not available to the public (classified as weapons!)
♦ In the 80s Zimmermann implements PGP (pretty good privacy) as freeware!» And gets to meet some nice fellows from the FBI…
Hybrid: Secure Layer (SSL) Protocol♦Used in Netscape for secure downloads♦Uses RSA encryption♦SSL Client» generates a secret key for one session, that key is encrypted using server’s public key
♦Session key then forwarded to the server and used for further communication between clients and server
♦Most O-O middleware use SSL rather than straight TCP as transport protocol, to prevent eavesdropping and tampering of object request traffic
Authentication: Proving you are who you claim to be.♦ In centralised systems: Password check at session start.
♦ In distributed systems:» Ensuring that each message came from claimed source.» Ensuring that each message has not been altered.» Ensuring that each message has not been replayed.
5 Security Systems: Kerberos♦Kerberos is a network authentication protocol» allow users and services to authenticate themselves to each other
♦Based on Needham/Schroeder Protocol.♦Developed by Steiner at MIT (1988).♦Used in » OSF/DCE.(OSF Distributed Computing Environment )» Unix NFS.» An adapted version of it is used in Microsoft Windows
5 Security Systems: CORBASupports the following security functionality:♦ Authentication of users.♦ Authentication between objects.♦ Authorisation and access control.♦ Security auditing.♦ Non-repudiation.♦ Administration of security information.Cryptography is not exposed at interfaces - The OMG has taken explicit care to avoid exposing keys and any other confidential knowledge within the specs. This was done to avoid that the CORBA security specification would be classified by the US Government as a weapon & as such be unavailable for use outside the US.
How To♦ First get the past exams, to get a better idea of what the exam will be like.♦ Fast revision: For each session, read:
» Its introduction;» Its summary; and» The summary for it that is at the beginning of the next session!
♦ Then read each session (+ notes!) and try to come up with questions for them of your own.♦ Answer these questions & those in the past exams♦ Feel free to collaborate on this – use cityspace.
» I will be correcting any wrong answers in Cityspace (but not providing correct answers to begin with)
Session 2 – Distributed SW Eng.♦ Distributed Systems consist of multiple components.
♦ Components are heterogeneous.♦ Components still have to be interoperable.♦ There has to be a common model for components, which expresses» component states,» component services, and» interaction of components with other components.