Top Banner
Naming in Distributed System G.Ramesh Babu
50

Naming in Distributed System

Jan 03, 2016

Download

Documents

Naming in Distributed System. G.Ramesh Babu. Contents. Naming Entities Names, Identifiers and Address Name Spaces Name Resolution Closure Mechanism Linking and Mounting Implementation of Name Space Implementation of Resolution Conclusion. Why naming is important?. Names are used to - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Naming in Distributed System

Naming in Distributed SystemG.Ramesh Babu

Page 2: Naming in Distributed System

Contents

• Naming Entities– Names, Identifiers and Address– Name Spaces

• Name Resolution– Closure Mechanism– Linking and Mounting

• Implementation of Name Space• Implementation of Resolution• Conclusion

Page 3: Naming in Distributed System

Why naming is important?

• Names are used to – Share resources– Uniquely identify entities– To refer locations, and so on…

• Name resolution allows a process to access the named entity

Page 4: Naming in Distributed System

Naming Entities

• Name string of characters used to refer to an entity– Entity in DS can be anything, e.g., hosts,

printers, disks, files, mailboxes, web pages, etc

• Access Point To access an entity

• Address name of access point

• Access points of an entity may change

Page 5: Naming in Distributed System

Identifier and True Identifiers

• We need – single name of entity independent from the address of

that entity location independent

• Identifiers name that uniquely identifies an entity

• True Identifier has three properties– Refers to at most one entity– Each entity is referred to by at most one identifier– Never reused

• Differentiating point for Address and Identifier

Page 6: Naming in Distributed System

Name Space

• Names in DS are organized into Name Spaces• Name Space represented as labeled, directed

graph• Leaf node no outgoing edges• Directory node number of labeled outgoing

edges– Stores directory table containing entries for each

outgoing edge as a pair (edge label, node identifier)• Root Node only outgoing edges• Path Name sequence of labels

– Absolute Path first node in path name is root– Relative Path the opposite case

Page 7: Naming in Distributed System

General Naming Graph

Page 8: Naming in Distributed System

Name Resolution• The process of looking up a name• Closure Mechanism Knowing how and where to start

name resolution• Mounting transparent way for name resolution with

different name spaces• Mounted File System letting a directory node store the

identifier of a directory node from a different name space (foreign name space)– Mount point directory node storing the node identifier– Mounting point directory node in the foreign name

space• Normally the mounting point is root

Page 9: Naming in Distributed System

Mounted File System

• During resolution, mounting point is looked up & resolution proceeds by accessing its directory table

• Mounting requires at least– Name of an access protocol (for communication)– Name of the server (resolved to address)– Name of mounting point in foreign name space

(resolved to node identifier in foreign NS)• Each of these names needs to be resolved• Three names can be represented as URL nfs://oslab.khu.ac.kr/home/faraz

Page 10: Naming in Distributed System

Mounted File System

Page 11: Naming in Distributed System

Global Name Service (GNS)

• Another way to merge different name spaces• Mechanism add a new root node and make

the exiting root node its children• Problem

– Existing names need to be changed. E.g.,

home/faraz people/home/faraz

• Expansion is generally hidden from user• Has a significant performance overhead when

merging 100s or 1000s of name spaces

Page 12: Naming in Distributed System

Global Name Service (GNS)

Page 13: Naming in Distributed System

Implementation of Name Space

• For large scale DS, name spaces are organized hierarchically

• Name Spaces are partitioned into three logical layers– Global Layer formed by highest-level

nodes– Administration Layer formed by directory

nodes managed within a single organization– Managerial Layer formed by nodes that

may typically change regularly

Page 14: Naming in Distributed System

Implementation of Name Space

Page 15: Naming in Distributed System

Implementation of Name Space

Item Global Administrational Managerial

Geographical scale of network Worldwide Organization Department

Total number of nodes Few Many Vast numbers

Responsiveness to lookups Seconds Milliseconds Immediate

Update propagation Lazy Immediate Immediate

Number of replicas Many None or few None

Is client-side caching applied? Yes Yes Sometimes

Page 16: Naming in Distributed System

Implementation of Name Resolution• Assumptions

– No replication of name servers– No client side caching– Each client has access to a local name server

• Two possible implementations – Iterative Name Resolution

• Server will resolve the path name as far as it can, and return each intermediate result to the client

– Recursive Name Resolution• A name server passes the result to the next name server

found by it

Page 17: Naming in Distributed System

Iterative Name Resolution

• Advantage– Less burden on name sever

• Disadvantage– More communication cost

Page 18: Naming in Distributed System

Recursive Name Resolution

• Advantages– Caching result is more effective– Reduced communication cost

• Disadvantage– Demands high performance on each name server

Page 19: Naming in Distributed System

Domain Name System (DNS)

• An example implementation of name resolution• Primarily used for looking up host address and

mail servers• DNS name space is hierarchically organized as

a rooted tree• A label is a case sensitive string with max.

length of 63 characters• Max. length of complete path name is 255

characters• The root is represented by a dot

– We generally omit this dot for readability

Page 20: Naming in Distributed System

Locating Mobile Entities

Page 21: Naming in Distributed System

Naming versus Locating Entities

• Entities are named for lookup and subsequent access– Human-friendly Names– Identifiers– Addresses

• Virtually all naming systems maintain mapping from Human-friendly names to addresses

• Partitioning of Name space– Global Level– Administrator Level– Managerial Level

Page 22: Naming in Distributed System

Naming versus Locating Entities

ftp.cs.vu.nl

cs.vu.nl cs.vu.nl

ftp.cs.vu.nl ftp.abc.cs.vu.nl

abc

cs.vu.nl

ftp.cs.vu.nl

ftp.khu.ac.kr

Page 23: Naming in Distributed System

Naming versus Locating Entities

• Possible Solutions– Record the address of new machine

• Lookup operation shall work• Another update shall be required to database in case it

changes again

– Record the name of the new machine• Less efficient

– Find the name of new machine

– Lookup the address associated with the name

• Addition of step to lookup operation

• For highly mobile entities, it becomes only worse

Page 24: Naming in Distributed System

Naming versus Locating Entities

• Direct, single level mapping between names and addresses.• T-level mapping using identities.

Page 25: Naming in Distributed System

Simple solutions: Broadcasting and multicasting

• A location service accepts an identifier as input and returns the current address of the identified entity.

• Simple solutions exist to work in local area network.• Address Resolution Protocol (ARP) to map the IP

address of a machine to its data-link address, which uses broadcasting.

• Multicasting can be used to locate entities in point-to-point networks (such as the Internet).

• Each multicasting address can be associated with multiple replicated entities.

Page 26: Naming in Distributed System

Forwarding Pointers (1)

• The principle of forwarding pointers using (proxy, skeleton) pairs.

Page 27: Naming in Distributed System

Forwarding Pointers (1)

• Redirecting a forwarding pointer, by storing a shortcut in a proxy.

Page 28: Naming in Distributed System

Home-Based Approaches

• Example: The principle of Mobile IP. (Perkins, 1997)

Page 29: Naming in Distributed System

Hierarchical Approaches (1)

• Hierarchical organization of a location service into domains, each having an associated directory node.

Page 30: Naming in Distributed System

Hierarchical Approaches (2)

• An example of storing information of an entity having two addresses in different leaf domains.

Page 31: Naming in Distributed System

Hierarchical Approaches (3)

• Looking up a location in a hierarchically organized location service.

Page 32: Naming in Distributed System

Hierarchical Approaches (4)

a) An insert request is forwarded to the first node that knows about entity E.

b) A chain of forwarding pointers to the leaf node is created.

Page 33: Naming in Distributed System

Pointer Caches (1)

• Caching a reference to a directory node of the lowest-level domain in which an entity will reside most of the time.

Page 34: Naming in Distributed System

Pointer Caches (2)

• A cache entry that needs to be invalidated because it returns a nonlocal address, while such an address is available.

Page 35: Naming in Distributed System

Scalability Issues

• The scalability issues related to uniformly placing subnodes of a partitioned root node across the network covered by a location service.

Page 36: Naming in Distributed System

The Problem of Unreferenced Objects• An example of a graph representing objects

containing references to each other.

Page 37: Naming in Distributed System

Reference Counting (1)

• The problem of maintaining a proper reference count in the presence of unreliable communication.

Page 38: Naming in Distributed System

Reference Counting (2)

a) Copying a reference to another process and incrementing the counter too late

b) A solution.

Page 39: Naming in Distributed System

Advanced Referencing Counting (1)

a) The initial assignment of weights in weighted reference counting

b) Weight assignment when creating a new reference.

Page 40: Naming in Distributed System

Advanced Referencing Counting (2)

c) Weight assignment when copying a reference.

Page 41: Naming in Distributed System

Advanced Referencing Counting (3)

• Creating an indirection when the partial weight of a reference has reached 1.

Page 42: Naming in Distributed System

Advanced Referencing Counting (4)

• Creating and copying a remote reference in generation reference counting.

Page 43: Naming in Distributed System

Reference Listing (1)• Skeleton Keeps track of Proxies

– Instead of counting them maintain an explicit list of references

• Adding/removing references to the list have no effect on the fact the proxy is already exists/removed

• Idempotent Operations– Repeatable without affecting the end result

• Increment/decrement operation are clearly not idempotent

Page 44: Naming in Distributed System

Reference Listing (2)• Advantages

– Don’t require reliable communication– Duplicate messages need not to be detected– Only insertion/deletion should be acknowledged– Easier to keep system consistent in case of process

failures

• Drawback– Scale badly

• Solution– Leasing

Page 45: Naming in Distributed System

Identifying Unreachable Entities• Trace based garbage collection

– Scalability problems

• Naïve tracing– Mark and sweep collectors

• White, Grey, Black marks

• Drawbacks– Reachability graphs need to remain same during

both phases– No process can run when GC is running

Page 46: Naming in Distributed System

Tracing in Groups (1)

• Initial marking of skeletons.

Page 47: Naming in Distributed System

Tracing in Groups (2)

• After local propagation in each process.

Page 48: Naming in Distributed System

Tracing in Groups (3)

• Final marking.

Page 49: Naming in Distributed System

Conclusion

• Naming, organization of names and name resolution are key issue in any distributed systems

• Locating entities is an open research issues. There are few methods like Forwarding pointers, hierarchical approaches, home based approaches and pointer caches but each has its own short comings

• Reference counting, advanced reference counting and Reference listing are few methods that can be used for unreferenced objects

Page 50: Naming in Distributed System

- All is well that ends well !

Thank you all

Questions / Comments?