Naming in Distributed Systems Hong-Linh Truong Distributed Systems Group, Vienna University of Technology [email protected] dsg.tuwien.ac.at/staff/truong 1 DS WS 2014 Distributed Systems, WS 2014 Distributed Systems, WS 2014
Naming in Distributed Systems
Hong-Linh Truong Distributed Systems Group,
Vienna University of Technology
[email protected]/staff/truong
1DS WS 2014
Distributed Systems, WS 2014Distributed Systems, WS 2014
What is this lecture about?
Understand how to create names/identifiers for entities in distributed systems
Understand how to manage names and to resolve names to provide further detailed information about entities
Examine main techniques/frameworks/services for the creation and management of names in distributed systems
DS WS 2014 2
Learning Materials
Main reading: Tanenbaum & Van Steen, Distributed Systems: Principles and
Paradigms, 2e, (c) 2007 Prentice-Hall Chapter 5
George Coulouris, Jean Dollimore, Tim Kindberg, Gordon Blair„Distributed Systems – Concepts and Design“, 5nd Edition Chapters 10 & 13
Test the examples in the lecture
DS WS 2014 3
Outline
Basic concepts and design principles Flat naming Structured naming Attribute-based naming Some naming systems in the Web Summary
DS WS 2014 4
Why naming systems are important?
Entity: any kind of objects we see in distributed systems: process, file, printer, host, communication endpoint, etc
The usefulness of naming services Identification Providing detailed description Foundations for communication, security, auditing,
etc.
DS WS 2014 6
Q: Can you list some entities that are relevant to the implementation of communication in distributed systems?
Why naming systems are complex?
Diverse types of and complex dependencies among entities at different levels E.g, printing service the network level
communication end points the data link level communication end points
There are just so many entities, how do we create and manage names and identify an entity?
DS WS 2014 7
Names, identifiers, and addresses Name: set of bits/characters used to identify/refer to an
entity, a collective of entities, etc. in a context Simply comparing two names, we might not be able to know if
they refer to the same entity
Identifier: a name that uniquely identifies an entity the identifier is unique and refers to only one entity
Address: the name of an access point, the location of an entity
DS WS 2014 8
Resource accesses
Access Point
Identifierrefers to
Addressbinds
Process
Naming design principles
Data models/structures for naming services information about names
Processes in naming services E.g., Creation, management, update, query, and
resolution activities
DS WS 2014 9
Naming design principles
Name space Contains all valid names recognized and managed
by a service A valid name might not be bound to any entity Alias: a name refers to another name
Naming domain Name space with a single administrative authority which
manages names for the name space
Name resolution A process to look up information/attributes from a
name
DS WS 2014 10
Naming design principles
Naming design is based on specific system organizations and characteristics
DS WS 2014 11
Broadcast link network
Network Ethernet Identifier: IP and MAC
address Name resolution: the
network address to the data link address
independent nodes P2P systems Identifier: m-bit key Name resolution:
distributed hash tables
Examples
Naming design principles Structures and characteristics of names are
based on different purposes Data structure: Can be simple, no structure at all, e.g., a set of bits:
$ uuidbcff7102-3632-11e3-8d4a-0050b6590a3a
Can be complex Include several data items to reflect different aspects on a
single entity
Names can include location information/reference or not, e.g., GLN (Global Location Number) in logistics
Readability: Human-readable or machine-processable formats
DS WS 2014 12
Naming design principles Diverse name-to-address binding mechanisms How a name is associated with an address or how
an identifier is associated with an entity Names can be changed over the time and names are
valid in specific contexts Dynamic or static binding?
Distributed or centralized management Naming data is distributed over many places or not
Discovery/Resolution protocol Names are managed by distributed services Noone/single system can have a complete view of all
namesDS WS 2014 13
Examples of relationships amongdifferent names/identifiers
DS WS 2014 14
URL
Resource ID (IP number, port number, pathname)
Web server
55.55.55.55 WebExamples/earth.html
file
8888
DNS lookup
Socket
http://www.cdk3.net:8888/WebExamples/earth.htmlhttp://www.cdk5.net:8888/WebExamples/earth.html
Network address
2:60:8c:2:b0:5a2:60:8c:2:b0:5a
Source: Coulouris, Dollimore, Kindbergand Blair, Distributed Systems: Concepts and Design Edn. 5
Flat naming
Q: For which types of systems flat naming is suitable
DS WS 2014 16
Simple way to represent identifiers Do not contain additional information for
understanding the entity Examples Internet Address at the Network layer m-bit numbers in Distributed Hash Tables
Unstructured/flat names: identifiers have no structured description, e.g., just a set of bits
Broadcast based Name Resolution
Principles Assume that we want find the access point of the
entity en Broadcast the identifier of en, e.g., broadcast(ID(en)) Only en will return the access point, when the
broadcast message reaches nodes Examples ARP: from IP address to MAC address (the datalink
access point)
DS WS 2014 17
mail.infosys.tuwien.ac.at (128.131.172.240) at 00:19:b9:f2:07:55 [ether] on eth0sw-ea-1.kom.tuwien.ac.at (128.131.172.1) at 00:08:e3:ff:fc:c8 [ether] on eth0
Dynamic systems
Nodes form a system which has no centralized coordination In an overlay network
Nodes can join/leave/fail anytime A large number of nodes but a node knows only
a subset of nodes Examples Large-scale p2p systems, e.g., Chord, CAN (Content
Addressable Network), and Pastry
DS WS 2014 18
How do we define identifiers for such a system?
Distributed Hash Tables
DS WS 2014 19Q: Can you explain the data models and the processes for naming in DHT?
Main concepts m-bit is used for the keyspace for identifiers (Processing) Node identifier nodeID is one key in
the keyspace An entity en is identified by a hash function
k=hash(en) A node with ID p is responsible for managing entities
associated with a range of keys If (k=hash(en) ∈ range(p)), then put (k, en) will store en in p
Nodes will relay messages (including entities/name resolution requests) till the messages reach the right destination
Example - Chord A ring network with [0…2 1
positions for nodes in clockwise nodeID = hash(IP) the successor of k, successor(k),
is the smallest node identifier that k (in mod 2 )
A key k of entity en will be managed by the first node p where p =successor(k)k=hash(en)/the first node clockwise from k
DS WS 2014 20
Source: Andrew S. Tanenbaum and Maarten van Steen, Distributed Systems – Principles andParadigms, 2nd Edition, 2007, Prentice-Hall
http://pdos.csail.mit.edu/papers/chord:sigcomm01/
Q: if you want to manage files in 8 computers, how many bits would you use for the keyspace?
Example - Chord Resolving at p Keep m entries in a finger
table FT
2 2 , 1, … ,
p < k=hash(en) <= successor of p, returnsuccessor of p
Otherwise, the most q = precedes k=hash(en)
DS WS 2014 21
Source: Andrew S. Tanenbaum and Maarten van Steen, Distributed Systems – Principles and Paradigms, 2nd Edition, 2007, Prentice-Hall
Name spaces Names are organized into a name space which can be
modeled as a graph: Leaf node versus directory node
Each leaf node represents an entity; nodes are also entities
DS WS 2014 23
An absolute path name
An relative path name
Directory table (label,identifier)
Source: Andrew S. Tanenbaum andMaarten van Steen, Distributed Systems – Principles and Paradigms, 2nd Edition, 2007, Prentice-Hall
“Absolute” or “relative” is based on specific contexts
Name resolution – ClosureMechanism
Closure Mechanism: determine where and how name resolution would be started
E.g., name resolution for /home/truong/ds.txt ? Or for https://me.yahoo.com/a/.....DS WS 2014 24
Name resolution: N:<label1,label2,label3,…labeln> Start from node N Lookup (label1,identifier1) in N‘s directory table Lookup (label2, identifier2) in identifier1‘s directory
table and so on
Enabling Alias Using Links
DS WS 2014 25
Hard links: multiple absolute paths namesreferring to thesame node
Symbolic links: leaf node storingan absolute pathname
Source: Andrew S. Tanenbaum and Maarten van Steen, Distributed Systems – Principles and Paradigms, 2nd Edition, 2007, Prentice-Hall
Name resolution - Mounting
DS WS 2014 26
A directory node (mounting point) in a remote server can be mounted into a local node (mount point)
Source: Andrew S. Tanenbaum and Maarten van Steen, Distributed Systems – Principles and Paradigms, 2nd Edition, 2007, Prentice-Hall
Name space implementation
Distributed name management Several servers are used for managing names
Many distribution layers Global layer: the root node and its close nodes Administrational layer: directory nodes managed
within a single organization Managerial layer: nodes typically change regularly.
DS WS 2014 27
Example in Domain Name System
DS WS 2014 28
Source: Andrew S. Tanenbaum and Maarten van Steen, Distributed Systems – Principles and Paradigms, 2nd Edition, 2007, Prentice-Hall
Characteristics of distributionlayers
DS WS 2014 29
Source: Andrew S. Tanenbaum and Maarten van Steen, Distributed Systems – Principles and Paradigms, 2nd Edition, 2007, Prentice-Hall
Name Resolution
DS WS 2014 30
Name Resolver
Name Server 1
Name Server 2
Name Server 3
Steps 1,2
Steps 3,4
Steps 5,6
Name Resolver
Name Server 1
Name Server 2
Name Server 3
Step 1
Steps 2,3 Steps 4,5
Name Resolver
Name Server 1
Name Server 2 Name
Server 3
Step 1
Step 2
Step 3
Step 4
Step 5Step 6
Step 6
Iterative name resolution at resolver side
Iterative name resolution at server side
Recursive name resolution
Example -- Iterative nameresolution
DS WS 2014 31
Source: Andrew S. Tanenbaum and Maarten van Steen, Distributed Systems – Principles and Paradigms, 2nd Edition, 2007, Prentice-Hall
Example -- Recursive nameresolution
DS WS 2014 32
Q: What are pros and cons of recursive name resolution?
Source: Andrew S. Tanenbaum and Maarten van Steen, Distributed Systems – Principles and Paradigms, 2nd Edition, 2007, Prentice-Hall
Example -- Domain Name System (DNS) in Internet
We use to remember „human-readable“ machine name we have the name hierarchy E.g., www.facebook.com
But machines in Internet use IP address E.g., 31.13.84.33 Application communication use IP addresses and
ports DNS Mapping from the domain name hierarchy to IP
addresses
DS WS 2014 33
www.facebook.com canonical name = star.c10r.facebook.com.Name: star.c10r.facebook.comAddress: 31.13.84.33
Domain Name System (DNS) in Internet
Information in records of DNS namespace
DS WS 2014 34
Source: Andrew S. Tanenbaum and Maarten van Steen, Distributed Systems – Principles and Paradigms, 2nd Edition, 2007, Prentice-Hall
DNS Name Servers
Authoritative name server: answer requests for a zone Primary and secondary servers: the main server and the replicated
server (maintained copied data from the main server) Caching serverDS WS 2014 35
Root Name Server
Root Name Server
Administered Zone
Name Server
Administered ZoneName Server
Administered ZoneName Server
Administered ZoneName Server
Root Name Server
Administered ZoneName Server
Administered ZoneName ServerAdministered Zone
Name Server
Administered Zone
Name ServerAdministered Zone
Name Server
com
at
ac
tuwien
Example
root
DNS Queries
Simple host name resolution Which is the IP of www.tuwien.ac.at?
Email server name resolution Which is the email server for
[email protected] ? Reverse resolution From IP to hostname
Host information Other services
DS WS 2014 36
Examples
Iterative hostname resolution: http://www.simpledns.com/lookup-dg.aspx
Mail server resolution: https://www.mailive.com/mxlookup/
DS WS 2014 37
Attributes/Values
A tuple (attribute,value) can be used to describe a property E.g., („country“,“Austria“), („language“, „German“),
A set of tuples (attribute, value) can be used to describe an entity
DS WS 2014 39
Attribute ValueCountryName AustriaLanguage GermanMemberofEU YesCapital Vienna
AustriaInfo
Attribute-based naming systems
Employ (attribute,value) tuples for describing entities Why flat and structured naming are not enough?
Also called directory services Naming resolution Usually based on querying mechanism Querying usually deal with the whole space
Implementations LDAP RDF (Resource Description Framework)
DS WS 2014 40
LDAP data model
Object class: describe information about objects/entities using tuple(attribute,value) Hierarchical object class
Directory entry: object entry for a particular object, alias entry for alternative naming and subentry for other information
Directory Information Base (DIB): collection of all directory entries Each entry is identified by a distinguished name (DN)
Directory Information Tree (DIT): the tree structure for entries in DIB
DS WS 2014 41
LDAP – Lightweight Directory Access Protocol
http://tools.ietf.org/html/rfc4510 Example of attributes/values
DS WS 2014 42
Source: Andrew S. Tanenbaum and Maarten van Steen, Distributed Systems – Principles and Paradigms, 2nd Edition, 2007, Prentice-Hall
LDAP-- Interaction
Client-server protocol
DS WS 2014 43
LDAP Server(Directory System Agent)
Client(Directory User Agent)
Directory Information Base (DIB) Fragment
LDAP Server(Directory System Agent)
referrals
Directory Information Base (DIB) Fragment
Directory Information Tree forthe whole service
queries/results
queries/results
Example with Apache DS/DS Studio http://directory.apache.org/ Apache DS: a directory service supporting LDAP and others Apache Directory Studio: tooling platform for LDAP
DS WS 2014 44
Web services – service identifier
Web service: basically an entity which offers software function via well-defined, interoperable interfaces that can be accessed through the network E.g.,
http://www.webservicex.net/globalweather.asmx Web services identifier: A web service can be described via WSDL Inside WSDL, there are several „addresses“ that
identify where and how to call the service access points
DS WS 2014 46
Web services -- discovery
DS WS 2014 47
storage
Web Services Registry
Web Services Provider
Web Services Consumer
Registry implementations WSO2 Governance Registry -
http://wso2.com/products/governance-registry/ java UDDI (jUDDI) - http://juddi.apache.org/
searches
publishes
uses
Web Services
provides
Web Services
Web Services
results
OpenID – people identifier in theWeb
Several services offering individual identifiers Your google ID, Your yahoo ID, etc.
But there will be no single provider for all people
DS WS 2014 48
OpenID standard enables identifiers for people that canbe accepted by several service provider
An OpenID identifier is described as a URL E.g., https://me.yahoo.com/a/.....
We need mechanisms to accept identifiers from different providers
Q: Why can an OpenID identifier be considered unique?
OpenID interactions
DS WS 2014 50
OpenID Provider
OpenID identifier
provides
User Agent(e.g. Web Browser)
Relying Party(e.g., Web site)
entities
redirectsauthentication(4)
authenticates (5)redirectsauthenticationresult (6)
Accessentities (7)
returns result (8)
accessesan entity (1)
accesses (2)
Establishes shared secret (3)Verify authentication result
Problems A very big organization in EU has many services and its own
employees from different locations. It uses distributed LDAP servers for managing names/identifiers of its employees and services
The organization has a lot of external users from different companies and freelancers (external partners) Some companies are big with a lot of people working for the
organization in a short term, some have only a few people The organization wants to support the collaboration among
members of different teams and a team consists of people from the organization and external partners The organization does not want to manage external people but
it trusts its external partners
DS WS 2014 52
Approach to solution The organization asked us possible solutions for managing team
members by allowing them to access different services of the organization
We suggested the organization to develop Develop an OpenID service so that the organization is also an
OpenID provider, by using OpenID-to-LDAP software to interface to internal LDAP servers
A naming service interfaces to external OpenID servers and the organization’s OpenID service
Each team consists of a set of members, each member is unified identified by an OpenID
Each team is associated with a set of services that it can use, the service information is stored in LDAP server.
Homework: design your solution based on our suggestion so that given a team you can find out member details and team services
DS WS 2014 53
Summary Naming is a complex issue Fundamental for other topics, e.g., communication
and access control in distributed systems Data models/structures versus processes Different models Flat, structured and attributed-based naming
Different techniques to manage names Centralized versus distributed
Different protocols for naming resolution Dont forget to play with some simple examples
to understand existing conceptsDS WS 2014 54
55
Thanks for your attention
Hong-Linh TruongDistributed Systems GroupVienna University of [email protected]://dsg.tuwien.ac.at/staff/truong
DS WS 2014