1 BSIT 63 Advanced Computer Networks Chapter 1 Application Layer T his chapter presents some of the important functions of application layer . Mainly we discus DNS, Email and WWW . 1.1 INTRODUCTION Application layer is the outermost layer o f the TCP/IP architecture. This layer is responsible for many of the user applications such as WWW, EMAIL, FTP , DNS etc, In this chapter the reader will get a basic understanding of the concepts of some popular application layer functions. The lower layers of the TCP/IP model does support for transport. However , there is still need for some transport functions at application layer which an essential for application to RUN. One for the most important one is DNS. 1.2 DOMAIN NAME SYSTEM ( DNS) It is well known that the IP addresses are used to identify the devices in the internet such as Routers, Servers etc. In the absence of a d omain name for an Email serve r, we wou ld have ended with a representation such as [email protected], [email protected]etc. It can been seen that such a representation is very difficul t to remember that too impossible if there are hundreds of such email ids. Ifthe email server is loaded to a different machine with a different IP address, they above scheme does not work. If this is the case with Email, then how about the thousands of websites? For example, http://202.16.70.2/~index.html, is a URL. We need to remember the entire number to acce ss the page. Thus it is clear that the IP addresses are difficult to remember and DNS is a perfect solution to this problem. 1 BSIT 63 Advanced Computer Networks
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
This chapter presents some of the important functions of application layer. Mainly we discus DNS,
Email and WWW.
1.1 INTRODUCTION
Application layer is the outermost layer of the TCP/IP architecture. This layer is responsible for many
of the user applications such as WWW, EMAIL, FTP, DNS etc, In this chapter the reader will get a basic
understanding of the concepts of some popular application layer functions. The lower layers of the TCP/ IP model does support for transport. However, there is still need for some transport functions at application
layer which an essential for application to RUN. One for the most important one is DNS.
1.2 DOMAIN NAME SYSTEM (DNS)
It is well known that the IP addresses are used to identify the devices in the internet such as Routers,
Servers etc. In the absence of a domain name for an Email server, we would have ended with a
n Domain names are case insensitive for example, Edu, EDU, edu are same. Each componentname can be 63 characters and full path name should not exceed 255 characters.
n Naming follows organizational boundaries and not physical networks.
Resource Records:
n Every domain is associated with a set of records with it.
n To every enquiry, the Resolver will be supplied with Resource Records.
n Thus the primary function of DNS is to map the Domain Names into Resource Records.
n Resource Records have five components.
%Domain Name
%Time to live
%Class
%Type
%Value
I Domain name tell the domain to which this record applies. It is the primary search key.
II Time to live: Indicates how stable the record is. Most stable record has 86400 (the number of
seconds in 1 day). Unstable records have a duration of 60 (1 minute).
III Class: Its value for internet information, is IN, other codes are used for other application.
The message transfer system is concerned with relaying messages from the originator to the recipient.
The simplest way to do this is to establish a transport connection from the source machine to the destinationmachine and then just transfer the message
SMTP The Simple Mail Transfer Protocol
Within the internet, e-mail is delivered by having the source machine establish a TCP connection to
port 25 of the destination machine. Listening to this port is an email daemon that speaks SMTP ( Simple
Mail Transfer Protocol). This daemon accepts incoming connections and copies message from them into
the appropriate mailboxes. If a message cannot be delivered, an error report containing the first part of
the undeliverable message is returned to the sender. SMTP is a simple ASCII protocol. After
establishing the TCP connection to port 25, the sending machine, operating as the client, wits for the
receiving machine, operating as the client waits for the receiving machine, operating as the server, to talk fist. The server starts by sending a line of text giving its identity and telling whether it is prepared to
receive a mail. If it is not, the client releases the connection and tries again later.
If the server is willing to accept email, the client announces whom the email is coming from and whom
it is going to. If such a recipient exists at the destination, the server gives the client the go- ahead to send
the message. Then the client sends the message and the server acknowledges it. No checksums are
needed because TCP provide a reliable byte stream. If there is more email, that is now sent. When all the
email has exchanged in both directions, the connection is released. Finally, although the syntax of the four-
character commands from the client is rigidly specified, the syntax of the replies is less rigid. Only the
numerical code really counts. Each implementation can put whatever string it wants after the code.
To get a better feel for how SMTP and some of the other protocols described in this chapter work, try
them out. In all cases, first go to a machine connected to the internet. On a UNIX system, in a shell, type
telnet mail.isp.com 25
substituting the DNS name of your ISP’s mail server for mail.isp.com. On a windows system , click on
start then run , and type the command in the dialog box. This command will establish a telnet ( i.e, TCP)
connection to port 25 on that machine. Port 25 is the SMTP port You will probably get a response some
verifies the cache and respond if the file exists else it invokes disk search and caches the file
and also send the file to the client.
At any instant of time ‘t’ out of k modules, K-X modules may be few to take requests, X modules maybe in the queue waiting for disk access and cache search. If the number of disks are enhanced then it is
possible to enhance the speed.
Figure 1.8 A Multi-threaded Web Server
Each Module does the following.
1. Resolve the name of the Web page requested.
Eg: http:// www.cisco.com
There is no file name here. Default is index .html.
2. Authentication of client
needed because some pages are not available for public.
3. Perform access control on the client check to see if there are any restrictions.
4. Perform access control on the web page. Access restrictions on the page itself.
l To discuss the two types of connections for effecting datagram transfer between networks
l Discuss direct and indirect Routing
l Discuss different Routing protocols
2.1 INTRODUCTION
One of the main objectives of the network layer is to deliver the packets to the destination. The
delivery of packets is often accomplished using either a connection-oriented or a connectionless
network service. In a connection-oriented approach, the network layer protocol first makes a
connection with the network layer protocol at the remote site before sending a packet. When the connection
is established, a sequence of packets from the same source to the same destination can be sent one after
another. In this case, there is a relationship between packets. They are sent on the same path where they
follow each other. A packet is logically connected to the packet traveling before it and to packet traveling
after it. When all packets of a message have been delivered, the connection is terminated. In a connection-
oriented approach, the decision about the route of a sequence of packets with the same source anddestination addresses can be made only once, when the connection is established. The network device
will not compute the route again and again for each arriving packet. In a connectionless situation, the
network protocol treats each packet independently, with each packet having no relationship to any other
packet. The packets in a message may not travel the same path to their destination. The internet protocol
(IP) is a connectionless protocol. It handles each packet transfer in a separate way. This means each
l Note also that the last delivery is always a direct delivery. In an indirect delivery, the sender uses
the destination IP address and a routing table to find the IP address of the next router to which
the packet should be delivered.
l The sender then uses the ARP protocol to find the physical address of the next router. Note
that in direct delivery, the address mapping is between the IP address of the final destination and
the physical address of the final destination.
l In an indirect delivery, the address mapping is between the IP address of the next router and the
physical address of the next router.
Figure 2.2 Indirect Delivery
Routing tables are used in the routers. The routing table contain the list of IP addresses of neighboringrouters. When a router has received a packet to be forwarded, it looks at this table to find the route to the
final destination. However, this simple solution is impossible today in an Internetwork such as the Internet
because the number of entries in the routing table make table lookups inefficient. Several techniques can
make the size of the routing table manageable and handle such issues as security.
A static routing table contains information entered manually. The administration enters the route form
each destination into the table. When a table is created, it cannot update automatically when there is achange in the Internet. The table must be manually altered by the administrator. A static routing table can
be used in a small internet that does not change very much, or in an experimental internet for troubleshooting.
It is not a good strategy to use a static routing table in a big internet such as the Internet.
2.5 DYNAMIC ROUTING TABLE
A dynamic routing table is updated periodically using one of the dynamic routing protocols such as RIP,
OSPF, or BGP. Whenever there is a change in the Internet, such as the shutdown of a router or breaking
of a link, the dynamic routing protocols update all of the tables in the routers.
2.6 ROUTING INFORMATION PROTOCOL (RIP)
The Routing Information Protocol, or RIP, as it is more commonly called, is one of the most enduring
of all routing protocols. RIP is also one of the more easily confused protocols because a variety of RIP-
like routing protocols proliferated, some of which even used the same name! RIP and the myriad RIP-
like protocols were based on the same set of algorithms that use distance vectors to mathematically
compare routes to identify the best path to any given destination address.
Today’s open standard version of RIP, sometimes referred to as IP RIP, is formally defined in two
documents: Request For Comments (RFC) 1058 and Internet Standard (STD) 56. As IP-based networks
became both more numerous and greater in size, it became apparent to the Internet Engineering Task
Force (IETF) that RIP needed to be updated. Consequently, the IETF released RFC 1388 in January
1993, which was then superceded in November 1994 by RFC 1723, which describes RIP 2 (the second
version of RIP). These RFCs described an extension of RIP’s capabilities but did not attempt to obsolete
the previous version of RIP. RIP 2 enabled RIP messages to carry more information, which permitted the
use of a simple authentication mechanism to secure table updates. More importantly, RIP 2 supported
subnet masks, a critical feature that was not available in RIP.
Routing UpdatesRIP sends routing-update messages at regular intervals and when the network topology changes.
When a router receives a routing update that includes changes to an entry, it updates its routing table to
reflect the new route. The metric value for the path is increased by 1, and the sender is indicated as the
next hop. RIP routers maintain only the best route (the route with the lowest metric value) to a destination.
After updating its routing table, the router immediately begins transmitting routing updates to inform other
network routers of the change. These updates are sent independently of the regularly scheduled updates
that RIP routers send.
RIP Routing MetricRIP uses a single routing metric (hop count) to measure the distance between the source and a
destination network. Each hop in a path from source to destination is assigned a hop count value, which is
typically 1. When a router receives a routing update that contains a new or changed destination network
entry, the router adds 1 to the metric value indicated in the update and enters the network in the routing
table. The IP address of the sender is used as the next hop.
RIP Stability Features
RIP prevents routing loops from continuing indefinitely by implementing a limit on the number of hops
allowed in a path from the source to a destination. The maximum number of hops in a path is 15. If a
router receives a routing update that contains a new or changed entry, and if increasing the metric value
by 1 causes the metric to be infinity (that is, 16), the network destination is considered unreachable. The
downside of this stability feature is that it limits the maximum diameter of a RIP network to less than 16
hops.
RIP includes a number of other stability features that are common to many routing protocols. These
features are designed to provide stability despite potentially rapid changes in a network’s topology. For
example, RIP implements the split horizon and hold down mechanisms to prevent incorrect routing
information from being propagated.
RIP Timers
RIP uses numerous timers to regulate its performance. These include a routing-update timer, a route-
timeout timer, and a route-flush timer. The routing-update timer clocks the interval between periodic
routing updates. Generally, it is set to 30 seconds, with a small random amount of time added whenever
the timer is reset. This is done to help prevent congestion, which could result from all routers simultaneously
attempting to update their neighbors. Each routing table entry has a route-timeout timer associated with it.
When the route-timeout timer expires, the route is marked invalid but is retained in the table until the
route-flush timer expires.
Packet Formats
The following section focuses on the IP RIP and IP RIP 2 packet formats illustrated in Figures 2.3 and2.4. Each illustration is followed by descriptions of the fields illustrated.
The following descriptions summarize the IP RIP 2 packet format fields illustrated in Figure 2.4 :
· Command—Indicates whether the packet is a request or a response. The request asks that a
router send all or a part of its routing table. The response can be an unsolicited regular routing
update or a reply to a request. Responses contain routing table entries. Multiple RIP packets are
used to convey information from large routing tables.
· Version—Specifies the RIP version used. In a RIP packet implementing any of the RIP 2
fields or using authentication, this value is set to 2.
· Unused—Has a value set to zero.
· Address-family identifier (AFI)—Specifies the address family used. RIPv2’s AFI field
functions identically to RFC 1058 RIP’s AFI field, with one exception: If the AFI for the first
entry in the message is 0xFFFF, the remainder of the entry contains authentication information.
Currently, the only authentication type is simple password.
· Route tag—Provides a method for distinguishing between internal routes (learned by RIP) and
external routes (learned from other protocols).
· IP address—Specifies the IP address for the entry.
· Subnet mask—Contains the subnet mask for the entry. If this field is zero, no subnet mask has
been specified for the entry.
· Next hop—Indicates the IP address of the next hop to which packets for the entry should be
forwarded.
· Metric—Indicates how many internetwork hops (routers) have been traversed in the trip to the
destination. This value is between 1 and 15 for a valid route, or 16 for an unreachable route.
2.7 OPEN SHORTEST PATH FIRST
Open Shortest Path First (OSPF) is a routing protocol developed for Internet Protocol (IP) networks
by the Interior Gateway Protocol (IGP) working group of the Internet Engineering Task Force (IETF).
The working group was formed in 1988 to design an IGP based on the Shortest Path First (SPF) algorithm
for use in the Internet. Similar to the Interior Gateway Routing Protocol (IGRP), OSPF was createdbecause in the mid-1980s, the Routing Information Protocol (RIP) was increasingly incapable of serving
large, heterogeneous internetworks. This chapter examines the OSPF routing environment, underlying
routing algorithm, and general protocol components.
OSPF was derived from several research efforts, including Bolt, Beranek, and Newman’s (BBN’s)
SPF algorithm developed in 1978 for the ARPANET (a landmark packet-switching network developed in
The following descriptions summarize the header fields illustrated in Figure 46-2.
l Version number—Identifies the OSPF version used.
l Type—Identifies the OSPF packet type as one of the following:
m Hello—Establishes and maintains neighbor relationships.
m Database description—Describes the contents of the topological database. These
messages are exchanged when an adjacency is initialized.
m Link-state request—Requests pieces of the topological database from neighbor routers.
These messages are exchanged after a router discovers (by examining database-description
packets) that parts of its topological database are outdated.
m Link-state update—Responds to a link-state request packet. These messages also are
used for the regular dispersal of LSAs. Several LSAs can be included within a single link-
state update packet.
m Link-state acknowledgment—Acknowledges link-state update packets.
l Packet length—Specifies the packet length, including the OSPF header, in bytes.
l Router ID—Identifies the source of the packet.
l Area ID—Identifies the area to which the packet belongs. All OSPF packets are associated
with a single area.
l Checksum—Checks the entire packet contents for any damage suffered in transit.
l Authentication type—Contains the authentication type. All OSPF protocol exchanges are
authenticated. The authentication type is configurable on per-area basis.
l Authentication—Contains authentication information.
l Data—Contains encapsulated upper-layer information.
Additional OSPF Features
Additional OSPF features include equal-cost, multipath routing, and routing based on upper-layer type-
of-service (TOS) requests. TOS-based routing supports those upper-layer protocols that can specifyparticular types of service. An application, for example, might specify that certain data is urgent. If OSPF
has high-priority links at its disposal, these can be used to transport the urgent datagram.
OSPF supports one or more metrics. If only one metric is used, it is considered to be arbitrary, and
TOS is not supported. If more than one metric is used, TOS is optionally supported through the use of a
separate metric (and, therefore, a separate routing table) for each of the eight combinations created by
BGP is a very robust and scalable routing protocol, as evidenced by the fact that BGP is the routing
protocol employed on the Internet. At the time of this writing, the Internet BGP routing tables number
more than 90,000 routes. To achieve scalability at this level, BGP uses many route parameters, called
attributes, to define routing policies and maintain a stable routing environment.
In addition to BGP attributes, classless interdomain routing (CIDR) is used by BGP to reduce the size
of the Internet routing tables. For example, assume that an ISP owns the IP address block 195.10.x.x
from the traditional Class C address space. This block consists of 256 Class C address blocks, 195.10.0.x
through 195.10.255.x. Assume that the ISP assigns a Class C block to each of its customers. Without
CIDR, the ISP would advertise 256 Class C address blocks to its BGP peers. With CIDR, BGP can
supernet the address space and advertise one block, 195.10.x.x. This block is the same size as a traditional
Class B address block. The class distinctions are rendered obsolete by CIDR, allowing a significant
reduction in the BGP routing tables.
BGP neighbors exchange full routing information when the TCP connection between neighbors is firstestablished. When changes to the routing table are detected, the BGP routers send to their neighbors only
those routes that have changed. BGP routers do not send periodic routing updates, and BGP routing
updates advertise only the optimal path to a destination network.
BGP Attributes
Routes learned via BGP have associated properties that are used to determine the best route to a
destination when multiple paths exist to a particular destination. These properties are referred to as BGP
attributes, and an understanding of how BGP attributes influence route selection is required for the design
of robust networks. This section describes the attributes that BGP uses in the route selection process:
l Weight
l Local preference
l Multi-exit discriminator
l Origin
l AS_path
l Next hop
l Community
Weight Attribute
Weight is a Cisco-defined attribute that is local to a router. The weight attribute is not advertised to
neighboring routers. If the router learns about more than one route to the same destination, the route with
the highest weight will be preferred. In Figure 2.8, Router A is receiving an advertisement for network
When a route advertisement passes through an autonomous system, the AS number is added to an
ordered list of AS numbers that the route advertisement has traversed. Figure 2.11 shows the situation inwhich a route is passing through three autonomous systems.
AS1 originates the route to 172.16.1.0 and advertises this route to AS 2 and AS 3, with the AS_path
attribute equal to {1}. AS 3 will advertise back to AS 1 with AS-path attribute {3,1}, and AS 2 will
advertise back to AS 1 with AS-path attribute {2,1}. AS 1 will reject these routes when its own AS
number is detected in the route advertisement. This is the mechanism that BGP uses to detect routing
loops. AS 2 and AS 3 propagate the route to each other with their AS numbers added to the AS_path
attribute. These routes will not be installed in the IP routing table because AS 2 and AS 3 are learning a
route to 172.16.1.0 from AS 1 with a shorter AS_path list.
Next-Hop AttributeThe EBGP next-hop attribute is the IP address that is used to reach the advertising router. For EBGP
peers, the next-hop address is the IP address of the connection between the peers. For IBGP, the EBGP
next-hop address is carried into the local AS, as illustrated in Figure 2.12.
Figure 2.15 demonstrates the internet community attribute. There are no limitations to the scope of the
route advertisement from AS 1.
Figure2.15 BGP internet Community Attribute
BGP Path Selection
BGP could possibly receive multiple advertisements for the same route from multiple sources. BGPselects only one path as the best path. When the path is selected, BGP puts the selected path in the IP
routing table and propagates the path to its neighbors. BGP uses the following criteria, in the order
presented, to select a path for a destination:
l If the path specifies a next hop that is inaccessible, drop the update.
l Prefer the path with the largest weight.
l If the weights are the same, prefer the path with the largest local preference.
l If the local preferences are the same, prefer the path that was originated by BGP running on
this router.
l If no route was originated, prefer the route that has the shortest AS_path.
l If all paths have the same AS_path length, prefer the path with the lowest origin type (where
IGP is lower than EGP, and EGP is lower than incomplete).
l If the origin codes are the same, prefer the path with the lowest MED attribute.
Clearly these are new killer applications that have grown above the basic applications such as
l Email
l Web
l Remote login
l File Sharing etc..
It is to be noted that the Internet is the largest dynamic network which works on a simple concept of
best of best service effort. This means the packets that are released from the internet layer do not
guarantee the final delivery to their respective destination in spite of its best effort. While conventionalEmail, web commerce and other off-line applications have no problem, the real time application with huge
data suffer many limitations. In other words
Multimedia applications are sensitive to end-to-end delay delay variation but can tolerate occasional
loss of data. In this chapter we will examine how multi-media applications can be designed to make the
best of the bet-effort Internet, which provides no end-to-end delay guarantees. Also we will examine a
number of activities that are currently under way to extend the Internet architecture to provide explicit
support for the service requirements of multimedia applications.
We know that timing considerations and tolerance of data loss are particularly important for networked
multimedia applications. Timing considerations are important because many multimedia applications arehighly delay-sensitive. We will see shortly that in many multimedia applications, packets that incur a
sender-to-receiver delay of more than a few hundred milliseconds are essentially useless. On the other
hand, networked multimedia applications are for the most part loss-tolerant occasional loss only causes
occasional glitches in the audio/video playback, and these losses can often be partially or fully concealed.
These delay-sensitive but loss-tolerant characteristics are clearly different from those of elastic applications
such as the Web, e-mail, FTP, and Telnet. For elastic applications, long delays are annoying but not
particularly harmful, and the completeness and integrity of the transferred data is of paramount importance.
3.2 MULTIMEDIA APPLICATIONS
We know that many multimedia applications are already invaded the Internet. I this chapter we
confine only few applications which are based on Audio and Video. The technologies pertaining to these
are presented in this section. We consider three broad classes of multimedia applications:
In this class of applications, clients request on-demand compressed audio or video files that are stored
on servers. Stored audio files might contain audio from a professor’s lecture, rock songs, symphonies,
archives of famous radio broadcasts, or archived historical recordings. Stored video files might contain
video of a professor’s lecture, full-length movies, prerecorded television shows, documentaries, video
archives of historical events, cartoons, or music video clips. This class of applications has three key
distinguishing features.
l Stored media. The multimedia content has been prerecorded and is stored at the server. Asa result, a user may pause, rewind, fast-forward, or index through the multimedia content. The
time from when a client should be on the order of one to ten seconds for acceptable responsiveness.
l Streaming. In a streaming stored audio/video application, a client begins playout of the audio/
video of few seconds after it begins receiving the file from the server. This means that the client
will be playing out audio/video from one location in the file while it is receiving later parts of the
file from the server. This technique, known as streaming, avoids having to download the entire
file (and incurring a potentially long delay) before beginning playout. There are many streaming
multimedia products, such as RealPlayer, QuickTime and Media Player.
l Continuous playout. Once playout of the multimedia content begins, it should proceed accordingto the original timing of the recording. This places critical delay constraints on data delivery.
Data must be received from there server in time for its playout at the client. Although stored
media applications have continuous playout requirements, their end-to-end delay constraints are
nevertheless less stringent than those for live, interactive applications such as Internet telephony
and video conferencing.
3.2.2 Streaming Live Audio and Video
This class of applications is similar to traditional broadcast radio and television transmission emittedfrom any corner or the world. Since streaming live audio/video is not stored, a client cannot fast-forward
through the media. However, with local storage of received data, other interactive operations such as
pausing and rewinding through live multimedia transmissions are possible in some applications. Live,
broadcast-like applications often have many clients who are receiving the same audio/video program.
Distribution of live audio/video to many receivers can be efficiently accomplished using the IP multicasting
However, live audio/video distribution is more often accomplished through multiple separate unicast
streams. As with streaming stored multimedia, continuous playout is required, although the timing constraints
are less stringent than for real-time interactive applications. Delays of up to tens of seconds from when
the user requests the delivery/playout of a live transmission to when playout begins can be tolerated.
3.2.3 Real-Time Interactive Audio and Video
This class of applications allows people to use audio/video to communicate with each other in real
time. Real-time interactive audio over the Internet is often referred to as Internet phone, since, from the
user’s perspective, it is similar to the traditional circuit-switched telephone service. Internet phone can
potentially provide PBX (private branch exchange), local, and long-distance telephone service at very low
cost. It can also facilitate the deployment of new services that are not easily supported by the traditional
circuit-switched networks, including Web-phone integration, group real-time communication, directoryservices, caller filtering, and more. There are hundreds of Internet telephone products currently PC-to-
phone and PC-to-PC voice calls.
With real-time interactive video, also called video conferencing, individuals communicate visually as
well as orally. There are also many real-time interactive video products currently available for the Internet,
including Microsoft’s NetMeeting. Note that in a real-time interactive audio/video application, a user can
speak or move at any time. For a conversation with interaction among multiple speakers, the delay from
when a user speaks or moves until the action is manifested at the receiving hosts should be less than a few
hundred milliseconds. For voice, delays smaller than 150 milliseconds are not perceived by a human
listener, delays between 150 and 400 milliseconds can be acceptable, and delays exceeding 400 milliseconds
can result in frustrating, if not completely unintelligible, voice conversations.
3.3 MULTIMEDIA ON INTERNET : CURRENT SCENARIO
Recall that the IP protocol deployed in the Internet today provides a best-effort service to all the
packets it carries. In other words, the Internet makes its best effort to move each datagram from sender
to receiver as quickly as possible, but it does not make any promises whatsoever about the end-to-end
delay for an individual packet. Nor does the service make any promise about the variation of packet delay
within a packet stream. Because TCP and UDP run over IP, it follows that neither of these transport
protocols makes any delay guarantees to invoking applications. Due to the lack of any special effort todeliver packets in a timely manner, it is an extremely challenging problem to develop successful multimedia
networking application for the Internet.
To date, multimedia over the Internet has achieved significant but limited success. For example,
streaming stored audio/video with user-interactivity delays of five to ten seconds is now commonplace in
the Internet. But during peak traffic periods, performance may be unsatisfactory, particularly when
intervening links are congested (such as congested transoceanic links). Internet phone and real-time
interactive video has, to date, been less successful than streaming stored audio/video. Indeed, real-time
interactive voice and video impose rigid constraints on packet delay and packet jitter. Packet jitter is the
variability of packet delay within the same packet stream. Real-time voice and video can work well inregions where bandwidth is plentiful, and hence delay and jitter are minimal. But quality can deteriorate
to unacceptable levels as soon as the real-time voice or video packet stream hits a moderately congested
link.
The design of multimedia applications would certainly be more straightforward if there were some sort
of first-class and second-class Internet services, whereby first-class packets were limited in number and
received priority service in router queues. Such a first-class service could be satisfactory for delay-
sensitive applications. But to date, the Internet has mostly taken an egalitarian approach to packet scheduling
in router queues. All packets receive equal service; no packets, including delay-sensitive audio and video
packets, receive special priority in the router queues.
So for the time being we have to live with best-effort service. But given this constraint, we can make
several design decisions and employ a few tricks to improve the user-perceived quality of a multimedia
networking application. For example, we can send the audio and video over UDP, and thereby circumvent
TCP’s low throughput when TCP enters its slow-start phase. We can delay playback at the receiver by
100 msecs or more in order to diminish the effects of network-induced jitter. We can timestamp packets
at the sender so that the receiver knows when the packets should be played back. For stored audio/video
we can pre-fetch data during playback when client storage and extra bandwidth are available. We can
even send redundant information in order to mitigate the effects of network-induced packet loss.
3.4 CHANGES NEEDED FOR THE INTERNET TO SUPPORT
MULTIMEDIA
Today there is a tremendous and sometimes ferocious debate about how the Internet should evolve in
order to accommodate multimedia traffic with its rigid timing constraints better. At one extreme, some
researchers argue that fundamental changes should be made to the Internet so that applications can
explicitly reserve end-to-end bandwidth. These researchers believe that if a user wants to make, for
example, an Internet phone call from host A to host B, then the user’s Internet phone application should
be able to reserve bandwidth explicitly in each link along a route between the two hosts. But permitting
applications to make reservations and requiring the network to honor the reservations requires some bigchanges.
1. We need a protocol that, on the behalf of applications, reserves bandwidth from the senders to
their receivers.
2. We must modify scheduling policies in the router queues so that bandwidth reservations can be
classes (possibly just two classes), assign each datagram to one of the classes, give datagrams different
levels of service according to their class in the router queues, and charge users according to the class of
packets that they are sending into the network.
3.5 NEED FOR AUDIO AND VIDEO COMPRESSION
Raw Audio and Video samples after digitization occupy a large amount of space. Therefore audio
and video are compressed before being sent through the network. The need for digitization is obvious:
computer networks transmit bits, so all transmitted information must be represented as a sequence of bits.
Compression is important because uncompressed audio and video consume a tremendous amount of
storage and bandwidth; removing the inherent redundancies in digitized audio and video signals can reduce
the amount of data that needs to be stored and transmitted by orders of magnitude.
As an example, a single image consisting of 1024 pixel * 1024 pixels, with each pixel encoded into 24
bits (eight bits each for the colors red, green, and blue), requires 3 MBytes of storage without compression.
It would take seven minutes to send this image over a 64 kbps link. If the image is compressed at a
modest 10:1 compression ration, the storage requirement is reduced to 300 Kbytes and the transmission
time also drops by a factor of ten. The fields of audio and video compression are vast. They have been
active areas pf research for more than 50 years, and there are now literally hundreds of popular techniques
and standards for both audio and video compression. Most universities offer entire courses on audio and
video compression and often offer separate courses on each. We therefore provide here a brief and high-
level introduction to the subject.
Audio Compression
A continuously varying analog audio signal (which could emanate from speech or music) is normally
converted to a digital signal as follows:
l The analog audio signal is first sampled at some fixed rate, for example, at 8,000 samples per
second. The value of each sample is an arbitrary real number.
l Each of the samples is then “rounded” to one of a finite number of values. This operation is
referred to as “quantization”. The number of finite value called quantization values is typically
a power of two, for example, 256 quantization values.
l Each of the quantization values is represented by a fixed number of bits. For example, if thereare 256 quantization values, then each value and hence each sample is represented by one
bytes. Each of the samples is converted to its bit representation. The bit representations of all
the samples are concatenated together t form the digital representation of the signal.
As an example, if an analog audio signal is sampled t 8,000 samples per second and each sample is
quantized and represented by 8 bits, then the resulting digital signal will have a rate of 64,000 bits second.
This digital signal can then be converted back that is, decoded to an analog signal for playback. However,
the decoded analog signal is typically different from the original audio signal. By increasing the sampling
rate and the number of quantization values, the decoded signal can approximate the original analog signal.
Thus, there is a clear trade-off between the quality of the decoded signal and the storage and bandwidthrequirements of the digital signal. The basic encoding technique that we just described is called pulse
code modulation (PCM). Speech encoding often usesPCM, with a sampling rate of 8,000 samples per
second and eight bits per sample, giving a rate of 64 kbps. The audio compact disk (CD) also uses PCM,
with a sampling rate of 44,100 samples per second with 16 bits per sample; this gives a rate of 705.6 kbps
for mono and 1.411 Mbps for stereo.
A bit rate of 1.411 Mbps for stereo music exceeds most access rates, and even 64 kbps speech
exceeds the access rate for a dial-up modem user. For these reasons, PCM encoded speech and music
are rarely used in the Internet. Instead compression techniques are used to reduce the bit rates of the
stream. Popular compression techniques for speech include GSM (13 kbps), G.729 (8 kbps), and G.
723.3 (both 6.4 and 5.3 kbps), and also a large number of proprietary techniques, including those used by
Real Networks
MP3
A popular compression technique for near CD quality stereo music is MPEG 1 layer 3, more commonly
known as MP3, MP3 encoders typically compress to rates of 96 kbps, 128 kbps, and 160 kbps, and
produce very little sound degradation. When an MP3 file is broken up into pieces, each piece is still
playable. This header-less file format allows MP3 music files to be streamed across the Internet (assuming
the playback bit rate and speed of the Internet connection are compatible). The MP3 compression
standard is complex, using psychoacoustic masking, redundancy reduction, and bit reservoir buffering.
Video Compression
A video is a sequence of frames, with frames typically being displayed at a constant rate, for example
at 24 or 30 frames per second. An uncompressed, digitally encoded image consists of an array of pixels,
with each pixel encode into a number of bits to represent luminance and color.
Video has two types of redundancies
l Spatial redundancy
l Temporal redundancy
Spatial redundancy is the redundancy within a given image. For example, an image that consists of
mostly white space can be efficiently compressed. Temporal redundancy reflects repetition from image
to subsequent image. If, for example, an image and the subsequent image are exactly the same, there is
no reason to re-encode the subsequent image; it is more efficient simply to indicate during encoding that
l Decompression. Audio/video is almost always compressed to save disk storage and network
bandwidth. A media player must decompress the audio/video on the fly during playout.
l Jitter removal. Packet jitter is the variability of source-to-destination delays of packets withinthe same packet stream. Since audio and video must be played out with the same timing with
which it was recorded, a receiver will buffer received packets for a short period of time to
remove this jitter.
l Error correction. Due to unpredictable congestion in the Internet, a fraction of packets in the
packet stream can be lost. If this fraction becomes too large, user perceived audio/video quality
becomes unacceptable. To this end, may streaming systems attempt to recover from losses by
either (1) reconstructing lost packets through the transmission of redundant packets, (2) having
the client explicitly request retransmission of lost packets, or (3) masking loss by interpolating
the missing data from the received data.
The media player has a graphical user interface with control knobs. This is the actual interface that
the user interacts with. It typically includes volume controls, pause/resume buttons, sliders for making
temporal jumps in the audio/video stream, and so on.
Plug-ins may be used to embed the user interface of the media player within the window of the Web
browser. For such embeddings, the browser reserves screen space on the current Web page, and it is up
to the media player to manage the screen space. But whether appearing in a separate window or within
the browser window (as a plug-in), the media player is a program that is being executed separately from
the browser.
3.7 ACCESSING AUDIO AND VIDEO THROUGH A WEBSERVER
Stored audio/video can reside either on a Web server that delivers the audio/video to the client over
HTTP, or on an audio/video streaming server that delivers the audio/video over non-HTTP protocol
(protocols that can be either proprietary or open standards). In this subsection, we examine delivery of
audio/video from a Web server; in the next subsection, we examine delivery from a streaming server.
Consider first the case of audio streaming. When an audio file resides on a Web server, the audio fileis an ordinary object in the server’s file system, just as HTML and JPEG files are. When a user wants to
hear the audio file, the user’s host establishes a TCP connection with the Web server and sends an HTTP
request for the object. Upon receiving a request, the Web server encapsulates the audio file in an HTTP
response message and sends the response message back into the TCP connection.
helper application) must interact with the server through a Web browser as an intermediary, the entire
object must be downloaded before the browser passes the object to a helper application. The resulting
delay before playout can begin is typically unacceptable for audio/video clips of moderate length.
For this reason, audio/video streaming implementations typically have the server send the audio/video
file directly to the media player process. In other words, a direct socket connection is made between the
server process and the media player process. As shown in Figure 3.2, this is typically done by making use
of a meta file, a file that provides information (for example, URL or type of encoding) about the audio/
video file that is to be streamed.
A direct TCP connection between the server and the media player is obtained as follows:
1. The user clicks on a hyperlink for an audio/video file.
2. The hyperlink does not point directly to the audio/video file, but instead to a meta file. The metafile contains the URL of the actual audio/video file. The HTTP response message that
encapsulates the meta file includes a content type header line that indicates the specific audio/
video application.
3. The client browser examines the content type header line of the response message, launches
the associated media player, and passes the entire body of the response message (that is, the
meta file) to the media player.
4. The media player sets up a TCP connection directly with the HTTP server. The media player
sends an HTTP request message for the audio/video file into the TCP connection.
5. The audio/video file is sent within an HTTP response message to the media player. The media
player streams out the audio/video file.
The importance of the intermediate step of acquiring the meta file is clear, when the browser sees the
content type of the file, it can launch the appropriate media player, and thereby have the media player
contact the server directly.
We have just learned how a meta file can allow a media player to communicate directly with a Web
server that stores an audio/video file. Yet many companies that sell products for audio/video streaming do
not recommend the architecture we just described. This is because the architecture has the media player
communicate with the server over HTTP and hence also over TCP. HTTP is often considered insufficiently
rich to allow for satisfactory user interaction with the server; in particular, HTTP does not easily allow a
user (through the media player) to send pause/resume, fast-forward, and temporal jump commands to the
Figure 3.2 Web server sends audio/video directly to the media player
3.8 TRANSMISSION OF MULTIMEDIA DATA FROM A
STREAMING SERVER TO A HELPER APPLICATION
In order to get around HTTP and /or TCP, audio/video can be stored on and sent from a streaming
server to the media player. This streaming server could be a proprietary streaming server, such as those
marketed by Real Networks and Microsoft, or could be a public-domain streaming server. With a streaming
server, audio/video can be sent over UDP (rather than TCP) using application-layer protocols that may be
better tailored than HTTP to audio/video streaming.
This architecture requires two servers, as shown in Figure 3.3. One server, the HTTP server, serverWeb pages (including meta files). The second server, the streaming server, serves the audio/video files.
The two servers can run on the same end system or on two distinct end systems. The steps for this
architecture are similar to those described in the preceding subsection. However, now the media player
requests the file from a streaming server rather than from a Web server, and now the media player and
streaming server can interact using their own protocols. These protocols can allow for rich user interaction
3. The media is sent over TCP. The server pushes the media file into the TCP socket as quickly
as it can; the client (that is, media player) reads from the TCP socket as quickly as it can, and
places the compressed video into media player buffer. After an initial two to five second delay,
the media player reads from its buffer at a rate d and forwards the compressed media todecompression and playback. Because TCP retransmits lost packets, it has the potential to
provide better sound quality than UDP. On the other hand, the fill rate x(t)now fluctuates with
packet loss. TCP congestion control and window flow control. In fact, after packet loss, TCP
congestion control may reduce the instantaneous rate to less than d for long periods of time.
This can empty the client buffer and introduce undesirable pauses into the output of the audio/
video stream at the client.
For the third option, the behavior of x(t) will very much depend on the size of the client buffer (which
is not to be confused with the TCP receive buffer). If this buffer is large enough to hold all of the media
file (possible within disk storage), then TCP will make use of all the instantaneous bandwidth available to
the connection, so that x(t) can become much larger than d . If x(t) becomes much larger than d for long
periods of time, then a large portion of media is pre-fetched into the client, and subsequent client starvation
is unlikely. If, on the other hand, the client buffer is small, then x(t) will fluctuate around the drain rate d.
Risk of client starvation is much larger in this case.
Figure 3.4 Client buffer being filed at rate x(t) and drained at rate d
3.9 REAL-TIME STREAMING PROTOCOL (RTSP)
Many Internet multimedia users (particularly those who grew up with a TV remote control in hand)
will want to control the playback of continuous media by pausing playback, repositioning playback to a
future or past point of time, fast-forwarding playback visually, rewinding playback visually, and so on.
This functionally is similar to what a user has with a DVD player when watching a DVD video or with a
CD player when listening to a music CD. To allow a user to control playback, the media player and server
need a protocol for exchanging playback control information. Real-time streaming protocol (RTSP),
defined in RFC 2326, is such a protocol.
Before getting into the details of RTSP, let us first indicate what RTSP does not do.
l RTSP does not define compression schemes for audio and video.
l RTSP does not define how audio and video are encapsulated in packets for transmission over a
network; encapsulation for streaming media can be provided by RTP or by a proprietary protocol.
(RTP is discussed in Section 6.4) For example, Real Networks’ audio/video servers and players
user RTSP to send control information to each other, but the media stream itself can beencapsulated in RTP packets or in some proprietary data format.
l RTSP does not restrict how streamed media is transported; it can be transported over UDP or
TCP.
l RTSP does not restrict how the media player buffers the audio/video. The audio/video can be
played out as soon as it begins to arrive at the client, it can be played out after a delay of a few
seconds, or it can be downloaded in its entirety before playout.
So if RTSP doesn’t do any of the above, what does it do? RTSP is a protocol that allows a media
player to control the transmission of a media stream. As mentioned above, control actions include pause/ resume, repositioning of playback, fast-forward, and rewind. RTSP is an out-of-band protocol. In particular,
the RTSP messages are sent out-of-band, whereas the media stream, whose packet structure is not
defined by RTSP, is considered “in-band”. RTSP messages use a different port number, 544, from the
media stream. The RTSP specification (RFC 2326) permits RTSP messages to be sent over either TCP
or UDP.
Recall that file transfer protocol (FTP) also uses the out-of-band notion. In particular, FTP uses tow
client/server pairs of sockets, each pair with its own port number: one client/server socket pair supports
a TCP connection that transports control information; the other client/server socket pair supports a TCP
connection that actually transports the file. The RTSP channel is in many ways similar to FTP’s control
server keeps track of the state of the client for each ongoing RTSP session. For example, the server
keeps track of whether the client is in an initialization state, a play state, or a pause state (see the programming
assignment for this chapter). The session and sequence numbers, which are part of each RTSP request
and response, help the server keep track of the session state. The session number is fixed throughout the
entire session; the client increments the sequence number each time it sends a new message; the server
echoes back the session number and the current sequence number.
As shown in the example, the client initiates the session with the SETUP request, providing the URL
of the file to be streamed and the RTSP version. The setup message includes the client port number to
which the media should be sent. The setup message also indicates the client port number to which the
media should be sent.
The setup message also indicates that the media should be sent over UDP using the packetization
protocol RTP. Notice that in this example, the player chose not to play back the complete presentation,but instead only the low-fidelity portion of the presentation.
The RTSP protocol is actually capable of doing much more than described in this brief introduction. In
particular, RTSP has facilities that allow clients to stream toward the server (for example, for recording).
RTSP has been adopted by Real Networks one of the industry leaders in audio/video streaming
3.10 LIMITATIONS OF A BEST-EFFORT SERVICE
We mentioned that the best-effort service can lead to packet loss, excessive end-to-end delay, and
packet jitter. Let’s examine these issues in more details.
Packet Loss
Consider one of the UDP segments generated by our Internet phone application. The UDP segment
is encapsulated in an IP datagram. As the datagram wanders through the network, it passes through
buffers (that is, queues) in the routers in order to access outbound links. It is possible that one or more of
the buffers in the route from sender to receiver is full and cannot admit the IP datagram. In this case, the
IP datagram is discarded, never to arrive at the receiving application.
Loss could be eliminated by sending the packets over TCP rather than over UDP. Recall that TCP
retransmits packets that do not arrive at the destination. However, retransmission mechanisms are often
considered unacceptable for interactive real-time audio applications such as Internet phone, because they
increase end-to-end delay. Furthermore, due to TCP congestion control, after packet loss the transmission
rate at the sender can be reduced to a rate that is lower than the drain rate at the receiver. This can have
a severe impact on voice intelligibility at the receiver. For these reasons, almost all existing Internet phone
applications run over UDP and do not bother to retransmit lost packets.
But losing packets is not necessarily as disastrous as one might think. Indeed, packet loss rates
between 1 and 20 percent can be tolerated, depending on how the voice is encoded and transmitted, and
on how the loss is concealed at the receiver. For example, forward error correction (FEC) can help
conceal packet loss. We’ll see below that with FEC, redundant information is transmitted along with the
original information so that some of the lost original data can be recovered from the redundant information.
Nevertheless, if one or more of the links between ender and receiver is severely congested, and packet
loss exceeds 10-20 percent, then there is really nothing that can be done to achieve acceptable sound
quality. Clearly, best effort service has its limitations.
End-to-End Delay
End-to-end delay is the accumulation of transmission, processing, and queuing delays in routers;
propagation delays in the links; and end-system processing delays. For highly interactive audio applications,
such as Internet phone, end-to-end delays smaller than 150 milliseconds are not perceived by a humanlistener; delays between 150 and 400 milliseconds can be acceptable but are not ideal; and delays exceeding
400 milliseconds can seriously hinder the interactivity in voice conversations. The receiving side of an
Internet phone application will typically disregard any packets that are delayed more than a certain threshold,
for example, more than 400 milliseconds. Thus, packets that are delayed by more than the threshold are
effectively lost.
Packet Jitter
A crucial component of end-to-end delay is the random queuing delays in the routers. Because of
these varying delays within the network, the time from when a packet is generated at the source until it is
received at the receiver can fluctuate from packet to packet. This phenomenon is called jitter.
As an example, consider two consecutive packets within a talk spurt in our Internet phone application.
The sender sends the second packet 20 msec after sending the first packet. But at the receiver, the
spacing between these packets can become greater than 20 msec. To see this, suppose the first packet
arrives at a nearly empty queue at a router, but just before the second packet arrives at the queue a large
number of packets from other sources arrive at the same queue. Because the first packet suffers a small
queuing delay and the second packet suffers a large queuing delay at this router, the first and second
packets become spaced by more than 20 mecs. The spacing between consecutive packets can also
become less than 20 msecs. To see this, again consider two consecutive packets within a talk spurt.
Suppose the first packet joins the end of a queue with large number of packets, and the second packet
arrives at the queue before packets from other sources arrive at the queue. In this case, our two packets
find themselves one right after the other in the queue. If the time it takes to transmit a packet on the
router’s outbound link is less than 20 msecs, then the first and second packets become spaced apart by
Specifically, streaming of stored audio/video can tolerate significantly larger delays. Indeed, when a
user requests an audio/video clip, the user may find it acceptable to wait five seconds or more before
playback begins. And most users can tolerate similar delays after interactive actions such as a temporal
jump within the media stream. This greater tolerance for delay gives the application developer greater
flexibility when designing stored media applications.
3.11 PROTOCOLS FOR REAL-TIME INTERACTIVEAPPLICATIONS
Real-time interactive applications, including Internet phone and video conferencing, promise to drive
much of the future Internet growth. It is therefore not surprising that standards bodies, such as the IETF
and ITU, have been busy for many years (and continue to be busy!) at hammering out standards for thisclass of applications. With the appropriate standards in place for real-time interactive applications,
independent companies will be able to create new and compelling products that interoperate with each
other. In this section we examine RTP, SIP and H.323 for real-time interactive applications. All three
sets of standards are enjoying widespread implementation in industry products.
3.12 RTP:REAL TIME PROTOCOL
In the previous section we learnt that the sender side of a multimedia application appends header fields
to the audio/video chunks before passing them to the transport layer. These header fields include sequencenumbers and timestamps. Since most multimedia networking applications can make use of sequence
numbers and timestamps, it is convenient to have a standardized packet structure that includes fields for
audio/video data, sequence number, and timestamp, as well as other potentially useful fields. RTP, defined
in RFC 1889, is such a standard.
RTP Basics
RTP typically runs on top of UDP. The sending side encapsulates a media chunk within an RTP
packet, then encapsulates the packet in a UDP segment, and then within an RTP packet, then encapsulates
the packet in a UDP segment, and then hands the segment to IP. The receiving side extracts the RTPpacket from the UDP segment, then extracts the media chunk from the RTP packet, and then passes the
chunk to the media player for decoding and rendering.
As an example, consider the use of RTP to transport voice. Suppose the voice source is PCM-
encoded (that is, sampled, quantized, and digitized) at 64 kbps. Further suppose that the application
collects the encoded data in 20 msec chunks, that is, 160 bytes in a chunk. The sending side precedes
each chunk of the audio data with an RTP header that includes the type of audio encoding, a sequence
number, and a timestamp. The RTP header is normally 12 bytes.
The audio chunk along with the RTP header form the RTP packet. The RTP packet is then sent intothe UDP socket interface. At the receiver side, the application receives the RTP packet from its socket
interface. The application extracts the audio chunk from the RTP packet and uses the header fields of the
RTP packet to properly decode and play back the audio chunk.
If an application incorporates RTP instead of a proprietary scheme to provide payload type, sequence
numbers, or timestamps then the application will more easily interoperate with other networked multimedia
applications. For example, if two different companies develop Internet phone software and they both
incorporate RTP into their product, there may be some hope that a user using one of the Internet phone
products will be able to communicate with a user using the other Internet phone product. In Section 6.4.3
we’ll see that RTP is often used in conjunction with the Internet telephony standards.
It should be emphasized that RTP in itself does not provide any mechanism to ensure timely delivery of
data or provide other quality of service guarantees; it does not even guarantee delivery of packets or
prevent out-of-order delivery of packets. Indeed, RTP encapsulation is seen only at the end systems.
Routers do not distinguish between IP datagrams that carry RTP packets and IP datagrams that don’t.
RTP allows each source (for example, a camera or a microphone) to be assigned its own independent
RTP stream of packets. For example, for a video conference between tow participant, four RTP stream
could be opened two streams for transmitting the audio(one in each direction) and two streams for
transmitting the video (again, one in each direction). However, many popular encoding techniques includingMPEG 1 and MPEG 2 bundle the audio and video into a single stream during the encoding process. When
the audio and video are bundled by the encoder, then only one RTP stream is generated in each direction.
RTP packets are not limited to unicast applications. They can also be sent over one-to-many and
many-to-many multicast trees. For a many-to-many multicast session, all of the session’s senders and
sources typically use the same multicast group for sending their RTP streams. RTP multicast streams
belonging together, such as audio and video streams emanating from multiple senders in a video conference
application, belong to an RTP session.
3.13 SUMMARY
In this chapter we have learnt wealth of information on the multimedia data transport across internet.
Especially we looked into the audio and video streaming across internet. Limitation of the present internet
and removal of certain drawbacks to make the existing internet to port multimedia information. Some of
the protocols used for real time streaming are also presented.
rates, occupational safety concerns, and licensing requirements. As these problems have been addressed,
the popularity of wireless LANs has grown rapidly.
Wireless LANs have been developed over the last 30 years. ALOHANET, the first operating wirelessnetwork, was implemented in Hawaii in 1971. It was started as a research project of the University of
Hawaii. It allowed seven campuses across four islands to communicate via satellite with a central
computer. The protocol used for ALOHA went through multiple iterations before a good throughput was
achieved.
Ham radio operators developed terminal node controllers (TNCs) in the 1980s, which they used to
connect their computers to the ham radio network. The TNCs modulated the computer signal and used
packet switching to transmit the data. Ham radio associations began sponsoring forums for the development
of wireless WANs in the early 1980s.
In the mid-1980s, the FCC authorized pubic use of the Industrial, Scientific, and Medical (ISM) frequencybands. The ISM band is designated for short range, low power devices therefore licensing is not required
to manufacture or use equipment operating in this range. This move by the FCC encouraged the
development of wireless LAN components. Early development, as with most new technology, resulted in
a lot of proprietary wireless equipment. This equipment was also expensive, which prevented widespread
use.
In the late 1980s, commercial industry standards development began for Wireless LAN. The Institute
of Electrical and Electronics Engineers (IEEE) 802 Working Group created the 802.11 Working Group to
develop wireless LAN standards. They defined the physical and media access control specifications. As
time has progressed, the initial standards were finalized and extended to cover multiple frequencies and
access speeds. Equipment prices are now falling and performance is increasing. Wireless LANs havebecome a viable solution in both homes and in industry.
4.2 SOME BASIC DEFINITIONS
Modulation
Data rates of a few bits per minute (bpm), all the way to 100 Mbps, do not have radio characteristics
that are sufficient to allow them free movement through the air. To make data move through the air, it
must be mixed with a frequency that has good free-air transmission characteristics. The frequency that
can carry the data is called the carrier frequency.
In Figure 4.1, we see a block diagram of a simple transmitter. Note that, as the data enters on the left
of the figure, it is mixed with the carrier frequency in a functional box called a modulator. A generator
produces the carrier frequency. When the intelligence is mixed with that frequency, it creates an output
signal that may resemble the output shown in the antenna.
The mixing, or modulating, of intelligence with the carrier frequency comes in various forms. Common
methods are AM, CCK, PBCC, FM, BPSK, and QPSK.
Figure 4.1: Block Diagram of Simple Transmitter
Carriers
If you tune the radio in your home to 103.9 FM, you will receive the same station all the time. In theUS, this is because the FCC regulates this range of frequencies. However, the frequency band used for
wireless – both the 2 and 5 GHz ranges – are unregulated. There is no ownership of any one frequency.
Interference could become a problem if fixed carrier frequencies were used. To overcome this problem,
carrier frequencies are consistently changed via several approaches. The major approach used in wireless
is called spread spectrum. The height of the carrier is reduced (suppressed carrier ), and the carrier
frequency is consistently changed within a predefined range and with a pattern known by both the receiver
and the transmitter.
S p r e a d S p e c tr u m M e th o d s
F r e q u e n c y H o p p i n g S p r e ad S p e c t r u m ( F H S S )
u se s a p se ud o - ra nd o m c a rr ie r h o p m e th o d. I n
theory , FH SS i s more secure because o f the
d i f fi cu l ty invo lved in p red ic t ing and cap tu r ing
car r ie rs genera ted in pseud o- random pa t te rns .
When a signal is sent into space, it mixes with radio noise. Once this happens, it is difficult to separate
the two. In radio communications, you may have two units of signal strength, but if one unit is noise, youreally have one useable unit of signal. The relationship in proportions of signal to noise is called signal-to-
noise ratio. The lower the signal-to-noise ratio, the lower the overall data performance. In Figure 4.4, we
see a radio signal and noise for a fixed carrier signal.
Figure 4.5 shows how a spread spectrum signal would look with noise. The suppressed carriers
operate just above the typical noise floor, making spread spectrum look like noise to the untrained eye.
The receiving stations must detect the carrier shift pattern and match their demodulation patterns to the
existing modulation pattern in order to recover data.
Figure 4.5: Spread Spectrum Signal and Noise
Bandwidth
Bandwidth alone should not be the deciding factor in equipment purchase and installation. In a wired
environment, many devices share the same wires. In a wireless environment, many devices share the
same radio spectrum. However, with the use of spread-spectrum technology, the resources are reused
many times over.
It is said that bigger is better, so more bandwidth is better, right? It may not be. In wired networks,
sometimes the rating of the wire’s clock speed is confused with traffic throughput. Because Ethernet
uses CSMA/CD with statistical multiplexing, the general rule is to design networks in which the throughput
does not exceed 30% of the rating, so an Ethernet-based 10Mbps link would have an average throughputof 3Mbps.
But what if I need more bandwidth for killer applications? We have been waiting for that killer
application for some time now. VoIP, Videoconferencing, and even on-line interactive training courses
use much less BW than one would think. An interactive videoconference uses around 2MHz of stream
l Cost: For e.g., high-speed Ethernet adapters are in the range of some 10 pounds, wireless
LAN adapters, e.g., as PC-Card ranges from 100 pounds.
l Proprietary solutions: Due to slow standardization procedures, many companies have comeup with proprietary solutions offering standardized functionality plus many enhanced features.
However, these additional features only work in a homogeneous environment.
l Restrictions: Several government and non-government institutions worldwide regulate the
operation and restrict frequencies to minimize interference. Consequently, it takes a very long
time to establish global solutions like e.g., IMT-2000. WLANs are limited to low power senders
and certain license-free frequency bands.
l Safety and security: Using radio waves for data transmission might interfere with other high-
tech equipment in e.g., hospitals. Additionally, the open radio interface makes eavesdroppingmuch easier in WLANs than e.g., in the case of fiber optics.
4.5 INFRARED VS. RADIO TRANSMISSION
Infrared light transmission is used for directed links, e.g., to connect different buildings via laser
links Radio transmission works in the range 2.4 GHz. Both technologies can be used to set up ad hoc
connections for work groups, to connect, e.g., a desktop with a printer without a wire, or to support
mobility with a small area.
Infrared technology uses diffuse light reflected at walls; furniture etc. or directed light if a line-of-sight
(LOS) exists between sender and receivers. Senders can be simple light emitting diodes (LEDs) or laser
diodes, whereas photodiodes act as receivers.
l The main advantages of infrared technology are its simple and extremely cheap senders and
receivers, which are integrated in almost all mobile devices and receivers available today. PDAs,
laptops, notebooks, mobile phones etc. have an infrared data association (IrDA) interface. Version
1.0 of this industry standard implements data rates of up to 115 kbps, while IrDA 1.1 defines
higher data rates of 1.152 and 4 Mbps. No licenses are required for infrared technology and
shielding is very simple. Furthermore, electrical devices do not interfere with infrared transmission.
l Disadvantages of infrared transmission are its low bandwidth compared to other LAN
technologies. Typically, IrDA devices are internally connected to a serial port limiting transfer
rates to 115 kbps. Infrared is quite easily shielded. They cannot penetrate walls or other obstacles,
for good transmission quality and high data rates typically a LOS, i.e., direct connection is
There are many networks that use radio transmission, e.g. GSM at 900, 1,800 and 1,900 MHz, DECT
at 1,880 MHz etc.
l Advantagesof radio transmission include the long-term experiences made with radio transmissionfor wide area networks (e.g., microwave links) and mobile cellular phones. Radio transmission
can cover larger areas and can penetrate (thinner) walls, furniture, plants etc. Thus, radio typically
does not need a LOS if the frequencies are not too high (then radio waves behave more and
more like light). Current radio-based products offer higher transmission rates (e.g., 10 Mbps)
than infrared.
l Shielding is not so simple and thus radio transmission can interfere with other senders or electrical
devices can destroy data transmission via radio. Additionally, radio transmission is only permitted
in certain frequency bands. Very limited ranges of license-free bands are available worldwide
and those available are typically not the same in all countries.
WLAN technologies:
1. IEEE 802.11: infrared and radio both
2. HIPERLAN: radio only
3. Bluetooth: radio only
4.6 IEEE 802.11 ARCHITECTURE
The IEEE standard 802.11 specifies the most famous family of WLANs in which many products are
already available. The standards number indicates this standard belongs to the group of 802.x LAN
standards, e.g. 802.3 Ethernet or 802.5 Token ring. This means that the standard specifies the physical
and the medium access layer adapted to the special requirements of wireless LANs. The primary goal of
the standard was the specification of a simple and robust WLAN which offers time bound and asynchronous
services. Furthermore, the MAC layer should be able to operate with multiple physical layers, each of
which exhibits a different medium sense and transmission characteristic. Candidates for physical layers
were infrared and spread spectrum radio transmission techniques. Additional features of the WLAN
should include the support of power management to save battery power, the handling of hidden nodes, and
Fig 4.7: Architecture of an infrastructure-based IEEE 802.11
Several nodes, called stations (STAi) are connected to access point (AP). Stations are terminals
with access mechanisms to the wireless medium and radio contact to the AP. The stations and the AP,which are within the same radio coverage form a basic service set (BSS
i). The example shows two BSS
- BSS1 and BSS2 - which are connected via a distribution system. A distribution system connects
several BSSs via the AP to form a single network and there by extends the wireless coverage area. This
network is now called an extended service set (ESS). Further more, the distribution system connects
the wireless networks via the APs with a portal, which forms the internetworking unit to other LANs.
The architecture of the distribution system is not specified further in the IEEE 802.11. It could consist
of bridged IEEE LANs, wireless links or any other networks. However, distribution system services
are defined in the standard. The APs support roaming, distribution system then handles data transfer
between the different APs. Furthermore, APs provide synchronization with in a BSS, support power
management, and can control medium access to support time bounded service.
IEEE 802.11allows the building of ad hoc networks between stations, thus forming one or more BSSs
Fig. 4.9 : IEEE 802.11 protocol architecture and bridging
The MAC management supports the association and re-association of a station to an access point and
roaming between different access points. Furthermore, it controls authentication mechanism encryption,
synchronization of a station with regard to access point and power management to save battery power.
MAC management also maintains the MAC management information base (MIB). The main tasks of the
PHY management include channel tuning and PHY MIB maintenance. Finally, station management
interacts with both management layer and is responsible for additional higher layer functions.
PHYSICAL LAYER
IEEE 802.11 supports three different physical layers: one layer based on infrared and two layers on
the basis of radio transmission. All PHY variants include the provision of the clear channel assessment
signal (CCA). The PHY layer offers a service access point (SAP) with 1 or 2 Mbps transfer rate to the
MAC layer.
Frequency hopping spread spectrum
Frequency hopping spread spectrum (FHSS) is a spread spectrum technique, which allows for the
coexistence of multiple networks in the same area by separating different networks using different hopping
sequences. Figure 4.10 shows a frame of the physical layer used with FHSS. The frame consists of twobasic parts the PLCP part and the payload part. While the PLCP part is always transmitted at 1 Mbps
payload i.e. MAC data can use 1 or 2 Mbps.
l Synchronization: The PLCP preamble starts with 80-bit synchronization. This pattern is used
for synchronization of potential receivers and signal detection by the CCA.
l Start frame delimiters (SFD): The 16 bits indicate the start of the frame and thus provide
frame synchronization.
l PLCP_PDU length word (PLW): The first field of the PLCP header indicates the length of
the payload in bytes including the 32 bit CRC at the end of the payload .PLW can range between
0 and 4,095.
l PLCP signaling fields (PSF): Only one bit is currently specified in this 4-bit field indicating the
data rate of the payload (1or 2 Mbit/s).
Header error check (HEC): The PLCP header is protected by a 16 bit checksum with the standard
ITU-T generator polynomial G (x) = x^16+x^12+x̂ 5+1.
Fig. 4.10 : Format of an IEEE 802.11 PHY frame using FHSS
4.8 DIRECT SEQUENCE SPREAD SPECTRUM
Direct sequence spread spectrum (DSSS) is the alternative spread spectrum method separated bycode and not by frequency .In the case of IEEE 802.11 DSSS, spreading is achieved using the 11-chip
sequence (+1,-1,+1,+1,-1,+1,+1,+1,-1,-1,-1), also called Barker code.
IEEE 802.11 DSSS PHY also uses the 2.4 GHz ISM band and offers both 1or 2Mbit/s data rates .The
system uses differential binary phase shift keying (DBPSK) for 1Mbit/s transmission and differential
quadrature phase shift keying (DQPSK) for 2 Mbps as modulation schemes.
Figure 4.11 shows a frame of the physical layer using DSSS. The frame consists of two basic parts,
the PLCP part (preamble and header) and the payload part. The PLCP part is always transmitted at 1
Mbit/s, Payload, i.e., MAC data, can use 1 or 2Mbit/s. The fields of the frame have the following function:
l Synchronization: The first 128 bits are not only used for synchronization, but also game setting,
energy detection (for the CCA), and frequency offset compensation.
l Start frame delimiters (SFD): This 16-bit field is used for synchronization at the beginning of
a frame.
l Signal: Only two values have been defined for this field to indicate the data rate of the payload.
The MAC layer has to fulfill several tasks. First of all, it has to control medium access, but it can also
offer support for roaming, authentication, and power conservation. The basic services provided by theMAC layer are the mandatory asynchronous data service and an optional time-bounded service. While
802.11 only offer the asynchronous service in ad hoc network mode, both service types can be offered
using an infrastructure-based network together with the access point coordinating medium access. The
asynchronous service supports broadcast and multicast packets and packet exchange is based on a best
effort model, i.e., no delay bounds can be given for transmission.
The mandatory basic method based on a version of CSMA/CA, an optional method avoiding the
hidden terminal problem, and finally a contention-free polling method for time-bounded service. The first
two methods are also summarized as distributed coordination function (DCF), the third method is called
point coordination function (PCT). DCP only offers asynchronous service while PCF offers both
asynchronous and time-bounded service but needs an access point to control medium access and to avoid
contention. The MAC mechanisms are also called distributed foundation wireless medium access control
(DFWMAC).
Figure 4.12 shows three different parameters defining the priorities of medium access. The medium,
as shown, can be busy of idle (which is detected by the CCA). If the medium is busy this can be due to
data a frame of other control frames.
Fig. 4.12: Medium access and inter-frame spacing
DCF inter-frame spacing (DIFS): This parameter denotes the longest waiting time and thus the
lowest priority for medium access. This waiting time is used for asynchronous data service within a
contention period.
PCF inter-frame spacing (PIFS): A waiting time between DIPS and SIFS (and thus a medium
priority) is used for a time-bounded service. That is, an access point polling other nodes only has to wait
PIFS for medium access.
Short inter-frame spacing (SIFS): The shortest waiting time for medium access (and thus the
highest priority) is defined for short control messages, such as acknowledgment for data packets of
polling responded.
Basic DFWMAC-DCF Using CSMA/CAThe mandatory access mechanism of IEEE 802.11 is based on the carrier senses multiple access with
collision avoidance (CSMA/CA). The basic CSMA/CA mechanism is shown in the following figure 4.13.
Fig. 4.13 : CSMA / CA Mechanism
If the mechanism is sensed idle for at least the duration of DIFS, a node can access the medium at
once. This allows for short access delay under light load. But as soon as more and more nodes try to
access the medium, additional mechanism is needed.
If the medium is busy, nodes have to wait for the duration of DIFS, entering a contention phase
afterwards. Each node now chooses a random backoff time with a contention window and additionally
delays medium access for this random amount of time. As soon as a node senses the channel is busy it
has lost this cycle and has to wait for the next chance, i.e. until the medium is idle again for at least DIFS.
But if the randomized additional waiting time for a node is over and the medium is still idle, the node can
access the medium immediately.
The additionally waiting time is measured in multiples of slots. Slot time is derived from the medium
propagation delay, transmitter delay and other PHY dependent parameters. To provide fairness IEEE
802.11 adds a backoff timer. Again each node selects a random waiting time with in the range of the
contention window. As soon as the counter expires, the nodes access the medium. This means that
deferred stations do not choose a randomized backoff time again but continue to count down. Thus longerwaiting stations have the advantage over newly entering stations.
Figure 4.15 shows a sender accessing the medium and sending its data. But now the receiver answers
directly with an acknowledgement (ACK).
The receiver accessed the medium after waiting for duration of SIFS and, thus, no other station can
access the medium in the meantime and cause a collision. The other stations have to wait for DIFS plus
their backoff time. This acknowledgement ensures the correct reception of the frame on the MAC layer,which is especially important in error-prone environments such as wireless connections. If no ACK is
returned, the sender automatically retransmits the frame. But now the sender has to wait again and
compete for the access right.
DFWMAC-DCP with RTS/CTS EXTENSION
Discussed the problem of hidden terminals, a situation that can also occur in IEEE 802.11 networks.
The problem occurs if one station can receive two others, but those stations cannot receive each other.
Then those two stations may sense the channel idle. Send a frame, and cause a collision at the receiver in
the middle. To deal with this problem, the standard defines an additional mechanism using two control
packets, RTS and CTS. The use of the mechanism is optional, however, every 802.11 node has to implementthe functions to react properly upon reception of RTS/CTS control packets.
Fig. 4.16: Use of RTS / CTS
Figure 4.16 illustrates the use of RTS and CTS. After waiting for DIFS, the sender can issue a request
to compare to other data packets. The RTS packet thus is not given any higher priority compared to other
data packets. The RTS packet includes the receiver of the data transmission to come and the duration of
This duration specifies the time interval necessary to transmit the whole data frame and acknowledgement
related to it. Every node receiving this RTS now has to set its net allocation vector (NAV) in accordance
with the duration field. The NAV specifies then the earliest point in time at which the station can try to
access the medium again.
If the receiver of the data transmission receives the RTS, it answers with a clear to send (CTS)
message after waiting for SIFS. This CTS packet contains the duration field again and all stations receiving
this packet from the receiver of the intended data transmission have to adjust their NAV. The latter set of
receivers need not be the same as the first set receiving the RTS packet. Now all nodes within receiving
distance around sender and receiver are informed that they have to wait more time before accessing the
medium. Basically, this mechanism reserves the medium for one sender exclusively.
Finally, the sender can send the data after SIFS. The receiver waits for SIFS after receiving the data
packet and then acknowledges whether the transfer was correct. Now the transmission has been completed
and thus the NAV in each node marks the medium as free and the standard cycle can start again.
However, the mechanism of fragmenting a user data packet into several smaller parts should be
transparent for a user. Furthermore, the MAC layer should have the possibility of adjusting the retransmission
frame size to the current error rate on the medium. Therefore, the IEEE 802.11 standard specifies a
fragmentation mode. Again, a sender can send an RTS control packet to reserve the medium after a
waiting time of DIFS. This RTS packet now includes the duration for the transmission of the first fragment
and the corresponding acknowledgement. A certain set of nodes may receive answers with CTS, again
including the duration of the transmission up to the acknowledgement. A set of receivers gets this CTS
message and set the NAV.
As shown in figure 6.10 the sender can now send the first data frame, frag1 after waiting only forSIFS. The new aspect of this fragmentation mode is that it includes another duration value in the frame
frag1.
This duration field reserves the medium for the duration of the transmission comprising the second
fragment and its acknowledgement. several nodes may receive this reservation and adjust their NAV.
The receiver of frag1 answers directly after SIFS with the acknowledgement packet ACK1 including
the reservation for the next transmission as shown in figure 6.10.
If frag2was not the last frame of this transmission, it would also include a new duration for the third
consecutive transmission. The receiver acknowledges this second fragment, not reserving the medium
again. After ACK2, all nodes can compete for the medium again after having waited for DIFS.
DFWMAC-PCF with Polling
The two-access mechanism presented so far cannot guarantee a maximum access delay or minimum
transmission bandwidth. To provide a time-bounded service, the standard specifies a point co-ordination
function (PCF) on top of the standard DCF mechanisms. Using PCF, which requires an access point that
l Data: The MAC frame may contain arbitrary data (max. 2312 byte), which is transferred
transparently from sender to the receiver(s).
l Checksum (CRC): Finally, a 32-bit checksum is used to protect the frame as this is commonprocedure in all 802.x networks.
MAC frames can be transmitted between mobile stations, between mobile stations and an access
point, and between access points over a distribution system.
4.10 MAC MANAGEMENT
MAC management plays a central role in an IEEE 802.11 station as it more or less control all functions
related to system integration i.e., integration of a wireless station into a BSS, formation of an ESS,synchronization of stations etc.
The functional groups include:
i) Synchronization:
Each node of an 802.11 network maintains an internal clock. To synchronize the clocks of all nodes,
IEEE 802.11 specifies a timing synchronization function (TSF). Synchronized clocks are needed for
power management, but also for coordination of the PCF, for synchronization of the hopping sequence in
an FHSS system. Using PCF, the local timer of a node can predict the start of a super frame, i.e., the
contention free and contention period. FHSS physical layers need the same hopping sequences for all thenodes to be able to communicate within a BSS.
Within a BSS, timing is conveyed by the periodic transmission of a beacon frame. A beacon contains
a timestamp and other management information used for power management and roaming. The timestamp
is used by a node to adjust its local clock. The node is not required to hear every beacon to stay synchronized;
however, from time to time internal clocks should be adjusted The transmission of a beacon frame is not
always periodic, but is also deferred if the medium is busy.
Within the infrastructure-basednetworks, the AP performs synchronization by transmitting the periodic
beacon signal, whereas all other wireless nodes adjust their local timer to the time stamp. This is shown inthe figure. 4.20.14 The AP is not always able to send its beacon B periodically if the medium is busy.
However, the AP always tries to schedule transmissions according to the expected beacon interval (target
beacon transmission time), i.e., beacon intervals are not shifted if one beacon is delayed. The timestamp
of a beacon always reflects the real transmit time, not the scheduled time.
For ad hoc networks, the situation is slightly more complicated as they do not have an AP for beacon
transmission. In this case, each node maintains its own synchronization timer and starts the transmission
of a beacon frame after the beacon interval. Figure 4.21 shows an example where multiple stations try tosend their beacon. However, the standard random back off algorithm is also applied to the beacon frames
and thus, typically only one beacon wins. Now all other stations adjust their internal clock according to the
received beacon and suppress their beacons for this cycle. If collision occurs, the beacon is lost. In this
scenario, the beacon intervals can be shifted slightly in time because all clocks may vary and, thus also the
start of a beacon interval from a node’s point of view. However, after synchronization all nodes again
have the same consistent view.
Fig. 4.21: Multiple stations try to send their beacon.
ii) Power management
Wireless devices are battery powered. Therefore, power-saving mechanisms are crucial for the success
of such devices. Standard LAN protocols assume that stations are always ready to receive data, although
receivers are idle most of the time in lightly loaded networks. However, this permanent readiness of the
receiving module is critical for battery lifetime as the receiver current may be up to 100mA.
Fig. 4.23: Simple ad hoc network with two stations
Figure 4.23 shows a simple ad hoc network with two stations. Again, the beacon interval is determined
by a distributed function (different stations may send the beacon). However, due to this synchronization,
all stations within the ad hoc network wake up at the same time. All stations stay awake for the ATIM
interval as shown in the first steps and go to sleep again if no frame is buffered for them. In the third step,
station1has data buffered for station
2. This is indicated in an ATIM transmitted by staiton
1. Station
2
acknowledges this ATIM and stays awake for the transmission. After the ATIM window, station1can
transmit the data frame, and station2acknowledges its receipt. In this case, the stations stay awake for the
next beacon.
iii) Roaming
Typical wireless networks within buildings require more than just one access point to cover all rooms.
Depending on the solidity and material of the walls on one AP has a transmission range of 10-20 m if
transmission is to have a decent quality. If a user walks around with a wireless station, the station has to
move from one AP to another to provide uninterrupted service. Moving between APs is called roaming.
The steps for roaming between AP are the following:
l A station that the current link quality to its AP1is too poor. The station then starts scanning for
another AP.
l Scanning involves the active search for another BSS and can also be used for setting up a newBSS in case of ad hoc networks. IEEE 802.11 specifies scanning on single or multiple channels
and differentiates between passive scanning and active scanning. Passive scanning means
listening into the medium to find other networks, i.e., function within an AP. Active scanning
comprises sending a probe on each channel and waiting for response. Beacon and probe
response contain the information necessary to join the new BSS.
l The station then selects the best AP for roaming based on, e.g., signal strength, and sends an
association request to the selected AP2.
l The new AP2answers with an association response. If the response is successful, the station
has roamed to the new AP2
Otherwise; the station has to continue scanning for new APs.
l The AP accepting an association’s request indicates the new station in its BSS to the distribution
system (DS). The DS then update its database, which contains the current location of the
wireless stations. This database is needed for forwarding frames between different BSSs, i.e.,
between the different APs controlling the BSSs, which combine to form an ESS.
4.11 SUMMARY
In this chapter we presented definition of signals and their characteristics. Specifically we introducedthe concept of modulation, Carrier Signal, and noise, Bandwidth. These serve as basis for WLAN. We
have touched upon different IEEE standards used in Wireless applications. The architecture and protocol
of WLAN are covered in detail. Some of the related topics in MAC layer and power management are
also discussed.
4.12 QUESTIONS
1. What are WLANs?
2. What is modulation?
3. What is a carrier signal?
4. Define SNR?
5. What is BW?
6. Compare 802.11a, 802.11b, 802.11g and blue tooth.
7. List out the advantages and disadvantages of WLAN?
8. Compare Infrared and Radio transmission?
9. Discuss the architecture of WLAN?
10. Briefly explain the WLAN protocol architecture?
Up until the mid 1970’s cryptography was an arcane science practised largely by government and
military security experts. A more serious attempt occurred in 1980, when the NSA (National
Security Agency) funded the American council on education to examine the issue with a view to
persuading congress to give it legal control of publications in the field of cryptography. As the eighties
progressed, pressure focused more on the practice than the study of cryptography. This gave rise to the
wide use of cryptography in all the fields of computer as well as Internet.
With the introduction of the computer, the need for automated tools for protecting files and otherinformation stored on the computer became evident. This is especially the case for a share system such
as a time-sharing system and the need is even more acute for systems that can be accessed over a public
telephone or data network. The generic name for the collection of tools designed to protect data and to
thwart hackers is computer security.
The second major change that affected security is the introduction of distributed systems and the use
of networks and its communications facilities for carrying data between terminal user and computer and
also between computers. Network security measures are needed to protect data during their transmission.
5.2 DEFINITION OF CRYPTOGRAPHY AND CRYPTANALYSIS
Cryptography is the science of using mathematics to encrypt and decrypt data. Cryptography enables
us to store sensitive information or transmit it across insecure networks (like the Internet) so that it cannot
mail message, and a transferred file may contain sensitive or confidential information. It is necessary to
prevent the opponent from learning the contents of the transmissions.
l The second traffic analysis is more subtle. Suppose that we had a way of masking the contents
of messages or other information traffic so that opponents, even if they captured the message
could not extract the information form the message. The common technique for masking contents
is encryption. If we had encryption protection in place, an opponent might still be able to observe
the pattern of these messages. The opponent could determine the location and identity of
communication hosts and could observe the frequency and length of messages being exchanged.
This information might be useful in guessing the nature of the communication that was taking
place.
Passive attacks are very difficult to detect because they do not involve any alteration of the data. The
emphasis in dealing with passive attacks is on prevention of the attack rather than detection.
Active attacks
These attacks involve some modification of the data stream or the creation of a false stream and it has
been divided into 4 categories like masquerade, replay, and modification of messages and denial of service.
Masquerade: This takes place when on entity pretends to be a different than other entity. This includes
one of the other form of active attacks i.e. replay or modification of messages or denial of service.
Replay: This involves the passive capture of a data unit and its subsequent retransmission to produce an
unauthorized effect.
Modification of messages: This means that some portion of the message is altered or that messagesare delayed or reordered to produce an unauthorized effect.
Denial of service: This prevents or inhibits the normal use or management of communications facilities.
This attack will have a specific target. For example and entity may suppress all messages directed to a
particular destination. Another form of service denial is the disruption of an entire network, either by
disabling the network or by overloading it with message so as to degrade performance.
Active attacks present the opposite characteristics of passive attacks. Active attacks are difficult to
prevent their success. Prevention is difficult because to do so it require physical protection of all
communications facilities and paths at all times. Instead, the goal is to detect them and to recover from
any disruption or delays caused by them. Because the detection has a deterrent effect, it may alsocontribute to prevention.
5.4.2 SECURITY SERVICES
Computer and network security research and development have instead focused on three or four
A model for network security is as shown in the figure 5.3. A message is to be transferred form one
party to another across some sort of Internet. The two parties, who are the principals in this transaction,must cooperate for the exchange to take place. A logical information channel is established by defining a
route through the Internet from source to destination and by the cooperative use of communication protocols
by the two principals.
Security aspects come into play when it is necessary or desirable to protect the information transmission
from an opponent who may present a threat to confidentiality, authenticity and so on. All the techniques
for providing security have two components:
Figure 5.3 Model for network security
l A security-related transformation on the information to be sent.
l Some secret information shared by the two principals and it is hoped, unknown to the opponent.
A trusted third part may be needed to achieve secure transmission. For example, a third party may be
responsible for distributing the secret information to the two principals while keeping it from any opponent.
Trusted Third party
Principal Principal
Message Information channel
Message
Secrete
Information
Secrete information
Security related transformation Opponent Security related transformation
l Service threats-exploit service flaws in computers to inhibit use by legitimate users.
The security mechanisms needed to cope with unwanted access fall into two broad categories as
shown in figure 1.4. The first category might be termed a gatekeeper function. It includes password-
based login procedures that are designed to deny access to all but authorized users and screening logic
that is designed to detect and reject viruses and other similar attacks.
Once either an unwanted user or unwanted software gains access, the second line of defense consists
of a variety of internal controls that monitor activity and analyze stored information in an attempt to detect
the presence of unwanted intruders.
5.6 CONVENTIONAL ENCRYPTION
Conventional encryption also referred as symmetric encryption or single-key encryption, was theonly type of encryption in use prior to the development of public key encryption. There are two general
types of encryption (i,e, Classical encryption and Modern encryption) techniques. These are key based
algorithms i.e, symmetric and public key algorithms. In conventional algorithms the encryption key can be
calculated from the decryption key and vice versa. In these algorithms, the encryption key and the
decryption key are the same. These algorithms are also called secret key algorithms, or one key algorithm.
In this the sender and receiver agree on a key before they communicate securely. The security of the
symmetric algorithm rests in the key. The key means that anyone could encrypt and decrypt messages
using any encryption and decryption algorithms. Encryption and decryption with a conventional algorithm
are denoted by:
Ek (M)=C where M is message, E – Encryption C-Ciphertext or Encrypted message
Dk (C)=M where D – Decryption K – this subscript stands for denoting key
5.7 CONVENTIONAL ALGORITHMS
Conventional or Symmetric algorithms can be divided into two categories. The first category of
algorithms are called as stream algorithms or stream ciphers which operate on the single bit of the
plaintext or byte at a time.. Others operate on the plaintext in-groups of bits. The second category of the
algorithm is block algorithms or block cipher, which operates on group of bits at a time. The figure 5.5shows the general model of encryption and decryption of the message.
5.7.1 Model of message encryption and decryption
A message is nothing but plaintext (also called clear text). The process of disguising a message in such
2. The number of keys used: If both sender and receiver use the same key, the system is
referred as symmetric or singe-key or secret-key or conventional encryption. If the sender and
receiver each uses a different key, the system is referred to as asymmetric, two key or public
key encryption.
3. The way in which the plaintext is processed: This cipher processes the input of one block
of elements at a time, producing an output block for each input block.
5.8.1 Cryptanalysis
The whole point of cryptography is to keep the plaintext (or the key, or both) secrete from the opponents
(also called adversaries, attackers, interceptors, interlopers, intruders, opponents, or simply the enemy).
The process of attempting to discover X (Message or key) or both is known as cryptanalysis. There are
four general types of cryptanalytic attacks.
1. Cipher text-only attack: The cryptanalyst has the cipher text of several messages, all of
which have been encrypted using the same encryption algorithm. The cryptanalyst’s job is to
recover the plaintext or key of any messages used to encrypt the messages, in order to decrypt
other messages encrypted with the same key.
Given: C1=D
m(P
1),C
2=E
k (p
2)……ci =E
k (P
i)
Deduce: Either P1, P
2,….P
i,k ; or an algorithm to infer P
i+1from C
i+1= E
k (P
i+1)
2. Known – plaintext attack: The cryptanalyst has the access to the cipher text as well asplaintext of the messages. Cryptanalysts job is to deduce the key (or keys) used to encrypt the
messages or an algorithm to decrypt any new messages encrypted with the same key (or keys).
Given: P1,C
1=E
k (P
1),P
2,C
2=E
k (P
2),…..Pi,Ci=E
k (Pi)
Deduce: Either k, or an algorithm to infer Pi+1
from Ci+1
=(Pi+1
)
3. Chosen plaintext attack: The cryptanalyst not only has access to the cipher text and associated
plaintext for several messages, but also chooses the plaintext that gets encrypted. This is more
powerful than a plaintext attack, because the cryptanalyst can choose specific plaintext blocks
to encrypt, which might yield more information about the key.
Given: P1,C
1=E
k (P
1),P
2,C
2=E
k (P
2),…..Pi,Ci=E
k (Pi),
Where the cryptanalyst gets to choose P1,P
2…Pi
Deduce: Either k, or an algorithm to infer Pi+1
from Ci+1
=Ek (P
i+1)
4. Adaptive chosen plaintext attack: This is a special case of chosen plaintext attack. In this
attack the cryptanalyst can choose and modify the plaintext that is encrypted, based on the results of
previous encryption.
The cipher text-only attack is the easiest attack to defend against other attacks. The analyst is able tocapture one or more plaintext messages as well as their encryptions. For example, a file that is encoded
in the Postscript format always begins with the same pattern, or there may be a standardized header or
banner to an electronic funds transfer message, and so on. These are the examples of known plaintext.
From this knowledge, the analyst is able to deduce the key on the basis of the way in which the known
plaintext is transformed.
In general, the analyst is able to choose the message to encrypt the messages using certain patterns
that can be expected to reveal the structure of the key.
5.9 STEGANOGRAPHY
Stenography hides the message (secrete) in other messages. Generally the sender writes an innocuous
message and then conceals a secret message on the same piece of paper. Historical tricks include invisible
inks, tiny pin punctures on selected characters, minute differences between handwritten characters, pencil
marks on typewritten character, grilles which cover most of the message except for a few characters and
so on.
Some examples are listed below:
l Character marking: Selected letters of printed or type written text are over- written in pencil.The marks are ordinarily not visible unless the paper is held at an angle to bright light.
l Invisible ink: A number of substances can be used for writing but leave no visible trace until
heat or some chemical is applied to the paper.
l Pin punctures: Small pin punctures on selected letters are ordinarily not visible unless the
paper is held up in front of a light.
l Typewriter correction ribbon: Used between lines typed with a black ribbon, the results of
typing with the correction tape are visible only under a strong light.
The advantage of steganography is that the parties can employ the stenographers to reveal the
secrecy of the messages. But this has more disadvantages when compared to encryption. Stenography
requires a lot of overhead to hide few bits of information and once the system is discovered, it becomes
useless without maintaining the secrecy. This can be overcome by first encrypting the message and then
hiding that message using stenography maintains the secrecy of the information.
This section provides the brief introduction to classical encryption techniques and its different techniques.
A study of these techniques illustrates the basic approaches to conventional encryption used in the presentscenario. Before computers, cryptography consisted of character-based algorithms. Different cryptographic
algorithms were used.
Most good cryptographic algorithms combine the elements of substitution and transposition.
In the classical encryption techniques there are four types of substitution
ciphers:
1. A simple substitution cipher or monoalphabetic cipher- This is the one in which each
character of the plaintext is replaced with a corresponding character of cipher text. The
cryptograms in newspapers are simple substitution ciphers.
2. A homophonic substitution cipher – is like a simple substitution cryptosystem, except a
single character of plaintext can map to one of several characters of cipher text.
3. A Polygram substitution cipher is one in which blocks of characters are encrypted in-groups.
4. A polyalphabetic substitution cipher is made up of multiple simple substation ciphers. For
example, there might be five different simple substation ciphers used; the particular one used
changes with the position of each character of plaintext.
5.10.1 SUBSTITUTION TECHNIQUES OR SUBSTITUTION
CIPHERS
A substitution cipher is one in which each character in the plaintext is substituted for another
character in the cipher text. The receiver decrypts the ciphertext or the encrypted message to deduce or
to recover the plaintex.
Caesar Cipher
This is the most famous substitution algorithm, in which each plaintext character is replaced by the
character three to the right modulo 26. ( i,e A is replaced “D” , B is replaced by “E”) For example:
Plain: m e e t m e t o m o r r o w
Cipher: PHHW PH WRPRUURZ
Note that the alphabet is warped around, so that the letter following Z is A. The transformation can be
The two principal methods are used in substitution ciphers to lessen the extent to which the structure
of the plaintext survives in the ciphertext. One approach is to encrypt multiple letters of plaintext, and theother is to use multiple cipher alphabets.
5.10.3 Playfair Cipher
The well known multiple letter encryption cipher is the playfair, which treats diagrams in the plaintext
as single units and translates these units into ciphertext diagrams. The playfair algorithm is based on the
use of a 5x5 matrix of letters constructed using a keyword. For example
In this case, the keyword is monarchy. The matrix is constructed by filling the letters of the keyword
from left to right and from top to bottom, and then filling in the remainder of the matrix with the remaining
letters in alphabetic order. The letters J and I count as one letter. Plaintext is encrypted two letters at a
time, according to the following rules:
1. Repeating plaintext letters that would fall in the same pair are separated with a filler letter, such
as x, so that bolloon would be enciphered as ba lx lo on.
2. Plaintext letters that fall in the same row of the matrix are each replaced by the letter to the
right, with first element of the row circularly following the last. For example, ar is encrypted as
RM.
3. Plaintext letters that fall in the same column are each replaced by the letter beneath, with the top
element of the row circularly following the last. For example, mu is encrypted as CM.
4. Otherwise, the letter that lies in its own row replaces each plaintext letter and the columnoccupied by the other plaintext letter. Thus, hs becomes BP and ea becomes IM(or JM, as the
encipherer wisher).
The palyfair cipher is a great advance over simple monoalphabetic ciphers. For one thing, whereas
there are only 26 letters, there are 26x26=676 diagrams, so that identification of individual digrams is more
difficult. .
5.10.4 Hill cipher
This is the multi letter cipher algorithm developed by the mathematician Lester Hill in 1929. The
encryption algorithm takes m successive plaintext letters and substitutes for them m cipher text letters.
The substitution is determined by m linear equations in which each character is assigned a numerical
value(a=0,b=1,….z=25). For m=3 the system can be described as follows:
C1 = ( k 11
p1
+ k 12
p2
+ k 13
p3
) mod26
C2 = ( k 21
p1
+ k 22
p2
+ k 23
p3
) mod26
C3 = ( k 31
p1
+ k 32
p2
+ k 33
p3
) mod26
This can be expressed in term of column vectors and matrices:
element from each row and exactly one element from each column, with certain of the product terms
preceding by menu’s sign. For a 2 x 2 matrix the determinant is K11
, K22
– K12
K21
.For a 3 x 3 matrix
the value for determinant is K11
K22
K33
+ K21
K32
K13
+ K13
K12
K23
– K31
K22
K13
– K21
K12
K33
– K11
K32 K23. If a square matrix A has a non zero determinant then the inverse of the matrix is computed as[A-1]
ij=(-1)i+j (D
ij) /dt (A), where ( D
ij) is the sub-determinant formed by deleting the ith row and jth
column of A and dt (A) is the determinant of A. For our purposes all arithmetic is done mode 26.
In general terms, the Hill system can be expressed as follows”
C = Ek
( P ) = KP
P = Dk
( C ) = K-1 C = K-1 KP = P
As with playfair, the strength of the Hill cipher is that it completely hides single-letter frequencies.
Indeed, with Hill, the use of a larger matrix hides more frequency information. Thus a 3x3 Hill cipher
hides not only single letter but also two-letter frequency information. This is strong against a cipher text-only attack; it is easily broken with a known plaintext attack.
For an mXn Hill cipher, suppose we have m plaintext – ciphertext pairs,each of length m. we label the
pairs P j
=(P1j,P
2j……P
mj) and C
j=(C
1j,C
2j…..C
jm)such that C
j=KP
jfor 1<=j<=m and for some unknown
key matrix k. Now define two m x m matrices X=(Pij) and Y=(C
ij). Then we can form the matrix
equation Y=XK. If X has an inverse, then we can determine K=X-1Y. If X is not invertible, then a new
version of X can be formed with additional plaintext – ciphertext pairs until an invertible X is obtained
5.10.5 Transposition CiphersIn a transposition cipher the plaintext remains the same, but the order of characters is shuffled around.
In a simple columnar transposition cipher, the plaintext is written horizontally onto a piece of graph paper
of fixed width and the ciphertext is read off vertically seen the following example. Decryption is a matter
of writing the ciphertext vertically onto a piece of graph paper of identical width and then reading the plain
text off horizontally.
Plaintext: COMPUTER GRAPHICS MAY BE SLOW BUT AT LEAST IT’S