Multimedia Networking Network Requirements and Protocols Ahrar Naqvi [email protected].

Multimedia Networking

Network Requirements and Protocols

Ahrar Naqvi

[email protected]

Agenda

• Overview of media characteristics from a networking standpoint

• Network requirements for multimedia communications

• Network architectures, techniques and protocols

Multimedia Classification

• Real Time: Require bounds on end-to-end packet delay & jitter. Subdivided into:– Discrete Media: MSN/Yahoo Messenger, Stock

quotes– Continuous Media: Continuous message stream with

inter-message dependency. Further divided into:• Delay Tolerant e.g Internet webcast• Delay Intolerant e.g. audio, video streams in conferencing

systems

• Non-Real Time: No strict delay constraints (e.g. text, image files)– May be highly sensitive to errors

Key Media Characteristics from Networking Standpoint

• Bandwidth usage

• Sensitivity to error

• Real-time nature

Text

• Bandwidth requirements depend on size– Can be easily reduced by compression

techniques

• Some text applications require complete freedom from loss & errors – use TCP. E.g. FTP

• Others are error and loss tolerant – use UDP e.g. instant messaging

Audio

• Bandwidth requirements depend on dynamic range and/or spectrum– Narrowband speech (300-3300Hz)

• 6.4 Kbps (G.723.3) to 64 Kbps (G.711)

– Wideband audio (CD quality music) 10-220 KHz• 112-128 Kbps (MP3)

• Can tolerate 1-2% packet loss• Real-time nature depends on extent of interactivity

– VoIP requires strong bounds on delay/jitter (Real-Time Intolerant)

• < 250 ms end to end delay

– Internet Webcast is more delay/jitter tolerant (Real-Time Tolerant)

Graphics and Animation

• Examples: Digital images, flash presentations• Large in size but lend themselves well to

compression• Progressive compression techniques enable

image to be initially displayed in low-quality and gradually improved as more information is received

• Error-tolerant and can sustain packet loss provided application knows how to deal with packet loss

• No real-time constraints

Video

• High bandwidth requirements• Efficient compression schemes

– MPEG-I (1.2 Mbps) VCR quality compression– MPEG-II (3-100 Mbps) broadcast quality video, HDTV– MPEG-IV (64 Kbps) for low bandwidth video

compression; supports audio, video, graphics, animation, text

– H.261 (px64 Kbps) video over ISDN– H.263 (18-64 Kbps) video over POTS

• Error requirements and real-time characteristics similar to audio

Network Requirements

• Traffic Requirements: Have implications for basic Internet infrastructure– Limits on Delay & Jitter– Bandwidth– Reliability

• Functional Requirements: Require enhancements to TCP/IP stack in the form of additional network protocols– Multicasting

• Single source of communication with multiple simultaneous receivers

• Can be one way (internet radio) or two way (multi-party audio/video conferencing)

– Security– Mobility– Session Management

Delay Related Metrics

• Maximum end to end delay• Delay variance

– Jitter: non-monotonic variation in delay in given stream

• For a video stream jitter would result in a shaky picture• Jitter can be removed by buffering at the receiver side

– Skew: constantly increasing difference between the expected arrival time and the acutal arrival time

• For a video stream skew could be a slower or faster moving picture

Delay: Packet Processing Delay

• Constant amount of delay at both source and destination– A/D, D/A conversion time and time taken to packetize it through

different layers of protocols

• Typically a characteristic of the operating system and the multimedia application

• Delay can become significant under high load conditions• Reductions in delay imply software enhancements

including use of multimedia operating systems that provide enhanced resource, file and memory management with real-time scheduling

Delay: Packet Transmission Delay

• Time taken by the physical layer at the source to transmit packets. Depends on– Number of active sessions. Typically physical layer

processes packets in FIFO order. Delay can become significant if OS does not support real-time scheduling for multimedia traffic

– MAC access delay: Widespread Ethernet networks cannot provide any firm guarantees on medium access delay due to inherent indeterminism in CSMA/CD (carrier sense multiple access/collision detection). Isochronous Ethernet (802.9) integrated voice data LAN and demand priority Ethernet (802.12) provide QoS but market potential remains low

Delay: Propagation Delay

• Flight time of packets – limited by speed of light. Can’t do anything about it

• For a distance of 20,000 km this would be about 0.067 sec

• Significant part of a desirable ~200 msec delay budget

Delay: Routing and Queuing Delay

• Best-effort Internet treats every packet equally• Packets arriving at a queue have to wait a

random amount of time depending on current router load

• Delay is variable and is the major contributor to jitter

• Techniques to reduce this include– IntServ– MPLS– DiffServ

Bandwidth Requirements• Multimedia traffic streams have high bandwidth• Uncontrolled transmissions at high rates can cause heavy congestion in the

network• Elastic applications that use TCP take advantage of built in congestion

control• Most multimedia applications use UDP for transmitting media streams• It is left to the discretion of the application to dynamically adapt to

congestion• To remove these shortcomings an enhanced internet service model would

require– Admission control: application must first get permission from some authority to

send traffic at a given rate with given traffic characteristics– Bandwidth reservation: if admission is given, appropriate resources (buffers,

bandwidth) will get reserved along the path– Traffic policing mechanisms: to ensure that applications do not send data at a

rate higher than what was negotiated

Bandwidth Related Metrics

• Cost of protocol processing operations relates more directly to packet processing rate than to the bandwidth in terms of bit rate

• In addition to a bit-rate oriented specification of bandwidth, packet processing rate is an important metric

• Cost of packet processing is largely dependent on number of packets and less so on packet size

• Packetization related metrics include maximum and average packet size and packet rate

Reliability

• Pertains to loss and corruption of data

• Can be measured in terms of loss probability

• Requires methods for dealing with erroneous/lost data

Error Correction

• Sender Based Repair• Active Repair - Automatic retransmission request (ARQ)

– Suitable for error intolerant applications

• Passive Repair– Interleaving

– FEC: Forward Error Correction.

» Media Independent – independent of the content/nature of the stream

» Media Dependent - use knowledge of the stream in the repair process

• Error Concealment (Receiver Based Repair)

FEC• Introduce repair data in traffic from which lost packets may be

recovered• Media Independent: use block or algebraic codes to produce

additional packets which aid in loss recovery– Each code takes a codeword of k data packets and generates n-k

additional check packets– i-th bit in check packet is generated from the i-th bits of each associated

data packet• Parity Coding: XOR is applied across groups of packets to generate parity

packets• Reed-Solomon Coding: Based on properties of polynomials over particular

number bases– Take a set of codewords and use these as coefficients of a polynomial f(x)– The transmitted codeword is determined by evaluating the polynomial for all

nonzero values of x over the number base

– Disadvantage: Cause additional delay, increase bandwidth usage and exacerbate congestion

Media Dependent FEC• Exploit media characteristics• For audio, could send each unit of audio in

multiple packets– Primary encoding: first transmission– Secondary encoding: additional transmissions– Secondary encoding could be of lower bandwidth

and quality than the primary coding– May not be necessary to transmit FEC for every

packet due to nature of media• Advantage: low latency – only single packet

delay added– Suitable for interactive applications

* A Survey of Packet Loss Recovery Techniques for Streaming Audio, Colin Perkins et al IEEE Network Sep/Oct 1998

Interleaving• Can be used when media unit size is smaller than packet size (as

may be the case with audio) and end-to-end delay is not important• Units are resequenced before transmission so that originally

adjacent units are separated by a guaranteed distance and returned to original order at the receiver

• Disperses the effect of packet loss – loss of a single packet would causes multiple smaller gaps among original media units

• In case of audio a phoneme originally encapsulated in one packet would get split across multiple packets

• Loss of small parts of several phonemes is easier to deal with than loss of entire phonemes

• Disadvantage: increased latency – not well suited for interactive applications

• Advantage: does not increase bandwidth usage – does well for non-interactive use

Error Concealment

• Producing a replacement for a lost packet which is similar to the original

• Work for relatively small loss rates (< 15%) and for small packets (4-40 ms)

• Types in increasing order of computational cost and improved performance:– Insertion based: insert a fill-in packet that contains silence, noise

or a repitition of an adjacent packet– Interpolation-based: some form of pattern matching and

interpolation to derive the missing packet (waveform, pitch or timescale based)

– Regeneration-based: derive decoder state from packets surrounding the loss and generate a lost packet from that (model based recovery)

Error Recovery for Different Applications

• Non-interactive Applications – Multicasts (e.g. radio)

• Interleaving is suitable (bandwidth efficient, though high latency)

• Use error concealment – repetition with fading• Media-independent FEC better than a

retransmission based scheme

• Interactive Applications (e.g. IP telephony)– Media Dependent FEC– Error concealment using packet repetition

Admission Control

• Pro-active form of congestion control• Takes requested traffic description as

input including (in terms of leaky bucket parameters (b,r))– Maximum burst size ( b = bucket size)– Peak rate – Average rate – Decides to accept or reject a flow including

consideration of impact to existing flows

Admission Control

• Leaky bucket parameters provide a very loose upper bound on the traffic rate

• When traffic becomes bursty network utilization can become very low if admission control is solely based on parameters provided at call setup time

• Admission control unit must also use measurements of current network load and packet delay in its admission decisions

Traffice Shaping/Policing

• Token bucket algorithm is used for traffic shaping. Limits the average rate and allows a degree of burstiness.– Token bucket depth ‘b’ in which tokens are collected

at rate ‘r’– When bucket becomes full extra tokens are dropped– Source can send data only if it can grab and destroy

sufficient tokens from the bucket• Leaky bucket algorithm is used for traffic

policing, in which excessive traffic is dropped– Bucket depth ‘b’ with hole at the bottom– If bucket is full extra packets are dropped

Packet Classification

• In order to prevent all packets from being treated equally some mechanism to distinguish between real-time and non-real time packets is needed

• Done by packet marking e.g. use Type of Service (ToS) field in IP header

• MPLS uses short labels

Packet Scheduling

• FIFO scheduling traditionally used in routers needs to be replaced with more sophisticated queuing

• E.g. priority queuing. Lower priority queues served after higher priority queues

• Disadvantage: possible starvation of low priority flows

• Weighted Fair Queuing has different queues for different classes. However every queue is assigned a certain weight. Packets in that queue get a fraction of the total bandwidth proportional to their weight

Packet Dropping

• Routers can randomly drop packets under congestion

• This can be a problem since certain packets may carry more information than others

QoS Based Routing

• OSPF, RIP, BGP are best-effort routing protocols using single-objective optimization algorithms: hop count or line cost

• All traffic routed to shortest path can lead to congestion on some links and leave other links under-utilized

• If link congestion is used to derive line cost such that congested routes are costlier, such algorithms can cause oscillations in the network, thus increasing jitter

• In QoS based routing paths are determined based on some knowledge of resource availability in the network as well as the QoS requirement of the flows

Integrated Services

• IntServ Internet Services model developed by IETF• Requires applications to know their QoS requirements

beforehand and signal intermediate network to reserve resources (bandwidth, buffers)

• Requires use of packet classifiers as well as packet schedulers

• Almost exclusively concerned with controlling the queuing component of end to end delay

• Has three service classes– Guaranteed Service (RTI)– Controlled Load (RTT)– Best-effort

IntServ

• Flow descriptor specifies QoS requirements– Filter spec – specifies information for the identifying a

packet with a given flow– Flow spec – specifies traffic spec in terms of token

bucket parameters (Tspec) and QoS parameters (Rspec) in terms of bandwidth, delay, jitter and packet loss

• Resource Reservation Protocol (RSVP) used to signal network nodes about required resources

RSVP

• The sender sends a PATH message to the receiver specifying traffic characteristics

• Every intermediate router along the path forwards the path to the next hop determined by the routing protocol

• The receiver responds with RESV message

Disadvantages of IntServ

• Routers having to maintain per-flow state for every flow is a large overhead

• Does not scale in the core network

• Router state has to be refreshed at regular intervals increasing traffic overhead

• However, RSVP has a place at the edge of the network

DiffServ

• Removes some of the shortcomings of the IntServ architecture

• Divides network into regions called DS domains. Each domain is controlled by a single entity

• To provide service guarantees the entire path between source and destination must be in some DS domain

• Nodes in a DS domain can be of following types:– Boundary node– Interior node

DiffServ Boundary Node

• Performs admission control to limit the number of flows in a domain

• Performs packet classification by marking each packet with a service class called “Behavior Aggregate”

• Each Behavior Aggregate is assigned an 8-bit code word called a DS code point

• IP ToS field is updated with the code-point

DiffServ Interior Node

• Lies completely within a DS domain. Connects with other interior nodes or boundary nodes within the domain

• Only performs packet forwarding• Packets are forwarded according to some pre-

defined rule associated with the packet class (as indicated by the code point)

• These pre-defined rules are called Per-Hop Behaviors (PHBs)

DiffServ PHB

• Two commonly used PHBs are– Assured Forwarding (AF): Divides incoming traffic into

four classes where each class guarantees a minimum bandwidth and buffer space

• Within each class packets are further assigned one of three drop priorities

– Expedited Forwarding (EF): Departure rate of a traffic class must equal or exceed the configured rate

• Queuing delay is guaranteed to be bounded• Used to provide Premium Service• Requires strict admission control and traffic policing

MPLS• A router determines the next hop of a packet by doing a longest

prefix match of an IP destination address against entries in a routing table

• This introduces some latency as routing tables can be very large• Same process repeated for every packet even if these are in the

same flow• IP switching gets around this:

– Short label is attached to every packet and is updated at every hop– This label is used at the next hop as an index into the routing table to

get the next hop (happens in constant time) and next label– Label is replaced and packet is forwarded to the next hop– Lends itself to being done in (low-cost) hardware resulting in very high

speeds

MPLS• Like DiffServ MPLS network is divided into domains with boundary

nodes called Label Edge Routers (LER) and interior nodes called Label Switching Routers (LSR)

• Packets entering an MPLS domain are assigned a label at the ingress LER and are switched inside a domain by a simple label lookup

• Labels determine QoS• Labels may get stripped off at egress LER and then get routed

conventionally outside the domain• A sequence of LSR to be followed by a packet in an MPLS domain

is called a Label Switched Path (LSP)• To guarantee QoS both source and destination have to be attached

to the same domain or the different domains have to have service level agreements among them

MPLS• A group of packets that are forwarded in the same manner are said to

belong to the same Forward Equivalence Class (FEC)- FECs form the basis of service differentiation

• No limit on number and granularity of FECs• Labels only have local significance

– Two LSRs agree upon using a particular label for a given FEC• Necessary to do label assignments including label allocation and label to

FEC binding on every hop of the LSP before the traffic flow can use the LSP• Labels can be stacked in FILO order – can be used in tunneling applications• In a domain only the topmost label is used to make forwarding decisions• Can be useful in providing mobility

– Home agent can push a label on incoming packets and forward them to a foreign agent

– Foreign agent pops of its label and forwards the packet to the destination mobile host

MPLS Label Assignment

• Label assignment can be done in the following ways:– Topology-driven: LSPs for every possible FEC are

automatically set up between every pair of LSR (zero call setup delay)

– Request driven: LSPs set up based on explicit requests.

• RSVP can be used– Traffic driven: LSPs set up only when LSR identifies

traffic patterns requiring a new LSP• Traffic patterns not requiring an established LSP use the

normal routing method• Combines advantages of above two

Multicasting – IP Multicast• Can be done in several ways• Send packets to multicast IP address (Class D)• Hosts willing to receive multicast messages for particular multicast

groups inform immediate-neighboring routers using IGMP• Multicast routers exchange group information using a variety of

algorithms:– Flooding– Spanning tree– Reverse path broadcasting– Reverse path multicasting

• Protocols that use some of these algorithms include– Distance Vector Muticast Routing Protocol (DVMRP)– Multicast extension to Open Shortest Path First (MOSPF)– Protocol Independent Multicast (PIM)

Multicasting – Overlay Network (MBONE)

• Virtual network implemented on top of some portions of the Internet

• Islands of multicast capable networks are connected by virtual links called tunnels– Tunneled packets are encapsulated as IP

over IP such that they look like normal unicast packets to intervening routers

Application Layer Multicasting

• SIP and H.323 support multicasting through a multi-point control unit that provides mixing and conferencing functionality

Session Management - Media Description

• Enables application to distribute session information – Media Type– Encoding Scheme– Session Start Time– Session Stop Time– IP Addresses of involved hosts

Session Description Protocol

• SDP developed by IETF can be used to describe media type, media encoding used for session

• More of a description syntax than a protocol – augmented by SIP for media negotiation

• Media descriptions encoded in text format• SDP message contains a series of lines called

fields with single letter abbreviations. Each field has a <tag>=<value> format

Session Management - Session Announcement

• Allows participants to announce future sessions

• E.g. for Internet radio stations to distribute information about scheduled shows

Session Announcement Protocol

• Used for advertising multicast conferences and sessions• SAP announcer periodically multicasts announcement packets to a well-

known multicast address and port (9875) with the same scope as the session being announced

• Recipients of announcement are also potential recipients of sessions being advertised

• Multiple announcers may announce a single session for more robustness• Announcement interval chosen to ensure total bandwidth used by

announcements is below a pre-configured limit• Each announcer is expected to listen to other announcements in order to

determine the total number of sessions being announced on a group• Involves large startup delay before complete set of announcements is heard

by a listener• Contains mechanisms for ensuring integrity, authenticating the origin and

encryption of announcements

Session Management - Session Control

• Information in multiple media streams may be inter-related

• Network must guarantee to maintain such relationships – Multimedia Synchronization

• Can be achieved by putting timestamps in every media packet

• Internet multimedia users may want to control playback of continuous media – similar to what a VCR or CD player provides

Session Control - RTP• RTP runs on top of UDP • Carries chunks of real-time (audio/video) data• Provides

– Sequencing: sequence number in RTP header helps detect lost packets– Payload Identification: payload identifier included in each RTP packet

describes encoding of the media– Frame Indication: video and audio sent in logical units called frames. A

frame marker bit indicates the beginning and end of a frame– Source Identification: To identify the originator of a frame in a multicast

session a Synchronization Source (SSRC) identifier– Intramedia Synchronization: To compensate for different delay and jitter

for packets within the same stream RTP provides timestamps, which are needed by play-out buffers

• Additional media information can be inserted using profile headers and extensions

Real-Time Control Protocol - RTCP

• RTCP is a control protocol that works in conjunction with RTP

• Provides useful statistics: packets sent, lost, jitter, round-trip time

• Sources can use this to adjust their data rate• Other information includes email address, phone

number, name – allow users to know the identities of other users in the session

Real-Time Streaming Protocol

• RTSP is an out-of-band control protocol that allows the media player to control the transmission of the media stream including functions such as – Pause– Resume– Repositioning– Playback

H.323• Umbrella recommendation that specifies components, protocols and

procedures multimedia conferencing over a packet network• Defines four components

– Terminals: These are the endpoints– Gateway: For interoperation between clients using different H.32x

flavors– Gatekeeper: Control functions including admission control, bandwidth

management, call routing– Multi-point Control Unit: For point to multipoint conferencing capability

• Uses H.245 to determine common capabilities of terminals• Two kinds of call control models

– Gatekeeper routed (preferred mode in carrier environments)– Direct (not scalable)

H.323 Protocol Stack

H.323 Stack

• RTP & RTCP used for media• H.225 RAS (Registration, Admission and Status) used by

endpoints and gateways to– Gatekeeper discovery and registration– Requesting call admission, bandwidth allocation– Clearing a call

• Q.931 signaling protocol is used for call setup and teardown between two endpoints – lightweight version of the ISDN protocol

• H.245 media control protocol is used for negotiating media processing capabilities such as A/V codec

• T.120 is used for data-conferencing capabilities such as whiteboard sharing, instant messaging

Session Initiation Protocol - SIP• Application-layer signaling protocol for initiating, modifying and

terminating interactive sessions. Defined in RFC 3261• Does not define what a “session” is. Session is carried opaquely

in SIP messages. • Text-encoded protocol based on elements from HTTP and SMTP• “SIP supports five facets of establishing and terminating

multimedia communications:– User location: determination of the end system to be used for

communication;– User availability: determination of the willingness of the called party to

engage in communications;– User capabilities: determination of the media and media parameters

to be used;– Session setup: "ringing", establishment of session parameters at both

called and calling party;– Session management: including transfer and termination of sessions,

modifying session parameters, and invoking services” (RFC 3261)

SIP – Key Capabilities

• A stateful SIP server can split or "fork" an incoming call so that several extensions can be rung at once

• The first extension to answer can take the call• SIP can return different media types within a

single session• Participants can be invited to existing sessions• Media can be added (removed from) an existing

session• Supports mobility

Elements of a SIP Network

• Three main elements in a SIP network– User Agent: end device in a SIP network. User Agent Client

(UAC) initiates requests. User Agent Server (UAS) responds to requests. Roles may change in the course of a session.

– Server: There are three main types• Proxy: Receives requests from UAs or other proxy and forward the

request to another location• Redirect: Receives a request from a UA or proxy and returns a

redirect response (3XX) indicating where the request should be retried

• Registrar: Receives SIP registration requests and updates UA’s information to a location server (e.g. LDAP server) or other database

– Location Server: General term for a database. Non-SIP protocol is used to interact with it.

SIP MethodsSIP Methods are commands supported by SIP:

• INVITE: Invites a user to a call• ACK: Used to facilitate reliable message exchange for INVITEs• BYE: Terminates a connection between users or declines a call• CANCEL: Terminates a request, or search, for a user• OPTIONS: Solicits information about a server's capabilities• REGISTER: Registers a user's current location• INFO: Used for mid-session signallingSIP responses

The following are SIP responses: • 1xx Informational (e.g. 100 Trying, 180 Ringing) • 2xx Successful (e.g. 200 OK, 202 Accepted) • 3xx Redirection (e.g. 302 Moved Temporarily) • 4xx Request Failure (e.g. 404 Not Found, 482 Loop Detected) • 5xx Server Failure (e.g. 501 Not Implemented)• 6xx Global Failure (e.g. 603 Decline)

SIP Signaling

SIP & H.323 Comparison• SIP is largely equivalent to the Q.931 and H.225 components of H.323 (sipcenter.com)

SIP H.323

PHILOSOPHY

"New World" - a relative of Internet protocols - simple, open and horizontal

"Old World" - complex, deterministic and vertical

IETF ITU

Carrier-class solution addressing the wide area

Borne of the LAN - focusing on enterprise conferencing priorities

CHARACTERISTICS

A simple toolkit upon which smart clients and applications can be built. It re-uses

Net elements (URLs, MIME and DNS)

H.323 specifies everything including the codec for the media and how you carry the packets in RTP

Leaves issues of reliability to underlying network

Assumes fallibility of network - an unnecessary overhead

SIP messages are formatted as text. (Text processing lies behind the web and email)

Binary format doesn't sit well with the internet - this adds complexity

SIP allows for standards-based extensions to perform specific functions.

Extensions are added by using vendor-specific non-standard elements

Hierarchical URL style addressing scheme that scales

Addressing scheme doesn't scale well

Minimal delay - simplified signalling scheme makes it faster

Possibilities of delay (up to 7 or 8 seconds!)

Slim and Pragmatic The suite is too cumbersome to deploy easily

SIP & H.323 Comparison SIP H.323

SERVICES

Standard IP Centrex services Standard IP Centrex services

Ability to 'fork' calls Not possible in the existing standard

User profiling -

'Unified messaging' -

Presence management -

Unique ability to mix media (e.g. IVR) Cannot mix media within a session

URLs can be embedded in web browsers and email tools

H.323 has no URL format

Works smoothly with media gateway controllers controlling multiple gateways - crucial in a multi-

operator environment

"Shoehorn" interworking with SS7 is problematic - H.323 has trouble connecting calls to and from PSTN endpoints

Seamless interaction with other media - services are only limited by the developers imagination

Services are nailed-down and constricted - voice only ceiling

STATUS

Industry endorsed Popularity due to the fact that it was the first set of agreed-upon standards

Many vendors developing products The majority of existing IP telephony products rely on the H.323 suite

Security

• Integrity

• Authenticity

• Encryption

• Intellectual rights protection– Digital watermarking techniques embed extra

information into multimedia data– Imperceptible to normal user and irremovable

Security

• At the IP layer security can be provided by IPSec

• Secure RTP (RFC 3711)– Provides 128 bit AES encryption– Confidentiality, authentication and replay

protection– SHA-1 (Secure Hash Algorithm) for

authentication

• Does not deal with key exchange

Multimedia Networking Network Requirements and Protocols Ahrar Naqvi [email protected].

Documents

audio slide

tolerant slide

delay delay variance

audio bandwidth requirements

picture slide

text bandwidth requirements

realtime nature slide

video stream jitter