Dec 20, 2015
Agenda
• Overview of media characteristics from a networking standpoint
• Network requirements for multimedia communications
• Network architectures, techniques and protocols
Multimedia Classification
• Real Time: Require bounds on end-to-end packet delay & jitter. Subdivided into:– Discrete Media: MSN/Yahoo Messenger, Stock
quotes– Continuous Media: Continuous message stream with
inter-message dependency. Further divided into:• Delay Tolerant e.g Internet webcast• Delay Intolerant e.g. audio, video streams in conferencing
systems
• Non-Real Time: No strict delay constraints (e.g. text, image files)– May be highly sensitive to errors
Key Media Characteristics from Networking Standpoint
• Bandwidth usage
• Sensitivity to error
• Real-time nature
Text
• Bandwidth requirements depend on size– Can be easily reduced by compression
techniques
• Some text applications require complete freedom from loss & errors – use TCP. E.g. FTP
• Others are error and loss tolerant – use UDP e.g. instant messaging
Audio
• Bandwidth requirements depend on dynamic range and/or spectrum– Narrowband speech (300-3300Hz)
• 6.4 Kbps (G.723.3) to 64 Kbps (G.711)
– Wideband audio (CD quality music) 10-220 KHz• 112-128 Kbps (MP3)
• Can tolerate 1-2% packet loss• Real-time nature depends on extent of interactivity
– VoIP requires strong bounds on delay/jitter (Real-Time Intolerant)
• < 250 ms end to end delay
– Internet Webcast is more delay/jitter tolerant (Real-Time Tolerant)
Graphics and Animation
• Examples: Digital images, flash presentations• Large in size but lend themselves well to
compression• Progressive compression techniques enable
image to be initially displayed in low-quality and gradually improved as more information is received
• Error-tolerant and can sustain packet loss provided application knows how to deal with packet loss
• No real-time constraints
Video
• High bandwidth requirements• Efficient compression schemes
– MPEG-I (1.2 Mbps) VCR quality compression– MPEG-II (3-100 Mbps) broadcast quality video, HDTV– MPEG-IV (64 Kbps) for low bandwidth video
compression; supports audio, video, graphics, animation, text
– H.261 (px64 Kbps) video over ISDN– H.263 (18-64 Kbps) video over POTS
• Error requirements and real-time characteristics similar to audio
Network Requirements
• Traffic Requirements: Have implications for basic Internet infrastructure– Limits on Delay & Jitter– Bandwidth– Reliability
• Functional Requirements: Require enhancements to TCP/IP stack in the form of additional network protocols– Multicasting
• Single source of communication with multiple simultaneous receivers
• Can be one way (internet radio) or two way (multi-party audio/video conferencing)
– Security– Mobility– Session Management
Delay Related Metrics
• Maximum end to end delay• Delay variance
– Jitter: non-monotonic variation in delay in given stream
• For a video stream jitter would result in a shaky picture• Jitter can be removed by buffering at the receiver side
– Skew: constantly increasing difference between the expected arrival time and the acutal arrival time
• For a video stream skew could be a slower or faster moving picture
Delay: Packet Processing Delay
• Constant amount of delay at both source and destination– A/D, D/A conversion time and time taken to packetize it through
different layers of protocols
• Typically a characteristic of the operating system and the multimedia application
• Delay can become significant under high load conditions• Reductions in delay imply software enhancements
including use of multimedia operating systems that provide enhanced resource, file and memory management with real-time scheduling
Delay: Packet Transmission Delay
• Time taken by the physical layer at the source to transmit packets. Depends on– Number of active sessions. Typically physical layer
processes packets in FIFO order. Delay can become significant if OS does not support real-time scheduling for multimedia traffic
– MAC access delay: Widespread Ethernet networks cannot provide any firm guarantees on medium access delay due to inherent indeterminism in CSMA/CD (carrier sense multiple access/collision detection). Isochronous Ethernet (802.9) integrated voice data LAN and demand priority Ethernet (802.12) provide QoS but market potential remains low
Delay: Propagation Delay
• Flight time of packets – limited by speed of light. Can’t do anything about it
• For a distance of 20,000 km this would be about 0.067 sec
• Significant part of a desirable ~200 msec delay budget
Delay: Routing and Queuing Delay
• Best-effort Internet treats every packet equally• Packets arriving at a queue have to wait a
random amount of time depending on current router load
• Delay is variable and is the major contributor to jitter
• Techniques to reduce this include– IntServ– MPLS– DiffServ
Bandwidth Requirements• Multimedia traffic streams have high bandwidth• Uncontrolled transmissions at high rates can cause heavy congestion in the
network• Elastic applications that use TCP take advantage of built in congestion
control• Most multimedia applications use UDP for transmitting media streams• It is left to the discretion of the application to dynamically adapt to
congestion• To remove these shortcomings an enhanced internet service model would
require– Admission control: application must first get permission from some authority to
send traffic at a given rate with given traffic characteristics– Bandwidth reservation: if admission is given, appropriate resources (buffers,
bandwidth) will get reserved along the path– Traffic policing mechanisms: to ensure that applications do not send data at a
rate higher than what was negotiated
Bandwidth Related Metrics
• Cost of protocol processing operations relates more directly to packet processing rate than to the bandwidth in terms of bit rate
• In addition to a bit-rate oriented specification of bandwidth, packet processing rate is an important metric
• Cost of packet processing is largely dependent on number of packets and less so on packet size
• Packetization related metrics include maximum and average packet size and packet rate
Reliability
• Pertains to loss and corruption of data
• Can be measured in terms of loss probability
• Requires methods for dealing with erroneous/lost data
Error Correction
• Sender Based Repair• Active Repair - Automatic retransmission request (ARQ)
– Suitable for error intolerant applications
• Passive Repair– Interleaving
– FEC: Forward Error Correction.
» Media Independent – independent of the content/nature of the stream
» Media Dependent - use knowledge of the stream in the repair process
• Error Concealment (Receiver Based Repair)
FEC• Introduce repair data in traffic from which lost packets may be
recovered• Media Independent: use block or algebraic codes to produce
additional packets which aid in loss recovery– Each code takes a codeword of k data packets and generates n-k
additional check packets– i-th bit in check packet is generated from the i-th bits of each associated
data packet• Parity Coding: XOR is applied across groups of packets to generate parity
packets• Reed-Solomon Coding: Based on properties of polynomials over particular
number bases– Take a set of codewords and use these as coefficients of a polynomial f(x)– The transmitted codeword is determined by evaluating the polynomial for all
nonzero values of x over the number base
– Disadvantage: Cause additional delay, increase bandwidth usage and exacerbate congestion
Media Dependent FEC• Exploit media characteristics• For audio, could send each unit of audio in
multiple packets– Primary encoding: first transmission– Secondary encoding: additional transmissions– Secondary encoding could be of lower bandwidth
and quality than the primary coding– May not be necessary to transmit FEC for every
packet due to nature of media• Advantage: low latency – only single packet
delay added– Suitable for interactive applications
* A Survey of Packet Loss Recovery Techniques for Streaming Audio, Colin Perkins et al IEEE Network Sep/Oct 1998
Interleaving• Can be used when media unit size is smaller than packet size (as
may be the case with audio) and end-to-end delay is not important• Units are resequenced before transmission so that originally
adjacent units are separated by a guaranteed distance and returned to original order at the receiver
• Disperses the effect of packet loss – loss of a single packet would causes multiple smaller gaps among original media units
• In case of audio a phoneme originally encapsulated in one packet would get split across multiple packets
• Loss of small parts of several phonemes is easier to deal with than loss of entire phonemes
• Disadvantage: increased latency – not well suited for interactive applications
• Advantage: does not increase bandwidth usage – does well for non-interactive use
Error Concealment
• Producing a replacement for a lost packet which is similar to the original
• Work for relatively small loss rates (< 15%) and for small packets (4-40 ms)
• Types in increasing order of computational cost and improved performance:– Insertion based: insert a fill-in packet that contains silence, noise
or a repitition of an adjacent packet– Interpolation-based: some form of pattern matching and
interpolation to derive the missing packet (waveform, pitch or timescale based)
– Regeneration-based: derive decoder state from packets surrounding the loss and generate a lost packet from that (model based recovery)
Error Recovery for Different Applications
• Non-interactive Applications – Multicasts (e.g. radio)
• Interleaving is suitable (bandwidth efficient, though high latency)
• Use error concealment – repetition with fading• Media-independent FEC better than a
retransmission based scheme
• Interactive Applications (e.g. IP telephony)– Media Dependent FEC– Error concealment using packet repetition
Admission Control
• Pro-active form of congestion control• Takes requested traffic description as
input including (in terms of leaky bucket parameters (b,r))– Maximum burst size ( b = bucket size)– Peak rate – Average rate – Decides to accept or reject a flow including
consideration of impact to existing flows
Admission Control
• Leaky bucket parameters provide a very loose upper bound on the traffic rate
• When traffic becomes bursty network utilization can become very low if admission control is solely based on parameters provided at call setup time
• Admission control unit must also use measurements of current network load and packet delay in its admission decisions
Traffice Shaping/Policing
• Token bucket algorithm is used for traffic shaping. Limits the average rate and allows a degree of burstiness.– Token bucket depth ‘b’ in which tokens are collected
at rate ‘r’– When bucket becomes full extra tokens are dropped– Source can send data only if it can grab and destroy
sufficient tokens from the bucket• Leaky bucket algorithm is used for traffic
policing, in which excessive traffic is dropped– Bucket depth ‘b’ with hole at the bottom– If bucket is full extra packets are dropped
Packet Classification
• In order to prevent all packets from being treated equally some mechanism to distinguish between real-time and non-real time packets is needed
• Done by packet marking e.g. use Type of Service (ToS) field in IP header
• MPLS uses short labels
Packet Scheduling
• FIFO scheduling traditionally used in routers needs to be replaced with more sophisticated queuing
• E.g. priority queuing. Lower priority queues served after higher priority queues
• Disadvantage: possible starvation of low priority flows
• Weighted Fair Queuing has different queues for different classes. However every queue is assigned a certain weight. Packets in that queue get a fraction of the total bandwidth proportional to their weight
Packet Dropping
• Routers can randomly drop packets under congestion
• This can be a problem since certain packets may carry more information than others
QoS Based Routing
• OSPF, RIP, BGP are best-effort routing protocols using single-objective optimization algorithms: hop count or line cost
• All traffic routed to shortest path can lead to congestion on some links and leave other links under-utilized
• If link congestion is used to derive line cost such that congested routes are costlier, such algorithms can cause oscillations in the network, thus increasing jitter
• In QoS based routing paths are determined based on some knowledge of resource availability in the network as well as the QoS requirement of the flows
Integrated Services
• IntServ Internet Services model developed by IETF• Requires applications to know their QoS requirements
beforehand and signal intermediate network to reserve resources (bandwidth, buffers)
• Requires use of packet classifiers as well as packet schedulers
• Almost exclusively concerned with controlling the queuing component of end to end delay
• Has three service classes– Guaranteed Service (RTI)– Controlled Load (RTT)– Best-effort
IntServ
• Flow descriptor specifies QoS requirements– Filter spec – specifies information for the identifying a
packet with a given flow– Flow spec – specifies traffic spec in terms of token
bucket parameters (Tspec) and QoS parameters (Rspec) in terms of bandwidth, delay, jitter and packet loss
• Resource Reservation Protocol (RSVP) used to signal network nodes about required resources
RSVP
• The sender sends a PATH message to the receiver specifying traffic characteristics
• Every intermediate router along the path forwards the path to the next hop determined by the routing protocol
• The receiver responds with RESV message
Disadvantages of IntServ
• Routers having to maintain per-flow state for every flow is a large overhead
• Does not scale in the core network
• Router state has to be refreshed at regular intervals increasing traffic overhead
• However, RSVP has a place at the edge of the network
DiffServ
• Removes some of the shortcomings of the IntServ architecture
• Divides network into regions called DS domains. Each domain is controlled by a single entity
• To provide service guarantees the entire path between source and destination must be in some DS domain
• Nodes in a DS domain can be of following types:– Boundary node– Interior node
DiffServ Boundary Node
• Performs admission control to limit the number of flows in a domain
• Performs packet classification by marking each packet with a service class called “Behavior Aggregate”
• Each Behavior Aggregate is assigned an 8-bit code word called a DS code point
• IP ToS field is updated with the code-point
DiffServ Interior Node
• Lies completely within a DS domain. Connects with other interior nodes or boundary nodes within the domain
• Only performs packet forwarding• Packets are forwarded according to some pre-
defined rule associated with the packet class (as indicated by the code point)
• These pre-defined rules are called Per-Hop Behaviors (PHBs)
DiffServ PHB
• Two commonly used PHBs are– Assured Forwarding (AF): Divides incoming traffic into
four classes where each class guarantees a minimum bandwidth and buffer space
• Within each class packets are further assigned one of three drop priorities
– Expedited Forwarding (EF): Departure rate of a traffic class must equal or exceed the configured rate
• Queuing delay is guaranteed to be bounded• Used to provide Premium Service• Requires strict admission control and traffic policing
MPLS• A router determines the next hop of a packet by doing a longest
prefix match of an IP destination address against entries in a routing table
• This introduces some latency as routing tables can be very large• Same process repeated for every packet even if these are in the
same flow• IP switching gets around this:
– Short label is attached to every packet and is updated at every hop– This label is used at the next hop as an index into the routing table to
get the next hop (happens in constant time) and next label– Label is replaced and packet is forwarded to the next hop– Lends itself to being done in (low-cost) hardware resulting in very high
speeds
MPLS• Like DiffServ MPLS network is divided into domains with boundary
nodes called Label Edge Routers (LER) and interior nodes called Label Switching Routers (LSR)
• Packets entering an MPLS domain are assigned a label at the ingress LER and are switched inside a domain by a simple label lookup
• Labels determine QoS• Labels may get stripped off at egress LER and then get routed
conventionally outside the domain• A sequence of LSR to be followed by a packet in an MPLS domain
is called a Label Switched Path (LSP)• To guarantee QoS both source and destination have to be attached
to the same domain or the different domains have to have service level agreements among them
MPLS• A group of packets that are forwarded in the same manner are said to
belong to the same Forward Equivalence Class (FEC)- FECs form the basis of service differentiation
• No limit on number and granularity of FECs• Labels only have local significance
– Two LSRs agree upon using a particular label for a given FEC• Necessary to do label assignments including label allocation and label to
FEC binding on every hop of the LSP before the traffic flow can use the LSP• Labels can be stacked in FILO order – can be used in tunneling applications• In a domain only the topmost label is used to make forwarding decisions• Can be useful in providing mobility
– Home agent can push a label on incoming packets and forward them to a foreign agent
– Foreign agent pops of its label and forwards the packet to the destination mobile host
MPLS Label Assignment
• Label assignment can be done in the following ways:– Topology-driven: LSPs for every possible FEC are
automatically set up between every pair of LSR (zero call setup delay)
– Request driven: LSPs set up based on explicit requests.
• RSVP can be used– Traffic driven: LSPs set up only when LSR identifies
traffic patterns requiring a new LSP• Traffic patterns not requiring an established LSP use the
normal routing method• Combines advantages of above two
Multicasting – IP Multicast• Can be done in several ways• Send packets to multicast IP address (Class D)• Hosts willing to receive multicast messages for particular multicast
groups inform immediate-neighboring routers using IGMP• Multicast routers exchange group information using a variety of
algorithms:– Flooding– Spanning tree– Reverse path broadcasting– Reverse path multicasting
• Protocols that use some of these algorithms include– Distance Vector Muticast Routing Protocol (DVMRP)– Multicast extension to Open Shortest Path First (MOSPF)– Protocol Independent Multicast (PIM)
Multicasting – Overlay Network (MBONE)
• Virtual network implemented on top of some portions of the Internet
• Islands of multicast capable networks are connected by virtual links called tunnels– Tunneled packets are encapsulated as IP
over IP such that they look like normal unicast packets to intervening routers
Application Layer Multicasting
• SIP and H.323 support multicasting through a multi-point control unit that provides mixing and conferencing functionality
Session Management - Media Description
• Enables application to distribute session information – Media Type– Encoding Scheme– Session Start Time– Session Stop Time– IP Addresses of involved hosts
Session Description Protocol
• SDP developed by IETF can be used to describe media type, media encoding used for session
• More of a description syntax than a protocol – augmented by SIP for media negotiation
• Media descriptions encoded in text format• SDP message contains a series of lines called
fields with single letter abbreviations. Each field has a <tag>=<value> format
Session Management - Session Announcement
• Allows participants to announce future sessions
• E.g. for Internet radio stations to distribute information about scheduled shows
Session Announcement Protocol
• Used for advertising multicast conferences and sessions• SAP announcer periodically multicasts announcement packets to a well-
known multicast address and port (9875) with the same scope as the session being announced
• Recipients of announcement are also potential recipients of sessions being advertised
• Multiple announcers may announce a single session for more robustness• Announcement interval chosen to ensure total bandwidth used by
announcements is below a pre-configured limit• Each announcer is expected to listen to other announcements in order to
determine the total number of sessions being announced on a group• Involves large startup delay before complete set of announcements is heard
by a listener• Contains mechanisms for ensuring integrity, authenticating the origin and
encryption of announcements
Session Management - Session Control
• Information in multiple media streams may be inter-related
• Network must guarantee to maintain such relationships – Multimedia Synchronization
• Can be achieved by putting timestamps in every media packet
• Internet multimedia users may want to control playback of continuous media – similar to what a VCR or CD player provides
Session Control - RTP• RTP runs on top of UDP • Carries chunks of real-time (audio/video) data• Provides
– Sequencing: sequence number in RTP header helps detect lost packets– Payload Identification: payload identifier included in each RTP packet
describes encoding of the media– Frame Indication: video and audio sent in logical units called frames. A
frame marker bit indicates the beginning and end of a frame– Source Identification: To identify the originator of a frame in a multicast
session a Synchronization Source (SSRC) identifier– Intramedia Synchronization: To compensate for different delay and jitter
for packets within the same stream RTP provides timestamps, which are needed by play-out buffers
• Additional media information can be inserted using profile headers and extensions
Real-Time Control Protocol - RTCP
• RTCP is a control protocol that works in conjunction with RTP
• Provides useful statistics: packets sent, lost, jitter, round-trip time
• Sources can use this to adjust their data rate• Other information includes email address, phone
number, name – allow users to know the identities of other users in the session
Real-Time Streaming Protocol
• RTSP is an out-of-band control protocol that allows the media player to control the transmission of the media stream including functions such as – Pause– Resume– Repositioning– Playback
H.323• Umbrella recommendation that specifies components, protocols and
procedures multimedia conferencing over a packet network• Defines four components
– Terminals: These are the endpoints– Gateway: For interoperation between clients using different H.32x
flavors– Gatekeeper: Control functions including admission control, bandwidth
management, call routing– Multi-point Control Unit: For point to multipoint conferencing capability
• Uses H.245 to determine common capabilities of terminals• Two kinds of call control models
– Gatekeeper routed (preferred mode in carrier environments)– Direct (not scalable)
H.323 Protocol Stack
H.323 Stack
• RTP & RTCP used for media• H.225 RAS (Registration, Admission and Status) used by
endpoints and gateways to– Gatekeeper discovery and registration– Requesting call admission, bandwidth allocation– Clearing a call
• Q.931 signaling protocol is used for call setup and teardown between two endpoints – lightweight version of the ISDN protocol
• H.245 media control protocol is used for negotiating media processing capabilities such as A/V codec
• T.120 is used for data-conferencing capabilities such as whiteboard sharing, instant messaging
Session Initiation Protocol - SIP• Application-layer signaling protocol for initiating, modifying and
terminating interactive sessions. Defined in RFC 3261• Does not define what a “session” is. Session is carried opaquely
in SIP messages. • Text-encoded protocol based on elements from HTTP and SMTP• “SIP supports five facets of establishing and terminating
multimedia communications:– User location: determination of the end system to be used for
communication;– User availability: determination of the willingness of the called party to
engage in communications;– User capabilities: determination of the media and media parameters
to be used;– Session setup: "ringing", establishment of session parameters at both
called and calling party;– Session management: including transfer and termination of sessions,
modifying session parameters, and invoking services” (RFC 3261)
SIP – Key Capabilities
• A stateful SIP server can split or "fork" an incoming call so that several extensions can be rung at once
• The first extension to answer can take the call• SIP can return different media types within a
single session• Participants can be invited to existing sessions• Media can be added (removed from) an existing
session• Supports mobility
Elements of a SIP Network
• Three main elements in a SIP network– User Agent: end device in a SIP network. User Agent Client
(UAC) initiates requests. User Agent Server (UAS) responds to requests. Roles may change in the course of a session.
– Server: There are three main types• Proxy: Receives requests from UAs or other proxy and forward the
request to another location• Redirect: Receives a request from a UA or proxy and returns a
redirect response (3XX) indicating where the request should be retried
• Registrar: Receives SIP registration requests and updates UA’s information to a location server (e.g. LDAP server) or other database
– Location Server: General term for a database. Non-SIP protocol is used to interact with it.
SIP MethodsSIP Methods are commands supported by SIP:
• INVITE: Invites a user to a call• ACK: Used to facilitate reliable message exchange for INVITEs• BYE: Terminates a connection between users or declines a call• CANCEL: Terminates a request, or search, for a user• OPTIONS: Solicits information about a server's capabilities• REGISTER: Registers a user's current location• INFO: Used for mid-session signallingSIP responses
The following are SIP responses: • 1xx Informational (e.g. 100 Trying, 180 Ringing) • 2xx Successful (e.g. 200 OK, 202 Accepted) • 3xx Redirection (e.g. 302 Moved Temporarily) • 4xx Request Failure (e.g. 404 Not Found, 482 Loop Detected) • 5xx Server Failure (e.g. 501 Not Implemented)• 6xx Global Failure (e.g. 603 Decline)
SIP Signaling
SIP & H.323 Comparison• SIP is largely equivalent to the Q.931 and H.225 components of H.323 (sipcenter.com)
SIP H.323
PHILOSOPHY
"New World" - a relative of Internet protocols - simple, open and horizontal
"Old World" - complex, deterministic and vertical
IETF ITU
Carrier-class solution addressing the wide area
Borne of the LAN - focusing on enterprise conferencing priorities
CHARACTERISTICS
A simple toolkit upon which smart clients and applications can be built. It re-uses
Net elements (URLs, MIME and DNS)
H.323 specifies everything including the codec for the media and how you carry the packets in RTP
Leaves issues of reliability to underlying network
Assumes fallibility of network - an unnecessary overhead
SIP messages are formatted as text. (Text processing lies behind the web and email)
Binary format doesn't sit well with the internet - this adds complexity
SIP allows for standards-based extensions to perform specific functions.
Extensions are added by using vendor-specific non-standard elements
Hierarchical URL style addressing scheme that scales
Addressing scheme doesn't scale well
Minimal delay - simplified signalling scheme makes it faster
Possibilities of delay (up to 7 or 8 seconds!)
Slim and Pragmatic The suite is too cumbersome to deploy easily
SIP & H.323 Comparison SIP H.323
SERVICES
Standard IP Centrex services Standard IP Centrex services
Ability to 'fork' calls Not possible in the existing standard
User profiling -
'Unified messaging' -
Presence management -
Unique ability to mix media (e.g. IVR) Cannot mix media within a session
URLs can be embedded in web browsers and email tools
H.323 has no URL format
Works smoothly with media gateway controllers controlling multiple gateways - crucial in a multi-
operator environment
"Shoehorn" interworking with SS7 is problematic - H.323 has trouble connecting calls to and from PSTN endpoints
Seamless interaction with other media - services are only limited by the developers imagination
Services are nailed-down and constricted - voice only ceiling
STATUS
Industry endorsed Popularity due to the fact that it was the first set of agreed-upon standards
Many vendors developing products The majority of existing IP telephony products rely on the H.323 suite
Security
• Integrity
• Authenticity
• Encryption
• Intellectual rights protection– Digital watermarking techniques embed extra
information into multimedia data– Imperceptible to normal user and irremovable
Security
• At the IP layer security can be provided by IPSec
• Secure RTP (RFC 3711)– Provides 128 bit AES encryption– Confidentiality, authentication and replay
protection– SHA-1 (Secure Hash Algorithm) for
authentication
• Does not deal with key exchange