1 Multimedia Communications Introduction
Post on 18-Jan-2016
30 Views
Preview:
Transcript
Multimedia Communications :
1. IntroductionInstitut Sains dan Teknologi Nasional - Jakarta
Reference Text: Multimedia Communications;
Applications, Networks, Protocols and Standards, Fred Halsall, Addison-Wesley; 1st edition (2002), ISBN: 0-201-39818-4.
What is Multimedia?
01/22/20073
Multimedia is a combination of text, art, sound, animation, and video.
Slide: Courtesy, Hung Nguyen
Multimedia Description
Introduction to Multimedia4
Multimedia is an integration of continuous media (e.g. audio, video)
and discrete media (e.g. text, graphics, images) through which digital information can be conveyed to the user in an appropriate way.
Multi many, much, multiple
Medium An interleaving substance through which something is
transmitted or carried on
Why Multimedia Computing?
Introduction to Multimedia5
Application driven e.g. medicine, sports, entertainment, education
Information can often be better represented using audio/video/animation rather than using text, images and graphics alone.
Information is distributed using computer and telecommunication networks.
Integration of multiple media places demands on computation power storage requirements networking requirements
Multimedia Information Systems
Introduction to Multimedia6
Technical challenges Sheer volume of data
Need to manage huge volumes of data Timing requirements
among components of data computation and communication.
Must work internally with given timing constraints - real-time performance is required.
Integration requirements need to process traditional media (text, images) as well
as continuous media (audio/video). Media are not always independent of each other -
synchronization among the media may be required.
High Data Volume of Multimedia Information
Speech 8000 samples/s 8Kbytes/s
CD Audio 44,100 samples/s, 2 bytes/sample
176Kbytes/s
Satellite Imagery
180X180 km 2̂ 30m 2̂ resolution
600MB/image (60MB compressed)
NTSC Video 30fps, 640X480 pixels, 3bytes/pixel
30Mbytes/s (2-8 Mbits/s compressed)
Introduction to Multimedia7
Technology Incentive
Introduction to Multimedia8
Growth in computational capacity MM workstations with audio/video processing capability Dramatic increase in CPU processing power Dedicated compression engines for audio, video etc.
Rise in storage capacity Large capacity disks (several gigabytes) Increase in storage bandwidth,e.g. disk array
technology
Surge in available network bandwidth high speed fiber optic networks - gigabit networks fast packet switching technology
Application Areas
Introduction to Multimedia9
Residential Services video-on-demand video phone/conferencing systems multimedia home shopping (MM catalogs, product
demos and presentation) self-paced education
Business Services Corporate training Desktop MM conferencing, MM e-mail
Application Areas
Introduction to Multimedia10
Education Distance education - MM repository of class videos Access to digital MM libraries over high speed networks
Science and Technology computational visualization and prototyping astronomy, environmental science
Medicine Diagnosis and treatment - e.g. MM databases that
provide support for queries on scanned images, X-rays, assessments, response etc.
Classification of Media
Introduction to Multimedia11
Perception Medium How do humans perceive information in a computer?
Through seeing - text, images, video Through hearing - music, noise, speech
Representation Medium How is the computer information encoded?
Using formats for representing and information ASCII(text), JPEG(image), MPEG(video)
Presentation Medium Through which medium is information delivered by the
computer or introduced into the computer? Via I/O tools and devices paper, screen, speakers (output media) keyboard, mouse, camera, microphone (input media)
Classification of Media (cont.)
Introduction to Multimedia12
Storage Medium Where will the information be stored? Storage media - floppy disk, hard disk, tape, CD-ROM etc.
Transmission Medium Over what medium will the information be transmitted? Using information carriers that enable continuous data
transmission - networks wire, coaxial cable, fiber optics
Information Exchange Medium Which information carrier will be used for information
exchange between different places? Direct transmission using computer networks Combined use of storage and transmission media (e.g.
electronic mail).
Media Concepts
Introduction to Multimedia13
Each medium defines Representation values - determine the information
representation of different media Continuous representation values (e.g. electro-magnetic
waves) Discrete representation values(e.g. text characters in digital
form) Representation space determines the surrounding
where the media are presented. Visual representation space (e.g. paper, screen) Acoustic representation space (e.g. stereo)
Media Concepts (cont.)
Introduction to Multimedia14
Representation dimensions of a representation space are: Spatial dimensions:
two dimensional (2D graphics) three dimensional (holography)
Temporal dimensions: Time independent (document) - Discrete media
Information consists of a sequence of individual elements without a time component.
Time dependent (movie) - Continuous media Information is expressed not only by its individual value but
also by its time of occurrence.
Multimedia Systems
Introduction to Multimedia15
Qualitative and quantitative evaluation of multimedia systems Combination of media
continuous and discrete. Levels of media-independence
some media types (audio/video) may be tightly coupled, others may not.
Computer supported integration timing, spatial and semantic synchronization
Communication capability
Data Streams
Introduction to Multimedia16
Distributed multimedia communication systems
data of discrete and continuous media are broken into individual units (packets) and transmitted.
Data Stream sequence of individual packets that are transmitted in a
time-dependant fashion. Transmission of information carrying different media
leads to data streams with varying features Asynchronous Synchronous Isochronous
Data Stream Characteristics
Introduction to Multimedia17
Asynchronous transmission mode provides for communication with no time restriction Packets reach receiver as quickly as possible, e.g. protocols
for email transmission Synchronous transmission mode
defines a maximum end-to-end delay for each packet of a data stream.
May require intermediate storage E.g. audio connection established over a network.
Isochronous transmission mode defines a maximum and a minimum end-to-end delay for
each packet of a data stream. Delay jitter of individual packets is bounded.
E.g. transmission of video over a network. Intermediate storage requirements reduced.
Data Stream Characteristics
Introduction to Multimedia18
Data Stream characteristics for continuous media can be based on Time intervals between complete transmission of
consecutive packets Strongly periodic data streams - constant time interval Weakly periodic data streams - periodic function with finite
period. Aperiodic data streams
Data size - amount of consecutive packets Strongly regular data streams - constant amount of data Weakly regular data streams - varies periodically with time Irregular data streams
Continuity Continuous data streams Discrete data streams
Classification based on time intervals
Introduction to Multimedia19
Strongly periodic data stream
Weakly periodic data stream
Aperiodic data stream
T
T
T1 T3T2
T1 T2
T
Classification based on packet size
Introduction to Multimedia20
TD1
D1
TD1D2D3D1D2D3
D1D2D3
Dn
Strongly regular data stream
Weakly regular data stream
Irregular data stream
t
t
t
Classification based on continuity
Introduction to Multimedia21
Continuous data stream
Discrete data stream
D
D1 D2 D3 D4
D
D1 D2 D3 D4
Logical Data Units
Introduction to Multimedia22
Continuous media consist of a time-dependent sequence of individual information units called Logical Data Units (LDU).
a symphony consists of independent sentences a sentence consists of notes notes are sequences of samples
Granularity of LDUs symphony, sentence, individual notes, grouped samples,
individual samples film, clip, frame, raster, pixel
Duration of LDU: open LDU - duration not known in advance closed LDU - predefined duration
Granularity of Logical Data Units
Introduction to Multimedia23
Film
Clip
Frame
Blocks
Pixels
Multimedia Components Simplified
01/22/200724
Multimedia can be viewed as they combination of audio, video, data and how they interact with the user (more than the sum of the individual components)
Audio
Multimedia
VideoData
Background
01/22/200725
Fast paced emergence in applications in medicine, education, travel etc
Characterized by large documents that must be communicated with short delays
Glamorous applications such as distance learning, video teleconferencing
Applications that are enhanced by Video are often seen as driver for development of multimedia networks
Forces Driving Communications That Facilitate Multimedia Communications
01/22/200726
Evolution of communications and data networks
Increasing availability of almost unlimited bandwidth demand
Availability of ubiquitous access to the network
Ever increasing amount of memory and computational power
Sophisticated terminals Digitization of virtually everything
New Information System Paradigm
01/22/200727
Integration
MultimediaIntegrated
Communication
MultimediaProcessing
Broadband Link
Workstation, PC
Slide: Courtesy, Hung Nguyen
Elements of Multimedia Systems
01/22/200728
Two key communication modes Person-to-person Person-to-machine
TransportUse
InterfaceUse
Interface
TransportProcessingStorage and
Retrieval
UseInterface
Slide: Courtesy, Hung Nguyen
Multimedia Networks
01/22/200729
The world has been wrapped in copper and glass fiber and can be viewed as a “hair ball” with physical, wireless and satellite entry/exit points.
Physical: LAN-WAN connections Wireless: Cellular telephony, wireless PC
connectivity Satellite: INMARSAT, THURYA, ACeS etc
Multimedia Communication Model
01/22/200730
Partitioning of information objects into distinct types, e.g., text, audio, video
Standardization of service components per information type
Creation of platforms at two levels – network service and multimedia communication
Define general applications for multiple use in various multimedia environments
Define specific applications, e.g. e-commerce, tele-training, … using building blocks from platform and general applications
Requirements
01/22/200731
User Requirements Fast preparation and presentation Dynamic control of multimedia applications Intelligent support to users Standardization
Network Requirements High speed and variable bit rates Multiple virtual connections using the same access Synchronization of different information types Suitable standardized services along with support
Network Requirements
01/22/200732
ATM-BISDN and SS7 have enabled the switching based communications capabilities over the PSTN that support the necessary services
ATM-BISDN-SS7 will evolve to all optical “switchless” networks based on packet transfer
Packet Transfer Concept
01/22/200733
Allows voice, video and data to be dealt with in a common format
More flexible than circuit switching which it can emulate while allowing the multiplexing of varied bit rate data streams
Dynamic allocation of bandwidth Handle Variable Bit Rate (VBR) directly
Considerations
01/22/200734
Buffering required for constant bit rate data such as audio
Re-sequencing and recovery capabilities must be provided over networks where packets may be received either in an order different from that transmitted or dropped In an ATM network some packets can be dropped
while others may not (i.e. voice vs bank transfer data packets)
Optimum packet lengths for voice video and data differ in an ATM network
IP packets over the internet may arrive in a different order or be dropped.
Digital Video Signal Transport
01/22/200735
Vid
eo
Encoder•Transformation•Quantization•Entropy Coding•Bit-Rate Control
Application
•Data Structuring
Use
rs
Network Multiplexing/Routing
•Overhead (FEC)•Re-Trans
•Error detection•Loss detection•Error correction•Erasure correction
Application
•Re-Synch
Decoder•De-quantization•Entropy decode•Inv Trans•Loss conceal•Post process
The following figure will be examined over the course of the semester
Quality of Service (QoS)
01/22/200736
The set of parameters that defines the properties of media streams
Can define four QoS layers:1. User QoS: Perception of the multimedia data at
the user interface (“qualitative”)2. Application QoS: Parameters such as end-to-end
delay (“quantitative”)3. System QoS: Requirements on the
communications services derived from the application QoS
4. Network QoS: Parameters such as network load and performance
Applications of Multimedia
01/22/200737
Business - Business applications for multimedia include presentations training, marketing, advertising, product demos, databases, catalogues, instant messaging, and networked communication.
Schools - Educational software can be developed to enrich the learning process.
Slide: Courtesy, Hung Nguyen
Applications of Multimedia
01/22/200738
Home - Most multimedia projects reach the homes via television sets or monitors with built-in user inputs.
Public places - Multimedia will become available at stand-alone terminals or kiosks to provide information and help.
Slide: Courtesy, Hung Nguyen
Compact Disc Read-Only (CD-ROM)
01/22/200739
CD-ROM is the most cost-effective distribution medium for multimedia projects.
It can contain up to 80 minutes of full-screen video or sound.
CD burners are used for reading discs and converting the discs to audio, video, and data formats.
Slide: Courtesy, Hung Nguyen
Digital Versatile Disc (DVD)
01/22/200740
Multilayered DVD technology increases the capacity of current optical technology to 18 GB.
DVD authoring and integration software is used to create interactive front-end menus for films and games.
DVD burners are used for reading discs and converting the disc to audio, video, and data formats.
Slide: Courtesy, Hung Nguyen
Multimedia Communications
01/22/200741
Multimedia communications is the delivery of multimedia to the user by electronic or digitally manipulated means.
Audio Communications(Telephony, sound, Broadcast)
Multimedia Communications
Video Communications(Video telephony,
TV/HDTV)
Data, text, imageCommunications
(Data Transfer, fax…)
Slide: Courtesy, Hung Nguyen
Multimedia Terms
01/22/200742
Alternative Types of Media used in Multimedia Applications
01/22/200743
Multimedia Communications Networks
01/22/200744
Multimedia Networks and Their Services
01/22/200745
Multimedia Networks and Their Services
01/22/200746
Audio-Visual Integration
Application in Biometrics – Bimodal Person Verification
01/22/200748
Existing methods for person verification are mainly based on a single modality which would have limitation in security and robustness
Audio visual integration using a camera and microphone makes person verification a more reliable product
Slide: Courtesy, Hung Nguyen
Joint Audio-Video Coding
01/22/200749
Correlation between audio and video can be used to achieve more efficient coding Predictive coding of audio and video information
used to construct estimate of current frame (cross-modal redundancy)
Difference between original and estimated signal can be transmitted as parameters
Decision on what and how to send is based on Rate Distortion (R-D) criteria
Reconstruction done at receiver according to agreed-upon decoding rules
Slide: Courtesy, Hung Nguyen
Cross-Model Predictive Coding
01/22/200750
Visual Analysis
A-to-VMapping
DecisionModule(R-D)
Parameter X
X̂
XX ˆ
Nothing
Parameter X
Slide: Courtesy, Hung Nguyen
Importance of Interaction
01/22/200751
Multimedia is more than the combination of text, audio, video and data
Interaction among media is important
Consider a poorly dubbed movie Audio not synchronized with video Lip movements inconsistent with
language Audio dynamic range inconsistent with
the sceneSlide: Courtesy, Hung Nguyen
Media Interaction
01/22/200752
Process and Model
Audio
TextImageVideo
Multimedia
Lip synchFace Animation
Joint A/V Coding
CompressionSynthesis3D Sound
Sign languageLip reading
Speech RecognitionText-to-Speech
Compression, GraphicsDatabase indexing/retrieval
TranslationNatural language
Slide: Courtesy, Hung Nguyen
Bimodality of Human Speech
01/22/200753
Human speech is produced by vibration of the vocal cord, configuration of the vocal tract with muscles that generate facial expressions
Audio + Visual Perceived
ba ga da
pa ga ta
ma ga na
Slide: Courtesy, Hung Nguyen
Basic Definitions
01/22/200754
The basic unit of acoustic speech is called a phoneme
In the visual domain, the basic unit of mouth movement is called viseme A viseme is the smallest visibly distinguishable
unit of speech Can contain several phonemes and thus form one
viseme group A many-to-one mapping between phonemes and
visemes
Slide: Courtesy, Hung Nguyen
Lip Reading System
01/22/200755
Application to support hearing-impaired person
People learn to understand spoken language by combining visual content with lexical, syntactic, semantic and programmatic information
Automated lip reading systems Speech recognition possible using only visual
information Integrated with speech recognition systems to
improve accuracy
Slide: Courtesy, Hung Nguyen
Lip Synchronization
01/22/200756
Applications In VTC (video teleconferencing) where video frame
is dropped (low bandwidth requirement) but audio must still be continuous
In non-real-time use such as dubbing in studio where recorded voice full of background noise
Time-warping commonly used in both audio and video modes Time-frequency analysis Video time-warping could be used for VTC Audio time-warping could be used for dubbing
Slide: Courtesy, Hung Nguyen
Lip Tracking
01/22/200757
To prevent too much jerkiness in the motion rendering and too much loss in lip synchronization
Involved real-time analysis on 3-dimensional of the video signal plus one temporal dimension
Produce meaningful parameters Classification of mouth images into visemes Measures of dimension, e.g. mouth widths and
heights Analysis tools – Fourier Transform, Karhunen-
Loeve Transform (KLT), Probability Density Function (pdf) EstimationSlide: Courtesy, Hung Nguyen
Audio-to-Visual Mapping for Lip Tracking
01/22/200758
Conversion of acoustic speech to mouth shape parameters
A mapping of phonemes to visemes Could be most precisely implemented with a
complete speech recognizer followed by a look-up table High computational overhead plus table look-up
complexity Do not need to recognize spoken word to achieve audio-
to-visual mapping Physical relationships exist between vocal tract
shape and sound produced functional relationships exist between speech and visual parametersSlide: Courtesy, Hung Nguyen
Classification-Based Conversion Approaches for Lip Tracking
01/22/200759
Two-step process Classification of acoustic signal using VQ (vector
quantization), HMM (hidden Markov model) and NN (neural network)
Mapping of the acoustic classes into corresponding visual outputs, then averaged to get centroid
Shortcomings Error resulting from averaging visual vector to get
visual centroid Not a continuous mapping – finite output levels
Slide: Courtesy, Hung Nguyen
Classification-Based Conversion
01/22/200760
Phoneme Space Viseme Space
Centroid
Slide: Courtesy, Hung Nguyen
Audio and Visual Integration for Lip Reading Applications
01/22/200761
Three major steps Audio-visual pre-processing – Principal Component
Analysis (PCA) has been used for feature extraction
Pattern recognition strategy (HMM, NN, time-warping…)
Integration strategy (decision making) Heuristic rules to incorporate knowledge of phonemes
about the two modalities Combination of independent evaluation score for each
modalities
Slide: Courtesy, Hung Nguyen
top related