DIVISION OF COMPUTER SCIENCE AND ENGINEERING COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY KOCHI-22 CERTIFICATE This is to certify that the technical report entitled “MPEG 7” that is being submitted by SREEKUMAR K.R in partial fulfilment for the award of the Degree of Bachelor of Technology in COMPUTER SCIENCE AND ENGINEERING of COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY is a bonafide work carried out by her under my guidance and supervision. Mr. Sudheep Elayidom Dr. David Peter S Seminar Guide Guide Head of Division
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
DIVISION OF COMPUTER SCIENCE AND ENGINEERING
COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY KOCHI-22
CERTIFICATE
This is to certify that the technical report entitled “MPEG 7” that is being submitted by
SREEKUMAR K.R in partial fulfilment for the award of the Degree of Bachelor of
Technology in COMPUTER SCIENCE AND ENGINEERING of COCHIN
UNIVERSITY OF SCIENCE AND TECHNOLOGY is a bonafide work carried out by
her under my guidance and supervision.
Mr. Sudheep Elayidom Dr. David Peter S
Seminar Guide Guide Head of Division
MPEG 7 A SEMINAR REPORT
Submitted by SREEKUMAR K.R
(12080082)
in partial fulfilment of requirement of the Degree of
Bachelor of Technology (B.Tech)
in
Computer Science &Engineering
of
Cochin University of Science And Technology
DIVISION OF COMPUTER SCIENCE SCHOOL OF ENGINEERING
COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY
KOCHI-682022
AUGUST 2010
ACKNOWLEDGMENT
I express my sincere thanks to Dr.David Peter ,Head of
Departement, Division of Computer Science Departement,CUSAT . I
express my heartfelt gratitude to my respected guide Mr.Sudheep
Elayidom for his kind and inspiring advise which helped me to understand
the subject and its semantic significance.
I also extend my sincere thanks to all other members of the faculty
of Computer Science and Engineering Department and my friends for their
co-operation and encouragement.
SREEKUMAR K.R
ABSTRACT
As more and more audiovisual information becomes available from many
sources around the world, many people would like to use this information for
various purposes. This challenging situation led to the need for a solution that
quickly and efficiently searches for and/or filters various types of multimedia
material that’s interesting to the user.
For example, finding information by rich-spoken queries, hand-drawn images,
and humming improves the user-friendliness of computer systems and finally
addresses what most people have been expecting from computers. For
professionals, a new generation of applications will enable high-quality
information search and retrieval. For example, TV program producers can search
with “laser-like precision” for occurrences of famous events or references to
certain people, stored in thousands of hours of audiovisual records, in order to
collect material for a program. This will reduce program production time and
increase the quality of its content.
MPEG-7 is a multimedia content description standard, (to be defined by
September 2001), that addresses how humans expect to interact with computer
systems, since it develops rich descriptions that reflect those expectations.
TABLE OF CONTENTS
CHAPTER
NO:
TITLE PAGE
NO:
1 INTRODUCTION 1
2 DIFFERENT VIDEO FORMATS 3
3 WHAT ARE THE MPEG STANDARDS 8
4 DEFINING MPEG-7 9
5 MPEG-7 TECHNICAL ACTIVITIES 11
6
MPEG-7 APPLICATION DOMAINS
17
7 MPEG-7 IN THE 21ST CENTURY MEDIA
LANDSCAPE
21
8 ADVANTAGES OF MPEG-7 – A SUMMARY 22
9 CONCLUSION AND FUTURE SCOPE 26
REFERENCES 28
MPEG 7
Department Of Computer Science Engineering 1
CHAPTER 1
INTRODUCTION
The Moving Pictures Experts Group abbreviated MPEG is part of the
International Standards Organization (ISO), and defines standards for digital
video and digital audio. The primal task of this group was to develop a format to
play back video and audio in real time from a CD. Meanwhile the demands have
raised and beside the CD the DVD needs to be supported as well as transmission
equipment like satellites and networks. All this operational uses are covered by a
broad selection of standards. Well known are the standards MPEG-1, MPEG-2,
MPEG-4 and MPEG-7. Each standard provides levels and profiles to support
special applications in an optimized way.
It's clearly much more fun to develop multimedia content than to index it.
The amount of multimedia content available -- in digital archives, on the World
Wide Web, in broadcast data streams and in personal and professional databases
-- is growing out of control. But this enthusiasm has led to increasing difficulties
in accessing, identifying and managing such resources due to their volume and
complexity and a lack of adequate indexing standards. The large number of
recently funded DLI-2 projects related to the resource discovery of different
media types, including music, speech, video and images, indicates an
acknowledgement of this problem and the importance of this field of research for
digital libraries.
MPEG-7 is being developed by the Moving Pictures Expert Group
(MPEG) a working group of ISO/IEC. Unlike the preceding MPEG standards
(MPEG-1, MPEG-2, MPEG-4) which have mainly addressed coded
representation of audio-visual content, MPEG-7 focuses on representing
information about the content, not the content itself.
MPEG 7
Department Of Computer Science Engineering 2
The goal of the MPEG-7 standard, formally called the "Multimedia
Content Description Interface", is to provide a rich set of standardized tools to
describe multimedia content.
A single standard which can provide a simple, flexible, interoperable
solution to the problems of indexing, searching and retrieving multimedia
resources will be extremely valuable and widely deployed. Resources described
using such a standard will acquire enhanced value. Compliant hardware and
software tools capable of efficiently generating and interpreting such
standardized descriptions will be in great demand.
MPEG 7
Department Of Computer Science Engineering 3
CHAPTER 2
DIFFERENT VIDEO FORMATS
Avid PC users will almost certainly remember the first time they were
able to view a video clip on their computer. The clips were about the size of a
postage stamp and were generously referred to as "multimedia". Later, the first
acceptable video clips were used in the opening scenes of computer games. In
some cases, there were even digital 3D animations that couldn't be created in
real-time with the hardware and software that was available in those days. As the
video clips demanded extensive storage space (despite their short length), they
were only available on CD-ROM drives that had recently become popular.
Because of this, many PC's became multimedia-compatible, in a restricted sense,
by the integration of a CD-ROM drive and a soundcard. However, their
limitations soon became apparent: it wasn't possible to run the video clip
smoothly in fullscreen mode even with the most powerful hardware available.
With the development of high performance graphic chips, faster processors and
corresponding software interfaces, today's users are now able to run video clips
in all the usual formats (including fullscreen mode) without problems. We'll
continue with a look at the most video formats and we'll then provide an
overview of their specific applications.
The AVI Format
One of the oldest formats in the x86 computer world is AVI. The
abbreviation 'AVI' stands for 'Audio Video Interlaced'. This video format was
created by Microsoft, which was introduced along with Windows 3.1. AVI, the
proprietary format of Microsoft's "Video for Windows" application, merely
provides a framework for various compression algorithms such as Cinepak, Intel
Indeo, Microsoft Video 1, Clear Video or IVI. In its first version, AVI supported
MPEG 7
Department Of Computer Science Engineering 4
a maximum resolution of 160 x 120 pixels with a refresh rate of 15 frames per
second. The format attained widespread popularity, as the first video editing
systems and software appeared that used AVI by default. Examples of such
editing boards included Fast's AV Master and Miro/Pinnacle's DC10 to DC50.
However, there were a number of restrictions: for example, an AVI video that
had been processed using an AV Master could not be directly processed using an
interface board from Miro/Pinnacle. The manufacturers adapted the open AVI
format according to their own requirements. AVI is subject to additional
restrictions under Windows 98, which make professional work at higher
resolutions more difficult. For example, the maximum file size under the FAT16
file system is 2 GB. The FAT32 file system (came with OSR2 and Windows 98)
brought an improvement: in connection with the latest DirectX6 module
'DirectShow', files with a size of 8 GB can (at least in theory) be created. In
practice however, many interface cards lack the corresponding driver support so
that Windows NT 4.0 and NTFS are strongly recommended. Despite its age and
numerous problems, the AVI format is still used in semi-professional video
editing cards. Many TV cards and graphic boards with a video input also use the
AVI format. These are able to grab video clips at low resolutions (mostly 320 x
240 pixels).
Apple's Format
The MOV format which originated in the Macintosh world, was also
ported to x86 based PC's. It is the proprietary standard of Apple's Quicktime
application that simultaneously stores audio and video data. Between 1993 and
1995, Quicktime was superior to Microsoft's AVI format in both functionality
and quality. The functionality of the latest generation (Quicktime 4.0) also
includes the streaming of Internet videos (the realtime transmission of videos
without the need to first download the entire file to the computer). Despite this,
Apple's proprietary format is continually losing popularity with the increasing
MPEG 7
Department Of Computer Science Engineering 5
use of MPEG. Video clips coded with Apple's format are still found on some
CD's because of Quicktime's ability to run on both Macintosh and x86
computers.
MPEG Formats
The MPEG formats are by far the most popular standard. MPEG stands
for "Motion Picture Experts Group" - an international organization that develops
standards for the encoding of moving images. In order to attain widespread use,
the MPEG standard only specifies a data model for the compression of moving
pictures and for audio signals. In this way, MPEG remains platform independent.
One can currently differentiate between four standards: MPEG-1, MPEG-2,
MPEG-4 und MPEG-7. Let's take a brief look at each format separately.
MPEG-1 was released in 1993 with the objective of achieving acceptable
frame rates and the best possible image quality for moving images and their
sound signals for media with a low bandwidth (1 MBit/s up to 1,5 MBit/s). The
design goal of MPEG-1 is the ability to randomly access a sequence within half a
second, without a noticeable loss in quality. For most home user applications
(digitizing of vacation videos) and business applications (image videos,
documentation), the quality offered by MPEG-1 is adequate.
MPEG-2 has been in existence since 1995 and its basic structure is the
same as that of MPEG-1, however it allows data rates up to 100 MBit/s and is
used for digital TV, video films on DVD-ROM and professional video studios.
MPEG-2 allows the scaling of resolution and the data rate over a wide range.
Due to its high data rate compared with MPEG-1 and the increased requirement
for memory space, MPEG-2 is currently only suitable for playback in the home
user field. The attainable video quality is noticeably better than with MPEG-1 for
data rates of approximately 4 MBit/s.
MPEG 7
Department Of Computer Science Engineering 6
MPEG-4 is one of the latest video formats and its objective is to get the
highest video quality possible for extremely low data rates in the range between
10 KBit/s and 1 MBit/s. Furthermore, the need for data integrity and loss-free
data transmission is paramount as these play an important role in mobile
communications. Something completely new in MPEG-4 is the organization of
the image contents into independent objects in order to be able to address or
process them individually. MPEG-4 is used for video transmission over the
Internet for example. Some manufacturers plan to transmit moving images to
mobile phones in the future. MPEG-4 is intended to form a platform for this type
of data transfer.
MPEG-7 is the latest MPEG family project. It is a standard to describe
multimedia data and can be used independently of other MPEG standards.
MPEG-7 will probably reach the status of an international standard by the year
2001.
The MJPEG Format
The abbreviation MJPEG stands for "Motion JPEG". This format is
practically an intermediate step between a still image and video format, as an
MJPEG clip is a sequence of JPEG images. This is one reason why the format is
often implemented by video editing cards and systems. MJPEG is a compression
method that is applied to every image. Video editing cards such as Fast's AV
Master or Miro's DC50 or the much more inexpensive Matrox Marvel product
series reduce the resulting data stream of a standard television signal from
approximately 30 MB/s (!) to 6 MB/s (MJPEG file). This corresponds to a
compression ratio of 5:1. However, a standard for the synchronization of audio
and video data during recording has not been implemented in the MJPEG format
so that the manufacturers of video editing cards have had to create their own
implementations.
MPEG 7
Department Of Computer Science Engineering 7
CHAPTER 3
WHAT ARE THE MPEG STANDARDS?
The Moving Picture Coding Experts Group (MPEG) is a working group of
the Geneva-based ISO/IEC standards organization, (International Standards
Organization/International Electro-technical Committee) in charge of the
development of international standards for compression, decompression,
processing, and coded representation of moving pictures, audio, and a
combination of the two. MPEG-7 then is an ISO/IEC standard being developed
by MPEG, the committee that also developed the Emmy Award-winning
standards known as MPEG-1 and MPEG-2, and the 1999 MPEG-4 standard.
• MPEG-1: For the storage and retrieval of moving pictures and audio on
storage media.
• MPEG-2: For digital television, it’s the timely response for the satellite
broadcasting and cable television industries in their transition from analog to
digital formats.
• MPEG-4: Codes content as objects and enables those objects to be
manipulated individually or collectively on an audiovisual scene.
MPEG-1, -2, and -4 make content available. MPEG-7 lets you to find the
content you need.
Besides these standards, MPEG is currently also working in MPEG-21 a
Technical Report about Multimedia Framework.
MPEG 7
Department Of Computer Science Engineering 8
CHAPTER 4
DEFINING MPEG-7
MPEG-7 is a standard for describing features of multimedia content.
Qualifying MPEG-7
MPEG-7 provides the world’s richest set of audio-visual descriptions.
These descriptions are based on catalogue (e.g., title, creator, rights),
semantic (e.g., the who, what, when, where information about objects and events)
and structural (e.g., the colour histogram - measurement of the amount of colour
associated with an image or the timbre of an recorded instrument) features of the
AV content and leverages on AV data representation defined by MPEG-1, 2 and
4.
Comprehensive Scope of Data Interoperability.
MPEG-7 uses XML Schema as the language of choice for content
description MPEG-7 will be interoperable with other leading standards such as,
SMPTE Metadata Dictionary, Dublin Core, EBU P/Meta, and TV Anytime.
The Key Role of MPEG-7
MPEG-7, formally named “Multimedia Content Description Inter-face,” is
the standard that describes multimedia content so users can search, browse, and
retrieve that content more efficiently and effectively than they could using
today’s mainly text-based search engines. It’s a standard for describing the
features of multimedia content.
MPEG-7 will not standardize the (automatic) extraction of AV
descriptions/features. Nor will it specify the search engine (or any other program)
MPEG 7
Department Of Computer Science Engineering 9
that can make use of the description. It will be left to the creativity and
innovation of search engine companies, for example, to manipulate and massage
the MPEG-7-described content into search indices that can be used by their
browser and retrieval tools, (see figure 1).
MPEG 7
Department Of Computer Science Engineering 10
CHAPTER 5
MPEG-7 TECHNICAL ACTIVITIES
It is important to note that MPEG-7 addresses many different applications in
many different environments, which means that it needs to provide a flexible and
extensible framework for describing audio-visual data. Therefore, MPEG-7 will
define a multimedia library of methods and tools. It will standardize:
• A set of descriptors: A descriptor (D) is a representation of a feature that
defines the syntax and semantics of the feature representation.
• A set of description schemes: A description scheme (DS) specifies the
structure and semantics of the relationships between its components, which
may be both descriptors and description schemes.
• A language that specifies description schemes, the Description Definition
Language (DDL): It also allows for the extension and modification of
existing description schemes. MPEG-7 adopted XML Schema Language as
the MPEG-7 DDL. However, the DDL requires some specific extensions to
XML Schema Language to satisfy all the requirements of MPEG-7. These
extensions are currently being discussed through liaison activities between
MPEG and W3C, the group standardizing XML.
• One or more ways (textual, binary) to encode descriptions: A coded
description is a description that’s been encoded to fulfill relevant
requirements such as compression efficiency, error resilience, and random
access.
MPEG 7
Department Of Computer Science Engineering 11
Figure 1: The Scope of MPEG-7
Organization of MPEG-7 Description Tools
Over 100 MPEG-7 Description Tools are currently being developed and
refined. The relationships between the MPEG-7 Description Tools are outlined in
Figure 2. The basic elements, at the lower level, deal with basic data types,
mathematical structures, schema tools, linking and media localization tools, as
well as basic DSs, which are elementary components of more complex DSs. The
Schema tools section specifies elements for creating valid MPEG-7 schema
instance documents and description fragments.
In addition, this section specifies tools for managing and organizing the
elements and datatypes of the schema. Based on this lower level, content
description and management elements can be defined. These elements describe
the content from several viewpoints. Currently five viewpoints are defined:
creation and production, media, usage, structural aspects, and conceptual aspects.
MPEG 7
Department Of Computer Science Engineering 12
The first three elements primarily address information that’s related to the
management of the content (content management), whereas the last two are
mainly devoted to the description of perceivable information (content
description).
Figure 2: Overview of MPEG-7 Multimedia Description Schemes (DSs)
• Creation and Production: Contains meta information that describes the
creation and production of the content; typical features include title, creator,
classification, and purpose of the creation. Most of the time this information is
author-generated since it can’t be extracted from the content.
• Usage: Contains meta information that’s related to the usage of the content;
typical features involve rights holders, access rights, publication, and financial
information. This information may be subject to change during the lifetime of the
AV content.
MPEG 7
Department Of Computer Science Engineering 13
• Media: Contains the description of the storage media; typical features include
the storage format, the encoding of the AV content, and elements for the
identification of the media. Note: Several instances of storage media for the same
AV content can be described.
• Structural aspects: Contains the description of the AV content from the
viewpoint of its structure. The description is structured around segments that
represent physical, spatial, temporal, or spatio-temporal components of the AV
content. Each segment may be described by signal-based features (color, texture,
shape, motion, audio) and some elementary semantic information.
• Conceptual Aspects: Contains a description of the AV content from the
viewpoint of its conceptual notions.
The five sets of Description Tools are presented here as separate entities,
however, they are interrelated and may be partially included in each other. For
example, Media, Usage or Creation & Production elements can be attached to
individual segments involved in the structural description of the content. Tools
are also defined for navigation and access and there is another set of tools for
Content organization which addresses the organization of content by
classification, by the definition of collections and by modeling. Finally, the last
set of tools is User Interaction which describes user’s preferences for the
consumption of multimedia content and usage history.
MPEG-7 Working Groups
Currently MPEG-7 concentrates on the specification of description tools
(Descriptors and Description Schemes), together with the development of the
MPEG-7 reference software, known as XM (eXperimentation Model). The XML
MPEG 7
Department Of Computer Science Engineering 14
Schema Language was chosen as the base for the Description Definition
Language (DDL).
The MPEG-7 Audio group develops a range of Description Tools, from generic
audio descriptors (e.g., waveform and spectrum envelopes, fundamental
frequency) to more sophisticated description tools like Spoken Content and
Timbre. Generic Audio Description tools will allow the search for similar voices,
by searching similar envelopes and fundamental frequencies of a voice sample
against a database of voices. The Spoken Content Description Scheme (DS) is
designed to represent the output of a great number of state of the art Automatic
Speech Recognition systems, containing both words and phonemes
representations and most likely transitions. This alleviates the problem of out-of-
vocabulary words, allowing retrieval even when the original word was wrongly
decoded. The Timbre descriptors (Ds) describe the perceptual features of
instrument sound, that make two sounds having the same pitch and loudness
appear different to the human ear. These descriptors allow searching for melodies
independently of the instruments.
The MPEG-7 Visual group is developing four groups of description tools:
Color, Texture, Shape and Motion. Color and Texture Description Tools will
allow the search and filtering of visual content (images, graphics, video) by
dominant color or textures in some (arbitrarily shaped) regions or the whole
image. Shape Description Tools will facilitate “query by sketch” or by contour
similarity in image databases, or, for example, searching trademarks in
registration databases. Motion Description Tools will allow searching of videos
with similar motion patterns that can be applicable to news (e.g. similar
movements in a soccer or football game) or to surveillance applications (e.g.,
detect intrusion as a movement towards the safe zone).
MPEG 7
Department Of Computer Science Engineering 15
The MPEG-7 Multimedia Description Schemes group is developing the
description tools dealing with generic and audiovisual and archival features. Its
central tools deal with content management and content description as outlined in
section 2.1.
The MPEG-7 Implementation Studies group is designing and implementing
the MPEG-7 Reference Software known as XM.
The MPEG-7 Systems group is developing the DDL and the binary format
(known as BiM), besides working in the definition of the terminal architecture
and access units.
MPEG 7
Department Of Computer Science Engineering 16
CHAPTER 6
MPEG-7 APPLICATION DOMAINS
The elements that MPEG-7 standardizes will support a broad a range of
applications (for example, multimedia digital libraries, broadcast media selection,
multimedia editing, home entertainment devices, etc.). MPEG-7 will also make
the web as searchable for multimedia content as it is searchable for text today.
This would apply especially to large content archives, which are being made
accessible to the public, as well as to multimedia catalogues enabling people to
identify content for purchase. The information used for content retrieval may also
be used by agents, for the selection and filtering of broadcasted "push" material
or for personalized advertising. Additionally, MPEG-7 descriptions will allow
fast and cost-effective usage of the underlying data, by enabling semi-automatic
multimedia presentation and editing. All domains making use of multimedia will
benefit from MPEG-7 including,
Digital libraries, Education (image catalogue, musical dictionary, Bio-
medical imaging catalogues…)
Multimedia editing (personalised electronic news service, media
authoring)
Cultural services (history museums, art galleries, etc.),