Top Banner
Module 1 INTRODUCTION TO MULTIMEDIA DATABASES Prof. Dr. Naomie Salim Faculty of Computer Science & Information Systems Universiti Teknologi Malaysia
39

Moditroduction multimedia database

Dec 06, 2015

Download

Documents

reza

intoduction
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Moditroduction multimedia database

Module 1

INTRODUCTION TO MULTIMEDIA DATABASES

Prof. Dr. Naomie SalimFaculty of Computer Science & Information Systems

Universiti Teknologi Malaysia

Page 2: Moditroduction multimedia database

The Explosion of Digital Multimedia Information

• We interact with multimedia everyday• Large amount of text, images, speech & video

converted to digital form• Advantages of digitized data over analog

– Easy storage– Easy processing– Easy sharing

Page 3: Moditroduction multimedia database

Give examples of multimedia applications that deals with

storing, retrieving, processing and sharing of multimedia data

Page 4: Moditroduction multimedia database

Eg 1. Journalism

• Journalist to write article about influence of alcohol on driving

• Investigation involved:– Collect news articles about accidents, scientific

reports, television commercials, police interviews, medical experts interviews

• Illustration:– Search photo archives, stock footage companies

for “good” photos – “shocking”, “funny”, etc.

Page 5: Moditroduction multimedia database

Other examples

• Searching movies– Based on “taste” of movies already seen– Based on movies a friend favor

• Searching on web– Eg. searching Australian Open website (

http://www.ausopen.org)– Integrate conceptual terms + interesting events– “give info about video segments showing female

American tennis players going to the net”

Page 6: Moditroduction multimedia database

Retrieval problems

• EMPLOYEE (Name: char(20), City: Char(20), Photo: Image)– How do you select employees in “Skudai”?– How do you select employees that wear “tudung”,

wear glasses, fair and have a mole under the lips?

Page 7: Moditroduction multimedia database

Characteristics of Media Data

• Medium a type of Information representation– Alphanumeric– Audio, video and image traditionally in analog

representation;

• Static vs dynamic– Static: do not have time dimensions (alphanumeric data,

images, graphics)– Dynamic: have time dimensions (video, animation, audio)

• Multimedia– Collection of media types used together– At least one media types must be non-alphanumeric

Page 8: Moditroduction multimedia database

Digital representation of text

• OCR (Optical character recognition) techniques convert analog text to digital text

• Eg. of digital representation: ASCII– Use 8 bits– Chinese char requires more space– Storage requirements depend on number of characters

• Structured documents becoming more popular– Docs consist of titles, chapters, sections, paragraphs, etc.– Standards like HTML and XML used to encode structured information

• Compression of text– Huffman, arithmetic coding– Since storage requirements not too high, less important than

multimedia data

Page 9: Moditroduction multimedia database

Digital representation of audio• Audio

– air pressure waves with frequency, amplitude– Human hears 20-20,000 Hertz– Low amplitude – soft sound

Page 10: Moditroduction multimedia database

Digitizing pressure waveforms• Transform into electrical

signal (by microphone)• Convert into discrete

values – Sampling: continuous time axis

divided into small, fixed intervals

– Quantization: determination of amplitude of video signals at beginning of each time interval

– Human cannot notice difference between analog & digital with enough high sampling rate and precise quantization

Page 11: Moditroduction multimedia database

Audio storage requirements

• Example of a CD audio– 16 bits per sample– 44,000 samples per second– Two (stereo) channels– Requirements = 16 * 44,000 * 2 bits = 1.4 Mbit per second

• Compression (examples)– Masking: Discard soft sound because not audible by

louder sound– Speech: coding of lower frequency sounds only– MPEG: audio compression standards

Page 12: Moditroduction multimedia database

Digital representation of image

• Scan analog photos & pictures using scanner– Analog image approximated by rectangle of small dots– In digital camera, ADC is built-in

• Image consists of many small dots or picture elements (pixels)– Gray scale: 1 byte (8 bits) per pixel– Color: 3 color (RGB) of one byte each– Data required for 1 rectangular screen

A = xybA:number of bytes needed, x: # pixels per horizontal line, y: # horizontal lines, b: # bytes per pixel

Page 13: Moditroduction multimedia database

Image compression

• Exploit redundancy in image & properties of human perception– Spatial redundancy: pixels in certain area often

appear similar (golden sand, blue sky)– Human tolerance: error still allows effective

communication• Eg. of image compression

– Transform coding– Fractal image coding

Page 14: Moditroduction multimedia database

Digital representation of video • Sequence of frames or images presented at

fixed rate– Digital video obtained by digitizing analog videos

or digital cameras– Playing 25 frames per second gives illusion of

continuous view• Amount of data to represent video

– 1 second, image: 512 lines, 512 pixels per line, 24 bits per pixel, 25 frames per second

– 512 * 512 * 3 * 25 = 19 Mbytes

Page 15: Moditroduction multimedia database

Compression of video• Compressing frames of videos: similar to image

– Reduce redundancy & exploit human perception properties• Temporal redundancy: neighboring frames normally similar, remove by

applying motion estimation & compression– Each image divided into fixed-sized blocks– For each block in image, the most similar block in previous image is

determined & pixel difference computed– Together with displacement between the two blocks, this difference stored or

transmitted• MPEG-1 (VHS, pixel based coding): coding of video data up to speed of 1.5

Mbits per second• MPEG-2 (pixel based coding): coding of video data up to speed of 10 Mbits

per second• MPEG-4 (multimedia data, object based coding) : coding of video data up

to speed of 40 Mbits per second, tools for decoding & representing video objects, support content-based indexing & retrieval

Page 16: Moditroduction multimedia database

How to search for images or multimedia data?

• Analyze one by one?• No! Takes too long!• Have to use metadata – instead of searching directly,

search for metadata that have been added to it• Metadata requirements to be valuable for searching:

– Description of multimedia object should be as complete as possible

– Storage of metadata must not take too much overhead– Comparison of two metadata values must be fast

Page 17: Moditroduction multimedia database

Metadata of Multimedia Objects

• Descriptive data– Give format or factual info about multimedia

object– Eg.: author name, creation date, length of

multimedia object, representation technique– Eg. standard for descriptive data: Dublin core– Can use SQL (metadata condition in “WHERE”

clause)

Page 18: Moditroduction multimedia database

Metadata of Multimedia Objects (cont.)

• Annotations– Textual description of contents of objects– Eg.: photo description in Facebook– Either free format or sequence of keywords– Manual text annotations allow Information Retrieval

techniques to be used but• Time consuming, expensive• Subjective, incomplete

– Structured concepts (eg semantic web, ER-like schema) can be used to describe content through concepts, their relationships to each other & MM object but

• Also slow and expensive

Page 19: Moditroduction multimedia database

Metadata of Multimedia Objects (cont.)

• Features– Derive characteristics from MM object itself– Need language to describe features, eg. MPEG-7– Process to capture features from MM object is

called feature extraction• Performed automatically, sometimes with human

support

– Two feature classes• Low-level features• High-level features

Page 20: Moditroduction multimedia database

Low-level Features

• Grasp data patterns & statistics of MM object• Depend strongly on medium• Extraction performed automatically• Eg. for text

– List of keywords with frequency indicators

• Eg. for audio– Representation

• Amplitude-time sequence: quantification of air pressure at each sample• Silence:0, > silence:+ve amplitude, < silence:-ve amplitude

– Eg. Low-level features derived• Energy (loudness of signal), ZCR(zero crossing rate-frequency of sign

change)-high indicate speech, silence ratio(low indicates music)

Page 21: Moditroduction multimedia database

Low-level features (cont.)

• Eg. for images– Color histograms: # pixels having color of certain range– Spatial relationships: eg. blue patterns appears above

yellow (beach photo), – Contrast: # dark spots neighboring light spots

• Eg. for video– Use low-level features for image– Eg. of temporal dimension: shot change-when pixel

difference between two images is higher than certain threshold

Shot- sequence of images taken with same camera position

Page 22: Moditroduction multimedia database

High-level features

• Features which are meaningful to end user, such as golf course, forest

• How can we bridge semantic gap between low level and high level features– High level feature extraction from low level features– Eg. text containing words “football”, “referee” – football

match text– Eg. Speech to text translators (low level audio features to

text)– Eg. Video-Domain specific: loud sound from crowd, round

object passing white line, followed by sharp whistle-goal

Page 23: Moditroduction multimedia database

Multimedia Information Retrieval System (MIRS)

Page 24: Moditroduction multimedia database

Component of MIRS - Archiving

• MM data stored separately from its metadata – Voluminous– Visible or audible delays in playback unacceptable

• MM data managed separately in MM content server– Objects get identification to be used by other

parts of MIRS at storage time– Have to deal with compression and protection

Page 25: Moditroduction multimedia database

Component of MIRS – Feature Extraction (Indexing)

• Extraction of metadata (annotations, descriptions, features) from incoming multimedia object

• Algorithms have to consider extraction dependencies. Eg.:– Video object segmented, choose key frame for each segment– Extract low-level features from key frame– Based on low-level features, classify into shots of audience, fields,

close-ups– For field shots, detect positions of players– Extract body related features of players– Determine where net playing begins and ends

• Have to consider incremental maintenance (modification of MM objects, extractors, extraction dependencies)

Page 26: Moditroduction multimedia database

Incremental Maintenance in ACOI Feature Extraction Architecture

Page 27: Moditroduction multimedia database

Component of MIRS - Searching

• Multimedia queries are diverse, can be specified in many different ways

• No exact match, many ways to describe MM objects

• Specifying information need– Direct – user specifies info. need herself– Indirect – user relies on other users

Page 28: Moditroduction multimedia database

Possible Querying Scenarios

Page 29: Moditroduction multimedia database

Possible Querying Scenarios (cont.)

• Queries based on Profile– Users expose preferences in one way or another– Preferences stored in user profile in MIRS– Can use profile of a “friend” if not sure & trusted

• Queries based on Descriptive Data– Based on format and fact about MM object– Eg. “all movies with Director = “Steven Spielberg”

Page 30: Moditroduction multimedia database

Possible Querying Scenarios (cont.)

• Queries based on Annotations– Text-based: keywords or natural language– Eg. “Show me video in which Barack Obama shakes hand with

Mahathir Mohamad” • Set of keywords derived from query & compared with keywords in

annotations of movies

• Queries based on Features– content-based queries– features derived (semi) automatically from content of MM object– Low & high level features used– Eg. “Find all photos with color distribution like this photo”– Eg. “Give me all football videos which a goal is scored within last ten

minutes”• “goal” is high-level feature that must be known to MIRS

Page 31: Moditroduction multimedia database

Possible Querying Scenarios (cont.)

• Query by example– Give example MM object– MIRS extract all kinds of features from the MM object– Resulting query based on these features

• Similarity– Degree to which query & MM object of MIRS are similar– Similarity calculated by MIRS based on metadata of MM

object & query– Try to estimate value of relevance of MM object to user– Output is list of MM objects in descending order of

similarity value

Page 32: Moditroduction multimedia database

General Retrieval Model

Page 33: Moditroduction multimedia database

Relevance Feedback

• Helps when user doesn’t know exactly what he is looking for, causing problem in query formulation

• Interactive approach• User issue starting query, MIRS compose result set, user

judge output (relevant/not), MIRS uses feedback to improve retrieval process

Page 34: Moditroduction multimedia database

Component of MIRS - Browsing

• User sometimes cannot precisely specify what they want, but can recognize what they want when they see it

• Browsing let user scans through objects– Exploits hyperlinks which lead user from one object to other– When object shown, user judge its relevance & proceed accordingly– If objects are huge, icons are used

• Starting point– query that describe info need or system provide starting point– User can ask for another starting point if not satisfied– Can classify object based on topics & subtopics

Page 35: Moditroduction multimedia database

Component of MIRS – Output Presentation (Play)

• When MIRS returns list of objects, system has to decide whether user has right to see them

• User interface should be able to show all kinds of MM data

• What if objects are huge and result set large?– Give user perception of content of object– Extract & present essential info for user to browse & select

objects• Text: title, summary, places where keywords occur• Audio: tune, start of song• Images: summary of images – thumbnails• Video: cut into scene n choose for each scene a prime image

Page 36: Moditroduction multimedia database

Component of MIRS – Output Presentation (cont.)

• Streaming– Content sent to client at specific rate and except for

buffering, played directly– Audio & video is delivered as continuous stream of packets– When resource become scarce

• Use switched Ethernet instead of shared Ethernet• Use disk stripping• Skip frames during play-back• Fragment content over several content servers (need logical

component between client & servers to direct client request to corresponding server)

Page 37: Moditroduction multimedia database

Quality of MIRS

• Recallr/R

• Precisionr/n

• Relevance judged by humans, refer to TREC (Text Retrieval Conference)

r: # of relevant objects returned by system, n: # objects retrieved, R: # relevant objects in collection

Page 38: Moditroduction multimedia database

Exercise

• Discuss the role of DBMS in storing MM objects

• Discuss the role of Information Retrieval systems in storing MM objects

Page 39: Moditroduction multimedia database

End of Module 1