Top Banner
Special Topics in Computer Science Special Topics in Computer Science Advanced Topics in Information Advanced Topics in Information Retrieval Retrieval Lecture 5 Lecture 5 (book chapter 11) (book chapter 11) : : Multimedia IR: Multimedia IR: Models and Languages Models and Languages Alexander Gelbukh www.Gelbukh.com
29

Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

Mar 27, 2015

Download

Documents

Aaron Archer
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

Special Topics in Computer ScienceSpecial Topics in Computer Science

Advanced Topics in Information RetrievalAdvanced Topics in Information Retrieval

Lecture 5 Lecture 5 (book chapter 11)(book chapter 11): :

Multimedia IR:Multimedia IR:Models and LanguagesModels and Languages

Alexander Gelbukh

www.Gelbukh.com

Page 2: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

2

Previous Chapter: Previous Chapter: ConclusionsConclusions

Inverted files seem to be the best option Other structures are good for specific cases

o Genetic databases

Sequential searching is an integral part of manyindexing-based search techniqueso Many methods to improve sequential searching

Compression can be integrated with search

Page 3: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

3

Previous Chapter: Research topicsPrevious Chapter: Research topics

Perhaps, new details in integration of compression and search

“Linguistic” indexing: allowing linguistic variationso Search in plural or only singular

o Search with or without synonyms

Page 4: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

4

MotivationMotivation

Applications: o office,

o CAD,

o medical,

o Internet

Example:o Artists sings a melody and sees all the songs with similar

melody

Page 5: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

5

What’s differentWhat’s different

Different from text IR:o Structure of data is more complex. Efficiency is an issue

o Using of metadata

o Characteristics of multimedia data

o Operations to be performed

Aspects:o Data modeling: Extract and maintain the features of

objects

o Data retrieval: based not only on description but on content

Page 6: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

6

Retrieval processRetrieval process

Query specificationo fuzzy predicates: similar too content predicates: images containing an appleo data type predicates: video, ...

Query processing and optimizationo Parsed, compiled, optimized for order of executiono Problem: many data types, different processing for each

Answero Relevance: similarity to query

Iterationo Bad quality, so need to refine

Page 7: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

ModelingModeling

Page 8: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

8

Data modelingData modeling

To model is to simplify, in order to make manageable. “We will represent an image as...”o From the user’s point of view

o From the system’s point of view (technically)

A problem: very large storage size. Modeling needed Objects are represented as feature vectors

o Images / Video: shape. House, car, ...

o Sound: style. Music: Merry, sad, ...

Features are defined directly or by comparisono Degree of certainty is stored

Page 9: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

9

Multimedia support in commercial DMultimedia support in commercial DBMSs BMSs (1999)

Variable length data. o Non-standard

o Different and usually very limited sets of operations

SQL3: o provides user-extensible data types

o Object-oriented

o Implemented partially in many systems

Example: data blades of Informixo Content-based functions on text and images

o E.g.: date = 1997 AND contains (car)

Page 10: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

10

Spatial data typesSpatial data types

Informix: 2D, 3D data blades Boxes, vectors, ... Operations: intersect, contains, center, ... Text: containWords, .... Supports query images by content

Page 11: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

11

Example: MULTOSExample: MULTOS

Multimedia document server Documents are described by:

o logical structure: title, into, chapter, ...

o layout structure: pages, frames, ...

o conceptual structure: allows content-based queries

o Docs similar in conceptual structures are grouped into conceptual types

o Example: Generic_Letter

Page 12: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

12

Example of conceptual structure...Example of conceptual structure...

Page 13: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

13

...continued...continued

Page 14: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

14

Image data in MULTOSImage data in MULTOS

Analysiso low level: detect objects and positions

o high level: image interpretation

Result of analysis: o description of objects found and their classes

o certainty values

Indices are used for fast access to this infoo Object index. Includes pointers to objects and certainty

values

o Cluster index, with fuzzy clusters of similar images

Page 15: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

15

InternetInternet

How Google does it? No image processing. Textual context! File names, nearby words Distance from image to words “give me images with flower in the file name or near

the image”

Page 16: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

LanguagesLanguages

Page 17: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

17

Query languagesQuery languages

As a query, either a description of the object or an example object is submittedo “show me images similar to this one”

o in what respects similar?!

Exact match is inadequate. Additional means are needed

Content is not a single feature

Page 18: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

18

What defines query languageWhat defines query language

Interface. How to enter the query Types of conditions to specify Handling of uncertainty, proximity, weights

Page 19: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

19

InterfaceInterface

Browsing and navigation Search: description or query by example Query by example:

o specify what features are important. Give me all houses with similar shape but different colors

o Libraries of examples can be provided

Page 20: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

20

Conditions...Conditions...

Attribute predicateso structured content – the predefined types extracted

beforehand

o Exact match. E.g.: size, type (video, audio, ...)

Structural predicateso structure: title, sections, ...

o metadata are used. Find objects containing an image and a video clip

Semantic predicateso unrestricted content.

o Find all red houses: red = ?, house = ? Fuzzy

Page 21: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

21

... conditions... conditions

Predicateso Spatial: contain, intersect, is contained in, is adjacent to ...

o Temporal: Find audio where first politics and then economy is discussed

o Spatial and temporal predicates can be combined: Find clips where the logo disappears and then a graph appears at the same place

A predicate can be applied to a part of documento As path expressions in OO databases

Page 22: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

22

Uncertainty, proximity, weightsUncertainty, proximity, weights

Similarity function The user can assign importance weights to individual

predicates in a complex query This gives ranking, as in text IR The same models can be used, e.g., probabilistic

model

Page 23: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

23

Examples of query languages: SQL3Examples of query languages: SQL3

Functions and stored procedures: user-defined data manipulation

Active database support: database reacts on the events, not only commands. This enforces integrity constraints

Good news: rather standard Bad news: no ranking supported! Effort to integrate SQL3 with IR techniques.

SQL MM Full Text and other similar languages

Page 24: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

24

... examples: MULTOS... examples: MULTOS

One of design goals: easy navigationo Paths are supported

Identification of components by type, not by positiono All images in the document, not the image in 3rd chapter

Types of predicates:o on data attributes, on textual components, on images

(image type, objects contained, ...)

Example:

Page 25: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

25

MULTOS exampleMULTOS example

Page 26: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

26

Another example of MULTOSAnother example of MULTOS

Page 27: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

27

Research topicsResearch topics

How similarity function can be defined? What features of images (video, sound) there are? How to better specify the importance of individual

features? (Give me similar houses: similar = size?color? strructure? Architectural style?)

How to determine the objects in an image? Integration with DBMSs and SQL for fast access and

rich semanticso Integration with XML

o Ranking: by similarity, taking into account history, profile

Page 28: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

28

ConclusionsConclusions

Basically, images are handled as text described themo Namely, feature vectors (or feature hierarchies)

o Context can be used when available to determine features

Also, queries by example are common From the point of view of DBMS, integration with IR

and multimedia-specific techniques is neededo Object-oriented technology is adequate

Page 29: Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.

29

Thank you!Till ??, 6 pm