Special Topics in Computer Science Special Topics in Computer Science Advanced Topics in Information Advanced Topics in Information Retrieval Retrieval Lecture 5 Lecture 5 (book chapter 11) (book chapter 11) : : Multimedia IR: Multimedia IR: Models and Languages Models and Languages Alexander Gelbukh www.Gelbukh.com
29
Embed
Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11) : Multimedia IR: Models and Languages Alexander.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Special Topics in Computer ScienceSpecial Topics in Computer Science
Advanced Topics in Information RetrievalAdvanced Topics in Information Retrieval
Inverted files seem to be the best option Other structures are good for specific cases
o Genetic databases
Sequential searching is an integral part of manyindexing-based search techniqueso Many methods to improve sequential searching
Compression can be integrated with search
3
Previous Chapter: Research topicsPrevious Chapter: Research topics
Perhaps, new details in integration of compression and search
“Linguistic” indexing: allowing linguistic variationso Search in plural or only singular
o Search with or without synonyms
4
MotivationMotivation
Applications: o office,
o CAD,
o medical,
o Internet
Example:o Artists sings a melody and sees all the songs with similar
melody
5
What’s differentWhat’s different
Different from text IR:o Structure of data is more complex. Efficiency is an issue
o Using of metadata
o Characteristics of multimedia data
o Operations to be performed
Aspects:o Data modeling: Extract and maintain the features of
objects
o Data retrieval: based not only on description but on content
6
Retrieval processRetrieval process
Query specificationo fuzzy predicates: similar too content predicates: images containing an appleo data type predicates: video, ...
Query processing and optimizationo Parsed, compiled, optimized for order of executiono Problem: many data types, different processing for each
Answero Relevance: similarity to query
Iterationo Bad quality, so need to refine
ModelingModeling
8
Data modelingData modeling
To model is to simplify, in order to make manageable. “We will represent an image as...”o From the user’s point of view
o From the system’s point of view (technically)
A problem: very large storage size. Modeling needed Objects are represented as feature vectors
o Images / Video: shape. House, car, ...
o Sound: style. Music: Merry, sad, ...
Features are defined directly or by comparisono Degree of certainty is stored
9
Multimedia support in commercial DMultimedia support in commercial DBMSs BMSs (1999)
Variable length data. o Non-standard
o Different and usually very limited sets of operations
SQL3: o provides user-extensible data types
o Object-oriented
o Implemented partially in many systems
Example: data blades of Informixo Content-based functions on text and images
o E.g.: date = 1997 AND contains (car)
10
Spatial data typesSpatial data types
Informix: 2D, 3D data blades Boxes, vectors, ... Operations: intersect, contains, center, ... Text: containWords, .... Supports query images by content
11
Example: MULTOSExample: MULTOS
Multimedia document server Documents are described by:
o logical structure: title, into, chapter, ...
o layout structure: pages, frames, ...
o conceptual structure: allows content-based queries
o Docs similar in conceptual structures are grouped into conceptual types
o Example: Generic_Letter
12
Example of conceptual structure...Example of conceptual structure...
13
...continued...continued
14
Image data in MULTOSImage data in MULTOS
Analysiso low level: detect objects and positions
o high level: image interpretation
Result of analysis: o description of objects found and their classes
o certainty values
Indices are used for fast access to this infoo Object index. Includes pointers to objects and certainty
values
o Cluster index, with fuzzy clusters of similar images
15
InternetInternet
How Google does it? No image processing. Textual context! File names, nearby words Distance from image to words “give me images with flower in the file name or near
the image”
LanguagesLanguages
17
Query languagesQuery languages
As a query, either a description of the object or an example object is submittedo “show me images similar to this one”
o in what respects similar?!
Exact match is inadequate. Additional means are needed
Content is not a single feature
18
What defines query languageWhat defines query language
Interface. How to enter the query Types of conditions to specify Handling of uncertainty, proximity, weights
19
InterfaceInterface
Browsing and navigation Search: description or query by example Query by example:
o specify what features are important. Give me all houses with similar shape but different colors
o Libraries of examples can be provided
20
Conditions...Conditions...
Attribute predicateso structured content – the predefined types extracted
beforehand
o Exact match. E.g.: size, type (video, audio, ...)