A Hierarchical Framework for Content-Based Image Retrieval - Dipti Vaidya.

A Hierarchical Framework for Content-Based Image Retrieval

- Dipti Vaidya

Outline of Presentation

• Introduction and Motivation• Related Work• Shortcomings of Related Work• Problem Statement• Solution Approach• Details• Future Work

Introduction and Motivation

• Gigabytes of images generated and stored everyday

• Simple matching methods using text-based retrieval of these images are not appropriate

• Make the information organized to allow efficient browsing ,searching and retrieval


• This information or metadata can be split into 3 main categories :

1. Catalogue Info: type of image,author,date etc2. Syntactic Content : Information about primary features

like color, texture, shape,spatial relationship3. Semantic Content : Information / knowledge about the

content of the images, as to what the image represents. ex: a smiling girls represents a happy person.


CBIR systems fall into 2 main categories :1. Image retrieval by Syntactic Content :• Images can be presented to the system either in form of

actual image or by sketch• These images can be processed and closest matches are

returned

2. Image retrieval by Semantic Content :• Queries are posed and images are retrieved by matching

the query to the knowledge encoded

Related Work Syntactic Image Retrieval (RIE): RIE – Most of the systems use Color or Texture

Feature General Approach : Record a distribution of colors or

textures in the images.Images with the smallest difference values from example image are matches.

For example they use a simple color histogram to record the predominant colors in the following example image:

Related Work This causes problem when querying. Instead of looking for

pictures of a dog, it looks for images which have a brown blob on a green background.

Query: results

Image not found,cause the system has no idea of semantics

Related Work

Retrieval by Semantic Content :• The Knowledge-Based spatial image model defines a 3 –

layer model for representing knowledge about the domain specific content of the images – Chu, Hsu, Tiara

• Ontology based Photo Annotation -agent,object,action approach using Ontology- University of Amsterdam

• Structured Knowledge Representation for Image Retrieval using DL. – Based on Image regions– Meghini,et al

Related Work

Retrieval by semantic content has been shown to be successful , but it has the following drawbacks:

• Requires significant effort by domain experts when developing

• Unlikely to be extensible beyond a specific problem domain

Related Work – Bob’s System

Proposes a system which uses the complimentary strengths of Semantic and Syntactic Retrieval methods:

• Create a small domain database that will process semantic queries related to the problem domain and will generate a set of example images

• These synthesized example images can be sent to a larger Image database, for matching by Example.

Advantages of Hierarchical Framework

• Provides a semantically-relevant method of querying an image database

• Decouples the knowledge organization from the image matching mechanism– Requires expert involvement to encode knowledge of

smaller set of data– Images and image matching algorithms in the large target

database can change and improve with no impact to knowledge

– Multiple, distributed domain databases can be used against one target database

Target Database

User SemanticQuery

ExtractSemantic

Values

Select/SynthesizeExampleImages

ImagesMatching

Semantic Query

Domain #1 Database Domain #2 Database Domain #n Database

User SemanticQuery

ExtractSemanticValues


ImagesMatching

Semantic Query

User SemanticQuery

ExtractSemanticValues


ImagesMatching

Semantic Query

Example ImagePre-Processor

Search Engine

Ordered List ofClosest Matched

Images

Pre-Processed Image Library

RETRIEVE Function

Hierarchical CBIR Framework Diagram

Related Work – Bob’s System

Here’s what Bob’s system uses :• Structured annotation ( agent,action, object) to specify semantic values

of interest• Domain specific ontology to represent agents and objects and has an

image stored for each of the concept introduced in the ontology• Spatial Relationships to encode actions• A query is posed , and it is semantically processed to generate a set of

example images • These images are then sent to Gift for Retrieval by Example

Example (Bob’s system)Ontology

object

Two wheelers

bikemotorbike

hat man

bike

hat

Role

•Wears(agent, object) = object above agent

•Rides(agent,two-wheeler) = agent above two wheeler

man

Sample database

agent

Example (simple queries)Query: man rides a bike Query: man wears hat

Issues With Bob’s System

• Scalability and Maintenance: If we want to introduce a new concept in the domain ontology,we have to do it manually

• Can not store composite images : There is no method to reuse the synthesized images

• The results returned from GIFT ( image matching server ) could be improved such that they are more domain query related.

Problem Statement

• We need to build a system such that we are able to represent the image content in a way that is Hierarchical, so that we can make semantic queries; Compositional so that we can build complex terms from simple terms and thus reuse the synthesized images

• To be able to retrieve better results from the Image Matching server, given example images.

Problem Formulation• Let DA be a domain database, SDA be semantic values in DA and Q

be a query that resolves to a specific set of semantic values in DA:

{SDA} QUERY(Q, DA) (1)

• Furthermore, let DA have the property that each semantic value can be mapped to a set of image {I}

• Then there is a resultant mapping of Q to a set of example images {Iex} that can be represented as: {Iex} MAP(QUERY( Q , DA ),DA) (3)

Problem Formulation (Cont)• Now suppose an RIE database exists and has a RETRIEVE function

which returns all images {i} that match any of a set of example images {Iex}

{i} RETRIEVE( {Iex}, T ) (4)

• Combining equations (3) and (4) we get{i} RETRIEVE( MAP(QUERY(Q, DA),DA) , T ) (5)

• Equation (5) shows we can make a semantic query in one collection of knowledge (DA) and retrieve matching images from another (T).

• This approach can be considered a hierarchy of RIE and RSC systems.

Problem FormulationIn order to be able realize the hierarchical framework we need to solve several problems:

• Method to define and encode domain knowledge such that we can use it for semantic queries

• Method to represent the semantic content of the Composite Images and map it to the domain knowledge base

•Method to define the actions

• Method to be able to reuse the synthesized Images

• Develop the way to interface with the RIE such that we get more specific domain related results from the Target databaseThe solutions of the above problem will be my contribution to the system

Solution Approach

Our primary aim is to investigate if Description Logics in general, can be used to represent the contents of the domain specific database in a way that it is hierarchical and compositional

Or could we do with using Semantic Networks, thus reducing the computational power ??

Hierarchical CBIR System Diagram

User Query

Racer DL system ( domain knowledge base)

Image DB

Query Results Process Query Results

Image Synthesis

Synthesized Images

RIE - GIFT

Feature selection

Target DBReturned

images

R.F.

Description Logic

What is Description Logic?It is a language that allows reasoning about information in particular supporting the classification of descriptors

Description Logic models a domain in terms of 3 things:Individuals – which represent instances of objects which we are modeling

Concepts – denoting a collection of individuals or instances

Roles – relationships between or attributes of concepts or individuals

Example

• Concept Example– person represents all human beings– fruit represents all the fruits

• Individual Example– Man, woman are individuals of the concept person– Banana is an individual of the concept fruit

• Role Example– Eat(person,fruit) is a relationship describing person and

something they are eating

DL

• Using these small blocks we can build more complex expressions

• Example– Eat(person, fruits)– Eat(Person, fruits) & Sits(Person,Chair)

• Example– cool-student = student & drives(student,Ferrari)

Reasoning with DL• Subsumption ( )

– Basic inferencing tool– Checks whether a concept is more

general than other– Example:

• mother woman

Reasoning with DL

Classification– Collection of descriptions can be classified using subsumption,

providing a hierarchy of descriptions ranging from general to specific.

Exampleperson driving car and wearing hat person driving car person

New Concept: person wearing hat ?person driving car and wearing hat person driving car Personperson wearing hat Person

Automatic Classification

• person person driving car

person driving car and wearing hat

New concept or query: person wearing hat ???

Architecture of DL

• DL is described as being split into 2 parts.T-Box & A-BoxT- Box => Subsumption & ClassificationA-Box => Reasons about relationships between individuals thus providing

classification and retrievalEg: mammal vehicle person dog person wearing cap person driving bus bus bike person wearing cap & driving bus

TedMary’s neighbor

nimbusdriving

DetailsDescribing the semantic contents of the image:

We must describe three types of spaces: that of images themselves , that of the real world concepts they contain and that of what each action or role means.

For E.g. the following Image can be described as :

Image instance Image1

Image1 contains ( person driving car,wearing hat)

Image1 contains ( person driving car)

Image1 contains ( person wearing hat)

Image1 contains ( person) ; Image1 contains (car)

Image1 contains (hat)

Tools UsedDescribing the Semantic Content :

In order to describe the world concepts type-space and progressively link the image instances with these concepts, we use a DL system called RACER

Racer DL•RACER is a semantic web inference engine for developing ontologies

•RACER is a Description Logic reasoning system with support for

•TBoxes with generalized concept inclusions •ABoxes

Example of Racer Files

T-box:(signature :atomic-concepts (human person female male woman man

parent mother father

grandmother aunt uncle

sister brother

only-child pet organization politicalorganisation politician malepolitician image)

:roles ((has-descendant :transitive t)(has-pet :domain person :range pet)(covers :transitive t :domain image :range human)

(has-child :parent has-descendant

:domain parent

:range person)

Example of Racer FilesA- Box(instance image01 image)(instance image01 malepolitician)

(related image01 jonmajor covers)

(instance jonmajor malepolitician)

(instance alice mother)

(related alice betty has-child)

(related alice charles has-child)

QUERIES:(concept-instances sister)

(concept-ancestors mother)

(concept-descendants man)

(individual-fillers alice has-descendant)

Identifying objects in an image

• Segment sections of images and associate them with concepts

2 men standing

mountains

Identifying Objects in an Image

Implemented an image annotator which: Allows the user the identify the objects in the

image and store the information about it’s region of interest

This data is stored in the form of an XML file, which can be parsed during the synthesis process.

Resolving queries

Query: man driving a car and wearing a hat1- attempt to find an image describe with the query (no synthesizing needed), if not found then

2- break query into components (synthesizing needed)

Man driving a car, man wears a hat, if not found

3- break query into components (synthesizing needed)

man ,car, hats,actions…(for actions, we can have a geo-spatial modeling which maps the action..can use Bob’s definitions here.

IN DLS, query language and description language

are unified

Example 2 (composite queries)Query: girl rides a bike and wears a hat

Hierarchical CBIR System Diagram

User Query

Racer DL system ( domain knowledge base)

Image DB

Query Results Process Query Results

Image Synthesis

Synthesized Images

RIE - GIFT

Feature selection

Target DBReturned

images

R.F.

RIE

The query is posed to generate a set of example images

These example images will then be sent to the Image Matching Server for Retrieval by Example from the target database

This Server uses various image features such as color,texture, shape to retrieve similar images

Problem 2In order to be able to improve the quality of images retrieved from the Image Matching server, the returned images should be more domain specific

Suggestion: use feature selection ( color, texture or shape) for handling queries in RIE

Problem 2In the CBIR, where the feature selection is used, the following problems should be solved:

• It must select a subset of features that provides the best input algorithm for the server (GIFT)

•Since we want the feature selection to take place at every query, it must be time efficient

•It must be able to handle example set of size as small as 3 or 5

Proposed Solution•Users can manually assign image feature combination in our user interface for image retrieval.

•Query request is computed and processed by query server.

• The server processes all procedures, find out the closest images, display them on query viewer, and transmitted user feedbacks to index the images

•More accurate query result would be obtained in the next round of search.

Future Work

• Need to implement the image synthesis part

• Populate the domain knowledge base with more concepts and images

• Find a method to implement the improved RIE

• Test the retrieval results by giving in different algorithms based on texture, color and shape

A Hierarchical Framework for Content-Based Image Retrieval - Dipti Vaidya.

Documents