Top Banner
Retrieval by spatial similarity: an algorithm and a comparative evaluation q E. Di Sciascio a, * , M. Mongiello a , F.M. Donini b , L. Allegretti a a Politecnico di Bari, Dipartimento di Elettrotecnica ed Elettronica, Via Re David, 200, 70125Bari, Italy b University of Tuscia, Viterbo, Italy Received 1 April 2003; received in revised form 24 May 2004 Available online 15 July 2004 Abstract We present an algorithm for retrieval by spatial similarity of symbolic images. The proposed algorithm is based on graph-matching and is invariant to scaling, rotation and translation, recognizes multiple rotation variants and can deal with multiple instances of a symbolic object in an image. It is particularly suitable for a query by sketch approach, in that it requires that at least objects in the query have to be present in the database image. Our approach is proposed in comparison with other two well-known algorithms. Peculiarities are analyzed and motivated. Results of a comparative evaluation on a test dataset of symbolic images are presented and discussed. Ó 2004 Elsevier B.V. All rights reserved. Keywords: Spatial arrangement; Similarity retrieval; Query by sketch; Symbolic images 1. Introduction In recent years, methods for content-based image retrieval based on primitive features extraction such as color, texture and shape have been widely explored and implemented in research prototypes and commercial systems. Methods for primitive features extraction are now mature en- ough to provide efficient retrieval. Yet to be able to carry out content-based image retrieval at a higher level of abstraction, the input to the problem cannot be just an array of pixels. An image has to be considered as an arrangement of parts that can be obtained either as the output of a segmentation process, or using a graphical language to describe the components. Given a sketch or a description with a few parts arranged according to user’s specification, the problem is to establish whether the sketch can be recognized in the image, and if so to what extent. Recently, new languages have been proposed for the description of two- and three-dimensional q This work was carried out in the framework of projects CNOSSO and MS3DI. * Corresponding author. Tel.: +39-0805-460641; fax: +39- 0805-460410. E-mail addresses: [email protected] (E. Di Sciascio), [email protected] (M. Mongiello), [email protected] (F.M. Donini), [email protected] (L. Allegretti). 0167-8655/$ - see front matter Ó 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.patrec.2004.06.010 Pattern Recognition Letters 25 (2004) 1633–1645 www.elsevier.com/locate/patrec
13

Retrieval by spatial similarity: an algorithm and a comparative evaluation

Mar 05, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Retrieval by spatial similarity: an algorithm and a comparative evaluation

Pattern Recognition Letters 25 (2004) 1633–1645

www.elsevier.com/locate/patrec

Retrieval by spatial similarity: an algorithmand a comparative evaluation q

E. Di Sciascio a,*, M. Mongiello a, F.M. Donini b, L. Allegretti a

a Politecnico di Bari, Dipartimento di Elettrotecnica ed Elettronica, Via Re David, 200, 70125Bari, Italyb University of Tuscia, Viterbo, Italy

Received 1 April 2003; received in revised form 24 May 2004

Available online 15 July 2004

Abstract

We present an algorithm for retrieval by spatial similarity of symbolic images. The proposed algorithm is based on

graph-matching and is invariant to scaling, rotation and translation, recognizes multiple rotation variants and can deal

with multiple instances of a symbolic object in an image.

It is particularly suitable for a query by sketch approach, in that it requires that at least objects in the query have to

be present in the database image.

Our approach is proposed in comparison with other two well-known algorithms. Peculiarities are analyzed and

motivated. Results of a comparative evaluation on a test dataset of symbolic images are presented and discussed.

� 2004 Elsevier B.V. All rights reserved.

Keywords: Spatial arrangement; Similarity retrieval; Query by sketch; Symbolic images

1. Introduction

In recent years, methods for content-based

image retrieval based on primitive features

extraction such as color, texture and shape have

been widely explored and implemented in researchprototypes and commercial systems. Methods for

qThis work was carried out in the framework of projects

CNOSSO and MS3DI.*Corresponding author. Tel.: +39-0805-460641; fax: +39-

0805-460410.

E-mail addresses: [email protected] (E. Di Sciascio),

[email protected] (M. Mongiello), [email protected] (F.M.

Donini), [email protected] (L. Allegretti).

0167-8655/$ - see front matter � 2004 Elsevier B.V. All rights reserv

doi:10.1016/j.patrec.2004.06.010

primitive features extraction are now mature en-

ough to provide efficient retrieval. Yet to be able to

carry out content-based image retrieval at a higher

level of abstraction, the input to the problem

cannot be just an array of pixels. An image has to

be considered as an arrangement of parts that canbe obtained either as the output of a segmentation

process, or using a graphical language to describe

the components. Given a sketch or a description

with a few parts arranged according to user’s

specification, the problem is to establish whether

the sketch can be recognized in the image, and if so

to what extent.

Recently, new languages have been proposedfor the description of two- and three-dimensional

ed.

Page 2: Retrieval by spatial similarity: an algorithm and a comparative evaluation

1634 E. Di Sciascio et al. / Pattern Recognition Letters 25 (2004) 1633–1645

objects, based on structured building blocks, and

on syntactical objects that can be grouped and

modified together. Structured descriptions of

three-dimensional images are present in e.g., lan-

guages for virtual reality like VRML, or in hier-

archical object modeling. They also appear inlanguages adopted in graphical environments for

describing three-dimensional models for architec-

tural design such as AutoCAD, Illustrator,

CorelDraw, Freehand, etc. Interesting applica-

tions involve also the novel Scalable Vector

Graphics (SVG) language, a W3C approved

standard for describing two-dimensional graphics

at the level of objects rather than individual points,with descriptions based on XML.

With reference to previous work on the subject,

methods for retrieval by spatial similarity can be

classified into symbolic projection, graph-match-

ing, geometric and spatial reasoning methods.

The first class includes approaches in which

retrieval of images basically reverted to string

matching (Chang et al., 1983). The modeling oficonic images was presented in terms of 2D string,

each string accounting for the position of icons

along one of the two planar dimensions. Variants

of 2D string such as 2D-G-string (Chang and

Jungert, 1991), 2D-C-string (Lee and Hsu, 1990)

have been proposed to deal with situations of

overlapping objects with complex shapes.

Methods of the second class describe domainobjects included in an image and their spatial

relations using a Spatial Orientation Graph (SOG)

(Gudivada and Raghavan, 1995). Objects in a

symbolic image are associated with vertexes in a

weighted graph. The spatial relations among the

objects are represented through the list of the

edges connecting pairs of centroids. A similarity

function computes the degree of closeness betweenthe two edge lists representing the query and the

database picture as a measure of the matching

between the two spatial graphs. Several variants of

methods based on graph-matching have been

proposed. A recent paper on the topic is (El-Kwae

and Kabuka, 1999), where SIMDTC algorithm is

proposed as an extension of the spatial-graph ap-

proach including both the topological and direc-tional constraints. The topological extension of the

objects can be obviously useful in determining

further differences between images. The similarity

algorithm extends the graph-matching proposed in

(Gudivada and Raghavan, 1995) and retains the

properties of the original approach, including its

invariance to scaling, rotation and translation

and is also able to recognize multiple rotationvariants.

Gudivada proposed a logical representation of

an image based on so-calledHR-string (Gudivada,

1998). Such a representation also provides a

geometry-based approach to iconic indexing based

on spatial relationships between the iconic objects

in an image individuated by their centroid coor-

dinates. Translation, rotation and scale variantimages and the variants generated by an arbitrary

composition of these three geometric transforma-

tions are considered. The similarity between a

database and a query image is obtained through a

spatial similarity algorithm, SIMG, that measures

the degree of similarity between a query and a

database image by comparing the similarity be-

tween their HR-strings. The algorithm recognizesrotation, scale and translation variants of the

arrangement of objects, and also subsets of the

arrangements. A constraint limiting the practical

use of this approach is the assumption that an

image can contain at most one instance of each

icon or object. An extension of this approach has

been proposed in (Zhou et al., 2001), where objects

are not identified by a single point, and their ori-entation is taken into account.

In this paper we propose an algorithm for

retrieval by spatial similarity that measures the

degree of similarity between the spatial layout of

objects in a sketched query and the layout of ob-

jects in a symbolic image. The proposed algorithm

is invariant to scaling, rotation and translation and

can deal with multiple instances of an object in animage. The algorithm is particularly suitable for a

query by sketch approach, in that it requires that

at least objects in the query have to be present in

the database image. The rationale is that a user

will sketch first objects that he/she believes more

relevant to his needs, and eventually add more

details if the retrieved set is too large. In our set-

ting, the user in the query/retrieval loop refines his/her search by adding new details, but he/she can be

sure that at least elements explicitly included in the

Page 3: Retrieval by spatial similarity: an algorithm and a comparative evaluation

E. Di Sciascio et al. / Pattern Recognition Letters 25 (2004) 1633–1645 1635

query will be present in the retrieved set. The

algorithm is sound with a formalism that some of

us proposed in (Di Sciascio et al., 2002b). There a

theory was proposed for representing images and

an algorithm for reasoning about images that is

sound and complete with respect to the semanticsof the proposed formalism. The method was also

extended to support structured, content-based

indexing and retrieval of SVG documents (Di

Sciascio et al., 2002c).

The remaining of the paper is organized as

follows: in Section 2 we describe the three algo-

rithms we consider: our proposal and the algo-

rithms we adopt for comparison, i.e., SIMG andSIMDTC . Then, in Section 3, we compare behavior

of the algorithms in a controlled setting and dis-

cuss relevant results. We draw conclusions in the

last section.

2. Algorithms for spatial similarity

We start describing the three analyzed algo-

rithms: SIMG (Gudivada, 1998), SIMDTC (El-

Kwae and Kabuka, 1999), and SIML, the one we

propose. All three algorithms measure the degree

of similarity between the spatial layout of objects

in a graphical query, and the layout of objects in a

database image.

In their current definition, SIMG and SIMDTC

cannot be directly compared with SIML since they

have more restrictive requirements. We properly

modified their definition in order to cope with the

requirements of SIML, without affecting their

structure.

Fig. 1. SIMG a

In the following subsections we describe the

three algorithms and explain their characteristics

with reference to a single pair of images: a query

image Iq and a database image Id obtained as a

rotation variant. Image variants can be perfect or

multiple ones. A perfect variant is generated byapplying to all the objects in the image the same

transformation with the same magnitude. Other-

wise the variant is multiple.

2.1. SIMG algorithm

SIMG algorithm is based on the definition of

HR-string by Gudivada (1998). An image I ¼ fO0;O1; . . . ;On�1g composed by n objects is represented

as a symbolic image by associating a name with

each object. The objects are ordered according to

increasing values of the angle h that the line joiningthe centroids of the objects to the centroid of the

image subtends with the positive x-axis. The

notation assumes that there exist neither multiple

instances of an object type, nor objects having thesame centroids coordinates.

HR-strings are obtained by associating each

object with information concerning those objects

that precede and follow it in the considered order.

Fig. 1 pictures a query image Iq having four

objects, a joypad, a headset, a camera and a

monitor and a database image Id that is a rotation

variant of Iq. Objects are ordered considering theangle h between the positive x-axis and the segment

joining the centroid of the image with the centroids

of each object. The order is given by the joypad,

the headset, the camera and the monitor; the ob-

jects are given indexes Oq0, Oq

1, Oq2, Oq

3. In the

lgorithm.

Page 4: Retrieval by spatial similarity: an algorithm and a comparative evaluation

1636 E. Di Sciascio et al. / Pattern Recognition Letters 25 (2004) 1633–1645

database image there is a different order since the

image is a rotation variant of the query: the order

is given by the monitor, the joypad, the headset

and the camera and the objects are given indexes

Od0, O

d1, O

d2, O

d3.

The HR-string is obtained by associating eachobject with an index representing the object name,

the indexes of the two objects respectively on the

left and on the right side of it (from the viewpoint

of an observer posed in the center of the group),

the two distances between the centroid of the ob-

ject and the centroids of the left and right neigh-

bors.

The algorithm compares the two HR-stringsrepresenting the query image and the database

image and returns a similarity value. Similarity is

computed by summing three contributions given

by an object factor, a scale factor and a spatial

factor.

The object factor sums a contribution due to

the existence of corresponding objects in the two

HR-strings. The spatial factor indicates the degreeof importance of the relationship among the ob-

jects. The scale factor, that includes a function of

the right and left neighbors distances, adds a

contribution that measures the scale variations in

the image.

In the following we explain more in detail the

SIMG algorithm in order to compare it with SIML.

Given the HR-string of the query image, thealgorithm searches in the HR-string of the data-

base image the index of the corresponding object.

In this case, it adds a contribution due to the ob-

ject factor to the similarity measure. It also com-

pares the corresponding left and right neighbors

and adds a spatial and scale factor to similarity

value whether they represent the same objects: in

the example, the algorithm compares the two HR-strings starting in the corresponding object, Oq

1 in

Iq and Od2 in Id. It adds an object factor for the two

corresponding objects and a spatial factor and a

scale factor for the left neighbor since both Oq0 and

Od1 represent a joypad; finally, it adds a spatial

factor and a scale factor for the right neighbor

since both Oq2 and Od

3 represent a camera.

The steps are repeated for all the objects in theHR-string of Iq and the final value of similarity is

returned.

2.2. SIMDTC algorithm

SIMDTC (El-Kwae and Kabuka, 1999) com-

putes similarity between two images as a function

of three factors: number of common objects,closeness of directional spatial relationships and

closeness of topological spatial relationships.

Directional spatial relationships are represented

using a spatial orientation graph, SOG. Each edge

in a SOG connects the centroids of two objects in

the image. An edge list is the set of all the edges in

the SOG and it has nðn�1Þ2

edges for an image having

n objects. Given a query image Iq and a databaseimage Id, the algorithm extracts the edge lists Eq

and Ed for Iq and Id.A similarity degree between the two images is

computed as a similarity function between Eq and

Ed that returns a real number in the range ½0; 1�.Given a pair of objects in Iq and Id, the algo-

rithm computes the difference angle between cor-

responding edges eq and ed, i.e., edge eq connectsthe same objects connected by ed. The angle is

computed by translating eq and ed such that their

starting point coincide with the origin and by

considering the smaller angle.

Fig. 2 pictures an example of the query and

database images used in Fig. 1. Objects are named

with the same indexes used in Fig. 1. The algo-

rithm extracts the edge list both for Iq and Id; theobjects in Iq are O0, O1, O2, O3 considering all the

pairs of vertexes in the graph, the obtained edge

list is Eq ¼ fO0O1;O0O2;O0O3;O1O2;O1O3;O2O3g;similarly Ed is computed.

In the bottom of Fig. 2 it is shown the com-

putation of the difference angle for two example

edges respectively of Iq and Id. We consider the

joypad and the camera with the spatial layout theyhave in Iq and Id and the two relative edges. The

directional spatial relationship is represented by

the hO2angle, i.e., the smaller angle between the

two edges. Considering all the six pairs of edges in

the edge lists Eq and Ed, the algorithm extracts the

six angle h that are used for computing the simi-

larity measure summing a function of cos h over alledge pairs. If both images are identical, this willlead to a maximum value of 1. If the edges between

Ed and Eq do not have the same slope or orienta-

tion, the contributing factor from each edge will

Page 5: Retrieval by spatial similarity: an algorithm and a comparative evaluation

Fig. 2. SIMDTC algorithm.

E. Di Sciascio et al. / Pattern Recognition Letters 25 (2004) 1633–1645 1637

depend on the degree by which the corresponding

edge orientation differ.The h angle is used in computing the hRCA

rotation correction angle used to align both the

image and the query as close as possible to obtain

more accurate similarity. Since the algorithm rec-

ognizes also rotational variant of an image, by

computing the angle of rotation, the query is ro-

tated in reverse direction to the original direction

of rotation according to the hRCA angle. In case ofperfect variants, the result of correction angle will

perfectly align the query with the original image.

Topological properties are considered repre-

senting for each object a minimum bounding

rectangle, i.e., the minimum size rectangle that

completely encloses a given object; topological

relations between a pair of objects are disjoint,

meets, contains inside, overlap, covers, covered by,equals. Topological relationships are computed by

means of a binary function of the type of topo-

logical relationship between corresponding pairs

of objects in the query and in the database image.

For each pair of objects in Iq and Id the function

returns 1 if the two objects have the same topo-

logical relationship, 0 otherwise.

The overall similarity is computed as a weighted

sum of the number of common objects, the direc-tional and the topological components.

The algorithm sums the contributions of direc-

tional and topological components over all the

pairs of corresponding objects in Iq and Id.

2.3. SIML algorithm

SIML provides a scale–rotation–translationinvariant measure of similarity.

Fig. 3 pictures the same query and database

images adopted for describing SIMG and SIMDTC .

For each object, SIML extracts all the oriented

angles obtained joining the centroid of the object

with all the other objects in the image. The algo-

rithm performs the following steps: it considers the

object Oq0 in the query image i.e., the joypad and

the pivot vector with origin in the joypad and

vertex in the next object, Oq1, the headset; it com-

putes the angles a0 obtained considering the pivot

vector and the vector with origin in the next object

Oq1; in the same manner it computes the angle a1,

considering the pivot vector and the vector with

vertex in object Oq3. In the next step, the algorithm

Page 6: Retrieval by spatial similarity: an algorithm and a comparative evaluation

Fig. 3. SIML algorithm.

1638 E. Di Sciascio et al. / Pattern Recognition Letters 25 (2004) 1633–1645

considers a different pivot vector with origin in Oq0

and vertex in the next object Oq2 and extracts the

remaining angle a2.The corresponding angles are extracted for the

database image; in Fig. 3 they are noted as bi.

For a given object, the algorithm computes the

maximum error between corresponding angles.

Similarity measure is obtained as a function of the

maximum error for all the groups of objects. We

use a function Uðx; fx; fyÞ to change a distance x(in which 0 corresponds to perfect matching) to a

similarity measure (in which the value 1 corre-

sponds to perfect matching), and to ‘‘smooth’’ the

changes of the quantity x, depending on two

parameters fx, fy.More formally, the algorithm can be described

as follows:

Algorithm SimLðIq; IdÞ;input a query image Iq ¼ fOq

0 ;Oq1 ; . . . ;O

qn�1g

a subset Id ¼ Od0 ;O

d1 ; . . . ;O

dn�1

� �of shapes of

a database image

output SimL

begin

for i 2 f0; . . . ; n� 1g do

Dspatial½i� ¼ 0;j ¼ iþ 1;

if j ¼ n then j ¼ 0;

while j 6¼ i docompute pivot vector r between Od

j and

Odi ;

compute pivot vector u between Oqj and

Oqi ;

k ¼ jþ 1;

if k ¼ n then k ¼ 0;

while k 6¼ i docompute pivot vector s between Od

k

and Odi ;

b ¼ ComputeAngle brsð Þ;compute pivot vector v between Oq

k

and Oqi ;

a ¼ ComputeAngle cuvð Þ;Dangle ¼ ja� bj;if Dspatial½i� < Dangle thenDspatial½i� ¼ Dangle;

k ¼ k þ 1;

if k ¼ n then j ¼ 0;

endwhile

j ¼ jþ 1;

if j ¼ n then j ¼ 0;

endwhile

endforreturn simL ¼ U maxn�1

i¼0 fDspatial½i�g; fxspatial;�

fyspatialÞend

The algorithm SIML executes calls to the simple

ComputeAngle algorithm, which is reported here-

after.

Algorithm ComputeAngle cpqð Þ;input two vectors p ¼ l1iþ m1j; q ¼ l2iþ m2joutput angle between p and qbegin

Acos ¼ l1 � l2 þ m1 � m2pl21 þ m2

2 �pl22 þ m2

2

;

Angle ¼ arccosðAcosÞ;

Page 7: Retrieval by spatial similarity: an algorithm and a comparative evaluation

E. Di Sciascio et al. / Pattern Recognition Letters 25 (2004) 1633–1645 1639

return Angle

end

In their original definition, SIMG and SIMDTC

algorithms cannot be compared with SIML algo-rithm, since their requirements are different. We

now briefly outline the main differences.

First of all, both SIMG and SIMDTC are appli-

cable to images containing at most one instance of

each object. This assumption limits the practical

application of the algorithms to real situations.

Instead, SIML can deal with multiple instances of

each object.The algorithms can compare even images with

different subsets of objects: for example, if a strict

subset of the objects in Iq is present in Id, the

similarity value just decreases, but it does not drop

to 0. In this case, both for SIMG and SIMDTC the

greater the number of objects in the query image Iqabsent in the database image Id, the smaller the

similarity between the two images. On the otherhand SIML satisfies the downward refinement

property, i.e., the similarity value decreases or at

least retains the same, when increasing the number

of components in Iq.Moreover, although all approaches recognize

scaled variants of the image, they provide a simi-

larity value that is affected by the scaling differ-

ences. If image Id is a scaled variant of Iq, the morethe scale factor differs from 1, the smaller simi-

larity value returned by SIMG. Instead, Algorithm

SIML returns a similarity value independent of the

scale factor.

Algorithm SIML takes into account only spatial

relationships between shapes since topological

ones are considered in the scale similarity. Hence,

SIML differs from SIMDTC that considers alsotopological components in similarity measure.

To cope with all these differences, it was neces-

sary to make some changes to the other two

algorithms we used for comparison. The modified

SIMG0 and SIMDTC0 satisfy the same constraints

of SIML. These changes do not affect their struc-

ture.

Algorithm SIMDTC was modified in order toconsider all the possible N groups groups of objects

similar to query Iq. Given the N groups number of

possible groups Gk of objects in the database

image Id, each group contains nq objects, where nqis the number of objects in the query image Iq.Each group contains an instance of each object

present in the query image Iq.The directional component of the spatial simi-

larity is obtained as the maximum on the SIMkDIR,

i.e., the directional component of the spatial sim-

ilarity between the group Gk and the query image

Iq. In this way, SIMDTC0 will depend on the group

that yields the higher similarity.

In the weighted sum used in computing the

overall similarity measure as described in Section

2.2, the weights of the topological component and

of the number of common objects have been an-nulled and the weight of the directional similarity

set to 1.

Algorithm SIMG has been modified as follows

to consider all the groups of objects. It extracts the

HR-string for all the N groups Gk of objects in the

database image Id. It compares the HR-strings of

the query image and the Gk groups of objects in the

database image. It then returns the maximumsimilarity among all the obtained values. Spatial

factor and scale factor are chosen using the same

relation proposed by Gudivada, such that

2 Spatial Factor þ 2 Scale Factor ¼ 1. The func-

tion adopted in refining the scale factor is the same

function U we defined for SIML. The right and left

neighbors distances are weighted on the magnitude

of the groups of objects, i.e., the distance of cent-roids of objects in Iq and Gk.

The modified SIMG0 algorithm has the same

properties of the algorithm proposed in (Di Scia-

scio et al., 2002a).

3. Comparative evaluation

3.1. Experimental setting

The algorithm has been tested on the image

collection shown in Figs. 4 and 5. The database

consists of 23 images whose components are rep-

resented as icons. The database images were ob-

tained by composing the seven symbolic objects in

Fig. 6 and built considering rotational and scalingvariants, both multiple and perfect. We assume a

generally accepted distinction between perfect or

Page 8: Retrieval by spatial similarity: an algorithm and a comparative evaluation

Fig. 4. Test dataset of symbolic images.

1640 E. Di Sciascio et al. / Pattern Recognition Letters 25 (2004) 1633–1645

multiple image variants. In a perfect variant the

same transformation is applied to all image objectsby the same magnitude. Otherwise the variant is

multiple.

Images are grouped into five classes: to the first

group belong images number 1, 2, 3 and 4. The

group was used to test the differences of behavior of

SIML and SIMDTC0 algorithms with respect to

SIMG0 . The purpose was to check whether the

algorithms can distinguish between images 1 and 4.In such two images, the distance between adjacent

objects is the same so SIMG0 does not distinguish

between them. Images 2 and 3 are intermediate

transformations of image 1 in 4.

The second group of images contains images 5,

6, 7, and 8; it is used to verify the behavior of

Page 9: Retrieval by spatial similarity: an algorithm and a comparative evaluation

Fig. 5. Continued test dataset of symbolic images.

Fig. 6. Symbolic objects used in the collection.

E. Di Sciascio et al. / Pattern Recognition Letters 25 (2004) 1633–1645 1641

algorithms with respect to perfect variants of animage obtained through rotational, scaling and

translation transformations. Image 6 is a 120%

scaled version of image 5; image 7 is obtained

rotating image 5 by 45� counter-clockwise; image 8

is obtained from image 5 rotating 35� clockwise,

scaling by 80% and translating objects by 20 pixels

along both the coordinate axes.

Images in Group 3 are created to check the

behavior of the algorithms with respect to multiple

variants. Images that belong to this group are 9,10 and 11 and the couple 21, 22. The first three

images are rotational variants of image 5 while the

other two are variants of image 12. Image 22 is a

scaled multiple variant of image 12.

Group 4 consists of images 12 through 20. The

images in this group are generated to test the

algorithms when varying the position of some

objects. Image 13 is obtained from image 12 byinverting the positions of the headset and the

joypad; image 14 and image 16 are both obtained

from image 13 where respectively the positions of

the videocamera and the radio and the joypad and

the camera are inverted; image 15 is obtained by

inverting the position of the monitor with the

Page 10: Retrieval by spatial similarity: an algorithm and a comparative evaluation

1642 E. Di Sciascio et al. / Pattern Recognition Letters 25 (2004) 1633–1645

joypad and of the videocamera with the radio in

the image 12. Image 17 is obtained by scaling

image 16 in vertical direction by 50%.

Images 18 and 19 are rotational variants and

perfect scale variant of image 12. Particularly,

image 18 is obtained by rotating image 12, 20�clockwise; 19 is a rotation variant of image 12

scaled by 80% and rotated 180�. Image 20 is ob-

tained reflecting image 12 along the horizontal

direction.

Images in Group 5 are obtained from Group 4

adding image 23. The group is used to test the

behavior of the algorithms in identifying the ima-

ges more similar to the query. Such test is usefulfor comparing the behavior of the algorithms in

the presence of more than one instance of a single

object: image 23 contains two radios and two

headsets.

3.2. Comparison of the three algorithms

To provide a basis for comparison, each imagein the collection was assumed as a query image in

turn and compared to all the others. For each

query, the algorithms returned a classification of

the images ranked for similarity measure.

Retrieval effectiveness can be evaluated by

comparing the classifications returned by the

algorithms to the user-provided rank ordering of

images. To this purpose we used the Rnorm measure(Bollmann et al., 1985).

Rnorm requires two rank orderings of the data-

base images with respect to the query image. The

first rank ordering is the system-provided one and

the second the expert provided. Values of Rnorm

range in the interval ½0; 1�. An Rnorm value of 1

indicates that the rank ordering is acceptable with

respect to the expert provided rank ordering.Lower values indicates an increasing disagreement

between the two rank orderings.

The expert was asked to provide a rank order-

ing of images with respect to each query image in

the test query. Table 1 describes the expert pro-

vided rank ordering. In each row images are or-

dered in decreasing value of relevance. Images in

the right side of each row have lower similaritywith the query. The user gave the same relevance

to images in the same cells.

The last row concerns image 23 that is not

comparable with all the other since it has a dif-

ferent number of objects with respect to images 1

through 11 and different object with respect to

images 12 through 22. For this reason results are

simply numerically ordered. Table 2 describes anumerical comparison for Rnorm results obtained

for the three algorithms. It can be noted that SIML

provides the higher average values for Rnorm, thus

it has the best response to user expectations.

Anyway, also results provided for SIMG0 and

SIMDTC0 show an efficient behavior of both algo-

rithms.

3.3. Discussion

In this section we describe some interesting

behaviors or the three algorithms with respect to

significant images in the database. Results of

applying the algorithms to Group 1 are shown in

Figs. 7 and 8, respectively for query 1 and 4. It

can be noted that images 1 and 4 are perfectlyequivalent for SIMG0 . Even though adjacent ob-

jects have equal distances, the spatial layouts are

different, then the similarity of 100% assigned by

SIMG0 is incorrect. SIML and SIMDTC0 , on the

other hand, provide correct values of similarity

that decrease while comparing query 1 with

images 1, 2, 3 and 4 and, respectively, for query 4

compared with images 1, 2, 3 and 4. As far asGroup 2 and a subset of Group 3 (containing

images 9, 10 and 11) are concerned, results for

query 5 are described in Fig. 9. All the algorithms

recognize images 6, 7 and 8 as perfect variant of

image 5, returning a 100% similarity value. Im-

ages 9, 10 and 11 are recognized as multiple

variants of image 5; in fact similarity value de-

creases with the increasing of the rotation factor.This test shows that it is possible for all the

algorithms to recognize multiple and perfect

rotation variants.

The fourth test is conducted on images in

Group 4 and in a subset of Group 3 containing

images 21 and 22. Results obtained for queries 12

and 20 are shown in Table 3.

Images 18 and 19 are perfect variants of query12, so it is expected that the algorithms will have a

Page 11: Retrieval by spatial similarity: an algorithm and a comparative evaluation

Table

1

Expert-users

rankings

E. Di Sciascio et al. / Pattern Recognition Letters 25 (2004) 1633–1645 1643

Page 12: Retrieval by spatial similarity: an algorithm and a comparative evaluation

12

34

0%

20%

40%

60%

80%

100%

Fig. 7. Results of applying SIMG0 , SIML and SIMDTC0 to

images of Group 1 using query image no. 1.

12

34

0%

20%

40%

60%

80%

100%

Fig. 8. Results of applying SIMG0 , SIML and SIMDTC0 to

images of Group 1 images using as a query image no. 4.

5 6 7 8 9 10 11 23

0%

20%

40%

60%

80%

100%

Fig. 9. Results of applying SIMG0 , SIML and SIMDTC0 to

images of Group 2 using using image no. 5 as query image.

Table 2

Comparison of Rnorm values for SIMG0 , SIML and SIMDTC0

Query SIML SIMG0 SIMDTC0

1 0.92 0.79 0.87

2 0.78 0.73 0.78

3 0.83 0.79 0.76

4 0.75 0.88 0.79

5 0.98 0.90 0.90

6 0.97 0.90 0.90

7 0.98 0.90 0.90

8 0.98 0.90 0.90

9 0.97 0.91 0.90

10 0.97 0.90 0.88

11 0.97 0.81 0.88

12 0.98 0.90 0.94

13 0.95 0.94 0.94

14 0.95 0.77 0.92

15 0.89 0.91 0.94

16 0.91 0.94 0.96

17 0.96 0.91 0.95

18 0.98 0.90 0.95

19 0.98 0.90 0.94

20 0.98 0.60 0.83

21 0.99 0.90 0.95

22 0.97 0.90 0.96

23 1.00 1.00 1.00

Average 0.94 0.87 0.90

1644 E. Di Sciascio et al. / Pattern Recognition Letters 25 (2004) 1633–1645

similarity value of 100%, which is reflected inTable 3. Similar results are assigned for images 21

and 22 that are multiple variants of query 12.

However, also image 20 is a perfect variant of

image 12 obtained as horizontal reflection, so

similarity value should be 100%; SIML is the only

one that assigns the correct similarity (see Table 4).SIMDTC0 returns similarity value of 73% and

SIMG0 returns a null similarity.

Page 13: Retrieval by spatial similarity: an algorithm and a comparative evaluation

Table 3

Results of applying SIMG0 , SIML and SIMDTC0 to Group 4 images using query no. 12

Query no. 12

Group 4 Group 3b

12 13 14 15 16 17 18 19 20 21 22

SIML 100% 28% 28% 28% 28% 28% 100% 100% 100% 90% 97%

SIMG0 100% 57% 43% 14% 55% 49% 100% 100% 0% 95% 97%

SIMDTC0 100% 93% 85% 86% 88% 88% 100% 100% 73% 100% 100%

Table 4

Results of applying SIMG0 , SIML and SIMDTC0 to Group 4 images using query no. 20

Query no. 20

Group 4 Group 3b

12 13 14 15 16 17 18 19 20 21 22

SIML 100% 28% 28% 28% 28% 28% 100% 100% 100% 90% 97%

SIMG0 0% 14% 43% 43% 0% 0% 0% 0% 100% 0% 0%

SIMDTC0 73% 77% 85% 78% 80% 80% 80% 73% 100% 78% 77%

E. Di Sciascio et al. / Pattern Recognition Letters 25 (2004) 1633–1645 1645

4. Conclusion

In this paper an algorithm for retrieval by

spatial similarity was proposed compared with the

other two algorithms. The algorithm was tested

using a dataset of symbolic images properly con-

structed to consider also rotation, translation and

scaling variants of an image. The experimentsshowed that the algorithm is robust with respect to

rotation, translation and scaling. The characteris-

tics of the proposed algorithm were compared with

those of two well-known algorithms, SIMG pro-

posed by Gudivada and SIMDTC proposed by El-

Kwae and Kabuka. The comparative evaluation

was presented using the Rnorm measure to deter-

mine the degree of matching of the results pro-vided by the algorithm with those provided by

expert users.

References

Bollmann, P., Jochum, F., Reiner, U., Weissmann, V., Zuse,

H., 1985. The LIVE-Project-Retrieval experiments based on

evaluation viewpoints. In: Proceedings of the 8th Annual

International ACM SIGIR Conference on Research and

Developement in Information Retrieval (SIGIR’85), ACM,

New York, pp. 213–214.

Chang, S.K., Jungert, E., 1991. Pictorial data management

based upon the theory of symbolic projections. J. Visual

Lang. Comput. 2 (3), 195–215.

Chang, S.K., Shi, Q.Y., Yan, C.W., 1983. Iconic indexing by

2D strings. IEEE Trans. Pattern Anal. Machine Intell. 9 (3),

413–428.

Di Sciascio, E., Donini, F.M., Mongiello, M., 2002a. Spatial

layout representation for query by sketch content based

image retrieval. Pattern Recognit. Lett. 23 (13), 1599–1612.

Di Sciascio, E., Donini, F.M., Mongiello, M., 2002b. Struc-

tured knowledge representation for image retrieval. J. Artif.

Intell. Res. 16, 209–257.

Di Sciascio, E., Donini, F.M., Mongiello, M., 2002. A logic for

SVG documents query and retrieval. In: Workshop on

Multimedia Semantics (SOFSEM 2002), Milovy, Czech

Republic, November 28–29.

El-Kwae, E.A., Kabuka, M.R., 1999. A robust framework for

content-based retrieval by spatial similarity in image data-

bases. ACM Transact. Inf. Syst. 17 (2), 174–198.

Gudivada, V.N., 1998. hR-string: A geometry-based represen-

tation for efficient and effective retrieval of images by spatial

similarity. IEEE Trans. Knowledge Data Eng. 10 (3), 504–

512.

Gudivada, V.N., Raghavan, J.V., 1995. Design and evaluation

of algorithms for image retrieval by spatial similarity. ACM

Trans. Inf. Syst. 13 (2), 115–144.

Lee, S.Y., Hsu, F.J., 1990. 2D C-string: A new spatial

knowledge representation for image database systems.

Pattern Recognit. 23, 1077–1088.

Zhou, X.M., Ang, C.H., Ling, T.W., 2001. Image retrieval

based on object’s orientation spatial relationship. Pattern

Recognit. Lett. 22, 469–477.