Top Banner
Database architecture for content-based image retrieval Toshikazu Kato *) Electrotechnical Laboratory 1-1-4, Umezono, Tsukuha Science City, 305 Japan ABSTRACT This paper describes visual interaction mechanisms for image database systems. The typical mechanisms for visual interac- tions are query by visual example (QVE) and query by subjective descriptions (QBD). The former includes a sketch retrieval func- tion and a similarity retrieval function, and the latter includes a sense retrieval function. We adopt both an image model and a user model to interpret and operate the contents of image data from user's viewpoint. The image model describes the graphical features of image data, while the user model reflects the visual perception processes of the user. These models, automatically cre- ated by image analysis and statistical learning, are referred as abstract indexes stored in relational tables. These algorithms are de- veloped on our experimental database system, the TRADEMARK and the ART MUSEUM. 1. INTRODUCTION "A picture is worth a thousand words. " A human interface plays an important role in a multimedia information system. For instance, we request a content based visual interface in order to communicate visual information itself to and from a multimedia database system 2 The algorithms of multimedia operations have to suit user's subjective viewpoint, such as a similarity measure, a sense of taste, etc. Thus, we have to design a multimedia database system to provide flexible interaction facilities. We expect an image database system to manage image data as well as alphanumeric data. We also expect it to provide a hu- man interface to accomplish flexible man-machine communication in a user-friendly manner. Then, what are needed in visual in- teraction? We can summarize the essential needs in visual interaction as follows 4. (a) Visual interaction requires interpreting the contents of image data. For instance, we would like to provide an image as a pic- tonal key to retrieve related images from the system. (b) Such interpretation algorithms have to suit the visual perception processes of each user. In similarity retrieval, we often ex- pect to get suitable candidates according to our subjective measures. (c) Some of the queries include multimedia data on the different domains as keys. In content based retrieval, we expect the sys- tem to retrieve some image data by describing their contents in a text. In this paper, we will describe the mechanisms for visual interaction and their implementation on relational database sys- tems. In our mechanisms, the system refers the abstract indexes to evaluates the contents of visual information. These indexes are defined on the image model of graphical features and on the user model of the visual perception process of each user. The lat- ter one is to meet the subjective viewpoint of each user. Abstract indexes are managed as relational tables so that we can easily implement the visual interaction mechanism on conventional relational database systems. Our approach gives a general frame- work for content-based image retrieval. 2. INTELUGENT VISUAL INTERFACE 2.1 Related wo&s on visual intethe Several experimental image database systems have been proposed to provide visual interfaces. The QPE system provides a visual schema of pictorial data in a graphic form as well as in a tabular form 5. Graphic elements such as line segments are ar- ranged and displayed in a map style. The visual query only accepts alphanumeric data such as instance values in a table or coordi- nate values on a graphic display. Such a query is only a substitute for a query language on alphanumeric data. In the icon-based system, two dimensional arrangements of objects are (manually) described with each picture 6 Such 2D-strings are referred to as pictorial index. A user may place icons on a graphic display as an example to specify the target images. The system evaluates *) From March 1992, he will be visiting Dept. of Physics and Astronomy, University College London, Gower Street, London WCIE 6BT, U.K. 112 ISPIE Vol. 1662 Image Storage and Retrieval Systems (1992) O-8194-0816-6/92/$4.OO Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/21/2014 Terms of Use: http://spiedl.org/terms
12

Database architecture for content-based image retrieval ... · Database architecture for content-based image retrieval Toshikazu Kato *) Electrotechnical Laboratory 1-1-4, Umezono,

Jun 17, 2018

Download

Documents

duongbao
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Database architecture for content-based image retrieval ... · Database architecture for content-based image retrieval Toshikazu Kato *) Electrotechnical Laboratory 1-1-4, Umezono,

Database architecture for content-based image retrieval

Toshikazu Kato *)

Electrotechnical Laboratory1-1-4, Umezono, Tsukuha Science City, 305 Japan

ABSTRACT

This paper describes visual interaction mechanisms for image database systems. The typical mechanisms for visual interac-tions are query by visual example (QVE) and query by subjective descriptions (QBD). The former includes a sketch retrieval func-tion and a similarity retrieval function, and the latter includes a sense retrieval function. We adopt both an image model and auser model to interpret and operate the contents of image data from user's viewpoint. The image model describes the graphicalfeatures of image data, while the user model reflects the visual perception processes of the user. These models, automatically cre-ated by image analysis and statistical learning, are referred as abstract indexes stored in relational tables. These algorithms are de-veloped on our experimental database system, the TRADEMARK and the ART MUSEUM.

1. INTRODUCTION

"A picture is worth a thousand words. " A human interface plays an important role in a multimedia information system. Forinstance, we request a content based visual interface in order to communicate visual information itself to and from a multimediadatabase system 2 The algorithms of multimedia operations have to suit user's subjective viewpoint, such as a similaritymeasure, a sense of taste, etc. Thus, we have to design a multimedia database system to provide flexible interaction facilities.

We expect an image database system to manage image data as well as alphanumeric data. We also expect it to provide a hu-man interface to accomplish flexible man-machine communication in a user-friendly manner. Then, what are needed in visual in-teraction? We can summarize the essential needs in visual interaction as follows 4.(a) Visual interaction requires interpreting the contents of image data. For instance, we would like to provide an image as a pic-

tonal key to retrieve related images from the system.(b) Such interpretation algorithms have to suit the visual perception processes of each user. In similarity retrieval, we often ex-

pect to get suitable candidates according to our subjective measures.(c) Some of the queries include multimedia data on the different domains as keys. In content based retrieval, we expect the sys-

tem to retrieve some image data by describing their contents in a text.

In this paper, we will describe the mechanisms for visual interaction and their implementation on relational database sys-tems. In our mechanisms, the system refers the abstract indexes to evaluates the contents of visual information. These indexesare defined on the image model of graphical features and on the user model of the visual perception process of each user. The lat-ter one is to meet the subjective viewpoint of each user. Abstract indexes are managed as relational tables so that we can easilyimplement the visual interaction mechanism on conventional relational database systems. Our approach gives a general frame-work for content-based image retrieval.

2. INTELUGENT VISUAL INTERFACE

2.1 Related wo&s on visual intethe

Several experimental image database systems have been proposed to provide visual interfaces. The QPE system provides avisual schema of pictorial data in a graphic form as well as in a tabular form 5. Graphic elements such as line segments are ar-ranged and displayed in a map style. The visual query only accepts alphanumeric data such as instance values in a table or coordi-nate values on a graphic display. Such a query is only a substitute for a query language on alphanumeric data. In the icon-basedsystem, two dimensional arrangements of objects are (manually) described with each picture6 Such 2D-strings are referred to aspictorial index. A user may place icons on a graphic display as an example to specify the target images. The system evaluates

*) From March 1992, he will be visiting Dept. of Physics and Astronomy, University College London, Gower Street, LondonWCIE 6BT, U.K.

112 ISPIE Vol. 1662 Image Storage and Retrieval Systems (1992) O-8194-0816-6/92/$4.OO

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/21/2014 Terms of Use: http://spiedl.org/terms

Page 2: Database architecture for content-based image retrieval ... · Database architecture for content-based image retrieval Toshikazu Kato *) Electrotechnical Laboratory 1-1-4, Umezono,

the 2D-string of the example and those of pictorial index in retrieval. Therefore, it is difficult to perform similarity retrieval ac-cording to the subjective measure of each user. The hypermedia system provides an indexing mechanism for multimedia data in auniformed style . Although this system enables a subjective indexing which associates the multimedia data on different do-mains, indexing process fully owes much to the user's effort on defining many links.

A cognitive aspect was pointed out with man-machine system designing 8 In this aspect, the most important issue is howto give a correct system image to its novice user. To do this, the system should be designed to evaluate the mental model of thenovice user, i.e. the user model, on the system image. Recently, this aspect was also discussed with database schema designing9.How to show a database schema to its novice user is much related with semantic data models and conceptual data models. Whilethis aspect denoted importance of the user model, these discussions do not cover how we interpret and express visual information.

2.2 Recpinments in visual inteife

As summarized above, these approaches suggest us the essential facilities for visual interface to perform a user-friendly man-machine interaction. We originally want to process image data itself based on our subjective views without any efforts on index-ing. Then, how do we organize such a visual interface? Let us show the general framework of visual interaction by typical user'squery requests in our applications; our basic ideas are QVE (query by isual example) and QBD (query y subjective description).Let us show the requirements and the technical problems associated with visual interaction. Query styles for image database canbe summarized as shown in Table I . In this classification, the following three cases are tightly related to visual interaction.

Table 1 Ouer styles for imaze database systems

Criteria: Objective criteria Subjective criteriaDomains:Alphanumeric & alphanumeric (Partial) string matching of keywords

attached to image dataString matching of keywords usingpre-defined thesaurus

Pictorial & pictorial Query by visual example (sketch re-trieval): showing a (rough) sketch tofind the original image data

Query by visual example (similarityretrieval): showing a visual exam-ple to find the (subjectively) similarimage data

Alphanumeric & pictorial (Partial) string matching of keywordsusing character recognition

Query by subjective descriptions(sense retrieval): showing descrip-tions to find the (subjectively) suit-able image data

2.2.1 QVE with pictoiial thmain

A visual query may include image data in its part as a pictorial key. A user often wishes to find an original image data,which he keeps in his mind, from an image database. In QVE, he has only to draw a rough sketch and show it as apictorial keyto the system to retrieve the original image data. This facility is sketch retrieval.

SELECT figure . , figure .patternFROM figureDBWHERE figure.pattern sample.pattern

Here, the attribute value of pattern belongs to image data type. Here, the value of sample .pattern is provided by theuser as the visual example. Note that QVE should evaluate the similarity between the sketch and the images, x y, on pictorialdomains. Thus, we have to design an evaluation mechanism on a robust image model.

2.2.2 QVE with Subjective cntena

A user may wish to find some images which give him the similar impression from his view.SELECT figure . name , figure . patternFROM figureDBWHERE figure.pattern sample.pattern

We have to notice that the criteria for similarity belongs to a subjective human factor. The system should evaluate similarity ac-cording to his subjective criterion. Therefore, the system should analyze and learn the subjective similarity measure on the im-ages with each user.

SPJE Vol. 1 662 Image Storage and Retrieval Systems (1 992) 1 1 13

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/21/2014 Terms of Use: http://spiedl.org/terms

Page 3: Database architecture for content-based image retrieval ... · Database architecture for content-based image retrieval Toshikazu Kato *) Electrotechnical Laboratory 1-1-4, Umezono,

2.2.3 QBD with mu1timea bmains and subjective cntena

A visual query may also include different media data in its operands. For instance, a user often wishes to see several imageswhich may leave him some visual impressions. In QBD, he has only to describe such impressions by his own words to retrievesome images. This facility is sense retrieval.

SELECT painting.name, painting.patternFROM paintingOBWHERE painting.pattern lively"

This is the most complicated visual query style. Here, the attribute value of pattern belongs to pictorial domain, while'I 1 ively " belongs to text domain. We also have to notice that our visual impressions may differ for each of us, even whenviewing the same painting. The system should evaluate similarity x y based on his subjective criterion while x, y belong tothe different domains (x E D1 , y E D2). Thus, we have to design the learning algorithm and evaluation mechanism which corre-late the different domains, such as subjective descriptions of visual impression and the graphical features of images with eachuser.

3. QUERY BY VISUAL EX AMPLE— Based on graphic features —

3.1 Visual pcneption and ime moki

This chapter describes the image models and algorithms for query by visual example (QVE), i.e. sketch retrieval. A user hasonly to draw a sketch of the target image to retrieve the original one. At first, let us discuss the technical problems associatedwith sketch retrieval. We can point out these problems;(i) The pixelwise pattern matching is quite a time consuming task.(ii) We can not give a pictorial key which is exactly the same with the original image.

One of the most reasonable approach is to parameterize the image data in the database as well as user's sketch based on ourvisual perception process We have investigated how we perceive a graphic symbol, as a typical example, in human visionIn our psychological examination, each examinee classified a set of graphic symbols into several non-overlapping groups in ac-cordance with his sense of similarity. (Here, the shape is the only consideration in classification.) The signal features were dom-inant factors, about 60% of the results. The structural features also affected about 40% of the results. From the results, we haveassumed an image model by the following graphic features (GF).

[GF spae forgiaphic symbols](1) Spatial distribution of the gray level Gray8, Edge8:

The distribution of black pixels represents the outlineof the graphic symbol. For this purpose, the graphicsymbol is divided into 8 x 8 square meshes. Gray8 de-notes the number of black pixels m1 in each mesh

(Figure 1(b)).Gray8 =

{m11}(0 � i � 7, 0 7).

Similarly, we defined Edge8 with the contour of thegraphic symbol.

(2) Spatial frequency RunBIW, RunW': The spatial fre-quency measures the complexity of graphic symbols.RunB/W approximates the frequency by the run-lengthdistribution of each rectangle mesh. Here, the figure isdivided into four horizontal meshes as well as four ver-tical meshes (Figure 1(c)). Similarly, we definedRunW' without distinguishing the black and white runs.

(3) Local correlation measure and local contrast measureCorr4, Cont4: The local correlation and the local con- (a) A graphic symbol (c) Spatial frequencytrast show the spatial structure, such as the regularity of (b) Gray level (d) Local correlation and contrast

arrangement of partial figures (Figure 1(d)). Figure 1. GF vectors of a graphic symbol

114 ISPIE Vol. 1662 Image Storage and Retrieval Systems (1992)

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/21/2014 Terms of Use: http://spiedl.org/terms

Page 4: Database architecture for content-based image retrieval ... · Database architecture for content-based image retrieval Toshikazu Kato *) Electrotechnical Laboratory 1-1-4, Umezono,

Corr4 = {mjjmj/'),Cont4 ={(mjj-mj')I(mjj+mj')},

(0 �i 0Where, m j, m are adjacent meshes. These parameters are defined on 4 x 4 square meshes.

3.2 Sketch itneva1 on graphic featuns

We may expect that the neighboring graphic symbolsin a GF space will have a similar shape. For example, a finecopy and its rough sketch are neighbors in GF space. There-fore, the system can retrieve similar graphic symbols bycomparing their GF vectors. Let us show the sketch re-trieval algorithm for graphic symbols. Figure 2 shows theoutline of this algorithm.

[Algorithm 1] Sketch mtneval on GF spe(1) Normalize the image size of the sketch, i.e. the visual

example.(2) Calculate the GF vector P0 of the sketch.(3) Calculate the distance d between the sketch P0 and the

graphic symbols in the database p1.K

dj=(wk 'PiPO I)k=]

Where, k and wk mean the GF vector and its weight fac-

tor.(4) Choose the graphic symbols in the ascending order of

d1.

Graphic Feature (GF) space[Image Model]

3.3 Experimental nsults of sketch itiieval

We have been developing an image database system called TRADEMARK). The TRADEMARK database is a collectionof graphic symbols . These figures are protected as intellectual property. At a Patent Office, an examiner compares each fig-ure with tens of thousands of existing registered graphic symbols. It is a burdensome task that can be avoided if the imagedatabase system accepts pictorial keys using sketch retrieval facility. The TRADEMARK system refers to the GF vectors as thepictorial index of graphic symbols.

Figure 6 shows an example of sketch retrieval. A user has written down a sketch shown as "your visual example" in theQVE window in Figure 3. The system searches for the most similar graphic symbols on the pictorial index comparing their GFvectors. The QVE window also shows the candidates for similar graphic symbols in descending order of priority. The first candi-date is the original design of the rough sketch. We can also find similar graphic symbols in Figure 3.

We have evaluated this algorithm in an experiment in which we showed fair copies, hand-written sketches and rough sketcheswith every 100 visual example. (Currently, the TRADEMARK database manages about 2,000 graphic symbols.) We have testedthe recall ratio. Here, the recall ratio shows the rate of retrieval of the original graphic symbol among the best ten candidates Fora fair copy, the system had an almost 100% recall ratio among the first ten candidates, using the GF features. Even for the roughsketches, it had about 95% recall. We may conclude that our GF features satisfy the requirements for a robust image model forsketch retrieval.

+) TRADEMARK: demark and sign database with multimedia bstracted image representation on ]nowledge base

SPIE Vol. 1662 Image Storage and Retrieval Systems (1992) / 115

Query by Visual Example

Figure 2. Overview of sketch retrieval on GF space

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/21/2014 Terms of Use: http://spiedl.org/terms

Page 5: Database architecture for content-based image retrieval ... · Database architecture for content-based image retrieval Toshikazu Kato *) Electrotechnical Laboratory 1-1-4, Umezono,

U tt 4 k

u Hd k& L } Ut t 14

U iat ! tiN V

ttt I t U P &mI 6 &I tp i-d flkna"

'co. 4? Z') P4€) 121 L Nn, 4 L Nt 4b 5, *t

d 4. hIt) d:s sjf > d'i fl38) 9571 ) 4 i=6Si P)

M I H P4 *3 M P4 Ui N( j4;_I4u2 #b ,t*31 9) } *8% {df. tk

3 F H

fl N N :d1 J L14 N I 44

Figure 3. Sketch retrieval of graphic symbols by showing a rough sketchA hand-written rough sketch appears in the QVE window. The upper and the lower windows show the firstten and second ten candidates for the sketch. The right window explains what the original figure is.

4. QUERY BY VISUAL EXAMPLE— Based on subjective features —

This chapter describes another aspect to query by visual example (QVE), i.e. similarity retrieval on subjective criteria.

4.1 Personal view moM forsimilarity ietiieval

This chapter describes another aspect for query by visual example (QVE); i.e. similarity retrieval. We have to remember thatour visual impressions of graphic symbols are psychologically ambiguous. The similarity measure may differ with each of useven when viewing the same images. Therefore, the system should learn the subjective similarity measure as a personal viewmodel for each user.

Figure 4 shows the outline of the learning algorithm. A user easily groups the test samples from the database into severalclusters judging similarity. The system extracts the GF vector of each graphic symbol. We need a subjective feature (SF) spacewhich reflects the subjective similarity measure. We can construct such an SF space by the discriminant analysis. The discrimi-nant analysis is one of the multivariate analysis methods to evaluate the classification. The algorithm to construct the SF spaceand the personal index is as follows.

116 ISPIE Vol. 1662 Image Storage and Retrieval Systems (1992)

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/21/2014 Terms of Use: http://spiedl.org/terms

Page 6: Database architecture for content-based image retrieval ... · Database architecture for content-based image retrieval Toshikazu Kato *) Electrotechnical Laboratory 1-1-4, Umezono,

[Algonthm 2J Learning subjective similarity meure (SFspae)

(1) Choose appropriate samples from the database to makethe learning set P. The user classifies the samples intoseveral clusters without overlapping.

(2) Normalize the image size. Calculate the OF vectorp k of each sample k E P. (The OF vector is thesame one in the sketch retrieval algorithm.)

(3) Apply the discriminant analysis to the clustering resultby the user. The linear mapping A is given by solvingthe following eigenvalue problem.

B A = w A A,

A'EA =1.Where, -B and w denote inter-group and intra-groupcovariance matrixes of OF vectors, respectively. A'means the transposed matrix of A, and A is an eigen-value vector. Thus, we can define the SF space withthe user.

rk APk.Where, r k i5 the SF vector of the graphic symbol k.

(4) Calculate the SF vectors with every image data in the database.r =A'pk.

We will refer to the SF space of r as the personal index. Note that we do not have to examine the similarity of all the im-age data in the database. Once the system has learned the linear mapping A, it can automatically construct the personal indexonly from the OF vectors. This algorithm reduces the personnel expenses for indexing.

4.2 Similarity retneval on pesonaI index

We may expect that the neighboring images in an SFspace will give a similar impression from the user's view.Just the same with sketch retrieval, the user shows a visualexample with which he wants to see the similar images.The system can retrieve similar image data by comparingtheir SF vectors on the personal index. Then, the systemshows suitable candidates. The algorithm for similarity re-trieval is as follows. (See also Figure 5.)

[Algonthm 3] Similaiity ntnevaI on SF spe(1) Apply the linear mapping A to the OF vector p o of

the visual example.r = A' p .(2) Choose the neighboring images p on the personal in-

dex as the candidates for similarity retrieval.(3) Calculate the distance d between the visual example r0

and the images r1.

d= /r-r0 I.

(4) Choose the images in the ascending order of d.

4.3 Expetimental isuIts of similaiity it.ieva1

Figure 6 shows an experimental result of similarity retrieval. The upper QVE window in this figure shows the ten candi-dates for similarity retrieval on the SF space, while the lower one shows the ten candidates on the OF space. The second to the

SPIE Vol. 1662 Image Storage and Retrieval Systems (1992)1 117

Subjective Feature(SF) Space

Figure 4. Learning process of subjective similarity for QVE

Query by VIsual Example(QVE)

Figure 5. Overview of similarity retrieval on SF space

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/21/2014 Terms of Use: http://spiedl.org/terms

Page 7: Database architecture for content-based image retrieval ... · Database architecture for content-based image retrieval Toshikazu Kato *) Electrotechnical Laboratory 1-1-4, Umezono,

eighth candidates on the SF space have matched with the classification by this user. The system could not retrieve these candi-dates in the sketch retrieval on the GF space, since their graphic features differ from those of the visual example.

We have evaluated the learning algorithm and the similarity retrieval algorithm in an experiment with eleven users. In thisexperiment, we used 230 graphic symbols for the samples out of 2000. The system had at least one similar graphic symbol morethan 98% recall ratio among the first ten candidates. We may conclude that our SF spaces satisfy the subjective similarity mea-sure of each user.

The upper QVE window arid the lower one show the result of similarity retrieval on the SF space of the userand that of sketch retrieval on the GF space.

5. QUERY BY SUBJECTIVE DESCRIPTIONS— Based on unified fealures —

5.1 Visual peiteption and subjective tscriptions

This chapter gives a more complex visual interaction algorithm for query by subjective descriptions (QBD), i.e. sense re-trieval. A user has only to show several words to describe the target images to find them. At first, let us discuss the technicalproblems associated with sense retrieval. We can point out these problems;(a) The indexer has to assign many keywords to each painting in the database, which is a laborious work.(b) Such keywords depend on the indexer's personal view, which may differ with those of other user's taste.(c) A keyword thesaurus itself has no direct relation to pictorial data.

118 / SPIE Vol. 1662 Image Storage and Retrieval Systems (1992)

Figure 6. Example of similarity retrieval

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/21/2014 Terms of Use: http://spiedl.org/terms

Page 8: Database architecture for content-based image retrieval ... · Database architecture for content-based image retrieval Toshikazu Kato *) Electrotechnical Laboratory 1-1-4, Umezono,

Our approach aims to unify alphanumeric domain and pictorial domain to operate the contents of an image. Thus, we haveto model how a person feels certain impressions when viewing an image. Chijiiwa 10 reported that the dominant impressiongenerated by paintings is coloring, while we view paintings from several aspects. This report suggests that there is a reasonablecorrelation between the coloring of full color images and the words in the reviews. We have parameterized image data and oursubjective impression by the following features. (Figure 7 shows an example of coloring feature of a painting and its visual im-pression described by a user.)

Figure 7 Example of a GF vector on coloring features and an SF vector on visual impressionThe histogram in the window shows the distribution of RGB intensity value in a subpicture of the painting.

[GF spe for coloting featuwsl(1) Distribution of the RGB intensity values: The distribution of the ROB intensity values in the subpictures describes the spa-

tial arrangements of coloring and overall composition.(2) Autocorrelation of the RGB intensity values: This feature also describes the combination of the colors which appear in the

picture.

[SF spe for visual impnssionJWeight vector of keywords: Visual impressions are described as the weight of the predefined adjectives as shown in Figure 7.

5.2 Learning a petsonal view mock1 on visual impression

Let us show the learning algorithm for a personal view on visual impressions. As described above, we will expect that theweight of keywords and the coloring feature correlates with each other. We can construct a unified feature (UF) space by thecanonical correlation analysis. The algorithm to construct the UF space and the personal index is as follows (Figure 8).

SPIE Vol. 1662 Image Storage and Retrieval Systems (1992) / 119

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/21/2014 Terms of Use: http://spiedl.org/terms

Page 9: Database architecture for content-based image retrieval ... · Database architecture for content-based image retrieval Toshikazu Kato *) Electrotechnical Laboratory 1-1-4, Umezono,

[Algorithm 41 Learning a visual impnssion measun(1) Choose appropriate images from the database to make

the learning set P.(2) The user describes his visual impressions as the weight

of the adjectives a k to each image k E P.

(3) Calculate the distribution of the RGB intensity value inthe subpictures as the GF vector p1 with each full color

image.(4) Apply the canonical correlation analysis to the result of

the enquete. The linear mappings F and G make theircorrelation maximum:

fk F'a kg k = G'p k

Where, F', G' mean the transposed matrix of F, G, re-spectively.

(5) Calculate the UF vectors of paintings in the databasefrom the following formula.

g = G' p .

We will refer to the UP space ofg as the personal index of the user. Note that we do not have to assign the adjectives ato every image in the database. Once the system has learned the linear mappings F and G, it can automatically construct the per-sonal index only from the GF vectors. This is a labor-saving algorithm for indexing.

5.3 Sense retneval on peisonal index

The UF space gives a criterion to evaluate the text dataand the image data by their contents. We may expect thatthe neighboring images and words in the UF space will givesimilar impressions to the user. Thus, the user has only toshow several words in QBD. The system evaluates the mostsuitable images for the words according to the personalview. We may call this operation a multimedia join. The SFalgorithm for sense retrieval is as follows (Figure 9).

[Algorithm 5] Sense ntneva1 on peionaI index(1) Apply the linear mappings F and A to the adjective vec-

tor a of a subjective descriptions in the user's query.

g o = A F' a oWhere A is the regression by the diagonal matrix ofcanonical correlation coefficients.

(2) Choose neighboring images g on the personal indexas candidates for sense retrieval.

Figure 9. Overview of sense retrieval on UF space

For other applications of the UF space, we can retrieve images that give us a similar impression by showing an image as a

visual example. We can also infer the suitable keywords for simulating the user's personal view, using the inverse mapping Fand A as follows.

a0 = F'1 A-' G'p0.

5.4 Experiments on personal view and sense ntiieval

120 / SPIE Vol. 1662 Image Storage and Retrieval Systems (1992)

Subjective(SF)

Figure 8. Learning process of visual impression for OBD

Query by Subjective Descriptions

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/21/2014 Terms of Use: http://spiedl.org/terms

Page 10: Database architecture for content-based image retrieval ... · Database architecture for content-based image retrieval Toshikazu Kato *) Electrotechnical Laboratory 1-1-4, Umezono,

We have been developing an electronic art gallery called ART MUSEUM) 4. The ART MUSEUM is a collection offull color paintings. The ART MUSEUM system provides QBD facility for sense retrieval. The ART MUSEUM system refersthe personal index on unified features (UF), which are derived from graphic features (GF) on coloring and subjective features (SF)on visual impression. Thus, a user can retrieve full color paintings by presenting some words on his personal taste. We haveexperimented with the sense retrieval algorithm on our ART MUSEUM system.

Figure 10 shows an example of sense retrieval. This figure shows the best eight candidates for the adjectives; "romantic,soft and warm". (In this experiment, we adjusted the UF space according to the average answers of female students.) These paint-ings roughly satisfied the personal view of the subjects. We may conclude that the UF personal index on UF space reflects a per-sonal sense of coloring.

The best eight paintings in the database appear for the words "romantic", "soft" and "warm". The words in thequery and colorings of the paintings are evaluated on a UF space of female students.

6. IMPLEMENTATION ON RELATIONAL DATABASE

This chapter describes the implementation method on a conventional relational database system. In our conceptual model,each image data is regarded as an object . Feature parameters related to an image data, which describe GF, SF and UF spaces,are referred to as attributes of an object. They are actually managed as columns of relational tables in our current implementationon a relational database system. Each object is managed as a tuple of a relational table. The sketch retrieval algorithm evaluates

ART MUSEUM: ltimedia database with nse of color and composition pon the matter of trt

SPIE Vol. 1662 Image Storage and Retrieval Systems (1992) / 121

Figure 10. Example of sense retrieval in OBD

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/21/2014 Terms of Use: http://spiedl.org/terms

Page 11: Database architecture for content-based image retrieval ... · Database architecture for content-based image retrieval Toshikazu Kato *) Electrotechnical Laboratory 1-1-4, Umezono,

the similarity between a sketch and image data in the database on these relational tables. The specification at the user interface isinternally mapped into SQL descriptions.

These relational tables are appropriately defined for each application. The following descriptions define GF parameter tablesfor some image class and a distance table;

create table GF(name_id mt not null, gfl mt not null gfl6 mt not null)

create table #VEGF(nameid mt not null, gfl mt not null gfl6 mt not null)

create table #DISTANCE(name_id mt not null, dist float not null)

The sketch retrieval algorithm at first calculates the GF vectors of a visual example and registers it to the system;insert #VEGFselect name_id, gfl gfl6from GFwhere name_id = @name_id

Secondly, this algorithm calculates the similarity between a sketch and image data in the database as the distance value of theirGF vectors. The following operation calculates the euclidian distance on the GF space;

insert #DISTANCEselect

GE . name_id,sqrt(power((GF.gfl - #VEGF.gfl), 2) + ... + power((GF.gfl6 - #VEGF.gfl6), 2))

from GE, #VEGEFinally, the algorithm sorts the results in ascending order.

select name_id, distfrom #DISTANCEorder by dist

In our current implementation, we used Sybase for SQL-server. The visual query systems, such as TRADEMARK, were in-stalled as one of the applications of the relational database system. The visual interface itself is developed on X-window system.A user can easily access to our system from appropriate workstation in our LAN.

7. SUMMARY

We have described a database architecture for visual interface and content-based image retrieval facilities. We have developedthe algorithms for sketch retrieval, similarity retrieval and sense retrieval to support visual interaction. The sketch retrieval algo-rithm for a query by visual example (QVE) accepts image data as a pictorial example. The similarity retrieval algorithm evaluatesthe similarity based on the personal view ofeach user. The sense retrieval algorithm for a query by subjective descriptions (QBD)evaluates the similarity between text data and image data at content level based on the personal view. These algorithms are im-plemented and tested in our experimental multimedia database systems, TRADEMARK and ART MUSEUM. These functionsformed visual interaction in a user-friendly manner.

We have also described our implementation method of these mechanisms on a relational database system. In our model, eachimage data is regarded as an object, and feature parameters, which describe GF, SF and UF spaces, are referred to as attributes of anobject. Each object is managed as a tuple in a relational table. The visual operations at the user interface are internally mappedinto SQL descriptions and processed as application programs on the relational database.

Our research gives the guiding principle to the multimedia information systems. Our methods give basis for user-centeredhuman interface designing and multimedia information management.

8. ACKN OWLEDGMEN'LS

The author would like to thank the colleagues in Electrotechnical Laboratory, especially Dr. Akio Tojo, Dr. ToshitsuguYuba, Dr. Kunikatsu Takase, Mr. Koreaki Fujimura and Mr. Takio Kurita for their support in this research. The author wouldalso thank to Dr. Hideyuki Tamura, Mr. Osamu Yoshizaki, Mr. Yu-ichi Ban'nai and Mr. Manabu Ohga, Information Systems

122 / SPIE Vol. 1662 Image Storage and Retrieval Systems (1992)

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/21/2014 Terms of Use: http://spiedl.org/terms

Page 12: Database architecture for content-based image retrieval ... · Database architecture for content-based image retrieval Toshikazu Kato *) Electrotechnical Laboratory 1-1-4, Umezono,

research Center, Canon Inc. for their cooperation on implementation issues on a relational database system. This research issupported by MITI's national research and development projects ofinteroperable Database Systems.

9. REFERENCES

1. S. S. Iyenger and R. L. Kashyap, "Image Databases", IEEE Trans on Software Engineering (special selection), Vol.SE-14, No.5, pp.6O8-688, May 1988.

2. W. I. Grosky and R. Mehrotra (ed), 'Image Database Management", COMPUTER (special issue), Vol.22, No. 12, pp.'7-71, Dec. 1989.

3. T. Kato, T. Kurita, H. Shimogaki, T. Mizutori and K. Fujimura, "A Cognitive Approach to Visual Interaction", Proc.ofMultimedia Information Systems MIS'91, pp. 271-278, Dec. 1989.

4. T. Kato, T. Kurita, H. Shimogaki, T. Mizutori and K. Fujimura, "Cognitive View Mechanism for Multimedia DatabaseSystem", Proc. oflnternational Workshop on Interoperability in Multidatabase Systems, pp.179-186, Apr. 1991.

5. N. S. Chang and K. S. Fu, "Query-by-Pictorial Example", IEEE Trans. on Software Engineering, Vol.SE-6, No.6,pp.519-524, June 1980.

6. 5. K. Chang, C. W. Yan, D. C. Dimitroff and T. Arndt, "An Intelligent Image Database System", IEEE Trans. onSoftware Engineering, Vol.SE-14, No.5, pp.681-688, May 1988.

7. N. Yankelovich, B. J. Haan, N. K. Meyrowitz and S. M. Drucker: "Intermedia: The Concept and the Construction of aSeamless Information Environment", COMPUTER, Vol.21, No.1, pp.81-96, Jan. 1988.

8. D. A. Norman, "Cognitive Engineering", in D. A. Norman, and S. W. Draper (ed), User Centered System Design,pp.31-61, Lawrence Erlbaum Associates, 1986.

9. E. J. Neuhold and M. Kracker, "Cognitive Aspects of Accessing Multi-Media Information", Proc. of Computer World89, pp.119-126, Sep. 1989.

10. H. Chijiiwa, Chromatics, Chap.5, pp.128-163, Fukumura Printing Co., Tokyo, 1983.11. T. Kato and T. Mizutori, "Multimedia Data Model for Advanced Image Information Systems", Proc. of Advanced

Database System Symp. ADSS'89, pp. 113-120, Dec. 1989.

SPIE Vol. 1662 Image Storage and Retrieval Systems (1992) / 123

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/21/2014 Terms of Use: http://spiedl.org/terms