Pyramid-Based Multisensor Image Data Fusion with Enhancement of Textural Features

Alberto Del Bimbo (Ed.)

Image Analysis and Processing

Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis and J. van Leeuwen

1310

Advisory Board: W. Brauer D. Gries J. Stoer

Alberto Del Bimbo (Ed.)

Image Analysis and Processing 9th International Conference, ICIAP '97 Florence, Italy, September 17-19, 1997 Proceedings, Volume I

~ Springer

Series Editors

Gerhard Goos, Karlsruhe University, Germany

Juris Hartmanis, Comell University, NY, USA

Jan van Leeuwen, Utrecht University, The Netherlands

Volume Editor

Alberto Del Bimbo Universit~t di Firenze, Dipartimento di Sistemi e Informatica Via di Santa Marta, 3, 1-50139 Firenze, Italy E-mail: delbimbo @ aguirre.ing.unifi.it

Cataloging-in-Publication data applied for

Die Deutsche Bibliothek - CIP-Einheitsaufnahme

Image analysis and processing : 9th international conference ; proceedings / ICIAP '97, Florence, Italy, September 17 - 19, 1997. Albert DelBimbo (ed.). - Berlin ; Heidelberg ; New York ; Barcelona ; Budapest ; Hong Kong ; London ; Milan ; Paris ; Santa Clara ; Singapore ; Tokyo : Springer

Literaturangaben Vol. 1 (1997) lS0ffecture notes in computer science ; Vol. 1310)

BN 3-540-63507-6

CR Subject Classification (1991): 1.4, 1.5, 1.3.3, 1.3.5, 1.3.7, 1.2.10

ISSN 0302-9743 ISBN 3-540-63507-6 Springer-Verlag Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer -Verlag. Violations are liable for prosecution under the German Copyright Law.

© Springer-Verlag Berlin Heidelberg 1997 Printed in Germany

Typesetting: Camera-ready by author SPIN 10551833 06/3142 - 5 4 3 2 1 0 Printed on acid-flee paper

M e s s a g e f r o m t h e G e n e r a l C h a i r

This volume collects proceedings of ICIAP'97, September 17-19, 1997, Florence, Italy. ICIAP'97 is the ninth meeting of the International Conference on Image Analysis and Processing, organized biennially by the Italian Chapter of the Inter- national Association for Pattern Recognition (IAPR). Following the successful 1995 meeting in Sanremo, ICIAP'97 is held in the magnificent city of Florence, one of the most beautiful and famous cities in the world, renown for its artistic and cultural heritage. The 1997 ICIAP conference is one of the largest ever, with over 200 participants coming from almost every part of the world. This confirms the success of this initiative of the IAPR Italian Chapter, as well as the very good work carried out by the organizers of the previous ICIAP meetings.

We received a very large submission of 304 papers from 40 different countries, confirming the intense and ever growing activity in imaging technology research and development, worldwide. Papers covered basic research topics in image analysis, pattern recognition and computer vision, as well as applications of these technologies to real problems. Basic topics addressed included image enhancement, image segmentation, image compression, motion analysis, object recognition, image understanding, and special hardware architectures and systems. Applications were in the fields of biomedicine, character recognition, safety and surveillance, object identification and inspection, and quality control in manu- facturing, among others. Growing and emerging research and application topics, such as image and video databases, vision-assisted man-machine interaction, and color image processing were also strongly represented.

The reviewing process resulted in the selection of 173 papers. Only papers that received high ranks by all the reviewers were accepted for presentation at ICIAP'97. Oral presentations were limited to 42, organized in 12 sessions. Four poster sessions included 131 papers. In setting the conference program, we fa- vored large poster sessions to encourage interactivity between researchers and promote exchanges and the establishment of new links. We invited four distin- guished speakers, Dr. Dragutin Petkovic, from IBM Almaden Research Center, Prof. Jake Aggarwal, from Texas University at Austin, Prof. Linda Shapiro, from Washington University in Seattle, and Prof. Ramesh Jain, from the University of California at San Diego, to predict the state of imaging technologies in 2000 and suggest research perspectives and trends for the near future. For the first time, ICIAP'97 hosts a special session devoted to successful ongoing or recently com- pleted projects in image analysis and processing and computer vision, developed under European Community programs. A total of 9 poster presentations were accepted. Dr. Kostas Glinos, EU officer from DG III in Brussels, was invited to provide a view of forthcoming EU programs and opportunities in these fields for research and industry communities. This session was prepared in cooperation with APRE, the Florence Agency for European Research Development, and will hopefully stimulate interaction and technology transfer between research and industrial communities.

VM

I would like to thank IAPR Italian Chapter for allowing us to organize this conference in Florence and IAPR for its sponsorship. Moreover, I gratefully thank Provincia di Firenze, and particularly its Vice-President Riccardo Conti, for their financial backing and sponsorship of this initiative and wise sensitivity in understanding our effort in this task. Thanks are also due to CESVIT SpA, Florence, and its President Sergio Bertini and General Manager Silvestro Mi- tolo; to Bassilichi Sviluppo SpA and its president Luca Bassilichi; to OTE SpA, Florence, and its President Carlo Lastrucci; to Logitron SpA, Florence, and its President Andrea Ripasarti; to SESA SpA, Empoli and its President Paolo Castellacci; as well as to the University of Florence and its Dean Prof. Paolo Blasi, and to the Italian National Council of Research, who all generously sup- ported this event. I also thank Claudia Bianconi, local coordinator of the APRE, who greatly helped us in organizing the special session on EU projects together with the European Community.

An excellent program committee and their colleagues did great work in carefully reviewing an unexpectedly large number of papers, thus easing the task of selecting the best contributions. Their work is sincerely acknowledged. Special thanks go to Carlo Colombo and Pietro Pala, who made a fundamental, vol- untary, contribution to this conference, helping in managing, working on, and resolving those many problems that a large event like this presents. All the student volunteers of the Visual Information Processing Laboratory at the Uni- versity of Florence are also gratefully acknowledged. Finally, I thank Consulta Umbria Srl and its administrative staff, especially Simona Sarti and Giuseppina Meniconi, who assisted us in the organization of the conference and helped us in too many situations to be remembered here.

I wish to all delegates a very successful conference and hope that many new links will be established, and long lasting friendships will be set and reinforced.

Florence, July 1997 Alberto Del Bimbo

Genera l Chair

Alberto Del Bimbo

Program Chairs

Vito Cappellini Alberto Del Bimbo

Program Committee

Carlo Arcelli Carlo Braccini Michael Brady Virginio Cantoni Roberto Cipolla Luigi P. Cordella James L. Crowley Leila De Floriani Ernst Dickmanns Vito Di Ges5 Marco Ferretti Herbert Freeman Giovanni Garibotto Marco Gori Concettina Guerra Sebastiano Impedovo Anil K. Jain Xiaoyi Jiang Josef Kittler Walter Kropatsch Stefano Levialdi Piero Mussio Dragutin Petkovic Matti Pietik~inen Vito Roberto Masao Sakauchi Alberto Sanfeliu Gabriella Sanniti di Baja Jorge L.C. Sanz Linda G. Shapiro Arnold W.M. Smeulders Renato Stefanelli Anastasios N. Venetsanopoulos Gianni Vernazza Juan Jos@ Villanueva Sergio Vitulano Hezy Yeshurun Bertrand Zavidovique

vii

University of Florence, I

University of Florence, I University of Florence, I

CNR Arco Felice Naples, I University of Genoa, I

University of Oxford, UK University of Pavia, I

University of Cambridge, UK University of Naples, I

INPG Grenoble, F University of Genoa, I

Universit~t Bundeswehr Mfinchen, D University of Palermo, I

University of Pavia, I Rutgers University, USA

ELSAG BAILEY, Genoa, I University of Siena, I

University of Padoa, I University of Bari, I

Michigan State University, USA University of Bern, CH

University of Surrey, UK Technical University of Vienna, A

University of Roma, I University of Brescia, I

IBM Almaden, USA University of Oulu, SF University of Udine, I University of Tokyo, J

Universitat Politecnica de Catalunya, E CNR Arco Felice Naples, I

IBM Argentina, ARG University of Washington, USA

University of Amsterdam, NL Politecnico di Milano, I

University of Toronto, CAN University of Cagtiari, I

Universidad Autonoma de Barcelona, E University of Cagliari, I Tel Aviv University, IL Universit~ Paris XI, F

VIII

L o c a l O r g a n i z i n g C o m m i t t e e

Luciano Alparone Stefano Baronti Carlo Colombo Jacopo M. Corridoni Alberto Del Bimbo Marco Lusini Pietro Pala Enrico Vicario

University of Florence, I IROE-CNR Florence, I University of Brescia, I

University of Florence, I University of Florence, I University of Florence, I University of Florence, I University of Florence, I

S p o n s o r e d by:

IAPR - Italian Association for Pattern Recognition DSI - Dipartimento Sistemi e Informatica, Universit~ degli Studi di Firenze

S u p p o r t e d by:

Universit~ degli Studi di Firenze CNR - Consiglio Nazionale delle Ricerche Provincia di Firenze CESVIT SpA - Firenze APRE - Firenze Bassilichi Sviluppo SpA - Firenze Logitron SpA - Firenze OTE SpA - Firenze SESA SpA - Empoli

T a b l e o f C o n t e n t s - V o l u m e I

Keynote Address

Challenges and Opportunities for Pattern Recognition and Computer Vision Research in Year 2000 and Beyond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 D. Petkovic

Session 1: Segmentation

Multiscale Gradient Magnitude Watershed Segmentation . . . . . . . . . . . . . . . . . . . 6 O.F. Olsen, M. Nielsen

Segmentation of Multispectral Images of Works of Art through Principal Component Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 S. Baronti, A. Casini, F. Lotti, S. Porcinai

Session 2: Image Analysis &: Pattern Recognition

Multiscale Edge Detection via Normal Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 C.-J. Sze, H.-Y.M. Liao, H.-L. Hung, K.-C. Fan, J.-W. Hsieh

Extending Adjacency to Fuzzy Sets for Coping with Imprecise Image Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 L Bloch, H. Maitre

Adaptive Selection of Image Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 G. Giacinto, F. Roli

Classification Reliability and Its Use in Multi-classifier Systems . . . . . . . . . . . 46 L.P. Cordelia, P. Foggia, C. Sansone, F. TortoreUa, M. Vento

Poster Session A: Color K: Texture, Enhancement, Image Analysis &: Pattern Recognition, Segmentation

Color Linear Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 C.-Y. Kim, Y.-S. Seo, L-S. Kweon

A Computational Approach to Color Illusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 D. Marini, A. Rizzi

Improved Textured Images Segmentation Using an Energy Fhnctionai . . . . . 70 A. Grau, J. Saludes

Contribution to the Colour Segmentation by Means of an Algorithm Which Reduces the CCDs Saturation Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 J. Regincds Isern, J. Batlle Grabulosa

Pyramid-Based Multi-sensor Image Data Fusion with Enhancement of Textural Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 B. Aiazzi, L. Alparone, S. Baronti, V. Cappellini, R. Carld, L. Mortelli

Texture Analysis Using Pairwise Interaction Maps . . . . . . . . . . . . . . . . . . . . . . . . 95 D. Chetverikov

Estimation of the Color Image Gradient with Perceptual Attributes . . . . . . 103 P. Pujas, M.-J. Aldon

Contour Line Extraction from Color Images of Scanned Maps . . . . . . . . . . . . 111 M. Lalonde, Y. Li

Subjective Analysis of Edge Detectors in Color Image Processing . . . . . . . . . 119 P. Androutsos, D. Androutsos, K.N. Plataniotis, A.N. Venetsanopoulos

Similarity Measures for Binary and Grey Level Markov Random Field Textures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 A. ~arkacio~lu, F. T. Yarman- Vural

A Simple and Effective Edge Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 C. Cafforio, E. Di Sciascio, C. Guaragnella, G. Piscitelli

Improvements to Image Magnification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 A. Biancardi, L. Lombardi, V. Pacaccio

Refining Surface Curvature with Relaxation Labeling . . . . . . . . . . . . . . . . . . . . 150 R.C. Wilson, E.R. Hancock

Dynamic Scale-Space Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 A.H. Salden

Reconstructing Digital Sets from X-Rays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 E. Barcucci, A. Del Lungo, M. Nivat, R. Pinzani, A. Zurli

Pattern Recognition from Compressed Labelled Trees of Fuzzy Regions .. 174 L. Wendling, J. Desachy, A. Paries

Optimality Analysis of Edge Detection Algorithms for Range Images . . . . . 182 X. Jiang

Analysis Situs and Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 F. Sloboda, B. Zat'ko

Defining Cost Functions and Profitability Measures for Digraphs Associated with Raster Dems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 P. Matsakis, J. Gadiou, J. Desachy

Using Proximity and Spatial Homogeneity in Neighbourhood-Based Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 J.S. Sdnchez, F. Pla, F.J. Ferri

Image Segmentation by Means of Fuzzy Entropy Measure . . . . . . . . . . . . . . . . 214 C. Di Ruberto, M. Nappi, S. Vitulano

×1

Efficient Region Segmentation through "Creep-and-Merge" . . . . . . . . . . . . . . . 223 A. Basman, J. Lasenby, R. Cipolla

An Automat ic Transformation from Bimodal to Pseudo-Binary Images . . . 231 J.M. I~esta, P.J. Sanz, ~.P. del Pobil

A New Deformable Model for 3D Image Segmentation . . . . . . . . . . . . . . . . . . . 239 Z. Zhang, M. Braun, P. Abbott

Evolutionary Image Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 P. Zingaretti, A. Carbonaro, P. Puliti

Discontinuity Adaptive MRF Model for Synthetic Aperture Radar Image Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 P.C. Smits, S.G. Dellepiane, G. Vernazza

Region Growing Euclidean Distance Transforms . . . . . . . . . . . . . . . . . . . . . . . . . 263 O. Cuisenaire

COP: A New Method for Extract ing Edges and Corners . . . . . . . . . . . . . . . . . 271 S.C. Bae, LS. Kweon

An Integrated Approach for Segmentation and Representation of Range Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 O.R.P. Bellon, C.L. Tozzi

Session 3: Segmentat ion 8z Coding

Two-Dimensional Fractal Segmentation of Natural Images . . . . . . . . . . . . . . . 287 V. Anh, J. Maeda, T. Ishizaka, Y. Suzuki, Q. Tieng

Fast Segmentation of Range Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 M. Haindl, P. Zid

Image Compression Based on Centipede Model . . . . . . . . . . . . . . . . . . . . . . . . . . 303 B. Kurt, M. GSkmen, A.K. Jain

Session 4: Color 8z Texture

Unsupervised Texture Segmentation Using Feature Distributions . . . . . . . . . 311 T. Ojala, M. Pietik~inen

Color Based Object Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 T. Gevers, A.W.M. Smeulders

Color Texture Classification by Wavelet Energy Correlation Signatures . . . 327 G. Van de Wouwer, S. Livens, P. Scheunders, D. Van Dyck

Cross-Media Color Matching Using Neural Networks . . . . . . . . . . . . . . . . . . . . . 335 E. Boldrin, R. Schettini

Xll

Keynote Address

Object Recognition and Performance Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 J.K. Aggarwal, S. Shah

Session 5: Shapes ~ Surfaces

Relating Image Warping to 3D Geometrical Deformations . . . . . . . . . . . . . . . . 361 A.L. Yuille, M. Ferraro, T. Zhang

Using Top-Down and Bottom-Up Analysis for a Multiscale Skeleton Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 G. Borgefors, G. Ramella, G. Sanniti di Baja

A New Algorithm for 3D Profilometry Based on Phase Measurement . . . . . 377 L. Di Stefano, F. Boland

Keynote Address

Surface Modeling and Display from Range and Color Data . . . . . . . . . . . . . . . 385 K. Pulli, M. Cohen, T. Duchamp, H. Hoppe, J. McDonald, L. Shapiro, W. Stuetzle

Session 6: Matching & Recognition

An Improved Active Shape Model: Handling Occlusion and Outliers . . . . . . 398 N. Duta, M. Sonka

Perspective Matching Using the EM Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 406 A.D.J. Cross, E.R. Hancock

Identifying Human Face Profiles with Semi-Local Integral Invariants . . . . . 414 J. Sato, R. Cipolla

Poster Session B: Active Vision, Motion, Shape, Stereo

Adaptive Fovea Structures for Space-Variant Sensors . . . . . . . . . . . . . . . . . . . . . 422 P. Camacho, F. Arrebola, F. Sandoval

Structural Characterization of Image Processing Operators . . . . . . . . . . . . . . . 430 P. Bottoni, L. Cinque, S. Levialdi, P. Mussio, B. Ncbbia

Easy Calibration of Pan/Til t Camera Heads and Online Computation of the Epipolar Cerrespondences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438 S. Spiess, M. Li

Integration of Spatio-Temporal Information for Motion Detection by Means of ~ z z y Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446 M. Barni, F. Bartolini, V. Cappellini, F. Lambardi

×111

Adaptive Motion Estimation and Video Vector Quantization Based on Spatiotemporal Non-linearities of Human Perception . . . . . . . . . . . . . . . . . . . . . 454 J. Malo, F. Ferri, J. Albert, J.M. Artigas

Integral Based Approach for Determining Motion Vector Fields . . . . . . . . . . 462 A. Nomura

A Practical Algorithm for Structure and Motion Recovery from Long Sequence of Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470 M. Trajkoviff, M. Hedley

Object Pose by Affine Iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478 F. Dornaika, C. Garcia

Robust Motion Estimation Using Chrominance Information in Color Image Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486 J. Magarey, A. Kokaram, N. Kingsbury

Temporal Prediction of Video Sequences Using an Image Warping Technique Based on Color Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494 N. Herodotou, A.N. Venetsanopoulos

Motion and Intensity-Based Segmentation and Its Application to Traffic Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502 J. Badenas, M. Bober, F. Pla

A Geometrically Deformable Contour Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 510 A. Raft, E. Petit, J. Lemoine, S. Djeziri

Non-visible Deformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519 J.-D. Durou, L. Mascarilla, D. Piau

Two-Step Parameter-Free Elastic Image Registration with Prescribed Point Displacements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527 W. Peckar, C. Schn5rr, K. Rohr, H.S. Stiehl

Learning for Feature Selection and Shape Detection . . . . . . . . . . . . . . . . . . . . . . 535 R. Cucchiara, M. Piccardi, M. Bariani, P. Mello

Experiments on the Decomposition of Arbitrarily Shaped Binary Morphological Structuring Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543 G. Anelli, A. Broggi, G. Destri

B~zier Modelling of Cracks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551 A. Varley, P. Rayner

An Adaptive Deformable Template for Mouth Boundary Modeling . . . . . . . 559 A.R. Mirhosseini, K.-M. Lam, H. }Tan

A Two-Stage Framework for Polygon Retrieval Using Minimum Circular Error Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567 L.H. Tung, L King

×Iv

Topology and Shape Preserving Parallel Thinning for 3D Digital Images - A New Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575 P.K. Saha, D.D. Majumder

Convergence of Model Based Shape from Shading . . . . . . . . . . . . . . . . . . . . . . . . 582 M.S. Lew, M. Chaudron, N. Huijsmans, A. She, T.S. Huang

Quanti tat ive Assessment of Two Skeletonization Algorithms Adapted to Rect- angular Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588 M. Ciuc, D. Coquin, P. Bolon

An Algorithm for the Global Solution of the Shape-from-Shading Model .. 596 M. Fatcone, M. Sagona

A Statistical Classification Method for Hierarchical Irregular Objects . . . . . 604 M.Peura

Multi-level Dynamic Programming for Axial Motion Stereo Line Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612 R.K.K. Yip

Analysis of Grey-Level Features for Line Segment Stereo Matching . . . . . . . 620 O. Schreer, L Hartmann, R. Adams

3D Object Positioning from Monocular Image Brightnesses . . . . . . . . . . . . . . . 628 T. Shioyama, H.Y. Wu, W.B. Jiang, S. Terauchi

Camera Calibration Based on 3D-Point-Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636 X.-F. Zhang, A. Luo, W. Tao, H. Burkhardt

A Geometric Modeling Tool for Stereo-Matching and Reconstruction of a Model of 3D-Scene . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644 L. SommeUier, E. Tosan, D. Vandorpe

Sess ion 7: M o t i o n &: Stereo

Est imat ing Translat ion/Deformation Motion through Phase Correlation .. 653 F. Pla, M. Bober

Robust Fitt ing of 3D CAD Models to Video Streams . . . . . . . . . . . . . . . . . . . . . 661 C. Meilhac, C. Nastar

Experiments with a New Area-Based Stereo Algorithm . . . . . . . . . . . . . . . . . . . 669 A. Fusiello, V. Roberto, E. Trucco

Adaptive Stereo Matching in Correlation Scale-Space . . . . . . . . . . . . . . . . . . . . 677 C. Menard, W.G. Kropatsch

Hierarchical Depth Mapping from Multiple Cameras . . . . . . . . . . . . . . . . . . . . . 685 J.-L Park, S. Inoue

×v

Session 8: Recognit ion

Fast Computation of Error-Correcting Graph Isomorphisms Based on Model Precompilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693 B.T. Messmer, H. Bunke

Function-Described Graphs Applied to 3D Object Representation . . . . . . . . . 701 F. Serratosa, A. Sanfeliu

Cooperative Vision in a Multi-Agent Architecture . . . . . . . . . . . . . . . . . . . . . . . . 709 N. Oswald, P. Levi

A u t h o r I n d e x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 717

Pyramid-Based Multi-sensor Image Data Fusion with Enhancement of Textural Features

B. Aiazzi °, L. Alparone*, S. Baronti °, V. Cappellini*, R. Carla °, L. Mortelli*

°IROE "Nello Carrara" - CNR Via Panciatichi, 64, 50127 Firenze, Italy

E-Mail: baronti0iroe, fi. cnr. it *Dip. Ing. Elettronica~ University of Florence

Via S. Marta, 3, 50139 Firenze, Italy E-Mail : alparoneOcosimo .die.unifi. it

Abs t rac t . In this work, a multi-resolution procedure based on a generalized Laplacian pyramid (GLP) with a rational scale factor is proposed to merge image data of any resolution and represent them at any scale. The GLP-based data fusion is shown to be superior to those of a similar scheme based on the discrete wavelet transform (WT) according to a set of parameters established in the literature. The pyramid-generating filters can be easily designed for data of any resolutions, differently from the WT, whose filter-bank design is non-trivial when the ratio between the scales of the images to be merged is not a power of two. Remotely sensed images from Landsat TM and from Panchromatic SPOT are fused together. Textured regions are enhanced without losing their spectral signatures, thereby expediting automatic analyses for contextual inter- pretation of the environment.

1 Multi-sensor Image Data Fusion

The availability of da ta from many sensors with different characteristics makes da ta fusion a topic of ever increasing relevance in the field of digital image processing. The main goal of pixel level algorithms [1] is to combine the original images from different sensors in order to synthesize a new set of da ta whose spatial and /or spectral resolution results to be enhanced, or to concentrate significant features of the various bands in a single image, thus compressing information and enhancing contrast and texture. In some cases processing is made with the main objective of extracting significant features [2] by maximizing the spatial contrast on the basis of the whole da ta set: distortion measures are not considered. In other applications, such as classification, merging algorithms are requested to maintain the spectral characteristics of the original da ta as much as possible [3] to avoid misinterpretation and introduction of undesired effects.

Approaches based on principal component analysis (PCA), on t ransformat ion of the original da ta in the hue intensity saturation (HIS) color space (three bands at a time), and on high pass filtering (HPF) [3] have been investigated

88

in the literature to achieve the latter objective: HPF resulted far more efficient than the other algorithms in preserving the spectral features of the enhanced bands. Therefore, such space-frequency image representations as discrete wavelet transform (WT) and Laplacian pyramid (LP) have been recently investigated for image fusion aimed at contrast enhancement [2].

Multi-spectral Earth observations from space exhibit limited spatial resolutions, differently from broad-spectrum imaging sensors, that may be inadequate to specific identification tasks. A typical example of such a situation is represented by Landsat Thematic Mapper (TM) multi-spectral imaging sensor, which has a 30m x 30m ground resolution in seven spectral bands and by the SPOT panchromatic (PAN) sensor, which provides single-band observations on a broad wavelength interval, with a 10m x 10m pixel size. Data fusion of Landsat-TM and SPOT-PAN images have been previously considered in the literature [3] due to their availability and their complementary spatial/spectral features. This paper reports about a pyramid-based approach to data fusion of Landsat-TM and SPOT-PAN images, with images previously registered on a common cartographic base, each at its own scale (30m and 10m, respectively). The proposed algorithm is a variant of the high-pass filter (HPF) method by Chavez et al. [3], recognized as one of the most efficient. Its generalization is achieved in a pyramid framework, since a generalize pyramid is an efficient structure by which both the high-pass filtering and the contrast enhancement algorithms can be easily implemented. Images are available at several different spatial scales. The expansion/reduction filters can be easily designed to cope with data of any resolutions from different sensors. Once new data from different sensors will be available on the selected test site (e.g., SAR data, digitized aerial photographs, hyper- spectral high resolution aircraft data, data from new-generation satellites), they will be easily merged with the existing ones in order to assess any advantages occurring from a cooperative analysis based on multiple imaging sources. The algorithm is assessed in terms of both objective scores and visual quality. Spec- tral feature preservation of Landsat images is evaluated. The performance of the merging procedure is previously discussed and assessed in a comparison with an analogous scheme based on the WT [2], recently established in the literature. The pyramid algorithm is found to be superior on the basis of both subjective and objective criteria.

2 M u l t i - s c a l e I m a g e A n a l y s i s

2.1 Wavelet t r ans fo rm

The wavelet transform provides a multi-resolution representation of continuous and discrete signals [4]. When it is applied to a sequence of discrete data f(n), the original signal can be considered as the coefficients of the projection of a continuous function into the highest resolution subspace: the coefficients rel- ative to the lower resolution subspace and to its orthogonal complement can be obtained through the subsampling of the discrete convolution of f(n) and the coefficients of the impulse response of two digital filters H(w) and G(w),

89

respectively low-pass and high-pass [4]. The two outcome sequences represent a smoothed version of f(n) and a detail signal, respectively: the latter, being the output of a high-pass filter, highlights the points in which rapid changes of the signal occur. In a similar manner, the higher resolution data can be re- trieved from the lower resolution projections by up-sampling and low-pass ill- tering. Therefore, the wavelet representation is closely related to a two-channel sub-band decomposition scheme.

Xo(m'n) f Wavelet Transform

. . . . . . ~_ X~L(rn,n) Wavelet ~"X~ H(m'n)

Transform ~-- x~=L(rn,n) J-- X~="(m,n)

L LL- • X~ (m,nl --x~H(m,n) --X~l"(m,n) -- X~l"(m,n )

HL

HH

(a) (b)

Fig. 1. Two-level 7 sub-bands wavelet transform scheme (a) and associated spatial frequency sub-bands (b).

If two dimension signals are dealt with, a wavelet representation can be obtained by separately processing rows and columns of the array. Let Xo(m, n) be the original image with dimensions M x N, and let xLL(m, n), m = 0,. . . , M / 2 - 1, n = 0 , . . . , N / 2 - 1 be the lower resolution subsequence obtained by low- pass filtering rows and columns; analogously, let xLH(m,n), X~lXL(m,n), and x~H(m,n), m = 0,... , M/2 - 1, n = 0,... , N / 2 - 1, be the sub-sequences obtained by the combination of low-pass and high-pass filtering along the rows or the columns. Since high-pass filtering highlights edges in an image, xLH(m, n) and x[fL(m, n) will contain information about vertical and horizontal contours, respectively. With analogous considerations x~H(m, n) highlights diagonal details. Further splitting of x~L(m, n) yields a multi-level decomposition: the signals x~L(.~, n), x~H (.~, n), x f fL (.~, n) and Xff~I (m, n), .~ = 0, . . . , Mt2 2 - 1, n = 0 , . . . , N/2 2 - 1, are produced at the second level of the decomposition and general expressions for the higher levels can easily be derived. Figure 1 shows the scheme for a two-level decomposition yielding a seven sub-bands representation: in the figure the Wavelet Transform block denotes the one-level four sub-bands separable splitting. The low-frequency coefficients xLL(m,n), are further de- composed, thus yielding a wavelet space-frequency representation, in which the wavelet coefficients may be accommodated into sub-bands based on their content of spatial frequencies.

90

2.2 Laplacian pyramid

The Gaussian pyramid (GP) is a multi-resolution image representation obtained through a recursive reduction, i.e. low-pass filtering and decimation. Let Go (m, n), m = O, . . . , M - l , and n = O, . . . , N - 1 , M = u × 2K, N = v x 2g, be the input image. The GP [5] is defined with a decimation factor of 2 ($ 2) as

G~(m,n) = reduce2[Gk_l](m,n) L. L~

E E r2(i) ×r2(j) G k _ l ( 2 m + i , 2 n + j ) (1) i:-L. j=-L~

for k = 1 , . . . , K , for m = O , . . . , M / 2 k - l , j = O , . . . , N / 2 k - l , in which k identifies the level of the pyramid. The 2D reduction (low-pass) filter is given as the outer product of a linear symmetric odd-sized kernel {r2 (i)} which should cut-off at one half of the signal bandwidth, to prevent aliasing.

From the GP, the LP is defined, for k = 0 , . . . , K - 1, as

Lk(m, n) ~ Gk(m,n ) -- expand2[Gk+ll(m,n) (2)

in which expand2[Gk+l] denotes that the (k + 1) st level of the GP is expanded by a factor 2 to match the size of the underlying k th level:

expand2[Gk+l](m,n) ~= E E e2(i) × e2(j) Gk+l + m j n (3) 2 '

i=--L~ j=-Le (j+n) rood 2----0 (i+m) mod 2=0

for m = 0 , . . . , M / 2 k - 1, n = 0 , . . . , N / 2 k - 1, and k = 0 , . . . , K - 1. The 2D low-pass filter for expansion is given as outer product of a linear symmetric odd- sized kernel {e2(i)}, which again should cut-off at one half of the bandwidth. Summation terms are taken to be null for noninteger values of (i + m ) / 2 and (j + n)/2, corresponding to interleaving zeroes introduced by up-sampling ($ 2).

2.3 General ized LP wi th a ra t ional scale factor

The expressions found for (1) and (3) may be generalized to comprise reduction and expansion factors different from 2 [7]. Reduction by q is defined as:

L~ L.

reduceq[Gk](m,n) z~ E E r q ( i ) x r q ( j ) G k ( q m + i , q n + j ) (4) i=--L, j=-L~

The reduction (low-pass) filter {rq(i)} should be designed to cut-off at one qth of the signal bandwidth. Expansion by a factor p is defined as:

Le L~ I j + n \ expandp[Gkl(m,n) ~ E E ep(i) × ep(j) Gk P , P (5)

i=--L, j=--L~ (j+n) rood p=0 (i+m) mod p=0

9 ]

The expansion filter {ep(i)} should cut-off at one pth of the signal bandwidth. Summation terms are null for noninteger values of (i + m)/p and (j + n)/p.

Ifp/q > 1, p, q integers, is the scale factor between two images to be merged, (1) modifies into the cascade of an expansion by q and a reduction by p

Gk+l : reducep/q[Gk] ~= reducep{expanda[Gk]} (6)

while (3) becomes an expansion by p followed by a reduction by q.

expandp/q[Gk] ~ reduceq{expandp[Gk]} (7)

When (4) is cascaded to (5), convolution can be skipped after up-sampling in (6), as well as before down-sampling in (7).

The Generalized Laplacian Pyramid (GLP) with p/q scale factor between adjacent layers, Lk, can thus be defined as:

Lk(m,r~) ~ Gk(m,n) - expandp/q{reducep/q[Gk]}(m,n) (8)

The filter design usually is a tradeoff between selectivity (sharp cutoff) and computational cost. Filters with different characteristics have to be designed to cope with bandwidth requirements of data fusion algorithms. In particular for a p/q scale ratio, only low-pass filters with 1/p and 1/q normalized frequency cut- offs are needed. Instead, the WT requires also a high-pass filter (i.e. a complete filter-bank) which must generally be re-designed for every value of p/q.

3 M u l t i - r e s o l u t i o n D a t a F u s i o n S c h e m e s

The idea of the wavelet-based image fusion algorithm developed by Li et al. [2] is to merge couples of sub-bands of corresponding frequency content on the basis of an activity measure locally computed on 2 x 2 blocks of coefficients. The fused image is produced by taking the inverse transform of the blocks of coefficients chosen as the more active between the two images.

The block diagram reported in Figure 2 describes the data fusion algorithm in the general case of two image data sets, preliminarily registered on the same cartographic base, whose scale ratio is p/q. Let $1 be the data set constituted by a single image having smaller scale and $2 the data set made up of several multi-spectral observations with larger scale. The goal is to obtain a set of as many multi-spectral images as $2, each having same spatial resolution as $1. The upgrade of S~ to the resolution of $1 is the zero-mean GLP (8) of $1, computed for k = 0. The high-pass component from $1 is added to each of the expanded images of $2 to yield an enhanced set of multi-spectral observations, $3.

4 Exper imenta l Results

Figures 3(a) and (b) show Band 6 (thermal infrared) and Band 5 (near infrared) of a Landsat TM image portraying a zone of the Elba island, in Tuscany, Italy.

92

, . ~ . r Multiband

4- -

Monoband ~ ~ iN-resolut ion high-resolution I ~ I

.. L

Fig. 2. Outline of data fusion procedure for two images with a p/q scale ratio.

Due to SNR constraints, TM Band 6 is actually sensed with a ground resolution of 120m/pel and resampled in order to match the size of the other bands (30m/pel). Figures 3(c) and (d) show fusion results of the two algorithms. The results of Fig. 3(c) have been obtained through the F IR implementat ion of a cubic spline W T [4,6]. Although the wavelet-fused image looks sharper, artifacts are perceivable around edges, due to ringing effects.

Spectral feature preservation is evaluated by taking the pixel differences between any of the merged images and a linearly resampled version of Band 6 (both integer valued). These differences are expected to be either zero or very small on homogeneous areas, and relevant on contours or highly textured areas. The s tandard deviation of such differences and the number of pixels in which

Table 1. Std. devs. (STD) of the differences obtained by subtracting merged images from expanded TM bands. Percentage of pixels (P :t: 1) whose absolute differences are equal to either one or zero. Results reported for wavelet (WT) and pyramid (GLP).

TM Band: WT: STD GLP: STD WT: P 4- 1 GLP: P 4- 1 1 . . . . 2.93 2.65 '40.91% 53.63 % 2 2.69 2.32 41.89 % 49.35 % 3 2.62 2.48 41.47 % 53.94 % 4 2.24 2.22 40.96 % 64.09 % 5 2.27 2.33 41.66 % 64.08 % 7 2.39 2.34 41.63 % 65.67 %

they are equal or very close to zero represent two figures of merit for image da ta fusion [3]. The former should be as close to zero as possible. Due to roundoff to integers, pixet differences are taken to be null if their absolute values do not

93

(a) (b)

(c) (d)

Fig. 3. ~s ion of Landsat TM Band 6 (a) with Band 5 (b): (c) wavelet scheme; (d) pyramid scheme. Both images are 256 × 256 details.

exceed unity. Table 1 reports the scores of each TM band. It is apparent that the pixel percentages are far larger for the pyramid scheme. Standard deviations are slightly smaller for the pyramid, with the only exception of Band 5. The values of the parameters reported have been optimized over the pyramid-generating filter (15 taps) and are steady. The results of Table 1 are bet ter than those reported in [3], thanks to the multi-resolution framework which allows a bet ter filter design.

SPOT Panchromatic and Landsat TM data were available for the test area of Metaponto, in Southern Italy. The images were registered on the same cartographic base, each maintaining its own scale. The p/q ratio is 3. Original SPOT-PAN and TM Band-5 images are shown in Figure 4, together with the enhanced TM Band-5 version, in order to visually assess the quality of the results. Contours and textures are highlighted. The local average level is carefully preserved. Such a feature is important in determining spectral signatures, and its alteration may be responsible for misclassification and misinterpretation.

94

ii i:ii !i:i!:

(a) (b) (c)

Fig. 4. 256 x 192 detail of the SPOT-PAN image (a), ground resolution 10m, and TM-5 image (b) of the test area: resolution is 30m and a magnification by 3 has been applied for displaying. TM-5 image pyramid-fused with SPOT-PAN (c). Performance parameters, as defined in Table 1, are S T D = 4.05 and P ± 1 -- 36.44%.

A c k n o w l e d g m e n t s

This work was carried out under grants of ASI -Italian Space Agency- within a joint project on multisource classification, and of CNR -National Research Council- in the framework of the nationwide project on Cultural Heritage.

R e f e r e n c e s

1. R. C. Luo and M. G. Kay, "Multisensor integration and fusion in intelligent systems," IEEE Trans. Systems, Man, and Cybernetics, 19(5), 901-931 (1989).

2. H. Li, B. S. Manjunath, and S. K. Mitra, "Multisensor image fusion using the wavelet transform," CVGIP: Graphical Models and Image Processing, 57(3), 235-245 (1995).

3. P. S. Chavez Jr., S. C. Sides, and J. A. Anderson, "Comparison of three different methods to merge multiresolution and multispectral data: Landsat TM and SPOT panchromatic," Photogram. Engin. Remote Sensing, 57(3), 295-303 (1991).

4. S. Mallat, "A Theory for Multiresotution Signal Decomposition: the WaveletRep- resentation," IEEE Trans. Pattern Anal. Machine Intell., 11(7), 674-693 (1989).

5. P. J. Burr, "The pyramid as a structure for efficient computation," in Multiresolution Image Processing and Analysis, A. Rosenfeld (Ed.), Berlin, Springer-Verlag (1984).

6. M. Unser and A. Aldroubi, "Polynomial Splines and Wavelets- A Signal Processing Perspective", in Wavelets- A ~torial in Theory and Applications, C. K. Chui (Ed.), Academic Press, 91-122 (1992).

7. M. G. Kim, I. Dinstein, and L. Shaw, "A Prototype Filter Design Approach to Pyramid Generation," IEEE Trans. Pattern Anal. Machine InteU., 15(12), 1233- 1240 (1993).

Pyramid-Based Multisensor Image Data Fusion with Enhancement of Textural Features

Documents