Top Banner
HANDBOOK OF COMPUTER VISION AND APPLICATIONS Volume 2 Signal Processing and Pattern Recognition ACADEMIC PRESS Bernd Jähne Horst Haußecker Peter Geißler 1 1 2 2 2 2 4 1 1
967

Jahne B., Handbook of Computer Vision and Applications Vol. 2 Signal Processing and Pattern Recognition

Jul 28, 2015

Download

Documents

aDreamerBoy
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript

1 2 1

2 4 2

1 2 1

HANDBOOK OF COMPUTER VISION AND APPLICATIONSVolume 2 Signal Processing and Pattern RecognitionBernd Jhne Horst Hauecker Peter Geiler

ACADEMIC PRESS

Handbook of Computer Vision and ApplicationsVolume 2 Signal Processing and Pattern Recognition

Handbook of Computer Vision and ApplicationsVolume 2 Signal Processing and Pattern RecognitionEditors Bernd JhneInterdisciplinary Center for Scientic Computing University of Heidelberg, Heidelberg, Germany and Scripps Institution of Oceanography University of California, San Diego

Horst Hauecker Peter GeilerInterdisciplinary Center for Scientic Computing University of Heidelberg, Heidelberg, Germany

ACADEMIC PRESSSan Diego London Boston New York Sydney Tokyo Toronto

This book is printed on acid-free paper. Copyright 1999 by Academic Press. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. The appearance of code at the bottom of the rst page of a chapter in this book indicates the Publishers consent that copies of the chapter may be made for personal or internal use of specic clients. This consent is given on the condition, however, that the copier pay the stated per-copy fee through the Copyright Clearance Center, Inc. (222 Rosewood Drive, Danvers, Massachusetts 01923), for copying beyond that permitted by Sections 107 or 108 of the U.S. Copyright Law. This consent does not extend to other kinds of copying, such as copying for general distribution, for advertising or promotional purposes, for creating new collective works, or for resale. Copy fees for pre-1999 chapters are as shown on the title pages; if no fee code appears on the title page, the copy fee is the same as for current chapters. ISBN 0-12-379770-5/$30.00 ACADEMIC PRESS A Division of Harcourt Brace & Company 525 B Street, Suite 1900, San Diego, CA 92101-4495 http://www.apnet.com ACADEMIC PRESS 24-28 Oval Road, London NW1 7DX, UK http://www.hbuk.co.uk/ap/ Library of Congress Cataloging-In-Publication Data Handbook of computer vision and applications / edited by Bernd Jhne, Horst Haussecker, Peter Geissler. p. cm. Includes bibliographical references and indexes. Contents: v. 1. Sensors and imaging v. 2. Signal processing and pattern recognition v. 3. Systems and applications. ISBN 0123797705 (set). ISBN 012379771-3 (v. 1) ISBN 0123797721 (v. 2). ISBN 012379773-X (v. 3) 1. Computer vision Handbooks, manuals. etc. I. Jhne, Bernd 1953 . II. Haussecker, Horst, 1968 . III. Geissler, Peter, 1966 . TA1634.H36 1999 006.3 7 dc21 9842541 CIP Printed in the United States of America 99 00 01 02 03 DS 9 8 7 6 5 4 3 2 1

Contents

Preface Contributors 1 Introduction B. Jhne 1.1 Signal processing for computer vision . . . . . . 1.2 Pattern recognition for computer vision . . . . . 1.3 Computational complexity and fast algorithms 1.4 Performance evaluation of algorithms . . . . . . 1.5 References . . . . . . . . . . . . . . . . . . . . . . . .

xi xiii 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 3 4 5 6

I

Signal Representation9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 10 13 23 30 34 35 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 41 51 57 66 67 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 70 76 84 90

2 Continuous and Digital Signals B. Jhne 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Continuous signals . . . . . . . . . . . . . . . . . . . . . 2.3 Discrete signals . . . . . . . . . . . . . . . . . . . . . . . 2.4 Relation between continuous and discrete signals 2.5 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Spatial and Fourier Domain B. Jhne 3.1 Vector spaces and unitary transforms . 3.2 Continuous Fourier transform (FT) . . . 3.3 The discrete Fourier transform (DFT) . . 3.4 Fast Fourier transform algorithms (FFT) 3.5 References . . . . . . . . . . . . . . . . . . . 4 Multiresolutional Signal Representation B. Jhne 4.1 Scale in signal processing . . . . . . 4.2 Scale lters . . . . . . . . . . . . . . . . 4.3 Scale space and diusion . . . . . . . 4.4 Multigrid representations . . . . . . 4.5 References . . . . . . . . . . . . . . . .

v

vi II Elementary Spatial Processing

Contents

5 Neighborhood Operators B. Jhne 5.1 Introduction . . . . . . . . . . . . . . 5.2 Basics . . . . . . . . . . . . . . . . . . 5.3 Linear shift-invariant lters . . . . 5.4 Recursive lters . . . . . . . . . . . . 5.5 Classes of nonlinear lters . . . . . 5.6 Ecient neighborhood operations 5.7 References . . . . . . . . . . . . . . .

93 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 94 98 106 113 116 124 125 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 126 128 132 133 143 151 153

6 Principles of Filter Design B. Jhne, H. Scharr, and S. Krkel 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . 6.2 Filter design criteria . . . . . . . . . . . . . . . . 6.3 Windowing techniques . . . . . . . . . . . . . . 6.4 Filter cascading . . . . . . . . . . . . . . . . . . . 6.5 Filter design as an optimization problem . . 6.6 Design of steerable lters and lter families 6.7 References . . . . . . . . . . . . . . . . . . . . . . 7 Local Averaging B. Jhne 7.1 Introduction . . . . . 7.2 Basic features . . . . 7.3 Box lters . . . . . . . 7.4 Binomial lters . . . 7.5 Cascaded averaging 7.6 Weighted averaging 7.7 References . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

153 154 158 163 167 173 174 175

8 Interpolation B. Jhne 8.1 Introduction . . . . . . . . . . . . 8.2 Basics . . . . . . . . . . . . . . . . 8.3 Interpolation in Fourier space . 8.4 Polynomial interpolation . . . . 8.5 Spline-based interpolation . . . 8.6 Optimized interpolation . . . . 8.7 References . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

175 176 180 182 187 190 192 193

9 Image Warping B. Jhne 9.1 Introduction . . . . . . . . . . . . . . . . . . . 9.2 Forward and inverse mapping . . . . . . . 9.3 Basic geometric transforms . . . . . . . . . 9.4 Fast algorithms for geometric transforms 9.5 References . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

193 194 195 199 206

Contents III Feature Estimation

vii

10 Local Structure B. Jhne 10.1 Introduction . . . . . . . . . . . . . . . . . . 10.2 Properties of simple neighborhoods . . 10.3 Edge detection by rst-order derivatives 10.4 Edge detection by zero crossings . . . . 10.5 Edges in multichannel images . . . . . . . 10.6 First-order tensor representation . . . . 10.7 References . . . . . . . . . . . . . . . . . . .

209 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 210 213 223 226 227 238 239 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 240 247 251 262 265 269 270 275 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 278 299 306 307 309 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 312 321 345 353 356 369 373 392 397 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 398 400 404 410 414 419 420

11 Principles for Automatic Scale Selection T. Lindeberg 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Multiscale dierential image geometry . . . . . . . . 11.3 A general scale-selection principle . . . . . . . . . . . 11.4 Feature detection with automatic scale selection . 11.5 Feature localization with automatic scale selection 11.6 Stereo matching with automatic scale selection . . 11.7 Summary and conclusions . . . . . . . . . . . . . . . . 11.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Texture Analysis T. Wagner 12.1 Importance of texture . . . . . . . . . . . . . . . . 12.2 Feature sets for texture analysis . . . . . . . . . 12.3 Assessment of textural features . . . . . . . . . 12.4 Automatic design of texture analysis systems 12.5 References . . . . . . . . . . . . . . . . . . . . . . . 13 Motion H. Hauecker and H. Spies 13.1 Introduction . . . . . . . . . . . . . . . . . . . . . 13.2 Basics: ow and correspondence . . . . . . . . 13.3 Optical ow-based motion estimation . . . . 13.4 Quadrature lter techniques . . . . . . . . . . 13.5 Correlation and matching . . . . . . . . . . . . 13.6 Modeling of ow elds . . . . . . . . . . . . . . 13.7 Condence measures and error propagation 13.8 Comparative analysis . . . . . . . . . . . . . . . 13.9 References . . . . . . . . . . . . . . . . . . . . . . 14 Bayesian Multiscale Dierential E. P. Simoncelli 14.1 Introduction . . . . . . . . . 14.2 Dierential formulation . 14.3 Uncertainty model . . . . . 14.4 Coarse-to-ne estimation . 14.5 Implementation issues . . 14.6 Examples . . . . . . . . . . . 14.7 Conclusion . . . . . . . . . . 14.8 References . . . . . . . . . . Optical Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

viii15 Nonlinear Diusion Filtering J. Weickert 15.1 Introduction . . . . . . . 15.2 Filter design . . . . . . . 15.3 Continuous theory . . . 15.4 Algorithmic details . . . 15.5 Discrete theory . . . . . 15.6 Parameter selection . . 15.7 Generalizations . . . . . 15.8 Summary . . . . . . . . . 15.9 References . . . . . . . .

Contents423 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424 425 433 436 439 441 444 446 446 451 . . . . . images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451 455 471 476 481 485 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485 487 499 502 505 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505 509 514 520 524 528 531 . . . . . . . . . . . . . . . . . . . . . . . . shading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532 539 552 559 571 574 586 587 591 592 593 595

16 Variational Methods C. Schnrr 16.1 Introduction . . . . . . . . . . . . . . . . . . . 16.2 Processing of two- and three-dimensional 16.3 Processing of vector-valued images . . . . 16.4 Processing of image sequences . . . . . . . 16.5 References . . . . . . . . . . . . . . . . . . . . 17 Stereopsis - Geometrical H. A. Mallot 17.1 Introduction . . . . 17.2 Stereo geometry . 17.3 Global stereopsis . 17.4 References . . . . . and Global Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

18 Stereo Terrain Reconstruction by Dynamic Programming G. Gimelfarb 18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.2 Statistical decisions in terrain reconstruction . . . . . 18.3 Probability models of epipolar proles . . . . . . . . . . 18.4 Dynamic programming reconstruction . . . . . . . . . . 18.5 Experimental results . . . . . . . . . . . . . . . . . . . . . . 18.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Reectance-Based Shape Recovery R. Klette, R. Kozera, and K. Schlns 19.1 Introduction . . . . . . . . . . . . . . . . . 19.2 Reection and gradients . . . . . . . . . 19.3 Three light sources . . . . . . . . . . . . 19.4 Two light sources . . . . . . . . . . . . . . 19.5 Theoretical framework for shape from 19.6 Shape from shading . . . . . . . . . . . . 19.7 Concluding remarks . . . . . . . . . . . . 19.8 References . . . . . . . . . . . . . . . . . .

20 Depth-from-Focus P. Geiler and T. Dierig 20.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.2 Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.3 Principles of depth-from-focus algorithms . . . . . . . . . . . .

Contents20.4 20.5 20.6 20.7 Multiple-view depth-from-focus Dual-view depth-from-focus . . . Single-view depth-from-focus . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ix596 601 608 622

IV

Object Analysis, Classication, Modeling, Visualization627 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628 629 637 659 664 678 683 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684 691 692 699 702 714 721 722 729 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729 730 736 740 743 747 751 753 . . . . . . . . . . . . . . . . . . . . . 754 754 760 762 775 780 786 791 791 794

21 Morphological Operators P. Soille 21.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Morphological operators . . . . . . . . . . . . . . . . . 21.4 Ecient computation of morphological operators 21.5 Morphological image processing . . . . . . . . . . . . 21.6 References . . . . . . . . . . . . . . . . . . . . . . . . . .

22 Fuzzy Image Processing H. Hauecker and H. R. Tizhoosh 22.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 22.2 Why fuzzy image processing? . . . . . . . . . . . . . . . 22.3 Fuzzy image understanding . . . . . . . . . . . . . . . . 22.4 Fuzzy image processing systems . . . . . . . . . . . . . 22.5 Theoretical components of fuzzy image processing 22.6 Selected application examples . . . . . . . . . . . . . . 22.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 22.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Neural Net Computing for Image Processing A. Meyer-Bse 23.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 23.2 Multilayer perceptron (MLP) . . . . . . . . . . . . . 23.3 Self-organizing neural networks . . . . . . . . . . 23.4 Radial-basis neural networks (RBNN) . . . . . . . 23.5 Transformation radial-basis networks (TRBNN) 23.6 Hopeld neural networks . . . . . . . . . . . . . . 23.7 References . . . . . . . . . . . . . . . . . . . . . . . .

24 Graph Theoretical Concepts for Computer Vision D. Willersinn et al. 24.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.2 Basic denitions . . . . . . . . . . . . . . . . . . . . . . . . . . 24.3 Graph representation of two-dimensional digital images 24.4 Voronoi diagrams and Delaunay graphs . . . . . . . . . . . 24.5 Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.6 Graph grammars . . . . . . . . . . . . . . . . . . . . . . . . . . 24.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25 Shape Reconstruction from Volumetric Data R. Eils and K. Stzler 25.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25.2 Incremental approach . . . . . . . . . . . . . . . . . . . . . . . . . .

x25.3 25.4 25.5 25.6 Three-dimensional shape reconstruction from Volumetric shape reconstruction . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . .

Contentscontour lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 797 802 811 813 817 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 817 819 821 826 844 850 852 852 855 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 855 859 864 865 868 870 872 872 875 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 876 880 881 884 890 898 901 905 905 907 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 908 909 911 912 917 918 921 925 927

26 Probabilistic Modeling in Computer Vision J. Hornegger, D. Paulus, and H. Niemann 26.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 26.2 Why probabilistic models? . . . . . . . . . . . . . . . 26.3 Object recognition: classication and regression 26.4 Parametric families of model densities . . . . . . . 26.5 Automatic model generation . . . . . . . . . . . . . 26.6 Practical issues . . . . . . . . . . . . . . . . . . . . . . 26.7 Summary, conclusions, and discussion . . . . . . . 26.8 References . . . . . . . . . . . . . . . . . . . . . . . . . 27 Knowledge-Based Interpretation of Images H. Niemann 27.1 Introduction . . . . . . . . . . . . . . . . . 27.2 Model of the task domain . . . . . . . . 27.3 Interpretation by optimization . . . . . 27.4 Control by graph search . . . . . . . . . 27.5 Control by combinatorial optimization 27.6 Judgment function . . . . . . . . . . . . . 27.7 Extensions and remarks . . . . . . . . . 27.8 References . . . . . . . . . . . . . . . . . .

28 Visualization of Volume Data J. Hesser and C. Poliwoda 28.1 Selected visualization techniques . . . . . . . . 28.2 Basic concepts and notation for visualization 28.3 Surface rendering algorithms and OpenGL . . 28.4 Volume rendering . . . . . . . . . . . . . . . . . . 28.5 The graphics library VGL . . . . . . . . . . . . . . 28.6 How to use volume rendering . . . . . . . . . . . 28.7 Volume rendering . . . . . . . . . . . . . . . . . . 28.8 Acknowledgments . . . . . . . . . . . . . . . . . . 28.9 References . . . . . . . . . . . . . . . . . . . . . . .

29 Databases for Microscopes and Microscopical Images N. Salmon, S. Lindek, and E. H. K. Stelzer 29.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29.2 Towards a better system for information management 29.3 From at les to database systems . . . . . . . . . . . . . 29.4 Database structure and content . . . . . . . . . . . . . . . 29.5 Database system requirements . . . . . . . . . . . . . . . . 29.6 Data owhow it looks in practice . . . . . . . . . . . . . 29.7 Future prospects . . . . . . . . . . . . . . . . . . . . . . . . . 29.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Index

PrefaceWhat this handbook is aboutThis handbook oers a fresh approach to computer vision. The whole vision process from image formation to measuring, recognition, or reacting is regarded as an integral process. Computer vision is understood as the host of techniques to acquire, process, analyze, and understand complex higher-dimensional data from our environment for scientic and technical exploration. In this sense the handbook takes into account the interdisciplinary nature of computer vision with its links to virtually all natural sciences and attempts to bridge two important gaps. The rst is between modern physical sciences and the many novel techniques to acquire images. The second is between basic research and applications. When a reader with a background in one of the elds related to computer vision feels he has learned something from one of the many other facets of computer vision, the handbook will have fullled its purpose. The handbook comprises three volumes. The rst volume, Sensors and Imaging, covers image formation and acquisition. The second volume, Signal Processing and Pattern Recognition, focuses on processing of the spatial and spatiotemporal signal acquired by imaging sensors. The third volume, Systems and Applications, describes how computer vision is integrated into systems and applications.

PrerequisitesIt is assumed that the reader is familiar with elementary mathematical concepts commonly used in computer vision and in many other areas of natural sciences and technical disciplines. This includes the basics of set theory, matrix algebra, dierential and integral equations, complex numbers, Fourier transform, probability, random variables, and graphing. Wherever possible, mathematical topics are described intuitively. In this respect it is very helpful that complex mathematical relations can often be visualized intuitively by images. For a more for-

xi

xii

Preface

mal treatment of the corresponding subject including proofs, suitable references are given.

How to use this handbookThe handbook has been designed to cover the dierent needs of its readership. First, it is suitable for sequential reading. In this way the reader gets an up-to-date account of the state of computer vision. It is presented in a way that makes it accessible for readers with dierent backgrounds. Second, the reader can look up specic topics of interest. The individual chapters are written in a self-consistent way with extensive cross-referencing to other chapters of the handbook and external references. The CD that accompanies each volume of the handbook contains the complete text of the handbook in the Adobe Acrobat portable document le format (PDF). This format can be read on all major platforms. Free Acrobat reader version 3.01 for all major computing platforms is included on the CDs. The texts are hyperlinked in multiple ways. Thus the reader can collect the information of interest with ease. Third, the reader can delve more deeply into a subject with the material on the CDs. They contain additional reference material, interactive software components, code examples, image material, and references to sources on the Internet. For more details see the readme le on the CDs.

AcknowledgmentsWriting a handbook on computer vision with this breadth of topics is a major undertaking that can succeed only in a coordinated eort that involves many co-workers. Thus the editors would like to thank rst all contributors who were willing to participate in this eort. Their cooperation with the constrained time schedule made it possible that the three-volume handbook could be published in such a short period following the call for contributions in December 1997. The editors are deeply grateful for the dedicated and professional work of the sta at AEON Verlag & Studio who did most of the editorial work. We also express our sincere thanks to Academic Press for the opportunity to write this handbook and for all professional advice. Last but not least, we encourage the reader to send us any hints on errors, omissions, typing errors, or any other shortcomings of the handbook. Actual information about the handbook can be found at the editors homepage http://klimt.iwr.uni-heidelberg.de. Heidelberg, Germany and La Jolla, California, December 1998 Bernd Jhne, Horst Hauecker, Peter Geiler

ContributorsEtienne Bertin received the PhD degree in mathematics from Universit Joseph Fourier in 1994. From 1990 to 1995 he worked on various topics in image analysis and computational geometry. Since 1995, he has been an assistant professor at the Universit Pierre Mends France in the Laboratoire de statistique et danalyses de donnes; he works on stochastic geometry. Dr. Etienne Bertin Laboratoire de Statistique et danalyse de donns Universit Pierre Mends, Grenoble, France [email protected] Anke Meyer-Bse received her M. S. and the PhD in electrical engineering from the Darmstadt Institute of Technology in 1990 and 1995, respectively. From 1995 to 1996 she was a postdoctoral fellow with the Federal Institute of Neurobiology, Magdeburg, Germany. Since 1996 she was a visiting assistant professor with the Dept. of Electrical Engineering, University of Florida, Gainesville, USA. She received the Max-Kade award in Neuroengineering in 1996 and the Lise-Meitner prize in 1997. Her research interests include neural networks, image processing, biomedicine, speech recognition, and theory of non-

linear systems. Dr. Anke Meyer-Bse, Dept. of Electrical Engineering and Computer Science, University of Florida, 454 New Engineering Building 33, Center Drive PO Box 116130, Gainesville, FL 32611-6130, U.S., [email protected]

Tobias Dierig graduated in 1997 from the University of Heidelberg with a master degree in physics and is now pursuing his PhD at the Interdisciplinary Center for Scientic Computing at Heidelberg university. He is concerned mainly with depth from focus algorithms, image fusion, and industrial applications of computer vision within the OpenEye project. Tobias Dierig, Forschungsgruppe Bildverarbeitung, IWR Universitt Heidelberg, Im Neuenheimer Feld 368 D-69120 Heidelberg, Germany [email protected] http://klimt.iwr.uni-heidelberg.de/tdierig

xiii

xiv

Contributors

Roland Eils studied mathematics and computer science in Aachen, where he received his diploma in 1990. After a two year stay in Indonesia for language studies he joint the Graduiertenkolleg Modeling and Scientic Computing in Mathematics and Sciences at the Interdisciplinary Center for Scientic Computing (IWR), University of Heidelberg, where he received his doctoral degree in 1995. Since 1996 he has been leading the biocomputing group, S tructures in Molecular Biology. His research interests include computer vision, in particular computational geometry, and application of image processing techniques in science and biotechnology. Dr. Roland Eils, Biocomputing-Gruppe, IWR, Universitt Heidelberg Im Neuenheimer Feld 368, D-69120 Heidelberg, Germany [email protected] http://www.iwr.uni-heidelberg.de/iwr/bioinf Peter Geiler studied physics in Heidelberg. He received his diploma and doctoral degree from Heidelberg University in 1994 and 1998, respectively. His research interests include computer vision, especially depth-fromfocus, adaptive ltering, and ow visualization as well as the application of image processing in physical sciences and oceanography. Dr. Peter Geiler Forschungsgruppe Bildverarbeitung, IWR Universitt Heidelberg, Im Neuenheimer Feld 368 D-69120 Heidelberg, Germany [email protected] http://klimt.iwr.uni-heidelberg.de Georgy Gimelfarb received his PhD degree from the Ukrainian Academy of Sciences in 1969 and his Doctor of Science (the habilitation) degree from the Higher Certifying Commission of the USSR in 1991. In 1962, he began working in the Pattern Recognition, Robotics, and Image Recognition Departments of the Institute of Cybernetics (Ukraine). In 19941997 he was an invited researcher in Hungary, the USA, Germany, and France. Since 1997, he has been a senior lecturer in computer vision and digital TV at the University of Auckland, New Zealand. His research interests include analysis of multiband space and aerial images, computational stereo, and image texture analysis. Dr. Georgy Gimelfarb, Centre for Image Technology and Robotics, Department of Computer Science, Tamaki Campus The University of Auckland, Private Bag 92019, Auckland 1, New Zealand [email protected], http://www.tcs.auckland.ac.nz/georgy

Contributors

xv

Horst Hauecker studied physics in Heidelberg. He received his diploma in physics and his doctoral degree from Heidelberg University in 1994 and 1996, respectively. He was visiting scientist at the Scripps Institution of Oceanography in 1994. Currently he is conducting research in the image processing research group at the Interdisciplinary Center for Scientic Computing (IWR), where he also lectures on optical ow computation. His research interests include computer vision, especially image sequence analysis, infrared thermography, and fuzzy-image processing, as well as the application of image processing in physical sciences and oceanography. Dr. Horst Hauecker, Forschungsgruppe Bildverarbeitung, IWR Universitt Heidelberg, Im Neuenheimer Feld 368, D-69120 Heidelberg [email protected] http://klimt.iwr.uni-heidelberg.de Jrgen Hesser is assistant professor at the Lehrstuhl fr Informatik V, University of Mannheim, Germany. He heads the groups on computer graphics, bioinformatics, and optimization. His research interests are realtime volume rendering, computer architectures, computational chemistry, and evolutionary algorithms. In addition, he is co-founder of Volume Graphics GmbH, Heidelberg. Hesser received his PhD and his diploma in physics at the University of Heidelberg, Germany. Jrgen Hesser, Lehrstuhl fr Informatik V Universitt Mannheim B6, 26, D-68131 Mannheim, Germany [email protected], Joachim Hornegger graduated in 1992 and received his PhD degree in computer science in 1996 from the Universitt Erlangen-Nrnberg, Germany, for his work on statistical object recognition. Joachim Hornegger was research and teaching associate at Universitt ErlangenNrnberg, a visiting scientist at the Technion, Israel, and at the Massachusetts Institute of Technology, U.S. He is currently a research scholar and teaching associate at Stanford University, U.S. Joachim Hornegger is the author of 30 technical papers in computer vision and speech processing and three books. His research interests include 3-D computer vision, 3-D object recognition, and statistical methods applied to image analysis problems. Dr. Joachim Hornegger, Stanford University, Robotics Laboratory Gates Building 1A, Stanford, CA 94305-9010, U.S. [email protected], http://www.robotics.stanford.edu/jh

xvi

Contributors

Bernd Jhne studied physics in Saarbrcken and Heidelberg. He received his diploma, doctoral degree, and habilitation degree from Heidelberg University in 1977, 1980, and 1985, respectively, and a habilitation degree in applied computer science from the University of Hamburg-Harburg in 1992. Since 1988 he has been a Marine Research Physicist at Scripps Institution of Oceanography, University of California, and, since 1994, he has been professor of physics at the Interdisciplinary Center of Scientic Computing. He leads the research group on image processing. His research interests include computer vision, especially lter design and image sequence analysis, the application of image processing techniques in science and industry, and small-scale air-sea interaction processes. Prof. Dr. Bernd Jhne, Forschungsgruppe Bildverarbeitung, IWR Universitt Heidelberg, Im Neuenheimer Feld 368, D-69120 Heidelberg [email protected] http://klimt.iwr.uni-heidelberg.de Reinhard Klette studied mathematics at Halle University, received his master degree and doctor of natural science degree in mathematics at Jena University, became a docent in computer science, and was a professor of computer vision at Berlin Technical University. Since June 1996 he has been professor of information technology in the Department of Computer Science at the University of Auckland. His research interests include theoretical and applied topics in image processing, pattern recognition, image analysis, and image understanding. He has published books about image processing and shape reconstruction and was chairman of several international conferences and workshops on computer vision. Recently, his research interests have been directed at 3-D biomedical image analysis with digital geometry and computational geometry as major subjects. Prof. Dr. Reinhard Klette, Centre for Image Technology and Robotics, Computer Science Department, Tamaki Campus The Auckland University, Private Bag 92019, Auckland, New Zealand [email protected], http://citr.auckland.ac.nz/rklette Christoph Klauck received his diploma in computer science and mathematics from the University of Kaiserslautern, Germany, in 1990. From 1990 to 1994 he worked as research scientist at the German Research Center for Articial Intelligence Inc. (DFKI GmbH) at Kaiserslautern. In 1994 he nished his dissertation in computer science. Since then he has been involved in the IRIS project at the University of Bremen (Articial Intelligence Group). His primary research interests include graph grammars and rewriting systems in general, knowledge representation, and ontologies.

Contributors

xvii

Prof. Dr. Christoph Klauck, Dep. of Electrical Eng. and Computer Science University of Hamburg (FH), Berliner Tor 3, D-20099 Hamburg, Germany [email protected], http://fbi010.informatik.fh-hamburg.de/klauck Stefan Krkel is member of the research groups for numerics and optimization of Prof. Bock and Prof. Reinelt at the Interdisciplinary Center for Scientic Computing at the University of Heidelberg, Germany. He studied mathematics in Heidelberg. Currently he is pursuing his PhD in nonlinear and mixed integer optimization methods. His research interests include lter optimization as well as nonlinear optimum experimental design. Stefan Krkel Interdisciplinary Center for Scientic Computing Im Neuenheimer Feld 368, 69120 Heidelberg [email protected] http://www.iwr.uni-heidelberg.de/Stefan.Koerkel/ Ryszard Kozera received his M.Sc. degree in pure mathematics in 1985 from Warsaw University, Poland, his PhD degree in computer science in 1991 from Flinders University, Australia, and nally his PhD degree in mathematics in 1992 from Warsaw University, Poland. He is currently employed as a senior lecturer at the University of Western Australia. Between July 1995 and February 1997, Dr. Kozera was at the Technical University of Berlin and at Warsaw University as an Alexander von Humboldt Foundation research fellow. His current research interests include applied mathematics with special emphasis on partial dierential equations, computer vision, and

numerical analysis. Dr. Ryszard Kozera, Department of Computer Science, The University of Western Australia, Nedlands, WA 6907, Australia, [email protected] http://www.cs.uwa.edu.au/people/info/ryszard.html

Tony Lindeberg received his M.Sc. degree in engineering physics and applied mathematics from KTH (Royal Institute of Technology), Stockholm, Sweden in 1987, and his PhD degree in computing science in 1991. He is currently an associate professor at the Department of Numerical Analysis and Computing Science at KTH. His main research interests are in computer vision and relate to multiscale representations, focus-of-attention, and shape. He has contributed to the foundations of continuous and discrete scale-space theory, as well as to the application of these theories to computer vision problems. Specically, he has developed principles for automatic scale selection, methodologies for extracting salient image structures, and theories for multiscale shape estimation. He is author of the book Scale-Space Theory in Computer Vision.

xviii

Contributors

Tony Lindeberg, Department of Numerical Analysis and Computing Science KTH, S-100 44 Stockholm, Sweden. [email protected], http://www.nada.kth.se/tony Steen Lindek studied physics at the RWTH Aachen, Germany, the EPF Lausanne, Switzerland, and the University of Heidelberg, Germany. He did his diploma and PhD theses in the Light Microscopy Group at the European Molecular Biology Laboratory (EMBL), Heidelberg, Germany, developing high-resolution light-microscopy techniques. Since December 1996 he has been a postdoctoral fellow with the BioImage project at EMBL. He currently works on the design and implementation of the image database, and he is responsible for the administration of EMBLs contribution to the project. Dr. Steen Lindek, European Molecular Biology Laboratory (EMBL) Postfach 10 22 09, D-69120 Heidelberg, Germany [email protected] Hanspeter A. Mallot studied biology and mathematics at the University of Mainz where he also received his doctoral degree in 1986. He was a postdoctoral fellow at the Massachusetts Institute of Technology in 1986/87 and held research positions at Mainz University and the Ruhr-Universitt-Bochum. In 1993, he joined the MaxPlanck-Institut fr biologische Kybernetik in Tbingen. In 1996/97, he was a fellow at the Institute of Advanced Studies in Berlin. His research interests include the perception of shape and space in humans and machines, cognitive maps, as well as neural network models of the

cerebral cortex. Dr. Hanspeter A. Mallot, Max-Planck-Institut fr biologische Kybernetik Spemannstr. 38, 72076 Tbingen, Germany [email protected] http://www.kyb.tuebingen.mpg.de/bu/

Heinrich Niemann obtained the degree of Dipl.-Ing. in electrical engineering and Dr.-Ing. at Technical University Hannover in 1966 and 1969, respectively. From 1967 to 1972 he was with Fraunhofer Institut fr Informationsverarbeitung in Technik und Biologie, Karlsruhe. Since 1975 he has been professor of computer science at the University of Erlangen-Nrnberg and since 1988 he has also served as head of the research group, Knowledge Processing, at the Bavarian Research Institute for Knowledge-Based Systems (FORWISS). His elds of research are speech and image understanding and the application of articial intelligence techniques in these elds. He is the author or co-author of 6 books and approximately 250 journal and conference contributions.

Contributors

xix

Prof. Dr.-Ing. H. Niemann, Lehrstuhl fr Mustererkennung (Informatik 5) Universitt Erlangen-Nrnberg, Martensstrae 3, 91058 Erlangen, Germany [email protected] http://www5.informatik.uni-erlangen.de Dietrich Paulus received a bachelor degree in computer science at the University of Western Ontario, London, Canada (1983). He graduated (1987) and received his PhD degree (1991) from the University of ErlangenNrnberg, Germany. He is currently a senior researcher (Akademischer Rat) in the eld of image pattern recognition and teaches courses in computer vision and applied programming for image processing. Together with J. Hornegger, he has recently written a book on pattern recognition and image processing in C++. Dr. Dietrich Paulus, Lehrstuhl fr Mustererkennung Universitt Erlangen-Nrnberg, Martensstr. 3, 91058 Erlangen, Germany [email protected] http://www5.informatik.uni-erlangen.de Christoph Poliwoda is PhD student at the Lehrstuhl fr Informatik V, University of Mannheim, and leader of the development section of Volume Graphics GmbH. His research interests are real-time volume and polygon raytracing, 3-D image processing, 3-D segmentation, computer architectures and parallel computing. Poliwoda received his diploma in physics at the University of Heidelberg, Germany. Christoph Poliwoda Lehrstuhl fr Informatik V Universitt Mannheim B6, 26, D-68131 Mannheim, Germany [email protected] Nicholas J. Salmon received the master of engineering degree from the Department of Electrical and Electronic Engineering at Bath University, England, in 1990. Then he worked as a software development engineer for Marconi Radar Systems Ltd., England, helping to create a vastly parallel signal-processing machine for radar applications. Since 1992 he has worked as software engineer in the Light Microscopy Group at the European Molecular Biology Laboratory, Germany, where he is concerned with creating innovative software systems for the control of confocal microscopes, and image processing. Nicholas J. Salmon, Light Microscopy Group, European Molecular Biology Laboratory (EMBL) Postfach 10 22 09, D-69120 Heidelberg, Germany [email protected],

xx

Contributors

Kurt Stzler studied physics at the University of Heidelberg, where he received his diploma in 1995. Since then he has been working as a PhD student at the MaxPlanck-Institute of Medical Research in Heidelberg. His research interests are mainly computational geometry applied to problems in biomedicine, architecture and computer graphics, image processing and tilted view microscopy. Kurt Stzler, IWR, Universitt Heidelberg Im Neuenheimer Feld 368, D-69120 Heidelberg or Max-Planck-Institute for Medical Research, Department of Cell Physiology Jahnstr. 29, D-69120 Heidelberg, Germany [email protected] Hanno Scharr studied physics at the University of Heidelberg, Germany and did his diploma thesis on texture analysis at the Interdisciplinary Center for Scientic Computing in Heidelberg. Currently, he is pursuing his PhD on motion estimation. His research interests include lter optimization and motion estimation in discrete time series of n-D images. Hanno Scharr Interdisciplinary Center for Scientic Computing Im Neuenheimer Feld 368, 69120 Heidelberg, Germany [email protected] http://klimt.iwr.uni-heidelberg.de/hscharr/ Karsten Schlns studied computer science in Berlin. He received his diploma and doctoral degree from the Technical University of Berlin in 1991 and 1996. From 1991 to 1996 he was research assistant in the Computer Vision Group, Technical University of Berlin, and from 1997 to 1998 he was a postdoctoral research fellow in computing and information technology, University of Auckland. Since 1998 he has been a scientist in the image processing group at the Institute of Pathology, University Hospital Charit in Berlin. His research interests include pattern recognition and computer vision, especially three-dimensional shape recovery, performance analysis of reconstruction algorithms, and teaching of computer vision. Dr. Karsten Schlns, Institute of Pathology, University Hospital Charit, Schumannstr. 20/21, D-10098 Berlin, Germany [email protected], http://amba.charite.de/ksch

Contributors

xxi

Christoph Schnrr received the master degree in electrical engineering in 1987, the doctoral degree in computer science in 1991, both from the University of Karlsruhe (TH), and the habilitation degree in Computer Science in 1998 from the University of Hamburg, Germany. From 19871992, he worked at the Fraunhofer Institute for Information and Data Processing (IITB) in Karlsruhe in the eld of image sequence analysis. In 1992 he joined the Cognitive Systems group, Department of Computer Science, University of Hamburg, where he became an assistant professor in 1995. He received an award for his work on image segmentation from the German Association for Pattern Recognition (DAGM) in 1996. Since October 1998, he has been a full professor at the University of Mannheim, Germany, where he heads the Computer Vision, Graphics, and Pattern Recognition Group. His research interests include pattern recognition, machine vision, and related aspects of computer graphics, machine learning, and applied mathematics. Prof. Dr. Christoph Schnrr, University of Mannheim Dept. of Math. & Computer Science, D-68131 Mannheim, Germany [email protected], http://www.ti.uni-mannheim.de Eero Simoncelli started his higher education with a bachelors degree in physics from Harvard University, went to Cambridge University on a fellowship to study mathematics for a year and a half, and then returned to the USA to pursue a doctorate in Electrical Engineering and Computer Science at MIT. He received his PhD in 1993, and joined the faculty of the Computer and Information Science Department at the University of Pennsylvania that same year. In September of 1996, he joined the faculty of the Center for Neural Science and the Courant Institute of Mathematical Sciences at New York University. He received an NSF Faculty Early Career Development (CAREER) grant in September 1996, for teaching and research in Visual Information Processing, and a Sloan Research Fellowship in February 1998. Dr. Eero Simoncelli, 4 Washington Place, RM 809, New York, NY 10003-6603 [email protected], http://www.cns.nyu.edu/eero Pierre Soille received the engineering degree from the Universit catholique de Louvain, Belgium, in 1988. He gained the doctorate degree in 1992 at the same university and in collaboration with the Centre de Morphologie Mathmatique of the Ecole des Mines de Paris. He then pursued research on image analysis at the CSIRO Mathematical and Information Sciences Division, Sydney, the Centre de Morphologie Mathmatique of the Ecole des Mines de Paris, and the Abteilung Mustererkennung of the Fraunhofer-Institut IPK, Berlin. During the period 1995-1998 he was lecturer and research scientist at the Ecole des Mines dAls and EERIE, Nmes, France. Now he is a senior research scientist at the Silsoe Research Institute, England. He worked on many ap-

xxii

Contributors

plied projects, taught tutorials during international conferences, co-organized the second International Symposium on Mathematical Morphology, wrote and edited three books, and contributed to over 50 scientic publications. Prof. Pierre Soille, Silsoe Research Institute, Wrest Park Silsoe, Bedfordshire, MK45 4HS, United Kingdom [email protected], http://www.bbsrc.ac.uk Hagen Spies graduated in January 1998 from the University of Heidelberg with a master degree in physics. He also received an MS in computing and information technology from the University of Dundee, Scotland in 1995. In 1998/1999 he spent one year as a visiting scientist at the University of Western Ontario, Canada. Currently he works as a researcher at the Interdisciplinary Center for Scientic Computing at the University of Heidelberg. His interests concern the measurement of optical and range ow and their use in scientic applications. Hagen Spies, Forschungsgruppe Bildverarbeitung, IWR Universitt Heidelberg, Im Neuenheimer Feld 368 D-69120 Heidelberg, Germany, [email protected] http://klimt.iwr.uni-heidelberg.de/hspies E. H. K. Stelzer studied physics in Frankfurt am Main and in Heidelberg, Germany. During his Diploma thesis at the Max-Planck-Institut fr Biophysik he worked on the physical chemistry of phospholipid vesicles, which he characterized by photon correlation spectroscopy. Since 1983 he has worked at the European Molecular Biology Laboratory (EMBL). He has contributed extensively to the development of confocal uorescence microscopy and its application in life sciences. His group works on the development and application of high-resolution techniques in light microscopy, video microscopy, confocal microscopy, optical tweezers, single particle analysis, and the documentation of relevant parameters with biological data. Prof. Dr. E. H. K. Stelzer, Light Microscopy Group, European Molecular Biology Laboratory (EMBL), Postfach 10 22 09 D-69120 Heidelberg, Germany, [email protected], Hamid R. Tizhoosh received the M.S. degree in electrical engineering from University of Technology, Aachen, Germany, in 1995. From 1993 to 1996, he worked at Management of Intelligent Technologies Ltd. (MIT GmbH), Aachen, Germany, in the area of industrial image processing. He is currently a PhD candidate, Dept. of Technical Computer Science of Otto-von-Guericke-University, Magdeburg, Germany. His research encompasses fuzzy logic and computer vision. His recent research eorts include medical and fuzzy image processing. He is currently involved in the European Union project INFOCUS, and is researching enhancement of medical images in radiation therapy. H. R. Tizhoosh, University of Magdeburg (IPE)

ContributorsP.O. Box 4120, D-39016 Magdeburg, Germany [email protected] http://pmt05.et.uni-magdeburg.de/hamid/

xxiii

Thomas Wagner received a diploma degree in physics in 1991 from the University of Erlangen, Germany. In 1995, he nished his PhD in computer science with an applied image processing topic at the Fraunhofer Institute for Integrated Circuits in Erlangen. Since 1992, Dr. Wagner has been working on industrial image processing problems at the Fraunhofer Institute, from 1994 to 1997 as group manager of the intelligent systems group. Projects in his research team belong to the elds of object recognition, surface inspection, and access control. In 1996, he received the Hans-Zehetmair-Habilitationsfrderpreis. He is now working on automatic solutions for the design of industrial image processing systems. Dr.-Ing. Thomas Wagner, Fraunhofer Institut fr Intregrierte Schaltungen Am Weichselgarten 3, D-91058 Erlangen, Germany [email protected], http://www.iis.fhg.de Joachim Weickert obtained a M.Sc. in industrial mathematics in 1991 and a PhD in mathematics in 1996, both from Kaiserslautern University, Germany. After receiving the PhD degree, he worked as post-doctoral researcher at the Image Sciences Institute of Utrecht University, The Netherlands. In April 1997 he joined the computer vision group of the Department of Computer Science at Copenhagen University. His current research interests include all aspects of partial dierential equations and scale-space theory in image analysis. He was awarded the Wacker Memorial Prize and authored the book Anisotropic Diusion in Image Processing. Dr. Joachim Weickert, Department of Computer Science, University of Copenhagen, Universitetsparken 1, DK-2100 Copenhagen, Denmark [email protected], http://www.diku.dk/users/joachim/ Dieter Willersinn received his diploma in electrical engineering from Technical University Darmstadt in 1988. From 1988 to 1992 he was with Vitronic Image Processing Systems in Wiesbaden, working on industrial applications of robot vision and quality control. He then took a research position at the Technical University in Vienna, Austria, from which he received his PhD degree in 1995. In 1995, he joined the Fraunhofer Institute for Information and Data Processing (IITB) in Karlsruhe, where he initially worked on obstacle detection for driver assistance applications. Since 1997, Dr. Willersinn has been the head of the group, Assessment of Computer Vision Systems, Department for Recognition and Diagnosis Systems. Dr. Dieter Willersinn, Fraunhofer Institut IITB, Fraunhoferstr. 1 D-76131 Karlsruhe, Germany, [email protected]

xxiv

Contributors

1 IntroductionBernd JhneInterdisziplinres Zentrum fr Wissenschaftliches Rechnen (IWR) Universitt Heidelberg, Germany1.1 1.2 1.3 1.4 1.5 Signal processing for computer vision . . . . . . . . . . . . . . . Pattern recognition for computer vision . . . . . . . . . . . . . . Computational complexity and fast algorithms . . . . . . . . . Performance evaluation of algorithms . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 3 4 5 6

The second volume of the Handbook on Computer Vision and Applications deals with signal processing and pattern recognition. The signals processed in computer vision originate from the radiance of an object that is collected by an optical system (Volume 1, Chapter 5). The irradiance received by a single photosensor or a 2-D array of photosensors through the optical system is converted into an electrical signal and nally into arrays of digital numbers (Volume 2, Chapter 2). The whole chain of image formation from the illumination and interaction of radiation with the object of interest up to the arrays of digital numbers stored in the computer is the topic of Volume 1 of this handbook (subtitled Sensors and Imaging). This volume deals with the processing of the signals generated by imaging sensors and this introduction covers four general topics. Section 1.1 discusses in which aspects the processing of higher-dimensional signals diers from the processing of 1-D time series. We also elaborate on the task of signal processing for computer vision. Pattern recognition (Section 1.2) plays a central role in computer vision because it uses the features extracted by lowlevel signal processing to classify and recognize objects. Given the vast amount of data generated by imaging sensors the question of the computational complexity and of ecient algorithms is of utmost importance (Section 1.3). Finally, the performance evaluation of computer vision algorithms (Section 1.4) is a subject that has been neglected in the past. Consequently, a vast number of algorithms exist for which the performance characteristics are not suciently known. 1

Handbook of Computer Vision and Applications Volume 2 Signal Processing and Pattern Recognition

Copyright 1999 by Academic Press All rights of reproduction in any form reserved. ISBN 012379772-1/$30.00

2

1 Introduction

This constitutes a major obstacle for progress of applications using computer vision techniques.

1.1 Signal processing for computer visionOne-dimensional linear signal processing and system theory is a standard topic in electrical engineering and is covered by many standard textbooks, for example, [1, 2]. There is a clear trend that the classical signal processing community is moving into multidimensional signals, as indicated, for example, by the new annual international IEEE conference on image processing (ICIP). This can also be seen from some recently published handbooks on this subject. The digital signal processing handbook by Madisetti and Williams [3] includes several chapters that deal with image processing. Likewise the transforms and applications handbook by Poularikas [4] is not restricted to one-dimensional transforms. There are, however, only a few monographs that treat signal processing specically for computer vision and image processing. The monograph of Lim [5] deals with 2-D signal and image processing and tries to transfer the classical techniques for the analysis of time series to 2-D spatial data. Granlund and Knutsson [6] were the rst to publish a monograph on signal processing for computer vision and elaborate on a number of novel ideas such as tensorial image processing and normalized convolution that did not have their origin in classical signal processing. Time series are 1-D, signals in computer vision are of higher dimension. They are not restricted to digital images, that is, 2-D spatial signals (Chapter 2). Volumetric sampling, image sequences and hyperspectral imaging all result in 3-D signals, a combination of any of these techniques in even higher-dimensional signals. How much more complex does signal processing become with increasing dimension? First, there is the explosion in the number of data points. Already a medium resolution volumetric image with 5123 voxels requires 128 MB if one voxel carries just one byte. Storage of even higher-dimensional data at comparable resolution is thus beyond the capabilities of todays computers. Moreover, many applications require the handling of a huge number of images. This is also why appropriate databases including images are of importance. An example is discussed in Chapter 29. Higher dimensional signals pose another problem. While we do not have diculty in grasping 2-D data, it is already signicantly more demanding to visualize 3-D data because the human visual system is built only to see surfaces in 3-D but not volumetric 3-D data. The more dimensions are processed, the more important it is that computer graph-

1.2 Pattern recognition for computer vision

3

ics and computer vision come closer together. This is why this volume includes a contribution on visualization of volume data (Chapter 28). The elementary framework for lowlevel signal processing for computer vision is worked out in part II of this volume. Of central importance are neighborhood operations (Chapter 5). Chapter 6 focuses on the design of lters optimized for a certain purpose. Other subjects of elementary spatial processing include fast algorithms for local averaging (Chapter 7), accurate and fast interpolation (Chapter 8), and image warping (Chapter 9) for subpixel-accurate signal processing. The basic goal of signal processing in computer vision is the extraction of suitable features for subsequent processing to recognize and classify objects. But what is a suitable feature? This is still less well dened than in other applications of signal processing. Certainly a mathematically well-dened description of local structure as discussed in Chapter 10 is an important basis. The selection of the proper scale for image processing has recently come into the focus of attention (Chapter 11). As signals processed in computer vision come from dynamical 3-D scenes, important features also include motion (Chapters 13 and 14) and various techniques to infer the depth in scenes including stereo (Chapters 17 and 18), shape from shading and photometric stereo (Chapter 19), and depth from focus (Chapter 20). There is little doubt that nonlinear techniques are crucial for feature extraction in computer vision. However, compared to linear lter techniques, these techniques are still in their infancy. There is also no single nonlinear technique but there are a host of such techniques often specically adapted to a certain purpose [7]. In this volume, a rather general class of nonlinear lters by combination of linear convolution and nonlinear point operations (Chapter 10), and nonlinear diusion ltering (Chapter 15) are discussed.

1.2 Pattern recognition for computer visionIn principle, pattern classication is nothing complex. Take some appropriate features and partition the feature space into classes. Why is it then so dicult for a computer vision system to recognize objects? The basic trouble is related to the fact that the dimensionality of the input space is so large. In principle, it would be possible to use the image itself as the input for a classication task, but no real-world classication techniquebe it statistical, neuronal, or fuzzywould be able to handle such high-dimensional feature spaces. Therefore, the need arises to extract features and to use them for classication. Unfortunately, techniques for feature selection have widely been neglected in computer vision. They have not been developed to the same degree of sophistication as classication where it is meanwhile well un-

4

1 Introduction

derstood that the dierent techniques, especially statistical and neural techniques, can been considered under a unied view [8]. Thus part IV of this volume focuses in part on some more advanced feature-extraction techniques. An important role in this aspect is played by morphological operators (Chapter 21) because they manipulate the shape of objects in images. Fuzzy image processing (Chapter 22) contributes a tool to handle vague data and information. The remainder of part IV focuses on another major area in computer vision. Object recognition can be performed only if it is possible to represent the knowledge in an appropriate way. In simple cases the knowledge can just be rested in simple models. Probabilistic modeling in computer vision is discussed in Chapter 26. In more complex cases this is not sucient. The graph theoretical concepts presented in Chapter 24 are one of the bases for knowledge-based interpretation of images as presented in Chapter 27.

1.3 Computational complexity and fast algorithmsThe processing of huge amounts of data in computer vision becomes a serious challenge if the number of computations increases more than linear with the number of data points, M = N D (D is the dimension of the signal). Already an algorithm that is of order O(M 2 ) may be prohibitively slow. Thus it is an important goal to achieve O(M) or at least O(M ld M) performance of all pixel-based algorithms in computer vision. Much eort has been devoted to the design of fast algorithms, that is, performance of a given task with a given computer system in a minimum amount of time. This does not mean merely minimizing the number of computations. Often it is equally or even more important to minimize the number of memory accesses. Point operations are of linear order and take cM operations. Thus they do not pose a problem. Neighborhood operations are still of linear order in the number of pixels but the constant c may become quite large, especially for signals with high dimensions. This is why there is already a need to develop fast neighborhood operations. Brute force implementations of global transforms such as the Fourier transform require cM 2 operations and can thus only be used at all if fast algorithms are available. Such algorithms are discussed in Section 3.4. Many other algorithms in computer vision, such as correlation, correspondence analysis, and graph search algorithms are also of polynomial order, some of them even of exponential order. A general breakthrough in the performance of more complex algorithms in computer vision was the introduction of multiresolutional data structures that are discussed in Chapters 4 and 14. All chapters

1.4 Performance evaluation of algorithms

5

about elementary techniques for processing of spatial data (Chapters 5 10) also deal with ecient algorithms.

1.4 Performance evaluation of algorithmsA systematic evaluation of the algorithms for computer vision has been widely neglected. For a newcomer to computer vision with an engineering background or a general education in natural sciences this is a strange experience. It appears to him as if one would present results of measurements without giving error bars or even thinking about possible statistical and systematic errors. What is the cause of this situation? On the one side, it is certainly true that some problems in computer vision are very hard and that it is even harder to perform a sophisticated error analysis. On the other hand, the computer vision community has ignored the fact to a large extent that any algorithm is only as good as its objective and solid evaluation and verication. Fortunately, this misconception has been recognized in the meantime and there are serious eorts underway to establish generally accepted rules for the performance analysis of computer vision algorithms. We give here just a brief summary and refer for details to Haralick et al. [9] and for a practical example to Volume 3, Chapter 7. The three major criteria for the performance of computer vision algorithms are: Successful solution of task. Any practitioner gives this a top priority. But also the designer of an algorithm should dene precisely for which task it is suitable and what the limits are. Accuracy. This includes an analysis of the statistical and systematic errors under carefully dened conditions (such as given signal-tonoise ratio (SNR), etc.). Speed. Again this is an important criterion for the applicability of an algorithm. There are dierent ways to evaluate algorithms according to the forementioned criteria. Ideally this should include three classes of studies: Analytical studies. This is the mathematically most rigorous way to verify algorithms, check error propagation, and predict catastrophic failures. Performance tests with computer generated images. These tests are useful as they can be carried out under carefully controlled conditions. Performance tests with real-world images. This is the nal test for practical applications.

6

1 Introduction

Much of the material presented in this volume is written in the spirit of a careful and mathematically well-founded analysis of the methods that are described although the performance evaluation techniques are certainly more advanced in some areas than in others.

1.5 References[1] Oppenheim, A. V. and Schafer, R. W., (1989). Discrete-time Signal Processing. Prentice-Hall Signal Processing Series. Englewood Clis, NJ: PrenticeHall. [2] Proakis, J. G. and Manolakis, D. G., (1992). Digital Signal Processing. Principles, Algorithms, and Applications. New York: McMillan. [3] Madisetti, V. K. and Williams, D. B. (eds.), (1997). The Digital Signal Processing Handbook. Boca Raton, FL: CRC Press. [4] Poularikas, A. D. (ed.), (1996). The Transforms and Applications Handbook. Boca Raton, FL: CRC Press. [5] Lim, J. S., (1990). Two-dimensional Signal and Image Processing. Englewood Clis, NJ: Prentice-Hall. [6] Granlund, G. H. and Knutsson, H., (1995). Signal Processing for Computer Vision. Norwell, MA: Kluwer Academic Publishers. [7] Pitas, I. and Venetsanopoulos, A. N., (1990). Nonlinear Digital Filters. Principles and Applications. Norwell, MA: Kluwer Academic Publishers. [8] Schrmann, J., (1996). Pattern Classication, a Unied View of Statistical and Neural Approaches. New York: John Wiley & Sons. [9] Haralick, R. M., Klette, R., Stiehl, H.-S., and Viergever, M. (eds.), (1999). Evaluation and Validation of Computer Vision Algorithms. Boston: Kluwer.

Part I

Signal Representation

2 Continuous and Digital SignalsBernd JhneInterdisziplinres Zentrum fr Wissenschaftliches Rechnen (IWR) Universitt Heidelberg, Germany2.1 2.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Continuous signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 2.2.2 2.2.3 2.3 2.3.1 2.3.2 2.3.3 2.3.4 2.3.5 2.3.6 2.4 2.4.1 2.4.2 2.4.3 2.4.4 2.5 2.5.1 2.5.2 2.5.3 2.6 Types of signals . . . . . . . . . . . . . . . . . . . . . . . . Unied description . . . . . . . . . . . . . . . . . . . . . . Multichannel signals . . . . . . . . . . . . . . . . . . . . . Regular two-dimensional lattices . . . . . . . . . . . . . Regular higher-dimensional lattices . . . . . . . . . . . . Irregular lattices . . . . . . . . . . . . . . . . . . . . . . . . Metric in digital images . . . . . . . . . . . . . . . . . . . . Neighborhood relations . . . . . . . . . . . . . . . . . . . Errors in object position and geometry . . . . . . . . . Image formation . . . . . . . . . . . . . . . . . . . . . . . . Sampling theorem . . . . . . . . . . . . . . . . . . . . . . . Aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reconstruction from samples . . . . . . . . . . . . . . . Equidistant quantization . . . . . . . . . . . . . . . . . . . Unsigned or signed representation . . . . . . . . . . . . Nonequidistant quantization . . . . . . . . . . . . . . . . 10 10 10 11 12 13 13 16 17 17 19 20 23 24 25 28 28 30 30 31 32 34

Discrete signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Relation between continuous and discrete signals . . . . . . .

Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Handbook of Computer Vision and Applications Volume 2 Signal Processing and Pattern Recognition

9

Copyright 1999 by Academic Press All rights of reproduction in any form reserved. ISBN 012379772-1/$30.00

10

2 Continuous and Digital Signals

2.1 IntroductionImages are signals with two spatial dimensions. This chapter deals with signals of arbitrary dimensions. This generalization is very useful because computer vision is not restricted solely to 2-D signals. On the one hand, higher-dimensional signals are encountered. Dynamic scenes require the analysis of image sequences; the exploration of 3-D space requires the acquisition of volumetric images. Scientic exploration of complex phenomena is signicantly enhanced if images not only of a single parameter but of many parameters are acquired. On the other hand, signals of lower dimensionality are also of importance when a computer vision system is integrated into a larger system and image data are fused with time series from point measuring sensors. Thus this chapter deals with continuous (Section 2.2) and discrete (Section 2.3) representations of signals with arbitrary dimensions. While the continuous representation is very useful for a solid mathematical foundation of signal processing, real-world sensors deliver and digital computers handle only discrete data. Given the two representations, the relation between them is of major importance. Section 2.4 discusses the spatial and temporal sampling on signals while Section 2.5 treats quantization, the conversion of a continuous signal into digital numbers.

2.2 Continuous signals2.2.1 Types of signals

An important characteristic of a signal is its dimension. A zero-dimensional signal results from the measurement of a single quantity at a single point in space and time. Such a single value can also be averaged over a certain time period and area. There are several ways to extend a zero-dimensional signal into a 1-D signal (Table 2.1). A time series records the temporal course of a signal in time, while a prole does the same in a spatial direction or along a certain path. A 1-D signal is also obtained if certain experimental parameters of the measurement are continuously changed and the measured parameter is recorded as a function of some control parameters. With respect to optics, the most obvious parameter is the wavelength of the electromagnetic radiation received by a radiation detector. When radiation is recorded as a function of the wavelength, a spectrum is obtained. The wavelength is only one of the many parameters that could be considered. Others could be temperature, pressure, humidity, concentration of a chemical species, and any other properties that may inuence the measured quantity.

2.2 Continuous signals

11

Table 2.1: Some types of signals g depending on D parameters D Type of signal 0 1 1 1 2 2 2 3 3 3 4 4 5 Measurement at a single point in space and time Time series Prole Spectrum Image Time series of proles Time series of spectra Volumetric image Image sequence Hyperspectral image Volumetric image sequence Hyperspectral image sequence Hyperspectral volumetric image sequence Function g g(t) g(x) g() g(x, y) g(x, t) g(, t) g(x, y, z) g(x, y, t) g(x, y, ) g(x, y, z, t) g(x, y, , t) g(x, y, z, , t)

With this general approach to multidimensional signal processing, it is obvious that an image is only one of the many possibilities of a 2-D signal. Other 2-D signals are, for example, time series of proles or spectra. With increasing dimension, more types of signals are possible as summarized in Table 2.1. A 5-D signal is constituted by a hyperspectral volumetric image sequence. 2.2.2 Unied description

Mathematically all these dierent types of multidimensional signals can be described in a unied way as continuous scalar functions of multiple parameters or generalized coordinates qd as g(q) = g(q1 , q2 , . . . , qD ) with q = [q1 , q2 , . . . , qD ]T (2.1)

that can be summarized in a D-dimensional parameter vector or generalized coordinate vector q. An element of the vector can be a spatial direction, the time, or any other parameter. As the signal g represents physical quantities, we can generally assume some properties that make the mathematical handling of the signals much easier. Continuity. Real signals do not show any abrupt changes or discontinuities. Mathematically this means that signals can generally be regarded as arbitrarily often dierentiable.

12

2 Continuous and Digital Signals

Finite range. The physical nature of both the signal and the imaging sensor ensures that a signal is limited to a nite range. Some signals are restricted to positive values. Finite energy. Normally a signal corresponds to the amplitude or the energy of a physical process (see also Volume 1, Chapter 2). As the energy of any physical system is limited, any signal must be square integrable:

g(q)

2

dDq <

(2.2)

With these general properties of physical signals, it is obvious that the continuous representation provides a powerful mathematical approach. The properties imply, for example, that the Fourier transform (Section 3.2) of the signals exist. Depending on the underlying physical process the observed signal can be regarded as a stochastic signal. More often, however, a signal is a mixture of a deterministic and a stochastic signal. In the simplest case, the measured signal of a deterministic process gd is corrupted by additive zero-mean homogeneous noise. This leads to the simple signal model g(q) = gd (q) + n (2.3)

2 where n has the variance n = n2 . In most practical situations, the noise is not homogeneous but rather depends on the level of the signal. Thus in a more general way

g(q) = gd (q) + n(g)

with

2 n(g) = 0, n2 (g) = n (g)

(2.4)

A detailed treatment of noise in various types of imaging sensors can be found in Volume 1, Sections 7.5, 9.3.1, and 10.2.3. 2.2.3 Multichannel signals

So far, only scalar signals have been considered. If more than one signal is taken simultaneously, a multichannel signal is obtained. In some cases, for example, taking time series at dierent spatial positions, the multichannel signal can be considered as just a sampled version of a higher-dimensional signal. In other cases, the individual signals cannot be regarded as samples. This is the case when they are parameters with dierent units and/or meaning. A multichannel signal provides a vector at each point and is therefore sometimes denoted as a vectorial signal and written as g(q) = [q1 (q), q2 (q), . . . , qD (q)]T

(2.5)

2.3 Discrete signals a b c

13

Figure 2.1: Representation of 2-D digital images by meshes of regular polygons: a triangles; b squares; c hexagons. Table 2.2: Properties of tessellations of the 2-D space with regular triangular, square, and hexagonal meshes; Ne : number of neighbors with common edge; Nc : number of neighbors with common edge and/or corner; l: basis length l of regular polygon; d: distance d to nearest neighbor; and A: area of cell Triangular Ne Nc l d A 3 12 l=

Square 4 8 16/3A l=d= d=l=

Hexagonal 6

6 l= d= A=

d= A=

1 16/27A 3 3l = 3 2 = 1 3l2 4 3d 4

3d =

A A

1 3 3d

= =

4/27A 4/3A3 2 2 3l

A = d 2 = l2

1 2 2 3d

3l =

A multichannel signal is not necessarily a vectorial signal. Depending on the mathematical relation between its components, it could also be a higher-order signal, for example, a tensorial signal. Such types of multichannel images are encountered when complex features are extracted from images. One example is the tensorial description of local structure discussed in Chapter 10.

2.3 Discrete signals2.3.1 Regular two-dimensional lattices

Computers cannot handle continuous signals but only arrays of digital numbers. Thus it is required to represent signals as D-dimensional arrays of points. We rst consider images as 2-D arrays of points. A point on the 2-D grid is called a pixel or pel. Both words are abbreviations of picture element . A pixel represents the irradiance at the corresponding grid position. There are two ways to derive 2-D lattices from continuous signals.

14 ab2

2 Continuous and Digital Signals bb2

cb2

b1

b1

b1

Figure 2.2: Elementary cells of regular grids for 2-D digital images: a triangle grid, b square grid, c hexagonal grid.

First, the continuous 2-D space can be partitioned into space-lling cells. For symmetry reasons, only regular polygons are considered. Then there are only three possible tesselations with regular polygons: triangles, squares, and hexagons as illustrated in Fig. 2.1 (see also Table 2.2). All other regular polygons do not lead to a space-lling geometrical arrangement. There are either overlaps or gaps. From the mesh of regular polygons a 2-D array of points is then formed by the symmetry centers of the polygons. In case of the square mesh, these points lie again on a square grid. For the hexagonal mesh, the symmetry centers of the hexagons form a triangular grid. In contrast, the symmetry centers of the triangular grid form a more complex pattern, where two triangular meshes are interleaved. The second mesh is oset by a third of the base length l of the triangular mesh. A second approach to regular lattices starts with a primitive cell. A primitive cell in 2-D is spanned by two not necessarily orthogonal base vectors b1 and b2 . Thus, the primitive cell is always a parallelogram except for square and rectangular lattices (Fig. 2.2). Only in the latter case are the base vectors b1 and b2 orthogonal. Translating the primitive cell by multiples of the base vectors of the primitive cell then forms the lattice. Such a translation vector or lattice vector r is therefore given by r = n1 b1 + n2 b2 n 1 , n2 Z (2.6)

The primitive cells of the square and hexagonal lattices (Fig. 2.2b and c) contains only one grid located at the origin of the primitive cell. This is not possible for a triangular grid, as the lattice points are not arranged in regular distances along two directions (Fig. 2.1a). Thus, the construction of the triangular lattice requires a primitive cell with two grid points. One grid point is located at the origin of the cell, the other is oset by a third of the length of each base vector (Fig. 2.2a) The construction scheme to generate the elementary cells of regular shape from the lattice points is illustrated in Fig. 2.3. From one lattice point straight lines are drawn to all other lattice points starting with

2.3 Discrete signals a b c

15

Figure 2.3: Construction of the cells of a regular lattice from the lattice points: a triangle lattice; b square lattice; and c hexagonal lattice.

a0 0 1 1 columns n N-1 x

b

z

l x m

m rows M-1 y

y n

Figure 2.4: Representation of digital images by orthogonal lattices: a square lattice for a 2-D image; and b cubic lattice for a volumetric or 3-D image.

the nearest neighbors (dashed lines). Then the smallest cell formed by the lines perpendicular to these lines and dividing them into two halves results in the primitive cell. For all three lattices, only the nearest neighbors must be considered for this construction scheme. The mathematics behind the formation of regular lattices in two dimensions is the 2-D analog to 3-D lattices used to describe crystals in solid state physics and mineralogy. The primitive cell constructed from the lattice points is, for example, known in solid state physics as the Wigner-Seitz cell. Although there is a choice of three lattices with regular polygons and many more if irregular polygons are consideredalmost exclusively square or rectangular lattices are used for 2-D digital images.

16

2 Continuous and Digital Signals

The position of the pixel is given in the common notation for matrices. The rst index, m, denotes the position of the row, the second, n, the position of the column (Fig. 2.4a). M gives the number of rows, N the number of columns. In accordance with the matrix notation, the vertical axis (y axis) runs from top to bottom and not vice versa as is common in graphs. The horizontal axis (x axis) runs as usual from left to right. 2.3.2 Regular higher-dimensional lattices

The considerations in the previous section can be extended to higher dimensions. In 3-D space, lattices are identical to those used in solid-state physics to describe crystalline solids. In higher dimensions, we have serious diculty in grasping the structure of discrete lattices because we can visualize only projections onto 2-D space. Given the fact that already 2-D discrete images are almost exclusively represented by rectangular lattices (Section 2.3.1), we may ask what we lose if we consider only hypercubic lattices in higher dimensions. Surprisingly, it turns out that this lattice has such signicant advantages that it is hardly necessary to consider any other lattice. Orthogonal lattice. The base vectors of the hypercubic primitive cell are orthogonal to each other. As discussed in Chapter 6, this is a signicant advantage for the design of lters. If separable lters are used, they can easily be extended to arbitrary dimensions. Valid for all dimensions. The hypercubic lattice is the most general solution for digital data as it is the only geometry that exists in arbitrary dimensions. In practice this means that it is generally quite easy to extend image processing algorithms to higher dimensions. We will see this, for example, with the discrete Fourier transform in Section 3.3, with multigrid data structures in Chapter 4, with averaging in Chapter 7, and with the analysis of local structure in Chapter 10. Only lattice with regular polyhedron. While in 2-D, three lattices with regular polyhedrons exist (Section 2.3.1), the cubic lattice is the only lattice with a regular polyhedron (the hexahedron) in 3-D. None of the other four regular polyhedra (tetrahedron, octahedron, dodecahedron, and icosahedron) is space lling. These signicant advantages of the hypercubic lattice are not outweighed by the single disadvantage that the neighborhood relations, discussed in Section 2.3.5, are more complex on these lattices than, for example, the 2-D hexagonal lattice. In 3-D or volumetric images the elementary cell is known as a voxel, an abbreviation of volume element . On a rectangular grid, each voxel

2.3 Discrete signals

17

represents the mean gray value of a cuboid. The position of a voxel is given by three indices. The rst, l, denotes the depth, m the row, and n the column (Fig. 2.4b). In higher dimensions, the elementary cell is denoted as a hyperpixel. 2.3.3 Irregular lattices

Irregular lattices are attractive because they can be adapted to the contents of images. Small cells are only required where the image contains ne details and can be much larger in other regions. In this way, a compact representation of an image seems to be feasable. It is also not dicult to generate an irregular lattice. The general principle for the construction of a mesh from an array of points (Section 2.3.1) can easily be extended to irregularly spaced points. It is known as Delaunay triangulation and results in the dual Voronoi and Delaunay graphs (Chapters 24 and 25). Processing of image data, however, becomes much more dicult on irregular grids. Some types of operations, such as all classical lter operations, do not even make much sense on irregular grids. In contrast, it poses no diculty to apply morphological operations to irregular lattices (Chapter 21). Because of the diculty in processing digital images on irregular lattices, these data structure are hardly ever used to represent raw images. In order to adapt low-level image processing operations to different scales and to provide an ecient storage scheme for raw data multigrid data structures, for example, pyramids have proved to be much more eective (Chapter 4). In contrast, irregular lattices play an important role in generating and representing segmented images (Chapter 25). 2.3.4 Metric in digital images

Based on the discussion in the previous two sections, we will focus in the following on hypercubic or orthogonal lattices and discuss in this section the metric of discrete images. This constitutes the base for all length, size, volume, and distance measurements in digital images. It is useful to generalize the lattice vector introduced in Eq. (2.6) that represents all points of a D-dimensional digital image and can be written as r n = [n1 x1 , n2 x2 , . . . , nD xD ]T (2.7)

In the preceding equation, the lattice constants xd need not be equal in all directions. For the special cases of 2-D images, 3-D volumetric

18

2 Continuous and Digital Signals

images, and 4-D spatiotemporal images the lattice vectors are nx nx my nx r m,n = , r l,m,n = my , r k,l,m,n = lz my lz kt

(2.8)

To measure distances, the Euclidean distance can be computed on an orthogonal lattice by de (x, x ) = x x=D

1/22 (nd nd )2 xd

(2.9)

d =1

On a square lattice, that is, a lattice with the same grid constant in all directions, the Euclidean distance can be computed more eciently by de (x, x ) = x x=D d =1

1/2 (nd nd )2

x

(2.10)

The Euclidean distance on discrete lattices is somewhat awkward. Although it is a discrete quantity, its values are not integers. Moreover, it cannot be computed very eciently. Therefore, two other metrics are sometimes considered in image processing. The city block distanceD

db (x, x ) =

d=1

|nd nd |

(2.11)

simply adds up the magnitude of the component dierences of two lattice vectors and not the squares as with the Euclidean distance in Eq. (2.10). Geometrically, the city block distance gives the length of a path between the two lattice vectors if we can only walk in directions parallel to axes. The chessboard distance is dened as the maximum of the absolute dierence between two components of the corresponding lattice vectors: dc (x, x ) = max |nd nd |d=1,... ,D

(2.12)

These two metrics have gained some importance for morphological operations (Section 21.2.5). Despite their simplicity they are not of much use as soon as lengths and distances are to be measured. The Euclidean distance is the only metric on digital images that preserves the isotropy of the continuous space. With the city block and chessboard distance, distances in the direction of the diagonals are longer and shorter than the Euclidean distance, respectively.

2.3 Discrete signals a5 5 4 5 4 2 3 2 1 2 1 3 2 1 2 4 2 3 5 4 5

19 b5 4 3 4 5 4 2 1 2 4 1 3 3 1 4 2 1 2 4 5 4 3 4 5

c4 4 3 4 4 2 3 2 1 1 2 1 3 3 1 2 1 1 2 4 3 2 3 4 4 4

4 5 4

Figure 2.5: Classication of the cells according to the distance from a given cell for the a triangular, b square, and c hexagonal lattices. The central cell is shaded in light gray, the nearest neighbors in darker gray. The numbers give the ranking in distance from the central cell.

2.3.5

Neighborhood relations

The term neighborhood has no meaning for a continuous signal. How far two points are from each other is simply measured by an adequate metric such as the Euclidean distance function and this distance can take any value. With the cells of a discrete signal, however, a ranking of the distance between cells is possible. The set of cells with the smallest distance to a given cell are called the nearest neighbors. The triangular, square, and hexagonal lattices have three, four, and six nearest neighbors, respectively (Fig. 2.5). The gure indicates also the ranking in distance from the central cell. Directly related to the question of neighbors is the term adjacency. A digital object is dened as a connected region. This means that we can reach any cell in the region from any other by walking from one neighboring cell to the next. Such a walk is called a path. On a square lattice there are two possible ways to dene neighboring cells (Fig. 2.5b). We can regard pixels as neighbors either when they have a joint edge or when they have at least one joint corner. Thus a pixel has four or eight neighbors and we speak of a 4-neighborhood or an 8-neighborhood. The denition of the 8-neighborhood is somewhat awkward, as there are neighboring cells with dierent distances. The triangular lattice shows an equivalent ambivalence with the 3and 12-neighborhoods with cells that have either only a joint edge or at least a joint corner with the central cell (Fig. 2.5a). In the 12neighborhood there are three dierent types of neighboring cells, each with a dierent distance (Fig. 2.5a). Only the hexagonal lattice gives a unique denition of neighbors. Each cell has six neighboring cells at the same distance joini