Top Banner
SWE 423: Multimedia Systems Chapter 4: Graphics and Images (4)
24

SWE 423: Multimedia Systems

Jan 17, 2016

Download

Documents

chinara

SWE 423: Multimedia Systems. Chapter 4: Graphics and Images (4). Image Segmentation. Assigning a unique number to “object” pixels based on different intensities or colors in the foreground and the background regions of an image - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SWE 423: Multimedia Systems

SWE 423: Multimedia Systems

Chapter 4: Graphics and Images (4)

Page 2: SWE 423: Multimedia Systems

Image Segmentation• Assigning a unique number to “object” pixels

based on different intensities or colors in the foreground and the background regions of an image– Can be used in the object recognition process, but

it is not object recognition on its own

• Segmentation Methods– Pixel oriented methods– Edge oriented methods– Region oriented methods– ....

Page 3: SWE 423: Multimedia Systems

Pixel-Oriented Segmentation

• Gray-values of pixels are studied in isolation• Looks at the gray-level histogram of an image and

finds one or more thresholds in the histogram– Ideally, the histogram has a region without pixels,

which is set as the threshold, and hence the image is divided into a foreground and a background based on that (Bimodal Distribution)

• Major drawback of this approach is that object and background histograms overlap.– Bimodal distribution rarely occurs in nature.

Page 4: SWE 423: Multimedia Systems

Edge-Oriented Segmentation

• Segmentation is carried out as follows– Edges of an image are extracted (using Canny

operators, e.g.)– Edges are connected to form closed contours

around the objects.• Hough Transform

– Usually very expensive

– Works well with regular curves (application in manufactured parts)

– May work in presence of noise

Page 5: SWE 423: Multimedia Systems

Region-Oriented Segmentation

• A major disadvantage of the previous approaches is the lack of “spatial” relationship considerations of pixels.– Neighboring pixels normally have similar properties

• The segmentation (region-growing) is carried out as follows– Start with a “seed” pixel.– Pixel’s neighbors are included if they have some

similarity to the seed pixel, otherwise they are not.• Homogeneity condition• Uses an eight-neighborhood (8-nbd) model

Page 6: SWE 423: Multimedia Systems

Region-Oriented Segmentation

• Homogeneity criterion: Gray-level mean value of a region is usually used

• With standard deviation

• Drawbacks: Computationally expensive.

N

i

N

jkk mjiP

n 1 1

2

2),(

1

N

i

N

jk jiPn

m1 1

2),(

1

Page 7: SWE 423: Multimedia Systems

Water Inflow Segmentation

• Fill a gray-level image gradually with water. – Gray-levels of pixels are taken as height.– The higher the water rises, the more pixels are

flooded• Hence, you have lands and waters

• Lands correspond to “objects”

Page 8: SWE 423: Multimedia Systems

Object Recognition Layer

• Features are analyzed to recognize objects and faces in an image database.– Features are matched with object models stored in a

knowledge base.– Each template is inspected to find the closest match.– Exact matches are usually impossible and generally

computationally expensive.– Occlusion of objects and the existence of spurious

features in the image can further diminish the success of matching strategies.

Page 9: SWE 423: Multimedia Systems

Template Matching Techniques

• Fixed Template Matching– Useful if object shapes do not change with

respect to the viewing angle of the camera.

•  Deformable Template Matching– More suitable for cases where objects in the

database may vary due to rigid and non-rigid deformations.

Page 10: SWE 423: Multimedia Systems

Fixed Template Matching• Image Subtraction:

– Difference in intensity levels between the image and the template is used in object recognition.

– Performs well in restricted environments where imaging conditions (such as image intensity) between the image and the template are the same. 

• Matching by correlation:– utilizes the position of the normalized cross-correlation

peak between a template and image. – Generally immune to noise and illumination effects in

the image.– Suffers from high computational complexity caused by

summations over the entire template.

Page 11: SWE 423: Multimedia Systems

Deformable Template Matching• Template is represented as a bitmap describing the

characteristic contour/edges of an object shape.• An objective function with transformation

parameters which alter the shape of the template is formulated reflecting the cost of such transformations.

• The objective function is minimized by iteratively updating the transformations parameters to best match the object.

• Applications include: handwritten character recognition and motion detection of objects in video frames. 

Page 12: SWE 423: Multimedia Systems

Prototype System: KMeD

• Medical objects belonging only to patients in a small age group are identified automatically in KMeD.– Such objects have high contrast with respect to

their background and have relatively simple shapes, large sizes, and little or no overlap with other objects.

• KMeD resorts to a human-assisted object recognition process otherwise.

Page 13: SWE 423: Multimedia Systems

Demo

• http://www.cs.washington.edu/research/imagedatabase/demo/cars/ (check car214)

Page 14: SWE 423: Multimedia Systems

Spatial Modeling and Knowledge Representation Layer (1)

• Maintain the domain knowledge for representing spatial semantics associated with image databases.

• At this level, queries are generally descriptive in nature, and focus mostly on semantics and concepts present in image databases.

• Semantics at this level are based on ``spatial events'' describing the relative locations of multiple objects.– An example involving such semantics is a range query

which involves spatial concepts such as close by, in the vicinity, larger than. (e.g. retrieve all images that contain a large tumor in the brain).

Page 15: SWE 423: Multimedia Systems

Spatial Modeling and Knowledge Representation Layer (2)

• Identify spatial relationships among objects, once they are recognized and marked by the lower layer using bounding boxes or volumes.

• Several techniques have been proposed to formally represent spatial knowledge at this layer.– Semantic networks– Mathematical logic– Constraints– Inclusion hierarchies– Frames.

Page 16: SWE 423: Multimedia Systems

Semantic Networks

• First introduced to represent the meanings of English sentences in terms of words and relationships between them.

• Semantic networks are graphs of nodes representing concepts that are linked together by arcs representing relationships between these concepts.

• Efficiency in semantic networks is gained by representing each concept or object once and using pointers for cross references rather than naming an object explicitly every time it is involved in a relation.

• Example: Type Abstraction Hierarchies (KMeD)

Page 17: SWE 423: Multimedia Systems

Brain Lesions Representation

Page 18: SWE 423: Multimedia Systems

TAH Example

Page 19: SWE 423: Multimedia Systems

Constraints-based Methodology

• Domain knowledge is represented using a set of constraints in conjunction with formal expressions such as predicate calculus or graphs.

• A constraint is a relationship between two or more objects that needs to be satisfied.

Page 20: SWE 423: Multimedia Systems

Example: PICTION system

• Its architecture consists of a natural language processing module (NLP), an image understanding module (IU), and a control module.

• A set of constraints is derived by the NLP module from the picture captions. These constraints (called Visual Semantics by the author) are used with the faces recognized in the picture by the IU module to identify the spatial relationships among people.

• The control module maintains the constraints generated by the NLP module and acts as a knowledge-base for the IU module to perform face recognition functions.

Page 21: SWE 423: Multimedia Systems
Page 22: SWE 423: Multimedia Systems

Mathematical Logic

• Iconic Indexing by 2D strings: Uses projections of salient objects in a coordinated system.

• These projections are expressed in the form of 2D strings to form a partial ordering of object projections in 2D.

• For query processing, 2D subsequence matching is performed to allow similarity-based retrieval.

• Binary Spatial Relations: Uses Allen's 13 temporal relations to represent spatial relationships.

Page 23: SWE 423: Multimedia Systems

Inclusion Hierarchies

• The approach is object-oriented and uses concept classes and attributes to represent domain knowledge.

• These concepts may represent image features, high-level semantics, semantic operators and conditions.

Page 24: SWE 423: Multimedia Systems

Frames

• A frame usually consists of a name and a list of attribute-value pairs.

• A frame can be associated with a class of objects or with a class of concepts.

• Frame abstractions allow encapsulation of file names, features, and relevant attributes of image objects.