Transcript
11 Introduction to Image Processing
A new structure-based interest region detector called Principal Curvature-Based Regions
(PCBR) which we use for object class recognition The PCBR interest operator detects stable
watershed regions within the multi-scale principal curvature image To detect robust
watershed regions we ldquocleanrdquo a principal curvature image by combining a gray scale
morphological close with our new ldquoeigenvector flowrdquo hysteresis threshold Robustness across
scales is achieved by selecting the maximally stable regions across consecutive scales PCBR
typically detects distinctive patterns distributed evenly on the objects and it shows significant
robustness to local intensity perturbations and intra-class variations We evaluate PCBR both
qualitatively (through visual inspection) and quantitatively (by measuring repeatability and
classification accuracy in real-world object-class recognition problems) Experiments on
different benchmark datasets show that PCBR is comparable or superior to state-of-art
detectors for both feature matching and object recognition Moreover we demonstrate the
application of PCBR to symmetry detection In many object recognition tasks within-class
changes in pose lighting color and texture can cause considerable variation in local
intensities Consequently local intensity no longer provides a stable detection cue As such
intensity-based interest operators (eg Harris Kadir)ndashand the object recognition systems
based on themndashoften fail to identify discriminative features An alternative to local intensity
cues is to capture semi-local structural cues such as edges and curvilinear shapes [25] These
structural cues tend to be more robust to intensity color and pose variations As such they
provide the basis for a more stable interest operator which in turn improves object
recognition accuracy This paper introduces a new detector that exploits curvilinear structures
to reliably detect interesting regions The detector called the Principal Curvature-Based
Region (PCBR) detector identifies stable watershed regions within the multi-scale principal
curvature image
Curvilinear structures are lines (either curved or straight) such as roads in aerial or satellite
images or blood vessels in medical scans These curvilinear structures can be detected over a
range of viewpoints scales and illumination changes The PCBR detector employs the first
steps of Stegerrsquos curvilinear detector algorithm [25] It forms an image of the maximum or
minimum eigen value of the Hessian matrix at each pixel We call this the principal curvature
image as it measures the principal curvature of the image intensity surface This process
generates a single response for both lines and edges producing a clearer structural sketch of
an image than is usually provided by the gradient magnitude image We develop a process
that detects structural regions efficiently and robustly using the watershed transform of the
principal curvature image across scale space The watershed algorithm provides a more
efficient mechanism for defining structural regions than previous methods that fit circles
ellipses and parallelograms [8 27] To improve the watershedrsquos robustness to noise and other
small image perturbations we first ldquocleanrdquo the principal curvature image with a gray scale
morphological close operation followed by a new hysteresis thresholding method based on
local eigenvector flow The watershed transform is then applied to the cleaned principal
curvature image and the resulting watershed regions (ie the catchment basins) define the
PCBR regions To achieve robust detections across multiple scales the watershed is applied
to the maxima of three consecutive images in the principal curvature scale spacendashsimilar to
local scale-space extreme used by Lowe [13] Mikolajczyk and Schmidt [17] and othersndashand
we further search for stable PCBR regions across consecutive scalesndashan idea adapted from
the stable regions detected across multiple threshold levels used by the MSER detector [15]
While PCBR shares similar ideas with previous detectors it represents a very different
approach to detecting interest regions Many prior intensity-based detectors search for points
with distinctive local differential geometry such as corners while ignoring image features
such as lines and edges Conversely PCBR utilizes line and edge features to construct
structural interest regions Compared to MSER PCBR differs two important aspects First
MSER does not analyze regions in scale space so it does not provide different levels of
region abstraction Second MSERrsquos intensity-based threshold process cannot overcome local
intensity variations within regions PCBR however overcomes this difficulty by focusing on
region boundaries rather than the appearance of region interiors This work makes two
contributions First we develop a new interest operator that utilizes principal curvature to
extract robust and invariant region structures based on both edge and curvilinear features
Second we introduce an enhanced principle-curvature-based watershed segmentation and
robust region selection process that is robust to intra-class variations and is more efficient
than previous structure-based detectors We demonstrate the value of our PCBR detector by
applying it to object-class recognition problems and symmetry detection
Image Processing is a form of signal processing where images and their properties can be
used to gather and analyze information about the objects in the image Digital image
processing uses digital images and computer algorithms to enhance manipulate or transform
images to obtain the necessary information and make decisions accordingly Examples of
digital image processing include improvements and analysis of the images of the Surveyor
missions to the moon [15] magnetic resonance imaging scans of the brain and electronic face
recognition packages These techniques can be used to assist humans with complex tasks and
make them easier A detailed analysis of an X-ray can help a radiologist to decide whether a
bone is fractured or not Digital image processing can increase the credibility of the decisions
made by humans
12 Introduction to Medical Imaging
Image processing techniques have developed and are applied to various fields like space
programs aerial and satellite imagery and medicine [15] Medical imaging is the set of digital
image processing techniques that create and analyze images of the human body to assist
doctors and medical scientists In medicine imaging is used for planning surgeries X-ray
imaging for bones Magnetic resonance imaging endoscopy and many other useful
applications [31] Digital X-ray imaging is used in this thesis project Figure 11 shows the
applications of digital imaging in medical imaging Since Wilhelm Roentgen discovered X-
rays in 1895 [14] X-ray technology has improved considerably In medicine X-rays help
doctors to see inside a patients body without surgery or any physical damage X-rays can
pass through solid objects without altering the physical state of the object because they have a
small wavelength So when this radiation is passed through a patients body objects of
different density cast shadows of different intensities resulting in black-and-white images
The bone for example will be shown in white as it is opaque and air will be shown in black
The other tissues in the body will be in gray A detailed analysis of the bone structure can be
performed using X-rays and any fractures can be detected Conventionally X-rays were taken
using special photographic films using silver salts [28] Digital X-rays can be taken using
crystal photodiodes Crystal photodiodes contain cadmium tungsten or bismuth germanate to
capture light as electrical pulses The signals are then converted from analogue to digital and
can be viewed on computers
Digital X-rays are very advantageous as they are portable require less energy than normal X-
rays less expensive and are environmentally friendly [28] A radiologist would look at the X-
rays and determine if a bone was fractured or not This system is time consuming and
unreliable because the probability of a fractured bone is low Some fractures are easy to
detect and a system can be developed to automatically detect fractures This will assist the
doctors and radiologists in their work and will improve the accuracy of the results [28]
According to the observations of [27] only 11 of the femur X-rays were showing fractured
bones So the radiologist has to look at a lot of X-rays to find a fractured one An algorithm to
automatically detect bone fractures could help the radiologist to find the fractured bones or at
least confidently sort out the healthy ones But no single algorithm can be used for the whole
body because of the complexity of different bone structures Even though a lot of research
has been done in this field there is no system that completely solves the problem [14] This is
because there are several complicated parts to this problem of fracture detection Digital X-
rays are very detailed and complicated to interpret Bones have different sizes and can differ
in characteristics from person to person So finding a general method to locate the bone and
decide if its fractured or not is a complex problem Some of the main aspects to the problem
of automatic bone fracture detection are bone orientation in the X-ray extracting bone
contour information bone segmentation extraction of relevant features
13 Description of the Problem
This thesis investigates the different ways of separating a bone from an X-ray Meth ods like
edge detection and Active Shape Models are experimented with The aim of this thesis is to
find an efficient and reasonably fast way of separating the bone from the rest of the X-ray
The bone that was used for the analysis is the tibia bone The tibia also known as the
shinbone or shankbone is the larger and stronger of the two bones in the leg below the knee
in vertebrates and connects the knee with the ankle bones Details of the X-ray data used are
provided in the next section
21 Theory Development
A typical digital image processing system consists of image segmentation feature extraction
pattern recognition thresholding and error classification Image processing aims at extracting
the necessary information from the image The image needs to be reduced to certain defining
characteristics and the analysis of these characteristics gives the relevant information Figure
21 shows a process flow diagram of a typical digital image processing system showing the
sequence of the operations Image segmentation is the main focus of this thesis The other
processes are briefly described for completeness and to inform the reader of the processes in
the whole system
211 Image Segmentation
Image segmentation is the process of extracting the regions of interest from an image There
are many operations to segment images and their usage depends on the nature of the region to
be extracted For example if an image has strong edges edge detection techniques can be
used to partition the image into its components using those edges Image segmentation is the
central theme of this thesis and is doneusing several techniques Figure 22 shows how one
of the coins can be separated from the image shows the original image and highlights the
boundary of one of the coins These techniques are analyzed and the best technique to
separate bones from X-rays is suggested When dealing with bone X-ray images contour
detection is an important step in image segmentation According to [31] classical image
segmentation and contour detection can be di_erent Contour detection algorithms extract the
contour of objects whereas image segmentation separates homogeneous sections of the
image A detailed literature review and history of the image segmentation techniques used for
different applications is given in Chapter 3
2 Segmentation of Images - An Overview
Image segmentation can proceed on three diregerent ways
sup2 Manually
sup2 Automatically
sup2 Semiautomatically
21 Manual Segmentation
The pixels belonging to the same intensity range could manually be pointed out but clearly
this is a very time consuming method if the image is large A better choice would be to mark
the contours of the objects This could be done discrete from the keyboard giving high
accuracy but low speed or it could be done with the mouse with higher speed but less
accuracy The manual techniques all have in common the amount of time spent in tracing the
objects and human resources are expensive Tracing algorithms can also make use of
geometrical figures like ellipses to approximate the boundaries of the objects This has been
done a lot for medical purposes but the approximations may not be very good
22 Automatic Segmentation
Fully automatic segmentation is diplusmncult to implement due to the high complexity and
variation of images Most algorithms need some a priori information to carry out the
segmentation and for a method to be automatic this a priori information must be available to
the computer The needed apriori information could for instance be noise level and
of the objects having a special distribution
23 Semiautomatic Segmentation
Semiautomatic segmentation combines the benemacrts of both manual and automatic seg-
mentation By giving some initial information about the structures we can proceed with
automatic methods
sup2 Thresholding
If the distribution of intensities is known thresholding divides the image into two
regions separated by a manually chosen threshold value a as follows
if B(i j) cedil a B(i j) = 1 (object) else B(i j) = 0 (background) for all i j over the image B
[YGV] This can be repeated for each region dividing them by the threshold value which
results in four regions etc However a successful segmentation requires that some properties
of the image is known beforehand This method has the drawback of including separated
regions which correctly lie within the limits specified but regionally do not belong to the
selected region These pixels could for instance appear from noise The simplest way of
choosing the threshold value would be a fixed value for instance the mean value of the
image A better choice would be a histogram derived threshold This method includes some
knowledge of the distribution of the image and will result in less misclassimacrcation
Isodata algorithm is an iterative process for macrnding the threshold value [YGV] First
segment the image into two regions according to a temporary chosen threshold value
Then calculate the mean value of the image corresponding to the two segmented
regions Calculate a new threshold value from
thresholdnew = mean(meanregion1 + meanregion2)
and repeat until the threshold value does not change any more Finally choose this
value for the threshold segmentation
To implement the triangle algorithm construct a histogram of intensities vs number
of pixels like in Figure 21 Draw a line between the maximum value of the histogram
hmax and the minimum value hmin and calculate the distance d between the line
and and the histogram Increase hmin and repeat for all h until h = hmax The
threshold value becomes the h for which the distance d is maximised This method
is particularly eregective when the pixels of the object we seek make a weak peak
sup2 Boundary tracking
Edge-macrnding by gradients is the method of selecting a boundary manually and auto-
matically follow this gradient until returning to the same point [YGV] Returning
to the same point can be a major problem of this method Boundary tracking will
wrongly include all interior holes in the region and will meet problems if the gradient
specifying the boundary is varying or is very small A way to overcome this problem
is macrrst to calculate the gradient and then apply a threshold segmentation This will
exclude some wrongly included pixels compared to the threshold method only
Zero-crossing based procedure is a method based on the Laplacian Assume the
boundaries of an object has the property that the Laplacian will change sign across
them Consider a 1D problem where cent = 2
x2 Assume the boundary is blurred
and the gradient will have a shape like in Figure 22 The Laplacian will change
sign just around the assumed edge for position = 0 For noisy images the noise will
produce large second derivatives around zero crossings and the zero-crossing based
procedure needs a smoothing macrlter to produce satisfactory results
sup2 Clustering Methods Clustering methods group pixels into larger regions using
colour codes The colour code for each pixel is usually given as a 3D vector but
212 Feature Extraction
Feature extraction is the process of reducing the segmented image into few numbers
or sets of numbers that de_ne the relevant features of the image These features
must be carefully chosen in such a way that they are a good representation of
the image and encapsulate the necessary information Some examples of features
can be image properties like the mean standard deviation gradient and edges
Generally a combination of features is used to generate a model for the images
Cross validation is done on the images to see which features represent the image
well and those features are used Features can sometimes be assigned weights to
signify the importance of certain features For example the mean in a certain
image may be given a weight of 09 because it is more important than the standard
deviation which may have a weight of 03 assigned to it Weights generally range
from 0 to 1 and they de_ne how important the features are These features and their
respective weights are then used on a test image to get the relevant information
To classify the bone as fractured or not[27]measures the neck-shaft angle from the
segmented femur contour as a feature Texture features of the image such as Gabor
orientation (GO) Markov Random Field (MRF) and intensity gradient direction
(IGD) are used by [22] to generate a combination of classi_ers to detect fractures in
bones These techniques are also used in [20] to look at femur fractures speci_cally
Best parameter values for the features can be found using various techniques
213 Classifiers and Pattern Recognition
After the feature extraction stage the features have to be analyzed and a pat-
tern needs to be recognized For example the features mentioned above like the
neck-shaft angle in a femur X-ray image need to be plotted The patterns can be
recognized if the neck-shaft angles of good femurs are di_erent from those of frac-
tured femurs Classifiers like Bayesian classifiers and Support Vector Machines are
used to classify features and _nd the best values for them For example [22] used
a support vector machine called the Gini-SVM [22] and found the feature values
for GO MRF and IGD that gave the best performance overall Clustering nearest
neighbour approaches can also be used for pattern recognition and classi_cation of
images For example the gradient vector of a healthy long bone X-ray may point
in a certain direction that is very di_erent to the gradient vector of a fractured long
bone X-ray So by observing this fact a bone in an unknown X-ray image can be
classi_ed as healthy or fractured using the gradient vector of the image
214 Thresholding and Error Classi_cation
Thresholding and Error Classi_cation is the _nal stage in the digital image process-
ing system Thresholding an image is a simple technique and can be done at any
stage in the process It can be used at the start to reduce the noise in the image or
it can be used to separate certain sections in an image that has distinct variations
in pixel values Thresholding is done by comparing the value of each pixel in an
image and comparing it to a threshold The image can be separated into regions
or pixels that are greater or lesser than the threshold value Multiple thresholds
can be used to achieve thresholding with many levels Otsus method [21] is a way
of automatically thresholding any image
Thresholding is used at different stages in this thesis It is a simple and useful tool in image
processing The following figures show the effects of thresholding Thresholding of an image
can be done manually by using the histogram of the intensities in an image It is difficult to
threshold noisy images as the background intensity and the foreground intensity may not be
distinctly separate Figure 23 shows an example of an image and its histogram that has the
pixel intensities on the horizontal axis and the number of pixels on the vertical axis
(a) The original image (b) The histogram of the image
Figure 23 Histogram of image [23]
IMAGE ENHANCEMENT TECHNIQUES
Image enhancement techniques improve the quality of an image as perceived by a human
These techniques are most useful because many satellite images when examined on a colour
display give inadequate information for image interpretation There is no conscious effort to
improve the fidelity of
the image with regard to some ideal form of the image There exists a wide variety of
techniques for improving image quality The contrast stretch density slicing edge
enhancement and spatial filtering are the more commonly used techniques Image
enhancement is attempted after the image is corrected for geometric and radiometric
distortions Image enhancement methods are applied separately to each band of a
multispectral image Digital techniques have been found to be most satisfactory than the
photographic technique for image enhancement because of the precision and wide variety of
digital
processes
Contrast
Contrast generally refers to the difference in luminance or grey level values in an image and
is an important characteristic It can be defined as the ratio of the maximum intensity to the
minimum intensity over an image Contrast ratio has a strong bearing on the resolving power
and detectability
of an image Larger this ratio more easy it is to interpret the image Satellite images lack
adequate contrast and require contrast improvement
Contrast Enhancement
Contrast enhancement techniques expand the range of brightness values in an image so that
the image can be efficiently displayed in a manner desired by the analyst The density values
in a scene are literally pulled farther apart that is expanded over a greater range The effect
is to increase the visual
contrast between two areas of different uniform densities This enables the analyst to
discriminate easily between areas initially having a small difference in density
Linear Contrast Stretch
This is the simplest contrast stretch algorithm The grey values in the original image and the
modified image follow a linear relation in this algorithm A density number in the low range
of the original histogram is assigned to extremely black and a value at the high end is
assigned to extremely white The remaining pixel values are distributed linearly between
these extremes The features or details that were obscure on the original image will be clear
in the contrast stretched image Linear contrast stretch operation can be represented
graphically as shown in Fig 4 To provide optimal contrast
and colour variation in colour composites the small range of grey values in each band is
stretched to the full brightness range of the output or display unit
Non-Linear Contrast Enhancement
In these methods the input and output data values follow a non-linear transformation The
general form of the non-linear contrast enhancement is defined by y = f (x) where x is the
input data value and y is the output data value The non-linear contrast enhancement
techniques have been found to be useful for enhancing the colour contrast between the nearly
classes and subclasses of a main class
A type of non linear contrast stretch involves scaling the input data logarithmically This
enhancement has greatest impact on the brightness values found in the darker part of
histogram It could be reversed to enhance values in brighter part of histogram by scaling the
input data using an inverse log
function Histogram equalization is another non-linear contrast enhancement technique In
this technique histogram of the original image is redistributed to produce a uniform
population density This is obtained by grouping certain adjacent grey values Thus the
number of grey levels in the enhanced image is less than the number of grey levels in the
original image
SPATIAL FILTERING
A characteristic of remotely sensed images is a parameter called spatial frequency defined as
number of changes in Brightness Value per unit distance for any particular part of an image
If there are very few changes in Brightness Value once a given area in an image this is
referred to as low frequency area Conversely if the Brightness Value changes dramatically
over short distances this is an area of high frequency Spatial filtering is the process of
dividing the image into its constituent spatial frequencies and selectively altering certain
spatial frequencies to emphasize some image features This technique increases the analystrsquos
ability to discriminate detail The three types of spatial filters used in remote sensor data
processing are Low pass filters Band pass filters and High pass filters
Low-Frequency Filtering in the Spatial Domain
Image enhancements that de-emphasize or block the high spatial frequency detail are low-
frequency or low-pass filters The simplest low-frequency filter evaluates a particular input
pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new
brightness value BVout that is the mean of this convolution The size of the neighbourhood
convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing
operation will however blur the image especially at the edges of objects Blurring becomes
more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass
image being two lines and two columns smaller than the original image Techniques that can
be applied to deal with this problem include (1) artificially extending the original image
beyond its border by repeating the original border pixel brightness values or (2) replicating
the averaged brightness values near the borders based on the image behaviour within a view
pixels of the border The most commonly used low pass filters are mean median and mode
filters
High-Frequency Filtering in the Spatial Domain
High-pass filtering is applied to imagery to remove the slowly varying components and
enhance the high-frequency local variations Brightness values tend to be highly correlated in
a nine-element window Thus the highfrequency filtered image will have a relatively narrow
intensity histogram This suggests that the output from most high-frequency filtered images
must be contrast stretched prior to visual analysis
Edge Enhancement in the Spatial Domain
For many remote sensing earth science applications the most valuable information that may
be derived from an image is contained in the edges surrounding various objects of interest
Edge enhancement delineates these edges and makes the shapes and details comprising the
image more conspicuous and perhaps easier to analyze Generally what the eyes see as
pictorial edges are simply sharp changes in brightness value between two adjacent pixels The
edges may be enhanced using either linear or nonlinear edge enhancement techniques
Linear Edge Enhancement
A straightforward method of extracting edges in remotely sensed imagery is the application
of a directional first-difference algorithm and approximates the first derivative between two
adjacent pixels The algorithm produces the first difference of the image input in the
horizontal vertical and diagonal directions
The Laplacian operator generally highlights point lines and edges in the image and
suppresses uniform and smoothly varying regions Human vision physiological research
suggests that we see objects in much the same way Hence the use of this operation has a
more natural look than many of the other edge-enhanced images
Band ratioing
Sometimes differences in brightness values from identical surface materials are caused by
topographic slope and aspect shadows or seasonal changes in sunlight illumination angle
and intensity These conditions may hamper the ability of an interpreter or classification
algorithm to identify correctly surface materials or land use in a remotely sensed image
Fortunately ratio transformations of the remotely sensed data can in certain instances be
applied to reduce the effects of such environmental conditions In addition to minimizing the
effects of environmental factors ratios may also provide unique information not available in
any single band that is useful for discriminating between soils and vegetation
Chapter 3
Literature Review and History
The _rst section in this chapter describes the work that is related to the topic Many
papers use the same image segmentation techniques for di_erent problems This
section explains the methods discussed in this thesis used by researchers to solve
similar problems The subsequent section describes the workings of the common
methods of image segmentation These methods were investigated in this thesis and
are also used in other papers They include techniques like Active Shape Models
Active ContourSnake Models Texture analysis edge detection and some methods
that are only relevant for the X-ray data
31 Previous Research
311 Summary of Previous Research
According to [14] compared to other areas in medical imaging bone fracture detec-
tion is not well researched and published Research has been done by the National
University of Singapore to segment and detect fractures in femurs (the thigh bone)
[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it
from the X-ray The X-rays were also segmented using Snakes or Active Contour
Models (discussed in 34) and Gradient Vector Flow According to the experiments
done by [27] their algorithm achieves a classi_cation with an accuracy of 945
Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in
X-rays [31] proposes two methods to extract femur contours from X-rays The _rst
is a semi-automatic method which gives priority to reliability and accuracy This
method tries to _t a model of the femur contour to a femur in the X-ray The second
method is automatic and uses active contour models This method breaks down the
shape of the femur into a couple of parallel or roughly parallel lines and a circle at
the top representing the head of the femur The method detects the strong edges in
the circle and locates the turning point using the point of in_ection in the second
derivative of the image Finally it optimizes the femur contour by applying shapeconstraints
to the model
Hough and Radon transforms are used by [14] to approximate the edges of long
bones [14] also uses clustering-based algorithms also known as bi-level or localized
thresholding methods and the global segmentation algorithms to segment X-rays
Clustering-based algorithms categorize each pixel of the image as either a part of
the background or as a part of the object hence the name bi-level thresholding
based on a speci_ed threshold Global segmentation algorithms take the whole
image into consideration and sometimes work better than the clustering-based algo-
rithms Global segmentation algorithms include methods like edge detection region
extraction and deformable models (discussed in 34)
Active Contour Models initially proposed by [19] fall under the class of deformable
models and are used widely as an image segmentation tool Active Contour Models
are used to extract femur contours in X-ray images by [31] after doing edge detection
on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by
[31] to extract contours and the results are compared to that of the Active Contour
Model [3] uses an Active Contour Model with curvature constraints to detect
femur fractures as the original Active Contour Model is susceptible to noise and
other undesired edges This method successfully extracts the femur contour with a
small restriction on shape size and orientation of the image
Active Shape Models introduced by Cootes and Taylor [9] is another widely used
statistical model for image segmentation Cootes and Taylor and their colleagues
[5 6 7 11 12 10] released a series of papers that completed the de_nition of the
original ASMs by modifying it also called classical ASMs by [24] These papers
investigated the performance of the model with gray-level variation di_erent reso-
lutions and made the model more _exible and adaptable ASMs are used by [24] to
detect facial features Some modi_cations to the original model were suggested and
experimented with The relationships between landmark points computing time
and the number of images in the training data were observed for di_erent sets of
data The results in this thesis are compared to the results in [24] The work done
in this thesis is similar to [24] as the same model is used for a di_erent application
[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition
of the shape and the gray level analysis of grayscale images The data used was
facial data from a face database and it was concluded that ASMs are an accurate
way of modeling the shape and gray level appearance It was observed that the
model allows for _exibility while being constrained on the shape of the object to
be segmented This is relevant for the problem of bone segmentation as X-rays are
grayscale and the structure and shape of bones can di_er slightly The _exibility
of the model will be useful for separating bones from X-rays even though one tibia
bone di_ers from another tibia bone
The lsquoworking mechanisms of the methods discussed above are explained in detail in
312 Common Limitations of the Previous Research
As mentioned in previous chapters bone segmentation and fracture detection are both
complicated problems There are many limitations and problems in the seg- mentation
methods used Some methods and models are too limited or constrained to match the bone
accurately Accuracy of results and computing time are conflict- ing variables
It is observed in [14] that there is no automatic method of segmenting bones [14] also
recognizes the need for good initial conditions for Active Contour Models to produce a good
segmentation of bones from X-rays If the initial conditions are not good the final results will
be inaccurate Manual definition of the initial conditions such as the scaling or orientation of
the contour is needed so the process is not automatic [14] tries to detect fractures in long
shaft bones using Computer Aided Design (CAD) techniques
The tradeo_ between automizing the algorithm and the accuracy of the results using the
Active Shape and Active Contour Models is examined in [31] If the model is made fully
automatic by estimating the initial conditions the accuracy will be lower than when the
initial conditions of the model are defined by user inputs [31] implements both manual and
automatic approaches and identifies that automatically segmenting bone structures from noisy
X-ray images is a complex problem This thesis project tackles these limitations The manual
and automatic approaches
are tried using Active Shape Models The relationship between the size of the
training set computation time and error are studied
32 Edge Detection
Edge detection falls under the category of feature detection of images which includes other
methods like ridge detection blob detection interest point detection and scale space models
In digital imaging edges are de_ned as a set of connected pixels that
lie on the boundary between two regions in an image where the image intensity changes
formally known as discontinuities [15] The pixels or a set of pixels that form the edge are
generally of the same or close to the same intensities Edge detection can be used to segment
images with respect to these edges and display the edges separately [26][15] Edge detection
can be used in separating tibia bones from X-rays as bones have strong boundaries or edges
Figure 31 is an example of
basic edge detection in images
321 Sobel Edge Detector
The Sobel operator used to do the edge detection calculates the gradient of the image
intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal
and vertical derivatives as its components The gradient vector can also be seen as a
magnitude and an angle If Dx and Dy are the derivatives in the x and y direction
respectively equations 31 and 32 show the magnitude and angle(direction) representation of
the gradient vector rD It is a measure of the rate of change in an image from light to dark
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
principal curvature image across scale space The watershed algorithm provides a more
efficient mechanism for defining structural regions than previous methods that fit circles
ellipses and parallelograms [8 27] To improve the watershedrsquos robustness to noise and other
small image perturbations we first ldquocleanrdquo the principal curvature image with a gray scale
morphological close operation followed by a new hysteresis thresholding method based on
local eigenvector flow The watershed transform is then applied to the cleaned principal
curvature image and the resulting watershed regions (ie the catchment basins) define the
PCBR regions To achieve robust detections across multiple scales the watershed is applied
to the maxima of three consecutive images in the principal curvature scale spacendashsimilar to
local scale-space extreme used by Lowe [13] Mikolajczyk and Schmidt [17] and othersndashand
we further search for stable PCBR regions across consecutive scalesndashan idea adapted from
the stable regions detected across multiple threshold levels used by the MSER detector [15]
While PCBR shares similar ideas with previous detectors it represents a very different
approach to detecting interest regions Many prior intensity-based detectors search for points
with distinctive local differential geometry such as corners while ignoring image features
such as lines and edges Conversely PCBR utilizes line and edge features to construct
structural interest regions Compared to MSER PCBR differs two important aspects First
MSER does not analyze regions in scale space so it does not provide different levels of
region abstraction Second MSERrsquos intensity-based threshold process cannot overcome local
intensity variations within regions PCBR however overcomes this difficulty by focusing on
region boundaries rather than the appearance of region interiors This work makes two
contributions First we develop a new interest operator that utilizes principal curvature to
extract robust and invariant region structures based on both edge and curvilinear features
Second we introduce an enhanced principle-curvature-based watershed segmentation and
robust region selection process that is robust to intra-class variations and is more efficient
than previous structure-based detectors We demonstrate the value of our PCBR detector by
applying it to object-class recognition problems and symmetry detection
Image Processing is a form of signal processing where images and their properties can be
used to gather and analyze information about the objects in the image Digital image
processing uses digital images and computer algorithms to enhance manipulate or transform
images to obtain the necessary information and make decisions accordingly Examples of
digital image processing include improvements and analysis of the images of the Surveyor
missions to the moon [15] magnetic resonance imaging scans of the brain and electronic face
recognition packages These techniques can be used to assist humans with complex tasks and
make them easier A detailed analysis of an X-ray can help a radiologist to decide whether a
bone is fractured or not Digital image processing can increase the credibility of the decisions
made by humans
12 Introduction to Medical Imaging
Image processing techniques have developed and are applied to various fields like space
programs aerial and satellite imagery and medicine [15] Medical imaging is the set of digital
image processing techniques that create and analyze images of the human body to assist
doctors and medical scientists In medicine imaging is used for planning surgeries X-ray
imaging for bones Magnetic resonance imaging endoscopy and many other useful
applications [31] Digital X-ray imaging is used in this thesis project Figure 11 shows the
applications of digital imaging in medical imaging Since Wilhelm Roentgen discovered X-
rays in 1895 [14] X-ray technology has improved considerably In medicine X-rays help
doctors to see inside a patients body without surgery or any physical damage X-rays can
pass through solid objects without altering the physical state of the object because they have a
small wavelength So when this radiation is passed through a patients body objects of
different density cast shadows of different intensities resulting in black-and-white images
The bone for example will be shown in white as it is opaque and air will be shown in black
The other tissues in the body will be in gray A detailed analysis of the bone structure can be
performed using X-rays and any fractures can be detected Conventionally X-rays were taken
using special photographic films using silver salts [28] Digital X-rays can be taken using
crystal photodiodes Crystal photodiodes contain cadmium tungsten or bismuth germanate to
capture light as electrical pulses The signals are then converted from analogue to digital and
can be viewed on computers
Digital X-rays are very advantageous as they are portable require less energy than normal X-
rays less expensive and are environmentally friendly [28] A radiologist would look at the X-
rays and determine if a bone was fractured or not This system is time consuming and
unreliable because the probability of a fractured bone is low Some fractures are easy to
detect and a system can be developed to automatically detect fractures This will assist the
doctors and radiologists in their work and will improve the accuracy of the results [28]
According to the observations of [27] only 11 of the femur X-rays were showing fractured
bones So the radiologist has to look at a lot of X-rays to find a fractured one An algorithm to
automatically detect bone fractures could help the radiologist to find the fractured bones or at
least confidently sort out the healthy ones But no single algorithm can be used for the whole
body because of the complexity of different bone structures Even though a lot of research
has been done in this field there is no system that completely solves the problem [14] This is
because there are several complicated parts to this problem of fracture detection Digital X-
rays are very detailed and complicated to interpret Bones have different sizes and can differ
in characteristics from person to person So finding a general method to locate the bone and
decide if its fractured or not is a complex problem Some of the main aspects to the problem
of automatic bone fracture detection are bone orientation in the X-ray extracting bone
contour information bone segmentation extraction of relevant features
13 Description of the Problem
This thesis investigates the different ways of separating a bone from an X-ray Meth ods like
edge detection and Active Shape Models are experimented with The aim of this thesis is to
find an efficient and reasonably fast way of separating the bone from the rest of the X-ray
The bone that was used for the analysis is the tibia bone The tibia also known as the
shinbone or shankbone is the larger and stronger of the two bones in the leg below the knee
in vertebrates and connects the knee with the ankle bones Details of the X-ray data used are
provided in the next section
21 Theory Development
A typical digital image processing system consists of image segmentation feature extraction
pattern recognition thresholding and error classification Image processing aims at extracting
the necessary information from the image The image needs to be reduced to certain defining
characteristics and the analysis of these characteristics gives the relevant information Figure
21 shows a process flow diagram of a typical digital image processing system showing the
sequence of the operations Image segmentation is the main focus of this thesis The other
processes are briefly described for completeness and to inform the reader of the processes in
the whole system
211 Image Segmentation
Image segmentation is the process of extracting the regions of interest from an image There
are many operations to segment images and their usage depends on the nature of the region to
be extracted For example if an image has strong edges edge detection techniques can be
used to partition the image into its components using those edges Image segmentation is the
central theme of this thesis and is doneusing several techniques Figure 22 shows how one
of the coins can be separated from the image shows the original image and highlights the
boundary of one of the coins These techniques are analyzed and the best technique to
separate bones from X-rays is suggested When dealing with bone X-ray images contour
detection is an important step in image segmentation According to [31] classical image
segmentation and contour detection can be di_erent Contour detection algorithms extract the
contour of objects whereas image segmentation separates homogeneous sections of the
image A detailed literature review and history of the image segmentation techniques used for
different applications is given in Chapter 3
2 Segmentation of Images - An Overview
Image segmentation can proceed on three diregerent ways
sup2 Manually
sup2 Automatically
sup2 Semiautomatically
21 Manual Segmentation
The pixels belonging to the same intensity range could manually be pointed out but clearly
this is a very time consuming method if the image is large A better choice would be to mark
the contours of the objects This could be done discrete from the keyboard giving high
accuracy but low speed or it could be done with the mouse with higher speed but less
accuracy The manual techniques all have in common the amount of time spent in tracing the
objects and human resources are expensive Tracing algorithms can also make use of
geometrical figures like ellipses to approximate the boundaries of the objects This has been
done a lot for medical purposes but the approximations may not be very good
22 Automatic Segmentation
Fully automatic segmentation is diplusmncult to implement due to the high complexity and
variation of images Most algorithms need some a priori information to carry out the
segmentation and for a method to be automatic this a priori information must be available to
the computer The needed apriori information could for instance be noise level and
of the objects having a special distribution
23 Semiautomatic Segmentation
Semiautomatic segmentation combines the benemacrts of both manual and automatic seg-
mentation By giving some initial information about the structures we can proceed with
automatic methods
sup2 Thresholding
If the distribution of intensities is known thresholding divides the image into two
regions separated by a manually chosen threshold value a as follows
if B(i j) cedil a B(i j) = 1 (object) else B(i j) = 0 (background) for all i j over the image B
[YGV] This can be repeated for each region dividing them by the threshold value which
results in four regions etc However a successful segmentation requires that some properties
of the image is known beforehand This method has the drawback of including separated
regions which correctly lie within the limits specified but regionally do not belong to the
selected region These pixels could for instance appear from noise The simplest way of
choosing the threshold value would be a fixed value for instance the mean value of the
image A better choice would be a histogram derived threshold This method includes some
knowledge of the distribution of the image and will result in less misclassimacrcation
Isodata algorithm is an iterative process for macrnding the threshold value [YGV] First
segment the image into two regions according to a temporary chosen threshold value
Then calculate the mean value of the image corresponding to the two segmented
regions Calculate a new threshold value from
thresholdnew = mean(meanregion1 + meanregion2)
and repeat until the threshold value does not change any more Finally choose this
value for the threshold segmentation
To implement the triangle algorithm construct a histogram of intensities vs number
of pixels like in Figure 21 Draw a line between the maximum value of the histogram
hmax and the minimum value hmin and calculate the distance d between the line
and and the histogram Increase hmin and repeat for all h until h = hmax The
threshold value becomes the h for which the distance d is maximised This method
is particularly eregective when the pixels of the object we seek make a weak peak
sup2 Boundary tracking
Edge-macrnding by gradients is the method of selecting a boundary manually and auto-
matically follow this gradient until returning to the same point [YGV] Returning
to the same point can be a major problem of this method Boundary tracking will
wrongly include all interior holes in the region and will meet problems if the gradient
specifying the boundary is varying or is very small A way to overcome this problem
is macrrst to calculate the gradient and then apply a threshold segmentation This will
exclude some wrongly included pixels compared to the threshold method only
Zero-crossing based procedure is a method based on the Laplacian Assume the
boundaries of an object has the property that the Laplacian will change sign across
them Consider a 1D problem where cent = 2
x2 Assume the boundary is blurred
and the gradient will have a shape like in Figure 22 The Laplacian will change
sign just around the assumed edge for position = 0 For noisy images the noise will
produce large second derivatives around zero crossings and the zero-crossing based
procedure needs a smoothing macrlter to produce satisfactory results
sup2 Clustering Methods Clustering methods group pixels into larger regions using
colour codes The colour code for each pixel is usually given as a 3D vector but
212 Feature Extraction
Feature extraction is the process of reducing the segmented image into few numbers
or sets of numbers that de_ne the relevant features of the image These features
must be carefully chosen in such a way that they are a good representation of
the image and encapsulate the necessary information Some examples of features
can be image properties like the mean standard deviation gradient and edges
Generally a combination of features is used to generate a model for the images
Cross validation is done on the images to see which features represent the image
well and those features are used Features can sometimes be assigned weights to
signify the importance of certain features For example the mean in a certain
image may be given a weight of 09 because it is more important than the standard
deviation which may have a weight of 03 assigned to it Weights generally range
from 0 to 1 and they de_ne how important the features are These features and their
respective weights are then used on a test image to get the relevant information
To classify the bone as fractured or not[27]measures the neck-shaft angle from the
segmented femur contour as a feature Texture features of the image such as Gabor
orientation (GO) Markov Random Field (MRF) and intensity gradient direction
(IGD) are used by [22] to generate a combination of classi_ers to detect fractures in
bones These techniques are also used in [20] to look at femur fractures speci_cally
Best parameter values for the features can be found using various techniques
213 Classifiers and Pattern Recognition
After the feature extraction stage the features have to be analyzed and a pat-
tern needs to be recognized For example the features mentioned above like the
neck-shaft angle in a femur X-ray image need to be plotted The patterns can be
recognized if the neck-shaft angles of good femurs are di_erent from those of frac-
tured femurs Classifiers like Bayesian classifiers and Support Vector Machines are
used to classify features and _nd the best values for them For example [22] used
a support vector machine called the Gini-SVM [22] and found the feature values
for GO MRF and IGD that gave the best performance overall Clustering nearest
neighbour approaches can also be used for pattern recognition and classi_cation of
images For example the gradient vector of a healthy long bone X-ray may point
in a certain direction that is very di_erent to the gradient vector of a fractured long
bone X-ray So by observing this fact a bone in an unknown X-ray image can be
classi_ed as healthy or fractured using the gradient vector of the image
214 Thresholding and Error Classi_cation
Thresholding and Error Classi_cation is the _nal stage in the digital image process-
ing system Thresholding an image is a simple technique and can be done at any
stage in the process It can be used at the start to reduce the noise in the image or
it can be used to separate certain sections in an image that has distinct variations
in pixel values Thresholding is done by comparing the value of each pixel in an
image and comparing it to a threshold The image can be separated into regions
or pixels that are greater or lesser than the threshold value Multiple thresholds
can be used to achieve thresholding with many levels Otsus method [21] is a way
of automatically thresholding any image
Thresholding is used at different stages in this thesis It is a simple and useful tool in image
processing The following figures show the effects of thresholding Thresholding of an image
can be done manually by using the histogram of the intensities in an image It is difficult to
threshold noisy images as the background intensity and the foreground intensity may not be
distinctly separate Figure 23 shows an example of an image and its histogram that has the
pixel intensities on the horizontal axis and the number of pixels on the vertical axis
(a) The original image (b) The histogram of the image
Figure 23 Histogram of image [23]
IMAGE ENHANCEMENT TECHNIQUES
Image enhancement techniques improve the quality of an image as perceived by a human
These techniques are most useful because many satellite images when examined on a colour
display give inadequate information for image interpretation There is no conscious effort to
improve the fidelity of
the image with regard to some ideal form of the image There exists a wide variety of
techniques for improving image quality The contrast stretch density slicing edge
enhancement and spatial filtering are the more commonly used techniques Image
enhancement is attempted after the image is corrected for geometric and radiometric
distortions Image enhancement methods are applied separately to each band of a
multispectral image Digital techniques have been found to be most satisfactory than the
photographic technique for image enhancement because of the precision and wide variety of
digital
processes
Contrast
Contrast generally refers to the difference in luminance or grey level values in an image and
is an important characteristic It can be defined as the ratio of the maximum intensity to the
minimum intensity over an image Contrast ratio has a strong bearing on the resolving power
and detectability
of an image Larger this ratio more easy it is to interpret the image Satellite images lack
adequate contrast and require contrast improvement
Contrast Enhancement
Contrast enhancement techniques expand the range of brightness values in an image so that
the image can be efficiently displayed in a manner desired by the analyst The density values
in a scene are literally pulled farther apart that is expanded over a greater range The effect
is to increase the visual
contrast between two areas of different uniform densities This enables the analyst to
discriminate easily between areas initially having a small difference in density
Linear Contrast Stretch
This is the simplest contrast stretch algorithm The grey values in the original image and the
modified image follow a linear relation in this algorithm A density number in the low range
of the original histogram is assigned to extremely black and a value at the high end is
assigned to extremely white The remaining pixel values are distributed linearly between
these extremes The features or details that were obscure on the original image will be clear
in the contrast stretched image Linear contrast stretch operation can be represented
graphically as shown in Fig 4 To provide optimal contrast
and colour variation in colour composites the small range of grey values in each band is
stretched to the full brightness range of the output or display unit
Non-Linear Contrast Enhancement
In these methods the input and output data values follow a non-linear transformation The
general form of the non-linear contrast enhancement is defined by y = f (x) where x is the
input data value and y is the output data value The non-linear contrast enhancement
techniques have been found to be useful for enhancing the colour contrast between the nearly
classes and subclasses of a main class
A type of non linear contrast stretch involves scaling the input data logarithmically This
enhancement has greatest impact on the brightness values found in the darker part of
histogram It could be reversed to enhance values in brighter part of histogram by scaling the
input data using an inverse log
function Histogram equalization is another non-linear contrast enhancement technique In
this technique histogram of the original image is redistributed to produce a uniform
population density This is obtained by grouping certain adjacent grey values Thus the
number of grey levels in the enhanced image is less than the number of grey levels in the
original image
SPATIAL FILTERING
A characteristic of remotely sensed images is a parameter called spatial frequency defined as
number of changes in Brightness Value per unit distance for any particular part of an image
If there are very few changes in Brightness Value once a given area in an image this is
referred to as low frequency area Conversely if the Brightness Value changes dramatically
over short distances this is an area of high frequency Spatial filtering is the process of
dividing the image into its constituent spatial frequencies and selectively altering certain
spatial frequencies to emphasize some image features This technique increases the analystrsquos
ability to discriminate detail The three types of spatial filters used in remote sensor data
processing are Low pass filters Band pass filters and High pass filters
Low-Frequency Filtering in the Spatial Domain
Image enhancements that de-emphasize or block the high spatial frequency detail are low-
frequency or low-pass filters The simplest low-frequency filter evaluates a particular input
pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new
brightness value BVout that is the mean of this convolution The size of the neighbourhood
convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing
operation will however blur the image especially at the edges of objects Blurring becomes
more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass
image being two lines and two columns smaller than the original image Techniques that can
be applied to deal with this problem include (1) artificially extending the original image
beyond its border by repeating the original border pixel brightness values or (2) replicating
the averaged brightness values near the borders based on the image behaviour within a view
pixels of the border The most commonly used low pass filters are mean median and mode
filters
High-Frequency Filtering in the Spatial Domain
High-pass filtering is applied to imagery to remove the slowly varying components and
enhance the high-frequency local variations Brightness values tend to be highly correlated in
a nine-element window Thus the highfrequency filtered image will have a relatively narrow
intensity histogram This suggests that the output from most high-frequency filtered images
must be contrast stretched prior to visual analysis
Edge Enhancement in the Spatial Domain
For many remote sensing earth science applications the most valuable information that may
be derived from an image is contained in the edges surrounding various objects of interest
Edge enhancement delineates these edges and makes the shapes and details comprising the
image more conspicuous and perhaps easier to analyze Generally what the eyes see as
pictorial edges are simply sharp changes in brightness value between two adjacent pixels The
edges may be enhanced using either linear or nonlinear edge enhancement techniques
Linear Edge Enhancement
A straightforward method of extracting edges in remotely sensed imagery is the application
of a directional first-difference algorithm and approximates the first derivative between two
adjacent pixels The algorithm produces the first difference of the image input in the
horizontal vertical and diagonal directions
The Laplacian operator generally highlights point lines and edges in the image and
suppresses uniform and smoothly varying regions Human vision physiological research
suggests that we see objects in much the same way Hence the use of this operation has a
more natural look than many of the other edge-enhanced images
Band ratioing
Sometimes differences in brightness values from identical surface materials are caused by
topographic slope and aspect shadows or seasonal changes in sunlight illumination angle
and intensity These conditions may hamper the ability of an interpreter or classification
algorithm to identify correctly surface materials or land use in a remotely sensed image
Fortunately ratio transformations of the remotely sensed data can in certain instances be
applied to reduce the effects of such environmental conditions In addition to minimizing the
effects of environmental factors ratios may also provide unique information not available in
any single band that is useful for discriminating between soils and vegetation
Chapter 3
Literature Review and History
The _rst section in this chapter describes the work that is related to the topic Many
papers use the same image segmentation techniques for di_erent problems This
section explains the methods discussed in this thesis used by researchers to solve
similar problems The subsequent section describes the workings of the common
methods of image segmentation These methods were investigated in this thesis and
are also used in other papers They include techniques like Active Shape Models
Active ContourSnake Models Texture analysis edge detection and some methods
that are only relevant for the X-ray data
31 Previous Research
311 Summary of Previous Research
According to [14] compared to other areas in medical imaging bone fracture detec-
tion is not well researched and published Research has been done by the National
University of Singapore to segment and detect fractures in femurs (the thigh bone)
[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it
from the X-ray The X-rays were also segmented using Snakes or Active Contour
Models (discussed in 34) and Gradient Vector Flow According to the experiments
done by [27] their algorithm achieves a classi_cation with an accuracy of 945
Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in
X-rays [31] proposes two methods to extract femur contours from X-rays The _rst
is a semi-automatic method which gives priority to reliability and accuracy This
method tries to _t a model of the femur contour to a femur in the X-ray The second
method is automatic and uses active contour models This method breaks down the
shape of the femur into a couple of parallel or roughly parallel lines and a circle at
the top representing the head of the femur The method detects the strong edges in
the circle and locates the turning point using the point of in_ection in the second
derivative of the image Finally it optimizes the femur contour by applying shapeconstraints
to the model
Hough and Radon transforms are used by [14] to approximate the edges of long
bones [14] also uses clustering-based algorithms also known as bi-level or localized
thresholding methods and the global segmentation algorithms to segment X-rays
Clustering-based algorithms categorize each pixel of the image as either a part of
the background or as a part of the object hence the name bi-level thresholding
based on a speci_ed threshold Global segmentation algorithms take the whole
image into consideration and sometimes work better than the clustering-based algo-
rithms Global segmentation algorithms include methods like edge detection region
extraction and deformable models (discussed in 34)
Active Contour Models initially proposed by [19] fall under the class of deformable
models and are used widely as an image segmentation tool Active Contour Models
are used to extract femur contours in X-ray images by [31] after doing edge detection
on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by
[31] to extract contours and the results are compared to that of the Active Contour
Model [3] uses an Active Contour Model with curvature constraints to detect
femur fractures as the original Active Contour Model is susceptible to noise and
other undesired edges This method successfully extracts the femur contour with a
small restriction on shape size and orientation of the image
Active Shape Models introduced by Cootes and Taylor [9] is another widely used
statistical model for image segmentation Cootes and Taylor and their colleagues
[5 6 7 11 12 10] released a series of papers that completed the de_nition of the
original ASMs by modifying it also called classical ASMs by [24] These papers
investigated the performance of the model with gray-level variation di_erent reso-
lutions and made the model more _exible and adaptable ASMs are used by [24] to
detect facial features Some modi_cations to the original model were suggested and
experimented with The relationships between landmark points computing time
and the number of images in the training data were observed for di_erent sets of
data The results in this thesis are compared to the results in [24] The work done
in this thesis is similar to [24] as the same model is used for a di_erent application
[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition
of the shape and the gray level analysis of grayscale images The data used was
facial data from a face database and it was concluded that ASMs are an accurate
way of modeling the shape and gray level appearance It was observed that the
model allows for _exibility while being constrained on the shape of the object to
be segmented This is relevant for the problem of bone segmentation as X-rays are
grayscale and the structure and shape of bones can di_er slightly The _exibility
of the model will be useful for separating bones from X-rays even though one tibia
bone di_ers from another tibia bone
The lsquoworking mechanisms of the methods discussed above are explained in detail in
312 Common Limitations of the Previous Research
As mentioned in previous chapters bone segmentation and fracture detection are both
complicated problems There are many limitations and problems in the seg- mentation
methods used Some methods and models are too limited or constrained to match the bone
accurately Accuracy of results and computing time are conflict- ing variables
It is observed in [14] that there is no automatic method of segmenting bones [14] also
recognizes the need for good initial conditions for Active Contour Models to produce a good
segmentation of bones from X-rays If the initial conditions are not good the final results will
be inaccurate Manual definition of the initial conditions such as the scaling or orientation of
the contour is needed so the process is not automatic [14] tries to detect fractures in long
shaft bones using Computer Aided Design (CAD) techniques
The tradeo_ between automizing the algorithm and the accuracy of the results using the
Active Shape and Active Contour Models is examined in [31] If the model is made fully
automatic by estimating the initial conditions the accuracy will be lower than when the
initial conditions of the model are defined by user inputs [31] implements both manual and
automatic approaches and identifies that automatically segmenting bone structures from noisy
X-ray images is a complex problem This thesis project tackles these limitations The manual
and automatic approaches
are tried using Active Shape Models The relationship between the size of the
training set computation time and error are studied
32 Edge Detection
Edge detection falls under the category of feature detection of images which includes other
methods like ridge detection blob detection interest point detection and scale space models
In digital imaging edges are de_ned as a set of connected pixels that
lie on the boundary between two regions in an image where the image intensity changes
formally known as discontinuities [15] The pixels or a set of pixels that form the edge are
generally of the same or close to the same intensities Edge detection can be used to segment
images with respect to these edges and display the edges separately [26][15] Edge detection
can be used in separating tibia bones from X-rays as bones have strong boundaries or edges
Figure 31 is an example of
basic edge detection in images
321 Sobel Edge Detector
The Sobel operator used to do the edge detection calculates the gradient of the image
intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal
and vertical derivatives as its components The gradient vector can also be seen as a
magnitude and an angle If Dx and Dy are the derivatives in the x and y direction
respectively equations 31 and 32 show the magnitude and angle(direction) representation of
the gradient vector rD It is a measure of the rate of change in an image from light to dark
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
make them easier A detailed analysis of an X-ray can help a radiologist to decide whether a
bone is fractured or not Digital image processing can increase the credibility of the decisions
made by humans
12 Introduction to Medical Imaging
Image processing techniques have developed and are applied to various fields like space
programs aerial and satellite imagery and medicine [15] Medical imaging is the set of digital
image processing techniques that create and analyze images of the human body to assist
doctors and medical scientists In medicine imaging is used for planning surgeries X-ray
imaging for bones Magnetic resonance imaging endoscopy and many other useful
applications [31] Digital X-ray imaging is used in this thesis project Figure 11 shows the
applications of digital imaging in medical imaging Since Wilhelm Roentgen discovered X-
rays in 1895 [14] X-ray technology has improved considerably In medicine X-rays help
doctors to see inside a patients body without surgery or any physical damage X-rays can
pass through solid objects without altering the physical state of the object because they have a
small wavelength So when this radiation is passed through a patients body objects of
different density cast shadows of different intensities resulting in black-and-white images
The bone for example will be shown in white as it is opaque and air will be shown in black
The other tissues in the body will be in gray A detailed analysis of the bone structure can be
performed using X-rays and any fractures can be detected Conventionally X-rays were taken
using special photographic films using silver salts [28] Digital X-rays can be taken using
crystal photodiodes Crystal photodiodes contain cadmium tungsten or bismuth germanate to
capture light as electrical pulses The signals are then converted from analogue to digital and
can be viewed on computers
Digital X-rays are very advantageous as they are portable require less energy than normal X-
rays less expensive and are environmentally friendly [28] A radiologist would look at the X-
rays and determine if a bone was fractured or not This system is time consuming and
unreliable because the probability of a fractured bone is low Some fractures are easy to
detect and a system can be developed to automatically detect fractures This will assist the
doctors and radiologists in their work and will improve the accuracy of the results [28]
According to the observations of [27] only 11 of the femur X-rays were showing fractured
bones So the radiologist has to look at a lot of X-rays to find a fractured one An algorithm to
automatically detect bone fractures could help the radiologist to find the fractured bones or at
least confidently sort out the healthy ones But no single algorithm can be used for the whole
body because of the complexity of different bone structures Even though a lot of research
has been done in this field there is no system that completely solves the problem [14] This is
because there are several complicated parts to this problem of fracture detection Digital X-
rays are very detailed and complicated to interpret Bones have different sizes and can differ
in characteristics from person to person So finding a general method to locate the bone and
decide if its fractured or not is a complex problem Some of the main aspects to the problem
of automatic bone fracture detection are bone orientation in the X-ray extracting bone
contour information bone segmentation extraction of relevant features
13 Description of the Problem
This thesis investigates the different ways of separating a bone from an X-ray Meth ods like
edge detection and Active Shape Models are experimented with The aim of this thesis is to
find an efficient and reasonably fast way of separating the bone from the rest of the X-ray
The bone that was used for the analysis is the tibia bone The tibia also known as the
shinbone or shankbone is the larger and stronger of the two bones in the leg below the knee
in vertebrates and connects the knee with the ankle bones Details of the X-ray data used are
provided in the next section
21 Theory Development
A typical digital image processing system consists of image segmentation feature extraction
pattern recognition thresholding and error classification Image processing aims at extracting
the necessary information from the image The image needs to be reduced to certain defining
characteristics and the analysis of these characteristics gives the relevant information Figure
21 shows a process flow diagram of a typical digital image processing system showing the
sequence of the operations Image segmentation is the main focus of this thesis The other
processes are briefly described for completeness and to inform the reader of the processes in
the whole system
211 Image Segmentation
Image segmentation is the process of extracting the regions of interest from an image There
are many operations to segment images and their usage depends on the nature of the region to
be extracted For example if an image has strong edges edge detection techniques can be
used to partition the image into its components using those edges Image segmentation is the
central theme of this thesis and is doneusing several techniques Figure 22 shows how one
of the coins can be separated from the image shows the original image and highlights the
boundary of one of the coins These techniques are analyzed and the best technique to
separate bones from X-rays is suggested When dealing with bone X-ray images contour
detection is an important step in image segmentation According to [31] classical image
segmentation and contour detection can be di_erent Contour detection algorithms extract the
contour of objects whereas image segmentation separates homogeneous sections of the
image A detailed literature review and history of the image segmentation techniques used for
different applications is given in Chapter 3
2 Segmentation of Images - An Overview
Image segmentation can proceed on three diregerent ways
sup2 Manually
sup2 Automatically
sup2 Semiautomatically
21 Manual Segmentation
The pixels belonging to the same intensity range could manually be pointed out but clearly
this is a very time consuming method if the image is large A better choice would be to mark
the contours of the objects This could be done discrete from the keyboard giving high
accuracy but low speed or it could be done with the mouse with higher speed but less
accuracy The manual techniques all have in common the amount of time spent in tracing the
objects and human resources are expensive Tracing algorithms can also make use of
geometrical figures like ellipses to approximate the boundaries of the objects This has been
done a lot for medical purposes but the approximations may not be very good
22 Automatic Segmentation
Fully automatic segmentation is diplusmncult to implement due to the high complexity and
variation of images Most algorithms need some a priori information to carry out the
segmentation and for a method to be automatic this a priori information must be available to
the computer The needed apriori information could for instance be noise level and
of the objects having a special distribution
23 Semiautomatic Segmentation
Semiautomatic segmentation combines the benemacrts of both manual and automatic seg-
mentation By giving some initial information about the structures we can proceed with
automatic methods
sup2 Thresholding
If the distribution of intensities is known thresholding divides the image into two
regions separated by a manually chosen threshold value a as follows
if B(i j) cedil a B(i j) = 1 (object) else B(i j) = 0 (background) for all i j over the image B
[YGV] This can be repeated for each region dividing them by the threshold value which
results in four regions etc However a successful segmentation requires that some properties
of the image is known beforehand This method has the drawback of including separated
regions which correctly lie within the limits specified but regionally do not belong to the
selected region These pixels could for instance appear from noise The simplest way of
choosing the threshold value would be a fixed value for instance the mean value of the
image A better choice would be a histogram derived threshold This method includes some
knowledge of the distribution of the image and will result in less misclassimacrcation
Isodata algorithm is an iterative process for macrnding the threshold value [YGV] First
segment the image into two regions according to a temporary chosen threshold value
Then calculate the mean value of the image corresponding to the two segmented
regions Calculate a new threshold value from
thresholdnew = mean(meanregion1 + meanregion2)
and repeat until the threshold value does not change any more Finally choose this
value for the threshold segmentation
To implement the triangle algorithm construct a histogram of intensities vs number
of pixels like in Figure 21 Draw a line between the maximum value of the histogram
hmax and the minimum value hmin and calculate the distance d between the line
and and the histogram Increase hmin and repeat for all h until h = hmax The
threshold value becomes the h for which the distance d is maximised This method
is particularly eregective when the pixels of the object we seek make a weak peak
sup2 Boundary tracking
Edge-macrnding by gradients is the method of selecting a boundary manually and auto-
matically follow this gradient until returning to the same point [YGV] Returning
to the same point can be a major problem of this method Boundary tracking will
wrongly include all interior holes in the region and will meet problems if the gradient
specifying the boundary is varying or is very small A way to overcome this problem
is macrrst to calculate the gradient and then apply a threshold segmentation This will
exclude some wrongly included pixels compared to the threshold method only
Zero-crossing based procedure is a method based on the Laplacian Assume the
boundaries of an object has the property that the Laplacian will change sign across
them Consider a 1D problem where cent = 2
x2 Assume the boundary is blurred
and the gradient will have a shape like in Figure 22 The Laplacian will change
sign just around the assumed edge for position = 0 For noisy images the noise will
produce large second derivatives around zero crossings and the zero-crossing based
procedure needs a smoothing macrlter to produce satisfactory results
sup2 Clustering Methods Clustering methods group pixels into larger regions using
colour codes The colour code for each pixel is usually given as a 3D vector but
212 Feature Extraction
Feature extraction is the process of reducing the segmented image into few numbers
or sets of numbers that de_ne the relevant features of the image These features
must be carefully chosen in such a way that they are a good representation of
the image and encapsulate the necessary information Some examples of features
can be image properties like the mean standard deviation gradient and edges
Generally a combination of features is used to generate a model for the images
Cross validation is done on the images to see which features represent the image
well and those features are used Features can sometimes be assigned weights to
signify the importance of certain features For example the mean in a certain
image may be given a weight of 09 because it is more important than the standard
deviation which may have a weight of 03 assigned to it Weights generally range
from 0 to 1 and they de_ne how important the features are These features and their
respective weights are then used on a test image to get the relevant information
To classify the bone as fractured or not[27]measures the neck-shaft angle from the
segmented femur contour as a feature Texture features of the image such as Gabor
orientation (GO) Markov Random Field (MRF) and intensity gradient direction
(IGD) are used by [22] to generate a combination of classi_ers to detect fractures in
bones These techniques are also used in [20] to look at femur fractures speci_cally
Best parameter values for the features can be found using various techniques
213 Classifiers and Pattern Recognition
After the feature extraction stage the features have to be analyzed and a pat-
tern needs to be recognized For example the features mentioned above like the
neck-shaft angle in a femur X-ray image need to be plotted The patterns can be
recognized if the neck-shaft angles of good femurs are di_erent from those of frac-
tured femurs Classifiers like Bayesian classifiers and Support Vector Machines are
used to classify features and _nd the best values for them For example [22] used
a support vector machine called the Gini-SVM [22] and found the feature values
for GO MRF and IGD that gave the best performance overall Clustering nearest
neighbour approaches can also be used for pattern recognition and classi_cation of
images For example the gradient vector of a healthy long bone X-ray may point
in a certain direction that is very di_erent to the gradient vector of a fractured long
bone X-ray So by observing this fact a bone in an unknown X-ray image can be
classi_ed as healthy or fractured using the gradient vector of the image
214 Thresholding and Error Classi_cation
Thresholding and Error Classi_cation is the _nal stage in the digital image process-
ing system Thresholding an image is a simple technique and can be done at any
stage in the process It can be used at the start to reduce the noise in the image or
it can be used to separate certain sections in an image that has distinct variations
in pixel values Thresholding is done by comparing the value of each pixel in an
image and comparing it to a threshold The image can be separated into regions
or pixels that are greater or lesser than the threshold value Multiple thresholds
can be used to achieve thresholding with many levels Otsus method [21] is a way
of automatically thresholding any image
Thresholding is used at different stages in this thesis It is a simple and useful tool in image
processing The following figures show the effects of thresholding Thresholding of an image
can be done manually by using the histogram of the intensities in an image It is difficult to
threshold noisy images as the background intensity and the foreground intensity may not be
distinctly separate Figure 23 shows an example of an image and its histogram that has the
pixel intensities on the horizontal axis and the number of pixels on the vertical axis
(a) The original image (b) The histogram of the image
Figure 23 Histogram of image [23]
IMAGE ENHANCEMENT TECHNIQUES
Image enhancement techniques improve the quality of an image as perceived by a human
These techniques are most useful because many satellite images when examined on a colour
display give inadequate information for image interpretation There is no conscious effort to
improve the fidelity of
the image with regard to some ideal form of the image There exists a wide variety of
techniques for improving image quality The contrast stretch density slicing edge
enhancement and spatial filtering are the more commonly used techniques Image
enhancement is attempted after the image is corrected for geometric and radiometric
distortions Image enhancement methods are applied separately to each band of a
multispectral image Digital techniques have been found to be most satisfactory than the
photographic technique for image enhancement because of the precision and wide variety of
digital
processes
Contrast
Contrast generally refers to the difference in luminance or grey level values in an image and
is an important characteristic It can be defined as the ratio of the maximum intensity to the
minimum intensity over an image Contrast ratio has a strong bearing on the resolving power
and detectability
of an image Larger this ratio more easy it is to interpret the image Satellite images lack
adequate contrast and require contrast improvement
Contrast Enhancement
Contrast enhancement techniques expand the range of brightness values in an image so that
the image can be efficiently displayed in a manner desired by the analyst The density values
in a scene are literally pulled farther apart that is expanded over a greater range The effect
is to increase the visual
contrast between two areas of different uniform densities This enables the analyst to
discriminate easily between areas initially having a small difference in density
Linear Contrast Stretch
This is the simplest contrast stretch algorithm The grey values in the original image and the
modified image follow a linear relation in this algorithm A density number in the low range
of the original histogram is assigned to extremely black and a value at the high end is
assigned to extremely white The remaining pixel values are distributed linearly between
these extremes The features or details that were obscure on the original image will be clear
in the contrast stretched image Linear contrast stretch operation can be represented
graphically as shown in Fig 4 To provide optimal contrast
and colour variation in colour composites the small range of grey values in each band is
stretched to the full brightness range of the output or display unit
Non-Linear Contrast Enhancement
In these methods the input and output data values follow a non-linear transformation The
general form of the non-linear contrast enhancement is defined by y = f (x) where x is the
input data value and y is the output data value The non-linear contrast enhancement
techniques have been found to be useful for enhancing the colour contrast between the nearly
classes and subclasses of a main class
A type of non linear contrast stretch involves scaling the input data logarithmically This
enhancement has greatest impact on the brightness values found in the darker part of
histogram It could be reversed to enhance values in brighter part of histogram by scaling the
input data using an inverse log
function Histogram equalization is another non-linear contrast enhancement technique In
this technique histogram of the original image is redistributed to produce a uniform
population density This is obtained by grouping certain adjacent grey values Thus the
number of grey levels in the enhanced image is less than the number of grey levels in the
original image
SPATIAL FILTERING
A characteristic of remotely sensed images is a parameter called spatial frequency defined as
number of changes in Brightness Value per unit distance for any particular part of an image
If there are very few changes in Brightness Value once a given area in an image this is
referred to as low frequency area Conversely if the Brightness Value changes dramatically
over short distances this is an area of high frequency Spatial filtering is the process of
dividing the image into its constituent spatial frequencies and selectively altering certain
spatial frequencies to emphasize some image features This technique increases the analystrsquos
ability to discriminate detail The three types of spatial filters used in remote sensor data
processing are Low pass filters Band pass filters and High pass filters
Low-Frequency Filtering in the Spatial Domain
Image enhancements that de-emphasize or block the high spatial frequency detail are low-
frequency or low-pass filters The simplest low-frequency filter evaluates a particular input
pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new
brightness value BVout that is the mean of this convolution The size of the neighbourhood
convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing
operation will however blur the image especially at the edges of objects Blurring becomes
more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass
image being two lines and two columns smaller than the original image Techniques that can
be applied to deal with this problem include (1) artificially extending the original image
beyond its border by repeating the original border pixel brightness values or (2) replicating
the averaged brightness values near the borders based on the image behaviour within a view
pixels of the border The most commonly used low pass filters are mean median and mode
filters
High-Frequency Filtering in the Spatial Domain
High-pass filtering is applied to imagery to remove the slowly varying components and
enhance the high-frequency local variations Brightness values tend to be highly correlated in
a nine-element window Thus the highfrequency filtered image will have a relatively narrow
intensity histogram This suggests that the output from most high-frequency filtered images
must be contrast stretched prior to visual analysis
Edge Enhancement in the Spatial Domain
For many remote sensing earth science applications the most valuable information that may
be derived from an image is contained in the edges surrounding various objects of interest
Edge enhancement delineates these edges and makes the shapes and details comprising the
image more conspicuous and perhaps easier to analyze Generally what the eyes see as
pictorial edges are simply sharp changes in brightness value between two adjacent pixels The
edges may be enhanced using either linear or nonlinear edge enhancement techniques
Linear Edge Enhancement
A straightforward method of extracting edges in remotely sensed imagery is the application
of a directional first-difference algorithm and approximates the first derivative between two
adjacent pixels The algorithm produces the first difference of the image input in the
horizontal vertical and diagonal directions
The Laplacian operator generally highlights point lines and edges in the image and
suppresses uniform and smoothly varying regions Human vision physiological research
suggests that we see objects in much the same way Hence the use of this operation has a
more natural look than many of the other edge-enhanced images
Band ratioing
Sometimes differences in brightness values from identical surface materials are caused by
topographic slope and aspect shadows or seasonal changes in sunlight illumination angle
and intensity These conditions may hamper the ability of an interpreter or classification
algorithm to identify correctly surface materials or land use in a remotely sensed image
Fortunately ratio transformations of the remotely sensed data can in certain instances be
applied to reduce the effects of such environmental conditions In addition to minimizing the
effects of environmental factors ratios may also provide unique information not available in
any single band that is useful for discriminating between soils and vegetation
Chapter 3
Literature Review and History
The _rst section in this chapter describes the work that is related to the topic Many
papers use the same image segmentation techniques for di_erent problems This
section explains the methods discussed in this thesis used by researchers to solve
similar problems The subsequent section describes the workings of the common
methods of image segmentation These methods were investigated in this thesis and
are also used in other papers They include techniques like Active Shape Models
Active ContourSnake Models Texture analysis edge detection and some methods
that are only relevant for the X-ray data
31 Previous Research
311 Summary of Previous Research
According to [14] compared to other areas in medical imaging bone fracture detec-
tion is not well researched and published Research has been done by the National
University of Singapore to segment and detect fractures in femurs (the thigh bone)
[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it
from the X-ray The X-rays were also segmented using Snakes or Active Contour
Models (discussed in 34) and Gradient Vector Flow According to the experiments
done by [27] their algorithm achieves a classi_cation with an accuracy of 945
Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in
X-rays [31] proposes two methods to extract femur contours from X-rays The _rst
is a semi-automatic method which gives priority to reliability and accuracy This
method tries to _t a model of the femur contour to a femur in the X-ray The second
method is automatic and uses active contour models This method breaks down the
shape of the femur into a couple of parallel or roughly parallel lines and a circle at
the top representing the head of the femur The method detects the strong edges in
the circle and locates the turning point using the point of in_ection in the second
derivative of the image Finally it optimizes the femur contour by applying shapeconstraints
to the model
Hough and Radon transforms are used by [14] to approximate the edges of long
bones [14] also uses clustering-based algorithms also known as bi-level or localized
thresholding methods and the global segmentation algorithms to segment X-rays
Clustering-based algorithms categorize each pixel of the image as either a part of
the background or as a part of the object hence the name bi-level thresholding
based on a speci_ed threshold Global segmentation algorithms take the whole
image into consideration and sometimes work better than the clustering-based algo-
rithms Global segmentation algorithms include methods like edge detection region
extraction and deformable models (discussed in 34)
Active Contour Models initially proposed by [19] fall under the class of deformable
models and are used widely as an image segmentation tool Active Contour Models
are used to extract femur contours in X-ray images by [31] after doing edge detection
on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by
[31] to extract contours and the results are compared to that of the Active Contour
Model [3] uses an Active Contour Model with curvature constraints to detect
femur fractures as the original Active Contour Model is susceptible to noise and
other undesired edges This method successfully extracts the femur contour with a
small restriction on shape size and orientation of the image
Active Shape Models introduced by Cootes and Taylor [9] is another widely used
statistical model for image segmentation Cootes and Taylor and their colleagues
[5 6 7 11 12 10] released a series of papers that completed the de_nition of the
original ASMs by modifying it also called classical ASMs by [24] These papers
investigated the performance of the model with gray-level variation di_erent reso-
lutions and made the model more _exible and adaptable ASMs are used by [24] to
detect facial features Some modi_cations to the original model were suggested and
experimented with The relationships between landmark points computing time
and the number of images in the training data were observed for di_erent sets of
data The results in this thesis are compared to the results in [24] The work done
in this thesis is similar to [24] as the same model is used for a di_erent application
[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition
of the shape and the gray level analysis of grayscale images The data used was
facial data from a face database and it was concluded that ASMs are an accurate
way of modeling the shape and gray level appearance It was observed that the
model allows for _exibility while being constrained on the shape of the object to
be segmented This is relevant for the problem of bone segmentation as X-rays are
grayscale and the structure and shape of bones can di_er slightly The _exibility
of the model will be useful for separating bones from X-rays even though one tibia
bone di_ers from another tibia bone
The lsquoworking mechanisms of the methods discussed above are explained in detail in
312 Common Limitations of the Previous Research
As mentioned in previous chapters bone segmentation and fracture detection are both
complicated problems There are many limitations and problems in the seg- mentation
methods used Some methods and models are too limited or constrained to match the bone
accurately Accuracy of results and computing time are conflict- ing variables
It is observed in [14] that there is no automatic method of segmenting bones [14] also
recognizes the need for good initial conditions for Active Contour Models to produce a good
segmentation of bones from X-rays If the initial conditions are not good the final results will
be inaccurate Manual definition of the initial conditions such as the scaling or orientation of
the contour is needed so the process is not automatic [14] tries to detect fractures in long
shaft bones using Computer Aided Design (CAD) techniques
The tradeo_ between automizing the algorithm and the accuracy of the results using the
Active Shape and Active Contour Models is examined in [31] If the model is made fully
automatic by estimating the initial conditions the accuracy will be lower than when the
initial conditions of the model are defined by user inputs [31] implements both manual and
automatic approaches and identifies that automatically segmenting bone structures from noisy
X-ray images is a complex problem This thesis project tackles these limitations The manual
and automatic approaches
are tried using Active Shape Models The relationship between the size of the
training set computation time and error are studied
32 Edge Detection
Edge detection falls under the category of feature detection of images which includes other
methods like ridge detection blob detection interest point detection and scale space models
In digital imaging edges are de_ned as a set of connected pixels that
lie on the boundary between two regions in an image where the image intensity changes
formally known as discontinuities [15] The pixels or a set of pixels that form the edge are
generally of the same or close to the same intensities Edge detection can be used to segment
images with respect to these edges and display the edges separately [26][15] Edge detection
can be used in separating tibia bones from X-rays as bones have strong boundaries or edges
Figure 31 is an example of
basic edge detection in images
321 Sobel Edge Detector
The Sobel operator used to do the edge detection calculates the gradient of the image
intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal
and vertical derivatives as its components The gradient vector can also be seen as a
magnitude and an angle If Dx and Dy are the derivatives in the x and y direction
respectively equations 31 and 32 show the magnitude and angle(direction) representation of
the gradient vector rD It is a measure of the rate of change in an image from light to dark
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
has been done in this field there is no system that completely solves the problem [14] This is
because there are several complicated parts to this problem of fracture detection Digital X-
rays are very detailed and complicated to interpret Bones have different sizes and can differ
in characteristics from person to person So finding a general method to locate the bone and
decide if its fractured or not is a complex problem Some of the main aspects to the problem
of automatic bone fracture detection are bone orientation in the X-ray extracting bone
contour information bone segmentation extraction of relevant features
13 Description of the Problem
This thesis investigates the different ways of separating a bone from an X-ray Meth ods like
edge detection and Active Shape Models are experimented with The aim of this thesis is to
find an efficient and reasonably fast way of separating the bone from the rest of the X-ray
The bone that was used for the analysis is the tibia bone The tibia also known as the
shinbone or shankbone is the larger and stronger of the two bones in the leg below the knee
in vertebrates and connects the knee with the ankle bones Details of the X-ray data used are
provided in the next section
21 Theory Development
A typical digital image processing system consists of image segmentation feature extraction
pattern recognition thresholding and error classification Image processing aims at extracting
the necessary information from the image The image needs to be reduced to certain defining
characteristics and the analysis of these characteristics gives the relevant information Figure
21 shows a process flow diagram of a typical digital image processing system showing the
sequence of the operations Image segmentation is the main focus of this thesis The other
processes are briefly described for completeness and to inform the reader of the processes in
the whole system
211 Image Segmentation
Image segmentation is the process of extracting the regions of interest from an image There
are many operations to segment images and their usage depends on the nature of the region to
be extracted For example if an image has strong edges edge detection techniques can be
used to partition the image into its components using those edges Image segmentation is the
central theme of this thesis and is doneusing several techniques Figure 22 shows how one
of the coins can be separated from the image shows the original image and highlights the
boundary of one of the coins These techniques are analyzed and the best technique to
separate bones from X-rays is suggested When dealing with bone X-ray images contour
detection is an important step in image segmentation According to [31] classical image
segmentation and contour detection can be di_erent Contour detection algorithms extract the
contour of objects whereas image segmentation separates homogeneous sections of the
image A detailed literature review and history of the image segmentation techniques used for
different applications is given in Chapter 3
2 Segmentation of Images - An Overview
Image segmentation can proceed on three diregerent ways
sup2 Manually
sup2 Automatically
sup2 Semiautomatically
21 Manual Segmentation
The pixels belonging to the same intensity range could manually be pointed out but clearly
this is a very time consuming method if the image is large A better choice would be to mark
the contours of the objects This could be done discrete from the keyboard giving high
accuracy but low speed or it could be done with the mouse with higher speed but less
accuracy The manual techniques all have in common the amount of time spent in tracing the
objects and human resources are expensive Tracing algorithms can also make use of
geometrical figures like ellipses to approximate the boundaries of the objects This has been
done a lot for medical purposes but the approximations may not be very good
22 Automatic Segmentation
Fully automatic segmentation is diplusmncult to implement due to the high complexity and
variation of images Most algorithms need some a priori information to carry out the
segmentation and for a method to be automatic this a priori information must be available to
the computer The needed apriori information could for instance be noise level and
of the objects having a special distribution
23 Semiautomatic Segmentation
Semiautomatic segmentation combines the benemacrts of both manual and automatic seg-
mentation By giving some initial information about the structures we can proceed with
automatic methods
sup2 Thresholding
If the distribution of intensities is known thresholding divides the image into two
regions separated by a manually chosen threshold value a as follows
if B(i j) cedil a B(i j) = 1 (object) else B(i j) = 0 (background) for all i j over the image B
[YGV] This can be repeated for each region dividing them by the threshold value which
results in four regions etc However a successful segmentation requires that some properties
of the image is known beforehand This method has the drawback of including separated
regions which correctly lie within the limits specified but regionally do not belong to the
selected region These pixels could for instance appear from noise The simplest way of
choosing the threshold value would be a fixed value for instance the mean value of the
image A better choice would be a histogram derived threshold This method includes some
knowledge of the distribution of the image and will result in less misclassimacrcation
Isodata algorithm is an iterative process for macrnding the threshold value [YGV] First
segment the image into two regions according to a temporary chosen threshold value
Then calculate the mean value of the image corresponding to the two segmented
regions Calculate a new threshold value from
thresholdnew = mean(meanregion1 + meanregion2)
and repeat until the threshold value does not change any more Finally choose this
value for the threshold segmentation
To implement the triangle algorithm construct a histogram of intensities vs number
of pixels like in Figure 21 Draw a line between the maximum value of the histogram
hmax and the minimum value hmin and calculate the distance d between the line
and and the histogram Increase hmin and repeat for all h until h = hmax The
threshold value becomes the h for which the distance d is maximised This method
is particularly eregective when the pixels of the object we seek make a weak peak
sup2 Boundary tracking
Edge-macrnding by gradients is the method of selecting a boundary manually and auto-
matically follow this gradient until returning to the same point [YGV] Returning
to the same point can be a major problem of this method Boundary tracking will
wrongly include all interior holes in the region and will meet problems if the gradient
specifying the boundary is varying or is very small A way to overcome this problem
is macrrst to calculate the gradient and then apply a threshold segmentation This will
exclude some wrongly included pixels compared to the threshold method only
Zero-crossing based procedure is a method based on the Laplacian Assume the
boundaries of an object has the property that the Laplacian will change sign across
them Consider a 1D problem where cent = 2
x2 Assume the boundary is blurred
and the gradient will have a shape like in Figure 22 The Laplacian will change
sign just around the assumed edge for position = 0 For noisy images the noise will
produce large second derivatives around zero crossings and the zero-crossing based
procedure needs a smoothing macrlter to produce satisfactory results
sup2 Clustering Methods Clustering methods group pixels into larger regions using
colour codes The colour code for each pixel is usually given as a 3D vector but
212 Feature Extraction
Feature extraction is the process of reducing the segmented image into few numbers
or sets of numbers that de_ne the relevant features of the image These features
must be carefully chosen in such a way that they are a good representation of
the image and encapsulate the necessary information Some examples of features
can be image properties like the mean standard deviation gradient and edges
Generally a combination of features is used to generate a model for the images
Cross validation is done on the images to see which features represent the image
well and those features are used Features can sometimes be assigned weights to
signify the importance of certain features For example the mean in a certain
image may be given a weight of 09 because it is more important than the standard
deviation which may have a weight of 03 assigned to it Weights generally range
from 0 to 1 and they de_ne how important the features are These features and their
respective weights are then used on a test image to get the relevant information
To classify the bone as fractured or not[27]measures the neck-shaft angle from the
segmented femur contour as a feature Texture features of the image such as Gabor
orientation (GO) Markov Random Field (MRF) and intensity gradient direction
(IGD) are used by [22] to generate a combination of classi_ers to detect fractures in
bones These techniques are also used in [20] to look at femur fractures speci_cally
Best parameter values for the features can be found using various techniques
213 Classifiers and Pattern Recognition
After the feature extraction stage the features have to be analyzed and a pat-
tern needs to be recognized For example the features mentioned above like the
neck-shaft angle in a femur X-ray image need to be plotted The patterns can be
recognized if the neck-shaft angles of good femurs are di_erent from those of frac-
tured femurs Classifiers like Bayesian classifiers and Support Vector Machines are
used to classify features and _nd the best values for them For example [22] used
a support vector machine called the Gini-SVM [22] and found the feature values
for GO MRF and IGD that gave the best performance overall Clustering nearest
neighbour approaches can also be used for pattern recognition and classi_cation of
images For example the gradient vector of a healthy long bone X-ray may point
in a certain direction that is very di_erent to the gradient vector of a fractured long
bone X-ray So by observing this fact a bone in an unknown X-ray image can be
classi_ed as healthy or fractured using the gradient vector of the image
214 Thresholding and Error Classi_cation
Thresholding and Error Classi_cation is the _nal stage in the digital image process-
ing system Thresholding an image is a simple technique and can be done at any
stage in the process It can be used at the start to reduce the noise in the image or
it can be used to separate certain sections in an image that has distinct variations
in pixel values Thresholding is done by comparing the value of each pixel in an
image and comparing it to a threshold The image can be separated into regions
or pixels that are greater or lesser than the threshold value Multiple thresholds
can be used to achieve thresholding with many levels Otsus method [21] is a way
of automatically thresholding any image
Thresholding is used at different stages in this thesis It is a simple and useful tool in image
processing The following figures show the effects of thresholding Thresholding of an image
can be done manually by using the histogram of the intensities in an image It is difficult to
threshold noisy images as the background intensity and the foreground intensity may not be
distinctly separate Figure 23 shows an example of an image and its histogram that has the
pixel intensities on the horizontal axis and the number of pixels on the vertical axis
(a) The original image (b) The histogram of the image
Figure 23 Histogram of image [23]
IMAGE ENHANCEMENT TECHNIQUES
Image enhancement techniques improve the quality of an image as perceived by a human
These techniques are most useful because many satellite images when examined on a colour
display give inadequate information for image interpretation There is no conscious effort to
improve the fidelity of
the image with regard to some ideal form of the image There exists a wide variety of
techniques for improving image quality The contrast stretch density slicing edge
enhancement and spatial filtering are the more commonly used techniques Image
enhancement is attempted after the image is corrected for geometric and radiometric
distortions Image enhancement methods are applied separately to each band of a
multispectral image Digital techniques have been found to be most satisfactory than the
photographic technique for image enhancement because of the precision and wide variety of
digital
processes
Contrast
Contrast generally refers to the difference in luminance or grey level values in an image and
is an important characteristic It can be defined as the ratio of the maximum intensity to the
minimum intensity over an image Contrast ratio has a strong bearing on the resolving power
and detectability
of an image Larger this ratio more easy it is to interpret the image Satellite images lack
adequate contrast and require contrast improvement
Contrast Enhancement
Contrast enhancement techniques expand the range of brightness values in an image so that
the image can be efficiently displayed in a manner desired by the analyst The density values
in a scene are literally pulled farther apart that is expanded over a greater range The effect
is to increase the visual
contrast between two areas of different uniform densities This enables the analyst to
discriminate easily between areas initially having a small difference in density
Linear Contrast Stretch
This is the simplest contrast stretch algorithm The grey values in the original image and the
modified image follow a linear relation in this algorithm A density number in the low range
of the original histogram is assigned to extremely black and a value at the high end is
assigned to extremely white The remaining pixel values are distributed linearly between
these extremes The features or details that were obscure on the original image will be clear
in the contrast stretched image Linear contrast stretch operation can be represented
graphically as shown in Fig 4 To provide optimal contrast
and colour variation in colour composites the small range of grey values in each band is
stretched to the full brightness range of the output or display unit
Non-Linear Contrast Enhancement
In these methods the input and output data values follow a non-linear transformation The
general form of the non-linear contrast enhancement is defined by y = f (x) where x is the
input data value and y is the output data value The non-linear contrast enhancement
techniques have been found to be useful for enhancing the colour contrast between the nearly
classes and subclasses of a main class
A type of non linear contrast stretch involves scaling the input data logarithmically This
enhancement has greatest impact on the brightness values found in the darker part of
histogram It could be reversed to enhance values in brighter part of histogram by scaling the
input data using an inverse log
function Histogram equalization is another non-linear contrast enhancement technique In
this technique histogram of the original image is redistributed to produce a uniform
population density This is obtained by grouping certain adjacent grey values Thus the
number of grey levels in the enhanced image is less than the number of grey levels in the
original image
SPATIAL FILTERING
A characteristic of remotely sensed images is a parameter called spatial frequency defined as
number of changes in Brightness Value per unit distance for any particular part of an image
If there are very few changes in Brightness Value once a given area in an image this is
referred to as low frequency area Conversely if the Brightness Value changes dramatically
over short distances this is an area of high frequency Spatial filtering is the process of
dividing the image into its constituent spatial frequencies and selectively altering certain
spatial frequencies to emphasize some image features This technique increases the analystrsquos
ability to discriminate detail The three types of spatial filters used in remote sensor data
processing are Low pass filters Band pass filters and High pass filters
Low-Frequency Filtering in the Spatial Domain
Image enhancements that de-emphasize or block the high spatial frequency detail are low-
frequency or low-pass filters The simplest low-frequency filter evaluates a particular input
pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new
brightness value BVout that is the mean of this convolution The size of the neighbourhood
convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing
operation will however blur the image especially at the edges of objects Blurring becomes
more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass
image being two lines and two columns smaller than the original image Techniques that can
be applied to deal with this problem include (1) artificially extending the original image
beyond its border by repeating the original border pixel brightness values or (2) replicating
the averaged brightness values near the borders based on the image behaviour within a view
pixels of the border The most commonly used low pass filters are mean median and mode
filters
High-Frequency Filtering in the Spatial Domain
High-pass filtering is applied to imagery to remove the slowly varying components and
enhance the high-frequency local variations Brightness values tend to be highly correlated in
a nine-element window Thus the highfrequency filtered image will have a relatively narrow
intensity histogram This suggests that the output from most high-frequency filtered images
must be contrast stretched prior to visual analysis
Edge Enhancement in the Spatial Domain
For many remote sensing earth science applications the most valuable information that may
be derived from an image is contained in the edges surrounding various objects of interest
Edge enhancement delineates these edges and makes the shapes and details comprising the
image more conspicuous and perhaps easier to analyze Generally what the eyes see as
pictorial edges are simply sharp changes in brightness value between two adjacent pixels The
edges may be enhanced using either linear or nonlinear edge enhancement techniques
Linear Edge Enhancement
A straightforward method of extracting edges in remotely sensed imagery is the application
of a directional first-difference algorithm and approximates the first derivative between two
adjacent pixels The algorithm produces the first difference of the image input in the
horizontal vertical and diagonal directions
The Laplacian operator generally highlights point lines and edges in the image and
suppresses uniform and smoothly varying regions Human vision physiological research
suggests that we see objects in much the same way Hence the use of this operation has a
more natural look than many of the other edge-enhanced images
Band ratioing
Sometimes differences in brightness values from identical surface materials are caused by
topographic slope and aspect shadows or seasonal changes in sunlight illumination angle
and intensity These conditions may hamper the ability of an interpreter or classification
algorithm to identify correctly surface materials or land use in a remotely sensed image
Fortunately ratio transformations of the remotely sensed data can in certain instances be
applied to reduce the effects of such environmental conditions In addition to minimizing the
effects of environmental factors ratios may also provide unique information not available in
any single band that is useful for discriminating between soils and vegetation
Chapter 3
Literature Review and History
The _rst section in this chapter describes the work that is related to the topic Many
papers use the same image segmentation techniques for di_erent problems This
section explains the methods discussed in this thesis used by researchers to solve
similar problems The subsequent section describes the workings of the common
methods of image segmentation These methods were investigated in this thesis and
are also used in other papers They include techniques like Active Shape Models
Active ContourSnake Models Texture analysis edge detection and some methods
that are only relevant for the X-ray data
31 Previous Research
311 Summary of Previous Research
According to [14] compared to other areas in medical imaging bone fracture detec-
tion is not well researched and published Research has been done by the National
University of Singapore to segment and detect fractures in femurs (the thigh bone)
[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it
from the X-ray The X-rays were also segmented using Snakes or Active Contour
Models (discussed in 34) and Gradient Vector Flow According to the experiments
done by [27] their algorithm achieves a classi_cation with an accuracy of 945
Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in
X-rays [31] proposes two methods to extract femur contours from X-rays The _rst
is a semi-automatic method which gives priority to reliability and accuracy This
method tries to _t a model of the femur contour to a femur in the X-ray The second
method is automatic and uses active contour models This method breaks down the
shape of the femur into a couple of parallel or roughly parallel lines and a circle at
the top representing the head of the femur The method detects the strong edges in
the circle and locates the turning point using the point of in_ection in the second
derivative of the image Finally it optimizes the femur contour by applying shapeconstraints
to the model
Hough and Radon transforms are used by [14] to approximate the edges of long
bones [14] also uses clustering-based algorithms also known as bi-level or localized
thresholding methods and the global segmentation algorithms to segment X-rays
Clustering-based algorithms categorize each pixel of the image as either a part of
the background or as a part of the object hence the name bi-level thresholding
based on a speci_ed threshold Global segmentation algorithms take the whole
image into consideration and sometimes work better than the clustering-based algo-
rithms Global segmentation algorithms include methods like edge detection region
extraction and deformable models (discussed in 34)
Active Contour Models initially proposed by [19] fall under the class of deformable
models and are used widely as an image segmentation tool Active Contour Models
are used to extract femur contours in X-ray images by [31] after doing edge detection
on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by
[31] to extract contours and the results are compared to that of the Active Contour
Model [3] uses an Active Contour Model with curvature constraints to detect
femur fractures as the original Active Contour Model is susceptible to noise and
other undesired edges This method successfully extracts the femur contour with a
small restriction on shape size and orientation of the image
Active Shape Models introduced by Cootes and Taylor [9] is another widely used
statistical model for image segmentation Cootes and Taylor and their colleagues
[5 6 7 11 12 10] released a series of papers that completed the de_nition of the
original ASMs by modifying it also called classical ASMs by [24] These papers
investigated the performance of the model with gray-level variation di_erent reso-
lutions and made the model more _exible and adaptable ASMs are used by [24] to
detect facial features Some modi_cations to the original model were suggested and
experimented with The relationships between landmark points computing time
and the number of images in the training data were observed for di_erent sets of
data The results in this thesis are compared to the results in [24] The work done
in this thesis is similar to [24] as the same model is used for a di_erent application
[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition
of the shape and the gray level analysis of grayscale images The data used was
facial data from a face database and it was concluded that ASMs are an accurate
way of modeling the shape and gray level appearance It was observed that the
model allows for _exibility while being constrained on the shape of the object to
be segmented This is relevant for the problem of bone segmentation as X-rays are
grayscale and the structure and shape of bones can di_er slightly The _exibility
of the model will be useful for separating bones from X-rays even though one tibia
bone di_ers from another tibia bone
The lsquoworking mechanisms of the methods discussed above are explained in detail in
312 Common Limitations of the Previous Research
As mentioned in previous chapters bone segmentation and fracture detection are both
complicated problems There are many limitations and problems in the seg- mentation
methods used Some methods and models are too limited or constrained to match the bone
accurately Accuracy of results and computing time are conflict- ing variables
It is observed in [14] that there is no automatic method of segmenting bones [14] also
recognizes the need for good initial conditions for Active Contour Models to produce a good
segmentation of bones from X-rays If the initial conditions are not good the final results will
be inaccurate Manual definition of the initial conditions such as the scaling or orientation of
the contour is needed so the process is not automatic [14] tries to detect fractures in long
shaft bones using Computer Aided Design (CAD) techniques
The tradeo_ between automizing the algorithm and the accuracy of the results using the
Active Shape and Active Contour Models is examined in [31] If the model is made fully
automatic by estimating the initial conditions the accuracy will be lower than when the
initial conditions of the model are defined by user inputs [31] implements both manual and
automatic approaches and identifies that automatically segmenting bone structures from noisy
X-ray images is a complex problem This thesis project tackles these limitations The manual
and automatic approaches
are tried using Active Shape Models The relationship between the size of the
training set computation time and error are studied
32 Edge Detection
Edge detection falls under the category of feature detection of images which includes other
methods like ridge detection blob detection interest point detection and scale space models
In digital imaging edges are de_ned as a set of connected pixels that
lie on the boundary between two regions in an image where the image intensity changes
formally known as discontinuities [15] The pixels or a set of pixels that form the edge are
generally of the same or close to the same intensities Edge detection can be used to segment
images with respect to these edges and display the edges separately [26][15] Edge detection
can be used in separating tibia bones from X-rays as bones have strong boundaries or edges
Figure 31 is an example of
basic edge detection in images
321 Sobel Edge Detector
The Sobel operator used to do the edge detection calculates the gradient of the image
intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal
and vertical derivatives as its components The gradient vector can also be seen as a
magnitude and an angle If Dx and Dy are the derivatives in the x and y direction
respectively equations 31 and 32 show the magnitude and angle(direction) representation of
the gradient vector rD It is a measure of the rate of change in an image from light to dark
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
used to partition the image into its components using those edges Image segmentation is the
central theme of this thesis and is doneusing several techniques Figure 22 shows how one
of the coins can be separated from the image shows the original image and highlights the
boundary of one of the coins These techniques are analyzed and the best technique to
separate bones from X-rays is suggested When dealing with bone X-ray images contour
detection is an important step in image segmentation According to [31] classical image
segmentation and contour detection can be di_erent Contour detection algorithms extract the
contour of objects whereas image segmentation separates homogeneous sections of the
image A detailed literature review and history of the image segmentation techniques used for
different applications is given in Chapter 3
2 Segmentation of Images - An Overview
Image segmentation can proceed on three diregerent ways
sup2 Manually
sup2 Automatically
sup2 Semiautomatically
21 Manual Segmentation
The pixels belonging to the same intensity range could manually be pointed out but clearly
this is a very time consuming method if the image is large A better choice would be to mark
the contours of the objects This could be done discrete from the keyboard giving high
accuracy but low speed or it could be done with the mouse with higher speed but less
accuracy The manual techniques all have in common the amount of time spent in tracing the
objects and human resources are expensive Tracing algorithms can also make use of
geometrical figures like ellipses to approximate the boundaries of the objects This has been
done a lot for medical purposes but the approximations may not be very good
22 Automatic Segmentation
Fully automatic segmentation is diplusmncult to implement due to the high complexity and
variation of images Most algorithms need some a priori information to carry out the
segmentation and for a method to be automatic this a priori information must be available to
the computer The needed apriori information could for instance be noise level and
of the objects having a special distribution
23 Semiautomatic Segmentation
Semiautomatic segmentation combines the benemacrts of both manual and automatic seg-
mentation By giving some initial information about the structures we can proceed with
automatic methods
sup2 Thresholding
If the distribution of intensities is known thresholding divides the image into two
regions separated by a manually chosen threshold value a as follows
if B(i j) cedil a B(i j) = 1 (object) else B(i j) = 0 (background) for all i j over the image B
[YGV] This can be repeated for each region dividing them by the threshold value which
results in four regions etc However a successful segmentation requires that some properties
of the image is known beforehand This method has the drawback of including separated
regions which correctly lie within the limits specified but regionally do not belong to the
selected region These pixels could for instance appear from noise The simplest way of
choosing the threshold value would be a fixed value for instance the mean value of the
image A better choice would be a histogram derived threshold This method includes some
knowledge of the distribution of the image and will result in less misclassimacrcation
Isodata algorithm is an iterative process for macrnding the threshold value [YGV] First
segment the image into two regions according to a temporary chosen threshold value
Then calculate the mean value of the image corresponding to the two segmented
regions Calculate a new threshold value from
thresholdnew = mean(meanregion1 + meanregion2)
and repeat until the threshold value does not change any more Finally choose this
value for the threshold segmentation
To implement the triangle algorithm construct a histogram of intensities vs number
of pixels like in Figure 21 Draw a line between the maximum value of the histogram
hmax and the minimum value hmin and calculate the distance d between the line
and and the histogram Increase hmin and repeat for all h until h = hmax The
threshold value becomes the h for which the distance d is maximised This method
is particularly eregective when the pixels of the object we seek make a weak peak
sup2 Boundary tracking
Edge-macrnding by gradients is the method of selecting a boundary manually and auto-
matically follow this gradient until returning to the same point [YGV] Returning
to the same point can be a major problem of this method Boundary tracking will
wrongly include all interior holes in the region and will meet problems if the gradient
specifying the boundary is varying or is very small A way to overcome this problem
is macrrst to calculate the gradient and then apply a threshold segmentation This will
exclude some wrongly included pixels compared to the threshold method only
Zero-crossing based procedure is a method based on the Laplacian Assume the
boundaries of an object has the property that the Laplacian will change sign across
them Consider a 1D problem where cent = 2
x2 Assume the boundary is blurred
and the gradient will have a shape like in Figure 22 The Laplacian will change
sign just around the assumed edge for position = 0 For noisy images the noise will
produce large second derivatives around zero crossings and the zero-crossing based
procedure needs a smoothing macrlter to produce satisfactory results
sup2 Clustering Methods Clustering methods group pixels into larger regions using
colour codes The colour code for each pixel is usually given as a 3D vector but
212 Feature Extraction
Feature extraction is the process of reducing the segmented image into few numbers
or sets of numbers that de_ne the relevant features of the image These features
must be carefully chosen in such a way that they are a good representation of
the image and encapsulate the necessary information Some examples of features
can be image properties like the mean standard deviation gradient and edges
Generally a combination of features is used to generate a model for the images
Cross validation is done on the images to see which features represent the image
well and those features are used Features can sometimes be assigned weights to
signify the importance of certain features For example the mean in a certain
image may be given a weight of 09 because it is more important than the standard
deviation which may have a weight of 03 assigned to it Weights generally range
from 0 to 1 and they de_ne how important the features are These features and their
respective weights are then used on a test image to get the relevant information
To classify the bone as fractured or not[27]measures the neck-shaft angle from the
segmented femur contour as a feature Texture features of the image such as Gabor
orientation (GO) Markov Random Field (MRF) and intensity gradient direction
(IGD) are used by [22] to generate a combination of classi_ers to detect fractures in
bones These techniques are also used in [20] to look at femur fractures speci_cally
Best parameter values for the features can be found using various techniques
213 Classifiers and Pattern Recognition
After the feature extraction stage the features have to be analyzed and a pat-
tern needs to be recognized For example the features mentioned above like the
neck-shaft angle in a femur X-ray image need to be plotted The patterns can be
recognized if the neck-shaft angles of good femurs are di_erent from those of frac-
tured femurs Classifiers like Bayesian classifiers and Support Vector Machines are
used to classify features and _nd the best values for them For example [22] used
a support vector machine called the Gini-SVM [22] and found the feature values
for GO MRF and IGD that gave the best performance overall Clustering nearest
neighbour approaches can also be used for pattern recognition and classi_cation of
images For example the gradient vector of a healthy long bone X-ray may point
in a certain direction that is very di_erent to the gradient vector of a fractured long
bone X-ray So by observing this fact a bone in an unknown X-ray image can be
classi_ed as healthy or fractured using the gradient vector of the image
214 Thresholding and Error Classi_cation
Thresholding and Error Classi_cation is the _nal stage in the digital image process-
ing system Thresholding an image is a simple technique and can be done at any
stage in the process It can be used at the start to reduce the noise in the image or
it can be used to separate certain sections in an image that has distinct variations
in pixel values Thresholding is done by comparing the value of each pixel in an
image and comparing it to a threshold The image can be separated into regions
or pixels that are greater or lesser than the threshold value Multiple thresholds
can be used to achieve thresholding with many levels Otsus method [21] is a way
of automatically thresholding any image
Thresholding is used at different stages in this thesis It is a simple and useful tool in image
processing The following figures show the effects of thresholding Thresholding of an image
can be done manually by using the histogram of the intensities in an image It is difficult to
threshold noisy images as the background intensity and the foreground intensity may not be
distinctly separate Figure 23 shows an example of an image and its histogram that has the
pixel intensities on the horizontal axis and the number of pixels on the vertical axis
(a) The original image (b) The histogram of the image
Figure 23 Histogram of image [23]
IMAGE ENHANCEMENT TECHNIQUES
Image enhancement techniques improve the quality of an image as perceived by a human
These techniques are most useful because many satellite images when examined on a colour
display give inadequate information for image interpretation There is no conscious effort to
improve the fidelity of
the image with regard to some ideal form of the image There exists a wide variety of
techniques for improving image quality The contrast stretch density slicing edge
enhancement and spatial filtering are the more commonly used techniques Image
enhancement is attempted after the image is corrected for geometric and radiometric
distortions Image enhancement methods are applied separately to each band of a
multispectral image Digital techniques have been found to be most satisfactory than the
photographic technique for image enhancement because of the precision and wide variety of
digital
processes
Contrast
Contrast generally refers to the difference in luminance or grey level values in an image and
is an important characteristic It can be defined as the ratio of the maximum intensity to the
minimum intensity over an image Contrast ratio has a strong bearing on the resolving power
and detectability
of an image Larger this ratio more easy it is to interpret the image Satellite images lack
adequate contrast and require contrast improvement
Contrast Enhancement
Contrast enhancement techniques expand the range of brightness values in an image so that
the image can be efficiently displayed in a manner desired by the analyst The density values
in a scene are literally pulled farther apart that is expanded over a greater range The effect
is to increase the visual
contrast between two areas of different uniform densities This enables the analyst to
discriminate easily between areas initially having a small difference in density
Linear Contrast Stretch
This is the simplest contrast stretch algorithm The grey values in the original image and the
modified image follow a linear relation in this algorithm A density number in the low range
of the original histogram is assigned to extremely black and a value at the high end is
assigned to extremely white The remaining pixel values are distributed linearly between
these extremes The features or details that were obscure on the original image will be clear
in the contrast stretched image Linear contrast stretch operation can be represented
graphically as shown in Fig 4 To provide optimal contrast
and colour variation in colour composites the small range of grey values in each band is
stretched to the full brightness range of the output or display unit
Non-Linear Contrast Enhancement
In these methods the input and output data values follow a non-linear transformation The
general form of the non-linear contrast enhancement is defined by y = f (x) where x is the
input data value and y is the output data value The non-linear contrast enhancement
techniques have been found to be useful for enhancing the colour contrast between the nearly
classes and subclasses of a main class
A type of non linear contrast stretch involves scaling the input data logarithmically This
enhancement has greatest impact on the brightness values found in the darker part of
histogram It could be reversed to enhance values in brighter part of histogram by scaling the
input data using an inverse log
function Histogram equalization is another non-linear contrast enhancement technique In
this technique histogram of the original image is redistributed to produce a uniform
population density This is obtained by grouping certain adjacent grey values Thus the
number of grey levels in the enhanced image is less than the number of grey levels in the
original image
SPATIAL FILTERING
A characteristic of remotely sensed images is a parameter called spatial frequency defined as
number of changes in Brightness Value per unit distance for any particular part of an image
If there are very few changes in Brightness Value once a given area in an image this is
referred to as low frequency area Conversely if the Brightness Value changes dramatically
over short distances this is an area of high frequency Spatial filtering is the process of
dividing the image into its constituent spatial frequencies and selectively altering certain
spatial frequencies to emphasize some image features This technique increases the analystrsquos
ability to discriminate detail The three types of spatial filters used in remote sensor data
processing are Low pass filters Band pass filters and High pass filters
Low-Frequency Filtering in the Spatial Domain
Image enhancements that de-emphasize or block the high spatial frequency detail are low-
frequency or low-pass filters The simplest low-frequency filter evaluates a particular input
pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new
brightness value BVout that is the mean of this convolution The size of the neighbourhood
convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing
operation will however blur the image especially at the edges of objects Blurring becomes
more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass
image being two lines and two columns smaller than the original image Techniques that can
be applied to deal with this problem include (1) artificially extending the original image
beyond its border by repeating the original border pixel brightness values or (2) replicating
the averaged brightness values near the borders based on the image behaviour within a view
pixels of the border The most commonly used low pass filters are mean median and mode
filters
High-Frequency Filtering in the Spatial Domain
High-pass filtering is applied to imagery to remove the slowly varying components and
enhance the high-frequency local variations Brightness values tend to be highly correlated in
a nine-element window Thus the highfrequency filtered image will have a relatively narrow
intensity histogram This suggests that the output from most high-frequency filtered images
must be contrast stretched prior to visual analysis
Edge Enhancement in the Spatial Domain
For many remote sensing earth science applications the most valuable information that may
be derived from an image is contained in the edges surrounding various objects of interest
Edge enhancement delineates these edges and makes the shapes and details comprising the
image more conspicuous and perhaps easier to analyze Generally what the eyes see as
pictorial edges are simply sharp changes in brightness value between two adjacent pixels The
edges may be enhanced using either linear or nonlinear edge enhancement techniques
Linear Edge Enhancement
A straightforward method of extracting edges in remotely sensed imagery is the application
of a directional first-difference algorithm and approximates the first derivative between two
adjacent pixels The algorithm produces the first difference of the image input in the
horizontal vertical and diagonal directions
The Laplacian operator generally highlights point lines and edges in the image and
suppresses uniform and smoothly varying regions Human vision physiological research
suggests that we see objects in much the same way Hence the use of this operation has a
more natural look than many of the other edge-enhanced images
Band ratioing
Sometimes differences in brightness values from identical surface materials are caused by
topographic slope and aspect shadows or seasonal changes in sunlight illumination angle
and intensity These conditions may hamper the ability of an interpreter or classification
algorithm to identify correctly surface materials or land use in a remotely sensed image
Fortunately ratio transformations of the remotely sensed data can in certain instances be
applied to reduce the effects of such environmental conditions In addition to minimizing the
effects of environmental factors ratios may also provide unique information not available in
any single band that is useful for discriminating between soils and vegetation
Chapter 3
Literature Review and History
The _rst section in this chapter describes the work that is related to the topic Many
papers use the same image segmentation techniques for di_erent problems This
section explains the methods discussed in this thesis used by researchers to solve
similar problems The subsequent section describes the workings of the common
methods of image segmentation These methods were investigated in this thesis and
are also used in other papers They include techniques like Active Shape Models
Active ContourSnake Models Texture analysis edge detection and some methods
that are only relevant for the X-ray data
31 Previous Research
311 Summary of Previous Research
According to [14] compared to other areas in medical imaging bone fracture detec-
tion is not well researched and published Research has been done by the National
University of Singapore to segment and detect fractures in femurs (the thigh bone)
[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it
from the X-ray The X-rays were also segmented using Snakes or Active Contour
Models (discussed in 34) and Gradient Vector Flow According to the experiments
done by [27] their algorithm achieves a classi_cation with an accuracy of 945
Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in
X-rays [31] proposes two methods to extract femur contours from X-rays The _rst
is a semi-automatic method which gives priority to reliability and accuracy This
method tries to _t a model of the femur contour to a femur in the X-ray The second
method is automatic and uses active contour models This method breaks down the
shape of the femur into a couple of parallel or roughly parallel lines and a circle at
the top representing the head of the femur The method detects the strong edges in
the circle and locates the turning point using the point of in_ection in the second
derivative of the image Finally it optimizes the femur contour by applying shapeconstraints
to the model
Hough and Radon transforms are used by [14] to approximate the edges of long
bones [14] also uses clustering-based algorithms also known as bi-level or localized
thresholding methods and the global segmentation algorithms to segment X-rays
Clustering-based algorithms categorize each pixel of the image as either a part of
the background or as a part of the object hence the name bi-level thresholding
based on a speci_ed threshold Global segmentation algorithms take the whole
image into consideration and sometimes work better than the clustering-based algo-
rithms Global segmentation algorithms include methods like edge detection region
extraction and deformable models (discussed in 34)
Active Contour Models initially proposed by [19] fall under the class of deformable
models and are used widely as an image segmentation tool Active Contour Models
are used to extract femur contours in X-ray images by [31] after doing edge detection
on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by
[31] to extract contours and the results are compared to that of the Active Contour
Model [3] uses an Active Contour Model with curvature constraints to detect
femur fractures as the original Active Contour Model is susceptible to noise and
other undesired edges This method successfully extracts the femur contour with a
small restriction on shape size and orientation of the image
Active Shape Models introduced by Cootes and Taylor [9] is another widely used
statistical model for image segmentation Cootes and Taylor and their colleagues
[5 6 7 11 12 10] released a series of papers that completed the de_nition of the
original ASMs by modifying it also called classical ASMs by [24] These papers
investigated the performance of the model with gray-level variation di_erent reso-
lutions and made the model more _exible and adaptable ASMs are used by [24] to
detect facial features Some modi_cations to the original model were suggested and
experimented with The relationships between landmark points computing time
and the number of images in the training data were observed for di_erent sets of
data The results in this thesis are compared to the results in [24] The work done
in this thesis is similar to [24] as the same model is used for a di_erent application
[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition
of the shape and the gray level analysis of grayscale images The data used was
facial data from a face database and it was concluded that ASMs are an accurate
way of modeling the shape and gray level appearance It was observed that the
model allows for _exibility while being constrained on the shape of the object to
be segmented This is relevant for the problem of bone segmentation as X-rays are
grayscale and the structure and shape of bones can di_er slightly The _exibility
of the model will be useful for separating bones from X-rays even though one tibia
bone di_ers from another tibia bone
The lsquoworking mechanisms of the methods discussed above are explained in detail in
312 Common Limitations of the Previous Research
As mentioned in previous chapters bone segmentation and fracture detection are both
complicated problems There are many limitations and problems in the seg- mentation
methods used Some methods and models are too limited or constrained to match the bone
accurately Accuracy of results and computing time are conflict- ing variables
It is observed in [14] that there is no automatic method of segmenting bones [14] also
recognizes the need for good initial conditions for Active Contour Models to produce a good
segmentation of bones from X-rays If the initial conditions are not good the final results will
be inaccurate Manual definition of the initial conditions such as the scaling or orientation of
the contour is needed so the process is not automatic [14] tries to detect fractures in long
shaft bones using Computer Aided Design (CAD) techniques
The tradeo_ between automizing the algorithm and the accuracy of the results using the
Active Shape and Active Contour Models is examined in [31] If the model is made fully
automatic by estimating the initial conditions the accuracy will be lower than when the
initial conditions of the model are defined by user inputs [31] implements both manual and
automatic approaches and identifies that automatically segmenting bone structures from noisy
X-ray images is a complex problem This thesis project tackles these limitations The manual
and automatic approaches
are tried using Active Shape Models The relationship between the size of the
training set computation time and error are studied
32 Edge Detection
Edge detection falls under the category of feature detection of images which includes other
methods like ridge detection blob detection interest point detection and scale space models
In digital imaging edges are de_ned as a set of connected pixels that
lie on the boundary between two regions in an image where the image intensity changes
formally known as discontinuities [15] The pixels or a set of pixels that form the edge are
generally of the same or close to the same intensities Edge detection can be used to segment
images with respect to these edges and display the edges separately [26][15] Edge detection
can be used in separating tibia bones from X-rays as bones have strong boundaries or edges
Figure 31 is an example of
basic edge detection in images
321 Sobel Edge Detector
The Sobel operator used to do the edge detection calculates the gradient of the image
intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal
and vertical derivatives as its components The gradient vector can also be seen as a
magnitude and an angle If Dx and Dy are the derivatives in the x and y direction
respectively equations 31 and 32 show the magnitude and angle(direction) representation of
the gradient vector rD It is a measure of the rate of change in an image from light to dark
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Semiautomatic segmentation combines the benemacrts of both manual and automatic seg-
mentation By giving some initial information about the structures we can proceed with
automatic methods
sup2 Thresholding
If the distribution of intensities is known thresholding divides the image into two
regions separated by a manually chosen threshold value a as follows
if B(i j) cedil a B(i j) = 1 (object) else B(i j) = 0 (background) for all i j over the image B
[YGV] This can be repeated for each region dividing them by the threshold value which
results in four regions etc However a successful segmentation requires that some properties
of the image is known beforehand This method has the drawback of including separated
regions which correctly lie within the limits specified but regionally do not belong to the
selected region These pixels could for instance appear from noise The simplest way of
choosing the threshold value would be a fixed value for instance the mean value of the
image A better choice would be a histogram derived threshold This method includes some
knowledge of the distribution of the image and will result in less misclassimacrcation
Isodata algorithm is an iterative process for macrnding the threshold value [YGV] First
segment the image into two regions according to a temporary chosen threshold value
Then calculate the mean value of the image corresponding to the two segmented
regions Calculate a new threshold value from
thresholdnew = mean(meanregion1 + meanregion2)
and repeat until the threshold value does not change any more Finally choose this
value for the threshold segmentation
To implement the triangle algorithm construct a histogram of intensities vs number
of pixels like in Figure 21 Draw a line between the maximum value of the histogram
hmax and the minimum value hmin and calculate the distance d between the line
and and the histogram Increase hmin and repeat for all h until h = hmax The
threshold value becomes the h for which the distance d is maximised This method
is particularly eregective when the pixels of the object we seek make a weak peak
sup2 Boundary tracking
Edge-macrnding by gradients is the method of selecting a boundary manually and auto-
matically follow this gradient until returning to the same point [YGV] Returning
to the same point can be a major problem of this method Boundary tracking will
wrongly include all interior holes in the region and will meet problems if the gradient
specifying the boundary is varying or is very small A way to overcome this problem
is macrrst to calculate the gradient and then apply a threshold segmentation This will
exclude some wrongly included pixels compared to the threshold method only
Zero-crossing based procedure is a method based on the Laplacian Assume the
boundaries of an object has the property that the Laplacian will change sign across
them Consider a 1D problem where cent = 2
x2 Assume the boundary is blurred
and the gradient will have a shape like in Figure 22 The Laplacian will change
sign just around the assumed edge for position = 0 For noisy images the noise will
produce large second derivatives around zero crossings and the zero-crossing based
procedure needs a smoothing macrlter to produce satisfactory results
sup2 Clustering Methods Clustering methods group pixels into larger regions using
colour codes The colour code for each pixel is usually given as a 3D vector but
212 Feature Extraction
Feature extraction is the process of reducing the segmented image into few numbers
or sets of numbers that de_ne the relevant features of the image These features
must be carefully chosen in such a way that they are a good representation of
the image and encapsulate the necessary information Some examples of features
can be image properties like the mean standard deviation gradient and edges
Generally a combination of features is used to generate a model for the images
Cross validation is done on the images to see which features represent the image
well and those features are used Features can sometimes be assigned weights to
signify the importance of certain features For example the mean in a certain
image may be given a weight of 09 because it is more important than the standard
deviation which may have a weight of 03 assigned to it Weights generally range
from 0 to 1 and they de_ne how important the features are These features and their
respective weights are then used on a test image to get the relevant information
To classify the bone as fractured or not[27]measures the neck-shaft angle from the
segmented femur contour as a feature Texture features of the image such as Gabor
orientation (GO) Markov Random Field (MRF) and intensity gradient direction
(IGD) are used by [22] to generate a combination of classi_ers to detect fractures in
bones These techniques are also used in [20] to look at femur fractures speci_cally
Best parameter values for the features can be found using various techniques
213 Classifiers and Pattern Recognition
After the feature extraction stage the features have to be analyzed and a pat-
tern needs to be recognized For example the features mentioned above like the
neck-shaft angle in a femur X-ray image need to be plotted The patterns can be
recognized if the neck-shaft angles of good femurs are di_erent from those of frac-
tured femurs Classifiers like Bayesian classifiers and Support Vector Machines are
used to classify features and _nd the best values for them For example [22] used
a support vector machine called the Gini-SVM [22] and found the feature values
for GO MRF and IGD that gave the best performance overall Clustering nearest
neighbour approaches can also be used for pattern recognition and classi_cation of
images For example the gradient vector of a healthy long bone X-ray may point
in a certain direction that is very di_erent to the gradient vector of a fractured long
bone X-ray So by observing this fact a bone in an unknown X-ray image can be
classi_ed as healthy or fractured using the gradient vector of the image
214 Thresholding and Error Classi_cation
Thresholding and Error Classi_cation is the _nal stage in the digital image process-
ing system Thresholding an image is a simple technique and can be done at any
stage in the process It can be used at the start to reduce the noise in the image or
it can be used to separate certain sections in an image that has distinct variations
in pixel values Thresholding is done by comparing the value of each pixel in an
image and comparing it to a threshold The image can be separated into regions
or pixels that are greater or lesser than the threshold value Multiple thresholds
can be used to achieve thresholding with many levels Otsus method [21] is a way
of automatically thresholding any image
Thresholding is used at different stages in this thesis It is a simple and useful tool in image
processing The following figures show the effects of thresholding Thresholding of an image
can be done manually by using the histogram of the intensities in an image It is difficult to
threshold noisy images as the background intensity and the foreground intensity may not be
distinctly separate Figure 23 shows an example of an image and its histogram that has the
pixel intensities on the horizontal axis and the number of pixels on the vertical axis
(a) The original image (b) The histogram of the image
Figure 23 Histogram of image [23]
IMAGE ENHANCEMENT TECHNIQUES
Image enhancement techniques improve the quality of an image as perceived by a human
These techniques are most useful because many satellite images when examined on a colour
display give inadequate information for image interpretation There is no conscious effort to
improve the fidelity of
the image with regard to some ideal form of the image There exists a wide variety of
techniques for improving image quality The contrast stretch density slicing edge
enhancement and spatial filtering are the more commonly used techniques Image
enhancement is attempted after the image is corrected for geometric and radiometric
distortions Image enhancement methods are applied separately to each band of a
multispectral image Digital techniques have been found to be most satisfactory than the
photographic technique for image enhancement because of the precision and wide variety of
digital
processes
Contrast
Contrast generally refers to the difference in luminance or grey level values in an image and
is an important characteristic It can be defined as the ratio of the maximum intensity to the
minimum intensity over an image Contrast ratio has a strong bearing on the resolving power
and detectability
of an image Larger this ratio more easy it is to interpret the image Satellite images lack
adequate contrast and require contrast improvement
Contrast Enhancement
Contrast enhancement techniques expand the range of brightness values in an image so that
the image can be efficiently displayed in a manner desired by the analyst The density values
in a scene are literally pulled farther apart that is expanded over a greater range The effect
is to increase the visual
contrast between two areas of different uniform densities This enables the analyst to
discriminate easily between areas initially having a small difference in density
Linear Contrast Stretch
This is the simplest contrast stretch algorithm The grey values in the original image and the
modified image follow a linear relation in this algorithm A density number in the low range
of the original histogram is assigned to extremely black and a value at the high end is
assigned to extremely white The remaining pixel values are distributed linearly between
these extremes The features or details that were obscure on the original image will be clear
in the contrast stretched image Linear contrast stretch operation can be represented
graphically as shown in Fig 4 To provide optimal contrast
and colour variation in colour composites the small range of grey values in each band is
stretched to the full brightness range of the output or display unit
Non-Linear Contrast Enhancement
In these methods the input and output data values follow a non-linear transformation The
general form of the non-linear contrast enhancement is defined by y = f (x) where x is the
input data value and y is the output data value The non-linear contrast enhancement
techniques have been found to be useful for enhancing the colour contrast between the nearly
classes and subclasses of a main class
A type of non linear contrast stretch involves scaling the input data logarithmically This
enhancement has greatest impact on the brightness values found in the darker part of
histogram It could be reversed to enhance values in brighter part of histogram by scaling the
input data using an inverse log
function Histogram equalization is another non-linear contrast enhancement technique In
this technique histogram of the original image is redistributed to produce a uniform
population density This is obtained by grouping certain adjacent grey values Thus the
number of grey levels in the enhanced image is less than the number of grey levels in the
original image
SPATIAL FILTERING
A characteristic of remotely sensed images is a parameter called spatial frequency defined as
number of changes in Brightness Value per unit distance for any particular part of an image
If there are very few changes in Brightness Value once a given area in an image this is
referred to as low frequency area Conversely if the Brightness Value changes dramatically
over short distances this is an area of high frequency Spatial filtering is the process of
dividing the image into its constituent spatial frequencies and selectively altering certain
spatial frequencies to emphasize some image features This technique increases the analystrsquos
ability to discriminate detail The three types of spatial filters used in remote sensor data
processing are Low pass filters Band pass filters and High pass filters
Low-Frequency Filtering in the Spatial Domain
Image enhancements that de-emphasize or block the high spatial frequency detail are low-
frequency or low-pass filters The simplest low-frequency filter evaluates a particular input
pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new
brightness value BVout that is the mean of this convolution The size of the neighbourhood
convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing
operation will however blur the image especially at the edges of objects Blurring becomes
more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass
image being two lines and two columns smaller than the original image Techniques that can
be applied to deal with this problem include (1) artificially extending the original image
beyond its border by repeating the original border pixel brightness values or (2) replicating
the averaged brightness values near the borders based on the image behaviour within a view
pixels of the border The most commonly used low pass filters are mean median and mode
filters
High-Frequency Filtering in the Spatial Domain
High-pass filtering is applied to imagery to remove the slowly varying components and
enhance the high-frequency local variations Brightness values tend to be highly correlated in
a nine-element window Thus the highfrequency filtered image will have a relatively narrow
intensity histogram This suggests that the output from most high-frequency filtered images
must be contrast stretched prior to visual analysis
Edge Enhancement in the Spatial Domain
For many remote sensing earth science applications the most valuable information that may
be derived from an image is contained in the edges surrounding various objects of interest
Edge enhancement delineates these edges and makes the shapes and details comprising the
image more conspicuous and perhaps easier to analyze Generally what the eyes see as
pictorial edges are simply sharp changes in brightness value between two adjacent pixels The
edges may be enhanced using either linear or nonlinear edge enhancement techniques
Linear Edge Enhancement
A straightforward method of extracting edges in remotely sensed imagery is the application
of a directional first-difference algorithm and approximates the first derivative between two
adjacent pixels The algorithm produces the first difference of the image input in the
horizontal vertical and diagonal directions
The Laplacian operator generally highlights point lines and edges in the image and
suppresses uniform and smoothly varying regions Human vision physiological research
suggests that we see objects in much the same way Hence the use of this operation has a
more natural look than many of the other edge-enhanced images
Band ratioing
Sometimes differences in brightness values from identical surface materials are caused by
topographic slope and aspect shadows or seasonal changes in sunlight illumination angle
and intensity These conditions may hamper the ability of an interpreter or classification
algorithm to identify correctly surface materials or land use in a remotely sensed image
Fortunately ratio transformations of the remotely sensed data can in certain instances be
applied to reduce the effects of such environmental conditions In addition to minimizing the
effects of environmental factors ratios may also provide unique information not available in
any single band that is useful for discriminating between soils and vegetation
Chapter 3
Literature Review and History
The _rst section in this chapter describes the work that is related to the topic Many
papers use the same image segmentation techniques for di_erent problems This
section explains the methods discussed in this thesis used by researchers to solve
similar problems The subsequent section describes the workings of the common
methods of image segmentation These methods were investigated in this thesis and
are also used in other papers They include techniques like Active Shape Models
Active ContourSnake Models Texture analysis edge detection and some methods
that are only relevant for the X-ray data
31 Previous Research
311 Summary of Previous Research
According to [14] compared to other areas in medical imaging bone fracture detec-
tion is not well researched and published Research has been done by the National
University of Singapore to segment and detect fractures in femurs (the thigh bone)
[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it
from the X-ray The X-rays were also segmented using Snakes or Active Contour
Models (discussed in 34) and Gradient Vector Flow According to the experiments
done by [27] their algorithm achieves a classi_cation with an accuracy of 945
Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in
X-rays [31] proposes two methods to extract femur contours from X-rays The _rst
is a semi-automatic method which gives priority to reliability and accuracy This
method tries to _t a model of the femur contour to a femur in the X-ray The second
method is automatic and uses active contour models This method breaks down the
shape of the femur into a couple of parallel or roughly parallel lines and a circle at
the top representing the head of the femur The method detects the strong edges in
the circle and locates the turning point using the point of in_ection in the second
derivative of the image Finally it optimizes the femur contour by applying shapeconstraints
to the model
Hough and Radon transforms are used by [14] to approximate the edges of long
bones [14] also uses clustering-based algorithms also known as bi-level or localized
thresholding methods and the global segmentation algorithms to segment X-rays
Clustering-based algorithms categorize each pixel of the image as either a part of
the background or as a part of the object hence the name bi-level thresholding
based on a speci_ed threshold Global segmentation algorithms take the whole
image into consideration and sometimes work better than the clustering-based algo-
rithms Global segmentation algorithms include methods like edge detection region
extraction and deformable models (discussed in 34)
Active Contour Models initially proposed by [19] fall under the class of deformable
models and are used widely as an image segmentation tool Active Contour Models
are used to extract femur contours in X-ray images by [31] after doing edge detection
on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by
[31] to extract contours and the results are compared to that of the Active Contour
Model [3] uses an Active Contour Model with curvature constraints to detect
femur fractures as the original Active Contour Model is susceptible to noise and
other undesired edges This method successfully extracts the femur contour with a
small restriction on shape size and orientation of the image
Active Shape Models introduced by Cootes and Taylor [9] is another widely used
statistical model for image segmentation Cootes and Taylor and their colleagues
[5 6 7 11 12 10] released a series of papers that completed the de_nition of the
original ASMs by modifying it also called classical ASMs by [24] These papers
investigated the performance of the model with gray-level variation di_erent reso-
lutions and made the model more _exible and adaptable ASMs are used by [24] to
detect facial features Some modi_cations to the original model were suggested and
experimented with The relationships between landmark points computing time
and the number of images in the training data were observed for di_erent sets of
data The results in this thesis are compared to the results in [24] The work done
in this thesis is similar to [24] as the same model is used for a di_erent application
[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition
of the shape and the gray level analysis of grayscale images The data used was
facial data from a face database and it was concluded that ASMs are an accurate
way of modeling the shape and gray level appearance It was observed that the
model allows for _exibility while being constrained on the shape of the object to
be segmented This is relevant for the problem of bone segmentation as X-rays are
grayscale and the structure and shape of bones can di_er slightly The _exibility
of the model will be useful for separating bones from X-rays even though one tibia
bone di_ers from another tibia bone
The lsquoworking mechanisms of the methods discussed above are explained in detail in
312 Common Limitations of the Previous Research
As mentioned in previous chapters bone segmentation and fracture detection are both
complicated problems There are many limitations and problems in the seg- mentation
methods used Some methods and models are too limited or constrained to match the bone
accurately Accuracy of results and computing time are conflict- ing variables
It is observed in [14] that there is no automatic method of segmenting bones [14] also
recognizes the need for good initial conditions for Active Contour Models to produce a good
segmentation of bones from X-rays If the initial conditions are not good the final results will
be inaccurate Manual definition of the initial conditions such as the scaling or orientation of
the contour is needed so the process is not automatic [14] tries to detect fractures in long
shaft bones using Computer Aided Design (CAD) techniques
The tradeo_ between automizing the algorithm and the accuracy of the results using the
Active Shape and Active Contour Models is examined in [31] If the model is made fully
automatic by estimating the initial conditions the accuracy will be lower than when the
initial conditions of the model are defined by user inputs [31] implements both manual and
automatic approaches and identifies that automatically segmenting bone structures from noisy
X-ray images is a complex problem This thesis project tackles these limitations The manual
and automatic approaches
are tried using Active Shape Models The relationship between the size of the
training set computation time and error are studied
32 Edge Detection
Edge detection falls under the category of feature detection of images which includes other
methods like ridge detection blob detection interest point detection and scale space models
In digital imaging edges are de_ned as a set of connected pixels that
lie on the boundary between two regions in an image where the image intensity changes
formally known as discontinuities [15] The pixels or a set of pixels that form the edge are
generally of the same or close to the same intensities Edge detection can be used to segment
images with respect to these edges and display the edges separately [26][15] Edge detection
can be used in separating tibia bones from X-rays as bones have strong boundaries or edges
Figure 31 is an example of
basic edge detection in images
321 Sobel Edge Detector
The Sobel operator used to do the edge detection calculates the gradient of the image
intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal
and vertical derivatives as its components The gradient vector can also be seen as a
magnitude and an angle If Dx and Dy are the derivatives in the x and y direction
respectively equations 31 and 32 show the magnitude and angle(direction) representation of
the gradient vector rD It is a measure of the rate of change in an image from light to dark
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
specifying the boundary is varying or is very small A way to overcome this problem
is macrrst to calculate the gradient and then apply a threshold segmentation This will
exclude some wrongly included pixels compared to the threshold method only
Zero-crossing based procedure is a method based on the Laplacian Assume the
boundaries of an object has the property that the Laplacian will change sign across
them Consider a 1D problem where cent = 2
x2 Assume the boundary is blurred
and the gradient will have a shape like in Figure 22 The Laplacian will change
sign just around the assumed edge for position = 0 For noisy images the noise will
produce large second derivatives around zero crossings and the zero-crossing based
procedure needs a smoothing macrlter to produce satisfactory results
sup2 Clustering Methods Clustering methods group pixels into larger regions using
colour codes The colour code for each pixel is usually given as a 3D vector but
212 Feature Extraction
Feature extraction is the process of reducing the segmented image into few numbers
or sets of numbers that de_ne the relevant features of the image These features
must be carefully chosen in such a way that they are a good representation of
the image and encapsulate the necessary information Some examples of features
can be image properties like the mean standard deviation gradient and edges
Generally a combination of features is used to generate a model for the images
Cross validation is done on the images to see which features represent the image
well and those features are used Features can sometimes be assigned weights to
signify the importance of certain features For example the mean in a certain
image may be given a weight of 09 because it is more important than the standard
deviation which may have a weight of 03 assigned to it Weights generally range
from 0 to 1 and they de_ne how important the features are These features and their
respective weights are then used on a test image to get the relevant information
To classify the bone as fractured or not[27]measures the neck-shaft angle from the
segmented femur contour as a feature Texture features of the image such as Gabor
orientation (GO) Markov Random Field (MRF) and intensity gradient direction
(IGD) are used by [22] to generate a combination of classi_ers to detect fractures in
bones These techniques are also used in [20] to look at femur fractures speci_cally
Best parameter values for the features can be found using various techniques
213 Classifiers and Pattern Recognition
After the feature extraction stage the features have to be analyzed and a pat-
tern needs to be recognized For example the features mentioned above like the
neck-shaft angle in a femur X-ray image need to be plotted The patterns can be
recognized if the neck-shaft angles of good femurs are di_erent from those of frac-
tured femurs Classifiers like Bayesian classifiers and Support Vector Machines are
used to classify features and _nd the best values for them For example [22] used
a support vector machine called the Gini-SVM [22] and found the feature values
for GO MRF and IGD that gave the best performance overall Clustering nearest
neighbour approaches can also be used for pattern recognition and classi_cation of
images For example the gradient vector of a healthy long bone X-ray may point
in a certain direction that is very di_erent to the gradient vector of a fractured long
bone X-ray So by observing this fact a bone in an unknown X-ray image can be
classi_ed as healthy or fractured using the gradient vector of the image
214 Thresholding and Error Classi_cation
Thresholding and Error Classi_cation is the _nal stage in the digital image process-
ing system Thresholding an image is a simple technique and can be done at any
stage in the process It can be used at the start to reduce the noise in the image or
it can be used to separate certain sections in an image that has distinct variations
in pixel values Thresholding is done by comparing the value of each pixel in an
image and comparing it to a threshold The image can be separated into regions
or pixels that are greater or lesser than the threshold value Multiple thresholds
can be used to achieve thresholding with many levels Otsus method [21] is a way
of automatically thresholding any image
Thresholding is used at different stages in this thesis It is a simple and useful tool in image
processing The following figures show the effects of thresholding Thresholding of an image
can be done manually by using the histogram of the intensities in an image It is difficult to
threshold noisy images as the background intensity and the foreground intensity may not be
distinctly separate Figure 23 shows an example of an image and its histogram that has the
pixel intensities on the horizontal axis and the number of pixels on the vertical axis
(a) The original image (b) The histogram of the image
Figure 23 Histogram of image [23]
IMAGE ENHANCEMENT TECHNIQUES
Image enhancement techniques improve the quality of an image as perceived by a human
These techniques are most useful because many satellite images when examined on a colour
display give inadequate information for image interpretation There is no conscious effort to
improve the fidelity of
the image with regard to some ideal form of the image There exists a wide variety of
techniques for improving image quality The contrast stretch density slicing edge
enhancement and spatial filtering are the more commonly used techniques Image
enhancement is attempted after the image is corrected for geometric and radiometric
distortions Image enhancement methods are applied separately to each band of a
multispectral image Digital techniques have been found to be most satisfactory than the
photographic technique for image enhancement because of the precision and wide variety of
digital
processes
Contrast
Contrast generally refers to the difference in luminance or grey level values in an image and
is an important characteristic It can be defined as the ratio of the maximum intensity to the
minimum intensity over an image Contrast ratio has a strong bearing on the resolving power
and detectability
of an image Larger this ratio more easy it is to interpret the image Satellite images lack
adequate contrast and require contrast improvement
Contrast Enhancement
Contrast enhancement techniques expand the range of brightness values in an image so that
the image can be efficiently displayed in a manner desired by the analyst The density values
in a scene are literally pulled farther apart that is expanded over a greater range The effect
is to increase the visual
contrast between two areas of different uniform densities This enables the analyst to
discriminate easily between areas initially having a small difference in density
Linear Contrast Stretch
This is the simplest contrast stretch algorithm The grey values in the original image and the
modified image follow a linear relation in this algorithm A density number in the low range
of the original histogram is assigned to extremely black and a value at the high end is
assigned to extremely white The remaining pixel values are distributed linearly between
these extremes The features or details that were obscure on the original image will be clear
in the contrast stretched image Linear contrast stretch operation can be represented
graphically as shown in Fig 4 To provide optimal contrast
and colour variation in colour composites the small range of grey values in each band is
stretched to the full brightness range of the output or display unit
Non-Linear Contrast Enhancement
In these methods the input and output data values follow a non-linear transformation The
general form of the non-linear contrast enhancement is defined by y = f (x) where x is the
input data value and y is the output data value The non-linear contrast enhancement
techniques have been found to be useful for enhancing the colour contrast between the nearly
classes and subclasses of a main class
A type of non linear contrast stretch involves scaling the input data logarithmically This
enhancement has greatest impact on the brightness values found in the darker part of
histogram It could be reversed to enhance values in brighter part of histogram by scaling the
input data using an inverse log
function Histogram equalization is another non-linear contrast enhancement technique In
this technique histogram of the original image is redistributed to produce a uniform
population density This is obtained by grouping certain adjacent grey values Thus the
number of grey levels in the enhanced image is less than the number of grey levels in the
original image
SPATIAL FILTERING
A characteristic of remotely sensed images is a parameter called spatial frequency defined as
number of changes in Brightness Value per unit distance for any particular part of an image
If there are very few changes in Brightness Value once a given area in an image this is
referred to as low frequency area Conversely if the Brightness Value changes dramatically
over short distances this is an area of high frequency Spatial filtering is the process of
dividing the image into its constituent spatial frequencies and selectively altering certain
spatial frequencies to emphasize some image features This technique increases the analystrsquos
ability to discriminate detail The three types of spatial filters used in remote sensor data
processing are Low pass filters Band pass filters and High pass filters
Low-Frequency Filtering in the Spatial Domain
Image enhancements that de-emphasize or block the high spatial frequency detail are low-
frequency or low-pass filters The simplest low-frequency filter evaluates a particular input
pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new
brightness value BVout that is the mean of this convolution The size of the neighbourhood
convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing
operation will however blur the image especially at the edges of objects Blurring becomes
more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass
image being two lines and two columns smaller than the original image Techniques that can
be applied to deal with this problem include (1) artificially extending the original image
beyond its border by repeating the original border pixel brightness values or (2) replicating
the averaged brightness values near the borders based on the image behaviour within a view
pixels of the border The most commonly used low pass filters are mean median and mode
filters
High-Frequency Filtering in the Spatial Domain
High-pass filtering is applied to imagery to remove the slowly varying components and
enhance the high-frequency local variations Brightness values tend to be highly correlated in
a nine-element window Thus the highfrequency filtered image will have a relatively narrow
intensity histogram This suggests that the output from most high-frequency filtered images
must be contrast stretched prior to visual analysis
Edge Enhancement in the Spatial Domain
For many remote sensing earth science applications the most valuable information that may
be derived from an image is contained in the edges surrounding various objects of interest
Edge enhancement delineates these edges and makes the shapes and details comprising the
image more conspicuous and perhaps easier to analyze Generally what the eyes see as
pictorial edges are simply sharp changes in brightness value between two adjacent pixels The
edges may be enhanced using either linear or nonlinear edge enhancement techniques
Linear Edge Enhancement
A straightforward method of extracting edges in remotely sensed imagery is the application
of a directional first-difference algorithm and approximates the first derivative between two
adjacent pixels The algorithm produces the first difference of the image input in the
horizontal vertical and diagonal directions
The Laplacian operator generally highlights point lines and edges in the image and
suppresses uniform and smoothly varying regions Human vision physiological research
suggests that we see objects in much the same way Hence the use of this operation has a
more natural look than many of the other edge-enhanced images
Band ratioing
Sometimes differences in brightness values from identical surface materials are caused by
topographic slope and aspect shadows or seasonal changes in sunlight illumination angle
and intensity These conditions may hamper the ability of an interpreter or classification
algorithm to identify correctly surface materials or land use in a remotely sensed image
Fortunately ratio transformations of the remotely sensed data can in certain instances be
applied to reduce the effects of such environmental conditions In addition to minimizing the
effects of environmental factors ratios may also provide unique information not available in
any single band that is useful for discriminating between soils and vegetation
Chapter 3
Literature Review and History
The _rst section in this chapter describes the work that is related to the topic Many
papers use the same image segmentation techniques for di_erent problems This
section explains the methods discussed in this thesis used by researchers to solve
similar problems The subsequent section describes the workings of the common
methods of image segmentation These methods were investigated in this thesis and
are also used in other papers They include techniques like Active Shape Models
Active ContourSnake Models Texture analysis edge detection and some methods
that are only relevant for the X-ray data
31 Previous Research
311 Summary of Previous Research
According to [14] compared to other areas in medical imaging bone fracture detec-
tion is not well researched and published Research has been done by the National
University of Singapore to segment and detect fractures in femurs (the thigh bone)
[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it
from the X-ray The X-rays were also segmented using Snakes or Active Contour
Models (discussed in 34) and Gradient Vector Flow According to the experiments
done by [27] their algorithm achieves a classi_cation with an accuracy of 945
Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in
X-rays [31] proposes two methods to extract femur contours from X-rays The _rst
is a semi-automatic method which gives priority to reliability and accuracy This
method tries to _t a model of the femur contour to a femur in the X-ray The second
method is automatic and uses active contour models This method breaks down the
shape of the femur into a couple of parallel or roughly parallel lines and a circle at
the top representing the head of the femur The method detects the strong edges in
the circle and locates the turning point using the point of in_ection in the second
derivative of the image Finally it optimizes the femur contour by applying shapeconstraints
to the model
Hough and Radon transforms are used by [14] to approximate the edges of long
bones [14] also uses clustering-based algorithms also known as bi-level or localized
thresholding methods and the global segmentation algorithms to segment X-rays
Clustering-based algorithms categorize each pixel of the image as either a part of
the background or as a part of the object hence the name bi-level thresholding
based on a speci_ed threshold Global segmentation algorithms take the whole
image into consideration and sometimes work better than the clustering-based algo-
rithms Global segmentation algorithms include methods like edge detection region
extraction and deformable models (discussed in 34)
Active Contour Models initially proposed by [19] fall under the class of deformable
models and are used widely as an image segmentation tool Active Contour Models
are used to extract femur contours in X-ray images by [31] after doing edge detection
on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by
[31] to extract contours and the results are compared to that of the Active Contour
Model [3] uses an Active Contour Model with curvature constraints to detect
femur fractures as the original Active Contour Model is susceptible to noise and
other undesired edges This method successfully extracts the femur contour with a
small restriction on shape size and orientation of the image
Active Shape Models introduced by Cootes and Taylor [9] is another widely used
statistical model for image segmentation Cootes and Taylor and their colleagues
[5 6 7 11 12 10] released a series of papers that completed the de_nition of the
original ASMs by modifying it also called classical ASMs by [24] These papers
investigated the performance of the model with gray-level variation di_erent reso-
lutions and made the model more _exible and adaptable ASMs are used by [24] to
detect facial features Some modi_cations to the original model were suggested and
experimented with The relationships between landmark points computing time
and the number of images in the training data were observed for di_erent sets of
data The results in this thesis are compared to the results in [24] The work done
in this thesis is similar to [24] as the same model is used for a di_erent application
[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition
of the shape and the gray level analysis of grayscale images The data used was
facial data from a face database and it was concluded that ASMs are an accurate
way of modeling the shape and gray level appearance It was observed that the
model allows for _exibility while being constrained on the shape of the object to
be segmented This is relevant for the problem of bone segmentation as X-rays are
grayscale and the structure and shape of bones can di_er slightly The _exibility
of the model will be useful for separating bones from X-rays even though one tibia
bone di_ers from another tibia bone
The lsquoworking mechanisms of the methods discussed above are explained in detail in
312 Common Limitations of the Previous Research
As mentioned in previous chapters bone segmentation and fracture detection are both
complicated problems There are many limitations and problems in the seg- mentation
methods used Some methods and models are too limited or constrained to match the bone
accurately Accuracy of results and computing time are conflict- ing variables
It is observed in [14] that there is no automatic method of segmenting bones [14] also
recognizes the need for good initial conditions for Active Contour Models to produce a good
segmentation of bones from X-rays If the initial conditions are not good the final results will
be inaccurate Manual definition of the initial conditions such as the scaling or orientation of
the contour is needed so the process is not automatic [14] tries to detect fractures in long
shaft bones using Computer Aided Design (CAD) techniques
The tradeo_ between automizing the algorithm and the accuracy of the results using the
Active Shape and Active Contour Models is examined in [31] If the model is made fully
automatic by estimating the initial conditions the accuracy will be lower than when the
initial conditions of the model are defined by user inputs [31] implements both manual and
automatic approaches and identifies that automatically segmenting bone structures from noisy
X-ray images is a complex problem This thesis project tackles these limitations The manual
and automatic approaches
are tried using Active Shape Models The relationship between the size of the
training set computation time and error are studied
32 Edge Detection
Edge detection falls under the category of feature detection of images which includes other
methods like ridge detection blob detection interest point detection and scale space models
In digital imaging edges are de_ned as a set of connected pixels that
lie on the boundary between two regions in an image where the image intensity changes
formally known as discontinuities [15] The pixels or a set of pixels that form the edge are
generally of the same or close to the same intensities Edge detection can be used to segment
images with respect to these edges and display the edges separately [26][15] Edge detection
can be used in separating tibia bones from X-rays as bones have strong boundaries or edges
Figure 31 is an example of
basic edge detection in images
321 Sobel Edge Detector
The Sobel operator used to do the edge detection calculates the gradient of the image
intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal
and vertical derivatives as its components The gradient vector can also be seen as a
magnitude and an angle If Dx and Dy are the derivatives in the x and y direction
respectively equations 31 and 32 show the magnitude and angle(direction) representation of
the gradient vector rD It is a measure of the rate of change in an image from light to dark
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
bones These techniques are also used in [20] to look at femur fractures speci_cally
Best parameter values for the features can be found using various techniques
213 Classifiers and Pattern Recognition
After the feature extraction stage the features have to be analyzed and a pat-
tern needs to be recognized For example the features mentioned above like the
neck-shaft angle in a femur X-ray image need to be plotted The patterns can be
recognized if the neck-shaft angles of good femurs are di_erent from those of frac-
tured femurs Classifiers like Bayesian classifiers and Support Vector Machines are
used to classify features and _nd the best values for them For example [22] used
a support vector machine called the Gini-SVM [22] and found the feature values
for GO MRF and IGD that gave the best performance overall Clustering nearest
neighbour approaches can also be used for pattern recognition and classi_cation of
images For example the gradient vector of a healthy long bone X-ray may point
in a certain direction that is very di_erent to the gradient vector of a fractured long
bone X-ray So by observing this fact a bone in an unknown X-ray image can be
classi_ed as healthy or fractured using the gradient vector of the image
214 Thresholding and Error Classi_cation
Thresholding and Error Classi_cation is the _nal stage in the digital image process-
ing system Thresholding an image is a simple technique and can be done at any
stage in the process It can be used at the start to reduce the noise in the image or
it can be used to separate certain sections in an image that has distinct variations
in pixel values Thresholding is done by comparing the value of each pixel in an
image and comparing it to a threshold The image can be separated into regions
or pixels that are greater or lesser than the threshold value Multiple thresholds
can be used to achieve thresholding with many levels Otsus method [21] is a way
of automatically thresholding any image
Thresholding is used at different stages in this thesis It is a simple and useful tool in image
processing The following figures show the effects of thresholding Thresholding of an image
can be done manually by using the histogram of the intensities in an image It is difficult to
threshold noisy images as the background intensity and the foreground intensity may not be
distinctly separate Figure 23 shows an example of an image and its histogram that has the
pixel intensities on the horizontal axis and the number of pixels on the vertical axis
(a) The original image (b) The histogram of the image
Figure 23 Histogram of image [23]
IMAGE ENHANCEMENT TECHNIQUES
Image enhancement techniques improve the quality of an image as perceived by a human
These techniques are most useful because many satellite images when examined on a colour
display give inadequate information for image interpretation There is no conscious effort to
improve the fidelity of
the image with regard to some ideal form of the image There exists a wide variety of
techniques for improving image quality The contrast stretch density slicing edge
enhancement and spatial filtering are the more commonly used techniques Image
enhancement is attempted after the image is corrected for geometric and radiometric
distortions Image enhancement methods are applied separately to each band of a
multispectral image Digital techniques have been found to be most satisfactory than the
photographic technique for image enhancement because of the precision and wide variety of
digital
processes
Contrast
Contrast generally refers to the difference in luminance or grey level values in an image and
is an important characteristic It can be defined as the ratio of the maximum intensity to the
minimum intensity over an image Contrast ratio has a strong bearing on the resolving power
and detectability
of an image Larger this ratio more easy it is to interpret the image Satellite images lack
adequate contrast and require contrast improvement
Contrast Enhancement
Contrast enhancement techniques expand the range of brightness values in an image so that
the image can be efficiently displayed in a manner desired by the analyst The density values
in a scene are literally pulled farther apart that is expanded over a greater range The effect
is to increase the visual
contrast between two areas of different uniform densities This enables the analyst to
discriminate easily between areas initially having a small difference in density
Linear Contrast Stretch
This is the simplest contrast stretch algorithm The grey values in the original image and the
modified image follow a linear relation in this algorithm A density number in the low range
of the original histogram is assigned to extremely black and a value at the high end is
assigned to extremely white The remaining pixel values are distributed linearly between
these extremes The features or details that were obscure on the original image will be clear
in the contrast stretched image Linear contrast stretch operation can be represented
graphically as shown in Fig 4 To provide optimal contrast
and colour variation in colour composites the small range of grey values in each band is
stretched to the full brightness range of the output or display unit
Non-Linear Contrast Enhancement
In these methods the input and output data values follow a non-linear transformation The
general form of the non-linear contrast enhancement is defined by y = f (x) where x is the
input data value and y is the output data value The non-linear contrast enhancement
techniques have been found to be useful for enhancing the colour contrast between the nearly
classes and subclasses of a main class
A type of non linear contrast stretch involves scaling the input data logarithmically This
enhancement has greatest impact on the brightness values found in the darker part of
histogram It could be reversed to enhance values in brighter part of histogram by scaling the
input data using an inverse log
function Histogram equalization is another non-linear contrast enhancement technique In
this technique histogram of the original image is redistributed to produce a uniform
population density This is obtained by grouping certain adjacent grey values Thus the
number of grey levels in the enhanced image is less than the number of grey levels in the
original image
SPATIAL FILTERING
A characteristic of remotely sensed images is a parameter called spatial frequency defined as
number of changes in Brightness Value per unit distance for any particular part of an image
If there are very few changes in Brightness Value once a given area in an image this is
referred to as low frequency area Conversely if the Brightness Value changes dramatically
over short distances this is an area of high frequency Spatial filtering is the process of
dividing the image into its constituent spatial frequencies and selectively altering certain
spatial frequencies to emphasize some image features This technique increases the analystrsquos
ability to discriminate detail The three types of spatial filters used in remote sensor data
processing are Low pass filters Band pass filters and High pass filters
Low-Frequency Filtering in the Spatial Domain
Image enhancements that de-emphasize or block the high spatial frequency detail are low-
frequency or low-pass filters The simplest low-frequency filter evaluates a particular input
pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new
brightness value BVout that is the mean of this convolution The size of the neighbourhood
convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing
operation will however blur the image especially at the edges of objects Blurring becomes
more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass
image being two lines and two columns smaller than the original image Techniques that can
be applied to deal with this problem include (1) artificially extending the original image
beyond its border by repeating the original border pixel brightness values or (2) replicating
the averaged brightness values near the borders based on the image behaviour within a view
pixels of the border The most commonly used low pass filters are mean median and mode
filters
High-Frequency Filtering in the Spatial Domain
High-pass filtering is applied to imagery to remove the slowly varying components and
enhance the high-frequency local variations Brightness values tend to be highly correlated in
a nine-element window Thus the highfrequency filtered image will have a relatively narrow
intensity histogram This suggests that the output from most high-frequency filtered images
must be contrast stretched prior to visual analysis
Edge Enhancement in the Spatial Domain
For many remote sensing earth science applications the most valuable information that may
be derived from an image is contained in the edges surrounding various objects of interest
Edge enhancement delineates these edges and makes the shapes and details comprising the
image more conspicuous and perhaps easier to analyze Generally what the eyes see as
pictorial edges are simply sharp changes in brightness value between two adjacent pixels The
edges may be enhanced using either linear or nonlinear edge enhancement techniques
Linear Edge Enhancement
A straightforward method of extracting edges in remotely sensed imagery is the application
of a directional first-difference algorithm and approximates the first derivative between two
adjacent pixels The algorithm produces the first difference of the image input in the
horizontal vertical and diagonal directions
The Laplacian operator generally highlights point lines and edges in the image and
suppresses uniform and smoothly varying regions Human vision physiological research
suggests that we see objects in much the same way Hence the use of this operation has a
more natural look than many of the other edge-enhanced images
Band ratioing
Sometimes differences in brightness values from identical surface materials are caused by
topographic slope and aspect shadows or seasonal changes in sunlight illumination angle
and intensity These conditions may hamper the ability of an interpreter or classification
algorithm to identify correctly surface materials or land use in a remotely sensed image
Fortunately ratio transformations of the remotely sensed data can in certain instances be
applied to reduce the effects of such environmental conditions In addition to minimizing the
effects of environmental factors ratios may also provide unique information not available in
any single band that is useful for discriminating between soils and vegetation
Chapter 3
Literature Review and History
The _rst section in this chapter describes the work that is related to the topic Many
papers use the same image segmentation techniques for di_erent problems This
section explains the methods discussed in this thesis used by researchers to solve
similar problems The subsequent section describes the workings of the common
methods of image segmentation These methods were investigated in this thesis and
are also used in other papers They include techniques like Active Shape Models
Active ContourSnake Models Texture analysis edge detection and some methods
that are only relevant for the X-ray data
31 Previous Research
311 Summary of Previous Research
According to [14] compared to other areas in medical imaging bone fracture detec-
tion is not well researched and published Research has been done by the National
University of Singapore to segment and detect fractures in femurs (the thigh bone)
[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it
from the X-ray The X-rays were also segmented using Snakes or Active Contour
Models (discussed in 34) and Gradient Vector Flow According to the experiments
done by [27] their algorithm achieves a classi_cation with an accuracy of 945
Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in
X-rays [31] proposes two methods to extract femur contours from X-rays The _rst
is a semi-automatic method which gives priority to reliability and accuracy This
method tries to _t a model of the femur contour to a femur in the X-ray The second
method is automatic and uses active contour models This method breaks down the
shape of the femur into a couple of parallel or roughly parallel lines and a circle at
the top representing the head of the femur The method detects the strong edges in
the circle and locates the turning point using the point of in_ection in the second
derivative of the image Finally it optimizes the femur contour by applying shapeconstraints
to the model
Hough and Radon transforms are used by [14] to approximate the edges of long
bones [14] also uses clustering-based algorithms also known as bi-level or localized
thresholding methods and the global segmentation algorithms to segment X-rays
Clustering-based algorithms categorize each pixel of the image as either a part of
the background or as a part of the object hence the name bi-level thresholding
based on a speci_ed threshold Global segmentation algorithms take the whole
image into consideration and sometimes work better than the clustering-based algo-
rithms Global segmentation algorithms include methods like edge detection region
extraction and deformable models (discussed in 34)
Active Contour Models initially proposed by [19] fall under the class of deformable
models and are used widely as an image segmentation tool Active Contour Models
are used to extract femur contours in X-ray images by [31] after doing edge detection
on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by
[31] to extract contours and the results are compared to that of the Active Contour
Model [3] uses an Active Contour Model with curvature constraints to detect
femur fractures as the original Active Contour Model is susceptible to noise and
other undesired edges This method successfully extracts the femur contour with a
small restriction on shape size and orientation of the image
Active Shape Models introduced by Cootes and Taylor [9] is another widely used
statistical model for image segmentation Cootes and Taylor and their colleagues
[5 6 7 11 12 10] released a series of papers that completed the de_nition of the
original ASMs by modifying it also called classical ASMs by [24] These papers
investigated the performance of the model with gray-level variation di_erent reso-
lutions and made the model more _exible and adaptable ASMs are used by [24] to
detect facial features Some modi_cations to the original model were suggested and
experimented with The relationships between landmark points computing time
and the number of images in the training data were observed for di_erent sets of
data The results in this thesis are compared to the results in [24] The work done
in this thesis is similar to [24] as the same model is used for a di_erent application
[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition
of the shape and the gray level analysis of grayscale images The data used was
facial data from a face database and it was concluded that ASMs are an accurate
way of modeling the shape and gray level appearance It was observed that the
model allows for _exibility while being constrained on the shape of the object to
be segmented This is relevant for the problem of bone segmentation as X-rays are
grayscale and the structure and shape of bones can di_er slightly The _exibility
of the model will be useful for separating bones from X-rays even though one tibia
bone di_ers from another tibia bone
The lsquoworking mechanisms of the methods discussed above are explained in detail in
312 Common Limitations of the Previous Research
As mentioned in previous chapters bone segmentation and fracture detection are both
complicated problems There are many limitations and problems in the seg- mentation
methods used Some methods and models are too limited or constrained to match the bone
accurately Accuracy of results and computing time are conflict- ing variables
It is observed in [14] that there is no automatic method of segmenting bones [14] also
recognizes the need for good initial conditions for Active Contour Models to produce a good
segmentation of bones from X-rays If the initial conditions are not good the final results will
be inaccurate Manual definition of the initial conditions such as the scaling or orientation of
the contour is needed so the process is not automatic [14] tries to detect fractures in long
shaft bones using Computer Aided Design (CAD) techniques
The tradeo_ between automizing the algorithm and the accuracy of the results using the
Active Shape and Active Contour Models is examined in [31] If the model is made fully
automatic by estimating the initial conditions the accuracy will be lower than when the
initial conditions of the model are defined by user inputs [31] implements both manual and
automatic approaches and identifies that automatically segmenting bone structures from noisy
X-ray images is a complex problem This thesis project tackles these limitations The manual
and automatic approaches
are tried using Active Shape Models The relationship between the size of the
training set computation time and error are studied
32 Edge Detection
Edge detection falls under the category of feature detection of images which includes other
methods like ridge detection blob detection interest point detection and scale space models
In digital imaging edges are de_ned as a set of connected pixels that
lie on the boundary between two regions in an image where the image intensity changes
formally known as discontinuities [15] The pixels or a set of pixels that form the edge are
generally of the same or close to the same intensities Edge detection can be used to segment
images with respect to these edges and display the edges separately [26][15] Edge detection
can be used in separating tibia bones from X-rays as bones have strong boundaries or edges
Figure 31 is an example of
basic edge detection in images
321 Sobel Edge Detector
The Sobel operator used to do the edge detection calculates the gradient of the image
intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal
and vertical derivatives as its components The gradient vector can also be seen as a
magnitude and an angle If Dx and Dy are the derivatives in the x and y direction
respectively equations 31 and 32 show the magnitude and angle(direction) representation of
the gradient vector rD It is a measure of the rate of change in an image from light to dark
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
(a) The original image (b) The histogram of the image
Figure 23 Histogram of image [23]
IMAGE ENHANCEMENT TECHNIQUES
Image enhancement techniques improve the quality of an image as perceived by a human
These techniques are most useful because many satellite images when examined on a colour
display give inadequate information for image interpretation There is no conscious effort to
improve the fidelity of
the image with regard to some ideal form of the image There exists a wide variety of
techniques for improving image quality The contrast stretch density slicing edge
enhancement and spatial filtering are the more commonly used techniques Image
enhancement is attempted after the image is corrected for geometric and radiometric
distortions Image enhancement methods are applied separately to each band of a
multispectral image Digital techniques have been found to be most satisfactory than the
photographic technique for image enhancement because of the precision and wide variety of
digital
processes
Contrast
Contrast generally refers to the difference in luminance or grey level values in an image and
is an important characteristic It can be defined as the ratio of the maximum intensity to the
minimum intensity over an image Contrast ratio has a strong bearing on the resolving power
and detectability
of an image Larger this ratio more easy it is to interpret the image Satellite images lack
adequate contrast and require contrast improvement
Contrast Enhancement
Contrast enhancement techniques expand the range of brightness values in an image so that
the image can be efficiently displayed in a manner desired by the analyst The density values
in a scene are literally pulled farther apart that is expanded over a greater range The effect
is to increase the visual
contrast between two areas of different uniform densities This enables the analyst to
discriminate easily between areas initially having a small difference in density
Linear Contrast Stretch
This is the simplest contrast stretch algorithm The grey values in the original image and the
modified image follow a linear relation in this algorithm A density number in the low range
of the original histogram is assigned to extremely black and a value at the high end is
assigned to extremely white The remaining pixel values are distributed linearly between
these extremes The features or details that were obscure on the original image will be clear
in the contrast stretched image Linear contrast stretch operation can be represented
graphically as shown in Fig 4 To provide optimal contrast
and colour variation in colour composites the small range of grey values in each band is
stretched to the full brightness range of the output or display unit
Non-Linear Contrast Enhancement
In these methods the input and output data values follow a non-linear transformation The
general form of the non-linear contrast enhancement is defined by y = f (x) where x is the
input data value and y is the output data value The non-linear contrast enhancement
techniques have been found to be useful for enhancing the colour contrast between the nearly
classes and subclasses of a main class
A type of non linear contrast stretch involves scaling the input data logarithmically This
enhancement has greatest impact on the brightness values found in the darker part of
histogram It could be reversed to enhance values in brighter part of histogram by scaling the
input data using an inverse log
function Histogram equalization is another non-linear contrast enhancement technique In
this technique histogram of the original image is redistributed to produce a uniform
population density This is obtained by grouping certain adjacent grey values Thus the
number of grey levels in the enhanced image is less than the number of grey levels in the
original image
SPATIAL FILTERING
A characteristic of remotely sensed images is a parameter called spatial frequency defined as
number of changes in Brightness Value per unit distance for any particular part of an image
If there are very few changes in Brightness Value once a given area in an image this is
referred to as low frequency area Conversely if the Brightness Value changes dramatically
over short distances this is an area of high frequency Spatial filtering is the process of
dividing the image into its constituent spatial frequencies and selectively altering certain
spatial frequencies to emphasize some image features This technique increases the analystrsquos
ability to discriminate detail The three types of spatial filters used in remote sensor data
processing are Low pass filters Band pass filters and High pass filters
Low-Frequency Filtering in the Spatial Domain
Image enhancements that de-emphasize or block the high spatial frequency detail are low-
frequency or low-pass filters The simplest low-frequency filter evaluates a particular input
pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new
brightness value BVout that is the mean of this convolution The size of the neighbourhood
convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing
operation will however blur the image especially at the edges of objects Blurring becomes
more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass
image being two lines and two columns smaller than the original image Techniques that can
be applied to deal with this problem include (1) artificially extending the original image
beyond its border by repeating the original border pixel brightness values or (2) replicating
the averaged brightness values near the borders based on the image behaviour within a view
pixels of the border The most commonly used low pass filters are mean median and mode
filters
High-Frequency Filtering in the Spatial Domain
High-pass filtering is applied to imagery to remove the slowly varying components and
enhance the high-frequency local variations Brightness values tend to be highly correlated in
a nine-element window Thus the highfrequency filtered image will have a relatively narrow
intensity histogram This suggests that the output from most high-frequency filtered images
must be contrast stretched prior to visual analysis
Edge Enhancement in the Spatial Domain
For many remote sensing earth science applications the most valuable information that may
be derived from an image is contained in the edges surrounding various objects of interest
Edge enhancement delineates these edges and makes the shapes and details comprising the
image more conspicuous and perhaps easier to analyze Generally what the eyes see as
pictorial edges are simply sharp changes in brightness value between two adjacent pixels The
edges may be enhanced using either linear or nonlinear edge enhancement techniques
Linear Edge Enhancement
A straightforward method of extracting edges in remotely sensed imagery is the application
of a directional first-difference algorithm and approximates the first derivative between two
adjacent pixels The algorithm produces the first difference of the image input in the
horizontal vertical and diagonal directions
The Laplacian operator generally highlights point lines and edges in the image and
suppresses uniform and smoothly varying regions Human vision physiological research
suggests that we see objects in much the same way Hence the use of this operation has a
more natural look than many of the other edge-enhanced images
Band ratioing
Sometimes differences in brightness values from identical surface materials are caused by
topographic slope and aspect shadows or seasonal changes in sunlight illumination angle
and intensity These conditions may hamper the ability of an interpreter or classification
algorithm to identify correctly surface materials or land use in a remotely sensed image
Fortunately ratio transformations of the remotely sensed data can in certain instances be
applied to reduce the effects of such environmental conditions In addition to minimizing the
effects of environmental factors ratios may also provide unique information not available in
any single band that is useful for discriminating between soils and vegetation
Chapter 3
Literature Review and History
The _rst section in this chapter describes the work that is related to the topic Many
papers use the same image segmentation techniques for di_erent problems This
section explains the methods discussed in this thesis used by researchers to solve
similar problems The subsequent section describes the workings of the common
methods of image segmentation These methods were investigated in this thesis and
are also used in other papers They include techniques like Active Shape Models
Active ContourSnake Models Texture analysis edge detection and some methods
that are only relevant for the X-ray data
31 Previous Research
311 Summary of Previous Research
According to [14] compared to other areas in medical imaging bone fracture detec-
tion is not well researched and published Research has been done by the National
University of Singapore to segment and detect fractures in femurs (the thigh bone)
[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it
from the X-ray The X-rays were also segmented using Snakes or Active Contour
Models (discussed in 34) and Gradient Vector Flow According to the experiments
done by [27] their algorithm achieves a classi_cation with an accuracy of 945
Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in
X-rays [31] proposes two methods to extract femur contours from X-rays The _rst
is a semi-automatic method which gives priority to reliability and accuracy This
method tries to _t a model of the femur contour to a femur in the X-ray The second
method is automatic and uses active contour models This method breaks down the
shape of the femur into a couple of parallel or roughly parallel lines and a circle at
the top representing the head of the femur The method detects the strong edges in
the circle and locates the turning point using the point of in_ection in the second
derivative of the image Finally it optimizes the femur contour by applying shapeconstraints
to the model
Hough and Radon transforms are used by [14] to approximate the edges of long
bones [14] also uses clustering-based algorithms also known as bi-level or localized
thresholding methods and the global segmentation algorithms to segment X-rays
Clustering-based algorithms categorize each pixel of the image as either a part of
the background or as a part of the object hence the name bi-level thresholding
based on a speci_ed threshold Global segmentation algorithms take the whole
image into consideration and sometimes work better than the clustering-based algo-
rithms Global segmentation algorithms include methods like edge detection region
extraction and deformable models (discussed in 34)
Active Contour Models initially proposed by [19] fall under the class of deformable
models and are used widely as an image segmentation tool Active Contour Models
are used to extract femur contours in X-ray images by [31] after doing edge detection
on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by
[31] to extract contours and the results are compared to that of the Active Contour
Model [3] uses an Active Contour Model with curvature constraints to detect
femur fractures as the original Active Contour Model is susceptible to noise and
other undesired edges This method successfully extracts the femur contour with a
small restriction on shape size and orientation of the image
Active Shape Models introduced by Cootes and Taylor [9] is another widely used
statistical model for image segmentation Cootes and Taylor and their colleagues
[5 6 7 11 12 10] released a series of papers that completed the de_nition of the
original ASMs by modifying it also called classical ASMs by [24] These papers
investigated the performance of the model with gray-level variation di_erent reso-
lutions and made the model more _exible and adaptable ASMs are used by [24] to
detect facial features Some modi_cations to the original model were suggested and
experimented with The relationships between landmark points computing time
and the number of images in the training data were observed for di_erent sets of
data The results in this thesis are compared to the results in [24] The work done
in this thesis is similar to [24] as the same model is used for a di_erent application
[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition
of the shape and the gray level analysis of grayscale images The data used was
facial data from a face database and it was concluded that ASMs are an accurate
way of modeling the shape and gray level appearance It was observed that the
model allows for _exibility while being constrained on the shape of the object to
be segmented This is relevant for the problem of bone segmentation as X-rays are
grayscale and the structure and shape of bones can di_er slightly The _exibility
of the model will be useful for separating bones from X-rays even though one tibia
bone di_ers from another tibia bone
The lsquoworking mechanisms of the methods discussed above are explained in detail in
312 Common Limitations of the Previous Research
As mentioned in previous chapters bone segmentation and fracture detection are both
complicated problems There are many limitations and problems in the seg- mentation
methods used Some methods and models are too limited or constrained to match the bone
accurately Accuracy of results and computing time are conflict- ing variables
It is observed in [14] that there is no automatic method of segmenting bones [14] also
recognizes the need for good initial conditions for Active Contour Models to produce a good
segmentation of bones from X-rays If the initial conditions are not good the final results will
be inaccurate Manual definition of the initial conditions such as the scaling or orientation of
the contour is needed so the process is not automatic [14] tries to detect fractures in long
shaft bones using Computer Aided Design (CAD) techniques
The tradeo_ between automizing the algorithm and the accuracy of the results using the
Active Shape and Active Contour Models is examined in [31] If the model is made fully
automatic by estimating the initial conditions the accuracy will be lower than when the
initial conditions of the model are defined by user inputs [31] implements both manual and
automatic approaches and identifies that automatically segmenting bone structures from noisy
X-ray images is a complex problem This thesis project tackles these limitations The manual
and automatic approaches
are tried using Active Shape Models The relationship between the size of the
training set computation time and error are studied
32 Edge Detection
Edge detection falls under the category of feature detection of images which includes other
methods like ridge detection blob detection interest point detection and scale space models
In digital imaging edges are de_ned as a set of connected pixels that
lie on the boundary between two regions in an image where the image intensity changes
formally known as discontinuities [15] The pixels or a set of pixels that form the edge are
generally of the same or close to the same intensities Edge detection can be used to segment
images with respect to these edges and display the edges separately [26][15] Edge detection
can be used in separating tibia bones from X-rays as bones have strong boundaries or edges
Figure 31 is an example of
basic edge detection in images
321 Sobel Edge Detector
The Sobel operator used to do the edge detection calculates the gradient of the image
intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal
and vertical derivatives as its components The gradient vector can also be seen as a
magnitude and an angle If Dx and Dy are the derivatives in the x and y direction
respectively equations 31 and 32 show the magnitude and angle(direction) representation of
the gradient vector rD It is a measure of the rate of change in an image from light to dark
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
of an image Larger this ratio more easy it is to interpret the image Satellite images lack
adequate contrast and require contrast improvement
Contrast Enhancement
Contrast enhancement techniques expand the range of brightness values in an image so that
the image can be efficiently displayed in a manner desired by the analyst The density values
in a scene are literally pulled farther apart that is expanded over a greater range The effect
is to increase the visual
contrast between two areas of different uniform densities This enables the analyst to
discriminate easily between areas initially having a small difference in density
Linear Contrast Stretch
This is the simplest contrast stretch algorithm The grey values in the original image and the
modified image follow a linear relation in this algorithm A density number in the low range
of the original histogram is assigned to extremely black and a value at the high end is
assigned to extremely white The remaining pixel values are distributed linearly between
these extremes The features or details that were obscure on the original image will be clear
in the contrast stretched image Linear contrast stretch operation can be represented
graphically as shown in Fig 4 To provide optimal contrast
and colour variation in colour composites the small range of grey values in each band is
stretched to the full brightness range of the output or display unit
Non-Linear Contrast Enhancement
In these methods the input and output data values follow a non-linear transformation The
general form of the non-linear contrast enhancement is defined by y = f (x) where x is the
input data value and y is the output data value The non-linear contrast enhancement
techniques have been found to be useful for enhancing the colour contrast between the nearly
classes and subclasses of a main class
A type of non linear contrast stretch involves scaling the input data logarithmically This
enhancement has greatest impact on the brightness values found in the darker part of
histogram It could be reversed to enhance values in brighter part of histogram by scaling the
input data using an inverse log
function Histogram equalization is another non-linear contrast enhancement technique In
this technique histogram of the original image is redistributed to produce a uniform
population density This is obtained by grouping certain adjacent grey values Thus the
number of grey levels in the enhanced image is less than the number of grey levels in the
original image
SPATIAL FILTERING
A characteristic of remotely sensed images is a parameter called spatial frequency defined as
number of changes in Brightness Value per unit distance for any particular part of an image
If there are very few changes in Brightness Value once a given area in an image this is
referred to as low frequency area Conversely if the Brightness Value changes dramatically
over short distances this is an area of high frequency Spatial filtering is the process of
dividing the image into its constituent spatial frequencies and selectively altering certain
spatial frequencies to emphasize some image features This technique increases the analystrsquos
ability to discriminate detail The three types of spatial filters used in remote sensor data
processing are Low pass filters Band pass filters and High pass filters
Low-Frequency Filtering in the Spatial Domain
Image enhancements that de-emphasize or block the high spatial frequency detail are low-
frequency or low-pass filters The simplest low-frequency filter evaluates a particular input
pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new
brightness value BVout that is the mean of this convolution The size of the neighbourhood
convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing
operation will however blur the image especially at the edges of objects Blurring becomes
more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass
image being two lines and two columns smaller than the original image Techniques that can
be applied to deal with this problem include (1) artificially extending the original image
beyond its border by repeating the original border pixel brightness values or (2) replicating
the averaged brightness values near the borders based on the image behaviour within a view
pixels of the border The most commonly used low pass filters are mean median and mode
filters
High-Frequency Filtering in the Spatial Domain
High-pass filtering is applied to imagery to remove the slowly varying components and
enhance the high-frequency local variations Brightness values tend to be highly correlated in
a nine-element window Thus the highfrequency filtered image will have a relatively narrow
intensity histogram This suggests that the output from most high-frequency filtered images
must be contrast stretched prior to visual analysis
Edge Enhancement in the Spatial Domain
For many remote sensing earth science applications the most valuable information that may
be derived from an image is contained in the edges surrounding various objects of interest
Edge enhancement delineates these edges and makes the shapes and details comprising the
image more conspicuous and perhaps easier to analyze Generally what the eyes see as
pictorial edges are simply sharp changes in brightness value between two adjacent pixels The
edges may be enhanced using either linear or nonlinear edge enhancement techniques
Linear Edge Enhancement
A straightforward method of extracting edges in remotely sensed imagery is the application
of a directional first-difference algorithm and approximates the first derivative between two
adjacent pixels The algorithm produces the first difference of the image input in the
horizontal vertical and diagonal directions
The Laplacian operator generally highlights point lines and edges in the image and
suppresses uniform and smoothly varying regions Human vision physiological research
suggests that we see objects in much the same way Hence the use of this operation has a
more natural look than many of the other edge-enhanced images
Band ratioing
Sometimes differences in brightness values from identical surface materials are caused by
topographic slope and aspect shadows or seasonal changes in sunlight illumination angle
and intensity These conditions may hamper the ability of an interpreter or classification
algorithm to identify correctly surface materials or land use in a remotely sensed image
Fortunately ratio transformations of the remotely sensed data can in certain instances be
applied to reduce the effects of such environmental conditions In addition to minimizing the
effects of environmental factors ratios may also provide unique information not available in
any single band that is useful for discriminating between soils and vegetation
Chapter 3
Literature Review and History
The _rst section in this chapter describes the work that is related to the topic Many
papers use the same image segmentation techniques for di_erent problems This
section explains the methods discussed in this thesis used by researchers to solve
similar problems The subsequent section describes the workings of the common
methods of image segmentation These methods were investigated in this thesis and
are also used in other papers They include techniques like Active Shape Models
Active ContourSnake Models Texture analysis edge detection and some methods
that are only relevant for the X-ray data
31 Previous Research
311 Summary of Previous Research
According to [14] compared to other areas in medical imaging bone fracture detec-
tion is not well researched and published Research has been done by the National
University of Singapore to segment and detect fractures in femurs (the thigh bone)
[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it
from the X-ray The X-rays were also segmented using Snakes or Active Contour
Models (discussed in 34) and Gradient Vector Flow According to the experiments
done by [27] their algorithm achieves a classi_cation with an accuracy of 945
Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in
X-rays [31] proposes two methods to extract femur contours from X-rays The _rst
is a semi-automatic method which gives priority to reliability and accuracy This
method tries to _t a model of the femur contour to a femur in the X-ray The second
method is automatic and uses active contour models This method breaks down the
shape of the femur into a couple of parallel or roughly parallel lines and a circle at
the top representing the head of the femur The method detects the strong edges in
the circle and locates the turning point using the point of in_ection in the second
derivative of the image Finally it optimizes the femur contour by applying shapeconstraints
to the model
Hough and Radon transforms are used by [14] to approximate the edges of long
bones [14] also uses clustering-based algorithms also known as bi-level or localized
thresholding methods and the global segmentation algorithms to segment X-rays
Clustering-based algorithms categorize each pixel of the image as either a part of
the background or as a part of the object hence the name bi-level thresholding
based on a speci_ed threshold Global segmentation algorithms take the whole
image into consideration and sometimes work better than the clustering-based algo-
rithms Global segmentation algorithms include methods like edge detection region
extraction and deformable models (discussed in 34)
Active Contour Models initially proposed by [19] fall under the class of deformable
models and are used widely as an image segmentation tool Active Contour Models
are used to extract femur contours in X-ray images by [31] after doing edge detection
on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by
[31] to extract contours and the results are compared to that of the Active Contour
Model [3] uses an Active Contour Model with curvature constraints to detect
femur fractures as the original Active Contour Model is susceptible to noise and
other undesired edges This method successfully extracts the femur contour with a
small restriction on shape size and orientation of the image
Active Shape Models introduced by Cootes and Taylor [9] is another widely used
statistical model for image segmentation Cootes and Taylor and their colleagues
[5 6 7 11 12 10] released a series of papers that completed the de_nition of the
original ASMs by modifying it also called classical ASMs by [24] These papers
investigated the performance of the model with gray-level variation di_erent reso-
lutions and made the model more _exible and adaptable ASMs are used by [24] to
detect facial features Some modi_cations to the original model were suggested and
experimented with The relationships between landmark points computing time
and the number of images in the training data were observed for di_erent sets of
data The results in this thesis are compared to the results in [24] The work done
in this thesis is similar to [24] as the same model is used for a di_erent application
[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition
of the shape and the gray level analysis of grayscale images The data used was
facial data from a face database and it was concluded that ASMs are an accurate
way of modeling the shape and gray level appearance It was observed that the
model allows for _exibility while being constrained on the shape of the object to
be segmented This is relevant for the problem of bone segmentation as X-rays are
grayscale and the structure and shape of bones can di_er slightly The _exibility
of the model will be useful for separating bones from X-rays even though one tibia
bone di_ers from another tibia bone
The lsquoworking mechanisms of the methods discussed above are explained in detail in
312 Common Limitations of the Previous Research
As mentioned in previous chapters bone segmentation and fracture detection are both
complicated problems There are many limitations and problems in the seg- mentation
methods used Some methods and models are too limited or constrained to match the bone
accurately Accuracy of results and computing time are conflict- ing variables
It is observed in [14] that there is no automatic method of segmenting bones [14] also
recognizes the need for good initial conditions for Active Contour Models to produce a good
segmentation of bones from X-rays If the initial conditions are not good the final results will
be inaccurate Manual definition of the initial conditions such as the scaling or orientation of
the contour is needed so the process is not automatic [14] tries to detect fractures in long
shaft bones using Computer Aided Design (CAD) techniques
The tradeo_ between automizing the algorithm and the accuracy of the results using the
Active Shape and Active Contour Models is examined in [31] If the model is made fully
automatic by estimating the initial conditions the accuracy will be lower than when the
initial conditions of the model are defined by user inputs [31] implements both manual and
automatic approaches and identifies that automatically segmenting bone structures from noisy
X-ray images is a complex problem This thesis project tackles these limitations The manual
and automatic approaches
are tried using Active Shape Models The relationship between the size of the
training set computation time and error are studied
32 Edge Detection
Edge detection falls under the category of feature detection of images which includes other
methods like ridge detection blob detection interest point detection and scale space models
In digital imaging edges are de_ned as a set of connected pixels that
lie on the boundary between two regions in an image where the image intensity changes
formally known as discontinuities [15] The pixels or a set of pixels that form the edge are
generally of the same or close to the same intensities Edge detection can be used to segment
images with respect to these edges and display the edges separately [26][15] Edge detection
can be used in separating tibia bones from X-rays as bones have strong boundaries or edges
Figure 31 is an example of
basic edge detection in images
321 Sobel Edge Detector
The Sobel operator used to do the edge detection calculates the gradient of the image
intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal
and vertical derivatives as its components The gradient vector can also be seen as a
magnitude and an angle If Dx and Dy are the derivatives in the x and y direction
respectively equations 31 and 32 show the magnitude and angle(direction) representation of
the gradient vector rD It is a measure of the rate of change in an image from light to dark
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
number of grey levels in the enhanced image is less than the number of grey levels in the
original image
SPATIAL FILTERING
A characteristic of remotely sensed images is a parameter called spatial frequency defined as
number of changes in Brightness Value per unit distance for any particular part of an image
If there are very few changes in Brightness Value once a given area in an image this is
referred to as low frequency area Conversely if the Brightness Value changes dramatically
over short distances this is an area of high frequency Spatial filtering is the process of
dividing the image into its constituent spatial frequencies and selectively altering certain
spatial frequencies to emphasize some image features This technique increases the analystrsquos
ability to discriminate detail The three types of spatial filters used in remote sensor data
processing are Low pass filters Band pass filters and High pass filters
Low-Frequency Filtering in the Spatial Domain
Image enhancements that de-emphasize or block the high spatial frequency detail are low-
frequency or low-pass filters The simplest low-frequency filter evaluates a particular input
pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new
brightness value BVout that is the mean of this convolution The size of the neighbourhood
convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing
operation will however blur the image especially at the edges of objects Blurring becomes
more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass
image being two lines and two columns smaller than the original image Techniques that can
be applied to deal with this problem include (1) artificially extending the original image
beyond its border by repeating the original border pixel brightness values or (2) replicating
the averaged brightness values near the borders based on the image behaviour within a view
pixels of the border The most commonly used low pass filters are mean median and mode
filters
High-Frequency Filtering in the Spatial Domain
High-pass filtering is applied to imagery to remove the slowly varying components and
enhance the high-frequency local variations Brightness values tend to be highly correlated in
a nine-element window Thus the highfrequency filtered image will have a relatively narrow
intensity histogram This suggests that the output from most high-frequency filtered images
must be contrast stretched prior to visual analysis
Edge Enhancement in the Spatial Domain
For many remote sensing earth science applications the most valuable information that may
be derived from an image is contained in the edges surrounding various objects of interest
Edge enhancement delineates these edges and makes the shapes and details comprising the
image more conspicuous and perhaps easier to analyze Generally what the eyes see as
pictorial edges are simply sharp changes in brightness value between two adjacent pixels The
edges may be enhanced using either linear or nonlinear edge enhancement techniques
Linear Edge Enhancement
A straightforward method of extracting edges in remotely sensed imagery is the application
of a directional first-difference algorithm and approximates the first derivative between two
adjacent pixels The algorithm produces the first difference of the image input in the
horizontal vertical and diagonal directions
The Laplacian operator generally highlights point lines and edges in the image and
suppresses uniform and smoothly varying regions Human vision physiological research
suggests that we see objects in much the same way Hence the use of this operation has a
more natural look than many of the other edge-enhanced images
Band ratioing
Sometimes differences in brightness values from identical surface materials are caused by
topographic slope and aspect shadows or seasonal changes in sunlight illumination angle
and intensity These conditions may hamper the ability of an interpreter or classification
algorithm to identify correctly surface materials or land use in a remotely sensed image
Fortunately ratio transformations of the remotely sensed data can in certain instances be
applied to reduce the effects of such environmental conditions In addition to minimizing the
effects of environmental factors ratios may also provide unique information not available in
any single band that is useful for discriminating between soils and vegetation
Chapter 3
Literature Review and History
The _rst section in this chapter describes the work that is related to the topic Many
papers use the same image segmentation techniques for di_erent problems This
section explains the methods discussed in this thesis used by researchers to solve
similar problems The subsequent section describes the workings of the common
methods of image segmentation These methods were investigated in this thesis and
are also used in other papers They include techniques like Active Shape Models
Active ContourSnake Models Texture analysis edge detection and some methods
that are only relevant for the X-ray data
31 Previous Research
311 Summary of Previous Research
According to [14] compared to other areas in medical imaging bone fracture detec-
tion is not well researched and published Research has been done by the National
University of Singapore to segment and detect fractures in femurs (the thigh bone)
[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it
from the X-ray The X-rays were also segmented using Snakes or Active Contour
Models (discussed in 34) and Gradient Vector Flow According to the experiments
done by [27] their algorithm achieves a classi_cation with an accuracy of 945
Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in
X-rays [31] proposes two methods to extract femur contours from X-rays The _rst
is a semi-automatic method which gives priority to reliability and accuracy This
method tries to _t a model of the femur contour to a femur in the X-ray The second
method is automatic and uses active contour models This method breaks down the
shape of the femur into a couple of parallel or roughly parallel lines and a circle at
the top representing the head of the femur The method detects the strong edges in
the circle and locates the turning point using the point of in_ection in the second
derivative of the image Finally it optimizes the femur contour by applying shapeconstraints
to the model
Hough and Radon transforms are used by [14] to approximate the edges of long
bones [14] also uses clustering-based algorithms also known as bi-level or localized
thresholding methods and the global segmentation algorithms to segment X-rays
Clustering-based algorithms categorize each pixel of the image as either a part of
the background or as a part of the object hence the name bi-level thresholding
based on a speci_ed threshold Global segmentation algorithms take the whole
image into consideration and sometimes work better than the clustering-based algo-
rithms Global segmentation algorithms include methods like edge detection region
extraction and deformable models (discussed in 34)
Active Contour Models initially proposed by [19] fall under the class of deformable
models and are used widely as an image segmentation tool Active Contour Models
are used to extract femur contours in X-ray images by [31] after doing edge detection
on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by
[31] to extract contours and the results are compared to that of the Active Contour
Model [3] uses an Active Contour Model with curvature constraints to detect
femur fractures as the original Active Contour Model is susceptible to noise and
other undesired edges This method successfully extracts the femur contour with a
small restriction on shape size and orientation of the image
Active Shape Models introduced by Cootes and Taylor [9] is another widely used
statistical model for image segmentation Cootes and Taylor and their colleagues
[5 6 7 11 12 10] released a series of papers that completed the de_nition of the
original ASMs by modifying it also called classical ASMs by [24] These papers
investigated the performance of the model with gray-level variation di_erent reso-
lutions and made the model more _exible and adaptable ASMs are used by [24] to
detect facial features Some modi_cations to the original model were suggested and
experimented with The relationships between landmark points computing time
and the number of images in the training data were observed for di_erent sets of
data The results in this thesis are compared to the results in [24] The work done
in this thesis is similar to [24] as the same model is used for a di_erent application
[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition
of the shape and the gray level analysis of grayscale images The data used was
facial data from a face database and it was concluded that ASMs are an accurate
way of modeling the shape and gray level appearance It was observed that the
model allows for _exibility while being constrained on the shape of the object to
be segmented This is relevant for the problem of bone segmentation as X-rays are
grayscale and the structure and shape of bones can di_er slightly The _exibility
of the model will be useful for separating bones from X-rays even though one tibia
bone di_ers from another tibia bone
The lsquoworking mechanisms of the methods discussed above are explained in detail in
312 Common Limitations of the Previous Research
As mentioned in previous chapters bone segmentation and fracture detection are both
complicated problems There are many limitations and problems in the seg- mentation
methods used Some methods and models are too limited or constrained to match the bone
accurately Accuracy of results and computing time are conflict- ing variables
It is observed in [14] that there is no automatic method of segmenting bones [14] also
recognizes the need for good initial conditions for Active Contour Models to produce a good
segmentation of bones from X-rays If the initial conditions are not good the final results will
be inaccurate Manual definition of the initial conditions such as the scaling or orientation of
the contour is needed so the process is not automatic [14] tries to detect fractures in long
shaft bones using Computer Aided Design (CAD) techniques
The tradeo_ between automizing the algorithm and the accuracy of the results using the
Active Shape and Active Contour Models is examined in [31] If the model is made fully
automatic by estimating the initial conditions the accuracy will be lower than when the
initial conditions of the model are defined by user inputs [31] implements both manual and
automatic approaches and identifies that automatically segmenting bone structures from noisy
X-ray images is a complex problem This thesis project tackles these limitations The manual
and automatic approaches
are tried using Active Shape Models The relationship between the size of the
training set computation time and error are studied
32 Edge Detection
Edge detection falls under the category of feature detection of images which includes other
methods like ridge detection blob detection interest point detection and scale space models
In digital imaging edges are de_ned as a set of connected pixels that
lie on the boundary between two regions in an image where the image intensity changes
formally known as discontinuities [15] The pixels or a set of pixels that form the edge are
generally of the same or close to the same intensities Edge detection can be used to segment
images with respect to these edges and display the edges separately [26][15] Edge detection
can be used in separating tibia bones from X-rays as bones have strong boundaries or edges
Figure 31 is an example of
basic edge detection in images
321 Sobel Edge Detector
The Sobel operator used to do the edge detection calculates the gradient of the image
intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal
and vertical derivatives as its components The gradient vector can also be seen as a
magnitude and an angle If Dx and Dy are the derivatives in the x and y direction
respectively equations 31 and 32 show the magnitude and angle(direction) representation of
the gradient vector rD It is a measure of the rate of change in an image from light to dark
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
intensity histogram This suggests that the output from most high-frequency filtered images
must be contrast stretched prior to visual analysis
Edge Enhancement in the Spatial Domain
For many remote sensing earth science applications the most valuable information that may
be derived from an image is contained in the edges surrounding various objects of interest
Edge enhancement delineates these edges and makes the shapes and details comprising the
image more conspicuous and perhaps easier to analyze Generally what the eyes see as
pictorial edges are simply sharp changes in brightness value between two adjacent pixels The
edges may be enhanced using either linear or nonlinear edge enhancement techniques
Linear Edge Enhancement
A straightforward method of extracting edges in remotely sensed imagery is the application
of a directional first-difference algorithm and approximates the first derivative between two
adjacent pixels The algorithm produces the first difference of the image input in the
horizontal vertical and diagonal directions
The Laplacian operator generally highlights point lines and edges in the image and
suppresses uniform and smoothly varying regions Human vision physiological research
suggests that we see objects in much the same way Hence the use of this operation has a
more natural look than many of the other edge-enhanced images
Band ratioing
Sometimes differences in brightness values from identical surface materials are caused by
topographic slope and aspect shadows or seasonal changes in sunlight illumination angle
and intensity These conditions may hamper the ability of an interpreter or classification
algorithm to identify correctly surface materials or land use in a remotely sensed image
Fortunately ratio transformations of the remotely sensed data can in certain instances be
applied to reduce the effects of such environmental conditions In addition to minimizing the
effects of environmental factors ratios may also provide unique information not available in
any single band that is useful for discriminating between soils and vegetation
Chapter 3
Literature Review and History
The _rst section in this chapter describes the work that is related to the topic Many
papers use the same image segmentation techniques for di_erent problems This
section explains the methods discussed in this thesis used by researchers to solve
similar problems The subsequent section describes the workings of the common
methods of image segmentation These methods were investigated in this thesis and
are also used in other papers They include techniques like Active Shape Models
Active ContourSnake Models Texture analysis edge detection and some methods
that are only relevant for the X-ray data
31 Previous Research
311 Summary of Previous Research
According to [14] compared to other areas in medical imaging bone fracture detec-
tion is not well researched and published Research has been done by the National
University of Singapore to segment and detect fractures in femurs (the thigh bone)
[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it
from the X-ray The X-rays were also segmented using Snakes or Active Contour
Models (discussed in 34) and Gradient Vector Flow According to the experiments
done by [27] their algorithm achieves a classi_cation with an accuracy of 945
Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in
X-rays [31] proposes two methods to extract femur contours from X-rays The _rst
is a semi-automatic method which gives priority to reliability and accuracy This
method tries to _t a model of the femur contour to a femur in the X-ray The second
method is automatic and uses active contour models This method breaks down the
shape of the femur into a couple of parallel or roughly parallel lines and a circle at
the top representing the head of the femur The method detects the strong edges in
the circle and locates the turning point using the point of in_ection in the second
derivative of the image Finally it optimizes the femur contour by applying shapeconstraints
to the model
Hough and Radon transforms are used by [14] to approximate the edges of long
bones [14] also uses clustering-based algorithms also known as bi-level or localized
thresholding methods and the global segmentation algorithms to segment X-rays
Clustering-based algorithms categorize each pixel of the image as either a part of
the background or as a part of the object hence the name bi-level thresholding
based on a speci_ed threshold Global segmentation algorithms take the whole
image into consideration and sometimes work better than the clustering-based algo-
rithms Global segmentation algorithms include methods like edge detection region
extraction and deformable models (discussed in 34)
Active Contour Models initially proposed by [19] fall under the class of deformable
models and are used widely as an image segmentation tool Active Contour Models
are used to extract femur contours in X-ray images by [31] after doing edge detection
on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by
[31] to extract contours and the results are compared to that of the Active Contour
Model [3] uses an Active Contour Model with curvature constraints to detect
femur fractures as the original Active Contour Model is susceptible to noise and
other undesired edges This method successfully extracts the femur contour with a
small restriction on shape size and orientation of the image
Active Shape Models introduced by Cootes and Taylor [9] is another widely used
statistical model for image segmentation Cootes and Taylor and their colleagues
[5 6 7 11 12 10] released a series of papers that completed the de_nition of the
original ASMs by modifying it also called classical ASMs by [24] These papers
investigated the performance of the model with gray-level variation di_erent reso-
lutions and made the model more _exible and adaptable ASMs are used by [24] to
detect facial features Some modi_cations to the original model were suggested and
experimented with The relationships between landmark points computing time
and the number of images in the training data were observed for di_erent sets of
data The results in this thesis are compared to the results in [24] The work done
in this thesis is similar to [24] as the same model is used for a di_erent application
[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition
of the shape and the gray level analysis of grayscale images The data used was
facial data from a face database and it was concluded that ASMs are an accurate
way of modeling the shape and gray level appearance It was observed that the
model allows for _exibility while being constrained on the shape of the object to
be segmented This is relevant for the problem of bone segmentation as X-rays are
grayscale and the structure and shape of bones can di_er slightly The _exibility
of the model will be useful for separating bones from X-rays even though one tibia
bone di_ers from another tibia bone
The lsquoworking mechanisms of the methods discussed above are explained in detail in
312 Common Limitations of the Previous Research
As mentioned in previous chapters bone segmentation and fracture detection are both
complicated problems There are many limitations and problems in the seg- mentation
methods used Some methods and models are too limited or constrained to match the bone
accurately Accuracy of results and computing time are conflict- ing variables
It is observed in [14] that there is no automatic method of segmenting bones [14] also
recognizes the need for good initial conditions for Active Contour Models to produce a good
segmentation of bones from X-rays If the initial conditions are not good the final results will
be inaccurate Manual definition of the initial conditions such as the scaling or orientation of
the contour is needed so the process is not automatic [14] tries to detect fractures in long
shaft bones using Computer Aided Design (CAD) techniques
The tradeo_ between automizing the algorithm and the accuracy of the results using the
Active Shape and Active Contour Models is examined in [31] If the model is made fully
automatic by estimating the initial conditions the accuracy will be lower than when the
initial conditions of the model are defined by user inputs [31] implements both manual and
automatic approaches and identifies that automatically segmenting bone structures from noisy
X-ray images is a complex problem This thesis project tackles these limitations The manual
and automatic approaches
are tried using Active Shape Models The relationship between the size of the
training set computation time and error are studied
32 Edge Detection
Edge detection falls under the category of feature detection of images which includes other
methods like ridge detection blob detection interest point detection and scale space models
In digital imaging edges are de_ned as a set of connected pixels that
lie on the boundary between two regions in an image where the image intensity changes
formally known as discontinuities [15] The pixels or a set of pixels that form the edge are
generally of the same or close to the same intensities Edge detection can be used to segment
images with respect to these edges and display the edges separately [26][15] Edge detection
can be used in separating tibia bones from X-rays as bones have strong boundaries or edges
Figure 31 is an example of
basic edge detection in images
321 Sobel Edge Detector
The Sobel operator used to do the edge detection calculates the gradient of the image
intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal
and vertical derivatives as its components The gradient vector can also be seen as a
magnitude and an angle If Dx and Dy are the derivatives in the x and y direction
respectively equations 31 and 32 show the magnitude and angle(direction) representation of
the gradient vector rD It is a measure of the rate of change in an image from light to dark
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Chapter 3
Literature Review and History
The _rst section in this chapter describes the work that is related to the topic Many
papers use the same image segmentation techniques for di_erent problems This
section explains the methods discussed in this thesis used by researchers to solve
similar problems The subsequent section describes the workings of the common
methods of image segmentation These methods were investigated in this thesis and
are also used in other papers They include techniques like Active Shape Models
Active ContourSnake Models Texture analysis edge detection and some methods
that are only relevant for the X-ray data
31 Previous Research
311 Summary of Previous Research
According to [14] compared to other areas in medical imaging bone fracture detec-
tion is not well researched and published Research has been done by the National
University of Singapore to segment and detect fractures in femurs (the thigh bone)
[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it
from the X-ray The X-rays were also segmented using Snakes or Active Contour
Models (discussed in 34) and Gradient Vector Flow According to the experiments
done by [27] their algorithm achieves a classi_cation with an accuracy of 945
Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in
X-rays [31] proposes two methods to extract femur contours from X-rays The _rst
is a semi-automatic method which gives priority to reliability and accuracy This
method tries to _t a model of the femur contour to a femur in the X-ray The second
method is automatic and uses active contour models This method breaks down the
shape of the femur into a couple of parallel or roughly parallel lines and a circle at
the top representing the head of the femur The method detects the strong edges in
the circle and locates the turning point using the point of in_ection in the second
derivative of the image Finally it optimizes the femur contour by applying shapeconstraints
to the model
Hough and Radon transforms are used by [14] to approximate the edges of long
bones [14] also uses clustering-based algorithms also known as bi-level or localized
thresholding methods and the global segmentation algorithms to segment X-rays
Clustering-based algorithms categorize each pixel of the image as either a part of
the background or as a part of the object hence the name bi-level thresholding
based on a speci_ed threshold Global segmentation algorithms take the whole
image into consideration and sometimes work better than the clustering-based algo-
rithms Global segmentation algorithms include methods like edge detection region
extraction and deformable models (discussed in 34)
Active Contour Models initially proposed by [19] fall under the class of deformable
models and are used widely as an image segmentation tool Active Contour Models
are used to extract femur contours in X-ray images by [31] after doing edge detection
on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by
[31] to extract contours and the results are compared to that of the Active Contour
Model [3] uses an Active Contour Model with curvature constraints to detect
femur fractures as the original Active Contour Model is susceptible to noise and
other undesired edges This method successfully extracts the femur contour with a
small restriction on shape size and orientation of the image
Active Shape Models introduced by Cootes and Taylor [9] is another widely used
statistical model for image segmentation Cootes and Taylor and their colleagues
[5 6 7 11 12 10] released a series of papers that completed the de_nition of the
original ASMs by modifying it also called classical ASMs by [24] These papers
investigated the performance of the model with gray-level variation di_erent reso-
lutions and made the model more _exible and adaptable ASMs are used by [24] to
detect facial features Some modi_cations to the original model were suggested and
experimented with The relationships between landmark points computing time
and the number of images in the training data were observed for di_erent sets of
data The results in this thesis are compared to the results in [24] The work done
in this thesis is similar to [24] as the same model is used for a di_erent application
[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition
of the shape and the gray level analysis of grayscale images The data used was
facial data from a face database and it was concluded that ASMs are an accurate
way of modeling the shape and gray level appearance It was observed that the
model allows for _exibility while being constrained on the shape of the object to
be segmented This is relevant for the problem of bone segmentation as X-rays are
grayscale and the structure and shape of bones can di_er slightly The _exibility
of the model will be useful for separating bones from X-rays even though one tibia
bone di_ers from another tibia bone
The lsquoworking mechanisms of the methods discussed above are explained in detail in
312 Common Limitations of the Previous Research
As mentioned in previous chapters bone segmentation and fracture detection are both
complicated problems There are many limitations and problems in the seg- mentation
methods used Some methods and models are too limited or constrained to match the bone
accurately Accuracy of results and computing time are conflict- ing variables
It is observed in [14] that there is no automatic method of segmenting bones [14] also
recognizes the need for good initial conditions for Active Contour Models to produce a good
segmentation of bones from X-rays If the initial conditions are not good the final results will
be inaccurate Manual definition of the initial conditions such as the scaling or orientation of
the contour is needed so the process is not automatic [14] tries to detect fractures in long
shaft bones using Computer Aided Design (CAD) techniques
The tradeo_ between automizing the algorithm and the accuracy of the results using the
Active Shape and Active Contour Models is examined in [31] If the model is made fully
automatic by estimating the initial conditions the accuracy will be lower than when the
initial conditions of the model are defined by user inputs [31] implements both manual and
automatic approaches and identifies that automatically segmenting bone structures from noisy
X-ray images is a complex problem This thesis project tackles these limitations The manual
and automatic approaches
are tried using Active Shape Models The relationship between the size of the
training set computation time and error are studied
32 Edge Detection
Edge detection falls under the category of feature detection of images which includes other
methods like ridge detection blob detection interest point detection and scale space models
In digital imaging edges are de_ned as a set of connected pixels that
lie on the boundary between two regions in an image where the image intensity changes
formally known as discontinuities [15] The pixels or a set of pixels that form the edge are
generally of the same or close to the same intensities Edge detection can be used to segment
images with respect to these edges and display the edges separately [26][15] Edge detection
can be used in separating tibia bones from X-rays as bones have strong boundaries or edges
Figure 31 is an example of
basic edge detection in images
321 Sobel Edge Detector
The Sobel operator used to do the edge detection calculates the gradient of the image
intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal
and vertical derivatives as its components The gradient vector can also be seen as a
magnitude and an angle If Dx and Dy are the derivatives in the x and y direction
respectively equations 31 and 32 show the magnitude and angle(direction) representation of
the gradient vector rD It is a measure of the rate of change in an image from light to dark
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
31 Previous Research
311 Summary of Previous Research
According to [14] compared to other areas in medical imaging bone fracture detec-
tion is not well researched and published Research has been done by the National
University of Singapore to segment and detect fractures in femurs (the thigh bone)
[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it
from the X-ray The X-rays were also segmented using Snakes or Active Contour
Models (discussed in 34) and Gradient Vector Flow According to the experiments
done by [27] their algorithm achieves a classi_cation with an accuracy of 945
Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in
X-rays [31] proposes two methods to extract femur contours from X-rays The _rst
is a semi-automatic method which gives priority to reliability and accuracy This
method tries to _t a model of the femur contour to a femur in the X-ray The second
method is automatic and uses active contour models This method breaks down the
shape of the femur into a couple of parallel or roughly parallel lines and a circle at
the top representing the head of the femur The method detects the strong edges in
the circle and locates the turning point using the point of in_ection in the second
derivative of the image Finally it optimizes the femur contour by applying shapeconstraints
to the model
Hough and Radon transforms are used by [14] to approximate the edges of long
bones [14] also uses clustering-based algorithms also known as bi-level or localized
thresholding methods and the global segmentation algorithms to segment X-rays
Clustering-based algorithms categorize each pixel of the image as either a part of
the background or as a part of the object hence the name bi-level thresholding
based on a speci_ed threshold Global segmentation algorithms take the whole
image into consideration and sometimes work better than the clustering-based algo-
rithms Global segmentation algorithms include methods like edge detection region
extraction and deformable models (discussed in 34)
Active Contour Models initially proposed by [19] fall under the class of deformable
models and are used widely as an image segmentation tool Active Contour Models
are used to extract femur contours in X-ray images by [31] after doing edge detection
on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by
[31] to extract contours and the results are compared to that of the Active Contour
Model [3] uses an Active Contour Model with curvature constraints to detect
femur fractures as the original Active Contour Model is susceptible to noise and
other undesired edges This method successfully extracts the femur contour with a
small restriction on shape size and orientation of the image
Active Shape Models introduced by Cootes and Taylor [9] is another widely used
statistical model for image segmentation Cootes and Taylor and their colleagues
[5 6 7 11 12 10] released a series of papers that completed the de_nition of the
original ASMs by modifying it also called classical ASMs by [24] These papers
investigated the performance of the model with gray-level variation di_erent reso-
lutions and made the model more _exible and adaptable ASMs are used by [24] to
detect facial features Some modi_cations to the original model were suggested and
experimented with The relationships between landmark points computing time
and the number of images in the training data were observed for di_erent sets of
data The results in this thesis are compared to the results in [24] The work done
in this thesis is similar to [24] as the same model is used for a di_erent application
[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition
of the shape and the gray level analysis of grayscale images The data used was
facial data from a face database and it was concluded that ASMs are an accurate
way of modeling the shape and gray level appearance It was observed that the
model allows for _exibility while being constrained on the shape of the object to
be segmented This is relevant for the problem of bone segmentation as X-rays are
grayscale and the structure and shape of bones can di_er slightly The _exibility
of the model will be useful for separating bones from X-rays even though one tibia
bone di_ers from another tibia bone
The lsquoworking mechanisms of the methods discussed above are explained in detail in
312 Common Limitations of the Previous Research
As mentioned in previous chapters bone segmentation and fracture detection are both
complicated problems There are many limitations and problems in the seg- mentation
methods used Some methods and models are too limited or constrained to match the bone
accurately Accuracy of results and computing time are conflict- ing variables
It is observed in [14] that there is no automatic method of segmenting bones [14] also
recognizes the need for good initial conditions for Active Contour Models to produce a good
segmentation of bones from X-rays If the initial conditions are not good the final results will
be inaccurate Manual definition of the initial conditions such as the scaling or orientation of
the contour is needed so the process is not automatic [14] tries to detect fractures in long
shaft bones using Computer Aided Design (CAD) techniques
The tradeo_ between automizing the algorithm and the accuracy of the results using the
Active Shape and Active Contour Models is examined in [31] If the model is made fully
automatic by estimating the initial conditions the accuracy will be lower than when the
initial conditions of the model are defined by user inputs [31] implements both manual and
automatic approaches and identifies that automatically segmenting bone structures from noisy
X-ray images is a complex problem This thesis project tackles these limitations The manual
and automatic approaches
are tried using Active Shape Models The relationship between the size of the
training set computation time and error are studied
32 Edge Detection
Edge detection falls under the category of feature detection of images which includes other
methods like ridge detection blob detection interest point detection and scale space models
In digital imaging edges are de_ned as a set of connected pixels that
lie on the boundary between two regions in an image where the image intensity changes
formally known as discontinuities [15] The pixels or a set of pixels that form the edge are
generally of the same or close to the same intensities Edge detection can be used to segment
images with respect to these edges and display the edges separately [26][15] Edge detection
can be used in separating tibia bones from X-rays as bones have strong boundaries or edges
Figure 31 is an example of
basic edge detection in images
321 Sobel Edge Detector
The Sobel operator used to do the edge detection calculates the gradient of the image
intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal
and vertical derivatives as its components The gradient vector can also be seen as a
magnitude and an angle If Dx and Dy are the derivatives in the x and y direction
respectively equations 31 and 32 show the magnitude and angle(direction) representation of
the gradient vector rD It is a measure of the rate of change in an image from light to dark
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
femur fractures as the original Active Contour Model is susceptible to noise and
other undesired edges This method successfully extracts the femur contour with a
small restriction on shape size and orientation of the image
Active Shape Models introduced by Cootes and Taylor [9] is another widely used
statistical model for image segmentation Cootes and Taylor and their colleagues
[5 6 7 11 12 10] released a series of papers that completed the de_nition of the
original ASMs by modifying it also called classical ASMs by [24] These papers
investigated the performance of the model with gray-level variation di_erent reso-
lutions and made the model more _exible and adaptable ASMs are used by [24] to
detect facial features Some modi_cations to the original model were suggested and
experimented with The relationships between landmark points computing time
and the number of images in the training data were observed for di_erent sets of
data The results in this thesis are compared to the results in [24] The work done
in this thesis is similar to [24] as the same model is used for a di_erent application
[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition
of the shape and the gray level analysis of grayscale images The data used was
facial data from a face database and it was concluded that ASMs are an accurate
way of modeling the shape and gray level appearance It was observed that the
model allows for _exibility while being constrained on the shape of the object to
be segmented This is relevant for the problem of bone segmentation as X-rays are
grayscale and the structure and shape of bones can di_er slightly The _exibility
of the model will be useful for separating bones from X-rays even though one tibia
bone di_ers from another tibia bone
The lsquoworking mechanisms of the methods discussed above are explained in detail in
312 Common Limitations of the Previous Research
As mentioned in previous chapters bone segmentation and fracture detection are both
complicated problems There are many limitations and problems in the seg- mentation
methods used Some methods and models are too limited or constrained to match the bone
accurately Accuracy of results and computing time are conflict- ing variables
It is observed in [14] that there is no automatic method of segmenting bones [14] also
recognizes the need for good initial conditions for Active Contour Models to produce a good
segmentation of bones from X-rays If the initial conditions are not good the final results will
be inaccurate Manual definition of the initial conditions such as the scaling or orientation of
the contour is needed so the process is not automatic [14] tries to detect fractures in long
shaft bones using Computer Aided Design (CAD) techniques
The tradeo_ between automizing the algorithm and the accuracy of the results using the
Active Shape and Active Contour Models is examined in [31] If the model is made fully
automatic by estimating the initial conditions the accuracy will be lower than when the
initial conditions of the model are defined by user inputs [31] implements both manual and
automatic approaches and identifies that automatically segmenting bone structures from noisy
X-ray images is a complex problem This thesis project tackles these limitations The manual
and automatic approaches
are tried using Active Shape Models The relationship between the size of the
training set computation time and error are studied
32 Edge Detection
Edge detection falls under the category of feature detection of images which includes other
methods like ridge detection blob detection interest point detection and scale space models
In digital imaging edges are de_ned as a set of connected pixels that
lie on the boundary between two regions in an image where the image intensity changes
formally known as discontinuities [15] The pixels or a set of pixels that form the edge are
generally of the same or close to the same intensities Edge detection can be used to segment
images with respect to these edges and display the edges separately [26][15] Edge detection
can be used in separating tibia bones from X-rays as bones have strong boundaries or edges
Figure 31 is an example of
basic edge detection in images
321 Sobel Edge Detector
The Sobel operator used to do the edge detection calculates the gradient of the image
intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal
and vertical derivatives as its components The gradient vector can also be seen as a
magnitude and an angle If Dx and Dy are the derivatives in the x and y direction
respectively equations 31 and 32 show the magnitude and angle(direction) representation of
the gradient vector rD It is a measure of the rate of change in an image from light to dark
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
be inaccurate Manual definition of the initial conditions such as the scaling or orientation of
the contour is needed so the process is not automatic [14] tries to detect fractures in long
shaft bones using Computer Aided Design (CAD) techniques
The tradeo_ between automizing the algorithm and the accuracy of the results using the
Active Shape and Active Contour Models is examined in [31] If the model is made fully
automatic by estimating the initial conditions the accuracy will be lower than when the
initial conditions of the model are defined by user inputs [31] implements both manual and
automatic approaches and identifies that automatically segmenting bone structures from noisy
X-ray images is a complex problem This thesis project tackles these limitations The manual
and automatic approaches
are tried using Active Shape Models The relationship between the size of the
training set computation time and error are studied
32 Edge Detection
Edge detection falls under the category of feature detection of images which includes other
methods like ridge detection blob detection interest point detection and scale space models
In digital imaging edges are de_ned as a set of connected pixels that
lie on the boundary between two regions in an image where the image intensity changes
formally known as discontinuities [15] The pixels or a set of pixels that form the edge are
generally of the same or close to the same intensities Edge detection can be used to segment
images with respect to these edges and display the edges separately [26][15] Edge detection
can be used in separating tibia bones from X-rays as bones have strong boundaries or edges
Figure 31 is an example of
basic edge detection in images
321 Sobel Edge Detector
The Sobel operator used to do the edge detection calculates the gradient of the image
intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal
and vertical derivatives as its components The gradient vector can also be seen as a
magnitude and an angle If Dx and Dy are the derivatives in the x and y direction
respectively equations 31 and 32 show the magnitude and angle(direction) representation of
the gradient vector rD It is a measure of the rate of change in an image from light to dark
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
pixel in case of grayscale images at every point At each point in the image the direction of
the gradient vector shows the direction of the largest increase in the intensity of the image
while the magnitude of the gradient vector denotes the rate of change in that direction [15]
[26] This implies that the result of the Sobel operator at an image point which is in a region
of constant image intensity is a zero vector and at a point on an edge is a vector which points
across the edge from darker to brighter values Mathematically Sobel edge detection is
implemented using two 33 convolution masks or kernels one for horizontal direction and
the other for vertical direction in an image that approximate the derivative in the horizontal
and vertical directions The derivatives in the x and y directions are calculated by 2D
convolution of the original image and the convolution masks If A is the original image and
Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34
show how the directional derivatives are calculated [26] The matrices are a representation of
the convolution kernels that are used
322 Prewitt Edge Detector
The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the
derivatives using convolution kernels to find the localized orientation of each pixel in an
image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is
more prone to noise than Sobel as it does not give weight- ing to the current pixel while
calculating the directional derivative at that point [15][26] This is the reason why Sobel has a
weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show
the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The
same variables as in the Sobel case are used The kernels to calculate the directional
derivatives are different
323 Roberts Edge Detector
The Roberts edge detectors also known as the Roberts Cross operator finds edges
by calculating the sum of the squares of the differences between diagonally adjacent
pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question
and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its
performance decreases if the images are noisy But this method is still used as it is simple
easy to implement and its faster than other methods The implementation is done by
convolving the input image with 2 2 kernels
324 Canny Edge Detector
Canny edge detector is considered as a very effective edge detecting technique as it
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
detects faint edges even when the image is noisy This is because in the beginning
of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a
blurred image so the output of the filter does not depend on a single noisy pixel also known
as an outlier Then the gradient of the image is calculated same as in other filters like Sobel
and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are
below a certain threshold are suppressed A multi-level thresholding technique same as the
example in 24 involving two levels is then used on the data If the pixel value is less than the
lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1
If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-
value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows
the X-ray image and the image after Canny edge detection
33 Image Segmentation
331 Texture Analysis
Texture analysis attempts to use the texture of the image to analyze it Texture analysis
attempts to quantify the visual or other simple characteristics so that the image can be
analyzed according to them [23] For example the visible properties of
an image like the roughness or the smoothness can be converted into numbers that describe
the pixel layout or brightness intensity in the region in question In the bone segmentation
problem image processing using texture can be used as bones are expected to have more
texture than the mesh Range filtering and standard deviation filtering were the texture
analysis techniques used in this thesis Range filtering calculates the local range of an image
3 Principal curvature-based Region Detector
31 Principal Curvature Image
Two types of structures have high curvature in one direction and low curvature in the
orthogonal direction lines
(ie straight or nearly straight curvilinear features) and edges Viewing an image as an
intensity surface the curvilinear structures correspond to ridges and valleys of this surface
The local shape characteristics of the surface at a particular point can be described by the
Hessian matrix
H(x σD) =
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
middot
Ixx(x σD) Ixy(x σD)
Ixy(x σD) Iyy(x σD)
cedil
(1)
where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the
point x and σD is the
Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related
second moment matrix have been applied in several other interest operators (eg the Harris
[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the
local image geometry is changing in more than one direction Likewise Lowersquos maximal
difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at
least approximates the sum of the diagonal elements) to find points of interest However our
PCBR detector is quite different from these other methods and is complementary to them
Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges
valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with
extremal points the ridges valleys and cliffs can be detected over a range of viewpoints
scales and appearance changes Many previous interest point detectors [7 19 18] apply the
Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure
is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and
the matrix A is either the Hessian matrix or the second moment matrix One advantage of the
Harris metric is that it does not require explicit computation of the eigenvalues However
computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate
the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values
for ldquolongrdquo structures that have a small first or second derivative in one particular direction
Our PCBR detector compliments previous interest point detectors in that we abandon the
Harris measure and exploit those very long structures as detection cues The principal
curvature image is given by either
P (x) =max(λ1(x) 0) (2)
or
P (x) =min(λ2(x) 0) (3)
where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x
Eq 2 provides a high response only for dark lines on a light background (or on the dark side
of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
and other detectors principal curvature images are calculated in scale space We first double
the size of the original image to produce our initial image I11
and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where
k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to
I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first
image in the second octave We apply the same smoothing process to build the second
octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are
the width and height of the doubled image respectively Finally we calculate a principal
curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq
2) of the Hessian matrix at each pixel For computational efficiency each smoothed image
and its corresponding Hessian image is computed from the previous smoothed image using
an incremental Gaussian scale Given the principal curvature scale space images we calculate
the maximum curvature over each set of three consecutive principal curvature images to form
the following set of four images in each of the n octaves
MP12 MP13 MP14 MP15
MP22 MP23 MP24 MP25
MPn2 MPn3 MPn4 MPn5 (4)
where MPij =max(Pijminus1 Pij Pij+1)
Figure 2(b) shows one of the maximum curvature images MP created by maximizing the
principal curvature at each pixel over three consecutive principal curvature images From
these maximum principal curvature images we find the stable regions via our watershed
algorithm
32 EnhancedWatershed Regions Detections
The watershed transform is an efficient technique that is widely employed for image
segmentation It is normally applied either to an intensity image directly or to the gradient
magnitude of an image We instead apply the watershed transform to the principal curvature
image However the watershed transform is sensitive to noise (and other small perturbations)
in the intensity image A consequence of this is that the small image variations form local
minima that result in many small watershed regions Figure 3(a) shows the over
segmentation results when the watershed algorithm is applied directly to the principal
curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
apply a grayscale morphological closing followed by hysteresis thresholding The grayscale
morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from
Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and
erosion respectively The closing operation removes small ldquopotholesrdquo in the principal
curvature terrain thus eliminating many local minima that result from noise and that would
otherwise produce watershed catchment basins Beyond the small (in terms of area of
influence) local minima there are other variations that have larger zones of influence and that
are not reclaimed by the morphological closing To further eliminate spurious or unstable
watershed regions we threshold the principal curvature image to create a clean binarized
principal curvature image However rather than apply a straight threshold or even hysteresis
thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust
eigenvector-guided hysteresis thresholding to help link structural cues and remove
perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal
strength (ie the line or edge contrast) the principal curvature image may at times become
weak due to low contrast portions of an edge or curvilinear structure These low contrast
segments may potentially cause gaps in the thresholded principal curvature image which in
turn cause watershed regions to merge that should otherwise be separate However the
directions of the eigenvectors provide a strong indication of where curvilinear structures
appear and they are more robust to these intensity perturbations than is the eigenvalue
magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and
low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a
strong principal curvature response Pixels with a strong response act as seeds that expand to
include connected pixels that are above the low threshold Unlike traditional hysteresis
thresholding our low threshold is a function of the support that each pixelrsquos major
eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing
the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo
major (or minor) eigenvectors This can be done by taking the absolute value of the inner
product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot
product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a
low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a
low threshold of 0028) The threshold values are based on visual inspection of detection
results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise
weak region The red arrows are the major eigenvectors and the yellow arrows are the minor
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
eigenvectors To improve visibility we draw them at every fourth pixel At the point
indicated by the large white arrow we see that the eigenvalue magnitudes are small and the
ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite
uniform This eigenvector-based active thresholding process yields better performance in
building continuous ridges and in handling perturbations which results in more stable regions
(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image
(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins
and themidlines of the thresholded white ridge pixels become watershed lines if they separate
two distinct catchment basins To define the interest regions of the PCBR detector in one
scale the resulting segmented regions are fit with ellipses via PCA that have the same
second-moment as the watershed regions (Fig 2(e))
33 Stable Regions Across Scale
Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve
stable region detections To further improve robustness we adopt a key idea from MSER and
keep only those regions that can be detected in at least three consecutive scales Similar to the
process of selecting stable regions via thresholding in MSER we select regions that are stable
across local scale changes To achieve this we compute the overlap error of the detected
regions across each triplet of consecutive scales in every octave The overlap error is
calculated the same as in [19] Overlapping regions that are detected at different scales
normally exhibit some variation This variation is valuable for object recognition because it
provides multiple descriptions of the same pattern An object category normally exhibits
large within-class variation in the same area Since detectors have difficulty locating the
interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single
descriptor vector it is better to extract multiple descriptors for several overlapping regions
provided that these descriptors are handled properly by the classifier
2 BACKGROUND AND RELATED WORK
Consider an RGB image of a passage in a painting consisting of open brush strokes that is
where lower layer strokes are visible The task of recovering layers of strokes involves
mainly three steps
1 Partition the image into regions with consistent colorsshapes corresponding to
di_erent layers of strokes
2 Identify the current top layer
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
3 inpaint the regions of the top layer
The three steps are repeated whenever there are more than two layers remained
21 De-pict Algorithm
Given an image as input De-pict algorithm starts by applying k-means and complete-linkage
as a clustering step to obtain chromatic consistent regions Under the assumption that brush
strokes of the same layer of painting are of similar colors such regions in different clusters
are good representatives of brush strokes at different layers in the painting as shown in Fig
3c Note that after the clustering step each pixel of the image is assigned a label
corresponding to its assigned cluster and each label can be described by the mean chromatic
feature vector Then the top layer is identified by human experts based on visual occlusion
cues etc Ideally this step should be fully automatic but this step challenging is not the focus
of our current work Lastly the regions of the top layer are removed and inpainted by k
nearest-neighbor algorithm
31 Spatially coherent segmentation
We improve the layer segmentation by incorporating k-means and spatial coherence
regularity in an iterative E-M way10 11 We model the appearances of brush strokes of
different layers by a set of feature centers (mean chromatic vectors as in k-means) In other
words we assume that each layer is modeled as independent Gaussians with same
covariances that only differs in the means Given the initial models ie the k mean chromatic
vectors we can refine the segmentation with spatial coherent priors by minimizing the
following energy function (E-step)
min
L
X
p
jjfp 1048576 cLp jj22
+ _
X
fpqg2N
T
jepqj
[Lp 6= Lq] (1)
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the
color model for clusters i jepqj is the edge length between pq and T is the delta function
The first term in Eq(1) measures the appearance similarity between the pixels and the clusters
they are assigned to And the second term penalizes the situation where pixels in the
neighborhood belong to different clusters By _xing the k appearance models the
minimization problem can be solved with graph-cut algorithm12 The solution gives us under
spatial regularization the optimal labeling of pixels to different clusters After spatial coherent
refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we
iterate the E and M step until convergence or a predefined number of iterations is reached
32 Curvature-based inpainting
Unlike exemplar-based inpainting method curvature-based inpainting methods focus on
reconstructing the ge- ometric structure of (chromatic) intensities which is usually
represented by level lines5 7 Here level lines can be contours that connect pixels of the
same graychromatic intensity in an image Therefore such methods are well-suited for
inpainting on images with no or very few textures due to the fact that level lines capture
concisely the structure and information of the texture-less regions For van Goghs painting
the brush strokes at each layer are close to textureless Therefore curvature-based inpainting
can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for
recovering the structures of underlying brush strokes In this paper we evaluate the recent
method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a
linear program Unlike other methods this method is independent of initialization andcan
handle general inpainting regions eg regions with holes In the following we briey review
Schoenemann et als method in details To formulate the problem as linear program in this
approach curvature is modeled in a discrete sense (where a possible reconstruction of the
level line with intensity 100 is shown) Specifically we impose an discrete grid of certain
connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and
line segment pairs that are used to represent level lines And the basic regions represent the
pixels Then for each potential discrete level line the curvature is approximated by the sum
of angle changes at all vertices along the level line with proper weighting of the edge length
To ensure that regions and the level lines are consistent (for instance level lines should be
continuous) two sets of linear constraints ie surface continuation constraints and boundary
continuation constraints are imposed on the variables Finally the boundary condition (the
intensities of the boundary pixels) of the damage region can also be easily formulated as
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
linear constraints With proper handling all these constraints the inpainting problem can be
solved as linear program To handle color images we simply formulate and solve a linear
program for each chromatic channel independently
II MATERIALS AND METHODS
A Data Retrieval
In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery
training and research hospital All pulmonary computed tomographic angiography exams
performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)
equipment Patients were informed about the examination and also for breath holding
Imaging performed with Bolus tracking program After scenogram single slice is taken at the
level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is
adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec
with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is
used When opacification is
reached at the pre-adjusted level exam performed from the supraclavicular region to the
diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at
antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch
10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal
window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger
Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam
consists of 400-500 images with 512x512 resolution
B Method
The stages which have been followed while doing lung segmentation from CTA images at
this work are shown in figure 1
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
CTA images which are in hands are 250 as being 2D The first step is thresholding the
image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in
the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding
air (nonndashbody pixels) Due to the large difference in intensity between these two groups
thresholding leads to a good separation In this study thresholding has been tried out for the
first time in a way that contains bigger parts than 700 HU At the end of thresholding the new
images are going to be in logical value
Thresh=imagegt700
In each of these new images subsegment vessels exist in lung region At the second step this
method has been used to get rid of these vessels firstly each of 2D images has been
considered one by one and each of components in the image have labeled with ldquoconnected
component labelling algorithmrdquo Then looking at the size of each labeled piece items whose
pixel numbers are under 1000 were removed from the image Figure 3
Next the image in Figure 3 has been labeled with ldquoconnected component labeling
algorithmrdquo The biggest size
which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts
have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo
turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is
going to be logical 1 the parts that achieve this condition have been removed and lung and
airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact
that airway in Figure 5 is going to be very small compared to the lung size each of images
have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose
number of pixels are below 1000 have been determined as airways and then removed from
the image The last image in hand is the segmented form of target lung Before airways
removed finding the edges of the image with sobel algoritm it has been gathered to original
image and the edges of lung and airway region have been shown in the original image Figure
6 (b) Also by multiplying defined lung region with the original CTA lung image original
segmented lung image has been carried out
Figure 6 (c)
MATLAB
MATLAB and the Image Processing Toolbox provide a wide range of advanced image
processing functions and interactive tools for enhancing and analyzing digital images The
interactive tools allowed us to perform spatial image transformations morphological
operations such as edge detection and noise removal region-of-interest processing filtering
basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects
semitransparent is a useful technique in 3-D visualization which furnishes more information
about spatial relationships of different structures The toolbox functions implemented in the
open MATLAB language has also been used to develop the customized algorithms
MATLAB is a high-level technical language and interactive environment for data analysis
and mathematical computing functions such as signal processing optimization partial
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
differential equation solving etc It provides interactive tools including threshold
correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and
3D plotting functions The operations for image processing allowed us to perform noise
reduction and image enhancement image transforms colormap manipulation colorspace
conversions region-of interest processing and geometric operation [4] The toolbox
functions implemented in the open MATLAB language can be used to develop the
customized algorithms
An X-ray Computed Tomography (CT) image is composed of pixels whose brightness
corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which
is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes
the pixel region rectangle over the image displayed in the Image Tool defining the group of
pixels that are displayed in extreme close-up view in the Pixel Region tool window The
Pixel Region tool shows the pixels at high magnification overlaying each pixel with its
numeric value [25] For RGB images we find three numeric values one for each band of the
image We can also determine the current position of the pixel region in the target image by
using the pixel information given at the bottom of the tool In this way we found the x- and y-
coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays
a histogram which represents the dynamic range of the X-ray CT image (Figure1)
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool
The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2
3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Figure 3b Area Graph of X-ray CT brain scan
The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)
Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)
The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)
Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)
Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening
The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Figure 6 - Contour Plot of X-ray CT brain scan
The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)
Figure 7 ndash Surfc on X-ray CT brain scan
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)
Figure 8 ndash Contour3 on X-ray CT brain scan
3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan
The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan
4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Figure 12 Group Delay Response - Frequency scale a) linear b) log
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Figure 13 Phase Delay Response - Frequency scale a) linear b) log
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Figure 14 (a) Impulse Response (b) PoleZero Plot
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Figure 15 Step Response (a) Default (b) Specify Length 50
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log
Chapter 4
This chapter describes the workings of a typical ASM Although there are many ex- tensions
and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]
gives a complete description of the classical ASM Section 41 in- troduces shapes and shape
models in general Section 42 describes the workings and the components of the ASM The
parameters and variations that affect the performance of the ASM are explained in Section
43 The experiments that are performed in this thesis to improve the performance of the
model are also described
in this section The problem of initialization of the model in a test image is tackled in Section
44 Section 45 elaborates on the training of the ASM and the definition of an error function
The performance of the ASM on bone X-rays will be judged according to this error function
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
41 Shape Models
A shape is a collection of points As shown in Figure 41 a shape can be represented by a
diagram showing the points or as a n _ 2 array where the n rows represent the number of
points and the two columns represent the x and y co-ordinates of the points respectively In
this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-
ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of
any ASM as it stays the same even if
it is scaled rotated or translated The lines connecting the points are not part of the shape but
they are shown to make the shape and order of the points more clear [24]
Figure 41 Example of a shape
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
The distance between two points is the Euclidean distance between them Equa-
tion 41 gives the formula for Euclidean distance between two points (x1 y1) and
x2 y2 The distance between two shapes can be de_ned as the distance between
their corresponding points [24] There are other ways of de_ning distances between
two points like the Procrustes distance but in this thesis the distance means the
Euclidean distance
radic (y2 - y1)2 + (x2 - x1)2
The centroid x of a shape x can be de_ned as the mean of the point positions
[24] The centroid can be useful while aligning shapes or _nding an automatic
initialization technique (discussed in 44) The size of the shape is the root mean
distance between the points and the centroid This can be used in measuring the
size of the test image which will help with the automatic initialization (discussed in
44)
Algorithm 1 Aligning shapes
Input set of unaligned shapes
1 Choose a reference shape (usually the 1st shape)
2 Translate each shape so that it is centered on the origin
3 Scale the reference shape to unit size Call this shape x0 the initial mean
shape
4 repeat
(a) Align all shapes to the mean shape
(b) Recalculate the mean shape from the aligned shapes
(c) Constrain the current mean shape (align to x0 scale to unit size)
5 until convergence (ie mean shape does not change much)
output set of aligned shapes and mean shape
42 Active Shape Models
The ASM has to be trained using training images In this project the tibia bone
was separated from a full-body X-ray (as shown in 12) and then those images were
re-sized to the same dimensions This ensured uniformity in the quality of data
being used The training on the images was done by manually selecting landmarks
Landmarks were placed at approximately equal intervals and were distributed uni-
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
formly over the bone boundary Such images are called hand annotated or manually
landmarked training images
Figure 43 shows the original image and the manually landmarked image for training
While performing tests using different number of landmark points a subset of these
landmarks points is chosen
After the training images have been landmarked the ASM produces two types of
sub-models [24] These are the profile model and the shape model
1 The profile model analyzes the landmark points and stores the behaviour of the
image around the landmark points So during training the algorithm learns
the characteristics of the area around the landmark points and builds a profile
model for each landmark point accordingly When searching for the shape in
the test image the area near the tentative landmarks is examined and the model moves the
shape to an area that fits closely to the profile model The
tentative location of the landmarks is obtained from the suggested shape
2 The shape model defines the permissible relative positions of landmarks This
introduces a constraint on the shape So as the profile model tries to find the
area in the test image that tries to fit the model the shape model ensures that
the mean shape is not changed The profile model acts on individual landmarks
whereas the shape acts globally on the image So both the models try to correct
each other until no further improvements in matching are possible
421 The ASM Model
The aim of the model is to try to convert the shape proposed by the individual
profiles into an allowable shape So it tries to find the area in the image that closely
matches the profiles of the individual landmarks while keeping the overall shape
constant
The shape is learnt from manually landmarked training images These images are
aligned and a mean shape is formulated with the permissible variations in it [24]
^x = x + ₵b where
^x is the generated shape vector by the model
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
x is the mean shape the average of the aligned training shapes
xi
422 Generating shapes from the model
As seen in Equation 43 different shapes can be generated by changing the value of
b The model is varied in height and width finding optimum values for landmarks
Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray
image The points that are perpendicular to the model are called _whiskers and they help the
profile model in analyzing the area around the landmark points
The shape created by the landmark points are used for the shape model and the
whisker profiles around the landmark points are used for the profile model A profile
and a covariance matrix is built for each landmark It is assumed that the profiles
are distributed as a multivariate Gaussian and so they can be described by their
mean pro_le g and the covariance matrix Sg
423 Searching the test image
After the training is over the shape is searched in the test image The mean shape
calculated from the training images is imposed on the image and the profiles around
the landmark points are search and examined The profiles are offset 3 pixels
along the whisker which is perpendicular to the shape to get the accurate area
that closely resembles the mean shape [24] The distance between the test profile g
and the mean profile g is calculated using the Mahalanobis distance given by
If the model is initialized correctly (discussed in 44) one of the profiles will have the
lowest distance This procedure is done for every landmark point and then the shape
model confirms that the shape is the same as the mean shape The shape model
assures that the pro_le model has not changed the shape If the shape model were
not employed the pro_le model may give the best pro_le results but the resulting
shape may be completely di_erent So as mentioned before the two models restrict
each other A multi-resolution search is done to make the model more robust This
enables the model to be more accurate as it can lock on to the shape from further
away So the model searches over a series of di_erent resolutions of the same image
called an image pyramid The resolutions of the images can be set and changed
in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of
the images are given relative to the _rst image A general picture and not a bone
32
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
43 Parameters and Variations
The performance of the ASM can be enhanced using optimizing the parameters
that it depends on Number of landmark points and number of training images are
investigated in this thesis
The number of landmark points is an important variable that a_ects the ASM The
pro_le model of the ASM works with these landmark points to create pro_les So
the position of landmark points is as important as the number of landmark points
In the training images landmark points are equally spaced along the boundary of
the bone Images are landmarked with 60 points and subsets of these points are
chosen to conduct experiments The impact of the number of landmark points on computing
time and the mean error (defined in Section 45) is tested by running the algorithm with a
different number of landmarks As the number of landmark points is increased it is expected
that the computing time increases and the error decreases The results are explained in
chapter5 A training set of images is used to train the ASM As the number of training images
increases the model becomes more robust and intelligent The computing time is expected to
increase as it will take time to train and create profile models for each image However as the
number of training images increases the mean profile and the model performs better so the
error is expected to decrease The model in this thesis has 12 images 11 are used to train the
ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the
unaligned shape learnt from the training images displays the aligned shapes
44 Initialization Problem
The Active Shape Model locks on to the shape learnt from the training images into the test
image It creates a mean shape pro_le from all the training images using landmark points But
the ASM starts of where the mean shape is located but it may not be near the bone on a test
image So the model needs to be initialized or started somewhere close to the bone boundary
in the test image Experiments were conducted to see the effect of initialization on the error
and the tracking of the shape It was observed that if the initialization is poor which means
that the mean shape starts away from the bone in test X-ray the model does not lock on to the
bone The shape and profile models fail to perform as the profile model looks for regions
similar to those of the training images in the regions away from the bone So it is unable to
find the bone as it is looking in a different region altogether The error increases considerably
if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
the initialization The pink contour is the mean shape and it starts away from the bone so the
result is a poor tracking of the bone
Chapter 5
OUTPUT SCREENS
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
REFERENCES
[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2
[2] A Baumberg Reliable feature matching across widely separated views CVPR pages
774ndash781 2000 2
[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2
[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2
[5] R Deriche and G Giraudon A computational approach for corner and vertex detection
IJCV 10(2)101ndash124 1992 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
[6] T G Dietterich Approximate statistical tests for comparing supervised classification
learning algorithms Neural Computation 10(7)1895ndash1924 1998 6
[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf
pages 147ndash151 1988 2 3
[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object
categories CVPR 290ndash96 2004 1 2
[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001
2
[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-
2)161ndash205 2005 6 7
[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116
1998 2
[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues
from affine deformations of local 2-d brightness structure Image and Vision Computing
pages 415ndash434 1997 2
[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash
110 2004 2 3
[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features
ECCV pages 508ndash521 2006 7
[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from
maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2
[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-
splines CVGIP 39267ndash278 1987 2
[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV
1(1)128ndash142 2002 2
[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV
60(1)63ndash86 2004 2 3
[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T
Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5
[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale
space PAMI 20(12)1376ndash 1381 1998 2
[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on
Artificial Intelligence page 584 1977 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for
generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7
[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence
CVPR pages 976ndash981 1997
[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV
23(1)45ndash78 1997
[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1
3
[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely
invariant regions BMVC pages 412ndash 425 2000 2
top related