COMBINING SHAPE, TEXTURE AND INTENSITY FEATURES FOR CELL NUCLEI EXTRACTION IN PAP SMEAR IMAGES Marina E. Plissiti 1 , Christophoros Nikou 1 and Antonia Charchanti 2 1 Department of Computer Science, University of Ioannina, Ioannina, Greece. 2 Department of Anatomy-Histology and Embryology, Medical School, University of Ioannina, Ioannina, Greece. Abstract — In this work, we present an automated method for the detection and boundary determination of cells nuclei in conventional Pap stained cervical smear images. The detection of the candidate nuclei areas is based on a morphological image reconstruction process and the segmentation of the nuclei boundaries is accomplished with the application of the watershed transform in the morphological color gradient image, using the nuclei markers extracted in the detection step. For the elimination of false positive findings, salient features characterizing the shape, the texture and the image intensity are extracted from the candidate nuclei regions and a classification step is performed to determine the true nuclei. We have examined the performance of two unsupervised (K-means, spectral clustering) and a supervised (Support Vector Machines, SVM) classification technique, employing discriminative features which were selected with a feature selection scheme based on the minimal-Redundancy – Maximal-Relevance criterion. The proposed method was evaluated on a data set of 90 Pap smear images containing 10248 recognized cell nuclei. Comparisons with the segmentation results of a gradient vector flow deformable (GVF) model and a region based active contour model (ACM) are performed, which indicate that the proposed method produces more accurate nuclei boundaries that are closer to the ground truth. Keywords: Cell nuclei segmentation, Pap smear images, morphological reconstruction, watersheds, feature selection, clustering. I. INTRODUCTION For over 30 years, the most effective and widespread screening test for cervical cancer is the Papanicolaou (Pap) test [1]. This technique provides a staining procedure of cervical cells, which results in the identification of the abnormalities in the cervix. The cervical cells are sampled and smeared onto a glass slide and the characterization of the slide (as normal or abnormal) is accomplished through the careful microscopical examination of the slide by an expert cytopathologist. Nowadays in developed 1
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
COMBINING SHAPE, TEXTURE AND INTENSITY
FEATURES FOR CELL NUCLEI EXTRACTION IN PAP
SMEAR IMAGES
Marina E. Plissiti1, Christophoros Nikou1 and Antonia Charchanti2
1 Department of Computer Science, University of Ioannina, Ioannina, Greece.
2 Department of Anatomy-Histology and Embryology, Medical School, University of Ioannina, Ioannina, Greece.
Abstract — In this work, we present an automated method for the detection and boundary determination of cells nuclei in
conventional Pap stained cervical smear images. The detection of the candidate nuclei areas is based on a morphological image
reconstruction process and the segmentation of the nuclei boundaries is accomplished with the application of the watershed transform
in the morphological color gradient image, using the nuclei markers extracted in the detection step. For the elimination of false positive
findings, salient features characterizing the shape, the texture and the image intensity are extracted from the candidate nuclei regions
and a classification step is performed to determine the true nuclei. We have examined the performance of two unsupervised (K-means,
spectral clustering) and a supervised (Support Vector Machines, SVM) classification technique, employing discriminative features
which were selected with a feature selection scheme based on the minimal-Redundancy – Maximal-Relevance criterion. The proposed
method was evaluated on a data set of 90 Pap smear images containing 10248 recognized cell nuclei. Comparisons with the segmentation
results of a gradient vector flow deformable (GVF) model and a region based active contour model (ACM) are performed, which
indicate that the proposed method produces more accurate nuclei boundaries that are closer to the ground truth.
E. Tripoliti, A. Charchanti, O. Krikoni , D. Fotiadis. Automated detection of cell nuclei in Pap stained smear images using fuzzy clustering,
m optimization for pap-smear diagnosis, Expert Systems with Applications, 35, 2008, 1645-1656.
, Artificial Intelligence in Medicine, 49(2) ,
ased on mutual information: criteria of max-dependency, max-relevance, and min-redundancy, IEEE Trans.
j .
[1] G. N. Papanicolaou, A new procedure for staining vaginal smears, Science, 95 (2469), 1942, 438-439.
[2] P. Bamford, B. Lovell,
[3] H. S. Wu, J. Barba, J. Gil, A parametric fitting algorithm for segmentation of cell images, IEEE Trans. Biomed. Eng., 45(3), 1998, 400-407.
[4] C. H. Lin, Y. K.Chan, C. C. Chen, Detec
[5] M. H. Tsai, Y. K. Chan, Z. Z. Lin, S. F. Yang-Mao, P. C. Huang, Nucleus and cytoplast contour detector of cervical smear image, Pattern
2008, 1441–1453.
[6] S. F. Yang-Mao, Y. K. Chan, Y. P. Chu, Edge e
Part B Cybern., 38 (2), 2008, 353-366.
[7] A. Garrido, N. Perez de la Blanca, Applying deformable templates
[8] N. Lassouaoui, L. Hamami, Genetic algorithms and multifractal segmentation of cervical cell images, Proceedings of 7th International Symposium on Signal
Processing and its Applications, 2, 2003, 1-4.
[9] N. A. Mat Isa, Automated edge detection technique for Pap smear images using moving K-means clustering
algorithm, Int. J. Comput. Internet Manage., 13 (3), 2005, 45-59.
[10] E. B
Conference of the IEEE Engineering in Medicine and Biology, 1, 2004, 1802-1805.
[11] P. T. Jackway, Gradient
[12] P. Bamford, B. Lovell, A water immersion algorithm for cytological image segmentation, Proceedings of APRS Image segmentation workshop, 1996, 75-
79.
[13] O. Lezoray, H. Cardot, Cooperation of color pixel classification schemes and color watershed: A study for microscopic images, IEEE Trans. Image Process.,
11 (7), 2002, 783-789.
[14] M. E. Plissiti, E.
Proceedings of 4th European Congress for Medical and Biomedical Engineering, 2008, 637-641.
[15] Y. Marinakis, M. Marinaki, G. Dounias, Particle swar
[16] L. Nanni, A. Lumini, S. Brahnam, Local binary patterns variants as texture descriptors for medical image analysis
2010, 117-125.
[17] H. Peng, F. Long, C. Ding, Feature selection b
orphological grayscale reconstruction in image analysis: Applications and efficient algorithms, IEEE Trans. Image Process., 2 (2), 1993, 176-
Proceedings of the IEEE International Conference on Image Processing (ICIP04), 5, 2004,
la, M. Pietikainen, T. Maenpaa, Multiresolution gray-scale and rotation invariant texture classification with local binary patterns, IEEE Transactions on
ural Information Processing Systems, 14, 2002, 849-
856.
[30] N. Christianini, J. S. Taylor, Support Vector Machines and other kernel-based methods, Cambridge University Press, 2000.
[31] M. E. Plissiti, C. Nikou, A. Charchanti, Accurate localization of cell nuclei in pap smear images using gradient vector flow deformable models, Proceedings
of 3rd International Conference on Bio-inspired Signals and Systems (BIOSIGNALS), 2010, 284-289.
[18] C. Xu and J. Prince, Snakes, shapes and gradient vector flow, IEEE Trans. Image Process., 7 (3), 1998, 359-369.
[19] K. H. Zhang, L. Zhang, H. H. Song, W. Zhou, Active contours with selective local or global segmentation:
and Vision Computing, 28(4), 2010, 668-676.
[20] K. Zu
[21] N. Otsu, A threshold selection method from gray-level histograms, IEEE Trans. Syst. Man Cybern., 9 (1), 1979, 62-66.
[22] P. Soille, Morphological Image Analysis: Principles and Applications, Springer-Verlag, New York, 1999.
[23] L. Vincent, M
201.
[24] E. J. Breen, R. Jones, Attribute openings, thinings, and granulometries, Comput. Vision Image Understanding, 64 (3), 1996, 377-389.
[25] A. N. Evans, Morphological gradient operators for colour images,
3089-3092.
[26] R. C. Gonzalez, R. E. Woods, Digital image processing, second ed., Predice Hall, 2002.
[27] T. Oja
pattern analysis and machine intelligence, 24 (7), 2002, 971-987.
[28] C. Bishop, Pattern recognition and machine learning, Springer, 2006.
[29] A. Y. Ng, M. I. Jordan, Y. Weiss, On spectral clustering: Analysis and an algorithm, in Advances in Ne
Table I
Shape Features
Minor Axis Length(1) ( )20 02
11
2 u uK
u+ − Δ
=
Major Axis Length (1) ( )20 02
11
2 u uL
u+ + Δ
=
Eccentricity 2 2
2 22
L K
LE
−⎛ ⎞ ⎛ ⎞⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠=
Equivalent Diameter 4 AreaED
π×
=
Perimeter number of boundary pointsP =
Circularity 2
4 AreaCP
π ×=
(1) The formulas for Δ and the central moments of order pqu p q+ of the region ( ),s x y are defined as:
( )2211 20 024u u uΔ = + − ,
( ) ( )p q
pqx y
u x x y= − −∑∑ y , where x and y are the coordinates of the centroid of the region.
19
Table II
Texture Features
Third Moment(2) ( ) (1
33
0
L
i ii
z m p zμ−
=
= −∑ )
Uniformity ( )1
2
0
L
ii
U p z−
=
= ∑
Entropy ( ) ( )1
20
logL
i ii
e p z p−
=
= −∑ z
Smoothness 2
111
Rs
= −+
, where ( ) ( )1
2
0
L
i ii
s z m p z−
=
= −∑
Mean Histogram 2riucircleLBP See Appendix A
Std Histogram 2riucircleLBP See Appendix A
Mean Histogram 2riuhyperbolaLBP See Appendix A
Std Histogram 2riuhyperbolaLBP See Appendix A
(2) Given that is the intensity value and is the histogram of the intensity levels in a region with possible intensity
levels, then the average intensity of the region is calculated as .
iz i ( )p z L
( )1
0
L
i ii
m z p z−
=
= ∑
20
Table III
Intensity Disparity Features
Foreground-Background contrast in red(3) Ngh ARED REDdR m m= −
Foreground-Background contrast green (3) Ngh AGREEN GREENdG m m= −
Foreground-Background contrast in blue(3) Ngh A
BLUE BLUEdB m m= −
(3) is the average intensity value of an image region in a specific color component. The RGB color space is used in our
experiments and the regions of the image that are considered are the enclosed boundary area
regioncolorm
A and its neighborhood
, where is the bounding box of the area cNgh A B= ∪ B A .
21
Table IV
mRMR rank of the 16 most discriminative features for the watershed, the GVF and the ACM segmentation(4)
Watersheds GVF ACM
1. Entropy of B in green Foreground-Background contrast in green Foreground-Background contrast in red
2. Perimeter Minor Axis Length Minor Axis Length
3. Foreground-Background contrast in red Third moment of A in blue Uniformity of Ngh in green
4. Std Histogram in green 2riuhyperbolaLBP Std Histogram in red 2riu
hyperbolaLBP Std Histogram in red 2riuhyperbolaLBP
5. Circularity Entropy of Ngh in red Smoothness of Ngh in green
6. Foreground-Background contrast in green Mean Histogram in green 2riucircleLBP Eccentricity
7. Mean Histogram in blue 2riucircleLBP Foreground-Background contrast in blue Foreground-Background contrast in green
8. Entropy of B in red Eccentricity Mean Histogram in blue 2riucircleLBP
9. Mean Histogram in blue 2riuhyperbolaLBP Mean Histogram in blue 2riu
hyperbolaLBP Mean Histogram in blue 2riuhyperbolaLBP
10. Smoothness of B in green Uniformity of B in green Third moment of A in red
11. Std Histogram in red 2riucircleLBP Foreground-Background contrast in red Circularity
12. Entropy of A in green Std Histogram in red 2riucircleLBP Foreground-Background contrast in blue
13. Foreground-Background contrast in blue Std Histogram in green 2riuhyperbolaLBP Std Histogram in green 2riu
hyperbolaLBP
14. Std Histogram in red 2riuhyperbolaLBP Circularity Entropy of Ngh in red
15. Smoothness of A in red Entropy of B in green Mean Histogram in red 2riucircleLBP
16. Third moment of Ngh in blue Third moment of Ngh in red Third moment of A in blue
(4) A is the enclosed detected boundary area, is the bounding box of B A calculated as the maximum rectangle that contains
the detected region, B and the neighborhood is defined as Ngh cNgh A B= ∩ (see Fig. 5 and text for a detailed description). The
features highlighted in bold face fonts are common for the three segmentation techniques and they appear in the first 14 positions
Fig. 1: (a) Initial image of overlapped cells, (b) the detected nuclei markers, (c) the corresponding color morphological gradient
image, (d) the watershed segmentation.
24
(a)
(b)
(c)
(d)
Fig. 2: (a) Initial image of overlapped cells and (b) the corresponding grayscale image, in which we apply the Canny edge
detector. Using a small threshold results in (c) an image with many undesired edges, while using a high threshold results in (d) an
image with several significant edges missing.
25
Fig. 3: The detected centroids of the regional minima in the image. The true nuclei locations are represented by a yellow cross
and the false positive findings are represented by a black circle.
26
(a)
(b)
(c)
R2
R1 R2
R1
R2
R1
Fig. 4: (a)-(b) The result of the watershed transform in parts of two different cell images. The regions R1 and R2 that are
detected in both images with the watershed transform are joined with a line for better visualization purposes. In (a) the detected
areas R1 and R2 correspond to the areas of true nuclei, while in (b) the detected area R1 corresponds to a nucleus and the area R2
corresponds to a cytoplasm overlapping area. The variation of the average color image intensity value along the line which joins
the areas R1 and R2 is depicted in (c). Notice that for the area R1 we observe sharp reduction of the intensity value in both images.
For the area R2, although the average intensity value is similar in both images, sharper intensity reduction (in relation with its
neighborhood pixels) occurs only for the true nucleus in image (a). This indicates that the use of the neighborhood of each
detected area contributes in the recognition of the true nuclei.
27
(a)
(b) (c) (d)
Ngh B
A
Fig. 5. The selected areas for the construction of the feature set. (a) A cell from the initial image, (b) the detected nucleus boundary with the watershed transform and the enclosed area A, (c)
the area B of the bounding box of the detected boundary, (d) the area of the neighborhood Ngh (Ac∩B) of the detected nucleus.
28
(a) (b) Fig.6: The topology of the neighborhood used for the calculation of the LBP: (a) circle, (b) hyperbola.
29
Fig. 7: Representative histograms of some features of the watershed and the GVF segmentation. Notice that their distribution
consists of a single blob and this allows their discretization into three states at the positions μ σ± .
30
(a)
(b)
(c)
31
Fig. 8: The leave-one-out and global mRMR feature rank for the watershed, SCM ang GVF segmentation algorithms. For the
leave-one-out mRMR feature rank the standard deviation is also depicted with error bars.
(a)
(b)
Fig. 9: Results in terms of the HM measure for the K-means clustering for ACM, GVF and watershed segmentation for both (a)
global and (b) leave-one-out mRMR rank. The vertical line indicates the number of features where the HM measure takes its
maximum value for the three segmentation methods. These values of HM are contained in Table V.
32
(a)
(b)
Fig. 10: Results in terms of the HM measure for spectral clustering for ACM, GVF and watershed segmentation for both global
(a) and leave-one-out (b) mRMR rank. The vertical line indicates the number of features where the HM measure takes its
maximum value for the three segmentation methods. These values of HM are contained in Table V.
.
33
(a)
(b)
Fig. 11: Results in terms of the HM measure for the SVM clustering for ACM, GVF and watershed segmentation for both
global (a) and leave-one-out (b) mRMR rank. For comparison purposes, the indicative values for HM measure were evaluated
using the first 16 features. These features are described in Table IV, while the values of HM are contained in Table V.
34
GVF ACM Watersheds Ground Truth
(a)
(b)
(c)
Fig. 12: (a)-(c) Segmentation results for several detected nuclei.
35
GVF ACM Watersheds Ground Truth
(a)
(b)
(c)
Fig. 13: Representative cases of failure for ACM and GVF segmentation in images with (a) weak gradient at the nucleus
boundary, (b) the inhomogeneities of the nucleus intensity and (c) the existence of high value of gradient in the neighborhood of