Computational Biology, Part 23 Segmentation and Feature Calculation for Automated Interpretation of Subcellular Patterns Robert F. Murphy Robert F. Murphy Copyright Copyright 1996, 1999, 1996, 1999, 2000-2009. 2000-2009. All rights reserved. All rights reserved.
49
Embed
Computational Biology, Part 23 Segmentation and Feature Calculation for Automated Interpretation of Subcellular Patterns Robert F. Murphy Copyright
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Computational Biology, Part 23
Segmentation and Feature Calculation for Automated
Interpretation of Subcellular Patterns
Computational Biology, Part 23
Segmentation and Feature Calculation for Automated
Interpretation of Subcellular PatternsRobert F. MurphyRobert F. Murphy
RegistrationRegistration Not critical if only using DNA or Not critical if only using DNA or membrane referencesmembrane references
Intensity scaling (constant scale Intensity scaling (constant scale or contrast stretched for each or contrast stretched for each cell)cell)
Feature levels and granularityFeature levels and granularity
Objectfeatures
SingleObject
SingleCell
SingleField
Cellfeatures
Fieldfeatures
Granularity: 2D, 3D, 2Dt, 3Dt
Aggregate/average operator
Cell SegmentationCell Segmentation
Single cell segmentation approaches
Single cell segmentation approaches
VoronoiVoronoi WatershedWatershed Seeded WatershedSeeded Watershed Level Set MethodsLevel Set Methods Graphical ModelsGraphical Models
Voronoi diagramVoronoi diagram
Seed
Edge
Vertex
Given a set of seeds, draw vertices and edges such that each seed is enclosed in a single polygon where each edge is equidistant from the seeds on either side.
Voronoi Segmentation ProcessVoronoi Segmentation Process• Threshold DNA image (downsample?)Threshold DNA image (downsample?)• Find the objects in the imageFind the objects in the image• Find the centers of the objectsFind the centers of the objects• Use as seeds to generate Voronoi Use as seeds to generate Voronoi
diagramdiagram• Create a mask for each region in Create a mask for each region in
the Voronoi diagramthe Voronoi diagram• Remove regions whose object that Remove regions whose object that
does not have does not have intensity/size/shape of nucleusintensity/size/shape of nucleus
Original DNA image
After thresholding and removing small objects
After triangulation
After removing edge cells and filtering
Final regions masked onto original image
Watershed SegmentationWatershed Segmentation
Intensity of an Intensity of an image ~ image ~ elevation in a elevation in a landscapelandscape Flood from minimaFlood from minima Prevent merging Prevent merging of “catchment of “catchment basins”basins”
Watershed borders Watershed borders built at contacts built at contacts between basinsbetween basins
If starting image has intensity If starting image has intensity centered on the cells (e.g., DNA) that centered on the cells (e.g., DNA) that you want to segment, invert image so you want to segment, invert image so that bright objects are the sourcesthat bright objects are the sources
If starting image has intensity If starting image has intensity centered on the boundary between the centered on the boundary between the cells (e.g., plasma membrane protein), cells (e.g., plasma membrane protein), don’t invert so that boundary runs don’t invert so that boundary runs along high intensityalong high intensity
Drawback is that the number of regions Drawback is that the number of regions may not correspond to the number of may not correspond to the number of cellscells
Seeded watershed allows water to rise Seeded watershed allows water to rise only from predefined sources (seeds)only from predefined sources (seeds)
If DNA image available, can use same If DNA image available, can use same approach to generate these seeds as approach to generate these seeds as for Voronoi segmentationfor Voronoi segmentation
Can use seeds from DNA image but use Can use seeds from DNA image but use total protein image for watershed total protein image for watershed segmentationsegmentation
• Shape similarity of protein image to Zernike polynomials Z(n,l)• 49 polynomials and 49 features
Haralick Texture FeaturesHaralick Texture Features Correlations of adjacent pixels in Correlations of adjacent pixels in gray level imagesgray level images
Start by calculating co-occurrence Start by calculating co-occurrence matrix P:matrix P:
N by N matrix, N=number of gray N by N matrix, N=number of gray level.level.Element P(i,j) is the probability of Element P(i,j) is the probability of pixels with value i being adjacent pixels with value i being adjacent with pixels with value jwith pixels with value j
Four directions in which a pixel can Four directions in which a pixel can be adjacentbe adjacent
331122
223344114433004400334444003322110033001144332211
221133334411443300333333441122330011001144332211
442244114422663300334433661122110011221144332211
222233224422224411334444442222
001144332211
4 2 2 2 41 2 4 1 13 4 4 4 22 2 3 3 23 3 3 2 4
Co-occurrence Matrix
Pixel Resolution and Gray LevelsPixel Resolution and Gray Levels Texture features are influenced Texture features are influenced by the number of gray levels by the number of gray levels and pixel resolution of the and pixel resolution of the imageimage
Optimization for each image Optimization for each image dataset requireddataset required
Alternatively, features can be Alternatively, features can be calculated for many resolutionscalculated for many resolutions
Fourier featuresFourier features
Frequency representationFrequency representation Any signal may be represented Any signal may be represented as the sum of many sinusoids.as the sum of many sinusoids.
As more sinusoids are added to As more sinusoids are added to the sum, the representation of the sum, the representation of the original signal becomes the original signal becomes more and more accurate.more and more accurate.
Frequency representationFrequency representation On the left below is a square wave.On the left below is a square wave. On the right is a single sinusoid with a DC offset On the right is a single sinusoid with a DC offset
which begins to approximate the original data.which begins to approximate the original data.
Frequency representationFrequency representation Now, a second sinusoid is added to the first to Now, a second sinusoid is added to the first to
create a better approximation.create a better approximation. The summation may be seen by noting how the The summation may be seen by noting how the
first sinusoid is raised and lowered depending on first sinusoid is raised and lowered depending on whether the second is positive or negative.whether the second is positive or negative.
Frequency representationFrequency representation Adding still another sinusoid further improves the Adding still another sinusoid further improves the
approximation.approximation.
Frequency representationFrequency representation Any discrete distribution can be represented in a Any discrete distribution can be represented in a
completely reversible manner (to numerical completely reversible manner (to numerical accuracy) by as many sinusoids as there are points accuracy) by as many sinusoids as there are points in the distributionin the distribution
Amount of signal at various Amount of signal at various spatial frequencies can be spatial frequencies can be used as image featuresused as image features
Background subtraction and thresholdingBackground subtraction and thresholding Translation and rotationTranslation and rotation
Wavelet transformationWavelet transformation The Daubechies 4 waveletThe Daubechies 4 wavelet 10 level decomposition10 level decomposition Use the average energy of the three high-Use the average energy of the three high-frequency components at each level as frequency components at each level as featuresfeatures
WaveletsWavelets
Many wavelet basis functions Many wavelet basis functions (filters):(filters): HaarHaar Daubechies (-4, -6, -20)Daubechies (-4, -6, -20) GaborGabor ......
Slide courtesy of Christos Faloutsos
Feature selectionFeature selection
Having too many features can Having too many features can confuse a classifierconfuse a classifier
Can use comparison of feature Can use comparison of feature distributions between classes to distributions between classes to choose a subset of features that choose a subset of features that gets rid of uninformative or gets rid of uninformative or redundant featuresredundant features
Feature Selection MethodsFeature Selection Methods Principal Components AnalysisPrincipal Components Analysis Non-Linear Principal Components Non-Linear Principal Components AnalysisAnalysis
Independent Components AnalysisIndependent Components Analysis Information GainInformation Gain Stepwise Discriminant AnalysisStepwise Discriminant Analysis Genetic AlgorithmsGenetic Algorithms
Matlab demonstrationsMatlab demonstrations
Example data files: Example data files: EndoAndLysoImages.tgzEndoAndLysoImages.tgz
(use tar xzf (use tar xzf EndoAndLysoImages.tgz)EndoAndLysoImages.tgz)
20 images of a lysosomal protein 20 images of a lysosomal protein (LAMP2, stained with antibody (LAMP2, stained with antibody h4b4)h4b4)
20 images of an endosomal protein 20 images of an endosomal protein (TfR)(TfR)