19/01/2015 1 Region Space Analysis School of Computer Science and Electronics Engineering Arief Setyanto Dr. John C Wood, Prof. Mohammed Ghanbary SMPTE London 15 January 2014 Outline Why segment: we do it Salient object isolation : feasible? Low level segmentation: practical? Metadata from regions : possible?
23
Embed
Region Space Analysis - We are SMPTE · Arief Setyanto, John Charles Wood, Mohammed Ghanbari, Evolution Analysis of Binary Partition Tree for Hierarchical Video Simplified Segmentation,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
19/01/2015
1
Region Space AnalysisSchool of Computer Science and Electronics Engineering
Arief Setyanto
Dr. John C Wood, Prof. Mohammed Ghanbary
SMPTE
London 15 January 2014
Outline Why segment: we do it
Salient object isolation : feasible?
Low level segmentation: practical?
Metadata from regions : possible?
19/01/2015
2
General Result
(a)
Original Video
(b)
Simplest Segmentation Result
(under segmented)
Introduction
There are a lot of dots in a digital image
Pre segmentation reduces required data
regions and even objects found using descriptors
Preservation of boundaries and salient content
Branches often represent semantic information
Can be achieved without the need for thresholds
19/01/2015
3
What We do?
Metadata Based Region Query
Region Metadata Generation
Evolution Analysis
Reluctant Merging Detection Region Relative Surround Saliency
Region Merging for Hierarchical Segmentation
Colour Based Colour and Motion Direction
Pre Segmentationwatershed SLIC
Object
Frames/ Images
Sequence of Frames (Video)
19/01/2015
4
Frame/Image
A single frame consists of almost millions of pixels, for example 720 lines uses
1280 pixels per line (1280 x 720) or 1920x1080.
Representation – 2D Matrices
In order to reduce computation, neighbouring pixels which carry similar
information groups together to be new unit called region
Video/Sequence of Frames
Generally immediate consecutive frame share similar information
In three dimensional representation, video has spatial dimension which are
horizontal (x) and vertical (y) axis plus temporal axis (t).
Every dot picture element in the 3D space so called voxel.
Representation 3D matrices
Instead of having spatial neighbor's, every voxel has spatial and temporal
neighbors.
According to similarity criteria, the voxels in the 3D space can be grouped
and form a volumetric partition or super voxel.
19/01/2015
5
Pre-Segmentation (Spatial)
Pre-
segment(a)Watershed 2675 regions
(b) SLIC 207 Regions
Original, 288 x 352 pixels
Region/Super pixel
Algorithm to produce regions
Watershed
Mean shift
SLIC
Video analysis using - Region temporal correlation
Region Descriptor
Colour descriptor
Colour and neighbourhood descriptor
Advantage
Only need a single frame at every execution
Doesn’t need a huge amount of memory
Disadvantage
Need region correlating task
19/01/2015
6
y
Original Video
Pre
Segmentation
Pre-Segmentation (Spatio Temporal)
Volume/Super voxel of Video
Method :
3D watershed
Advantages
Doesn’t need to compute region correlation across frame
Provide approximation of motion direction for each frame since the pre-segmentation task
Disadvantage
Need a number of frames as an input (usually between 2 cuts)
Need a huge number of memory in preprocessing stage
19/01/2015
7
Hierarchical Segmentation
In the real world there are no single interpretation of a scene.
One may interpret a face as a single entity while other percept as
compound objects, it consist of eyes, nose, lips etc.
The idea of hierarchical segmentation is keeping the detail
information on the lower level while provide generalization on the
higher level.
Our Algorithm record every merging task in a tree. Because every
iteration algorithm choose a pair (2 partitions) which have the
closest distance (can be colour distance) the result is a binary
partition tree.
BPT as Hierarchical Segmentation
Pre-Segmentation
Partition Labelling
Region/Volume AdjacencyGraph
(VAG)
Merging
Record Merging History
Binary
Partition
Tree
19/01/2015
8
RAGs for Image
Region Adjacency Graph
1
2 4
35
Image
Regions
Region
Adjacency
Graph
2
3
1
4
5
Volume Adjacency Graph (VAG)
1
2
3
4
19/01/2015
9
Merging
Generally generic segmentation algorithm produce a set of over
segmented
A thousands of tiny partitions obviously mean nothing
That’s the reason why tiny partitions must be merge in order to obtain the
expected object candidates.
Problem :
Which region pair to be merge and when they are merge
When the merging must be stopped
Solution – Identify salient node and propose as salient candidate
Merging
(1) Merge the most Similar
(2)Issue New Parent Node
(3)Update The Volume/Region
Adjacency Graph
(4) If VAG/RAG is Not Empty
19/01/2015
10
Hierarchical Segmentation on BPT
(Image)
Pre-segmentation :
Watershed
Initial Nodes : 2675
Total Nodes : 11828
Level : 65
Hierarchical Segmentation on BPT
(image)
Pre-segmentation
: SLIC
Initial Nodes : 206
Total Nodes : 363
Level : 20
19/01/2015
11
BPT for video (example) Plot Level 2
Simplification
Using the word ‘simplification’ avoids committing to
‘segmentation’
A range of tree densities provides a hierarchy for a user
to operate in
Tree densities are controlled by examining the gradients
of the graph arcs
Can be applied to colour size centroid etc. or a
combination
19/01/2015
12
Application of Hindsight
The tree is a documentation of the merging process generated without
using thresholds
An individual path from leaf to root is unique
The path can be subjected to statistical analysis
The Rule
On an upward path through the tree a region which is growing consistently
is a ‘happy’ region
If a discontinuity occurs, a reluctance to merge is evident.
A reluctant event should inhibit further merging
The same event observed in colour, size and centroid etc. reinforces
19/01/2015
13
BPT Evolution ANALYSIS
Define all possible path from the lowest leaf toward the root of BPT
Path = {path1, 1 path2 ….. Pathl}
where l = number of original partition as the result of watershed
Each path consist of n node starting from the lowest to the root
Path_i = {Node1, node2 …..Node_n} where n is the number of node from certain path (i) from the leaf nodes to the root. n can be vary for individual path
Observe volume evolution and identify the unhappy merging, which is
happen when 2 neighbouring region obliged to be merge while they have
big difference. They may belong to different object
Proceed for all possible path
In our experiment, we choose first, second and last peak from the graph
First peak result remain over segmented, second more simple and the last
peak tend to be under segmented.
For simplicity reason, all the node under the peak node will be pruned.
19/01/2015
14
Evolution Analysis
Choose specific leaf nodes from a BPT and form a path, identify the
reluctant merge in every merging step.
Plot the evolution in the BPT
Detected reluctant merging
19/01/2015
15
Simplification Result (carphone) – 1st Peak
Carphone – second peak
19/01/2015
16
Carphone – last peak
Soccer – original
19/01/2015
17
Result – (Soccer) – first peak
Soccer – second peak
19/01/2015
18
Soccer – last peak
Test Result
The table bellow figure out how far the simplification been done by the
algorithm.
19/01/2015
19
Simplification Rate
Metadata
Image processing domain often involve a complex pixel processing
One of our work aim to shift image processing area into database
processing domain.
Region as a result of segmentation are translated into textual database
records.
19/01/2015
20
Task
Saliency identification
Shape Identification for salient region
Metadata Recording
Retrieval from Metadata
It can be used either by machine or human to query an information by using “SQL Like Syntax” which is extended with some special keyword in order to perform spatial logic operation such as : next to, on the left of etc.
Region to Textual Metadata
Region to
Textual
Features
RegionID level Parent Left Right Shape
colourtext
2132 0 0 2128 2131 - Gray
2128 1 2132 2122 2086 Face Silver
2131 1 2132 2127 2130 - Blue
2122 2 2128 1994 2115 Triangle Silver
2086 2 2128 2024 2057 - yellow
RegionIDNeighbour Angle Position Text
2745 2748 23.1707 1 Up Right
2745 2752 58.2401 1 Up Right
2746 2748 271.924 4 Bottom
2746 2752 32.1977 1 Up Right
19/01/2015
21
Metadata Content
1. Region Identity
2. Level in the pyramidal tree (BPT)
3. Region statistic(colour, size, centroid)
4. Region relative to its parent and child in the tree
5. Region relative to its neighbour in spatial space
6. Region in temporal domain (how long its alive, how far its moving)
7. Region Shape for object candidate only with certain measure (according
to 3 and 4)
Extended SQL for textual metadata
19/01/2015
22
Extended Query Language
Near to
Next to
South Neighbour to
South East Neighbour to
East Neighbour to
…
Inside
References M. El Saban and B. Manjunath, “Video region segmentation by spatiotemporalwatersheds,” in Proceedings 2003
International Conference on Image Processing, vol. 1, pp. I–349–52, Ieee, 2003.
P. Salembier and F. Marqu´es, “Region-based representations of image and video: segmentation tools for multimedia services,” IEEE Transactionson Circuits and Systems for Video Technology, vol. , no. 8,pp. 1147–1169, 1999.
[H. Lu, J. C. Woods, and M. Ghanbari, “Binary Partition Tree for Semantic Object Extraction and Image Segmentation,” IEEE Transactionson Circuits and Systems for Video Technology, vol. 17, pp. 378–383,Mar. 2007.
L. Vincent and P. Soille, “Watersheds in digital spaces: an efficient algorithm based on immersion simulations,” IEEE transactions on pattern analysis and Machine Intelligence, vol. 13, no. 6, pp. 583–598, 1991.
H. Lu, J. C. Woods, and M. Ghanbari, “A Platform for Region Space Analysis in Binary Partition Tree,” IADIS International Journal on Computer Science and Information Systems, vol. 2, no. 1, pp. 96–110, 2007.
E. L. Andrade, E. Khan, J. C. Woods, M. Ghanbari, “Description based object tracking in region space using prior information,” Electronic Letters, vol. 39, no. 7, pp. 600-602, April 2003.
E. L. Andrade, J. C. Woods, E. Khan and M. Ghanbari, “Region based analysis and retrieval for tracking of semantic objects and provision of augmented information in interactive sport scenes,” IEEE Multimedia, to appear.
Y. Tsaig, A. Averbuch, “Automatic segmentation of moving objects in video sequences: a region labeling approach,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, no. 7, pp. 597–612, July 2002.
Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Su ̈sstrunk, S., "SLIC Superpixels Compared to State-of-the-Art SuperpixelMethods," Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.34, no.11, pp.2274,2282, Nov. 2012
Arief Setyanto, John Charles Wood, Mohammed Ghanbari, Evolution Analysis of Binary Partition Tree for Hierarchical Video Simplified Segmentation, CEEC, 2014
Arief Setyanto, John Charles Wood, Mohammed Ghanbari, Genetic Algorithm for Inter-frame Region Object Temporal Correlation in Binary Partition Tree, International Conference on System Engineering and Technology (ICSET) 2012
19/01/2015
23
References
K. Ngan and H. Li, “Image/Video Segmentation: Current Status, Trends, and Challenges,” in Video segmentation and its applications, pp. 1–24, New York: Springer, 2011.
a. Tremeau and P. Colantoni, “Regions adjacency graph applied to color image segmentation.,” IEEE Transactions on Image Processing, vol. 9, no. 4, pp. 735–744, 2000.
J. Shi and J. Malik, “Motion segmentation and tracking using normalized cuts,” in Computer Vision, 1998. Sixth International Conference on,1998.
W. Tao, H. Jin, and Y. Zhang, “Color image segmentation based on mean shift and normalized cuts,” Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, vol. 37, pp. 1382–9, Oct. 2007.
E. Tuncel and L. Onural, “Utilization of the Recursive Shortest Spanning Tree Algorithm for Video-Object Segmentation by 2-D,” Circuits andSystems for Video Technology, IEEE Transactions on, vol. 10, no. 5, pp. 776–781, 2000.
P. F. Felzenszwalb and D. P. Huttenlocher, “Efficient Graph-Based Image Segmentation,” International Journal of Computer Vision, vol. 59, pp. 167–181, Sept. 2004.
M. Grundmann, V. Kwatra, M. Han, and I. Essa, “Efficient hierarchical graph-based video segmentation,” in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2141– 2148, Ieee, June 2010.
C. Xu, C. Xiong, and J. Corso, “Streaming hierarchical video segmentation,” Computer VisionECCV 2012, 2012. [9] a. Cavallaro, O. Steiger, and T. Ebrahimi, “Tracking video objects in cluttered background,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, pp. 575–584, Apr. 2005.
J. Wang and Y. Yagi, “Consecutive tracking and segmentation using adaptive mean-shift and graph cut,” in Robotics and Biomimetics. IEEE International Conference on, 2007.
A. Setyanto, J. C. Wood, and M. Ghanbari, “Platform for Temporal Analysis of Binary Partition Tree,” in Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), 2013, (Poznan, Poland), pp. 45 – 50, 2013.
C. Dorea, M. Pard`as, and F. Marques, “Trajectory tree as an objectoriented hierarchical representation for video,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 19, no. 4, pp. 1–14, 2009.