Autonomous Segmentation of Near-Symmetric Objects through Vision and Robotic Nudging Wai Ho Li and Lindsay Kleeman Abstract — This paper details a rob ust and accur ate segmen- tation method for near -symmetric objec ts plac ed on a table of kno wn geo met ry . Her e we defi ne vis ual seg mentatio n as the problem of isolating all potions of an image that belongs to a physi cal ly coher ent obj ect . The ter m Nea r-Symmetric is use d as our meth od can segment objec ts wit h some non- symmet ri c par ts, such as a cof fee mug and its handl e. Usi ng bilat eral symmetr y this prob lem is solv ed auton omously and rob ust ly thr ough the aid of physi cal act ion pr ovi ded by a robo t mani pulat or . Our prop osed appr oach does not req uir e pri or mod els of tar get obj ec ts and ass umes no pr eviou sly colle cted back grou nd stati stics . Inste ad, our appr oach rel ies on a pr eci se robotic nud ge to gen er ate the nec ess ary obj ec t motion to perform segmentation. Experiments performed on ten objects show that our model-free approach can autonomously and accurately segment a variety of objects. These experiments also indicate that our segmentation approach is not adversely affec ted when opera ting in clutt ere d scen es and can segment multi-coloured and transparent objects in a robust manner. I. INTRODUCTION A. Moti vation Object segmentation is an important sensory process for robots using visi on. It allows a robot to bu ild accura te inter nal models of its surround ings by isol atin g regi ons ofimages that correspond to objects in the real world. Multi- scale comput er visi on obje ct recog niti on meth ods, such as SIFT [1] and Haar boosted cascades [2] can imbue a robot wit h the abi lit y to rob ust ly det ect and cla ssify model ed objects. However, training such schemes to recognize objects requi re man y hand- label ed and well segment ed images ofposit iv e and neg ati ve examples. Preci ous huma n resou rces are required to obtain this kind of training data. For very large object sets, the amount of time and effort required can be prohibitive. The autonomous process described in this paper attempts to address this problem by obtaining accurate object segmentations robustly without the need for human aid or intervention. Anoth er moti vating fact or is to prov ide a segmenta tion process that is highly autonomous. By limiting target objects to tho se wit h bil ateral symme try , a model- fre e approach can be app li ed, which allo ws us to aba ndo n the a pri ori assumptio ns and offline trai ning demanded by othe r seg- mentation approaches. For example, our method can operate on tra nsp are nt obj ects as we do not assume any tempor al constancy or colour uniformity in an object’s appearance. Wai Ho Li and Li nd say Kl eeman ar e wi th th e De part me nt of Ele ctrical and Compu ter Sys tems Eng ine eri ng, Mon ash Uni ve rsi ty , Cla yto n Campus , Mel bou rne , Aus tra lia [email protected], [email protected]Thi s wor k is intended for use in domest ic rob oti cs ap- pl ic at ions as ther e ar e ma ny obje ct s wi th symmet ry in most households. However, the sensing parts of the process, namely locating points of interest using symmetry triangu- lation and object segmentation by folded frame difference, are applicable to other robotic tasks. The overall aim is to provide robots with general methods of dealing with common household objects such as cups, bottles and cans, without the burden of mandatory offline training for every new object. As our approach assumes nothing about the appearance ofthe robot manipulator, the actuation of target objects can be provided by any manipulator capable of performing a robotic nudge as described in Section III, including a human hand. B. Contribution s Segmenta tion using robot ic acti on has been explore d in the past, most recentl y by Fitzp atri cket al [3], [4]. Their approach uses a poking action, which sweeps the end effector across the workspace. The presence of an object is detected when visual motion increases due to contact with the moving effector. Their segmentation method use frames just before and after this poi nt of contact. No pla nni ng is per for med pri or to rob oti c act ion . Ass umi ng the tar get obj ect is not deformed by the poking action, objects of any shape can be segmented. The main contributions of our work are as follows. Firstly, by limiting our scope to near-symmetric objects, locations ofinterest are found prior to the application of robotic action. Thi s is achieved by clu ste rin g the inters ect ion s betwee n ste reo triang ula ted symme try axes and a tab le pla ne. By av oid ing den se ste reo app roa che s, we can als o loc ali zed transparentobjects with bilateral symmetry. Details of our ster eo tria ngul atio n appro ach, incl udin g a comp aris on ofresults against dense stereo, can be found in [5]. Limited by the use of elastic acutators in their manipulator, the approach of Fitzpatricket al uses applies an imprecise pok ing actio n to obj ect s. In contrast, our met hod use s a short, accurate roboti c nudge, applied only to locations ofinterest. In experiments, we show that our method does not ti p ov er tal l obj ects suc h as emp ty bot tle s and does not damage fra gil e obj ect s suc h as cer ami c mug s. This le ve l of gentleness in object manipulation is not demonstrated in the work of Fitzpatricket al. While neither method address the problem of end effect or obst acle avoid ance, the small workspace footprint of the robotic nudge should make path planning easier. Finally, while appearing similar at a glance, our approach to vis ual se gment ation is ver y different to that of Fit z-
6
Embed
Autonomous Segmentation of Near-Symmetric Objects through Vision and Robotic Nudging
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
8/3/2019 Autonomous Segmentation of Near-Symmetric Objects through Vision and Robotic Nudging
Autonomous Segmentation of Near-Symmetric Objects through Vision
and Robotic Nudging
Wai Ho Li and Lindsay Kleeman
Abstract— This paper details a robust and accurate segmen-tation method for near-symmetric objects placed on a tableof known geometry. Here we define visual segmentation asthe problem of isolating all potions of an image that belongsto a physically coherent object. The term Near-Symmetricis used as our method can segment objects with some non-symmetric parts, such as a coffee mug and its handle. Usingbilateral symmetry this problem is solved autonomously androbustly through the aid of physical action provided by arobot manipulator. Our proposed approach does not requireprior models of target objects and assumes no previouslycollected background statistics. Instead, our approach relieson a precise robotic nudge to generate the necessary objectmotion to perform segmentation. Experiments performed on tenobjects show that our model-free approach can autonomouslyand accurately segment a variety of objects. These experimentsalso indicate that our segmentation approach is not adverselyaffected when operating in cluttered scenes and can segmentmulti-coloured and transparent objects in a robust manner.
I. INTRODUCTION
A. Motivation
Object segmentation is an important sensory process for
robots using vision. It allows a robot to build accurate
internal models of its surroundings by isolating regions of
images that correspond to objects in the real world. Multi-
scale computer vision object recognition methods, such asSIFT [1] and Haar boosted cascades [2] can imbue a robot
with the ability to robustly detect and classify modeled
objects. However, training such schemes to recognize objects
require many hand-labeled and well segmented images of
positive and negative examples. Precious human resources
are required to obtain this kind of training data. For very large
object sets, the amount of time and effort required can be
prohibitive. The autonomous process described in this paper
attempts to address this problem by obtaining accurate object
segmentations robustly without the need for human aid or
intervention.
Another motivating factor is to provide a segmentation
process that is highly autonomous. By limiting target objectsto those with bilateral symmetry, a model-free approach
can be applied, which allows us to abandon the a priori
assumptions and offline training demanded by other seg-
mentation approaches. For example, our method can operate
on transparent objects as we do not assume any temporal
constancy or colour uniformity in an object’s appearance.
Wai Ho Li and Lindsay Kleeman are with the Department of Electrical and Computer Systems Engineering, Monash University,Clayton Campus, Melbourne, Australia [email protected],[email protected]
This work is intended for use in domestic robotics ap-
plications as there are many objects with symmetry in
most households. However, the sensing parts of the process,
namely locating points of interest using symmetry triangu-
lation and object segmentation by folded frame difference,
are applicable to other robotic tasks. The overall aim is to
provide robots with general methods of dealing with common
household objects such as cups, bottles and cans, without the
burden of mandatory offline training for every new object.
As our approach assumes nothing about the appearance of
the robot manipulator, the actuation of target objects can be
provided by any manipulator capable of performing a roboticnudge as described in Section III, including a human hand.
B. Contributions
Segmentation using robotic action has been explored in
the past, most recently by Fitzpatrick et al [3], [4]. Their
approach uses a poking action, which sweeps the end effector
across the workspace. The presence of an object is detected
when visual motion increases due to contact with the moving
effector. Their segmentation method use frames just before
and after this point of contact. No planning is performed
prior to robotic action. Assuming the target object is not
deformed by the poking action, objects of any shape can be
segmented.The main contributions of our work are as follows. Firstly,
by limiting our scope to near-symmetric objects, locations of
interest are found prior to the application of robotic action.
This is achieved by clustering the intersections between
stereo triangulated symmetry axes and a table plane. By
avoiding dense stereo approaches, we can also localized
transparent objects with bilateral symmetry. Details of our
stereo triangulation approach, including a comparison of
results against dense stereo, can be found in [5].
Limited by the use of elastic acutators in their manipulator,
the approach of Fitzpatrick et al uses applies an imprecise
poking action to objects. In contrast, our method uses a
short, accurate robotic nudge, applied only to locations of interest. In experiments, we show that our method does not
tip over tall objects such as empty bottles and does not
damage fragile objects such as ceramic mugs. This level
of gentleness in object manipulation is not demonstrated in
the work of Fitzpatrick et al. While neither method address
the problem of end effector obstacle avoidance, the small
workspace footprint of the robotic nudge should make path
planning easier.
Finally, while appearing similar at a glance, our approach
to visual segmentation is very different to that of Fitz-
8/3/2019 Autonomous Segmentation of Near-Symmetric Objects through Vision and Robotic Nudging
When the gripper begins its descent at P 0, the right
camera image is monitored for motion. Motion detection
is performed at a coarse resolution using 8x8 pixel cells.
Cells with two times the motion of the global average are
labeled as moving. This block motion algorithm is the same
as the one used in our symmetry tracking paper [10]. To
prevent ego motion of the robot manipulator from beinginterpreted as object motion, the object’s symmetry line is
used as a visual barrier. As the robot gripper never crosses
the symmetry line, motion detection is only performed on
green region in Figure 3.
Once motion has been detected, the robot begins stereo
tracking on the target object’s symmetry line. A Kalman
filter is used to track the polar parameters of the target
symmetry line. The tracking system is identical to the one
described in our previous work on real time monocular
symmetry tracking [10]. The monocular tracker is replicated
twice to perform stereo tracking. Visual segmentation will
only take place if tracking converges to a symmetry axes
roughly perpendicular to the table plane. This prevents poorsegmentation caused by insufficient object motion.
Videos of the robotic nudge and stereo tracking can be
downloaded from:
www.ecse.monash.edu.au/centres/irrc/li_iro08.php
IV. OBJECT SEGMENTATION
A. Object Segmentation by Folded Frame Difference
(a) Before Nudge (b) After Nudge (c) Frame Difference
(d) Folded Difference (e) Symmetry Filled (f) Segmentation Result
Fig. 6. Segmentation by Folded Frame Difference. Note that the FoldedDifference and Symmetry Filled images are rotated such that the object’ssymmetry line is vertical
Segmentation is performed using the object motion gen-
erated by the robotic nudge. Figure 6 illustrates the major
steps of segmentation. Figure 6(a) and Figure 6(b) are
images taken by the right camera before and after the nudge.
The absolute frame difference between the before and after
images is shown in Figure 6(c). The green lines are the
object’s symmetry lines before and after the nudge, found
using our symmetry detector. Note that thresholding the raw
frame difference will produce a mask that includes many
background pixels. The mask will also have a large gap
at the center of low-texture objects, such as the clear cup
in the example. Using the object’s symmetry lines, we can
overcome these problems.
Figure 6(d) shows the folded frame difference of the
object. This image is produced by removing the frame dif-
ference pixels between the two symmetry lines. This process
folds the frame difference image together as if it is printed on
a piece of paper, pressing the creases at the symmetry lines
together. Changes in the orientation of the object’s symmetry
lines before and after the nudge are removed prior to folding.
This folding process removes the excess area of the motion
mask autonomously and reduces the size of the motion gap
at the center of the moved object’s frame difference.
After folding, a small gap still remains in the frame
difference. This can be seen in Figure 6(d) as a dark vertical
section inside the cup-like shape. To remedy this, we again
exploit object symmetry to our advantage. Recall that thefolding step merges the symmetry lines of the object in the
before and after frames. Using this newly merged symmetry
line as a mirror, we search for motion on either side of it.
A pixel is considered moving if its frame difference value
is above a threshold. The folded difference image is rotated
so that the merged symmetry line is vertical. The widest
pair of moving pixels bisected by the object’s symmetry line
are recorded for each row of the image. This produces a
symmetric contour of the object. By filling the interior of this
contour, we produce the image in Figure 6(e). Note that this
filling approach retains the non-symmetric parts of objects.
The final segmentation result in Figure 6(f) is obtained by
thresholding the symmetry filled difference image.
V. SEGMENTATION EXPERIMENT R ESULTS
Segmentation experiments were carried out on ten ob-
jects of different size, shape, texture and colour. Trans-
parent, multi-coloured and partially symmetric objects are
also included. Objects are set against different backgrounds,
ranging from plain to cluttered. All segmentation results are
obtained autonomously by our robot without any human aid.
Objects in our scenes casts many shadows due to four bright
fluorescent ceiling light sources illuminating the table. For
safety reasons, a flashing warning beacon is active during
robot motion, periodically casting red light on the table when
the robot manipulator is powered.Due to space constraints, some segmentation results have
been left out. They can be found at:
www.ecse.monash.edu.au/centres/irrc/li_iro08.php
A. Cups without Handles
The white cup in Figure 7 poses a challenge to our
segmentation process not because of its imperfect symmetry,
but because of its shape. Due to its narrow stem-like bottom
half, the nudge produces very small shifts in the object’s
8/3/2019 Autonomous Segmentation of Near-Symmetric Objects through Vision and Robotic Nudging
against background clutter. Finally, Figure 15 contains two
segmentation results for a transparent bottle. Note the accu-
rate segmentation obtained for the transparent bottle, which
produces a very weak motion signature when nudged.
VI. CONCLUSION
Our segmentation approach performs robustly and accu-
rately on near-symmetric objects in cluttered environments.By using the robotic nudge, the entire segmentation process
is carried out autonomously. Multi-coloured and transpar-
ent objects, as well as objects with non-symmetric parts,
are handled in a robust manner. We have shown that our
approach can segment objects of varying visual appearance
autonomously, shifting the burden of training data collection
from the user to the robot.
End effector obstacle avoidance and path planning, espe-
cially in situations where non-symmetric objects are present
in the nudge path, are left to future work. As our symmetry
detection method uses edge pixels as input, our segmentation
approach is visually orthogonal to those that use pixel
information, such as colour and image gradient. In situationswhere the target object is non-symmetric, approaches relying
on other features can be applied synergetically.
Our objection to stereo optical flow and graph cuts is their
reliance on object surface information, which is completely
unreliable for transparent and reflective objects. However,
if the opaqueness of an object has been confirmed, these
approaches can be used with our robotic nudge. As the
geometry of our table plane is known, a stereo approach to
segmentation can further improve segmentation by removing
the object shadow which is present in some of the results.
Fig. 15. Transparent Bottle
VII. ACKNOWLEDGMENTS
Thanks go to Steve Armstrong for his help with repairing
the PUMA 260 manipulator and the anonymous reviewers
for their insightful comments.
REFERENCES
[1] D. G. Lowe, “Distinctive image features from scale-invariant key-points,” IJCV , vol. 60, no. 2, pp. 91–110, November 2004.
[2] P. Viola and M. J. Jones, “Rapid object detection using a boostedcascade of simple features,” in IEEE CVPR, 2001.
[3] P. Fitzpatrick, “First contact: an active vision approach to segmenta-
tion,” in Proceedings of Intelligent Robots and Systems, 2003. (IROS2003), vol. 3. IEEE, October 2003, pp. 2161–2166.
[4] P. Fitzpatrick and G. Metta, “Grounding vision through experimentalmanipulation,” in Philosophical Transactions of the Royal Society:
Mathematical, Physical, and Engineering Sciences, 2003, pp. 2165–2185.
[5] W. H. Li and L. Kleeman, “Fast stereo triangulation using symmetry,”in Australasian Conference on Robotics and Automation, 2006.
[6] W. H. Li, A. M. Zhang, and L. Kleeman, “Fast global reflectionalsymmetry detection for robotic grasping and visual tracking,” in
Australasian Conference on Robotics and Automation, 2005.[7] J.-Y. Bouguet, “Camera calibration toolbox for matlab,” Online, July
2006, http://www.vision.caltech.edu/bouguetj/calib doc/.[8] T. S. H. K. S. Arun and S. D. Blostein, “Least-squares fitting of two
3-d point sets,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 9, pp. 698–700, 1987.
[9] S. K. Laurie J. Heyer and S. Yooseph, “Exploring expression data:
Identification and analysis of coexpressed genes,” Genome Research,vol. 9, pp. 1106–1115, 1999.
[10] W. H. Li and L. Kleeman, “Real time object tracking using reflectionalsymmetry and motion,” in IEEE/RSJ Conference on Intelligent Robotsand Systems, 2006, pp. 2798–2803.