-
HumanBody PoseEstimationUsingSilhouetteShapeAnalysis
Abstract
We describe a system for human body pose estimationfrom multiple
views that is fast and not dependent on arigid 3D model. We make
use of recent work in decom-position of a silhouette into 2D parts.
These 2D partprimitives are matched across views to build
assem-blies in 3D. In order to search for the best assembly,we use
a likelihood function that integrates informa-tion available from
multiple views about body part lo-cations. Occlusion is modeled
into the likelihood func-tion so that the algorithm is able to work
in a crowdedscene even when only part of the person is visible
ineach view. The algorithm has potential applicationsin
surveillance and promising results have been ob-tained.
1 Intr oduction
Determiningtheposeof humansis animportantprob-lem in vision
andhasmany applications. In this pa-per, we target
multi-camerasurveillanceapplicationswhereonewantsto
recognizetheactivities of peoplein ascenein thepresenceof
occlusionsandpartialoc-clusions. Onecannotassumethat a personis
visiblein isolationor in full in eitheroneor all of theviews.Nor
canoneassumethatwe have a modelof theper-son, or that the initial
body poseis known. Suchasystemshouldalsobereasonablyfast. However,
veryaccuratebody posevaluesare typically not
required,andananswercloseto theactualbodyposemight beadequate.We
describeanalgorithmthatcanform thebasisof
suchasurveillancesystem.
Oursystemestimatesthe3D poseof ahumanbodyfrom multiple views. We
make useof recentwork indecompositionof asilhouetteinto 2D
parts.These2Dpartprimitivesarematchedacrossviewsto build
prim-itivesin 3D whicharethenassembledto form ahuman
figure. In orderto searchfor thebestassembly, weusea
likelihoodfunctionthatintegratesinformationavail-able from multiple
views aboutbody part
locations.Greedysearchstrategiesareemployedsoasto find
thebestassemblyfast.
1.1 RelatedWork
HumanBody poseestimationhasreceived consider-able interest in
the past few years and several ap-proacheshave beentried for
differentapplications.
There are many methodsfor incrementalmodel-basedbody part
tracking wherea modelof an artic-ulatedstructure(person)is
specifiedupfront[5, 20, 1,21, 6]. DelamarreandFaugeras[5] try to
alignthepro-jectionof an articulatedstructurewith the silhouettesof
a personobtainedin multiple views by calculatingforcesthatneedto
beappliedto structure.DrummondandCipolla[20] useLie algebrato
incrementallytrackarticulatedstructures.BreglerandMalik [1]
usetwistsandexponentialmapsto specifyrelationshipsbetweenpartsand
to track an articulatedstructureincremen-tally. Sidenbladh[17] and
Choo [4] usemontecarloparticlefiltering to
incrementallyupdatetheposteriorprobabilitiesof
poseparameters.Thesemethodsneedto have both a 3D modelof the
humanstructureandagoodinitialization andhave
potentialapplicationsinmotion-capture[6].
Another classof algorithms[12, 8, 18, 15] try todetectbody
partsin 2D usingtemplatematchingandthentry to find
thebestassemblyusingsomecriteria.Someothermethodslearnsomemodelsof
humanmo-tion. Thesemodelscanbe basedon optical flow
[7],exemplars[14, 19], featurevectors[18], supportvec-tor
machines[15], or statisticalmappings(SMA)
[16].Thesemodelscanthenbeusedto detectandestimatetheposeof ahumanin
anobservedimage.
Our work is most closely relatedto the work ofKakadiarisand
Metaxas[11] who try to acquire3Dbodypartinformationfrom
silhouettesextractedin or-
1
-
thogonalviews. They employ a deformablehumanmodelsothatany
sizeof thehumancanberecognized.Thedistinguishingfeatureof ourwork
is thatit is ableto work in a crowdedscenesothat in all of
theviews,the personmight be fully or partially occluded.Thisis
accomplishedby explicitly modelingocclusionanddeveloping prior
modelsfor personshapesfrom thescene.This helpsusto
decoupletheproblemsof poseestimationfor multiple peopleso that the
degreesoffreedomof theproblemaredecreasedsubstantially.
The paperis organizedas follows. Section2 de-scribesthe methodof
extraction of silhouettesin acrowdedscene.Section3
describesshapeanalysisofsilhoettesand matchingpartsacrossviews to
obtain3D partprimitives. Section4 describesthe likelihoodfunction
usedfor assemblyevaluation. Section5 de-scribesthe algorithmusedto
find the bestassembly.We concludewith somepreliminaryresultsin
section6.
2 Extracting Multiple Silhouettes ina Clutter ed Scene
Weusethemethoddevelopedby Mittal andDavis [13]in theirsystemM �
Tracker for extractingsilhouettesofpeoplein
aclutteredscene.Themethodis ableto seg-mentregionsbelongingto
differentpeopleevenwhenthey arenotvisually
isolated.Here,weprovideabriefreview of themethod.
M � Tracker developstwo typesof modelsfor eachperson.
2.1 AppearanceModels
2.1.1 Color Models
A probabilisticmodelfor thecolor distribution at dif-ferent
heightsof the personis developedusing themethodof
non-parametricGaussiankernelestimation.
2.1.2 “Presence”Probabilities
The otherattribute modeledis the “Presence”Proba-bility
(denotedby
���������
), definedastheprobability
thatapersonis present(i.e.occupiesspace)atheight�
anddistance�
from thevertical line passingthroughtheperson’s center.
Figure1: SamplePresenceProbabilitiesof people.
Thesemodelsaredevelopedautomaticallyfrom thesceneandareusedto
segmentimagesin thefollowingway.
2.2 Pixel Classification
BayesianClassificationis usedto classifyeachpixelasbelongingto
aparticularperson,or thebackground.Thea posteriori
probabilitythatanobservation � ��� atpixel
originatedfrom person� (or the background)
is
������������������� �! "� ��#�%$&�('�)������ � � ��*� � ��#
+� (1)Thepixel is thenclassifiedas
Most likely class,&-/.103254+67 ���8������������������ �9 "�
���� (2)�*� � ��� +� is given by the color modelof the personat
height
�. For thebackground,a backgroundmodel
of thesceneis
used.Thepriorsincludeocclusioninformationanddeter-
minedusingthe following method.For eachpixel
,a ray is projectedin spacepassingthroughthe opti-cal centerof
the camera. Minimum distances
� 7 ofthis ray arecalculatedfrom the vertical lines
passingthroughthecurrentlyestimatedcentersof thepeople.Also
calculatedaretheheights
� 7 of theshortestlinesegmentsconnectingtheselines.Then,theprior
prob-ability that a pixel
is the imageof person� is set
as
� '�)������ � � , � 7 ��� 7 �� 7
k occludes j
��:@���A>!��!�
� '�)������ ��B+CED 0F.1GIH ,all j
��:(;=� 7 ��� 7 �� 7 � (3)
2
-
Figure2: Someresultsfrom M � Tracker. Thefirst twoimagesshow
detectionandtrackingresultsandthelasttwo show
segmentationresults.
where� 7 ��� 7 �� 7 is the “presence”probability de-
scribedearlier. A person“D
occludes� ” if thedistanceofD
to theopticalcenterof thecamerais lessthanthedistanceof � to the
center. The classificationproce-durehelpsto incorporateboth
thecolor profile of
thepeople,andtheocclusioninformationavailable.
Thesegmentationalgorithmassumesknowledgeofapproximatepersonlocations.Theselocationsareob-tainedusinga
region-basedstereoalgorithm.
2.3 Obtaining Multiple Segmentations
Thereareseveral parametersin the segmentational-gorithm.
Accurateextractionof differentpartsof
thepersonrequiresdifferentparameters.Therefore,it isessentialto
vary theparameterssoasto
obtainmulti-plesegmentations.Theparametersthatwevary are(1)
therelative weightgivento thebackgroundmodel,(2) the relative
weight given to different
foregroundobjectssothatdifferentobjectsarehighlighted,and(3)
thethresholdfor deteminingwhetherapixel is
un-classifiedpixels.Thesilhouettesthusobtainedaresegmentedusingthemethoddescribedin
thenext section.
Figure3: Multiple SegmentationsObtainedfor theim-ageshown in
thefirst image
3 Computing Body-part Primiti ves
3.1 2D SilhouetteShapeAnalysis
In orderto recover theposeof a person,we breakthesilhouetteof
the personinto parts. According to hu-man intuition aboutparts,a
segmentationinto partsoccursat negative minima of curvature so that
thedecomposedparts are convex regions. Singh et
al.notedthatwhenboundarypointscanbejoinedin morethanonewayto
decomposeasilhouette,humanvisionprefersthepartitioningschemewhichusestheshortestcuts(
A cut is theboundarybetweenapartandtherestof thesilhouette).They
furtherrestrictacut to crossasymmetryaxis in orderto avoid shortbut
undesirablecuts.However, mostsymmetryaxesarevery sensitiveto
noiseand are expensive to compute. In contrast,we
usetheconstrainton thesalienceof apartto avoidshortbut
undesirablecuts.Accordingto HoffmanandSingh’s [10] study thereare
threefactorsthat affectthesalienceof apart: thesizeof
thepartrelative to thewhole object, the degreeto which the part
protrudes,andthestrengthof its
boundaries.Amongthesethreefactors,thecomputationof a part’s
protrusion(thera-
3
-
Figure4: SilhouetteDecomposition
tio of the perimeterof the part (excluding the cut) tothe
lengthof the cut) is more efficient and robust
tonoiseandpartialocclusionof theobject.Thus,weem-ploy
theprotrusionof aparttoevaluateitssalience;thesalienceof a
partincreasesasits protrusionincreases.
In summary, we combinetheshort-cutrule
andthesaliencerequirementto constrainthe other end of acut. For
examplein Figure3.1,let J beasilhouette,Kbetheboundaryof J , �
beapointon K with negativeminimaof curvature,and
�8Lbeapointon K sothat �
and��L
divide theboundaryK into two curves KNM , K �of
equalarclength.Thentwo cutsareformedpassingthroughpoint
�:�*� M , �*�8� suchthatpoints � M and ���
lieson KNM and K � , respectively. Theends� M and ��� ofthetwo
cutsarelocatedasfollows:
� M?,&-/.10325O PQSR*T �U�WV Ts.t. T
X�*� V TT �*�WV T
Y[Z �"�\� V^] KNM � �*� V ] J (4)
��� ,_-`.10325O PQSR*T �*�WV Ts.t. T
X�U� V TT �U�WV T
Y[Z � �\� V ] K � � �*� V ] J (5)
whereX�*� V
is thesmallerpartof boundaryK between�and
� V, T
X�*� V T is thearclengthofX�*� V
, and abQ^Q R aa Q^QSR a is
thesalienceof thepartboundedby curveX�U� M andcut�*� M .
Eq. (4) meansthat point� M is locatedso that the
cut�*� M is theshortestoneamongall cutssharingthe
Pl P
r
Cl
Cr
Pm
P
Figure5: Computingthecutspassingthroughpoint P
sameend�
, lying within thesilhouettewith theotherendlying on contour KNM
, andresultingin a significantpartwhosesalienceis above a
thresholdZ � . Theotherpoint
�8�is locatedin thesamewayusingEq. (5).
Since negative minima of curvature are obtainedby local
computation,their computationis not robustin real digital images.
We take several computation-ally efficient strategies to reducethe
effectsof noise.First, a B-splineapproximationis usedto
moderatelysmooththe boundaryof a
silhouette,sinceB-splinerepresentationis stableandeasyto
manipulatelocallywithout affecting the restpart of the
silhouette.Sec-ond,thenegativeminimaof curvaturewith
smallmag-nitudeof curvatureareremoved to avoid partsduetonoiseor
small local deformations. However, curva-tureis not
scaleinvariant(e.g. its valuedoublesif thesilhouetteshrinksby
half). Oneway to transformcur-vatureinto ascale-invariantquantityis
to first find thechordjoining thetwo closestinflectionswhich
boundthe point, thenmultiply the curvatureat the point
bythelengthof thischord.Theresultingnormalizedcur-vaturedoesnot
changewith scale— if thesilhouetteshrinksto half
size,thecurvaturedoublesbut thechordhalves,sotheirproductis
constant.
This analysisyields 2D body partsfor a personina singleview. The
torsois not found directly by thismethodasthe body part
segmentationscanonly findprotrudedpartsreliably.
Sincetheseprotrudedpartscan overlap, thereare a large numberof
torsosthatcanbeformedfrom theremainingpartof
thesilhoette.Therefore,wedonotattemptto find
thetorsosdirectlyandsimply infer it from theotherbodyparts.
Zhao [22] has useda similar methodto developa systemfor body
part identification from a singleview. However, bodypart
identificationfrom a singleview is very difficult
andlabelingsareoftenincorrect,
4
-
Figure6: Multiple Body partsobtainedusingtheseg-mentationsshown
in Fig. 3
especiallyin the caseof partial-occlusionsand
self-occlusionswheresomebodypartsarenotvisible. Theproblemis
alsounderconstrainedsincedepthinforma-tion is not available.
Another difficulty is that theirsystemrequiresextractionof
goodsilhouetteswhicharenot easyto obtain in a densescene.As
opposedto Zhao’s work, we usemultiple camerasandidentifybodyposein
3D usingaglobalanalysis.
3.2 Computing Body-part Primiti vesin 3D
2D
partprimitivesobtainedusingSilhouetteAnalysisareusedtoobtainpartprimitivesin
3D.First,partsthatare relatively closeto eachother are
combinedwitheachother.
Second,thedecomposedpartsarematchedacrossviewsusingepipolargeometryto
yield 3D bodyparts. The two endpointsof a part in one view
arematchedto the correspondingendpointsin the otherview. The
matchingis basedon simply lying on thecorrespondingepipolarline. An
additionalconstraintthatcanbeusedis thecolor profile of
thebodyparts.Thedisadvantageis thatif
theviewpointsaresubstan-tially
different,thecolorprofilescanvarysignificantly.Also, thecolor
profilesfor differentbodypartscanbe
verysimilar(for e.g.thetwo
legscanhaveverysimilarcolorprofiles.)
Oncematchingis done,a certainnumberof
bodypartsareselectedbasedon their
matchingscoreandtheirendpointsareprojectedin spaceto yield3D
bodyparts.
4 AssemblyEvaluation using the Ob-servation Lik elihood
Labelingsareassignedto these3D partsby
buildinganassemblythathasthemaximumlikelihoodaccord-ing to an
appropriatelikelihood function. From thesetof 3D body parts,we form
setsof possibleheads,handsandlegs basedon
sizeconstraints.Additionalknowledge,if
available,canbeused.Suchinformationmight consistof theknowledgethat
thelegsarecloseto thefloor or thatthepersonis
standing(constraintonheadandhandpositions).Then,theproblemreducesto
findingahead,two hands(or asingleor nohands,ifnot found)andtwo
legs(or 0 or 1 legs),suchthat theassemblyhasthe highestlikelihood.
The likelihoodfunctionwe useis describedin thenext section.
4.1 Observation Lik elihood
In orderto evaluateaparticularassemblyc , wedeter-mine the
observation likelihood
� . � �`d � � � �\e�e�e�� ��fg `c ,which is the likelihood of
observing images�`d � � � �\e�e�e�� ��f given the particular
assembly c .Assuming that assemblieshave equal priors,
theassemblyhaving the highest likelihood is also theassemblywith
thehighestposterior. Sincewe do notknow the body poseof
otherpeoplein the scene,theobservation likelihood cannot be
determinedunlessthe problemsof body posedeterminationof
differentpeople are coupled with one another. This leadsto an
exponential increasein the complexity of thealgorithm.
We can decouplethe problem, however, if makesomesimplifying
assumptions.Specifically, we canusethemethoddevelopedin M �
Tracker[13] to deter-mine priors using presenceprobabilities. Then,
thegeneralformula for the observation probability at
aparticularpixel
canbewrittenas:
h � � ��#� , 7�������i���j� � �� . � � ��# +� (6)
5
-
Figure7: DeterminingtheProjectionof anAssembly
wherethe summationis doneover all persons� andthe background,and
� ��# is the observation at pixel
. If the locationof the assemblyis given, the func-tion
� 7 ��# (PresenceProbability defined in section2.1.2)for
thepersonunderconsiderationchangesfromaprobabilisticto
afixedfunctionsothat:
� 7 ��# ,:
if assembly projects to pixel xkif it does not project to pixel
x
(7)
Using this definition, onecanredeterminethe pri-ors for all
peopleusingequation(3)
andcalculatetheobservationprobabilityusingequation(6).
Thiswouldbetheconditionalprobability
� . � � ��# `c . Assumingthat observationsat different pixels
are independent,the overall observation probability is thensimply
theproductof the observation probabilitiesat eachpixelin
eachview.
� . � �`d � � � �\e�e�e�� ��f8 `c ,f�ml d all pixelsx
� . � � ��� `c
(8)In orderto determinetheprojectionof theassembly
onanimage,wemodelthehandsandlegsascylinderswith
approximatewidthsandtheheadasa
sphereanddeterminetheirprojectionsontoaview (Figure7). Thetorsois
built by filling in thepolygonformedby takingthe joint locationsof
the (five) partsas the vertices.More
accurateprojectioncanbeformedby building a3D structurebasedon the
joint locationsandfindingits projectionontotheviews. Thatwill,
however, addto therunningtime of thealgorithm.
M � Tracker determinesthe probability � . � � ��# +�
usedin equation(6) using color modelsat different
height slices. This puts only occupancy constraintson
thelikelihood.However, apartfrom thehypothesisthat the given
assemblyprojectsto a particularpixel,we alsohave informationas to
which part of the as-semblyprojectsto the pixel. Using this
information,we canimprove resultsby including in the
likelihoodfunction information available from the views
aboutpossiblebody part locations. For e.g., we might beableto find
theheadusinga facedetector. If we havea skin detector, we might
want to exclude the torsofrom the setof body partsthat cangive rise
to it. Inthepresentwork, we includeanadditionaltermin
thelikelihood
� . � � ��� +� .First, we determinetheprobability thata
particular
bodyparthasa particularaspectratio� .�n � � -/. . This
probabilityis modeledasa1D Gaussian,its meanandstandarddeviation
learntusingtrainingdata.Now, weconsiderbody partsdetectedfrom the
silhouettesex-tractedfor thepersonandfind
theiraspectratios.Find-ing thevalueof thefunction
� .�n � � -/. , we assignthisvalueto all pixels belongingto
thepart in thesilhou-ette. Sincethe torsois not observed directly,
we can-not determinethis probability for pixels belongingtoit
andhencethey areassigneda constantvalue.Sincewehave
multiplesilhouettesandhencemultiple prob-ability estimatesfor the
aspectratio at a given pixel,weaveragethemto yield
asingleresult.For pixelsly-ing outsideany
silhouette,theprobabilityis zero.Thiswill yield thefunction
� . � -/.o B h for eachpixel andeachbodypart
B h . During evaluationof anassembly,we can computethe value of
this function sinceweknow theprojectionsof
thebodypartsontotheimage.Thisprobabilityvaluecanbemultipliedwith
thecolorlikelihoodto yield thelikelihoodfunction
� . � � ��� +�
usedin equation(6).
5 Searching for the Optimal Assem-bly
We believe that the bestassemblycanonly be foundby an exhaustive
searchin p � GSq time (where Gsrp ��: k ) is the numberof
possibleprimitives for eachpart). However, in practive, we have
found that thesameresultcanbe obtainedin p � G time if we havea
goodinitial estimateof thebodypartpositions, andin p � G � time
duringtheinitialization phase.We
firstdescribetheincrementalscheme.
6
-
Figure8: Schematicfor theInitializationprocedure
5.1 IncrementalAlgorithm
If we have a sufficiently goodestimateof the currentbody part
locations,we usea greedyapproach.Theideais to first try to
replaceeachpart with candidateparts. If the assemblywith the
original part has ahigher likelihood than the oneswith any of the
newprimitives,we keeptheoriginal one. This is repeatedfor
differentparts.Wehave foundthat,apartfrom be-ing very fast,this
methodyields thebestresults(bet-ter
thaninitializationmethod)sinceit is
oftenthecasethatsomebodypartshavenogoodcandidatesatapar-ticular
time step,in which casewe cankeepthe oldestimate.
5.2 Initialization
In order to find an initial solution, or reinitialize
themethodif the incrementalmethodfails, we use thefollowing
approach. First, we try to find good legpairs. We find K bestpairs
(in p ��t � time) basedon thelikelihoodfunctionby building
anassemblyofjust the two legs. Similarly, we find K bestpairs
ofhands.Next, we find K bestassembliesconsistingoftwo handsandtwo
legs usingthe handandleg pairsfoundearlier(Figure8). For
thisstep,weconstructthetorsousingthefour joint locations.Finally
theheadisaddedandthe bestassemblyis found. Although
thismethoddoesnot find the optimal assembly, we havefoundthatit is
extremelyeffective in practiceandwiththe right choiceof
t, yields resultsvery closeto an
Figure9: Resultsof thealgorithmfor a personat a partic-ular time
instantfrom multiple perspectives.Notehow
theperson’sbodypartsarecorrectlydetectedeventhoughheispartiallyoccludedfrom
someviews.
exhaustive search.If computationalcost is available,we canfind
the
resultusingbothalgorithms,takingtheassemblywiththehigherlikelihoodastheanswer.
6 Results
We have obtainedpromisingresultsfor thealgorithm.We testedour
algorithmon a 5-perspective sequencewith two peoplepartially
occludingeachotherin sev-eralviews. Wewereableto correctlyidentify
thebodypartsof thepeoplewhenthey wereextendedfrom thebody.
Whenthepartswerecloseto thebody, thealgo-rithm
labeledthepartasmissingandcorrectlyidenti-fied
theotherparts.Figure9 shows theresultobtainedataparticulartime
instantfor thesequence.Figure10shows theresultsover time from
aparticularview. Noinitializationwasdone,norany exact3D modelof
thepersonspecified.Thealgorithmtook about10s/frameon a Dual
933MHzPentiumIII processorwheremostof the time was spentin
evaluatingdifferent assem-blies.
7 Summary and Conclusions
Wehavepresentedanalgorithmfor bodyposeestima-tion thatdoesnot
requireany initializationsor modelsto bespecifiedupfrontandis
ableto work in acrowded
7
-
Figure10: Resultsfor five framesof thesequence.
sceneso that occlusions- both full and partial -
arepresent.Thesefeaturesmake it especiallyuseful formany
surveillanceapplications.In thefuture,we wishto investigatemore
cuesfor body parts in an image(otherthanthesilhouettes)like
edgemapsandtextureregions,which might helpus to
reducethenumberofcamerasrequiredto obtainacertainquality of
results.
References
[1] C. Bregler andJ. Malik. Trackingpeoplewith twists
andexponentialmaps.In IEEE Conference on Computer Visionand Pattern
Recognition, 1998.
[2] Q. CaiandJ.K.Aggarwal. Trackinghumanmotionin
struc-turedenvironmentsusinga distributed-camerasystem.Pat-tern
Analysis and Machine Intelligence,
21(11):1241–1247,November1999.
[3] G.K.M. Cheung,T. Kanade,J.Y. Bouguet,andM. Holler.
Areal-timesystemfor robust3dvoxel reconstructionof humanmotions.In
CVPR, pages714–720,2000.
[4] Kiam ChooandDavid J. Fleet. Peopletrackingusinghy-brid
montecarlo filtering. In International Conference onComputer
Vision, 2001.
[5] QuentinDelamarreandOlivier D. Faugeras.3d
articulatedmodelsandmulti-view trackingwith silhouettes. In
ICCV(2), pages716–721,1999.
[6] D. DiFranco,T.-J.Cham,andJ.M.Rehg.
Reconstructionof3-dfiguremotionfrom 2-d correspondences.In IEEE
Com-puter Vision and Pattern Recognition, Kauai, Hawaii,
De-cember2001.
[7] R. FabletandM.J. Black. Automaticdetectionand track-ing of
humanmotion with a view-basedrepresentation.InECCV02, pageI:
476ff., 2002.
[8] P. Felzenszwalb andD. Huttenlocher. Efficient
matchingofpictorial structures.In Computer Vision and Pattern
Recog-nition, pages66–75,2000.
[9] G.Gavrila andL. Davis. 3dmodel-basedtrackingof humansin
action: a multi-view approach. In CVPR, pages73–80,1996.
[10] DonaldD. Hoffman andManishSingh. Salienceof
visualparts.Cognition, 63:29–78,1997.
[11] KakadiarisI.A. andD.
Metaxas.Three-dimensionalhumanbodymodelacquisitionfrom multiple
views. InternationalJournal of Computer Vision,
30(3):191–218,1998.
[12] Sergey Ioffe andDavid Forsyth. Humantrackingwith
mix-turesof trees.In International Conference on Computer Vi-sion,
pages690–695,2001.
[13] A. Mittal andL.S.Davis. M u tracker: A multi-view
approachto segmentingandtrackingpeoplein a
clutteredsceneusingregion-basedstereo.In European Conference on
ComputerVision, pageI: 18 ff., 2002.
[14] G.Mori andJ.Malik.
Estimatinghumanbodyconfigurationsusingshapecontext matching.In
ECCV02, pageIII: 666ff.,2002.
[15] R. Ronfard, C. Schmid,andB. Triggs. Learningto
parsepicturesof people.In ECCV02, pageIV: 700ff., 2002.
[16] R.Rosales,M. Siddiqui,J.Alon, andS.Sclaroff. Estimating3d
bodyposeusinguncalibratedcameras.In IEEE Interna-tional Conference
on Computer Vision and Pattern Recogni-tion, 2001.
[17] Hedvig Sidenbladh,Michael J. Black, and David J.
Fleet.Stochastictrackingof 3dhumanfiguresusing2d imagemo-tion. In
ECCV (2), pages702–718,2000.
[18] YangSong,Xiaolin Feng,andPietroPerona.Towardsde-tectionof
humanmotion. In CVPR, pages810–817,2000.
[19] J.SullivanandS.Carlsson.Recognizingandtrackinghumanaction.
In ECCV02, pageI: 629ff., 2002.
[20] T.Drummondand RobertoCipolla. Real-timetracking
ofthemultiple articulatedstructuresin multiple views. In Eu-ropean
Conference on Computer Vision, 2000.
[21] M. Yamamoto,A. Sato,S.Kawada,T. Kondo,andY.
Osaki.Incrementaltrackingof humanactionsfrom multiple views.In IEEE
International Conference on Computer Vision andPattern Recognition,
1998.
[22] LiangZhao.Dressed Human Modeling, Detection, and
PartsLocalization. PhDthesis,RoboticsInstitute,Carnegie Mel-lon
University, Pittsburgh,PA, July2001.
8