Human Body Pose Estimation Using Silhouette Shape Analysisamittal/bodypose.pdf · Human Body Pose Estimation Using Silhouette Shape Analysis Abstract We describe a system for human

HumanBody PoseEstimationUsingSilhouetteShapeAnalysis

Abstract

We describe a system for human body pose estimationfrom multiple views that is fast and not dependent on arigid 3D model. We make use of recent work in decom-position of a silhouette into 2D parts. These 2D partprimitives are matched across views to build assem-blies in 3D. In order to search for the best assembly,we use a likelihood function that integrates informa-tion available from multiple views about body part lo-cations. Occlusion is modeled into the likelihood func-tion so that the algorithm is able to work in a crowdedscene even when only part of the person is visible ineach view. The algorithm has potential applicationsin surveillance and promising results have been ob-tained.

1 Intr oduction

Determiningtheposeof humansis animportantprob-lem in vision andhasmany applications. In this pa-per, we target multi-camerasurveillanceapplicationswhereonewantsto recognizetheactivities of peoplein ascenein thepresenceof occlusionsandpartialoc-clusions. Onecannotassumethat a personis visiblein isolationor in full in eitheroneor all of theviews.Nor canoneassumethatwe have a modelof theper-son, or that the initial body poseis known. Suchasystemshouldalsobereasonablyfast. However, veryaccuratebody posevaluesare typically not required,andananswercloseto theactualbodyposemight beadequate.We describeanalgorithmthatcanform thebasisof suchasurveillancesystem.

Oursystemestimatesthe3D poseof ahumanbodyfrom multiple views. We make useof recentwork indecompositionof asilhouetteinto 2D parts.These2Dpartprimitivesarematchedacrossviewsto build prim-itivesin 3D whicharethenassembledto form ahuman

figure. In orderto searchfor thebestassembly, weusea likelihoodfunctionthatintegratesinformationavail-able from multiple views aboutbody part locations.Greedysearchstrategiesareemployedsoasto find thebestassemblyfast.

1.1 RelatedWork

HumanBody poseestimationhasreceived consider-able interest in the past few years and several ap-proacheshave beentried for differentapplications.

There are many methodsfor incrementalmodel-basedbody part tracking wherea modelof an artic-ulatedstructure(person)is specifiedupfront[5, 20, 1,21, 6]. DelamarreandFaugeras[5] try to alignthepro-jectionof an articulatedstructurewith the silhouettesof a personobtainedin multiple views by calculatingforcesthatneedto beappliedto structure.DrummondandCipolla[20] useLie algebrato incrementallytrackarticulatedstructures.BreglerandMalik [1] usetwistsandexponentialmapsto specifyrelationshipsbetweenpartsand to track an articulatedstructureincremen-tally. Sidenbladh[17] and Choo [4] usemontecarloparticlefiltering to incrementallyupdatetheposteriorprobabilitiesof poseparameters.Thesemethodsneedto have both a 3D modelof the humanstructureandagoodinitialization andhave potentialapplicationsinmotion-capture[6].

Another classof algorithms[12, 8, 18, 15] try todetectbody partsin 2D usingtemplatematchingandthentry to find thebestassemblyusingsomecriteria.Someothermethodslearnsomemodelsof humanmo-tion. Thesemodelscanbe basedon optical flow [7],exemplars[14, 19], featurevectors[18], supportvec-tor machines[15], or statisticalmappings(SMA) [16].Thesemodelscanthenbeusedto detectandestimatetheposeof ahumanin anobservedimage.

Our work is most closely relatedto the work ofKakadiarisand Metaxas[11] who try to acquire3Dbodypartinformationfrom silhouettesextractedin or-

1

thogonalviews. They employ a deformablehumanmodelsothatany sizeof thehumancanberecognized.Thedistinguishingfeatureof ourwork is thatit is ableto work in a crowdedscenesothat in all of theviews,the personmight be fully or partially occluded.Thisis accomplishedby explicitly modelingocclusionanddeveloping prior modelsfor personshapesfrom thescene.This helpsusto decoupletheproblemsof poseestimationfor multiple peopleso that the degreesoffreedomof theproblemaredecreasedsubstantially.

The paperis organizedas follows. Section2 de-scribesthe methodof extraction of silhouettesin acrowdedscene.Section3 describesshapeanalysisofsilhoettesand matchingpartsacrossviews to obtain3D partprimitives. Section4 describesthe likelihoodfunction usedfor assemblyevaluation. Section5 de-scribesthe algorithmusedto find the bestassembly.We concludewith somepreliminaryresultsin section6.

2 Extracting Multiple Silhouettes ina Clutter ed Scene

Weusethemethoddevelopedby Mittal andDavis [13]in theirsystemM � Tracker for extractingsilhouettesofpeoplein aclutteredscene.Themethodis ableto seg-mentregionsbelongingto differentpeopleevenwhenthey arenotvisually isolated.Here,weprovideabriefreview of themethod.

M � Tracker developstwo typesof modelsfor eachperson.

2.1 AppearanceModels

2.1.1 Color Models

A probabilisticmodelfor thecolor distribution at dif-ferent heightsof the personis developedusing themethodof non-parametricGaussiankernelestimation.

2.1.2 “Presence”Probabilities

The otherattribute modeledis the “Presence”Proba-bility (denotedby

��

), definedastheprobability

thatapersonis present(i.e.occupiesspace)atheight�

anddistance�

from thevertical line passingthroughtheperson’s center.

Figure1: SamplePresenceProbabilitiesof people.

Thesemodelsaredevelopedautomaticallyfrom thesceneandareusedto segmentimagesin thefollowingway.

2.2 Pixel Classification

BayesianClassificationis usedto classifyeachpixelasbelongingto aparticularperson,or thebackground.Thea posteriori probabilitythatanobservation � �� atpixel

originatedfrom person� (or the background)

is

�� ! "� ��#�%$&�('�)�� *� � ��# +� (1)Thepixel is thenclassifiedas

Most likely class,&-/.103254+67 ��8�� 9 "� �� (2)�*� � �� +� is given by the color modelof the personat height

�. For thebackground,a backgroundmodel

of thesceneis used.Thepriorsincludeocclusioninformationanddeter-

minedusingthe following method.For eachpixel

,a ray is projectedin spacepassingthroughthe opti-cal centerof the camera. Minimum distances

� 7 ofthis ray arecalculatedfrom the vertical lines passingthroughthecurrentlyestimatedcentersof thepeople.Also calculatedaretheheights

� 7 of theshortestlinesegmentsconnectingtheselines.Then,theprior prob-ability that a pixel

is the imageof person� is set

as

� '�)�� , � 7 �� 7 �� 7

k occludes j

��:@��A>!��!�

� '�)�� B+CED 0F.1GIH ,all j

��:(;=� 7 �� 7 �� 7 � (3)

2

Figure2: Someresultsfrom M � Tracker. Thefirst twoimagesshow detectionandtrackingresultsandthelasttwo show segmentationresults.

where� 7 �� 7 �� 7 is the “presence”probability de-

scribedearlier. A person“D

occludes� ” if thedistanceofD

to theopticalcenterof thecamerais lessthanthedistanceof � to the center. The classificationproce-durehelpsto incorporateboth thecolor profile of thepeople,andtheocclusioninformationavailable.

Thesegmentationalgorithmassumesknowledgeofapproximatepersonlocations.Theselocationsareob-tainedusinga region-basedstereoalgorithm.

2.3 Obtaining Multiple Segmentations

Thereareseveral parametersin the segmentational-gorithm. Accurateextractionof differentpartsof thepersonrequiresdifferentparameters.Therefore,it isessentialto vary theparameterssoasto obtainmulti-plesegmentations.Theparametersthatwevary are(1) therelative weightgivento thebackgroundmodel,(2) the relative weight given to different foregroundobjectssothatdifferentobjectsarehighlighted,and(3) thethresholdfor deteminingwhetherapixel is un-classifiedpixels.Thesilhouettesthusobtainedaresegmentedusingthemethoddescribedin thenext section.

Figure3: Multiple SegmentationsObtainedfor theim-ageshown in thefirst image

3 Computing Body-part Primiti ves

3.1 2D SilhouetteShapeAnalysis

In orderto recover theposeof a person,we breakthesilhouetteof the personinto parts. According to hu-man intuition aboutparts,a segmentationinto partsoccursat negative minima of curvature so that thedecomposedparts are convex regions. Singh et al.notedthatwhenboundarypointscanbejoinedin morethanonewayto decomposeasilhouette,humanvisionprefersthepartitioningschemewhichusestheshortestcuts( A cut is theboundarybetweenapartandtherestof thesilhouette).They furtherrestrictacut to crossasymmetryaxis in orderto avoid shortbut undesirablecuts.However, mostsymmetryaxesarevery sensitiveto noiseand are expensive to compute. In contrast,we usetheconstrainton thesalienceof apartto avoidshortbut undesirablecuts.Accordingto HoffmanandSingh’s [10] study thereare threefactorsthat affectthesalienceof apart: thesizeof thepartrelative to thewhole object, the degreeto which the part protrudes,andthestrengthof its boundaries.Amongthesethreefactors,thecomputationof a part’s protrusion(thera-

3

Figure4: SilhouetteDecomposition

tio of the perimeterof the part (excluding the cut) tothe lengthof the cut) is more efficient and robust tonoiseandpartialocclusionof theobject.Thus,weem-ploy theprotrusionof aparttoevaluateitssalience;thesalienceof a partincreasesasits protrusionincreases.

In summary, we combinetheshort-cutrule andthesaliencerequirementto constrainthe other end of acut. For examplein Figure3.1,let J beasilhouette,Kbetheboundaryof J , � beapointon K with negativeminimaof curvature,and

�8Lbeapointon K sothat �

and��L

divide theboundaryK into two curves KNM , K �of equalarclength.Thentwo cutsareformedpassingthroughpoint

�:�*� M , �*�8� suchthatpoints � M and ��

lieson KNM and K � , respectively. Theends� M and �� ofthetwo cutsarelocatedasfollows:

� M?,&-/.10325O PQSR*T �U�WV Ts.t. T

X�*� V TT �*�WV T

Y[Z �"�\� V^] KNM � �*� V ] J (4)

�� ,_-`.10325O PQSR*T �*�WV Ts.t. T

X�U� V TT �U�WV T

Y[Z � �\� V ] K � � �*� V ] J (5)

whereX�*� V

is thesmallerpartof boundaryK between�and

� V, T

X�*� V T is thearclengthofX�*� V

, and abQ^Q R aa Q^QSR a is

thesalienceof thepartboundedby curveX�U� M andcut�*� M .

Eq. (4) meansthat point� M is locatedso that the

cut�*� M is theshortestoneamongall cutssharingthe

Pl P

r

Cl

Cr

Pm

P

Figure5: Computingthecutspassingthroughpoint P

sameend�

, lying within thesilhouettewith theotherendlying on contour KNM , andresultingin a significantpartwhosesalienceis above a thresholdZ � . Theotherpoint

�8�is locatedin thesamewayusingEq. (5).

Since negative minima of curvature are obtainedby local computation,their computationis not robustin real digital images. We take several computation-ally efficient strategies to reducethe effectsof noise.First, a B-splineapproximationis usedto moderatelysmooththe boundaryof a silhouette,sinceB-splinerepresentationis stableandeasyto manipulatelocallywithout affecting the restpart of the silhouette.Sec-ond,thenegativeminimaof curvaturewith smallmag-nitudeof curvatureareremoved to avoid partsduetonoiseor small local deformations. However, curva-tureis not scaleinvariant(e.g. its valuedoublesif thesilhouetteshrinksby half). Oneway to transformcur-vatureinto ascale-invariantquantityis to first find thechordjoining thetwo closestinflectionswhich boundthe point, thenmultiply the curvatureat the point bythelengthof thischord.Theresultingnormalizedcur-vaturedoesnot changewith scale— if thesilhouetteshrinksto half size,thecurvaturedoublesbut thechordhalves,sotheirproductis constant.

This analysisyields 2D body partsfor a personina singleview. The torsois not found directly by thismethodasthe body part segmentationscanonly findprotrudedpartsreliably. Sincetheseprotrudedpartscan overlap, thereare a large numberof torsosthatcanbeformedfrom theremainingpartof thesilhoette.Therefore,wedonotattemptto find thetorsosdirectlyandsimply infer it from theotherbodyparts.

Zhao [22] has useda similar methodto developa systemfor body part identification from a singleview. However, bodypart identificationfrom a singleview is very difficult andlabelingsareoftenincorrect,

4

Figure6: Multiple Body partsobtainedusingtheseg-mentationsshown in Fig. 3

especiallyin the caseof partial-occlusionsand self-occlusionswheresomebodypartsarenotvisible. Theproblemis alsounderconstrainedsincedepthinforma-tion is not available. Another difficulty is that theirsystemrequiresextractionof goodsilhouetteswhicharenot easyto obtain in a densescene.As opposedto Zhao’s work, we usemultiple camerasandidentifybodyposein 3D usingaglobalanalysis.

3.2 Computing Body-part Primiti vesin 3D

2D partprimitivesobtainedusingSilhouetteAnalysisareusedtoobtainpartprimitivesin 3D.First,partsthatare relatively closeto eachother are combinedwitheachother. Second,thedecomposedpartsarematchedacrossviewsusingepipolargeometryto yield 3D bodyparts. The two endpointsof a part in one view arematchedto the correspondingendpointsin the otherview. The matchingis basedon simply lying on thecorrespondingepipolarline. An additionalconstraintthatcanbeusedis thecolor profile of thebodyparts.Thedisadvantageis thatif theviewpointsaresubstan-tially different,thecolorprofilescanvarysignificantly.Also, thecolor profilesfor differentbodypartscanbe

verysimilar(for e.g.thetwo legscanhaveverysimilarcolorprofiles.)

Oncematchingis done,a certainnumberof bodypartsareselectedbasedon their matchingscoreandtheirendpointsareprojectedin spaceto yield3D bodyparts.

4 AssemblyEvaluation using the Ob-servation Lik elihood

Labelingsareassignedto these3D partsby buildinganassemblythathasthemaximumlikelihoodaccord-ing to an appropriatelikelihood function. From thesetof 3D body parts,we form setsof possibleheads,handsandlegs basedon sizeconstraints.Additionalknowledge,if available,canbeused.Suchinformationmight consistof theknowledgethat thelegsarecloseto thefloor or thatthepersonis standing(constraintonheadandhandpositions).Then,theproblemreducesto findingahead,two hands(or asingleor nohands,ifnot found)andtwo legs(or 0 or 1 legs),suchthat theassemblyhasthe highestlikelihood. The likelihoodfunctionwe useis describedin thenext section.

4.1 Observation Lik elihood

In orderto evaluateaparticularassemblyc , wedeter-mine the observation likelihood

� . � �`d � � � �\e�e�e�� fg `c ,which is the likelihood of observing images�`d � � � �\e�e�e�� f given the particular assembly c .Assuming that assemblieshave equal priors, theassemblyhaving the highest likelihood is also theassemblywith thehighestposterior. Sincewe do notknow the body poseof otherpeoplein the scene,theobservation likelihood cannot be determinedunlessthe problemsof body posedeterminationof differentpeople are coupled with one another. This leadsto an exponential increasein the complexity of thealgorithm.

We can decouplethe problem, however, if makesomesimplifying assumptions.Specifically, we canusethemethoddevelopedin M � Tracker[13] to deter-mine priors using presenceprobabilities. Then, thegeneralformula for the observation probability at aparticularpixel

canbewrittenas:

h � � ��#� , 7��i��j� � �� . � � ��# +� (6)

5

Figure7: DeterminingtheProjectionof anAssembly

wherethe summationis doneover all persons� andthe background,and � ��# is the observation at pixel

. If the locationof the assemblyis given, the func-tion

� 7 ��# (PresenceProbability defined in section2.1.2)for thepersonunderconsiderationchangesfromaprobabilisticto afixedfunctionsothat:

� 7 ��# ,:

if assembly projects to pixel xkif it does not project to pixel x

(7)

Using this definition, onecanredeterminethe pri-ors for all peopleusingequation(3) andcalculatetheobservationprobabilityusingequation(6). Thiswouldbetheconditionalprobability

� . � � ��# `c . Assumingthat observationsat different pixels are independent,the overall observation probability is thensimply theproductof the observation probabilitiesat eachpixelin eachview.

� . � �`d � � � �\e�e�e�� f8 `c ,f�ml d all pixelsx

� . � � �� `c

(8)In orderto determinetheprojectionof theassembly

onanimage,wemodelthehandsandlegsascylinderswith approximatewidthsandtheheadasa sphereanddeterminetheirprojectionsontoaview (Figure7). Thetorsois built by filling in thepolygonformedby takingthe joint locationsof the (five) partsas the vertices.More accurateprojectioncanbeformedby building a3D structurebasedon the joint locationsandfindingits projectionontotheviews. Thatwill, however, addto therunningtime of thealgorithm.

M � Tracker determinesthe probability � . � � ��# +�

usedin equation(6) using color modelsat different

height slices. This puts only occupancy constraintson thelikelihood.However, apartfrom thehypothesisthat the given assemblyprojectsto a particularpixel,we alsohave informationas to which part of the as-semblyprojectsto the pixel. Using this information,we canimprove resultsby including in the likelihoodfunction information available from the views aboutpossiblebody part locations. For e.g., we might beableto find theheadusinga facedetector. If we havea skin detector, we might want to exclude the torsofrom the setof body partsthat cangive rise to it. Inthepresentwork, we includeanadditionaltermin thelikelihood

� . � � �� +� .First, we determinetheprobability thata particular

bodyparthasa particularaspectratio� .�n � � -/. . This

probabilityis modeledasa1D Gaussian,its meanandstandarddeviation learntusingtrainingdata.Now, weconsiderbody partsdetectedfrom the silhouettesex-tractedfor thepersonandfind theiraspectratios.Find-ing thevalueof thefunction

� .�n � � -/. , we assignthisvalueto all pixels belongingto thepart in thesilhou-ette. Sincethe torsois not observed directly, we can-not determinethis probability for pixels belongingtoit andhencethey areassigneda constantvalue.Sincewehave multiplesilhouettesandhencemultiple prob-ability estimatesfor the aspectratio at a given pixel,weaveragethemto yield asingleresult.For pixelsly-ing outsideany silhouette,theprobabilityis zero.Thiswill yield thefunction

� . � -/.o B h for eachpixel andeachbodypart

B h . During evaluationof anassembly,we can computethe value of this function sinceweknow theprojectionsof thebodypartsontotheimage.Thisprobabilityvaluecanbemultipliedwith thecolorlikelihoodto yield thelikelihoodfunction

� . � � �� +�

usedin equation(6).

5 Searching for the Optimal Assem-bly

We believe that the bestassemblycanonly be foundby an exhaustive searchin p � GSq time (where Gsrp ��: k ) is the numberof possibleprimitives for eachpart). However, in practive, we have found that thesameresultcanbe obtainedin p � G time if we havea goodinitial estimateof thebodypartpositions, andin p � G � time duringtheinitialization phase.We firstdescribetheincrementalscheme.

6

Figure8: Schematicfor theInitializationprocedure

5.1 IncrementalAlgorithm

If we have a sufficiently goodestimateof the currentbody part locations,we usea greedyapproach.Theideais to first try to replaceeachpart with candidateparts. If the assemblywith the original part has ahigher likelihood than the oneswith any of the newprimitives,we keeptheoriginal one. This is repeatedfor differentparts.Wehave foundthat,apartfrom be-ing very fast,this methodyields thebestresults(bet-ter thaninitializationmethod)sinceit is oftenthecasethatsomebodypartshavenogoodcandidatesatapar-ticular time step,in which casewe cankeepthe oldestimate.

5.2 Initialization

In order to find an initial solution, or reinitialize themethodif the incrementalmethodfails, we use thefollowing approach. First, we try to find good legpairs. We find K bestpairs (in p ��t � time) basedon thelikelihoodfunctionby building anassemblyofjust the two legs. Similarly, we find K bestpairs ofhands.Next, we find K bestassembliesconsistingoftwo handsandtwo legs usingthe handandleg pairsfoundearlier(Figure8). For thisstep,weconstructthetorsousingthefour joint locations.Finally theheadisaddedandthe bestassemblyis found. Although thismethoddoesnot find the optimal assembly, we havefoundthatit is extremelyeffective in practiceandwiththe right choiceof

t, yields resultsvery closeto an

Figure9: Resultsof thealgorithmfor a personat a partic-ular time instantfrom multiple perspectives.Notehow theperson’sbodypartsarecorrectlydetectedeventhoughheispartiallyoccludedfrom someviews.

exhaustive search.If computationalcost is available,we canfind the

resultusingbothalgorithms,takingtheassemblywiththehigherlikelihoodastheanswer.

6 Results

We have obtainedpromisingresultsfor thealgorithm.We testedour algorithmon a 5-perspective sequencewith two peoplepartially occludingeachotherin sev-eralviews. Wewereableto correctlyidentify thebodypartsof thepeoplewhenthey wereextendedfrom thebody. Whenthepartswerecloseto thebody, thealgo-rithm labeledthepartasmissingandcorrectlyidenti-fied theotherparts.Figure9 shows theresultobtainedataparticulartime instantfor thesequence.Figure10shows theresultsover time from aparticularview. Noinitializationwasdone,norany exact3D modelof thepersonspecified.Thealgorithmtook about10s/frameon a Dual 933MHzPentiumIII processorwheremostof the time was spentin evaluatingdifferent assem-blies.

7 Summary and Conclusions

Wehavepresentedanalgorithmfor bodyposeestima-tion thatdoesnot requireany initializationsor modelsto bespecifiedupfrontandis ableto work in acrowded

7

Figure10: Resultsfor five framesof thesequence.

sceneso that occlusions- both full and partial - arepresent.Thesefeaturesmake it especiallyuseful formany surveillanceapplications.In thefuture,we wishto investigatemore cuesfor body parts in an image(otherthanthesilhouettes)like edgemapsandtextureregions,which might helpus to reducethenumberofcamerasrequiredto obtainacertainquality of results.

References

[1] C. Bregler andJ. Malik. Trackingpeoplewith twists andexponentialmaps.In IEEE Conference on Computer Visionand Pattern Recognition, 1998.

[2] Q. CaiandJ.K.Aggarwal. Trackinghumanmotionin struc-turedenvironmentsusinga distributed-camerasystem.Pat-tern Analysis and Machine Intelligence, 21(11):1241–1247,November1999.

[3] G.K.M. Cheung,T. Kanade,J.Y. Bouguet,andM. Holler. Areal-timesystemfor robust3dvoxel reconstructionof humanmotions.In CVPR, pages714–720,2000.

[4] Kiam ChooandDavid J. Fleet. Peopletrackingusinghy-brid montecarlo filtering. In International Conference onComputer Vision, 2001.

[5] QuentinDelamarreandOlivier D. Faugeras.3d articulatedmodelsandmulti-view trackingwith silhouettes. In ICCV(2), pages716–721,1999.

[6] D. DiFranco,T.-J.Cham,andJ.M.Rehg. Reconstructionof3-dfiguremotionfrom 2-d correspondences.In IEEE Com-puter Vision and Pattern Recognition, Kauai, Hawaii, De-cember2001.

[7] R. FabletandM.J. Black. Automaticdetectionand track-ing of humanmotion with a view-basedrepresentation.InECCV02, pageI: 476ff., 2002.

[8] P. Felzenszwalb andD. Huttenlocher. Efficient matchingofpictorial structures.In Computer Vision and Pattern Recog-nition, pages66–75,2000.

[9] G.Gavrila andL. Davis. 3dmodel-basedtrackingof humansin action: a multi-view approach. In CVPR, pages73–80,1996.

[10] DonaldD. Hoffman andManishSingh. Salienceof visualparts.Cognition, 63:29–78,1997.

[11] KakadiarisI.A. andD. Metaxas.Three-dimensionalhumanbodymodelacquisitionfrom multiple views. InternationalJournal of Computer Vision, 30(3):191–218,1998.

[12] Sergey Ioffe andDavid Forsyth. Humantrackingwith mix-turesof trees.In International Conference on Computer Vi-sion, pages690–695,2001.

[13] A. Mittal andL.S.Davis. M u tracker: A multi-view approachto segmentingandtrackingpeoplein a clutteredsceneusingregion-basedstereo.In European Conference on ComputerVision, pageI: 18 ff., 2002.

[14] G.Mori andJ.Malik. Estimatinghumanbodyconfigurationsusingshapecontext matching.In ECCV02, pageIII: 666ff.,2002.

[15] R. Ronfard, C. Schmid,andB. Triggs. Learningto parsepicturesof people.In ECCV02, pageIV: 700ff., 2002.

[16] R.Rosales,M. Siddiqui,J.Alon, andS.Sclaroff. Estimating3d bodyposeusinguncalibratedcameras.In IEEE Interna-tional Conference on Computer Vision and Pattern Recogni-tion, 2001.

[17] Hedvig Sidenbladh,Michael J. Black, and David J. Fleet.Stochastictrackingof 3dhumanfiguresusing2d imagemo-tion. In ECCV (2), pages702–718,2000.

[18] YangSong,Xiaolin Feng,andPietroPerona.Towardsde-tectionof humanmotion. In CVPR, pages810–817,2000.

[19] J.SullivanandS.Carlsson.Recognizingandtrackinghumanaction. In ECCV02, pageI: 629ff., 2002.

[20] T.Drummondand RobertoCipolla. Real-timetracking ofthemultiple articulatedstructuresin multiple views. In Eu-ropean Conference on Computer Vision, 2000.

[21] M. Yamamoto,A. Sato,S.Kawada,T. Kondo,andY. Osaki.Incrementaltrackingof humanactionsfrom multiple views.In IEEE International Conference on Computer Vision andPattern Recognition, 1998.

[22] LiangZhao.Dressed Human Modeling, Detection, and PartsLocalization. PhDthesis,RoboticsInstitute,Carnegie Mel-lon University, Pittsburgh,PA, July2001.

8

Human Body Pose Estimation Using Silhouette Shape Analysisamittal/bodypose.pdf · Human Body Pose Estimation Using Silhouette Shape Analysis Abstract We describe a system for human

Documents