A Scalable Active Framework for Region Annotation in 3D ...ericyi/papers/part... · A Scalable Active Framework for Region Annotation in 3D Shape Collections Supplemental material

A Scalable Active Framework for Region Annotation in 3D Shape CollectionsSupplemental material

Li Yi1 Vladimir G. Kim1,2 Duygu Ceylan2 I-Chao Shen3 Mengyan Yan1

Hao Su1 Cewu Lu1 Qixing Huang4,5 Alla Sheffer3 Leonidas Guibas1

1Stanford University 2Adobe Research 3University of British Columbia 4TTI Chicago 5UT Austin

This document provides a list of supplemental materials that ac-company this paper.

• Labeling vs segmentation - We demonstrate performance ofour method on segmentation benchmark relative to existingalternatives.

• Exploring ShapeNetCore - We demonstrate examples ofleveraging our part labels for exploring collections of shapes(see exploration.mp4 file and Section 2).

• Optimality of Verification Set - We prove optimality of ourverification set selection procedure in Section 3 of this docu-ment.

• User Interfaces Video - We submit a video demonstratingkey features of annotation and verification user interfaces (seeinterface.mp4 file).

• ShapeNetCore Analysis - We provide the full set of analysisresults for each category of ShapeNetCore showing numberof verified models vs human time (Figure 4), and FMF vsfraction of data that has been verified (Figure 5).

1 Labeling vs segmentation

Although the focus of our work is labeling, not segmentation, wefound that our simple method for generating segmentation from thelabels gives reasonable boundaries. We evaluate our method onPSB benchmark [Chen et al. 2009] using leave-one-out setup pro-posed in Kalogerakis et al. [2010] and ELM-OPT [Xie et al. 2014].We tested our method on 7 categories of rigid objects excluding ar-ticulated shapes such as human, since our method is not designedfor those. One could use intrinsic mapping algorithms to extend ourmethod to non-rigidly deforming shapes. We report the results inTable 1.

Training Kalo ELM-OPT

Ours

Cup 9.8 9.9 10.3 10.8Airplane 7.4 7.9 8.9 8.6Chair 5.2 5.4 7.1 5.9Table 5.9 6.2 5.9 7.5Mech 8.5 10.0 15.9 14.8Bearing 6.8 9.7 15.4 9.6Vase 10.5 16.0 15.6 20.9

Table 1: This table shows the quality of the segmentations ourmethod produces by comparing the rand index scores on some cat-egories of PSB achieved by our method and the prior segmentationalgorithms Kalo (Kalogerakis et al.) and ELM-OPT (Xie et al.).Lower rand index scores indicate better performance and are high-lighted in bold.

In the paper, we compare our method with supervised labeling ap-proaches based on an F1 measurement. Also we evaluate differentdesign choices of our pipeline based on F1 measurement. For a bet-ter understanding of the labeling quality of these experiments, we

Figure 1: This comparison corresponds to the one shown in Figure12 of the paper, but with precision and recall as evaluation metrics.

also provide the precision and recall for the final labeling results inTable 2 and 3 correspondingly. In addition, we use precision and re-call as evaluation metrics and show the comparison with supervisedlabeling approaches in Figure 1. Notice unlike traditional retrievaltasks, our experiment does not involve in any parameters influenc-ing the tradeoff between precision and recall, so we show how theaverage precision and recall change under different training datapercentages.

Precision/Recall Wu et al.[2014]

Kalogerakiset al. [2010]

Ours

Lamp 0.823/0.786 0.873/0.801 0.907/0.827Chair 0.824/0.818 0.924/0.905 0.935/0.922

Table 2: This table gives the average per part precision and recallcorresponding to the comparison conducted in Figure 12 of the pa-per. The numbers reported here are generated when the trainingdata percentage is 5%.

Precision/Recall Chair-400 Vase-300with all component 0.950/0.953 0.909/0.889no active selection 0.928/0.946 0.873/0.857no verification step 0.889/0.896 0.825/0.830no ensemble learning 0.917/0.933 0.867/0.883no correspondence term 0.899/0.915 0.821/0.815no feature-based term 0.913/0.927 0.859/0.878no learning of weights 0.940/0.948 0.887/0.877no smoothness term 0.949/0.951 0.902/0.886

Table 3: This table gives the average per part labeling precisionand recall corresponding to the experiment conducted in Figure 13of the paper. Different variants of our method were tested, eachwithout some feature. The numbers reported here correspond to theend point of each curve in Figure 13 of the paper.

2 Exploring ShapeNetCore

We can leverage the obtained annotations to explore and gain in-sights about the data in ShapeNetCore. Instead of considering only

Chair Backs Chair Legs

Figure 2: We demonstrate that our annotations can facilitate ex-ploration of large collections of 3D models. We show embeddingsof chairs based on the shape of their backs and legs (see video foran interactive example).

User Query Retrieved Models

+

(a)

(b)

(c)

Figure 3: Part annotations can facilitate shape retrieval. In thisexample for ShapeNetCore chairs, we show models retrieved basedonly on the shape of the base (a), the shape of the back (b), and bysearching for chairs that combine both of these criteria (c).

global shape similarities, understanding salient regions enables usto organize data by comparing features in specific regions only. Forexample, we can embed the models based on their part-to-part sim-ilarity as shown in Figure 2 (where part similarity is computed withlightfield descriptors). Note how this quickly allows the user to seedifferent regions due to variations in a particular part. We also en-able faceted search in the spirit of Kim et al. [2013] allowing usersto select parts of interest and search for shapes that have the desiredarrangement of parts (see Figure 3 and video).

3 Optimality of verification set

We demonstrate optimality of our method for selecting the ver-ification set V m

k (see Section 4 of our paper). Recall that atthis stage our method selected annotation set Am

k , and it willgreedily add models to Vm

k = {vmk } based on annotation con-fidences Cm

ver until the utility function EmU =

Nmgood

Tm stops in-creasing. The input to our algorithm is sorted confidence values:1 � Cm

ver[k] � Cmver[k]... � Cm

ver[k] � 0 (we assume that sort-ing does not change shape indexing since it does not affect ourderivations), we want to determine vmk 2 {0, 1} so that E(Vm

) =

Nm�1good +

PNk=1 vm

k Cmver [k]

Tm�1+PN

k=1 vmk (⌧ident+(1�Cm

ver [k])⌧click)is maximized. Notice that

terms that depend on annotation set A are absorbed into Tm�1 andcan be kept constant at this stage without affecting our derivation.

The optimality of our greedy approach is guaranteed by two obser-vations. First, given a constant number of models to be verified n(i.e.,

Pk v

mk = n), the optimal values are vmk = 1 8k n and

0 everywhere else since it minimizes the denominator and maxi-mizes the numerator. Thus, our algorithm is optimal if the numberof models to be verified is fixed to n. Let us denote this optimalenergy for n models by:

f(n) = max

V mk s.t.

Pk V m

k =nEm

U (V mk )

=

Nm�1good +

Pnk=1 C

mver[k]

Tm�1+ n⌧ident + n⌧click �

Pnk=1 C

mver[k]⌧click

(1)

We denote numerator in the equation above by fnum and denomina-tor fdenom. We now prove that our utility function is monotonicallydecreasing after we greedily pick optimal n: f(n + 1) f(n) )f(n+ i+ 1) f(n+ i) 8i > 0.

Suppose f(n+ 1) f(n) is true, then this is equivalent to:

fnum(n)+Cmver [n+1]

fdenom(n)+⌧ident+⌧click�Cmver [n+1]⌧click

fnum(n)fdenom(n)

, Cmver [n+1]

⌧ident+⌧click�Cmver [n+1]⌧click

fnum(n)+Cmver [n+1]


(2)

by utilizing a simple algebraic equality: a+cb+d a

b , cd a+c

b+d

(note that all our terms are positive). Also, since Cmver[n + 2]

Cmver[n+ 1]:

Cmver[n+ 2]

⌧ident + ⌧click � Cmver[n+ 2]⌧click

Cmver[n+ 1]

⌧ident + ⌧click � Cmver[n+ 1]⌧click

(3)

Now combining Equations 2 and 3 yields:

Cmver [n+2]


fnum(n)+Cmver [n+1]


(4)

By definition:

, Cmver [n+2]


fnum(n+1)fdenom(n+1) (5)

and applying cd a

b , a+cb+d a

b :

, fnum(n+1)+Cmver [n+2]

fdenom(n+1)+⌧ident+⌧click�Cmver [n+2]⌧click

fnum(n+1)fdenom(n+1) (6)

Or equivalently, f(n+ 2) f(n+ 1).Thus, by induction f(n+ i+ 1) f(n+ i) for i > 0. ⇤

References

CHEN, X., GOLOVINSKIY, A., AND FUNKHOUSER, T. 2009.A benchmark for 3d mesh segmentation. In ACM SIGGRAPH,SIGGRAPH ’09, 73:1–73:12.

KALOGERAKIS, E., HERTZMANN, A., AND SINGH, K. 2010.Learning 3D mesh segmentation and labeling. In ACM SIG-GRAPH, 102:1–102:12.

WU, Z., SHOU, R., WANG, Y., AND LIU, X. 2014. Interactiveshape co-segmentation via label propagation. CAD/Graphics 38,2, 248–254.

XIE, Z., XU, K., LIU, L., AND XIONG, Y. 2014. 3d shape seg-mentation and labeling via extreme learning machine. SGP.

Labeling Ratio0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

FMF

4

6

8

10

12

14

16

18

20

22

24Airplane

bodywingenginetail

Labeling Ratio0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1

FMF

7

8

9

10

11

12

13

14Bag

handlebody

Labeling Ratio0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1

FMF

5

6

7

8

9

10

11Cap

peakpanel

Labeling Ratio0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

FMF

5

10

15

20

25

30Car

roofwheelhood

Labeling Ratio0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

FMF

5

10

15

20

25

30Chair

backlegseatarm

Labeling Ratio0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1

FMF

4

5

6

7

8

9

10

11

12

13

14Earphone

earphoneheadband

Labeling Ratio0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

FMF

4

6

8

10

12

14

16

18

20

22

24Guitar

bodyheadneck

Labeling Ratio0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

FMF

4

6

8

10

12

14

16

18

20Knife

bladehandle

Labeling Ratio0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

FMF

6

8

10

12

14

16

18

20

22

24Lamp

basecanopylampshade

Labeling Ratio0.4 0.5 0.6 0.7 0.8 0.9 1

FMF

15

16

17

18

19

20

21

22

23

24Laptop

keyboard

Labeling Ratio0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

FMF

2

4

6

8

10

12

14

16

18

20

22Motorbike

wheelseatgas-tankhandlelight

Labeling Ratio0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

FMF

4

4.5

5

5.5

6

6.5

7

7.5

8

8.5Mug

handle

Labeling Ratio0.4 0.5 0.6 0.7 0.8 0.9 1

FMF

2

4

6

8

10

12

14

16

18

20Pistol

barrelhandletrigger-and-guard

Labeling Ratio0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

FMF

2

4

6

8

10

12

14Rocket

bodyfinnose

Labeling Ratio0.4 0.5 0.6 0.7 0.8 0.9 1

FMF

8

10

12

14

16

18

20Skateboard

wheeldeck

Labeling Ratio0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

FMF

6

8

10

12

14

16

18

20

22Table

legtop

Figure 4: These plots depict how FMF relates to fraction of labeled data for different labels in different categories of ShapeNetCore. See ourpaper (Section 8) for more details.

time (sec) #10 40 0.5 1 1.5 2 2.5

N (#

of m

odel

s)

0

500

1000

1500

2000

2500

3000

3500

4000Airplane

bodywingenginetailnaive labeling

time (sec)0 50 100 150 200 250 300 350

N (#

of m

odel

s)

0

10

20

30

40

50

60

70

80Bag

handlebodynaive labeling

time (sec)0 50 100 150 200 250 300 350

N (#

of m

odel

s)

0

10

20

30

40

50

60Cap

peakpanelnaive labeling

time (sec) #10 40 0.5 1 1.5 2 2.5 3 3.5

N (#

of m

odel

s)

0

1000

2000

3000

4000

5000

6000

7000

8000Car

roofwheelhoodnaive labeling

time (sec) #10 40 0.5 1 1.5 2 2.5 3

N (#

of m

odel

s)

0

1000

2000

3000

4000

5000

6000

7000Chair

backlegseatarmnaive labeling

time (sec)0 100 200 300 400 500 600

N (#

of m

odel

s)

0

10

20

30

40

50

60

70

80Earphone

earphoneheadbandnaive labeling

time (sec)0 1000 2000 3000 4000 5000

N (#

of m

odel

s)

0

100

200

300

400

500

600

700

800Guitar

bodyheadnecknaive labeling

time (sec)0 500 1000 1500 2000 2500 3000

N (#

of m

odel

s)

0

50

100

150

200

250

300

350

400

450Knife

bladehandlenaive labeling

time (sec)0 2000 4000 6000 8000 10000

N (#

of m

odel

s)

0

500

1000

1500

2000

2500Lamp

basecanopylampshadenaive labeling

time (sec)0 100 200 300 400 500 600 700 800 900

N (#

of m

odel

s)

0

50

100

150

200

250

300

350

400

450Laptop

keyboardnaive labeling

time (sec)0 500 1000 1500 2000 2500 3000

N (#

of m

odel

s)

0

50

100

150

200

250

300

350Motorbike

wheelseatgas-tankhandlelightnaive labeling

time (sec)0 500 1000 1500

N (#

of m

odel

s)

0

50

100

150

200

250Mug

handlenaive labeling

time (sec)0 500 1000 1500 2000 2500 3000

N (#

of m

odel

s)

0

50

100

150

200

250

300Pistol

barrelhandletrigger-and-guardnaive labeling

time (sec)0 100 200 300 400 500 600 700

N (#

of m

odel

s)

0

10

20

30

40

50

60

70

80

90Rocket

bodyfinnosenaive labeling

time (sec)0 100 200 300 400 500 600

N (#

of m

odel

s)

0

20

40

60

80

100

120

140

160Skateboard

wheeldecknaive labeling

time (sec) #10 40 0.5 1 1.5 2 2.5 3 3.5

N (#

of m

odel

s)

0

1000

2000

3000

4000

5000

6000

7000

8000

9000Table

legtopnaive labeling

Figure 5: These plots depict how number of positively-verified models relates to total human work time for different labels in differentcategories of ShapeNetCore. See our paper (Section 8) for more details.

A Scalable Active Framework for Region Annotation in 3D ...ericyi/papers/part... · A Scalable Active Framework for Region Annotation in 3D Shape Collections Supplemental material

Documents