Visual measurement of jackets by structured prediction...Visual measurement of jackets by structured prediction J. Serrat, Oriol Ramos Master in CV, module M3, course 2015-16 Computer

Visual measurement of jackets by structuredprediction

J. Serrat, Oriol Ramos

Master in CV, module M3, course 2015-16

Computer Vision Center, {oriolrt | joans}@cvc.uab.es

Index

1 Problem

2 Image capture

3 Samples

4 Method

5 What to do

6 Solution

Problem Image capture Samples Method What to do Solution

Problem

A company sells jackets custom tailored to the sizes specified bytheir clients through its website.

3 / 41


Problem

Sometimes customers return items complaining they don’t fit well.Possible reasons are

I customers don’t measure themselves well with a tailor tape

I input wrong measures to the web page

I a tailor had a bad day

There’s a need, at pre-shipment time, for comparing the actualmeasures of an item with those provided by customers.

Goal

Perform automatic measurements on each garment at pre-shipmenttime for quality control, by means of computer vision techniques.

4 / 41


Problem

What to measure

5 / 41


Image capture

6 / 41


Image capture

I The chosen vision system: Raspberry Pi + Raspicam

www.rasberrypi.org

7 / 41

www.rasberrypi.org


Image capture

Why ?

I Raspberry Pi is a Linux mini computer

I managed to install powerful Python image processing andmachine learning libraries we needed

I Raspicam is a 5 Mp color camera, fully controlled

I inexpensive, ∼ e 100

I slow but can be optimized

I can do learning to measure in a desktop PC and just measurehere

8 / 41


Samples

We are provided 23 jackets as working samples, from which

I 16 normal color, style, size

I 7 less frequent

9 / 41


Normal samples

1 2 3

4 5 610 / 41


Normal samples

7 8 9

10 11 1211 / 41


Normal samples

13 14 15

1612 / 41


Less frequent samples

17 18 19

20 21 2213 / 41


Less frequent samples

23

14 / 41


Method

1. Segmentation

2. Extract contour

15 / 41


Method

3. Compute contour curvature and its extrema

yellow = maxima / convexred = minima / concave

16 / 41


Method

4. contour + curvature = extrema points −→ detect keypoints

12 labels_keypoints = {

3 0:’left neck’,

4 1:’left shoulder’,

5 2:’left upper wrist’,

6 3:’left lower wrist’,

7 4:’left armpit’,

8 5:’left waist’,

9 6:’right waist’,

10 7:’right armpit’,

11 8:’right lower wrist’,

12 9:’right upper wrist’,

13 10:’right shoulder’,

14 11:’right neck’,

15 }

16 # see groundtruth.py

17 / 41


Method

or

4. contour + curvature = extrema points −→ segments betweenpoints −→ detect segment types

12 labels_segments = {

3 0: ’neck’,

4 1: ’left shoulder’,

5 2: ’outer left sleeve’,

6 3: ’left wrist’,

7 4: ’inner left sleeve’,

8 5: ’left chest’,

9 6: ’waist’,

10 7: ’right chest’,

11 8: ’inner right sleeve’,

12 9: ’right wrist’,

13 10: ’outer right sleeve’,

14 11: ’right shoulder’,

15 }

16 # see groundtruth.py

18 / 41


Method

5. keypoints / segments −→ measures

19 / 41


What to do

Given pre-processed data

I contours

I curvature

I points of extrema of curvature

I segments of extrema points

I groundtruth : which points are what keypoints, which type ofsegment is each segment

plus source code to plot all this, just detect keypoints /segment types

20 / 41


What to do

1. design a graphical model to detect either keypoints or typeof each segment, making the most of the structured nature ofthe result

2. train it = learn its parameters, trying a few different inferenceand ssvm learning algorithms

3. perform detection as inference (structured prediction)

4. evaluation with kfold= 5 on jackets (23 for learning, 4 fortesting)

5. perform detection with standard classification like linearSVM

6. compare performance

7. try to explain the difference

21 / 41


Tools

I Python + some IDE (Windows: Python(x, y), Linux: Python+ Scipy + Spyder)

I scikit-learn

I Pystruct + dependencies (CVXOPT, qpbo), maybe you’llneed pip for that

Study Pystruct OCR letter recognition example.

22 / 41


Solution

Graphical model: a chain

y1 y2

x1 x2

yn

xn

wpairwisewunary

I hidden nodes are segments between two curvature extrema

I labels yi are segment type : ’neck’, ’left wrist’, ’left chest’ . . .I observations xi are feature vectors :

I segment orientation = angle with x axisI the x and y coordinates of endpoints rescaled to [0, 1]I segment length . . .

23 / 41


Solution

What if

I hidden nodes are points of extrema of curvature

I labels yi are types of keypoints : ’left wrist’, ’armpit’ . . .

I observations xi are measures on extrema points (curvature,relative position. . . )

Then compatibility coefficients wpairwise are not informative: aftersome keypoint most often comes a “no keypoint” label: thoseextrema which are not keypoints.

Therefore we can just rely on unary coefficients : not so muchstructured prediction as standard classification of each extremeindependently (though it may perform well here).

24 / 41


Solution

Graphical model: a chain

y1 y2

x1 x2

yn

xn

wpairwisewunary

I wunary are the weights of each feature x·j for each possiblelabel y

I wpairwise are the compatibilities between every label pair(yi, yi+1) of two successive segments

25 / 41


Solution

|wunary | |wpairwise |

26 / 41


Solution

Results:

I evaluation with 5-fold of training set: 5 rounds of 4 jackets fortesting, 19 for training −→ 760 segments train, 160 test

I features: angle/π, normalized coordiantes of 2 endpointsI CRF inference with max-productI FrankWolfeSSVM optimization, C = 5.e+ 4, tol=0.001I compared to

LinearSVC(multi class=’ovr’, dual=False, C=1.e+4)

Method Scores Mean

CRF 1.0 1.0 1.0 1.0 0.99375 0.99875linear SVM 0.98 0.995 0.98 0.975 0.96875 0.97975

CRF: just 1 segment wrong, SVM 18 segments, 920segments total

27 / 41


Solution

1 2

3 428 / 41


Solution

5 6

7 829 / 41


Solution

9 10

11 1230 / 41


Solution

13 14

15 1631 / 41


Solution

17 18

19 2032 / 41


Solution

21 22

23The only error :right wrist 22

33 / 41


Solution

SSVM learning + structured prediction beats plain SVM in eachfold. But just 2% better !?

My interpretation: the problem is easier than expected because

I selected features are “too” good

I too few samples, lack of difficult cases

Anyway, we get an interpretation of the classifier: unarycoefficients tell us what features are important, pairwise how oftena label follows another label.

34 / 41


Solution

Features: rescaled coordinates of the two endpoints of segment

35 / 41


Solution

Features: angle/π of segment with vertical axis

36 / 41


Solution

Using only the angle, a poor feature, makes a difference

Features Method Scores Mean

2 points CRF 1.0 1.0 1.0 1.0 0.994 0.998and angle SVM 0.98 0.995 0.98 0.975 0.969 0.979

just angle CRF 0.74 0.78 0.72 0.725 0.706 0.736SVM 0.345 0.455 0.36 0.356 0.437 0.391

37 / 41


Solution

Point coordinate features are in the range [0, 1] and angle [−1, 1].

Adding Gaussian noise σ = 0.2 that is upto ± 20% coordinatesand ± 10% angle,

Features Method Scores Mean

2 points CRF 0.90 0.875 0.85 0.906 0.9 0.88and angle SVM 0.60 0.685 0.675 0.625 0.969 0.64

38 / 41


Solution

Key code lines:

1 from pystruct.models import ChainCRF, MultiClassClf

2 from pystruct.learners import OneSlackSSVM, NSlackSSVM,

FrankWolfeSSVM

3 from sklearn.cross_validation import KFold

4 from sklearn.svm import LinearSVC

56 """ compute segment features """

7 X.shape = np.zeros((num_jackets, num_segments, num_features))

8 for jacket_segments, i in zip(segments, range(num_jackets)):

9 for s, j in zip(jacket_segments, range(num_segments)):

10 X[i,j,0:num_features] = s.x0norm, s.y0norm, \

11 s.x1norm, s.y1norm, \

12 s.angle/pi

39 / 41


Solution

1 model = ChainCRF(n_states=num_labels, n_features=num_features,

2 directed=True, inference_method=’max-product’)

3 """ can be ’ad3’, ’qbpo’, ’max-product’, ’lp’ """

4 ssvm = FrankWolfeSSVM(model=model, C=5.e+4, tol=0.001)

5 """ can be OneSlackSSVM, NSlackSSVM """

6 svm = LinearSVC(multi_class=’ovr’, dual=False, C=1.e+4)

40 / 41


Solution

1 """ 5-fold: 4 jackets for testing, 19 for training """

2 kf = KFold(num_jackets, n_folds=5)

3 for train_index, test_index in kf:

4 X_train, Y_train = X[train_index], Y[train_index]

5 X_test, Y_test = X[test_index], Y[test_index]

6 """ CRF """

7 ssvm.fit(X_train, Y_train)

8 print("Test score with chain CRF: %f" %

9 ssvm.score(X_test, Y_test))

10 """ linear SVM """

11 svm.fit(np.vstack(X_train), np.hstack(Y_train))

12 print("Test score with linear SVM: %f" %

13 svm.score(np.vstack(X_test), np.hstack(Y_test))

41 / 41

Visual measurement of jackets by structured prediction...Visual measurement of jackets by structured prediction J. Serrat, Oriol Ramos Master in CV, module M3, course 2015-16 Computer

Documents