Top Banner
Deep Face Recognition Challenges and Tips for Real-life Deployment [email protected]
16

Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

Oct 13, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

Deep Face Recognition Challenges and Tips for Real-life Deployment

[email protected]

Page 2: Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

1 Deep Face Recognition

2 Public DBs

3 Public models

4 Managing imbalance

5 Embeddings

6 Conclusions

Page 3: Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

HERTA

www.hertasecurity.com

Deep Face Recognition

GPU-powered face recognition

Offices in Barcelona, Madrid, London, Los Angeles

Crowds, unconstrained

Deep Face Recognition

Large training DBs, >100K images, >1K subjects (Public DBs)

Public models (Inception, VGG, ResNet, SENet…), close to state-of-the-art

Typically, embedding layer (yielding facial descriptor) feeds one-hot encoding

Unconstrained (in-the-wild) environments

Page 4: Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

HERTA

www.hertasecurity.com

Public DBs

CWF

LFW

VGG Face

VGG Face 2

IJBB

• Mostly celebrities: subjects overlap

2.6K

9.1K

10.6K

1.8K

5.7K

Page 5: Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

HERTA

www.hertasecurity.com

Public DBs

LFWCWF

• Highly imbalancedD

emo

grap

hic

gro

up

Imag

es /

su

bje

ct

Page 6: Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

HERTA

www.hertasecurity.com

Public models

Public models • trained on public DBs (DIY)

Validate with • demographically-balanced DB:

Asian female: 1M pairsAsian male: 1M pairsBlack female: 1M pairsBlack male: 1M pairsWhite female:1M pairsWhite male: 1M pairs

FaceNet (2015) CWF / MS-1MCVGGFace (2015) VGGSphereFace (2017) CWFVGGFace2 (2017) MS-1MC + VGG2

(50% same ID, 50% different ID)

Page 7: Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

HERTA

www.hertasecurity.com

Public models: examples of failures

False positives False negatives

Page 8: Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

HERTA

www.hertasecurity.com

Public models: evaluation

FaceNet (2015)

SphereFace (2017)

VGGFace (2015)

VGGFace2 (2017)

1MC CWF VGG

CWF 1MC 1MCVG2 VG2

White male Black male Asian female

Page 9: Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

HERTA

www.hertasecurity.com

“Features get better at understanding faces, improving

performances of individual tasks”

Multi-tasklearning

id

gender

ethnics

Managing imbalance

Undersampling

Oversampling

Cost-sensitive learning

c

SAMPLING(DATA-ORIENTED)

TRAINING LOSS(MODEL-ORIENTED)

R Ranjan, VM Patel, R Chellappa. “Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition.” TPAMI 2017

Page 10: Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

HERTA

www.hertasecurity.com

Managing imbalance – Data augmentation

• Data augmentation: makes imbalance mitigation much more effective

Stochasticdata augmentation

Oversampled DB DNNDatabase

I Masi et al. "Do we really need to collect millions of faces for effective face recognition?" ECCV 2016.

Page 11: Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

HERTA

www.hertasecurity.com

Managing imbalance – Proposal

Traditional imbalance:

Proposal: IDR(robust to outliers)

Iterative multi-label oversampling:

𝑚𝑎𝑥 𝑋

𝑚𝑖𝑛(𝑋)

𝐷9 𝑋

𝐷1(𝑋)

1. Find most imbalanced label L2. Find most imbalanced category C within L3. Draw random sample from C, replicate

𝐷1

𝐷9

𝑚𝑎𝑥 𝑋

𝑚𝑖𝑛(𝑋)

𝐷9 𝑋

𝐷1(𝑋)

#samples added #samples added

Page 12: Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

HERTA

www.hertasecurity.com

Managing imbalance – Sample training batch

Before oversampling… …and after

Page 13: Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

HERTA

www.hertasecurity.com

Managing imbalance

• Results with ResNet 20 (tiny network, for comparison only)• Better with almost 6X less subjects, 2X less images!

10.6K subjects,494K images

1.8K subjects,295K images

Page 14: Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

HERTA

www.hertasecurity.com

Sparse embedding

Typically, in deep face recognition: •

What about • ReLU + embedding + one-hot encoding? (e.g. VGGFace)Why more dimensions, if 90% zero?

Larger representation subspace, at expense of computational efficiency•

But can gain it back! • ̴200M comp/s

image CNNembedding

layerone-hot encoding

Sparse 4096-d Dense 512-dDict + Dense 256-d

Page 15: Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

HERTA

www.hertasecurity.com

Conclusions

• Public training / validation DBs: heavily biased at multiple levels• Without balancing, trained models will be biased, too!• Prefer “better data” over “more data”

• Machine Learning vs Machine Teaching

Explainable ML

Designing algorithms to passively train models

Choosing which examplesto show a learner

Zhu, Xiaojin, et al. "An Overview of Machine Teaching." arXiv preprint arXiv:1801.05927 (2018).

Page 16: Deep Face Recognition - NVIDIA · HERTA Deep Face Recognition GPU-powered face recognition Offices in Barcelona, Madrid, London, Los Angeles Crowds, unconstrained Deep Face Recognition

Questions?

[email protected]