CUbRIK research presented at SSMS 2012

CONTEXT-BASED PEOPLE RECOGNITION

in CONSUMER PHOTO COLLECTIONS

Markus Brenner, Ebroul Izquierdo MMV Research Group, School of Electronic Engineering and Computer Science

Queen Mary University of London, UK

{markus.brenner, ebroul.izquierdo}@eecs.qmul.ac.uk

Face Detection and Basic Recognition

Initial steps: Image preprocessing, face detection and face normalization

Descriptor-based: Local Binary Pattern (LBP) texture histograms

Similarity metric: Chi-Square Statistics

Basic face recognition: k-Nearest-Neighbor

Graph-based Recognition

Model: pairwise Markov Network (graph nodes represent faces)

Unary Potentials: likelihood of faces belonging to

particular people

Pairwise Potentials: encourage spatial smoothness,

encode exclusivity constraint and temporal domain

Topology: only the most similar faces are

connected with edges

Inference: maximum a posteriori (MAP)

solution of Loopy Belief Propagation (LBP)

Social Semantics

Individual appearance for a more effective graph

topology (used to regularize the number of edges)

Unique People Constraint models exclusivity:

a person cannot appear more than once in a photo

Pairwise co-appearance: people appearing together

bear a higher likelihood of appearing together again

Groups of people: use data mining to

discover frequently appearing social patterns

Body Detection and Recognition

… when faces are obscured or invisible

Detect upper and lower body parts

Bipartite matching of faces and bodies

Graph-based fusion of faces and clothing

f2f1

f3

Unary potential

Pairwise potential

Face

Resolve identities of people primarily by their faces

Incorporate rich contextual cues of personal photo collections

where few individual people frequently appear together

Perform recognition by considering all contextual information

at the same time (unlike traditional approaches that usually

train a classifier and then predict identities independently)

Aim

𝑢 𝑤𝑛 =1

𝑍𝑓𝑓 𝑤𝑛

Experiments Public Gallagher Dataset:

~600 photos, ~800 faces, 32 distinct people

Our dataset:

~3300 photos, ~5000 faces, 106 distinct people

All photos shot with a typical consumer camera

Considering only correctly detected faces (87%)

Te Tr

Tr

Tr

Te

Face

similarity

All samples

are independent

Te

TrTr

TrTe

Based on face

similarities

Unary potential

of every node

Te

TrTr

TrTe

Upper body

similarity

Face

similarity

Lower

body

similarity

Unary potential

of every node

...

𝑝 𝑤𝑛 ,𝑤𝑚 =

𝜏, 𝑖𝑓 𝑤𝑛 = 𝑤𝑚 ∧ 𝑖𝑛 ≠ 𝑖𝑚 0, 𝑖𝑓 𝑤𝑛 = 𝑤𝑚 ∧ 𝑖𝑛 = 𝑖𝑚

𝑐𝑜 𝑤𝑛 ,𝑤𝑚 , 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

0%

5%

10%

15%

20%

25%

+ Graph. Model + Social Semantics + Body parts

Gain @ 3% training

… for each block …

LBP

LBP

CUbRIK research presented at SSMS 2012

Technology

likelihood of faces

similar faces

faces aim

detected faces

graphbased fusion of

distinct people

faces unary potentials

againgroups of people