Graph Regularised Hashing (ECIR'15 Talk)

Graph Regularised Hashing

Sean Moran and Victor Lavrenko

Institute of Language, Cognition and ComputationSchool of Informatics

University of Edinburgh

ECIR’15 Vienna, March 2015

Graph Regularised Hashing (GRH)

Overview

Evaluation

Conclusion

Overview

Evaluation

Conclusion

Locality Sensitive Hashing

DATABASE

110101

010111

010101

111101

DATABASE

HASH TABLE

110101

010111

DATABASE

HASH TABLE

010101

111101

110101

010111

COMPUTE SIMILARITY

NEARESTNEIGHBOURS

010101

111101

DATABASE

HASH TABLE

110101

010111

111101

Content Based IR

Image: Imense Ltd

Image: Doersch et al.

Image: Xu et al.

Location Recognition

Near duplicate detection

010101

111101

DATABASE

NEARESTNEIGHBOURS

HASH TABLE

COMPUTE SIMILARITY

Previous work

I Data-independent: Locality Sensitive Hashing (LSH) [Indyk.98]

I Data-dependent (unsupervised): Anchor Graph Hashing(AGH) [Liu et al. ’11], Spectral Hashing (SH) [Weiss ’08]

I Data-dependent (supervised): Self Taught Hashing (STH)[Zhang ’10], Supervised Hashing with Kernels (KSH) [Liu etal. ’12], ITQ + CCA [Gong and Lazebnik ’11], BinaryReconstructive Embedding (BRE) [Kulis and Darrell. ’09]

Previous work

Method Data-Dependent Supervised Scalable Effectiveness

LSH X LowSH X LowSTH X X MediumBRE X X MediumITQ+CCA X X MediumKSH X X High

GRH X X X High

Overview

Evaluation

Conclusion

I Two step iterative hashing model:

I Step A: Graph Regularisation

Lm ← sgn(α SD−1Lm−1 + (1−α)L0

)I Step B: Data-Space Partitioning

for k = 1. . .K : min ||hk ||2 + C∑N

i=1 ξik

s.t. Lik(hkᵀxi + bk) ≥ 1− ξik for i = 1. . .N

I Repeat for a set number of iterations (M)

I Step A: Graph Regularisation [Diaz ’07][1]

I S: Affinity (adjacency) matrix

I D: Diagonal degree matrix

I L: Binary bits at specified iteration

I α: Interpolation parameter (0 ≤ α ≤ 1)

[1] Diaz, F.: Regularizing query-based retrieval scores. In: IR(2007)

I Step A: Graph Regularisation [Diaz ’07]

-1 1 1

-1 -1 -1ba

S a b c

a 1 1 0b 1 1 1c 0 1 1

D−1 a b c

a 0.5 0 0b 0 0.33 0c 0 0 0.5

L0 b1 b2 b3

a −1 −1 −1b −1 1 1c 1 1 1

-1 1 1

-1 -1 -1ba

L1 = sgn

−1 0 0−0.33 0.33 0.33

-1 1 1

-1 1 1ba

b1 b2 b3

a −1 1 1b −1 1 1c 1 1 1

I Step B: Data-Space Partitioning

for k = 1. . .K : min ||hk ||2 + C∑N

i=1 ξik

I hk : Hyperplane k

I bk : bias of hyperplane k

I xi : data-point i

I Lik : bit k of data-point i

ξik : slack variable ij

K : # bits

N: # data-points

for k = 1. . .K : min ||hk ||2 + C∑N

i=1 ξik

I hk : Hyperplane k

I xi : data-point i

K : # bits

N: # data-points

for k = 1. . .K : min ||hk ||2 + C∑N

i=1 ξik

I hk : Hyperplane k

I xi : data-point i

K : # bits

N: # data-points

for k = 1. . .K : min ||hk ||2 + C∑N

i=1 ξik

I hk : Hyperplane k

I xi : data-point i

K : # bits

N: # data-points

for k = 1. . .K : min ||hk ||2 + C∑N

i=1 ξik

I hk : Hyperplane k

I xi : data-point i

K : # bits

N: # data-points

-1 1 1

-1 -1 -1

1 1 -1

-1 1 1

1 -1 -1

-1 1 1

-1 -1 -1

1 1 -1

-1 1 1

1 -1 -1

-1 1 1

-1 -1 -1

1 1 -1

-1 1 1

First bit flipped

1 1 -1 Second bit flipped

1 -1 -1

-1 1 1

-1 -1 -1

1 1 -1

-1 1 1h1 . x−b1=0

Negative (-1)half space

Positive (+1)half space

1 1 -1

1 -1 -1

-1 1 1

-1 -1 -1

1 1 -11 -1 -1

1 1 -1

-1 1 1

Positive (+1)half space

h2 . x−b2=0 Negative (-1) half space

Evaluation

Overview

Evaluation

Conclusion

Datasets/Features

I Standard evaluation datasets [Liu et al. ’12], [Gong andLazebnik ’11]:

I CIFAR-10: 60K images, GIST descriptors, 10 classes1

I MNIST: 70K images, grayscale pixels, 10 classes2

I NUSWIDE: 270K images, GIST descriptors, 21 classes3

I True NNs: images that share at least one class in common[Liu et al. ’12]

1http://www.cs.toronto.edu/~kriz/cifar.html2http://yann.lecun.com/exdb/mnist/3http://lms.comp.nus.edu.sg/research/NUS-WIDE.htm

Evaluation Metrics

I Hamming ranking evaluation paradigm [Liu et al. ’12], [Gongand Lazebnik ’11]

I Standard evaluation metrics [Liu et al. ’12], [Gong andLazebnik ’11]:

I Mean average precison (mAP)

I Precision at Hamming radius 2 (P@R2)

GRH vs Literature (CIFAR-10 @ 32 bits)

LSH BRE STH KSH GRH (Linear) GRH (RBF)0.10

P LinearGRH

Non-linear GRH

GRH vs Literature (CIFAR-10 @ 32 bits)

LSH BRE STH KSH GRH (Linear)GRH (RBF)0.10

GRH's straightforwardobjective outperforms

more complexobjectives

GRH vs Literature (CIFAR-10)

16 24 32 40 48 56 640.10

0.40LSHBREKSHGRH

# Bits

GRH vs Literature (CIFAR-10)

Small amount of supervision

required

16 24 32 40 48 56 640.10

0.40LSHBREKSHGRH

# Bits

+25-30%

GRH vs. Initialisation Strategy (CIFAR-10 @ 32 bits)

GRH (Linear) GRH (RBF)0.00

0.30 LSH

ITQ+CCA

Linear GRH

Non-Linear GRH

GRH vs. Initialisation Strategy (CIFAR-10 @ 32 bits)

GRH (Linear) GRH (RBF)0.00

0.30 LSH

ITQ+CCA

Eigendecompositionnot necessary -saves O(d^3)

GRH vs # Supervisory Data-Points (CIFAR-10)

Linear, T=1K Linear, T=2K RBF, T=1K RBF, T=2K0.00

LinearGRH

Non-linear GRH

LinearGRH

Non-linear GRH

GRH vs # Supervisory Data-Points (CIFAR-10)

Linear, T=1K Linear, T=2K RBF, T=1K RBF, T=2K0.00

Small amount of supervision

required

GRH Timing (CIFAR-10 @ 32 bits)

Timings (s)Method Train Test TotalGRH 42.68 0.613 43.29KSH [1] 81.17 0.103 82.27

BRE [2] 231.1 0.370 231.4

[1] Liu, W.: Supervised Hashing with Kernels. In: CVPR (2012)[2] Kulis, B.: Binary Reconstructive Embedding. In: NIPS (2009)

Conclusion

Overview

Evaluation

Conclusion

Conclusions and Future Work

I Supervised hashing model that is both accurate and easilyscalable

I Take-home messages:

I Regularising bits over a graph is effective (and efficient) forhashcode learning

I An intermediate eigendecomposition step is not necessary

I Hyperplanes (linear hypersurfaces) can achieve a very goodretrieval accuracy

I Future work: extend to the cross-modal hashing scenario (e.g.Image ↔ Text, English ↔ Spanish)

Thank you for your attention

Sean Moran

Code and datasets available at:

sean.moran@ed.ac.ukwww.seanjmoran.com

Graph Regularised Hashing (ECIR'15 Talk)

supervised hashing

graph regularisation

step iterative hashing

diagonal degree matrixi

affinity adjacency matrixi

self taught hashing

spectral hashing sh

binary bits

Engineering

Supplementary material: Multimodal regularised linear models...

ECIR 2011 - cs.nuim.ie

Hashing Part Two: Static Perfect Hashing

ECIR 2014 Program Booklet

The existence of a critical length scale in regularised ...

Proteinas ecir

Goal driven collaborative filtering (ECIR 2010)

Moreau Yosida Regularised Density-Functional...

Regularised GMRES-type Methods for X-Ray Computed...

Report on ECIR 2016: 38th European Conference on ...

Tutorial @ ECIR Toulouse Who am I? › ~mandl › events ›...

Los estados de la materia 1º ESO (ECIR)

Guía Didáctica Matemáticas ECIR

RTBMA ECIR 2016 tutorial

Lecture XI HASHINGyap/wiki/pm/uploads/Algo/l11_BASE.pdf ·....

Explorations in Cyber International Relations (ECIR)