Changeable and Privacy Preserving Face Recognition

Changeable and Privacy Preserving

Face Recognition

by

Yongjin Wang

A thesis submitted in conformity with the requirementsfor the degree of Doctor of Philosophy

Graduate Department of The Edward S. Rogers Sr. Department of Electrical andComputer EngineeringUniversity of Toronto

c© Copyright by Yongjin Wang, 2010

Changeable and Privacy Preserving Face Recognition

Yongjin Wang

Doctor of Philosophy, 2010

Graduate Department of The Edward S. Rogers Sr. Department of Electrical and

Computer Engineering

University of Toronto

Abstract

Traditional methods of identity recognition are based on the knowledge of a password or

a PIN, or possession factors such as tokens and ID cards. Such strategies usually afford

low level of security, and can not meet the requirements of applications with high security

demands. Biometrics refer to the technology of recognizing or validating the identity of an

individual based on his/her physiological and/or behavioral characteristics. It is superior

to conventional methods in both security and convenience since biometric traits can not

be lost, forgotten, or stolen as easily, and it is relatively difficult to circumvent. However,

although biometrics based solutions provide various advantages, there exist some inherent

concerns of the technology. In the first place, biometrics can not be easily changed

or reissued if compromised due to the limited number of biometric traits that humans

possess. Secondly, since biometric data reflect the user’s physiological or behavioral

characteristics, privacy issues arise if the stored biometric templates are obtained by an

adversary. To that end, changeability and privacy protection of biometric templates are

two important issues that need to be addressed for widespread deployment of biometric

technology.

ii

This dissertation systematically investigates random transformation based methods

for addressing the challenging problems of changeability and privacy protection in bio-

metrics enabled recognition systems. A random projection based approach is first in-

troduced. We present a detailed mathematical analysis on the similarity and privacy

preserving properties of random projection, and introduce a vector translation technique

to achieve strong changeability. To further enhance privacy protection as well as to

improve the recognition accuracy, a sorted index number (SIN) approach is proposed

such that only the index numbers of the sorted feature vectors are stored as templates.

The SIN framework is then evaluated in conjunction with random additive transform,

random multiplicative transform, and random projection, for producing reissuable and

privacy preserving biometric templates. The feasibility of the introduced solutions is well

supported by detailed theoretical analyses. Extensive experimentation on a face based

biometric recognition problem demonstrates the effectiveness of the proposed methods.

iii

Dedication

To my parents, my wife, and my daughter

iv

Acknowledgements

First, I would like to thank my supervisor, Prof. Dimitrios Hatzinakos, for his continuous

guidance, support, and encouragement throughout the course of my research work. This

work would not be possible without his patience and kindness. It is a great honor and

pleasure to work under his supervision, and I truly appreciate his help.

I would like to thank Prof. Raviraj Adve, Prof. Shahrokh Valaee, and Prof. Jason

H. Anderson for their insightful comments and suggestions on my thesis work. I would

like to express my sincere gratitude to Prof. A. Enis Cetin for his invaluable time to

serve as an external examiner. I would also like to thank Prof. Kostas Plataniotis for

his guidance particularly at the early stage of this work. I would like to acknowledge

the Department of Electrical and Computer Engineering, University of Toronto, the On-

tario Graduate Scholarship program, and the Natural Science and Engineering Research

Council of Canada for providing me financial support throughout my research work.

I am grateful to the members of multimedia lab for providing such a warm and

friendly environment: Dr. Jie Wang, Dr. Haiping Lu, Dr. Francis Bui, Tahir Amin,

Foteini Agrafioti, Petros Spachos, Tejas Ganapathi, Hoda Mohammadzade, and many

others. Thank you all for creating such a pleasant and productive atmosphere. I would

also like to express my gratitude to my friends: Dr. Shuguang Wang, Dr. Ivan Lee, Dr.

Matt Kyan, Dr. Yifeng He, Dr. Yunfeng Lin, Dr. Yangyang Li, Yupeng Li, Ming Du,

Xiaoming Fan, Tony Wang, Ning Zhang, Rui Zhang, Nan Dong, Yun Tie, . . . , thank you

all for your continued friendship and help.

Special thanks to my friend and Master’s thesis advisor, Dr. Ling Guan. Thank you

for introducing me into the area of multimedia processing and pattern recognition. Your

consistent support, encouragement, and help are deeply appreciated.

I would like to express my special thanks to my parents. Thank you for your love

and encouragement. You are always the peaceful harbor whenever I feel frustrated and

tired. To my sisters and their families, thank you for your consistent support. Finally, I

v

want to give my sincere thanks to my beloved wife Mei, and my daughter Gloria. Thank

you for your love, caring, patience, and understanding. I share with them every piece of

my achievement.

vi

Contents

Abstract ii

1 Introduction 1

1.1 Biometrics and Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Biometric Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.4 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2 Literature Review 17

2.1 Biometric Crypto-system . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2 Cancelable Biometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3 Random Projection Based Face Verification 35

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.2 Method Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.3 Accuracy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.4 Changeability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.5 Privacy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.6.1 Database Description . . . . . . . . . . . . . . . . . . . . . . . . . 54

vii

3.6.2 RP vs PCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3.6.3 RP vs PCARP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

3.6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

3.8 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.8.1 Appendix 3-I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.8.2 Appendix 3-II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

4 Sorted Index Numbers for Face Recognition 73

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.2 Method Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.3 SIN Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76



4.5.1 Face Identification . . . . . . . . . . . . . . . . . . . . . . . . . . 87

4.5.2 Face Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

5 Random Transformations for Changeable Biometrics 91

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.2 Method Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5.3 Changeability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

5.3.1 Random Additive Transform . . . . . . . . . . . . . . . . . . . . . 94

5.3.2 Random Multiplicative Transform . . . . . . . . . . . . . . . . . . 96

5.3.3 Random Projection . . . . . . . . . . . . . . . . . . . . . . . . . . 100


5.4.1 RAT-SIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

5.4.2 RMT-SIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

viii

5.4.3 RP-SIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107


5.5.1 RAT-SIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

5.5.2 RMT-SIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

5.5.3 RP-SIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

5.5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

5.7 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

5.7.1 Appendix 5-I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

5.7.2 Appendix 5-II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

5.7.3 Appendix 5-III . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

5.7.4 Appendix 5-IV . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

6 Conclusion and Future Work 126

6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

Bibliography 131

ix

List of Tables

1.1 Comparison of biometrics (L:Low, M:Medium, H:High, D:Difficult, E:Easy,

U:Unknown). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3.1 Identification data set configuration. . . . . . . . . . . . . . . . . . . . . 57

3.2 Verification data set configuration. . . . . . . . . . . . . . . . . . . . . . 58

3.3 Experimental results (EER, in %) of PCA, RP, and PCARP at different

dimensionalities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.1 Face identification results (CRR in %). . . . . . . . . . . . . . . . . . . 88

4.2 Face verification results (EER in %). . . . . . . . . . . . . . . . . . . . . 89

5.1 Experimental results (EER in %) of RAT-SIN method on PCA and KDDA

features at selected σ2 values. . . . . . . . . . . . . . . . . . . . . . . . . 109

5.2 Experimental results (EER in %) of RMT-SIN method on PCA and KDDA

features at different d values (σ2=0.02). . . . . . . . . . . . . . . . . . . . 112

5.3 Comparison of BH with RP-SIN (EER in %). . . . . . . . . . . . . . . . 113

5.4 Comparison of BH with RP-SIN with translation(EER in %). . . . . . . 115

5.5 Comparison of different approaches (EER in %). . . . . . . . . . . . . . . 117

x

List of Figures

1.1 Block diagram of biometric recognition systems. . . . . . . . . . . . . . . 10

1.2 Biometric system errors: (a) intra-class and inter-class distributions, (b)

ROC curve. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.1 General block diagram of a biometric crypto-system. . . . . . . . . . . . 20

2.2 General block diagram of a cancelable biometric system. . . . . . . . . . 28

3.1 Probability of preserving distance between two vectors as a function of M

and ε. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.2 Probability of preserving distance for all n points as a function of M and

ε (n=100). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.3 Probability of preserving distance for all n points as a function of M and

n (ε=0.3). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.4 Comparison of lower bound of M with reference work. . . . . . . . . . . 46

3.5 Demonstration of computing probability of error in 2-D space. . . . . . . 48

3.6 Gaussian approximation of estimation error. . . . . . . . . . . . . . . . . 53

3.7 Image examples from the generic data set. . . . . . . . . . . . . . . . . . 56

3.8 Procedures for image preprocessing. . . . . . . . . . . . . . . . . . . . . 57

3.9 EER obtained by using PCA and RP as feature extractors. . . . . . . . . 60

3.10 ROC curve of RP and original image vectors. . . . . . . . . . . . . . . . 61

3.11 EER obtained in the user-independent scenario. . . . . . . . . . . . . . . 63

xi

3.12 ROC curve of RP and PCARP in the user-independent scenario. . . . . . 63

3.13 EER obtained in the user-dependent scenario. . . . . . . . . . . . . . . . 64

3.14 ROC curve of RP in the user-dependent scenario. . . . . . . . . . . . . . 65

3.15 ROC curve of PCARP in the user-dependent scenario. . . . . . . . . . . 65

3.16 Experimental results for changeability: RP (left) and PCARP (right). . . 67

4.1 3-D demonstration of SIN method. . . . . . . . . . . . . . . . . . . . . . 78

4.2 Diagram of Pairwise Relational Discretization (PRD) method. . . . . . . 79

4.3 Comparison of intra-class and inter-class distribution using Euclidean and

Hamming distances. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

4.4 SIN approximation of Euclidean distance. . . . . . . . . . . . . . . . . . . 82

4.5 Variance σ2j:N as function of dimensionality N . . . . . . . . . . . . . . . . 85

4.6 Privacy measures of SIN as functions of dimensionality. . . . . . . . . . . 86

4.7 CRR in face identification scenario. . . . . . . . . . . . . . . . . . . . . . 88

4.8 EER in face verification scenario. . . . . . . . . . . . . . . . . . . . . . . 90

5.1 RAT: Distribution of SED (a) Gaussian approximation (σ2=0.005); (b) at

different σ2 values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

5.2 RAT: Distribution of NSD (a) Gaussian approximation (σ2=0.005); (b) at

different σ2 values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

5.3 RMT: Gaussian approximation of the distribution of SED (σ2 = 0.005). . 97

5.4 RMT: Distribution of SED (a) at different σ2 values; (b) at different d

values (σ2 = 0.01). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

5.5 RMT: Gaussian approximation of the distribution of NSD (σ2 = 0.005). . 99

5.6 RMT: Distribution of NSD (a) at different σ2 values; (b) at different d

values (σ2 = 0.01). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

5.7 RP: Distribution of SED (a) Gaussian approximation (M=80); (b) at

different projected dimensionalities. . . . . . . . . . . . . . . . . . . . . . 102

xii

5.8 RP: Distribution of SED (a) Gaussian approximation (M=80); (b) at

different projected dimensionalities. . . . . . . . . . . . . . . . . . . . . . 102

5.9 RP: Distribution of (a) SED, and (b) NSD, at different vector translation

values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

5.10 Variance σ2j:N as function of variance of x σ2

x. . . . . . . . . . . . . . . . 105

5.11 Privacy measures of RAT method as functions of variance σ2r . . . . . . . 105

5.12 Privacy measures of RMT-SIN as functions of variance σ2r and translation

value d: (a) α, (b) β. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

5.13 Privacy measures of RP-SIN as functions of projected dimensionality M

and translation value d: (a) α, (b) β. . . . . . . . . . . . . . . . . . . . 108

5.14 Obtained EER of RAT-SIN method for PCA and KDDA as functions of

the variance of additive vector. . . . . . . . . . . . . . . . . . . . . . . . 110

5.15 Obtained EER of RMT-SIN method: (a) PCA UI, (b) PCA UD, (c) KDDA

UI, (d) KDDA UD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

5.16 Obtained EER of RP-SIN method for PCA and KDDA. . . . . . . . . . 111

5.17 Comparison of RP-SIN and BH. . . . . . . . . . . . . . . . . . . . . . . . 114

5.18 Intra-class and inter-class distributions of RP-SIN and BH, using PCA and

KDDA feature extractors, in both user-independent and user-dependent

scenarios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

5.19 Obtained EER of RP-SIN and BH as functions of translation value. . . . 116

5.20 ROC curve of different methods. . . . . . . . . . . . . . . . . . . . . . . . 117

xiii

List of Acronyms

• BH: BioHashing method

• CRR: Correct Recognition Rate

• DK: Different Key

• EER: Equal Error Rate

• FAR: False Accept Rate

• FRR: False Reject Rate

• GAR: Genuine Acceptance Rate

• i.i.d.: Independently and Identically Distributed

• KDDA: Kernel Direct Discriminant Analysis

• LDA: Linear Discriminant Analysis

• NSD: Normalized SIN distance

• PCA: Principal Component Analysis

• PRD: Pairwise Relational Discretization

• RAT: Random Additive Transform

• RMT: Random Multiplicative Transform

• ROC: Receiver Operating Charactersitic

• RP: Random Projection

• SED: Squared Euclidean Distance

• SIN: Sorted Index Numbers

• SK: Same Key

• UD: User Dependent

• UI: User Independent

xiv

Chapter 1

Introduction

1.1 Biometrics and Privacy

Applications that involve human authentication are everywhere in our daily lives. For

instance, a bank client needs to have a bank card and a PIN number to perform any

financial transaction; an end user needs to have a password to access a computer network;

a person needs to have a passport or other ID cards to travel by flight. A reliable

authentication system can make our lives secure, efficient, and convenient. Conventional

methods of identity verification are based on something you remember (e.g., password,

PIN), or something you possess (e.g., token, ID card). However, such methods usually

offer low-level security since passwords and PINs can be forgotten or acquired by covert

observation, while tokens and ID cards can be lost, stolen, or easily forged. According

to the report from the Identity Theft Resources Center (ITRC), USA, there are millions

of victims of identity theft each year, with the types of identity theft ranging from

financial, criminal, to government and benefit fraud [1]. Fraud of identity threatens our

security, creates financial loss, invades people’s privacy, and produces negative emotional

impact on victims. There is a very strong need to develop methods for reliable human

authentication to protect our private properties and safety.

1

Chapter 1. Introduction 2

Instead of utilizing something a person remembers or possesses, biometrics determine

or confirm an individual’s identity based on something he/she naturally possesses, either

physiologically or behaviorally. Biometrics based systems enable true user authentication

since they require a person to be presented at the time and location of authentication,

and therefore provide a direct link between the service and actual user. With biometrics,

there is nothing to lose or forget, and it is relatively difficult to circumvent. Furthermore,

biometrics offer an attractive method of non-repudiation, by which an authenticated

user can not falsely deny his/her malicious behavior due to the difficulty in copying and

stealing someone’s biometrics [2].

The early stage of biometrics can be dated back to more than one thousand years ago,

when potters in east Asia used to use their fingerprints as brand identity for their wares.

One of the first commercial applications of modern biometric systems was deployed in

early 1970s to provide time keeping and monitoring application using a finger measure-

ment device [3]. Since then, biometrics have improved tremendously in terms of ease of

use and diversity of application. In the recent twenty years, the advances in technology

have allowed the utilization of more sophisticated signals that are collected from the

human body to serve as biometrics, so as to achieve higher authentication performance.

A wide range of biometric traits have been investigated in the past. Examples of

these biometrics include physiological traits such as fingerprint and face, and behavioral

characteristics such as gait and keystroke. To be qualified as a biometric, a physiological

or behavioral trait should be universal to all members of the population, distinctive

between individuals, permanent over a period of time, and quantitatively collectable [2].

From a practical application point of view, a biometric system should be low cost, easy

to use, willingly accepted by the users, and capable of producing good performance in

terms of both accuracy and processing speed. Table 1.1 provides a comparison of some

biometrics, based on the information in [2, 3] and our own perception. The comparison

is based on six aspects: Universality (which reflects how universal the biometric trait


being possessed by the population), Accuracy (distinctiveness of the biometric), Stability

(which reflects how stable the biometric is over a period of time), User acceptability (the

extent to which people are willing to accept this biometric in their daily lives), Cost

(expense of deploying a biometric), and Circumvention (which indicates how strong the

biometric is to against attacks). The last column lists possible factors of authentication

errors. In general, each biometric has its strengths and weaknesses. There is no single

biometric that can be used in all applications. The selection of a particular biometric is

dependent on the the properties of the biometric and the requirements of the envisioned

application.

While biometric technology provides various advantages over conventional authentica-

tion strategies, there exist some problems primarily related to changeability and privacy.

In the first place, biometrics are irrevocable. In traditional password based authentica-

tion, an individual can use different passwords for different applications. If the password

in one application is compromised, a new one can be generated, and the other applica-

tions are not affected. The user can also change the password as often as he/she wishes,

to achieve better security. Similarly, for applications that use tokens or ID cards, a new

one can also be easily reissued if the old one is lost or stolen. However, this is not the

case for biometrics. When the biometric data of an individual are captured or the stored

template is compromised, the biometric can not be easily changed or reissued due to the

limited number of biometric traits, e.g., humans only have one face and ten fingers. This

limits the number of possible applications. For achieving potential deployment of bio-

metric technology in a wide variety of applications, the biometrics based authentication

systems should be revocable. Ideally, just like password or ID card, a user should be

able to use a different biometric representation (or template) for different applications,

and the biometric representation can be changed as often as the user wishes. When the

biometric template in one application is compromised, the biometric signal itself should

not be lost forever and a new biometric template can be reissued.


Biometric Traits Univ

ersa

lity

Acc

ura

cy

Sta

bility

Acc

epta

bility

Cos

t

Circu

mve

ntion

Error

Face H L M H L E lighting, age, hair

Facial Thermogram H M L L H D heat emanating surfaces

Fingerprint M H H H L M dryness, dirt, age

Hand Geometry M M M M M M hand injury, age

Hand vein M M H M H D exercise, health

Voice M L L H L E noise, colds, age

Iris H H H L H D poor lighting

Retina H H M L H D glasses

Signature L L L H L E changing signatures

Keystroke L L L H L M hand injury, tiredness

DNA H H H L H D

Ear M M H H L M piercing, age

Gait M L L H L M weight, injury, inebriety

Odor H L H M M D diet, emotion

Palmprint M H H M M M injury, age

Hand Grip M U M M H M temperature, health

Sweat Pores H U M H H M dirt, health

Fingernail Bed M U H H L M injury

ECG H H M L L D emotion, health

Table 1.1: Comparison of biometrics (L:Low, M:Medium, H:High, D:Difficult, E:Easy,

U:Unknown).


Moreover, since biometric data reflect the user’s physiological or behavioral charac-

teristics, privacy concerns arise. In the password based scheme, a hashed version of the

password is usually stored as template for matching. The hash function is one-way and

non-invertible, making it impossible to guess the original password even when the tem-

plate is obtained by an adversary. The authentication will be successful if and only if

the hashed version of the probe matches exactly with the stored template. However,

due to the noisy nature of biometric signals, the hash value of the same biometric signal

obtained at different times will be totally different. Therefore, the hashing method that

requires exact matching is not feasible for biometrics based authentication. In biometric

recognition systems, the stored templates are usually the original signals (e.g., face or

fingerprint images) or the extracted features. Storing the original signal as a template

will potentially lead to immediate disclosure of the user’s personal information when the

template is compromised. The extracted features are usually a set of low-dimensional

values, either continuous or discrete, to represent the essentials of the original signal for

recognition purposes, such as positions of minutiae points for fingerprints, eigen-faces and

IrisCodes. Although these features do not introduce direct disclosure of the original sig-

nal when compromised, it is possible to adequately reconstruct the original signal using

the stored information. For example, Uludag and Jain show that a synthetic fingerprint

can be generated using the minutiae information [4], and Adler shows that face images

can also be regenerated from stored face templates [5].

The possible disclosure of the original biometric signal imposes significant privacy

concerns. Privacy is a general term and usually arises as an assertion against other

individuals or organizations to prevent interference with the individual’s autonomy [6].

One intuitive type of privacy is the physical privacy, in which people desire their physical

space to be free of interruption, intrusion, embarrassment, or accountability [7]. In

biometrics, privacy refers to information privacy, i.e., an individual’s personal control

over the collection, use, and disclosure of recorded information about them, as well as


an organization’s responsibility for data protection and the safeguarding of personally

identifiable information, in its custody or control [8].

Biometric data are unique to an individual. In addition to the information that

can be used for recognition purpose, biometric data also contain rich personal private

information. In particular, it is possible to reveal an individual’s sensitive information

such as ethnicity, gender, health and medical condition. It is shown in [9] that certain

types of chromosomal disorders such as Down syndrome and Turner’s syndrome might

be associated with unusual fingerprint patterns. Certain nonchromosomal disorders such

as breast cancer and leukemia may also be implicated from fingerprint patterns. For iris

and retina, afflictions like diabetes arteriosclerosis and hypertension may be determined

by an expert, in addition to the diseases of iris and retina themselves [10]. DNA contains

all genetic information including gender, race, physical disorder, and mental illness, and

certain patterns in palm lines are also associated with mental disorders such as Down

syndrome and schizophrenia [11]. If these sensitive information are identified and used for

discrimination in health insurance, employment, or any other activities, it would create

a serious privacy breach.

The public also has increasing concern about the possible sharing of the biometric

information and the misuse of private information for unintended purposes, such as in

commercial or law enforcement applications. For instance, human face images may be

used for unintended applications such as advertisement, unauthorized disclosure on the

Internet, or even conducting crimes. It is also possible to track an individual’s activity

such as what he/she did and where he/she went to, since every time an individual uses

his/her biometric for authentication, the transaction will be recorded. The government

may also monitor the actions and behaviors of people for security or forensic purposes.

Without the knowledge and consent from the users, these unintended functions threaten

the user’s privacy. Gradually, it is possible that the biometric is not only used for

authentication, but for additional purposes, and the individuals will totally lose the


control of their biometric information. The lack of privacy protection will create negative

impacts on user confidence, trust, and the usage of a given information technology, specific

application or deployment, or even an entire industry [8].

The changeability and privacy protection issues in biometrics have drawn intensive

attention in both privacy and research communities [12]. Cavoukian [13] introduced the

term Untraceable Biometrics (UB), which states the following features that need to be

possessed for obtaining fair information practice, time-honored privacy principles of user

control, data minimization, and a high degree of data security.

• No storage of a biometric image or conventional biometric template;

• The original biometric image/template can not be recreated from the stored infor-

mation, thereby rendering it untraceable;

• A large number of untraceable templates for the same biometric may be created

for different applications;

• Untraceable templates from multiple applications can not be linked together;

• An untraceable template may, however, be revoked or canceled.

The first two points are essentially the privacy issues, and the last three state the

changeability problem.

One straightforward method to address the changeability and privacy issues is to use

encryption keys to encrypt the biometric data during enrollment, and decrypt at the

time of authentication. However, this method provides limited privacy protection since

the original biometric template will be exactly recovered if the key is stolen.

To provide strong privacy protection, some tentative solutions have been introduced

in the literature, which can be roughly categorized into two groups: biometric crypto-

systems [14], and cancelable biometrics [15]. In a biometric crypto-system, a randomly

generated cryptographic key is bound with the biometric features in a secure way such


that both the key and the biometric features can not be revealed if the template is

compromised. The cryptographic key can be retrieved if sufficiently similar biometric

features are presented. Error correction algorithms are usually employed to tolerate

variability. Due to the binary nature of cryptographic keys, such systems usually require

discrete representation of biometric data, such as minutiae points for fingerprints, and

iris codes. The major difficulty in the design of a biometric crypto-system is to bridge

the gap between the exactness requirement of the cryptographic key and the noisy nature

of biometric signals. Most of the existing systems are computationally expensive, and

usually degrade the recognition performance. Furthermore, the security levels of such

methods still need to be further investigated [14,16,17].

An alternative and effective solution is the cancelable biometrics method, which ap-

plies repeatable and non-invertible transformations on the biometric data or features [15].

With this method, every enrollment (or application) can use a different transform. When

a biometric template is compromised, a new one can be generated using a new trans-

form. In mathematical language, the verification problem can be formulated as: Given a

biometric feature vector u, the biometric template x is generated through a generation

function x = Gen(u,k). Different templates can be generated by varying the control

factor k. During verification, the same transformation is applied to the authentication

feature vector v, y = Gen(v,k), and matching is based on similarity measure in the trans-

formed domain, i.e. S(x,y). The major challenge here lies in the difficulty of preserving

the similarity measure in the transformed domain, i.e. S(x,y) ≈ S(u,v). To further

ensure the property of privacy protection, the generation function Gen(u,k) should be

non-invertible such that u = Rec(x,k) 6= u, where Rec(x,k) denote the reconstruction

function when both the template x and control factor k are known. The selection of a

reconstruction function is dependent on the generation function to provide an estima-

tion of the original signal. The reconstructed signal should not be able to be used for

successful authentication.


The objective of this research is to investigate methods for changeable and privacy

preserving biometric template generation. The generated biometric templates should

possess the properties of (1) Changeability : a new template can be reissued if the old one

is compromised; (2) Privacy : the biometric information can not be retrieved when the

template is compromised; and (3) Performance: the recognition accuracy is not sacrificed.

Specifically, we follow the line of cancelable biometrics and explore different repeatable

and non-invertible random transformations. The effectiveness of the introduced solutions

is supported by detailed theoretical analyses on the changeability and privacy protecting

properties, as well as extensive experimentation on a face recognition problem.

1.2 Biometric Systems

Biometric recognition is essentially a pattern recognition problem that consists of four

modules: sensor module, feature extraction module, matching module, and system database

module. The sensor module performs data acquisition from an individual. The feature

extraction module processes and identifies discriminatory features from the collected

biometric data. In the matching module, the extracted features are compared with the

templates to generate a match score, where the templates are enrolled and stored in the

system database module [18].

Depending on the application context, a biometric system can operate in either a

verification mode or an identification mode [19]. Figure 1.1 shows the block diagram of

biometric systems in both modes. Verification is a one-to-one match that determines

whether the claim of an identity is true. Identification is a one-to-many comparison to

find an individual’s identity. Commercial applications of biometrics may work either

in verification or identification mode. Most of government and forensic applications

operate in identification only. Examples of verification applications include computer

network login, electronic data security, ATM, physical access control, medical records


Figure 1.1: Block diagram of biometric recognition systems.

management, distance learning. Identification applications include national ID, driver’s

license, social security, border control, passport control, criminal investigation, terrorist

identification, and missing children.

During enrolment, a feature vector ui, i = 1, 2, ..., n, where n is the total number

of participants, is extracted from the biometric data of each individual and stored in

the system database as a template. In the verification mode, a feature vector v is

extracted from the biometric signal of the authenticating individual Iv, and compared

with the stored template uj of the claimed identity Ij, through a similarity function

S. The evaluation of a verification system can be performed in terms of hypothesis

testing [20]: H0: Iv = Ij, the claimed identity is correct; H1: Iv 6= Ij, the claimed

identity is not correct. The decision is made based on the system threshold t: H0 is

decided if S(v,uj) ≤ t and H1 is decided if S(v,uj) > t. A verification system makes

two types of errors:

Type I: False acceptance of an illegitimate individual,

Type II: False rejection of a legitimate individual.


The performance of a biometric verification system is therefore characterized by false

accept rate (FAR), and false reject rate (FRR). FAR is the probability of Type I error

P(H0|H1), i.e., the probability of deciding H0 when H1 is true. FRR is the probability

of Type II error P(H1|H0), i.e., the probability of deciding H1 when H0 is true. In

experimental evaluation, the FAR can be computed as FAR = 1n

∑nj=1 eFAR

j , where n is

total number of human individuals in the evaluation, and eFARj is the FAR for individual

Ij. Let N sj denote the number of successful imposter attempts against individual Ij, and

Naj denote the total number of imposter attempts against individual Ij, then eFAR

j =Ns

j

Naj.

Similarly, the FRR can be computed as FRR = 1n

∑nj=1 eFRR

j , eFRRj =

Mrj

Maj, where M r

j

denotes the number of rejected genuine attempts for individual Ij, and Maj denotes the

total number of genuine attempts of individual Ij.

In every biometric system, there is a tradeoff between FAR and FRR. The FAR

and FRR are closely related functions of the system decision threshold t. Figure 1.2(a)

depicts the intra-class and inter-class distributions. The overlapping area signifies the

error distribution. Assuming t denotes some distance measures, if t is increased to make

the system more tolerant to security, the FRR decreases accordingly. Often the interplay

of the FAR and FRR is represented by plotting FAR against FRR with the decision

threshold t as the free variable. This plot is called the receiver operator characteristic

(ROC) curve, as shown in Figure 1.2(b). By manipulating the system threshold t, the

system performance can be adjusted to match the requirement of the application.

Another evaluation measure of biometric verification systems is the equal error rate

(EER), which is defined as the operating point where FAR and FRR are equal. The

lower the EER, the better the system performance. In particular, EER provides a valu-

able measure for changeability in a two-factor scenario where a second control factor is

involved for producing changeable templates. For example, if a distance measure is used

to evaluate the similarity between data vectors, a larger threshold value will produce

lower FRR. The threshold value t that corresponds to FRR=0 represents the largest


within-class variation. Let x = Gen(u,k1) and y = Gen(v,k1) denote two biometric

templates generated from biometric representations u and v respectively, using the same

control factor k1, and the distance S(x,y) < t. Let z = Gen(v,k2) represent a tem-

plate generated from v using a different control factor. Assuming u and v are biometric

features from the same individual, if S(x, z) ≥ t, then the biometric templates that are

generated using different control factors can not be used to authenticate each other, i.e.,

it provides changeability for u and v. This can be generalized for any two biometric

feature vectors that are originally used to authenticate each other. If the authentication

is not successful after applying different control factors, then the associated controlling

operation provides changeability. Therefore, the FAR in the two-factor scenario provides

a measure of the changeability, and the smaller the FAR, the better the changeability.

If EER=0 can be obtained, it implies that zero FAR can be achieved even at the largest

threshold value (FRR=0), therefore indicates strong changeability.

Figure 1.2: Biometric system errors: (a) intra-class and inter-class distributions, (b) ROC

curve.

In identification mode, given an input feature vector v, if the identity of v, Iv, is

known to be in the system database, i.e., Iv ∈ {I1, I2, ..., In}, then Iv can be determined

by Iv = Ij = minj{S(v,uj)}, j = 1, 2, ..., n, where S denotes the similarity measure. The

performance of a biometric identification system can be evaluated by the percentage of

correctly recognized attempts, i.e., correct recognition rate (CRR).


1.3 Contributions

This dissertation systematically studies random transformation based approaches for

changeable and privacy preserving face recognition. A set of contributions that have

been made are summarized as follows:

• We introduce a random projection based method for changeable and privacy pre-

serving face verification. Detailed mathematical analysis on the distance preserving

properties of random projection using i.i.d. Gaussian entries are presented. Our

analysis provides a method for computing the probability of preserving the distance

when projected onto an arbitrary lower dimension, and demonstrates lower projec-

tion bound than those presented in the literature. A geometric based approach is

presented to analyze the changeability of the proposed method, and a vector trans-

lation method is presented for achieving strong changeability of the generated bio-

metric template. Two application scenarios, user-independent and user-dependent

random projections are discussed. In both cases, the biometric template can be

regenerated by simply varying the random projection matrix generation key. We

analyze the privacy preserving property by studying how efficiently an attacker

can reconstruct the original signal, even in the worst case that the template and

the projection matrix are both known. Our analysis shows that there is a tradeoff

between accuracy and privacy, and it is possible to provide privacy protection with

slightly degraded performance. Furthermore, unlike other dimensionality reduction

tools which usually require the collection of a large number of images for training,

the proposed method uses random projection as both dimensionality reduction and

privacy preserving tools. It is data-independent and easy to implement.

• We propose a novel approach for face recognition based on biometric features in

the continuous domain. Instead of using the original features as templates and

for biometric matching, the proposed method only stores a set of sorted index


numbers (SIN), which is obtained by sorting the original features, and storing the

corresponding indices. A new algorithm is introduced to measure the similarity

between SIN vectors. We analyze the privacy preserving property of the proposed

method, and introduce two privacy measures to evaluate the level of privacy pro-

tection for both the individual attributes and global characteristics of the feature

vectors. Our experimental results suggest that the proposed method may improve

the recognition accuracy.

• We further present the application of the SIN method in conjunction with three

types of random transformations, namely random additive transform, random mul-

tiplicative transform, and random projection, for achieving both changeability and

privacy protection. The statistical properties of each random transformation in

both same key and different key scenarios are analyzed to provide insight into how

strong changeability can be obtained. The element and vector level privacy pro-

tecting characteristics of each random transformation in combination with SIN are

analyzed and demonstrated. Our analysis and experimental results demonstrate

that the proposed methods are capable of producing privacy preserving biomet-

ric templates with strong changeability, while maintaining and even improving the

recognition accuracy of the original features.

1.4 Organization

The organization of this dissertation is as follows:

Chapter 1 presents the general background of biometrics, changeability and privacy

issues, the contribution of this research, and the organization of this dissertation.

Chapter 2 provides a detailed review of the related works. This includes both the

biometrics based cryptographic systems and the repeatable and non-invertible transfor-

mation based cancelable biometrics methods.


Chapter 3 presents the proposed random projection based method for changeable

and privacy preserving face verification. The distance preserving property of random pro-

jection, and the changeability and privacy analysis are presented in detail. Experimental

evaluation is performed on a complex generic data set and the results are presented.

Chapter 4 introduces the proposed sorted index number (SIN) approach for face

recognition. The rationale of the proposed SIN method as well as the privacy protecting

property are discussed. The experimental results on both identification and verification

scenarios are reported.

Chapter 5 introduces a framework for the integration of random transformations

with the SIN approach, to achieve reissuable and privacy protecting biometric template

generation. The changeability and privacy preserving properties of three types of random

transformations are analyzed and discussed. The effectiveness of the proposed methods

are demonstrated through extensive experimentation.

Chapter 6 summarizes the work presented in this dissertation and outlines the di-

rections for future research.

The technical contents of Chapters 3, 4, and 5 have been submitted or appeared in

the following referred journal and conference publications:

• Y. Wang, D. Hatzinakos, On random transformations for changeable face verifica-

tion, submitted to IEEE Transactions on Systems, Man and Cybernetics, Part B,

January 2010.

• Y. Wang, D. Hatzinakos, Sorted index numbers for privacy preserving face recog-

nition, EURASIP Journal on Advances in Signal Processing, vol. 2009, Article ID

260148, 16 pages, 2009. doi: 10.1155/2009/260148.

• Y. Wang, K. N. Plataniotis, An analysis of random projection for changeable and

privacy preserving biometric verification, IEEE Transactions on Systems, Man and

Cybernetics, Part B, In Press. Reprint permission granted by IEEE.


• Y. Wang, D. Hatzinakos, Cancelable face recognition using random multiplicative

transform, accepted by the 20th International Conference on Pattern Recognition

(ICPR), Istanbul, Turkey, August 23-26, 2010.

• Y. Wang, D. Hatzinakos, Random translational transformation for changeable face

verification, Proceedings of IEEE 16th International Conference on Digital Signal

Processing (DSP), Page(s): 1-6, Santorini, Greece, July 5-7, 2009.

• Y. Wang, D. Hatzinakos, Face verification with changeable templates, Proceed-

ings of IEEE 22nd Canadian Conference on Electrical and Computer Engineering

(CCECE), pp. 31-36, St. John’s, Newfoundland and Labrador, Canada, May 3-6,

2009.

• Y. Wang, D. Hatzinakos, Face recognition with enhanced privacy protection, Pro-

ceedings of IEEE International Conference on Acoustics, Speech, and Signal Pro-

cessing (ICASSP), pp. 885-888, Taipei, Taiwan, April 19-24, 2009.

• Y. Wang, K. N. Plataniotis, Face based biometric authentication with changeable

and privacy preservable templates, Proceedings of IEEE Biometrics Symposium

(BSYM), Page(s): 1-6, Baltimore, USA, September 11-13, 2007.

Chapter 2

Literature Review

The advances in biometric technology have significantly improved the recognition perfor-

mance of various biometric modalities over the past two decades. To support the potential

deployment of biometrics in a broad spectrum of government and civilian applications,

it is important to offer changeability and privacy protection to the biometric templates.

Secure biometric recognition has drawn extensive attention in the research community

in the past a few years, with a large number of solutions being proposed on various bio-

metric traits. The design of a privacy preserving biometric system is critically dependent

on the characteristics of the biometric data and features. Existing works on privacy

preserving biometric recognition can be roughly grouped into two categories, namely

biometric crypto-systems which combine biometrics with cryptographic technology, and

cancelable biometrics that employ repeatable and non-invertible transformations. This

chapter provides a review of the related works.

2.1 Biometric Crypto-system

Cryptography is an important technique in information security and related applications,

particularly in encryption, authentication, and access control. Different cryptographic

algorithms have been introduced in the past. In general, data are secured using symmetric

17

Chapter 2. Literature Review 18

cipher systems, while public-key systems are used for digital signatures and key exchange

among users. The level of security relies on the secrecy of the secret for symmetric

systems, and private key for public-key systems. Due to the large size of the cryptographic

key, a short password is usually used to encrypt the cryptographic key. The users only

need to remember the short password to retrieve their cryptographic key. When following

this strategy, the cryptographic key has the same level of security as that of a password,

which can be forgotten or stolen.

Biometrics and cryptography are two complementary security technologies. Biomet-

rics utilize the unique characteristics of an individual and hence provide true user authen-

tication. By combining them, high level security can be expected. The major problem in

the combination of biometrics with cryptographic systems is primarily due to the drastic

variation of biometric representation and the imperfect nature of biometric feature extrac-

tion and matching algorithms. Unlike password-based cryptographic system where exact

key generation and matching can be obtained, the biometric information of the same

person presented at different time and location may suffer significant variation. Since it

is difficult to produce exactly the same representation, the matching of a biometric sys-

tem is usually fuzzy. Among the pioneering works of biometrics based cryptographic key

generation, Bodo [21] first proposed to use the data derived from the biometric template

as the cryptographic key in a German patent. Obviously, the noisy nature of biometrics

makes it a questionable choice when used as a cryptographic key directly. Also, in this

case, if the key is ever compromised, then this biometric is irrevocably lost.

A biometric crypto-system is essentially a process of secure binding of a cryptographic

key with biometric data, and retrieval of the key based on new a biometric presentation

and the stored template. Figure 2.1 shows a general block diagram of a biometric crypto-

system. Due to the binary nature of cryptographic keys, the biometric information is

usually required to be represented in discrete domain. During enrollment, a discrete

feature vector is extracted from the original biometric data, and a randomly generated


key is combined with the biometric features through a binding algorithm. The resulting

representation is stored as a template, and the biometric features and the key are both

discarded. Binding should be performed in a secure way such that neither the key nor the

biometric information can be retrieved, even when the stored template is compromised.

During authentication, the original cryptographic key can be retrieved, if the presented

biometric data is sufficiently similar with the enrolled ones. The system will either

generate a key that can be used to decrypt a certain application, or a yes/no decision if

the hash value of the original key is also stored and an exact match of the hash values of

the retrieved key and original key is successful. The major challenge in the design of the

binding and retrieval algorithms is to bridge the gap between the fuzziness of biometrics

and the exactitude of cryptography. The security level of a biometric crypto-system can

be evaluated by the entropy of the generated key, which measures the cryptographic

strength of the key and usually in unit of bit. For example, for a cryptographic key of

length N , if all the bits in the key are random and independent to each other, then the

key has an entropy of N bits. The larger the entropy, the greater the uncertainty of the

key, and the higher the security level. For symmetric cipher systems, a key length of 90

bits is generally considered the minimum for strong security, and the most commonly

used are 128-bit keys [22].

Among the earliest efforts, Soutar et al. [23] introduced a fingerprint based system,

the Bioscrypt, which is also the first that has been commercialized into a product. The

proposed biometric encryption technique extracts phase information from the fingerprint

images using Fourier Transform. The extracted features are combined with a randomly

generated phase array to create two output arrays, a filter and a correlation function.

The filter function is stored in the Bioscrypt, while the correlation function is used to

link with a predefined randomly generated key, and then create an identification code.

During verification, a new image is combined with the filter function in the Bioscrypt

to produce a new output pattern, which is used to retrieve the key and compute the


Figure 2.1: General block diagram of a biometric crypto-system.

identification code. The newly generated identification code is compared with the one

stored in the Bioscrypt. This paper assumes a constrained image acquisition system, and

all the fingerprint images are completely aligned. No performance results are reported

in this work. However, Adler [16] shows that the biometric encryption technique is

vulnerable to a hill-climbing attack, where an estimate of the enrolled image can be

obtained to decrypt the code.

Monrose et al. [24] proposed to enhance the security of password by combining

keystroke biometrics. Keystroke duration and latency features are extracted and each

feature is discretized into a single bit. A short string is formed by concatenating the

bits. The short bit string is used in conjunction with a randomized number r to gener-

ate a lookup table via Shamir secret sharing [25]. The lookup table essentially contains

instructions on how to generate a hardened password using the keystroke biometric, the

password, and the random number r. The produced hardened password is then used


to encrypt a history file. During authentication, it uses the random number r, the look

up table, the newly presented keystroke biometric, and the authentication password to

compute a hardened password. This newly generated hardened password is then used to

decrypt the history file. If the decryption is successful, then the authentication is success-

ful. However, their system can only provide around 12 bits of entropy to the password

and approximately 51.6% success rate for legitimate logins. Later, they apply a similar

scheme on voice based cryptographic key generation [26], which is more distinctive than

the keystroke biometric. The biometric key entropy can be increased to 46 bits, and FRR

can be decreased to below 20%.

Davida et al. [27] proposed an iris based cryptographic signature verification system,

which carries a storage device with user-specific error correction parameters stored. These

parameters are used to decode and rectify the offset of biometric data, and an one way

hash is used for verification. Furthermore, they proposed a scheme to hash the biometric

with the user’s password if the desired entropy can not be provided. Their method

provides rigorous resolution of biometric uncertainty through Hamming error correction

codes, and the recovery of iris data is protected by complexity theory. However, the stored

error correction parameters, if compromised, can be used to reveal the information of an

user’s biometric. Also, no experimental results are reported.

Juels and Wattenberg [28] introduced a fuzzy commitment scheme, which generalized

and improved Davida’s method using error correction algorithms. During enrolment, a

secret message c is selected randomly from a set of vectors of error correction codes C,

which can correct up to t errors, and combined with biometric features x by computing

a difference vector e = x − c. For example, if the biometric features are binary, then

an XOR operation can be applied to bind the key and the features. A hashed version

of c, denoted as h(c), and the vector e are stored as templates. During authentication,

a newly presented biometric sample with features y can be used to retrieve the original

key c by computing c’ = e + y. If c’ and c are close to each other up to a Hamming


distance t, then the error correction capability of C makes it possible to reconstruct c

such that h(c’) = h(c).

Feng et al. [29] presented an iris based system which stores a string of error correction

data. A two-layer error correction technique that involves a combination of Hadamard

and Reed-Solomon codes is devised to cope with the error patterns in iris codes. The key

is generated from an user’s iris image with the aid of auxiliary error-correction data, and

can be stored in a token. The reproduction of the key will be dependent on the biometric

and token. They further show that their system can be easily extended to incorporate

other factors such as passwords. Their paper claims that up to 140 bits of biometric key

can be generated and the FRR can be improved to below 0.5%.

Draper et al. [30] introduced a fingerprint based system using distributed source

coding. A statistical model is developed to model the movement, deletions and insertions

of minutiae points. The extracted enrolment biometric information x is then compressed

and scrambled to produce a ”syndrome” s using a graphical code. During authentication,

a newly presented probe biometric representation y is used to estimate x from s. The

proposed method stores s, a cryptographic hash of x and the joint distribution of x, s, and

y. The authentication will be successful if the hash of the estimated x matches exactly

with the stored hash value of the original enrolment biometric. The security property of

this method is analyzed using information theory and random codes [31]. Experimental

results on a fingerprint data set show that the employed low-density parity-check (LDPC)

code is not strong enough to obtain information theoretic security yet.

Juels and Sudan [32] proposed a fuzzy vault scheme that enables unordered biometric

representation. The hardness of their scheme is based on the polynomial reconstruction

problem. During enrollment, a user selects a polynomial p(x) and encodes his crypto-

graphic key c into the polynomial’s coefficients, where the encoding can be achieved by

dividing c into non-overlapping chunks and mapping to the coefficients. The polynomial

p(x) can then be evaluated at each value of x and all the pairs of {x, p(x)} are stored as


the genuine set G. The user then generates a random set of pairs Q which are merged

with the G set to generate the final vault. Within the final vault, it is not known whether

the points belong to the G or the Q set. At verification, only when the biometric rep-

resentation of the authenticator has substantial overlap with the enrolled user, will the

pairs lying on the polynomial be identified and the key be reconstructed. This scheme is

expected to tolerate more variation in the biometric representation.

Clancy et al. [33] first applied the fuzzy vault scheme in a fingerprint application.

They used a bounded nearest neighbor algorithm to find canonical minutiae positions

from a set of five fingerprints of a user. The fingerprint images in the training set are used

to estimate the variance of features. To build the vault, the maximal number of chaff

points are added with a preset minimal distance to the features. Their implementation

assumes pre-aligned fingerprints, and a 69-bit biometric key is derived with 30% FRR.

To address the alignment problem, Yang et al. proposed an automatic feature extraction

method [34, 35]. Their system achieves 83% successful unlocking rate. They also show

that the chaff points must have a minimal distance to the lock set points that is at least

twice as large as the acceptable distance of a minutiae position between different scans.

Uludag et al. [36] reported another implementation of fuzzy vault using fingerprints.

A 128-bit secret c in combination with the 16-bit cyclic redundancy check bits are divided

into non-overlapping 16-bit segments and mapped to a real number as the coefficients

of the constructed polynomial. Manually marked coordinates of minutiae points are

used as the biometric features. To tolerate slight variation, a simple quantization of

the minutiae data is performed. The fuzzy vault is formed by evaluating the minutiae

data on the polynomial, and adding chaff point pairs that do not lie on the polynomial.

During decoding, the probing minutiae data are compared with the vault, and the k exact

matches are identified. All possible combinations of m+1 points from the k points are

experimented to reconstruct the m degree polynomial. Therefore, if the query features

overlap with the template features in at least m+1 points, for some combination, the


secret can be decoded. Their system achieves 21% FRR and zero FAR on a database of

100 fingerprint image pairs. One disadvantage of their approach is the high computational

complexity due to the evaluation of multiple point combinations during decoding. In a

later paper [37], they extended the fingerprint fuzzy vault scheme in combination with

”helper” data to provide information for minutiae alignment.

Following the similar scheme as in [36], a handwritten signature based cryptographic

key generation is presented in [38], and a face based method is introduced in [39]. Lee et

al. [40] presented a fuzzy vault based private key generation system using iris features.

To produce an unordered set of features for vault encoding and decoding, multiple iris

features are extracted from several local iris patches, and the exact values of the set are

generated through the k-means clustering method. Nandakumar and Jain introduced

a fuzzy vault based multi-biometric scheme using fingerprint and iris [41]. The fuzzy

vault scheme has also been implemented in a multiple people secret sharing problem [42].

However, although the fuzzy vault scheme is shown to be secure in an information-

theoretic sense, it is generally computationally complex, and also vulnerable to cross

matching attacks when multiple templates of the same person are compromised [17].

Dodis et al. [43] presented a theoretical work for generating keys from noisy data,

where error correction codes are applied to the input followed by a hash function. In

their paper, they proposed two primitives, termed secure sketch and fuzzy extractor re-

spectively. The secure sketch only addresses the problem of error tolerance, while the

fuzzy extractor addresses both error tolerance and the nonuniformity of the input. They

showed that the fuzzy extractor can be constructed from the secure sketch using a ran-

domness extractor. Different constructions for three distance metrics, Hamming distance,

set difference, and edit distance are introduced. They also suggested a modification of the

fuzzy vault scheme by using a higher order polynomial to replace the chaff points. How-

ever, Boyen [44] showed that the proposed fuzzy extractor is not secure for the multiple

uses of the same secret.


The secure sketch scheme assumes discrete representation of biometric information.

Li et al. [45] extended the sketch to the continuous domain by first quantizing the contin-

uous features, followed by a known sketch in the discrete domain. They also introduced

the usage of relative entropy loss to measure the quality of a given quantization strat-

egy. Sutcu et al. [46] presented an implementation of [45] on a face verification problem.

Singular Value Decomposition (SVD) is first performed on the face images to extract

features. The resulting features are then randomized through a user-specific random

mapping, and a scalar quantizer is used to map the coefficients to discrete values. Dur-

ing training, the midpoint and its range of each dimension is recorded. The sketch is

constructed by simply computing the difference between the midpoint with the closest

codeword. At authentication, feature extraction and random mapping is performed on

the probe image. The decoder then computes the difference between the closest codeword

of the probe and the stored sketch. The original biometric can be reconstructed if the dif-

ference is smaller than a predefined user-specific range value. Their experimental results

show that it is possible to produce better recognition accuracy compared to the original

features. However, it is not clear why user-specific random mapping is employed, and

what the performance would be if the same random mapping is applied to all the users.

A similar scheme has also been applied in a multi-modality scenario using fingerprint and

face [47], where feature fusion is performed by a simple concatenation.

Hao and Chan [48] proposed a handwritten signature based system for key genera-

tion. Dynamic information such as velocity, pressure, altitude, and azimuth are extracted,

quantized, and encoded into bits. A binary string is generated by concatenating the fea-

ture derived bits. The private key is generated by a standard digital signature algorithm

proposed by NIST. Their system achieves around 40 bits entropy with 28% FRR and

1.2% FAR.

A helper data system was introduced in [49,50], where the enrollment biometric data

x and a selected matrix V are used to generate helper data W together with a key


c through an encoding function. In each dimension, x is quantized at a step size of

q, and the helper data W is obtained by adding a small value to c depending on the

corresponding c value. The helper data and a hashed version of c, h(c), is stored in the

database. During authentication, the input biometric data y is combined with the help

data W to reconstruct a key c’, and h(c’) is then compared with h(c). It is assumed that

the variation in each dimension is relatively small compared to the quantization size q.

In [51], the above mentioned scheme is applied in an acoustic ear identification problem,

and a FRR of 3.89% can be achieved with a 100-bit key length.

Kevenaar et al. [52] proposed a method to produce a binary feature vector from face

images. A set of six key objects are first identified from the human face as fiducial points,

and Gabor filters are applied to extract texture features x from a small patch centered

around every fiducial point. The mean vector of a user is compared against the mean

vector u of all the users to determine a binary feature vector q. A reliability measure is

then defined based on the normal distribution assumption of the features and only those

with high reliability are selected, denoted as b, and the indices are arranged in a vector

W1. Their experiments demonstrate that the performance of the binary feature vectors

only degrades slightly compared to the original features. To produce a renewable and

privacy reserving template, a binary string c is generated randomly, and encrypted using

error correction codes. The binary vector b and the error correction code encrypted

random string e are combined through an XOR operation to produce help data W2.

The mean vector u, index vector W1, helper data W2, and an one-way cryptographic

hash of c, h(c) are stored. During verification, a binary vector b’ is generated using the

authenticator’s features, u, and W1, and b’ are then combined with W2 through XOR

operation to produce e’. Finally, c’ is obtained by decoding e’ and h(c’) is compared

with h(c). However, their approach results in an unacceptable FRR of 35% and can only

tolerate small within class variation. Furthermore, the performance of their system is

critically dependent on the accurate localization of the fiducial facial points.


2.2 Cancelable Biometrics

To deal with the non-revocable and privacy problems in biometric systems, Ratha et

al. [15] introduced the concept of Cancelable Biometrics, which is defined as an intentional

and repeatable distortion of a biometric signal, through a chosen transform. Figure 2.2

depicts the general block diagram of a cancelable biometric system. With this approach,

every instance of enrollment can use a different transform. If one variant of the biometric

template is compromised, then a new variant can be created by simply changing the

transform control, e.g., a seed or key associated with a random number generator. In

general, the repeatable transform should be selected to be non-invertible such that even if

the exact transform and the resulting transformed biometrics are both known, the original

biometrics can not be recovered. The distortion can be implemented either in the signal

domain, where a morphed version of the biometric signal is enrolled, or in the feature

domain, where the distortion is performed on the processed biometric signal. They also

proposed to transform the face images using a morphing method [15,20]. However, their

method requires an alignment before the transformation. Moreover, the face image may

be revealed by an adversary if the morphing function is invertible.

Following the line of cancelable biometrics, Ratha et al. [53] introduced a framework of

generating cancelable fingerprint templates. A few different methods including Cartesian,

polar and surface folding transformations of the minutiae positions are discussed analyt-

ically and empirically. Their paper demonstrates the revocability and non-invertibility

of the proposed transformations, and anticipates that the feature level cancelable bio-

metric construction can be applied in large biometric deployments. However, this work

focuses on fingerprints whose features are usually a set of unordered minutia positions,

and the number of which varies. It is not clear how such methods can be applied to other

biometrics such as face and iris, whose features are usually of fixed length and order.

Jeong et al. [54] proposed a method to scramble and add the normalized principal

component analysis (PCA) and independent component analysis (ICA) coefficient vec-


Figure 2.2: General block diagram of a cancelable biometric system.

tors that are extracted from face images together to produce a new feature vector as a

template. Since the transformed template is generated by the addition of two vectors, the

original coefficients can not be recovered. When a template is compromised, a new scram-

ble rule may be applied to generate a new template. Although their experiments show

that the performance does not degrade significantly compared to the original features,

no further analysis is given in the case where the scramble key is stolen.

Savvides et al. [55] proposed an approach for cancelable biometric authentication in

the encrypted domain. The training face images are first convolved with a random kernel,

and the transformed images are used to synthesize a single minimum average correlation

energy filter. At the point of verification, a query face image is convolved with the

same random kernel, and then correlated with the stored filter to check similarity. If

the storage card is ever attacked, a new random kernel may be applied. They show that

the performance is not effected by the random kernel. However, it is not clear how the

system preserves privacy if the random kernel is obtained by an adversary. The original

biometrics may be retrieved through de-convolution if the random kernel is known.

Boult [56] introduced a method for face based revocable biometrics based on robust


distance measures. In this scheme, the face features are first transformed through scal-

ing and translation, and the resulting values are partitioned into the integer and the

fractional part. The integer part is encrypted using Public-Key algorithms, while the

fractional part is retained for local approximation. A user-specific pass-code is included

to address the revocation problem. In a subsequent paper [57], a similar scheme is applied

on a fingerprint problem, and a detailed security analysis is provided. Their methods

demonstrate both improvement in accuracy and privacy. However, it is assumed that the

private key can not be obtained by an imposter. In the case of known private key and

transformation parameters, the biometrics features can be successfully recovered.

Lee et al. [58] introduced a two-factor method for generating cancelable fingerprint

templates using local minutiae information. A feature vector that contains the orientation

information of the neighboring area of each minutia point is first extracted. A rotation

and translation invariant value is then estimated by computing the inner product between

a user-specific random vector and the feature vector of each minutiae points. The result-

ing invariant value for each minutiae point is used as input to two user-specific changing

functions, and the transformed features are stored as templates for authentication. The

biometric templates can be changed by simply replacing the changing functions, which

are associated with a PIN number for random number generators. However, it is shown

that the proposed method has a tradeoff between performance and changeability, i.e.,

the weaker the changeability, the higher the accuracy, and vice versa. The reported ex-

perimental results are also based on the ideal case, i.e., the user-specific PIN number is

always legitimate. No detailed analysis about the stolen PIN scenario is provided.

Teoh et al. [59] introduced a two factor authenticator, BioHashing, for cancelable

human identity recognition based on biometrics and a tokenized random number. The

base BioHashing method is composed of two steps. In the first step, a feature vector x ∈RN is extracted from the biometric characteristic. The second step involves discretization,

where the extracted feature vector is reduced down to a bit vector b ∈ {0, 1}M , with M


the length of the bit string, M ≤ N , by using the pseudo-random numbers generated by

the given Hash key. The procedure of creating the BioHash code b is as follows:

1. Use Hash key k to generate a set of pseudo random vectors ri ∈ RN , i=1,. . . , M.

2. Apply the Gram-Schmidt orthonormalization method to transform the basis ri into

an orthonormal set of vectors ori, i = 1, . . . , M .

3. Compute the inner product between the biometric feature vector x and ori, i =

1, . . . ,M, denoted as 〈x|ori〉.

4. Compute the M bits BioHash code bi, i = 1, . . . ,M, according to:

bi =

0, if 〈x|ori〉 ≤ τ

1, if 〈x|ori〉 > τ(2.1)

where τ is a preset threshold.

By using this method, a unique compact code is generated for each individual. The

Hash key is given to the user during the enrollment, and is different among different

users and different applications. The generated BioHash codes are compared for sim-

ilarity matching using the Hamming distance. An individual needs to have both the

correct biometrics and Hash key to pass the authentication. The BioHashing technique

is applied to a fingerprint problem, and it is claimed that zero equal error rate (EER)

can be achieved, i.e., reducing the FAR without increasing the FRR. Later, the same

method is applied to other biometric systems, such as palmprint [60], and face [61,62], in

combination with different feature extraction and matching techniques. In [63], a similar

procedure with a different thresholding method was introduced for better error tolerance.

In this work, instead of an one step thresholding for bit extraction, the bit is determined

immediately when the inner product 〈x|ori〉 is greater than u + σ or smaller than u− σ,

where u and σ are experimental parameters. When < x|ori >∈ [u− σ, u + σ], the inner

product is recomputed as < x|orj >, where orj is an unused orthonormal random vector.


The base BioHashing method is based on the assumption that the Hash key will

not be stolen. The main drawback of the method is the low performance when an

imposter B steals the Hash key of A and attempts to be authenticated as A. When this

problem occurs, the performance of BioHashing can be lower than using the biometric

data only [64]. Kong et al. [65] investigate the applicability of the BioHashing method on a

face recognition problem and conclude that the claim of having achieved near zero EER is

based on an unpractical assumption that the Hash key can not be stolen. They also point

out that if the assumption holds, there would be no need for biometrics to be combined

with the user-specific random numbers, since the latter can itself serve as a password.

Their experiments show that if an imposter has the Hash key, the BioHashing method is

even worse than the biometric method alone. However, no solutions are suggested.

To improve the performance of BioHashing in the case of stolen key, a multi-matcher

fusion method was proposed in [66]. The biometric signature is divided into a number

of training sets, half of which are used to train the classifiers, while the other half are

combined with pseudo random numbers. Fusion is performed using a sum rule of the

similarity scores of the two ensembles, and a max rule is used to select the final score for

each of them. In [64], a multi-modal method is proposed to combine the scores of selected

fingerprint matchers with the scores obtained by a face authenticator where the facial

features are combined with pseudo-random numbers. Fusion is performed by treating

the similarity matching scores of each system as new features, and using a linear support

vector machine for the final classification. In [67], a random subspace based method is

further proposed to combine the similarity matching scores and enhance the performance

of the system.

In addition, Lumini et al. [68] proposed another approach to enhance BioHashing tech-

nique. Their experimental results show that the the performance of the base BioHashing

method relies on the selection of parameter M (number of bits), and τ (discretization

threshold value). To deal with these problems, an improved version of BioHashing in-


cludes: 1) Normalization of the biometric vectors before BioHashing. 2) Instead of using

a fixed value for τ , several values of τ may be used and combined according to the sum

rule. 3) Use more projection spaces to generate more BioHash codes per user. This

can be achieved by performing the BioHashing method iteratively k times on the same

biometric vector to obtain k projection spaces. Verification is carried out by combining

the classification scores of each BioHash code. 4) Another way to generate more Bio-

Hash codes is to use several permutations of the biometric features during the projection

calculation. Experiments were performed on face, fingerprint, and signature biometric

data. The experimental results demonstrate performance improvement in the stolen key

case, and the fusion of biometric data and space augmented BioHash codes is expected

to achieve a good compromise between the best case of non-stealing and the worst case

of being always stolen.

The BioHashing technique has also been used as the first step for cryptographic key

generation. In [69], eigenprojections are extracted from face images as features, each of

which is then hashed with pseudorandom numbers to extract a single bit. A bit string

is formed by concatenating the bits, and this bit string is further securely reduced to a

single cryptographic key via Shamir secret sharing. This paper reports 80-bit entropy

with 0.93% FRR. In [70], the same bit string and key generation technique as in [69]

is used, while the face images are represented with a wavelet Fourier-Mellin transform.

In [71], a Reed-Solomon error code is incorporated as an error correction step to correct

the bit disparity between the gallery and probe sample of BioHash. Their methods are

tested on three different fingerprint databases and on average, an FRR of below 1% can

be achieved.

Theoretical analysis of the BioHashing technique is presented in [72] using random

projection theory. However, random projection theory addresses the distance preserving

property in the domain of real numbers, and it is not clear how this is preserved in the

quantized domain. Moreover, it should be noted that for a certain system threshold value,


the FRR is not affected by the employment of user-specific key. Therefore, the system

threshold value that is selected for near zero EER will produce large FAR in the stolen-key

scenario. Furthermore, for a M bit BioHash code b, assume each bit in b is independent,

let t be the system threshold value in terms of Hamming distance, then, when different

keys are applied on the biometric features of the same user, the probability of false

acceptance is∑t

i=0 (Mi )

2M . This probability depends on two factors, the system threshold t

and the dimensionality M , which reflect the separability and characteristics of the data

and the feature extractors. Therefore, the changeability (as well as the performance in

the user-specific key scenario ) of BioHashing is highly dependent on the characteristics

of the extracted features.

More recently, Teoh et al. [73] proposed a Multispace Random Projection (MRP)

method, which applies user-specific random projection on reduced low-dimensional fea-

ture vectors without the quantization procedure of BioHashing. The distance preserving

property of MRP is analyzed based on normalized inner product, and near zero EER is

achieved in the user-specific MRP scenario. However, their papers lack of rigorous pri-

vacy and changeability analysis. As shown in this thesis, the privacy protecting property

of their method is subject to certain attacks. Similar to the BioHashing technique, the

near zero EER in the user-specific key scenario will produce high FAR in the stolen-key

scenario. Even in the both-legitimate scenario, the performance of BioHashing and MRP

techniques are highly dependent on the characteristics of the extracted features, therefore

their methods do not provide strong changeability.

2.3 Summary

In summary, existing works either can not provide robust privacy protection, or sacrifice

verification accuracy for privacy preservation. The biometric crypto-system integrates

biometrics with cryptographic techniques to provide strong security of the biometric


templates. Due to the binary nature of traditional cryptographic keys, a discrete rep-

resentation of the biometric signal is generally required. However, due to the enormous

variation of biometric signal, it is difficult to extract a discriminatory discrete represen-

tation which will allow the error correction algorithms to efficiently correct the errors.

This will lead to significant performance degradation. This is particularly challenging for

biometric traits such as face, whose features are usually represented in the continuous

domain. Most of the existing biometric crypto-systems are computationally complex,

and usually suffer performance degradation. Furthermore, the majority of these works

only produce a repeatable cryptographic key, while the biometric itself is not changeable.

Biometrics are not secret. Human leave their fingerprint everywhere, and human faces

can be easily captured by a camera. As such, if the biometric information is obtained

by an adversary, then all the biometric systems that use the same biometric trait are

compromised.

Motivated by the cancelable biometrics framework in [15], this thesis focuses on the

development of computationally efficient repeatable and non-invertible transforms for ad-

dressing both the changeability and privacy protection problems. Independent from [73],

we present a random projection based technique for secure biometric template generation.

To improve the recognition performance, a distance and privacy preserving mechanism

in a low-dimensional space, termed sorted index numbers (SIN) approach, is introduced.

The SIN method is then combined with different random transformations to obtain strong

changeability and enhanced privacy protection. This thesis presents detailed analysis on

the changeability and privacy preserving properties of the proposed methods. The Bio-

Hashing [61, 62] and MRP [73] methods are adopted for comparison. The effectiveness

of the introduced solutions is supported by extensive experimentation.

Chapter 3

Random Projection Based Face

Verification

3.1 Introduction

This chapter presents a systematic analysis of random projection (RP) as an intentional,

repeatable, and non-invertible transformation for changeable and privacy preserving bio-

metric template generation. RP is a technique to project a set of high-dimensional

data points to a randomly selected low-dimensional subspace, with the pairwise distance

between the data points approximately preserved. It is fundamentally based on the

Johnson-Lindenstrauss (J-L) lemma [74]. RP has been used as a dimensionality reduc-

tion or a privacy preserving tool in many different application contexts. Applications

of RP for dimensionality reduction include nearest neighbor search [75], face recogni-

tion [76], image and text data processing [77] and clustering [78], and learning of mixture

of Gaussian [79]. For privacy protection, RP has been applied for data mining [80], data

clustering [81], and biometric applications [62,73].

In this chapter, we elaborate its application in biometric verification as both di-

mensionality reduction and privacy preserving tools. The proposed method transforms

35

Chapter 3. Random Projection Based Face Verification 36

biometric data using a random matrix with each entry an independently and identically

distributed (i.i.d.) Gaussian random variable. This chapter contributes comprehensive

and detailed mathematical analysis on the similarity preserving and privacy protecting

properties of the generated biometric template. Our analysis introduces a precise method

of computing the probability of preserving the distance at an arbitrarily projected di-

mensionality, and achieves better projection lower bound than the best known in existing

works. Detailed privacy protection analysis is presented by studying the statistical prop-

erties of the reconstructed signal. The changeability of the biometric information in the

transformed domain is analyzed in detail using a geometric based approach and a vector

translation method is introduced to generate biometric templates with strong changeabil-

ity. Specifically, RP on both high-dimensional image vectors and dimensionality reduced

feature vectors are discussed and compared. Two different application scenarios, user-

independent (UI) and user-dependent (UD) RP are presented. The UI scenario utilizes

the same projection matrix for all the users, while the UD scenario is a two factor scheme

that applies user-specific RP. In both scenarios, the biometric template can be regener-

ated by simply varying the projection matrices. The proposed method is capable of

producing zero EER in UD scenario when both the biometric data and projection matrix

are legitimate. This also indicates strong changeability of the generated biometric tem-

plate. This is supported by both the probabilistic analysis and extensive experimentation

on a face verification problem.

The remainder of this chapter is organized as follows: Section 3.2 provides an overview

of the proposed solution. Section 3.3 analyzes the similarity preserving property of RP.

Section 3.4 presents the changeability analysis, and introduces the vector translation

technique for obtaining strong changeability. Section 3.5 presents an analysis of the

privacy protecting property of the proposed method. Section 3.6 reports the detailed

experimental results on a face verification problem, and Section 3.7 summarizes this

chapter.


3.2 Method Overview

The proposed method is based on random projection of face image vectors. An input

image is first preprocessed by detecting the face region. The preprocessed face image is

converted to a vector of size N × 1 by concatenating each row. The resulting vector, z,

is regarded as the input vector for feature extraction. The procedure of producing the

changeable and privacy preserving biometric template is as follows:

1. Preprocess and obtain an image vector z ∈ RN from the input face image.

2. Use a key k to generate an N × M (M < N) random matrix R. Each entry of

R is i.i.d. according to a Gaussian distribution with mean zero and variance 1N

,

rij ∼ N(0, 1N

), i = 1, ...,N, j = 1, ...,M.

3. Compute x =√

NM

RTz, where superscript T denote the transpose.

The extracted feature vector x ∈ RM is stored as the template for verification.

3.3 Accuracy Analysis

This section provides a detailed mathematical analysis of the similarity preserving prop-

erty of RP. RP is motivated by the J-L lemma [74]:

Lemma 3.1 (J-L lemma): For any 0 < ε < 1, and an integer n, let M be a positive

integer such that M ≥ M0 = O(ε−2 log n). For any set B of n points in RN , there exists

a mapping f: RN → RM such that for all u,v ∈ B,

(1− ε) ‖u− v‖2 ≤ ‖f(u)− f(v)‖2 ≤ (1 + ε) ‖u− v‖2 . (3.1)

This lemma states that the pairwise distance between any two vectors in the Eu-

clidean space can be preserved up to a factor of ε, when projected onto a random

M -dimensional subspace. The original paper used heavy mathematical machinery to


prove that such mapping can be achieved by using a random matrix with orthonor-

mal columns. Frankl and Meahara [82] simplified the proof and introduced a bound of

M0 = d9(ε2 − 2ε3/3)−1 log ne + 1. Independently, simplified versions of this proof were

provided by Indyk and Motwani [83] and Dasgupta and Gupta [84]. In addition, Ar-

riaga and Vempala [85], Achlioptas [86], and and Li et al. [87] showed that it is possible

to achieve such embedding through much simpler random matrices for fast operation.

Achlioptas [86] provided a sharper lower bound of M0 = d(4 + 2γ)(ε2/2− ε3/3)−1 log ne,such that with probability of at least 1−n−γ, where γ controls the probability of success,

the pairwise distance between all n points can be preserved. Vempala [88] also intro-

duced a random projection method for mapping high-dimensional binary vectors into

low-dimensional ones, with the Hamming distance between the binary vectors approxi-

mately preserved.

As illustrated [84] and [86], the key issue in producing such distance preserving map-

ping is to show that the squared length (norm) of a random vector is sharply concentrated

around its mean when projected onto a random M -dimensional subspace, i.e., the Re-

stricted Isometry Property (RIP) [89]. Then, the assertion of the J-L lemma can be

proved by applying an union bound on all(

n2

)pairs such that none of the pairwise dis-

tance can be distorted more than (1 ± ε). Most of the existing works utilize inequality

properties to provide a bound for the probability of preserving distance between two

points, and then extend to n points to compute the lower bound M0. However, exper-

imental results in [76] and [77] suggest that the lower bound M0 is not tight, and it is

possible to produce good results in a lower dimensionality. Therefore, we are interested

in finding the extent to which the distance between two vectors can be approximately

preserved, when projected onto a lower dimensional subspace. This is particularly impor-

tant for applications that have a high demand in storage or computational complexity.

In [85] and [86], it is suggested that RP can be achieved by using a random matrix

with i.i.d. Gaussian entries. Such methods do not need to conduct the computationally


expensive Gram-Schmidt procedure for orthonormalization, and therefore are more ap-

propriate for practical applications. Following this line, this chapter introduces a precise

method for computing the probability of preserving the Euclidean distance between two

vectors when projected onto an arbitrary M-dimensional subspace. The probability lower

bound of preserving the pairwise distances for all n points, with respect to an arbitrary

M is further analyzed. As demonstrated later in this chapter, for the same probability of

preserving distance for all n points, we can get better a lower bound M0 than that shown

in [86]. To begin with, we first look into the properties of a random matrix with i.i.d.

Gaussian entries. Throughout this thesis, we use E[·] and Var[·] to denote expectation

and variance respectively.

Lemma 3.2 : Let R be an N × M (M < N) matrix. Each entry of R is an i.i.d.

Gaussian random variable with mean zero and variance 1N

, rij ∼ N(0, 1N

), i = 1, ...,N,

j = 1, ...,M. Let W = RT R and W ′ = RRT , then:

E[wi,j] =

1 i = j;

0, i 6= j;(3.2)

Var[wi,j] =

2N

, i = j;

1N

, i 6= j;(3.3)

E[w′i,j] =

MN

, i = j;

0, i 6= j;(3.4)

Var[w′i,j] =

2MN2 , i = j;

MN2 , i 6= j;

(3.5)

where wi,j and w′i,j are elements of W and W ′ respectively.

Please see Appendix 3-I for the proofs.

The results in Lemma 3.2 show that E[RT R] = I, where I denote identity matrix.

When N is large, the elements of RT R are sharply concentrated around their mean with

a very small variance, i.e. RT R ≈ I. This suggests that in a high-dimensional space,


when the entries of a random matrix R are i.i.d. Gaussian random variables, the columns

in R are almost orthogonal. The higher the dimensionality, the better the approximation

of orthogonality. Intuitively, the results show that in a high-dimensional space, vectors

with random directions are very likely to be close to orthogonal [90]. In particular, it

is straightforward to verify that when rij ∼ N(0, 1N

), E[‖rj‖2] = E[∑N

i=1 r2ij] = 1, and

Var[‖rj‖2] = Var[∑N

i=1 r2ij] = 2

Nwhere rj denote each column of R. This suggests that

the length of each column vector in R is strongly concentrated around 1, and subsequently

the vectors in R are close to orthonormal. These properties of a random matrix with

i.i.d. Gaussian entries imply that it is possible to relax the enforced orthogonality and

normality as in the original J-L lemma. Similarly, it can be shown that E[RRT ] = MN

I.

When R is scaled by√

NM

, and with large M , we have√

NM

R√

NM

RT ≈ I.

Lemma 3.3 : Let u be an arbitrary vector in N -dimensional Euclidean space, u ∈RN . Let R be an N ×M(M < N) matrix. Each entry of R is an i.i.d. Gaussian random

variable with mean zero and variance 1N

, rij ∼ N(0, 1N

), i = 1, ...,N, j = 1, ...,M. Let

x =√

NM

RTu, then:

E[‖x‖2] = ‖u‖2 , (3.6)

Var[‖x‖2] =2

M‖u‖4 . (3.7)

Please see Appendix 3-II for the proofs.

Lemma 3.3 shows that, up to a scaling factor√

NM

, the squared length of an arbitrary

vector is concentrated about its original one when the vector is projected onto a random

M-dimensional subspace. This explains the key issue in producing a distance preserving

mapping as illustrated in [84] and [86]. The variation of the squared length is inversely

proportional to the dimensionality of the projected subspace. As the dimensionality M

increases, the degree of concentration becomes sharper. Lemma 3.3 can be naturally

extended to the following lemma:


Lemma 3.4 : Let u and v be two arbitrary vectors in an N -dimensional Euclidean

space, u ∈ RN and v ∈ RN . Let R be an N × M(M < N) matrix. Each entry of R

is an i.i.d. Gaussian random variable with mean zero and variance 1N

, rij ∼ N(0, 1N

),

i = 1, ...,N, j = 1, ...,M. Let x =√

NM

RTu,y =√

NM

RTv, then:

E[‖x− y‖2] = ‖u− v‖2 , (3.8)

Var[‖x− y‖2] =2

M‖u− v‖4 . (3.9)

Proof: Replace x by x− y, and u by u− v in Lemma 3.3.

Lemma 3.4 shows that the expectation of the squared Euclidean distance (SED)

between two randomly projected vectors is the SED between the two original vectors,

and accordingly the variance is inversely proportional to the projected dimensionality.

The higher the projected dimensionality, the smaller the variance, and hence the better

the SED between two vectors in the transformed domain being preserved. It should

be noted that, since the entries of the projection matrix R are i.i.d. Gaussian random

variables, for a fixed vector u, all elements in the projected vector x = RTu are also

independent Gaussian random variables. This is due to the 2-stability of the Gaussian

distribution [86]: for any real numbers a1, a2, ..., ak, if {qi}ki=1 is a family of independent

Gaussian random variables with zero mean and unit variance, let X =∑k

i=1 aiqi, then

X ∼ cN(0, 1), where c = (a21 + ... + a2

k)1/2. Similarly, for a vector u− v, the elements of

RTu−RTv = RT (u− v) are independent Gaussian random variables.

Lemma 3.5 : For any ε > 0, and an integer M, let u and v be two arbitrary vectors

in N -dimensional Euclidean space, u ∈ RN and v ∈ RN . Let R be an N ×M(M < N)

matrix. Each entry of R is an i.i.d. Gaussian random variable with mean zero and

variance 1N

, rij ∼ N(0, 1N

), i = 1, ...,N, j = 1, ...,M. Let x =√

NM

RTu,y =√

NM

RTv,

then we have the probability of:

P ((1− ε) ‖u− v‖2 ≤ ‖x− y‖2 ≤ (1 + ε) ‖u− v‖2)

= G

(M

2,(1 + ε)M

2

)−G

(M

2,(1− ε)M

2

). (3.10)


where G(a, x) is the regularized Gamma function, G(a, x) = 1Γ(a)

∫ x

0e−tta−1dt, and Γ

denote the Gamma function [91].

Proof: Let xj and ui denote the elements of vectors x and u respectively, we have:

E[xj] = E

[N∑

i=1

√N

Mrijui

]

=

√N

M

N∑i=1

E[rij]ui

= 0,

Var[xj] = Var

[N∑

i=1

√N

Mrijui

]

=N

M

N∑i=1

Var[rijui]

=N

M

N∑i=1

(E[r2iju

2i ]− E[rijui]

2)

=N

M

N∑i=1

E[r2iju

2i ]

=N

M

N∑i=1

1

Nu2

i

=1

M‖u‖2 ,

Therefore√

M‖u‖2 xj ∼ N(0, 1). Since the elements of x are independent, let Z = M‖x‖2

‖u‖2 ,

then the random variable Z is distributed according to a Chi-square distribution. Replace

x and u by x − y and u − v respectively, then Z =M‖x−y‖2‖u−v‖2 also follows a Chi-square

distribution with M degrees of freedom. We have:

P (‖x− y‖2 ≤ (1 + ε) ‖u− v‖2) = P (X ≤ (1 + ε)M)

= G

(M

2,(1 + ε)M

2

),

P (‖x− y‖2 ≤ (1− ε) ‖u− v‖2) = P (X ≤ (1− ε)M)

= G

(M

2,(1− ε)M

2

),


Figure 3.1: Probability of preserving distance between two vectors as a function of M

and ε.

Hence:

P ((1− ε) ‖u− v‖2 ≤ ‖x− y‖2 ≤ (1 + ε) ‖u− v‖2)

= G

(M

2,(1 + ε)M

2

)−G

(M

2,(1− ε)M

2

).

Eqn. (3.10) provides a precise method for computing the probability of preserving the

SED between two vectors in the projected subspace. Figure 3.1 plots the probability as

a function of dimensionality M and error ε. It can be observed that for any fixed error ε,

the probability of preserving the distance between two vectors increases as the projected

dimensionality increases. On the other hand, for any fixed projected dimensionality, the

larger the error factor, the higher the probability of preserving the distance. For example,

even when projected to a low dimensionality of M = 200, with probability of 99.68%,

the SED between two vectors can be preserved up to an error factor of ε = 0.3.

Having obtained the probability of preserving the distance between two fixed points,

now we can apply the union bound to analyze the probability of preserving the pairwise

distance for all the n points. Let λ denote the probability in Eqn. (3.10), then for each


of the(

n2

)pairs, the probability of the distortion being larger than (1 ± ε) is 1− λ. For

all the(

n2

)pairs, the chance that some pairs do not preserve the distance is at most

(n2

)× (1− λ). Hence the probability of preserving the pairwise distance for all the pairs

simultaneously is 1− (n2

)× (1− λ). This proves the following lemma:

Lemma 3.6 : For any ε > 0, and an integer M , let any set B of n points in RN

being represented as a matrix D of size N × n. Let R be an N × M(M < N) matrix.

Each entry of R is an i.i.d. Gaussian random variable with mean zero and variance 1N

,

rij ∼ N(0, 1N

), i = 1, ...,N, j = 1, ...,M. Let A =√

NM

RT D, and f denote the map

RN → RM from the ith column of D to the ith column of A. Then with probability of at

least 1− (n2

)× (1− λ), for all u,v ∈ B,

(1− ε) ‖u− v‖2 ≤ ‖f(u)− f(v)‖2 ≤ (1 + ε) ‖u− v‖2 . (3.11)

where λ = G(M2, (1+ε)M

2)−G(M

2, (1−ε)M

2).

Lemma 3.6 offers a probability lower bound of preserving the distance for all the n

points when projected onto an arbitrary M -dimensional subspace. It can be seen that

the similarity preserving property is determined by three factors, i.e., the cardinality n,

the error factor ε, and the projected dimensionality M . In a pattern recognition problem,

the error factor ε depends on the discriminant power of data vectors, and the cardinality

n is the number of classes. Figure 3.2 plots the probability lower bound as a function of

M and ε with fixed n=100. It can be observed that for an n-class problem, if the original

data vectors are well separated, i.e., they can tolerate large error, even with a lower

dimensionality, the pairwise distances of all the points can be well preserved. Figure 3.3

depicts the relation of M and n with fixed ε=0.3. It can be observed that when n is

getting larger, the requirement of increasing the corresponding M becomes less stringent,

since still with high probability, the distances can be well preserved. Therefore, with an

appropriately selected M , the projection does not need to be altered when n increases

insignificantly. This is important for applications such as biometric systems, since we

may not want to change the projection whenever a new user is added to the system.


Figure 3.2: Probability of preserving distance for all n points as a function of M and ε

(n=100).

Figure 3.3: Probability of preserving distance for all n points as a function of M and n

(ε=0.3).

Different from existing work, which uses inequality properties to analyze the distance

preserving probability between two points [86], this chapter offers a method to compute

the exact probability of preserving pairwise distance. A direct gain of this is the possibil-


Figure 3.4: Comparison of lower bound of M with reference work.

ity of lowering the lower bound of the required projection dimensionality M0. To verify

this, Figure 3.4 compares the lowest required projection dimensionality M according to

Lemma 3.6, with the lower bound M0 provided in [86], which, to our knowledge, is the

best known bound. In [86], it was shown that with probability of at least 1−n−γ, where

γ controls the probability of success, the pairwise distance between all n points can be

preserved when projected onto a lower bound of M0 = d(4 + 2γ)(ε2/2− ε3/3)−1 log ne.In the plot, the probability lower bound is set to 1 − 1

n(correspond to γ=1 in [86]). It

can be seen that our analysis provides a better lower bound M0 than illustrated in [86].

3.4 Changeability Analysis

In the proposed method, the biometric templates can be changed by simply varying the

RP matrix. To ensure strong changeability, the biometric templates that are generated

from the same user, using different RP matrices, should not be able to authenticate each

other. Considering a scenario where an imposter compromises the template of a user,

the user cancels the old template and generates a new one using a different RP matrix.


The imposter then tries to authenticate as the true user using the old template. In

this section, we use the subscripts p and g to represent the probing template and the

newly generated template of the claimed identity respectively. Since different projection

matrices are used, therefore Rp 6= Rg. To quantify the probability of error and illustrate

the importance of translating the biometric data, we first consider a case where RP is

applied on the biometric data directly, i.e., x =√

NM

RTz.

Assume√

NM

Rp = UQp and√

NM

Rg = UQg, where U is an N ×M matrix with each

entry an i.i.d. Gaussian random variable with mean zero and variance 1N

, and Qp and Qg

are two matrices of size M ×M . From Lemma 3.2, UT U ≈ I, we have Qp =√

NM

UT Rp

and Qg =√

NM

UT Rg. Due to the 2-stability of Gaussian distribution, for fixed Rp and

Rg, the elements of Qp and Qg are also Gaussian random variables with zero mean and

variance 1/M , and the columns are almost orthonormal. Therefore, the problem can

be formulated as xp =√

NM

RTp zp = (UQp)

Tzp = QTp (UTzp) and xg =

√NM

RTg zg =

(UQg)Tzg = QT

g (UTzg). It is equivalent to first projecting the biometric data with the

same projection matrix U , and then transforming the projected feature vector using

different orthonormal matrices Qp and Qg. When the same projection matrix is applied

on the biometric data, the Euclidean distance between zp and zg is preserved as shown

in previous section.

For changeable biometrics, we are concerned with the probability of false accept when

different transformations are applied on the biometric data of the same user, denoted as

Pf in this chapter. Accordingly, the changeability, which is the probability of a template

to be changeable, can be defined as Pc = 1 − Pf . The higher the Pc, the better the

changeability. Since the transformation is random and almost orthogonal, it corresponds

to the rotation of a point in the hyper-sphere whose radius is specified by the length of

the point, i.e., the Euclidean distance between the point and the origin. We have:

Pf = P (lxg − t ≤ lxp ≤ lxg + t, S(xg,xp) ≤ t), (3.12)

where l denote the length of the corresponding vector in the subscript, t is the sys-


Figure 3.5: Demonstration of computing probability of error in 2-D space.

tem threshold, and S represents the similarity function, i.e., Euclidean distance in this

chapter. Since xp is a point chosen uniformly at random from the surface of the M -

dimensional sphere with radius lxg , the computation of Eqn. (3.12) needs to be split

into two cases: lxg ≤ t and lxg > t, as shown in Figure 3.5. In a 2-dimensional space,

P (S(xp,xg) ≤ t|lxg − t ≤ lxp ≤ lxg + t) = πt2

π(lxg+t)2

when lxg ≤ t, and P (S(xp,xg) ≤t|lxg − t ≤ lxp ≤ lxg + t) = πt2

π(lxg+t)2−π(lxg

−t)2 when lxg > t. This can be easily extended

to an M -dimensional space, where the volume of an M -dimensional hypersphere with ra-

dius t is defined as [92]: VM = SMtM

M , where SM is the hyper-surface area of an M-sphere

of unit radius. In an M -dimensional space, we have:

P1 = P (S(xp,xg) ≤ t|lxg − t ≤ lxp ≤ lxg + t, lxg ≤ t)

=

SMtM

MSM (lxg

+t)M

M

=tM

(lxg + t)M, (3.13)

P2 = P (S(xp,xg) ≤ t|lxg − t ≤ lxp ≤ lxg + t, lxg > t)

=

SMtM

MSM (lxg

+t)M

M − SM (lxg−t)M

M

=tM

(lxg + t)M − (lxg − t)M, (3.14)


Pf = P (lxg ≤ t)P (lxp ≤ lxg + t|lxg ≤ t)P1

+P (lxg > t)P (lxg − t ≤ lxp ≤ lxg + t|lxg > t)P2, (3.15)

From Eqn. (3.15), it is clear that the probability of error depends on the characteris-

tics of the features, and the dimensionality M . In general, zero Pf can not be achieved

by applying RP on the biometric data directly. However, since P (lxp ≤ lxg + t|lxg ≤t)P1 ≤ 1, and P (lxg > t)P (lxg − t ≤ lxp ≤ lxg + t|lxg > t) ≤ 1, Eqn. (3.15) can be

simplified as:

Pf ≤ P (lxg ≤ t) +tM

(lxg + t)M − (lxg − t)M, (3.16)

This probability can be minimized by adding an extra vector d ∈ RN , di >> t, to the

biometric data, z’ = z + d, such that after RP, P (lxg < t) = 0. We have:

Pf ≤ tM

(lxg + t)M − (lxg − t)M, (3.17)

and

limt

lxg

→0,∀MPc = lim

tlxg

→0,∀M(1− Pf ) = 1. (3.18)

It should be noted that the addition of vector d does not change the similarity between

two vectors since∥∥RT (u + d)−RT (v + d)

∥∥2=

∥∥RTu−RTv∥∥2

. The preceding analysis

shows that with appropriate vector translation, the proposed method can produce bio-

metric templates with changeability 1, by applying different RPs on the biometric data

of the same user. The system threshold t determines the choice of vector d. The elements

of vector d should satisfy di >> t such that lxg >> t and Pc = 1. This indicates the

strong changeability of the proposed method.

3.5 Privacy Analysis

To preserve the privacy of the users, it is expected that no information should be disclosed

if the stored biometric template is compromised. The proposed method utilizes RP for


biometric template generation. Due to the randomness of projection matrix, the user’s

privacy information can not be compromised if only the template is obtained by an

adversary. However, it is possible that an attacker can acquire more knowledge and

estimate the original signal.

Assuming the worst case that both the template and the projection matrix are com-

promised, then an adversary can estimate the original biometric data. For a robust pri-

vacy preserving mechanism, the estimated individual elements in the data vector should

not be exactly the same as the original ones. Furthermore, the global characteristics of

the estimated data vector should be far apart from the genuine data vector up to some

similarity functions.

Considering a projection function x = RTz, R ∈ RN×M , where the entries of R are

i.i.d. Gaussian random variables, and an adversary tries to estimate the values of z.

Since M < N , this is an under-determined system, where there are more unknowns

than linear equations. There are infinitely many solutions that satisfy x = RT z. To

solve this problem, one classical approach is to find the minimum norm solutions, using

z = R(RT R)−1x, where R(RT R)−1 is essentially the pseudo-inverse of R. Since RT R ≈ I,

the above estimation function can be simplified as z = Rx.

However, although the estimation involves an under-determined system, and hence

there are infinitely many solutions, it is possible that an adversary can estimate partial

of the real values, and therefore reveal part of the user’s information. If as many linearly

independent equations as the unknown elements can be found, then some elements may

be completely identified. To solve this problem, Du et al. [93] introduced the concept of

k-secure. For a matrix Q = RT of size M × N(M < N), if the remaining sub-matrix

after removing k columns of Q is still of full row rank, the matrix Q is called k-secure,

which guarantees that it is impossible to generate an equation (except the trivial zero

combination) that contains less than k+1 variables [93]. It is further shown in [80] and [93]

that for a matrix Φ of size (k+1)×N , where each row of Φ is a nonzero linear combination


of row vectors in Q, if Q is k-secure, the linear system of equations y = Φx involves at

least 2k+1 unknown variables. This property illustrates that if Q is k-secure, any linear

combinations of the equations contains at least k+1 variables. Therefore, to solve the

problem of identifying a few of the elements, the projected dimensionality should satisfy

M ≤ N2, such that each unknown variable is disguised by at least M other variables [94].

Since it is impossible to find M linearly independent equations that involve these M

variables, the solutions to each of the unknown variable are infinite, and therefore it is

impossible to find the exact value of any element in the original data vector.

Recall that the projection model in this chapter is x =√

NM

RTz, we can estimate z

using z =√

NM

Rx [80]. Since x =√

NM

RTz, we have z =√

NM

R√

NM

RTz = NM

RRTz.

To analyze the statistical properties of the estimated individual element, let zi be the ith

element of the estimated data vector, using the results in Lemma 3.2, it is straightforward

to derive that:

E[zi] = E

[N∑

j=1

N

Mw′

i,jzj

]= zi, (3.19)

Var[zi] = Var

[N∑

j=1

N

Mw′

i,jzj

]

= E

(N∑

j=1

N

Mw′

i,jzj

)2− E

[N∑

j=1

N

Mw′

i,jzj

]2

=N2

M2E

[N∑

j=1

(w′i,j)

2z2j + 2

∑

j 6=k

w′i,jzjw

′i,kzk

]− z2

i

=N2

M2E

[N∑

j=1

(w′i,j)

2z2j

]− z2

i

=

(2

M+ 1

)z2

i +1

M

∑

i6=j

z2j − z2

i

=1

M

∑

i6=j

z2j +

2

Mz2

i

=1

M(‖z‖2 + z2

i ). (3.20)


It can be seen that the expected value of each estimated element is equal to the true

value. Since when M ≤ N2, no single element can be exactly recovered, the variance of

zi can be considered as a measure of privacy.

Although the individual element in the original data vector can not be correctly

estimated, it is possible that the characteristics of the whole estimated data vector are

still close to the original data vector up to some similarity function. In this case, the

privacy of the user still can not be protected. To solve this problem, we should make sure

that the estimated data vector has a large distance to the original one, i.e. ‖z− z‖2 > ϕ,

where ϕ is a privacy threshold. For a biometric verification problem, the privacy threshold

value ϕ represents the natural variance of face images, and should be set to a value that is

larger than the largest possible distance between data vectors of the same human subject.

To quantify the probability of preserving privacy, we first note that the estimation

error of individual elements zi− zi approximates a Gaussian distribution with zero mean

and variance‖z‖2+z2

i

M. This is due to the fact that the elements R are i.i.d. Gaussian

random variables, and according to the Central Limit Theorem (CLT) [95], the elements

of W ′ = RRT are also almost Gaussian. To validate this, we generate a random vector

of size 10000× 1 and normalize it to unity length. This vector is considered as the data

vector. A matrix of size 10000×500 is then generated randomly with each entry an i.i.d.

Gaussian random variable. The data vector is then projected onto a low-dimensional

space using the generated random matrix, followed by a reconstruction procedure as

described above. This process is repeated 1000 times on the same data vector using

different random matrices. Figure 3.6 plots the estimation error of the first element of

the data vector. It can be seen that experimental error distribution fits well with the

statistics shown in Eqn. (3.19) and Eqn. (3.20).

For real applications, such as face recognition, the dimensionality of a face image

vector N is usually large, and |zi|2 << ‖z‖2. The expected value and variance of zi − zi

are E[zi − zi] = 0 and Var[zi − zi] ≈ ‖z‖2M

. Due to zi − zi ∼ N(0, ‖z‖2

M), we have


Figure 3.6: Gaussian approximation of estimation error.

(√

M‖z‖2 )(zi − zi) ∼ N(0, 1), and therefore M

‖z‖2 ‖z− z‖2 = M‖z‖2

∑Ni=1(zi − zi)

2 follows a

Chi-square distribution with N degrees of freedom. Then the probability of ‖z− z‖2 > ϕ

can be computed as:

P (‖z− z‖2 > ϕ) = P

(M

‖z‖2 ‖z− z‖2 >Mϕ

‖z‖2

)

= 1−G

(N

2,

Mϕ

2 ‖z‖2

). (3.21)

where G denote the regularized Gamma function.

It can be seen that the probability of preserving privacy with respect to ϕ is asso-

ciated with the dimensionality N , the squared length of the data vector ‖z‖2, and the

projected dimensionality M . When N and ‖z‖2 are fixed, the probability value mono-

tonically increases as M decreases. The above analysis is based on the minimum l2 norm

reconstruction model. Recent advances in the theory of Compressive Sensing (CS) [96],

a closely related area to random projection, demonstrate that for an K-sparse signal

of dimensionality N , the minimum l0 norm reconstruction model can recover the orig-

inal signal through exhaustive enumeration of all(

NK

)possible combinations, and the

minimum l1 norm reconstruction model can exactly reconstruct K-sparse vectors and


stably approximate compressible vectors with high probability in polynomial time when

M ≥ M0 = cKlog(N/K). Therefore, to ensure privacy, the projected dimensionality M

should be set to a smaller value than M0. However, as shown in previous section, the M

value is also associated with the similarity preserving property. This demonstrates that

the RP based method has a tradeoff between the privacy level and verification accuracy.

With higher projected dimensionality, better accuracy but possibly lower privacy level,

and vice versa.

Recall that the variance of the estimated individual element (Eqn. (3.20)) and the

probability of privacy preserving (Eqn. (3.21)) will both increase as the squared length

of the data vector ‖z‖2 increases. Therefore, the translation vector d, which is used to

enhance the changeability, can enlarge the vector length and be used as a complementary

approach to enhance privacy. It should be noted that when the vector d is also obtained

by the adversary, the privacy level is not improved but remains the same as without

translation. In real applications, the d vector is not associated with the user’s key, and

can be kept secret by a central controller.

3.6 Experimental Results

To evaluate and compare the performance of the introduced method for face based bio-

metric verification, we conduct experiments on a generic database that consists of face

images from several well-known face databases [97]. In this section, we first give a descrip-

tion of the employed database, followed by the experimental results along with detailed

discussion.

3.6.1 Database Description

In real life face recognition application scenarios, it is common that the user’s face images

are not available for training. As such, the intrinsic properties of the human subjects


are usually trained from subjects that are not those to be recognized. Moreover, other

conditions such as illumination, resolution, lighting, facial expression, and pose, may

vary from time to time, such that the image conditions for training and testing are differ-

ent. To simulate such situations, a generic database was organized in [97]. The generic

database originally contains 5676 images of 1020 subjects from 5 well-known databases,

FERET [98, 99], PIE [100], AR [101], Aging [102], and BioID [103]. In the FERET

database, 3817 images of 1194 subjects are officially provided with eye coordinates. In

addition, 1016 more images have had the eye coordinates manually determined by the

author of [97]. Therefore, altogether 3881 images of 750 subjects with at least 3 images

per subject are collected from the FERET database to form the data set. For the PIE

database [100], 816 images of 68 subjects are selected. In detail, 7 different poses and 5

different lighting conditions are included. Following the PIE’s naming rule, pose group

[27,37,05,29,11,07,09] is selected, which contains both horizontal and vertical rotations.

Images with pose variations are under normal lighting conditions and with neutral ex-

pressions. For illumination variations, 5 frontal face images with neutral expressions are

randomly selected from all 21 different illumination conditions with room lighting on. For

the AR database [101], all ground-truthed images (480 images of 120 subjects) are in-

cluded. Although the images in the FG-NET Aging database [102] are all ground-truthed

images, some of the low-quality or extremely difficult to recognize images are discarded

(e.g., baby images and old adult images are not selected at the same time for a specific

subject). Finally 276 images of 63 subjects are included. For the BioID database [103],

227 images of 20 subjects are selected to form the data set. A detailed configuration of

the whole data set is illustrated in the Table 3.1, with some example images shown in

Figure 3.7.

For face verification, we exclude image samples with large pose variation (> 15o), and

selected 4666 images for our experiments. The detailed configuration of the verification

data set is given in Table 3.2.


Figure 3.7: Image examples from the generic data set.

All the selected color images are first transformed to gray scale images by taking the

luminance component in YCbCr color space. All images are preprocessed according to

the recommendation of the FERET protocol, which includes: (1) images are rotated and

scaled so that the centers of the eyes are placed on specific pixels and the image size is

150 × 130; (2) a standard mask is applied to remove non-face portions; (3) histogram

equalized and image normalized to have zero mean and unit standard deviation. The

three steps for image preprocessing are illustrated in Figure 3.8. Finally, each image is

represented as a vector of dimensionality 17154.

In our experiments, we randomly select samples from 520 subjects as the training set,

while samples of the rest 500 subjects as the testing set. The training set includes 2388

images, and the testing set contains 2278 images. There is no overlap between the training

and the testing subjects. To simulate a real application, we perform evaluation on an

exhaustive basis, where every single image is used as a template once, and the rest of the

images as the probe set. All the elements in the translation vector, di, i = 1, 2, ..., N , are

set to 100, and the same d is applied to all users. To minimize the effect of randomness,

all the experiments were performed 5 times, and the average of the results are reported.


Database No. of No. of No. of

subjects images per subject images

FERET 750 ≥ 3 3881

AR 119 4 476

Aging 63 ≥ 3 276

BioID 20 ≥ 6 227

PIE 68 12 816

Total 1020 ≥ 3 5676

Table 3.1: Identification data set configuration.

Figure 3.8: Procedures for image preprocessing.

3.6.2 RP vs PCA

For the purpose of comparative study, we first need to compare the performance of RP

with other dimensionality reduction tools. Principal Component Analysis (PCA) [104]

and Linear Discriminant Analysis (LDA) [105] are two of the most popular methods for

dimensionality reduction, and have been used extensively in the literature as powerful

tools for face recognition applications. LDA is a supervised learning technique that

provides a class specific solution. It produces the optimal feature subspace in such a way


Database No. of No. of No. of

subjects images per subject images

FERET 750 ≥ 2 3029

AR 119 4 476

Aging 63 ≥ 3 276

BioID 20 ≥ 6 227

PIE 68 ≥ 8 658

Total 1020 ≥ 2 4666

Table 3.2: Verification data set configuration.

that the ratio of the between- and within-class scatters is maximized. Although LDA

based algorithms are superior to PCA based methods in some cases, it is shown in [106]

that PCA outperforms LDA when the training sample size is small and the training

images are less representative of the testing subjects. It is confirmed in [97] that PCA

performs much better than LDA in a generic learning scenario, where the image samples

of the human subjects are not available for training. Since the small sample size (SSS)

problem and the unavailability of training images are common in real life applications, and

PCA provides more reliable performance, we adopt the PCA algorithms for comparison

in this chapter.

PCA is an unsupervised learning technique which provides an optimal, in the least

mean square error sense, representation of the input in a lower dimensional space. In the

eigenfaces method [104], given a training set Z = {Zi}Ci=1, containing C classes with each

class Zi = {zij}Cij=1 consisting of a number of face images zij, a total of K =

∑Ci=1 Ci

images, the PCA is applied to the training set Z to find the K eigenvectors of the

covariance matrix,

Scov =1

K

C∑i=1

Ci∑j=1

(zij − z)(zij − z)T . (3.22)


where z = 1K

∑Ci=1

∑Ci

j=1 zij is the average of the ensemble. The eigenfaces are the

first M(≤ K) eigenvectors corresponding to the largest eigenvalues, denoted as Ψ. The

original image is transformed to the M -dimensional face space by a linear mapping:

xij = ΨT (zij − z).

The PCA transformation matrix Ψ and mean image z are obtained based on the

images in the training set, and the images in the testing set are used for evaluation.

There is no overlap between the training and testing human subjects. Since RP does not

need a training process, to produce comparable results, we perform evaluation on the

same set of testing images as PCA. In the report of the experimental results, RP denotes

the application of random projection on the high-dimensional image vectors directly.

Figure 3.9 compares the obtained EER at different dimensionalities when PCA and RP

are applied as feature extractors respectively. It can be seen that PCA provides better

EER than RP, and the verification accuracy of RP improves at higher dimensionality.

This is because PCA projects the image vectors to directions with highest variance, while

RP projects to random directions. As shown in Lemma 3.6, as the dimensionality M

increases, with higher probability the Euclidean distance can be preserved up to a smaller

error factor, hence the performance improves.

Another observation is that the verification accuracy of both methods levels off after

certain dimensions, 100 for PCA (EER=17.54%), and 200 for RP (EER=18.68%) in our

experiments. For PCA, the projected features after a certain dimension will have very

small variance, therefore contribute little to the classification. For RP, the verification

accuracy is associated with both the dimensionality of the projected features, and the

discriminant power of the image vectors. When M exceeds a certain dimension, with

probability 1, the Euclidean distance can be preserved up to a very small error factor,

and therefore the verification accuracy depends on the separability of the original image

vectors. To illustrate this, we performed experiments on the non-projected original image

vectors, where Euclidean distance is used as the dissimilarity measure. This produces an


Figure 3.9: EER obtained by using PCA and RP as feature extractors.

EER of 18.19%. Figure 3.10 plots the Receiver Operating Characteristic (ROC) curves of

RP (M = 200), and the verification results of the original image vectors. The ROC curves

are plotted by Genuine Acceptance Rate (GAR, complement of FRR) against FAR. It

can be observed that RP and original images have almost overlapping ROC curves. This

demonstrates that the Euclidean distance of the original images can be approximately

preserved. Generally, in a face recognition problem, PCA provides more discriminatory

representation than the original noisy face images. This explains why PCA outperforms

RP in our experimentation.

3.6.3 RP vs PCARP

Although the PCA algorithm performs better than RP in general, it provides neither

privacy protection, nor revocability. To solve these problems, a possible solution is to

apply RP on dimensionality reduced PCA feature vectors, as in [73]. In this chapter, this

method is denoted as PCARP. Due to the fact that the original image can be approxi-

mately reconstructed from its PCA coefficients, the revealing of these PCA coefficients


Figure 3.10: ROC curve of RP and original image vectors.

can be considered as a privacy breach. To protect the PCA coefficients, the PCARP

projected features should satisfy M ≤ J2, where J is the dimensionality of PCA feature

vectors.

Depending on the application context, the proposed changeable biometric system can

be implemented in two scenarios: user-independent (UI) and user-dependent (UD). In the

UI scenario, all the users use the same projection, i.e., same key to generate the same set

of random matrices. The randomness generation key can be controlled by the application

provider, and therefore the users do not need to carry the key for authentication. The

UD scenario is a two-factor scheme that requires user-specific projection, i.e., each user

use a different key to generate different random matrix. In both cases, the biometric

template can be regenerated by simply changing the key.

Since the UD scenario is a two factor scheme, there exist some situations that need to

be considered: both-legitimate, stolen-key, and stolen-biometrics. In the both-legitimate

case, different users utilize distinct keys for RP, and it is assumed that the key and bio-

metric data are not stolen. As we discussed in Chapter 2, the UD projection does not

change the FRR since for the same projection is still applied on the biometric represen-


tation of the same user. If the data points that are originally within a distance of t to a

vector u are rejected by using a different projection, then it provides changeability for u.

Therefore, the generated FAR in the both-legitimate case provides a measure of change-

ability, and the smaller the FAR, the better the changeability. The stolen-biometrics

case is essentially the changeability problem, and it has the same performance as the

both-legitimate case. In the stolen-key case, the same random matrix is applied to the

biometric features of both the genuine and imposter users. This is equivalent to the UI

scenario. Therefore, the performance of the changeable face verification system can be

evaluated through experiments on UI and UD (both-legitimate) scenarios.

User-independent Scenario

In the UI scenario, all the users utilize the same projection matrix. Since the same vector

d is used for all users, the translation procedure does not effect the similarity between

vectors. Figure 3.11 compares the obtained EER of PCARP and RP at different M ,

with the dimensionality of PCA vectors J = 2 × M . Table 3.3 lists the experimental

results of PCA, RP, and PCARP. Overall, RP and PCARP achieves similar performance,

and both produce lower recognition accuracy comparing with the original PCA method.

Due to the fact that PCA features provide better discriminant power than the original

image vectors, the PCARP method requires lower dimensionality than the RP method

to achieve the same accuracy. Figure 3.12 shows the ROC curve of RP and PCARP at

M = 200. It can be observed that RP and PCARP have almost overlapping ROC curves.

User-dependent Scenario

The UD scenario is a two-factor scheme where each user utilizes a distinct projection

matrix. Figure 3.13 depicts the obtained EER as a function of dimensionality. It can be

seen that when RP is applied directly on image or PCA feature vectors, zero EER can not


Figure 3.11: EER obtained in the user-independent scenario.

Figure 3.12: ROC curve of RP and PCARP in the user-independent scenario.


Projected Dimensionality

Method 20 60 100 120 160 200 220 260 300

PCA 19.25 17.79 17.54 17.46 17.48 17.44 17.49 17.46 17.46

RP 23.68 20.66 19.67 19.36 19.01 18.68 18.69 18.62 18.63

PCARP 21.25 19.24 18.43 18.5 18.4 18.37 18.27 18.42 18.49

Table 3.3: Experimental results (EER, in %) of PCA, RP, and PCARP at different

dimensionalities.

Figure 3.13: EER obtained in the user-dependent scenario.

be obtained. The EER decreases as the dimensionality increases. This is consistent with

our analysis in Eqn. (3.15), that the probability of error depends on the characteristics of

the vectors and the dimensionality M , and the probability of error decreases at higher M .

However, as shown in Eqn. (3.18), by proper translation, the ratio of system threshold

and the length of vector approaches zero, and zero error rate can be obtained. This is

confirmed in our experiments that zero EER is obtained at all dimensionalities.

The above experimental results demonstrate that it is possible to produce zero EER

when the biometric data and the projection matrix generation key are both legitimate.


Figure 3.14: ROC curve of RP in the user-dependent scenario.

Figure 3.15: ROC curve of PCARP in the user-dependent scenario.


Figure 3.14 and 3.15 depict the ROC curve of RP and PCARP at M = 200 respectively,

with the threshold values selected based on stolen-key case. It can be observed that

in the both-legitimate case, without vector translation, the FAR is dependent on the

system threshold value, and hence can not provide strong changeability. On the other

hand, with proper vector translation, zero FAR is obtained for all selections of threshold

values. This demonstrates that the biometrics is strongly changeable, and the FAR is

zero even the biometrics is stolen. In the stolen-key case, the performance is evaluated

by using the same projection matrix for all the users. For fixed system threshold, the

FRR is the same for the both-legitimate case and the stolen-key case. A smaller FRR

will produce higher FAR when the key is stolen, and vice versa. The selection of the

system threshold is dependent on the requirement of the applications.

Changeability

As discussed before, the performance in the UD scenario actually implies the changeabil-

ity of the proposed method. The smaller the FAR, the stronger the changeability. Since a

zero FRR corresponds to the largest threshold value t, zero EER indicates strong change-

ability. To confirm this point, we also demonstrate the changeability of the proposed

method through experiments on RP and PCARP projected features. The image samples

from the same user are projected using different RP matrices and matched against each

other. Each individual image is also matched against itself by using different projection

matrices. The experiment consists of a total number of 13922 verification attempts. The

experimental results are shown in Figure 3.16, where the FAR is plotted as a function of

the system threshold t, and CH denotes changeability experiments. The t is normalized

such that 0 represents the lowest value, and 1 is the highest value. The obtained FAR

in the UD scenario is also depicted for comparison purposes. Since Euclidean distance

is applied as the dissimilarity measure, a smaller t means lower FAR and higher FRR,

and vice versa. It can be observed that without vector translation, the changeability is


Figure 3.16: Experimental results for changeability: RP (left) and PCARP (right).

dependent on t, and hence can not produce strong changeability. On the other hand,

with proper translation, it is capable of producing zero FAR for all selections of system

threshold values, i.e., for any system. The experimental results in changeability and that

of the UD scenario almost overlap FAR plots; this confirms that the performance in the

UD scenario indicates changeability of the system. In the remainder of the thesis, we

will use the performance in UD scenario to demonstrate the changeability.

3.6.4 Discussion

The experimental results indicate that RP offers slight degradation in verification accu-

racy comparing with PCA based method. However, the RP method preserves the user’s

privacy if the stored template is compromised. The RP based privacy preserving solu-

tion can be applied on either high-dimensional image vectors or dimensionality reduced

feature vectors. As shown in our experiments, PCARP and RP methods produce similar

performance in UI scenario, and both are capable of producing zero EER in UD scenario

with proper vector translation. In the UD scheme, if the key is stolen, the performance

will be the same as in the UI scenario. If only the biometric data is stolen, the FAR will

be zero. This also explains the changeability, which means that two biometric vectors


that are generated from the same biometric using different projection matrices can not

be used to authenticate each other successfully.

An advantage of the PCARP method is that it can produce similar performance at

a lower dimensionality. However, the PCA based method requires a training process,

which usually involves a large number of training images, and hence it has much higher

computational requirements. Also, the collection of these training images pose a privacy

problem. On the other hand, the RP method is data independent, does not require

training, and is much easier to implement. More importantly, the PCARP method is

vulnerable to cross-matching attack. For example, given a PCA vector of dimensionality

J = 200, to produce privacy preserving template, and also highest possible accuracy, we

can project the PCA features to a vector of size M = 100 using RP. However, if the tem-

plates of two applications that use the same set of PCA coefficients are revealed, and the

RP matrix for these two applications are different and also obtained, then an adversary

can form a set of J linear equations with J unknowns, and the PCA coefficients can be

exactly reconstructed. By using RP directly on image vectors, since the dimensionality

of such vectors is usually very high (e.g. N = 17154 in the generic data set), and the

projected dimensionality is low (e.g. M = 200), an adversary will need to compromise⌈

NM

⌉= 85 templates from one user to recover the original image. Although it is possible

to produce better verification accuracy using advanced feature extraction method, the

vulnerability to cross-matching attack is essentially a weakness of applying RP to such

low-dimensional feature vectors. Considering all these aspects, RP on high-dimensional

image vectors is a more appropriate solution for privacy preserving biometric verification.

3.7 Summary

This chapter has presented a systematic analysis of random projection based method for

addressing the challenging problem of template changeability and privacy protection in


biometrics enabled verification systems. Two different scenarios, user-independent and

user-dependent random projection have been discussed. Detailed mathematical analysis

shows that the similarity between two vectors can be approximately preserved when

projected onto a random subspace with appropriate dimensionality. We have introduced

a precise method for computing the probability of preserving distance between two points

with respect to the error factor and projected dimensionality, and provided a probability

lower bound of preserving the pairwise distance for all the points. Our method achieves

better dimensionality lower bound than existing works. The user-dependent scenario is a

two-factor scheme that utilizes user-specific matrix for random projection. We have used

a geometric-based approach to approximate the probability of error, and introduced an

effective method of vector translation to improve the changeability.

The proposed method produces changeable biometric templates, which can be achieved

by simply varying the RP matrix. To explore the privacy preserving characteristics of

such method, we have provided detailed analysis in both the estimation of individual

element and the whole vector. For the purpose of comparative study, we have performed

computer simulations by using RP on both image vectors and PCA reduced feature

vectors. Experimental results show that these two methods have similar verification

accuracy in user-independent scenario, and are both capable of producing zero EER in

user-dependent scenario with vector translation. It is pointed out that better privacy pro-

tection can be obtained by applying RP on high-dimensional image vectors directly, which

is also data-independent, computationally economical, and easy to implement. However,

due to the noisy representation of the original image vectors, and the requirement of

using lower projected dimensionality to ensure privacy, the recognition performance is

usually degraded when RP is applied on high-dimensional image vectors. To improve the

recognition performance as well as maintaining privacy protection, a sorted index num-

ber approach is introduced to preserve privacy for discriminant low-dimensional feature

vectors, which will be presented in the following chapters.


3.8 Appendix

3.8.1 Appendix 3-I

Proof of Lemma 3.2 : Let W = RT R, R ∈ RN×M , and the entries of R are i.i.d. Gaussian

random variables, rij ∼ N(0, 1N

). Let wij denote the elements of W .

If i = j, we have:

E[wij] = E

[N∑

k=1

r2kj

]=

N∑

k=1

E[r2kj] = N × 1

N= 1,

Since the entries of R, rij, are i.i.d. Gaussian random variables with mean zero and

variance 1N

, then the random variable Z = NN∑

k=1

r2kj follows a Chi-square distribution

with degree of freedom N :

Var

[N

N∑

k=1

r2kj

]= N2Var

[N∑

k=1

r2kj

]= 2N,

Hence:

Var[wij] = Var

[N∑

k=1

r2kj

]=

2

N,

If i 6= j, we have:

E[wij] = E

[N∑

k=1

rkirkj

]=

N∑

k=1

E[rkirkj]

=N∑

k=1

E[rki]E[rkj] = 0,

Var[wij] = E[w2ij]− E[wij]

2 = E

(N∑

k=1

rkirkj

)2

= E

[N∑

k=1

r2kir

2kj +

∑

l 6=k

rlirkirljrkj

]

= E

[N∑

k=1

r2kir

2kj

]=

N∑

k=1

E[r2kir

2kj] =

1

N.

Similarly, Eqn. (3.4) and Eqn. (3.5) can be proved.


3.8.2 Appendix 3-II

Proof of Lemma 3.3 : Let x =√

NM

RTu, where u ∈ RN , R ∈ RN×M , and the entries of

R are i.i.d. Gaussian random variables, rij ∼ N(0, 1N

). Let ui denote the elements of u,

we have:

E[‖x‖2] = E

M∑j=1

(N∑

i=1

√N

Mrijui

)2

=N

M

M∑j=1

E

(N∑

i=1

rijui

)2

=N

M

M∑j=1

E

[N∑

i=1

r2iju

2i + 2

∑

l 6=k

rljrlrkjrk

]

=N

M

M∑j=1

E

[N∑

i=1

r2iju

2i

]

=N

M

M∑j=1

1

N‖u‖2

= ‖u‖2 ,

To compute Var[‖x‖2], we first define αj = (∑N

i=1 rijui)2, we have:

E[αj] = E

(N∑

i=1

rijui

)2

= E

[N∑

i=1

r2iju

2i + 2

∑

l 6=k

rljulrkjuk

]

= E

[N∑

i=1

r2iju

2i

]

=N∑

i=1

1

Nu2

i

=1

N‖u‖2 ,

Since rij ∼ N(0, 1N

), E[r4ij] = 3

N2 , then:

E[α2j ] = E

(N∑

i=1

rijui

)4


= E

[N∑

i=1

r4iju

4i + 6

∑

l 6=k

r2lju

2l r

2kju

2k

]

=3

N2

N∑i=1

u4i +

6

N2

∑

l 6=k

u2l u

2k

=3

N2

(N∑

i=1

u4i + 2

∑

l 6=k

u2l u

2k

)

=3

N2

(N∑

i=1

u2i

)2

=3

N2‖u‖4 ,

We have:

E[‖x‖4] = E

M∑j=1

(N∑

i=1

√N

Mrijui

)2

2

=N2

M2E

(M∑

j=1

αj

)2

=N2

M2E

[M∑

j=1

α2j + 2

∑

l 6=k

αlαk

]

=N2

M2

(M∑

j=1

E[α2j ] + 2

∑

l 6=k

E[αl]E[αk]

)

=N2

M2

(3

N2‖u‖4 + 2

M(M − 1)

2

‖u‖2

N

‖u‖2

N

)

=

(1 +

2

M

)‖u‖4 ,

and the variance of ‖x‖2 can be computed as:

Var[‖x‖2] = E[‖x‖4]− E[‖x‖2]2 =2

M‖u‖4 .

Chapter 4

Sorted Index Numbers for Face

Recognition

4.1 Introduction

The recognition accuracy is of fundamental importance in biometrics based recognition

systems. Many face recognition (FR) techniques have been proposed in the literature,

and the state-of-the-art in the area can be found in a series of surveys [107–109]. In

general, geometrical local feature based approach and holistic template matching based

approach are considered to be two of the major FR methodologies. In a geometrical

feature based FR system, some local facial features such as eyes, nose, and mouth are

identified, and their location or geometry characteristics are used for face representation.

Examples of geometrical approaches include the Hidden Markov Model (HMM) based

method [110,111], and the Elastic Bunch Graph Matching (EBGM) method [112]. How-

ever, the performance of such methods usually relies heavily on the exact localization of

facial features, which is a difficult task in many application scenarios [113]. Appearance

based approaches, which treat the human face as a holistic pattern, are among the most

successful methods [108,114]. In an appearance based FR system, the face image is con-

73

Chapter 4. Sorted Index Numbers for Face Recognition 74

verted to a high-dimensional vector that consists of the pixel values in the image, and

dimensionality reduction techniques are applied to obtain a lower-dimensional representa-

tion. The extracted features are usually a set of real numbers in the continuous domain,

and the similarity between images is evaluated by distance measures. Representative

techniques include PCA and LDA and their variants.

In Chapter 3, a random projection (RP) based method is introduced for privacy

preserving face recognition. Due to the noisy nature of the original images, RP on high-

dimensional image vectors produces slightly lower performance than PCA. Many other

advanced appearance based techniques may provide more discriminant representation

than PCA [97]. It is highly probable that RP on these discriminant features will produce

better recognition performance. However, when RP is applied on dimensionality reduced

feature vectors, then the biometric template is vulnerable to cross matching attack.

This chapter presents a novel approach for privacy preserving face recognition using

appearance based continuous features. Unlike traditional appearance based FR systems,

where the original features are usually stored as templates for matching, the proposed

method stores the sorted index numbers (SIN) of the extracted features as template.

Since it is impossible to recover any of the exact values of the original features, the

transformation from original features to the SIN vectors is non-invertible. A matching

algorithm is introduced to measure the similarity between two SIN vectors. Extensive

experimentation demonstrates that the proposed solution may improve the recognition

accuracy in both identification and verification scenarios.

The remainder of this chapter is organized as follows: Section 4.2 provides an overview

of the proposed SIN method. Detailed analysis on the SIN method is given in Section

4.3. Section 4.4 introduces two privacy measures that evaluate the privacy protection at

individual attribute and global vector levels, and presents a privacy analysis of the SIN

method. Detailed experimental results in both identification and verification scenarios

are presented in Section 4.5, and a summary of this chapter is given in Section 4.6.


4.2 Method Overview

This section presents an overview of the proposed solution for privacy preserving face

recognition. The proposed method assumes that the extracted features of a biometric

signal can be represented by a vector of continuous numbers, and the similarity of the

vectors can be evaluated by some (e.g., Euclidean) distance measures. The procedure of

creating the proposed SIN feature vector is as follows:

1. Extract feature vector w ∈ RN from the input face image.

2. Compute u = w − w, where w is the mean feature vector calculated from the

training data.

3. Sort the feature vector u in descending order, and store the corresponding index

numbers in a new vector g.

4. The generated vector g ∈ ZN that contains the sorted index numbers is stored as

template for recognition.

For example, given u = {u1, u2, u3, u4, u5, u6}, the sorted vector in descending order

is g = {u4, u6, u2, u1, u3, u5}, then the template is g = {4, 6, 2, 1, 3, 5}.The method for computing the similarity between two SIN vectors, denoted as the

SIN distance in this thesis, is as follows:

1. Given two SIN feature vectors g ∈ ZN and p ∈ ZN , where g denote the template

vector, and p denote the probe vector. Start from the first element g1 of g.

2. Search for the corresponding element in p, i.e., pj = g1. Record ξ1 = j − 1, where

j is the index number in p.

3. Eliminate the obtained pj from p, and obtain p1 = {p1, p2, ..., pj−1, pj+1, ..., pN}.

4. Repeat step 2 and 3 on the subsequent elements of g until gN−1. Record ξ2, ξ3, ..., ξN−1.


5. The similarity measure of g and p is computed as S(g,p) =∑N−1

i=1 ξi.

Illustration example:

1. For two SIN feature vectors g = {4, 6, 2, 1, 3, 5} and p = {2, 5, 3, 6, 1, 4}, we first

search the 1st element g1 = 4, and find that p6 = 4. Therefore ξ1 = 6 − 1 = 5.

Eliminate p6 from p and we form a new vector of p1 = {2, 5, 3, 6, 1}.

2. Search the 2nd element g2 = 6, and find that p14 = 6. Therefore ξ2 = 4 − 1 = 3.

Eliminate p14 from p1 and form a new vector of p2 = {2, 5, 3, 1}.

3. Search the 3rd element g3 = 2, and find that p21 = 2. Therefore ξ3 = 1 − 1 = 0.

Eliminate p21 from p2 and form a new vector of p3 = {5, 3, 1}.

4. Search the 4th element g4 = 1, and find that p33 = 1. Therefore ξ4 = 3 − 1 = 2.

Eliminate p33 from p3 and form a new vector of p4 = {5, 3}.

5. Search the 5th element g5 = 3, and find that p42 = 1. Therefore ξ5 = 2− 1 = 1.

6. Compute S(g,p) =∑5

i=1 ξi = 5 + 3 + 0 + 2 + 1 = 11.

4.3 SIN Method

The idea of SIN is originated from the pairwise relation of any two elements in a vector.

Xiang et al. [115] utilized the relative relation of groups of two bins to represent the

shape of a histogram. In the proposed method, the pairwise relative relation of vector

elements is used for distance approximation. To understand the underlying rationale of

the proposed algorithm, we first look into an alternative presentation of the method,

named Pairwise Relational Discretization (PRD). The procedure of producing the PRD

feature vector is as follows:




training data.

3. Compute binary representation of u by comparing the pairwise relation of all the

elements in u according to:

bij =

1 ui ≥ uj;

0, ui < uj;(4.1)

4. Concatenate all the bits into one vector b = {b12, ..., b1N , b23, ..., b2N , b34, ..., bN−1,N}.Store the binary vector b as template for recognition.

The similarity measure of the PRD method is based on Hamming distance.

Unlike the traditional discretization method, which quantizes individual elements

based on some predefined quantization levels, the proposed method takes the global

characteristics of the feature vectors into consideration. This is interpreted by comparing

the pairwise relation of all groups of two elements in the vector. From a geometric point

of view, the PRD method is equivalent to partitioning an N -dimensional space into N !

cells, where N ! is the total number of possible outputs of the PRD vector. An original

vector is mapped onto the corresponding cell, and the Euclidean distance between two

vectors is approximated by the spatial distance of the cells, i.e., the Hamming distance of

the corresponding PRD vectors. More precisely, since the pairwise relation is invariant

to the norm of the vectors, the Hamming distance of the PRD vectors approximates the

Euclidean distance between vectors that are normalized to the same length. Figure 4.1

provides a graphic view of the partition of a 3-D sphere, assuming all the vectors have

unit length. It can be observed that the 3-D surface is partitioned into 3! = 6 cells, and

the distance of the cells can be measured by the Hamming distance of the corresponding

binary PRD vectors.

Alternatively, the PRD method interprets an N -dimensional space as combinations

of 2-D planes. In an N -dimensional subspace, when the similarity of two vectors is


Figure 4.1: 3-D demonstration of SIN method.

evaluated by Euclidean distance, the vector elements are treated as coordinates in the

corresponding basis {h1,h2, ...,hN}, and the similarity is based on the spatial closeness.

The elements are essentially the projection coefficients of the vector onto each basis

(i.e., lines). Here, instead of projecting onto lines, we explore the projection onto 2-

D planes. Figure 4.2 offers a diagrammatic illustration of the PRD method. For two

points in an N -dimensional subspace, if they are spatially close to each other, then in

a large number of 2-D planes, their projection location should be close to each other,

i.e., small Hamming distance, and vise versa. Therefore, the Euclidean distance between

two vectors can be approximated by the Hamming distance between the corresponding

PRD vectors. The mean subtraction step ensures zero mean of each dimension. It

deleverages the significance of each element such that no single dimension will overpower

others. The discretization step partitions a 2-D plane into two regions by comparing the

pairwise relation. It reduces the sensitivity of the variation of individual elements, and

therefore can potentially provide better error tolerance. Figure 4.3 shows the intra-class

and inter-class distributions of the first 100 PCA coefficients based on 1000 randomly


Figure 4.2: Diagram of Pairwise Relational Discretization (PRD) method.

selected images from the experimental data set. The PCA vectors are normalized to

unit length, and Euclidean distance and Hamming distance are used as dissimilarity

measure. Note that the size of the overlapping area of the intra-class and inter-class

distributions indicates the recognition error. It can be observed that the PRD method

produces smaller error than the original features, therefore will possibly provide better

recognition performance.

A major drawback of the PRD method is the high dimensionality of the generated

binary PRD vector. For an N -dimensional vector, the generated binary vector b will

have a size of N(N−1)2

. For example, for a feature vector with N = 100, the PRD

vector will have a size of 4950. This problem introduces high storage and computational

requirements. This is particularly important for applications with high processing speed

demands. To improve this, we note that the PRD method is based on pairwise relation


Figure 4.3: Comparison of intra-class and inter-class distribution using Euclidean and

Hamming distances.

of all the vector elements, and the same information can be exactly preserved from the

sorted index numbers, i.e., any single bit in b can be derived from the SIN vector.

Let g and p denote the SIN vectors of template and probe images respectively, bg

and bp represent the corresponding PRD vectors, then we have:

H(bg,bp) = S(g,p) =N−1∑i=1

ξi. (4.2)

where H(bg,bp) and S(g,p) denote the Hamming distance and SIN distance respectively,

and ξi, i = 1, ......, N represents the Hamming distance associated with every single ele-

ment in g.

Proof of Eqn. (4.2): Since g and bg are derived from the same feature vector, in bg,

there are N−1 bits that are associated with the first element of g, g1 . If pj = g1, where j

is the index number of the corresponding element in p, then all the index numbers to the

left of pj will have different bit values in bp, i.e., ξ1 = j−1. It should be noted that since

the Hamming distance for all the bits associated with pj = g1 have been computed, the


pj element should be removed for the calculation of next iteration. After the Hamming

distances for all the elements in g and p are computed, the sum of them will correspond

to the Hamming distance of bg and bp, i.e., H(bg,bp) = S(g,p) =∑N−1

i=1 ξi.

Eqn. (4.2) shows that the proposed SIN and PRD methods produce exactly the

same results. The equivalence of PRD and SIN methods also indicates that in an N -

dimensional space, the total number of possible outputs is N !. Because the events associ-

ated with each permutation are mutually exclusive to each other with equal probability,

the N ! cells on the surface of an N -dimensional sphere have the same volume. To test the

effectiveness of SIN over PRD in computational complexity, we performed experiments

on a computer with Intel CoreTM2 CPU 2.66GHz. With an original feature vector of

dimensionality 100, the average time for PRD feature extraction and matching is 26.2

ms, while the SIN method only consumes less than 0.9 ms.

The approximation of the Euclidean distance of original vectors and the Hamming

distance of the corresponding SIN/PRD vectors are demonstrated in Figure 4.4, at differ-

ent dimensionalities. A number of 1000 vectors are generated randomly and normalized

to unit length. Taking the first vector as a reference vector, the Euclidean distance with

all the other vectors are computed, sorted, and plotted as the red curve. The Hamming

distances between the corresponding SIN/PRD vectors are then computed and plotted

in blue. It can be seen that the Euclidean distance can be approximately preserved by

the Hamming distance of the PRD vectors, and the higher the dimensionality, the better

the distance approximation.


Since the SIN method only stores the index numbers of the sorted feature vector u, the

transformation from u to the corresponding SIN vector g is non-invertible. There is no

effective reconstruction being possible to recover any of the exact values of u from g.


Figure 4.4: SIN approximation of Euclidean distance.

However, an adversary may be able to estimate the distribution of the original features,

generate a set of random numbers according to the known distribution, and rearrange

the random numbers based on the SIN vector. As such, it is possible to provide an

approximate estimation of the original features. For simplicity, we assume the features

are i.i.d. in this chapter.

Let ρ1, ρ2, ..., ρN denote N i.i.d. random variables, and ρ1:N , ρ2:N , ..., ρN :N denote the

ordered variates, then we have the mean and variance of the jth order statistic are [116]:

mj:N =

∫ +∞

−∞sfj:N(s)ds (4.3)

σ2j:N =

∫ +∞

−∞(s−mj:N)2fj:N(s)ds (4.4)


where fj:N(s) is the probability density function (pdf) of ρj:N , and

fj:N(s) =N !

(j − 1)!(N − j)!F j−1(s)[1− F (s)]N−Jf(s) (4.5)

where F (s) is the cumulative distribution function (cdf) of ρ.

Let ρj:N denote the estimation of ρj:N , then E[ρj:N − ρj:N ] = mj:N − mj:N , and

Var[ρj:N− ρj:N ] = σ2j:N + σ2

j:N . When the distribution of ρ is unknown, then the expected

value of the estimation is not zero since mj:N 6= mj:N . In this case, the estimation will

be less accurate and the user’s privacy can be protected. However, it is possible that the

attacker may estimate the distribution of the original features. Considering the worst

case that the exact distribution is known, then we have:

E[ρj:N − ρj:N ] = mj:N − mj:N = 0, (4.6)

Var[ρj:N − ρj:N ] = 2σ2j:N (4.7)

Therefore, the expected value of the ρj:N − ρj:N will be zero. Since the exact value of any

element in the original feature vector can not be recovered, the variance of ρj:N− ρj:N can

be considered as a privacy measure. The larger the variance, the better the individual

elements being protected.

Figure 4.5 plots the variances of the order statistics as functions of vector dimension-

ality N , and ρ and ρ are assumed to be i.i.d. Gaussian random variables with zero mean

and unit variance. It can be seen that with higher dimensionality, the variances become

smaller. This suggests that the SIN method provides better privacy protection at lower

dimensionality.

Eqn. (4.7) provides a privacy measure of individual element. To evaluate the degree

of privacy protection for all the individual elements in a vector, as well as the privacy

preserving property of the global characteristics of the features, we define the following

privacy measures:

Definition 1: A feature vector u ∈ <N is called privacy protected at element-wise


level α, where α is computed as:

α =1

N

N∑i=1

1− (1− ηi)h(1− ηi), ηi =Var[ui − ui]

Var[ui]. (4.8)

where ui denote the estimated value of element ui, and h(x) is the unit step function,

i.e., h(x) = 1 if x ≥ 0 and h(x) = 0 otherwise. The function h(x) is utilized to regulate

the significance of all the elements, such that the variance ratio of any single dimension

is maximum 1.

Using the variance ratio of the estimated difference and the original variate has been

used as a privacy measure for individual attributes in data mining [117]. Here we take

the average of the variance ratio as a measure of the privacy protection for the indi-

vidual elements. When the variance ratio of any attribute is greater or equal to 1, i.e.,

Var[ui − ui] ≥ Var[ui], then the estimation of that attribute essentially provides no

useful information, and the attribute is strongly protected. The element-wise privacy

level α measures the average privacy protection of individual elements. The greater the

α value, the better the privacy protection. For the SIN method, assuming the elements

in u follow a distribution of mean zero and variance σ2u, then for the estimation of the

jth order element, we have ηj =2σ2

j:N

σ2u

, and α = 1N

∑Nj=1 1− (1− ηj)h(1− ηj).

Besides measuring the privacy protection of the individual elements, it is also impor-

tant to measure the global characteristics of the feature vector such that the estimated

vector is not close to the original one up to certain similarity functions. In [118], it is

shown that any arbitrary distance functions can be approximately mapped to Euclidean

distance domain through certain algorithms. In this chapter, we consider the squared Eu-

clidean distance (SED) between the estimated and original feature vectors as a measure

of privacy:

Definition 2: A feature vector u ∈ <N is called privacy protected at vector-wise

level β, where β is computed as:

β =E[‖u− u‖2]

E[‖r− u‖2]. (4.9)


where r denote a random vector in the estimation feature space, with the same dis-

tribution as u. If the average distance between the estimated and original vector is

approaching the average distance between any random vector and the original vector,

then the estimated vector essentially exhibits randomness, and therefore does not dis-

close information about u, i.e., the larger the β, the better privacy. Considering the

worst case that the distribution of the elements in u is known to have zero mean and a

variance of σ2u, then we have:

β =E[

∑Ni=1(ui:N − ui:N)2]

E[∑N

i=1(ri − ui)2]=

∑Ni=1 2σ2

i:N

2Nσ2u

=

∑Ni=1 σ2

i:N

Nσ2u

. (4.10)

Figure 4.6 depicts the privacy measures as functions of N using 1000 randomly selected

PCA feature vectors. The PCA vectors are normalized to have mean zero and variance

1/N . In both cases, the estimation is based on Gaussian distribution and it is assumed

that the mean and variance values are known by the adversary. It can be observed that

the SIN method provides better privacy level at lower dimensionality.

Figure 4.5: Variance σ2j:N as function of dimensionality N .


Figure 4.6: Privacy measures of SIN as functions of dimensionality.


The performance of the proposed method is evaluated on the same generic data set

as described in Chapter 3. To study the effects of different feature extractors on the

performance of proposed methods, we compare Principal Component Analysis (PCA) and

Kernel Direct Discriminant Analysis (KDDA). PCA has been introduced in Chapter 3.

PCA produces the most expressive subspace for face representation, but is not necessarily

the most discriminant one. This is due to the fact that the underlying class structure

of the data is not considered in the PCA technique. It was shown in [97] that KDDA

outperforms other techniques in most of the cases. Therefore we also adopt KDDA for

comparison in this chapter.

KDDA was proposed by Lu et al. [119] to address the nonlinearities in complex face

patterns. Kernel based solution finds a nonlinear transform from the original image

space RJ to a high-dimensional feature space F using a nonlinear function φ(·). In

the transformed high-dimensional feature space F , the convexity of the distribution is

expected to be retained so that traditional linear methodologies such as PCA and LDA


can be applied. The optimal nonlinear discriminant feature representation of z can be

obtained by:

y = Θ · ν(φ(z)) (4.11)

where Θ is a matrix representing the found kernel discriminant subspace, and ν(φ(z)) is

the kernel vector of the input z. The detailed implementation algorithm of KDDA can

be found in [119].

4.5.1 Face Identification

For face identification, we use all the 5676 images in the generic data set for experiments.

A set of 2836 images from 520 human subjects are randomly selected for training, and

the rest of 2840 images from 500 subjects for testing. There is no overlap between the

training and testing subjects and images. The test is performed on an exhaustive basis,

such that each time, one image is taken from the test set as a probe image, while the

rest of the images in the test set as gallery images. This is repeated until all the images

in the test set were used as the probe once. The classification is based on the nearest

neighbor classifier.

Table 4.1 shows the correct recognition rate (CRR) of SIN method with Euclidean

and Cosine distance measures at different dimensionalities, and a graphical comparison is

depicted in Figure 4.7. It can be observed that at a higher dimensionality, the SIN method

may improve the recognition accuracy of PCA significantly, while maintaining the good

performance of the stronger feature extractor KDDA. The PCA method projects images

to directions with highest variance, but not the discriminant ones. This will become more

severe in large image variations due to illumination, expression, pose and aging. When

computing the similarity between two PCA vectors, the distance measure is sensitive to

the variation of individual element, particularly those directions corresponding to noise.

The SIN method, on the other hand, reduces this sensitivity by simply comparing the

relative relation of the projections, and therefore possibly provides better error tolerance.


In the case of strong extractors such as KDDA, the SIN method will approximate the

distance between two vectors, and hence preserve the recognition accuracy.

PCA KDDA

Dim. Euc. Cos. SIN Euc. Cos. SIN

20 56.30 56.31 52.32 40.04 41.09 34.86

40 60.09 61.09 61.94 61.44 65.28 61.94

60 63.52 62.96 66.06 71.73 74.86 74.68

80 64.37 64.44 68.84 81.76 83.27 81.76

100 65.14 65.18 71.27 79.05 80.42 80.07

Table 4.1: Face identification results (CRR in %).

Figure 4.7: CRR in face identification scenario.

4.5.2 Face Verification

For face verification, the experiments are performed on the generic verification data set,

where 2388 images from 520 subjects are randomly selected as the training set, and 2278

images of the rest 500 subjects as the testing set. There is no overlap between the training


and the testing subjects and images. The evaluation is also performed on an exhaustive

basis, where every single image is used as a template once, and the rest of the images in

the test set as the probe images.

Table 4.2 details the obtained EER of SIN with Euclidean and Cosine distance mea-

sures at different dimensionalities when PCA and KDDA are used as feature extractors,

and a graphic comparison is presented in Figure 4.8. In general, the Cosine distance mea-

sure outperforms the Euclidean distance, and the proposed SIN method improves both

the verification accuracy of PCA and KDDA at almost all dimensionalities. This fur-

ther demonstrates that the SIN approach indeed offers better error tolerance and provide

more discriminant representation.

PCA KDDA

Dim. Euc. Cos. SIN Euc. Cos. SIN

20 20.05 19.23 13.78 25.22 20.42 20.97

40 19.09 17.81 11.46 21.49 16.22 14.54

60 18.52 17.42 10.28 18.80 13.41 10.97

80 18.50 17.15 9.72 10.96 9.90 7.19

100 18.20 16.94 9.46 10.41 8.84 6.52

Table 4.2: Face verification results (EER in %).

4.6 Summary

This chapter has introduced a novel approach for face recognition based on feature vec-

tors in continuous domain. The proposed method stores the sorted index numbers of

dimensionality reduced feature vectors as biometric templates for recognition. The SIN

method is originated from the pairwise relation of any two elements in a vector, and it is

shown that it is capable of approximating the Euclidean distance between two vectors. A


Figure 4.8: EER in face verification scenario.

new distance measure has been presented for evaluating the similarity between SIN vec-

tors. Since it is impossible to recover the exact value of any of the original features, the

transformation from the original features to the SIN vector is non-invertible. To study

the privacy protecting property of SIN method, two privacy measures that evaluate the

protection at both element and vector levels are introduced. It has been shown that the

SIN method may provide better privacy protection at lower dimensionality. Experimental

results on both face identification and verification demonstrate that the proposed method

may improve the recognition performance. Such characteristics of the SIN method make

it a candidate for being applied in conjunction with random transformations to obtain

changeability and enhanced privacy protection.

Chapter 5

Random Transformations for

Changeable Biometrics

5.1 Introduction

To support the deployment of biometrics in a wide range of applications, the same bio-

metric trait should be able to be used in different applications. For example, a user

should be able to register his face images for different bank account access, or for com-

puter/network logins. For security purposes, the biometric templates that are generated

for different applications should not be able to authenticate each other. In Chapter 4, a

privacy preserving scheme that utilizes the sorted index numbers of the extracted features

is proposed. The SIN method is capable of providing a certain level of privacy protection

in which the original features can not be exactly recovered. However, the SIN method

itself does not address the changeability problem. In other words, two SIN vectors of

the same biometric can be used to authenticate each other. To solve this problem, a

repeatable transform is necessary to be applied prior to the SIN operation. This can be

achieved by introducing randomness into the biometric features.

In this chapter, we present methods for changeable face verification using random

91

Chapter 5. Random Transformations for Changeable Biometrics 92

transformations. The proposed method applies random transformations on the origi-

nal features first. The randomized vector is then sorted in descending order, and the

corresponding index numbers are recorded and stored as template for future matching.

Random transformations have been used extensively as data perturbation techniques for

privacy preserving data mining, which include additive data perturbation [120,121], mul-

tiplicative data perturbation [122, 123], and random projection based approach [80]. In

this chapter, we explore their capability of producing changeable biometric templates, as

well as the privacy preserving properties when applied in conjunction with the irreversible

SIN technique. Random projection has shown its capability of obtaining changeability in

Chapter 3. Therefore, it is possible to apply random projection before the sorting opera-

tion. In addition, two other random transformations, namely random additive transform

and random multiplicative transform, are discussed and compared. It is shown that

since it is impossible to retrieve the original features from the sorted index numbers

of the randomized vector, the combination of random transformations and SIN com-

prises repeatable and non-invertible transformations, hence the generated templates are

changeable and privacy preserving.

The remainder of this chapter is organized as follows: Section 5.2 presents an overview

of the introduced solution. Detailed changeability and privacy analysis are presented in

Section 5.3 and 5.4 respectively. Section 5.5 presents the experimental results, and a

conclusive summary is provided in Section 5.6.

5.2 Method Overview

The proposed methods assume that a biometric signal is represented by a vector in

the continuous domain, and the similarity of the vectors can be evaluated by distance

measures such as Euclidean distance. The procedure of generating a biometric template

is as follows:




training data.

3. Use a key k as a control factor for randomness generation. Transform the vector u

by x = fk(u), x ∈ RM where fk() is a random transformation function associated

with the key k.

4. Sort vector x in descending order, and store the corresponding index numbers in a

new vector g.

5. The generated vector g ∈ ZM that contains the sorted index numbers is stored as

template.

For example, given x = {x1, x2, x3, x4}, the sorted vector in descending order is

g = {x4, x2, x3, x1}, then the template is g = {4, 2, 3, 1}. The similarity matching of the

SIN vectors is based on the SIN distance, which has been introduced in Chapter 4.

5.3 Changeability Analysis

The proposed methods utilize randomness to address the changeability problem, and in

combination with the SIN method for achieving privacy protection. A feature vector

u ∈ RN is first transformed by x = fk(u), and the resulting SIN vector of x is stored as

a biometric template. In this section, we study three types of random transformations:

random additive transform (RAT), random multiplicative transform (RMT) and random

projection (RP). To illustrate the changeability of the proposed methods, the statistical

properties of the random transformations are analyzed in detail.


5.3.1 Random Additive Transform

The RAT transform performs element-wise addition by adding a random vector to the

original biometric feature vector. Let u and v be two biometric feature vectors in an

N -dimensional Euclidean space, u ∈ RN and v ∈ RN . Let r ∈ RN and s ∈ RN be

two N -dimensional random vectors. Each entry of r and s follows an i.i.d. Gaussian

distribution of mean zero and variance σ2, ri ∼ N(0, σ2), si ∼ N(0, σ2), i = 1, ...,N. Let

x = u + r, y = v + s. If the same key (SK) is applied, i.e., r = s, then we have:

‖x− y‖2 = ‖u + r− v− s‖2 = ‖u− v‖2, (5.1)

Therefore, when the same RAT is applied, the squared Euclidean distance (SED)

between any two vectors is exactly preserved. If different keys (DK) are applied to u and

v, i.e., r 6= s, and r and s are independent to each other, we have:

E[‖x− y‖2] = ‖u− v‖2 + 2Nσ2, (5.2)

Var[‖x− y‖2] = 8‖u− v‖2σ2 + 8Nσ4, (5.3)

Please see Appendix 5-I for the proofs.

Eqn. (5.2) shows that that when DK are applied, for any two vectors with fixed

dimensionality, the mean of the SED will increase as σ2 increases. To facilitate demon-

stration, we assume that the distribution of ‖x − y‖2 is Gaussian. This is validated in

Figure 5.1(a), where we randomly select two PCA feature vectors from our experimen-

tal data set, perform the DK scenario 2000 times, and plot the SED. The PCA feature

vectors are normalized to unit length, and σ2 is set to 0.005. It can be observed that

the experimental values of mean and variance fit well with our theoretical results in Eqn.

(5.2) and Eqn. (5.3), and the distribution of the obtained SED can be well approximated

as a Gaussian distribution. Assuming u and v are biometric feature vectors from the

same human subject, to obtain changeability, we hope the transformed biometric repre-

sentation using different keys can not authenticate each other, i.e., their distance should


be larger than the system threshold t. Figure 5.1(b) depicts the distribution of ‖x− y‖2

at different σ2 values. As σ2 increases, the probability of getting ‖x − y‖2 < ‖u − v‖2

will decrease to zero. By setting a larger σ2 value, we can produce changeable biometric

templates with probability 1, i.e., P (‖x− y‖2 > t) = 1.

Figure 5.1: RAT: Distribution of SED (a) Gaussian approximation (σ2=0.005); (b) at

different σ2 values.

The above analysis demonstrates the changeability of RAT using the SED. Since the

SIN method also approximates the Euclidean distance between two vectors, it is expected

that similar property can be preserved by applying the SIN method on RAT transformed

vectors, noted as RAT-SIN in this chapter. Figure 5.2(a) plots the distribution of the

normalized SIN distance (NSD) in both SK and DK scenarios with σ2=0.005. The SIN

distance is normalized by dividing the largest possible value N(N−1)2

. It can be seen that

the distribution of both scenarios can be well approximated by Gaussian distributions.

Figure 5.2(b) plots the distributions as functions of the variance of additive vector σ2.

It can be observed that by increasing σ2, the SK and DK distributions become well

separated, and strong changeability can be obtained. The DK distribution moves to the

right toward 0.5, which implies stronger randomization. The SK distribution, on the

other hand, shift to the left. This implicates that with larger σ2, the randomness of the


additive vector may overpower the distribution of the original features, hence produce

larger deviation from the original characteristics of the features. Therefore, the larger

the σ2, the better the changeability, but possibly the lower the recognition accuracy.

Figure 5.2: RAT: Distribution of NSD (a) Gaussian approximation (σ2=0.005); (b) at

different σ2 values.

5.3.2 Random Multiplicative Transform

The RMT transform performs element-wise multiplication between a randomly generated

vector and the original feature vector. Let u and v be two biometric feature vectors in

an N -dimensional Euclidean space, u ∈ RN and v ∈ RN . Let r ∈ RN and s ∈ RN be

two N -dimensional random vectors. Each entry of r and s follows an i.i.d. Gaussian

distribution of mean one and variance σ2, ri ∼ N(1, σ2), si ∼ N(1, σ2), i = 1, ...,N. Let

x = u. ∗ r, y = v. ∗ s, where .∗ denote multiplication by elements. In the SK scenario,

i.e., r = s, we have:

E[‖x− y‖2] = (σ2 + 1)‖u− v‖2, (5.4)

Var[‖x− y‖2] = (2σ4 + 4σ2)N∑

i=1

(ui − vi)4, (5.5)

Please see Appendix 5-II for the proofs.


Eqn. (5.4) and Eqn. (5.5) show that the RMT preserves the mean of the SED between

two vectors in the transformed domain up to a scaling factor σ2 + 1, and the variance is

proportional to σ2. The larger the σ2, the bigger the variance. In the DK case, where

r 6= s and they are independent to each other, we can derive the statistical properties of

the SED in the transformed domain:

E[‖x− y‖2] = σ2(‖u‖2 + ‖v‖2) + ‖u− v‖2, (5.6)

Var[‖x− y‖2] = 2σ4

N∑i=1

(u2i + v2

i )2 + 4σ2

N∑i=1

(ui − vi)2(u2

i + v2i ), (5.7)

Please see Appendix 5-III for the proofs.

The statistical properties of RMT are validated through two randomly selected unit-

length PCA feature vectors of dimensionality 100 from our experimental data set, with

each experiment performed 2000 trails. Figure 5.3 demonstrates that the theoretical SED

distributions in Eqn. (5.4) - (5.7) fit well with the experimental results in both SK and

DK scenarios, and the distributions are approximately Gaussian.

Figure 5.3: RMT: Gaussian approximation of the distribution of SED (σ2 = 0.005).

To obtain changeability, we expect the distributions of SK and DK cases are well

separated. Eqn. (5.4) and Eqn. (5.6) can be rewritten as:

E

[‖x− y‖2

σ2 + 1

]= ‖u− v‖2, (5.8)


E

[‖x− y‖2

σ2 + 1

]=

σ2(‖u− v‖2 + 2uTv) + ‖u− v‖2

σ2 + 1

= ‖u− v‖2 +2σ2uTv

σ2 + 1, (5.9)

Eqn. (5.8) and Eqn. (5.9) show that the separation of the distributions is dependent

on the σ2 and the inner product of the vectors. Since there is no guarantee that uTv > 0,

it is possible that the SED in the DK transformed domain is even smaller than the

original SED, i.e., weak changeability. This is confirmed in Figure 5.4(a), where the

distribution of SED is plotted at different σ2 values. It can be observed that the SK and

DK distributions are not well separated with significant overlap.

Figure 5.4: RMT: Distribution of SED (a) at different σ2 values; (b) at different d values

(σ2 = 0.01).

To solve this problem, we note that 2σ2

σ2+1> 0, and the SED in the DK case can be

enlarged by increasing uTv. This can be achieved by adding a translation vector d to

u and v, such that uTv is augmented, and the SED of the original vectors ‖u − v‖2

is unaltered. As such, the distributions of SK and DK cases can be well separated,

and strong changeability can be obtained. This is shown in Figure 5.4(b), where the

distributions of SED at different translation values are plotted. For simplicity, all the

elements in d are set to the same value d. Since the addition of d does not change the SED


when the same key is applied, the distribution of the SK case is the same for different d

values. On the other hand, it can be seen that by adding appropriate translation value,

the mean of the DK distribution shifts to the right, away from the SK distribution. The

clear separation of SK and DK distributions indicates the possibility of producing strong

changeability.

Figure 5.5: RMT: Gaussian approximation of the distribution of NSD (σ2 = 0.005).

Due to the distance approximation property of SIN, similar properties may be ob-

tained when the SIN method is applied in the RMT transformed domain. Figure 5.5

demonstrates that in both SK and DK cases, the distributions of NSD can also be ap-

proximated by Gaussian distributions. The NSD distributions in the RMT transformed

domain are depicted in Figure 5.6(a), at different σ2 values. Similar to the SED distri-

butions, the significant distribution overlap indicates weak changeability. As shown in

Figure 5.6(b), by adding a translation value, it is possible to produce clear separation of

SK and DK distributions. Note that, different from Figure 5.4(b), where the addition of

d does not change the distribution of SED in SK case, the mean of NSD shifts to the left

as the translation value increases. This is similar to the RAT-SIN method, the larger the

d value, the better the changeability, but possibly the lower the verification performance.


Figure 5.6: RMT: Distribution of NSD (a) at different σ2 values; (b) at different d values

(σ2 = 0.01).

5.3.3 Random Projection

The changeability of RP has been analyzed in Chapter 3 using a geometric based ap-

proach. In this section, we provide an alternative analysis by study the statistical prop-

erties of the features in the projected domain. Let u and v denote two vectors in an

N -dimensional Euclidean space, u ∈ <N and v ∈ <N . Let R be an N × M(M ≤ N)

matrix with each entry rij, i = 1, ...,N, j = 1, ...,M follows an i.i.d. Gaussian distribu-

tion, rij ∼ N(0, 1N

). Let x =√

NM

RTu, and y =√

NM

RTv, then it is shown in Chapter

3, Lemma 3.4 that:

E[‖x− y‖2] = ‖u− v‖2 , (5.10)

Var[‖x− y‖2] =2

M‖u− v‖4 . (5.11)

It shows that when SK is applied, the mean of the SED in the projected domain

equals the SED of the original vectors, and the variance is inversely proportional to the

projected dimensionality M . Therefore, the higher the projected dimensionality, the

better the distance can be preserved in the transformed domain.

To provide changeability, the biometric templates that are generated using DK should

not be able to authenticate each other. For two vectors u ∈ <N and v ∈ <N , let R and


S be two independent N × M(M ≤ N) matrices with each entry of R and S an i.i.d.

Gaussian random variable, i.e., rij ∼ N(0, 1N

), sij ∼ N(0, 1N

), i = 1, ...,N, j = 1, ...,M.

Let x =√

NM

RTu, and y =√

NM

STv, then we have:

E[‖x− y‖2] = ‖u‖2 + ‖v‖2 , (5.12)

Var[‖x− y‖2] =2

M(‖u‖2 + ‖v‖2)2. (5.13)

Please see Appendix 5-IV for the proofs.

Eqn. (5.12) and Eqn. (5.13) show that when different RP matrices are applied, the

mean of the SED equals the sum of the squared vector length, and the variance is inversely

proportional to the projected dimensionality M . Figure 5.7(a) shows the distribution of

the SED between two feature vectors in SK and DK scenarios. We randomly selected

two PCA feature vectors (N = 100) of the same human subject from the employed

data set, normalized to unit length, and performed RP 2000 trials. It can be seen that

the theoretical results in Eqn. (5.10) - (5.13) fit very well with the experimental results

(M = 80). The distributions of SED approximate Gaussian in both SK and DK scenarios.

Note that although the DK distribution has a mean that is larger than that of the

SK distribution, the separation of the SK and DK distributions is dependent on the

characteristics of the features. For example, let u and v denote two vectors from the

same subject, if ‖u− v‖2 is large, i.e., large within-class variation, then the SK and DK

distributions will possibly have overlap, hence clear separation of distribution can not

be obtained, and strong changeability can not be achieved. Figure 5.7(b) depicts the

SK and DK distributions at different projected dimensionalities. The relation between

the projected dimensionality and the variance of distance distribution can be easily ob-

served. The lower the M , the higher the variance. The degree of distribution overlap

also increases as the projected dimensionality decreases.

The Euclidean distance approximation property of the SIN method indicates possibly

similar changeability characteristic when the SIN method is applied after RP (RP-SIN).


This is confirmed in Figure 5.8(a) and Figure 5.8(b) that the NSD also approximate

Gaussian distributions in both SK and DK scenarios, and the variance increase as M

decreases. Similarly, the SK and DK distributions have overlapping and clear distribution

separation can not be obtained.

Figure 5.7: RP: Distribution of SED (a) Gaussian approximation (M=80); (b) at different

projected dimensionalities.

Figure 5.8: RP: Distribution of SED (a) Gaussian approximation (M=80); (b) at different

projected dimensionalities.

Note that in Eqn. (5.12), the expected SED is equal to the sum of the squared

length of the two vectors. By increasing the length of the vectors, the SED in the


DK case will be further enlarged. To achieve this, we can apply a vector translation,

i.e., x =√

NM

RT (u + d). Figure 5.9 shows the impact of vector translation with dif-

ferent translation values. All the elements in d are set to the same value d. Since∥∥∥√

NM

RT (u + d)−√

NM

RT (v + d)∥∥∥

2

=∥∥∥√

NM

RTu−√

NM

RTv∥∥∥

2

, the vector translation

by d does not change the SED between two vectors using the same key, therefore in Fig-

ure 5.9(a), the distribution of SK case does not change with d. However, as d increases,

the distribution of DK case shifts to the right, and clear separation of SK and DK distri-

butions can be obtained. For the SIN distance in Figure 5.9(b), due to the randomness

of the projection, the DK distribution is always centered around 0.5. The vector trans-

lation operation shifts the SK distribution to the left, and the distributions can be well

separated. The clear separation of the distributions indicates strong changeability.

Figure 5.9: RP: Distribution of (a) SED, and (b) NSD, at different vector translation

values.


In this section, the privacy preserving properties of the random transformations in combi-

nation with the SIN method are discussed in detail using the previously defined element-

wise and vector-wise privacy measures in Chapter 4.


5.4.1 RAT-SIN

The RAT method itself (without applying the SIN operation) is capable of producing

changeability for the generated templates. Without any knowledge of the RAT additive

vector, it is impossible for an attacker to recover the values of the original features.

However, such a method only offers limited privacy protection since the exact value of the

original biometric features will be computed by a simple element-by-element subtraction

if the RAT additive vector is known. To solve this problem, we introduce the combination

of RAT with SIN for achieving enhanced privacy protection.

In the RAT based SIN framework (RAT-SIN), a random vector r ∈ <N with each

element ri an i.i.d. random variable of mean zero and variance σ2r is added to the biometric

feature vector u ∈ <N , and the SIN vector of the resulting vector x = u + r is stored

as template. Since the biometric features are mean centralized, u has zero mean. Let

σ2u denote the variance of the elements in u, then the variance of the elements in x is

σ2x = σ2

u +σ2r . Due to the randomness of r, it is impossible for an adversary to accurately

estimate x without knowing r. Assuming the worst case that an attacker knows the

distribution of u, and also obtains r, then he can estimate the variance of r, generate a

set of random numbers of mean zero and variance σ2u + σ2

r , estimate x according to the

SIN vector g, and then subtract r to get u = x− r.

For fixed dimensionality N , Figure 5.10 plots the variance of order statistics σ2j:N

as functions of variance σ2x, assuming Gaussian distribution. It can be seen that σ2

j:N

is proportional to the variance σ2x. The larger the σ2

x, the greater the σ2j:N . Since the

element-wise privacy α and vector-wise privacy β are both proportional to σ2j:N , hence

the larger the σ2r , the greater the σ2

x, and the better the privacy protection. This is

confirmed in Figure 5.11, where the α and β values are plotted as functions of σ2r , with

the dimensionality set to 100, and the PCA vectors are normalized to zero mean and

a variance of 0.01. It can be seen that better privacy protection can be obtained by

increasing the variance of additive vector r.


Figure 5.10: Variance σ2j:N as function of variance of x σ2

x.

Figure 5.11: Privacy measures of RAT method as functions of variance σ2r .

5.4.2 RMT-SIN

In the RMT based framework RMT-SIN, the SIN vector g of RMT transformed vector x

is stored as template, with each element of x obtained by xi = ri(ui + d), i = 1, 2, ..., N ,

where ri ∼ N(1, σ2r), ui is the ith element of feature vector u of mean zero and variance σ2

u,


and d is a translation value. It is straightforward to derive that E[xi] = E[ri(ui +d)] = d,

E[x2i ] = E[r2

i (ui +d)2] = (σ2r +1)(σ2

u +d2), and the variance of xi is σ2x = E[x2

i ]−E[xi]2 =

σ2u(σ

2r + 1) + d2σ2

r .

Assuming the worst case that an attacker knows the distribution of u, and obtains the

value of d and r, he can generate a set of N random numbers of mean d and variance σ2x,

estimate x by mapping the numbers according to the SIN vector g, perform element-wise

division followed by subtraction of d to obtain an estimate of ui as ui = xi/ri − d. As

shown in Figure 5.10, the variance of the order statistics σ2j:N increases as the variance

of σ2x increases. In the RMT-SIN method, σ2

x is proportional to σ2r and d. Since both

privacy measures α and β are proportional to σ2j:N , the larger the σ2

r and d, the greater

the σ2j:N , and hence the better the privacy. Figure 5.12 shows the element-wise privacy

α and vector-wise privacy β as functions of the variance of multiplicative vector σ2r and

translation value d respectively, using PCA feature vectors. It can be seen that the

privacy protection level improves at higher σ2r and d values.

Figure 5.12: Privacy measures of RMT-SIN as functions of variance σ2r and translation

value d: (a) α, (b) β.


5.4.3 RP-SIN

In the RP based framework RP-SIN, the SIN vector g of x =√

NM

RT (u + d) is stored

as template, where R ∈ <N×M with each entry rij ∼ N(o, 1N

), u is the feature vector of

mean zero and variance σ2u, and d is the translation vector. Assuming all the elements

in d have the same value d, for each element in x, we have:

E[xi] = E

[√N

M

N∑j=1

rji(uj + d)

]= 0, (5.14)

σ2x = Var[xi] = E[x2

i ]− E[xi]2

= E

(√N

M

N∑j=1

rji(uj + d)

)2

=N

ME

(N∑

j=1

rji(uj + d)

)2

=N

ME

[N∑

j=1

r2ji(uj + d)2

]

=N

M

N∑j=1

E[r2ji(uj + d)2]

=N

M

N∑j=1

1

N(σ2

u + d2)

=N

M(σ2

u + d2). (5.15)

Similar to our previous analysis, we assume the worst case where g, R, and d are

all compromised by an attacker. For a projection function x = RT (u + d), the most

an attacker can do is to generate a set of M random numbers with mean and variance

shown in Eqn. (5.14) and Eqn. (5.15), map to x according to g, then estimate u by

R(RT R)−1x−d, where R(RT R)−1 is essentially the pseudo-inverse of R. In Eqn. (5.15),

it is shown that σ2x increases as M decreases and d increases. According to Figure 5.10,

the greater the σ2x, the larger the variance of the order statistics σ2

j:N , and the larger

the privacy measures α and β. Therefore, in the RP-SIN method, better privacy can be


achieved at a lower projected dimensionality M and greater translation value d. This is

demonstrated in Figure 5.13 using PCA feature vectors, where α and β are plotted as

functions of M and d respectively.

Figure 5.13: Privacy measures of RP-SIN as functions of projected dimensionality M

and translation value d: (a) α, (b) β.


To evaluate the effectiveness of the proposed method, we conduct experiments on the

generic verification data set with the experimental setup the same as in Chapter 3 and

4. The training set contains 2388 images from 520 subjects and 2278 images from 500

human subjects are used for testing. All the experiments are performed 5 times, and the

averages of the results are reported. PCA and KDDA are selected as feature extractors,

and the original dimensionality is set to 100 for both of them. The detailed experimental

results are presented in this section.


5.5.1 RAT-SIN

Table 5.1 shows the obtained EER of applying the RAT-SIN method on PCA and KDDA

feature vectors at different additive vector variance value σ2, and a graphical presentation

is provided in Figure 5.14. Note that when the variance is zero, it is equivalent to apply

SIN on the original vectors. It can be observed that as the variance increases, the EER

in the UD scenario decreases gradually to zero, while that of the UI scenario increases

slightly. This is consistent with our analysis in Section 5.3.1. As shown in Figure 5.2,

when the variance of the random additive vector increases, clearer separation of the

SK and DK distributions can be obtained, hence better changeability can be achieved.

However, it can also be observed that the distribution of the NSD in the SK scenario

shifts to the left as the variance increases. This indicates larger deviation from the

original characteristics of the features, hence possibly degrades the performance in the

UI scenario.

σ2=

Method 0.0005 0.001 0.002 0.003 0.004 0.005 0.008 0.01 0.015

PCAUI 10.23 11.35 11.95 12.67 13.12 12.86 13.97 14.07 14.16

UD 5.25 3.04 1.21 0.54 0.27 0.1 0.02 0.004 0

KDDAUI 6.69 6.86 7.29 7.35 7.68 7.66 7.99 7.87 8.06

UD 4.68 3.48 1.98 1.14 0.64 0.35 0.06 0.01 0

Table 5.1: Experimental results (EER in %) of RAT-SIN method on PCA and KDDA

features at selected σ2 values.

5.5.2 RMT-SIN

Figure 5.15 depicts the obtained EER of RMT-SIN method as functions of variance of

the multiplicative vector σ2 and translation value d. The detailed EER values at σ2=0.02


Figure 5.14: Obtained EER of RAT-SIN method for PCA and KDDA as functions of the

variance of additive vector.

are shown in Table 5.2. It can observed that without translation, zero EER can not be

obtained in the UD scenario, which implicates weak changeability. This is consistent with

our analysis in Section 5.3.2 and the plot in Figure 5.6(a), that without translation, clear

separation of SK and DK distributions can not be obtained. In fact, without translation,

the performance in both UI and UD scenarios are almost consistent regardless of the

multiplicative vector variance value. As shown in Figure 5.6(b), with proper translation,

the DK distribution shifts to the right towards 0.5, and a clear separation of SK and

DK distributions can be achieved. This is confirmed in Figure 5.15(b) and 5.15(d), that

zero EER can be obtained in the UD scenario by proper translation. On the other hand,

since the SK distribution in Figure 5.6(b) also shifts to the left as the translation value

increases, which indicates deviation from the characteristics of the original features, the

performance in the UI scenario will possibly degrades as d increases, as shown in Figure

5.15(a) and 5.15(c).

5.5.3 RP-SIN

In the evaluation of RP-SIN method, the original feature vectors have dimensionality of

N = 100, and the projected dimensionality M are set to 20 − 100. Figure 5.16 depicts


Figure 5.15: Obtained EER of RMT-SIN method: (a) PCA UI, (b) PCA UD, (c) KDDA

UI, (d) KDDA UD.

Figure 5.16: Obtained EER of RP-SIN method for PCA and KDDA.


d=

Method 0 0.2 0.4 0.6 0.8 1

PCAUI 9.42 10.75 12.66 13.72 14.19 14.71

UD 9.51 3.76 0.48 0.04 0 0

KDDAUI 6.45 7.04 7.96 8.36 8.49 8.71

UD 6.42 4.05 0.93 0.1 0.01 0

Table 5.2: Experimental results (EER in %) of RMT-SIN method on PCA and KDDA

features at different d values (σ2=0.02).

the obtained EER as functions of M , as well as translation value d. When d = 0, it can

be seen that the recognition accuracy decreases as M decreases, and strong changeability

can not be obtained at all projections since the EER in the UD scenario is not zero.

This complies with our analysis in Section 5.3.3 that the variance of distance preserving

increases as the projected dimensionality decreases, and the changeability is dependent

on the dimensionality and characteristics of the features. Better recognition accuracy

and changeability can be obtained at a higher dimensionality. On the other hand, it can

be observed that by proper translation, strong changeability (zero EER) can be achieved

at all projected dimensionalities, and the translation operation only introduces slight

performance degradation in the UI scenario.

The proposed RP-SIN method shares similarity with the BioHashing (BH) technique

in which both utilize RP prior to discretization. For the purpose of comparative study,

we compared the performance of RP-SIN with that of the BH method. For the BH tech-

nique, as illustrated in [61,62], each of the generated BH code should have a probability of

50% to be 1 or 0. To achieve this, we centralize all the feature vectors by subtracting the

mean, and then compare with the threshold value 0. We first compare the results when

vector translation is not applied. Table 5.3 reports the obtained EER with a graphical

comparison shown in Figure 5.17. It can be seen that the RP-SIN method outperforms


PCA KDDA

UD UI UD UI

Dim. BH RP-SIN BH RP-SIN BH RP-SIN BH RP-SIN

20 22.13 16.92 25.25 20.82 18.77 12.96 18.63 13.58

40 17.80 13.44 21.43 18.69 13.03 7.70 13.96 9.23

60 15.54 11.76 19.24 17.63 9.85 5.68 10.92 7.38

80 14.38 10.76 18.34 17.18 7.97 4.54 9.37 6.64

100 12.98 9.89 17.79 16.83 6.84 3.83 8.63 6.05

Table 5.3: Comparison of BH with RP-SIN (EER in %).

the BH method in both UI and UD scenarios, at all projected dimensionalities. With-

out vector translation, both methods can not produce zero EER in UD scenario, which

demonstrates weak changeability. Although previous works on BH demonstrate near zero

EER in both-legitimate cases, the performance of it relies on the characteristics of the

feature extractors. Figure 5.18 shows the intra-class and inter-class distributions of the

generic data set. It can be observed that the SIN method provides better distribution

separation than the BH method, in both the UI and UD scenarios, with both PCA and

KDDA feature extractors. This demonstrates that the proposed SIN method provides

more discriminatory representation than the simple thresholding method in BioHashing.

Since both the discretization methods in RP-SIN and BH are supposed to preserve

the distance between two vectors, the vector translation method should be able to pro-

duce stronger changeability for the BH technique as well. Figure 5.19 compares the

performance of RP-SIN and BH as functions of translation value d, with the original and

projected dimensionality both set to 100, and the detailed results are shown in Table

5.4. It can be seen that with proper translation, both methods are capable of producing

zero EER in the UD scenario. However, the BH technique introduces greater degrada-


Figure 5.17: Comparison of RP-SIN and BH.

Figure 5.18: Intra-class and inter-class distributions of RP-SIN and BH, using PCA and

KDDA feature extractors, in both user-independent and user-dependent scenarios.


tion in the UI scenario than the RP-SIN method. This is due to the fact that the BH

method discretizes each individual element by comparing with a threshold value. By

vector translation, the probability of getting 1 or 0 in each dimension is not 50% any-

more, therefore the performance degrades significantly. On the other hand, the pairwise

relation of the original vector elements does not change when the same value is added to

all the elements, and the RP operation introduces slight degradation to the performance.

PCA KDDA

UD UI UD UI

d= BH RP-SIN BH RP-SIN BH RP-SIN BH RP-SIN

0 12.85 10.08 17.64 16.75 6.84 3.76 8.74 5.96

0.02 11.41 8.82 17.51 16.62 6.48 3.64 8.86 6.36

0.04 9.10 6.08 18.07 16.74 4.56 2.19 9.49 6.78

0.06 6.17 3.67 18.27 16.86 2.68 0.71 10.45 7.35

0.08 3.21 1.79 18.91 16.84 1.25 0.1 11.90 7.57

0.10 1.68 0.62 19.77 16.93 0.53 0.02 12.86 7.83

0.12 0.76 0.05 19.83 16.98 0.22 0.01 13.54 7.87

0.14 0.34 0.02 20.12 17.13 0.08 0 14.87 7.77

0.16 0.14 0 20.68 17.15 0.03 0 15.80 7.79

0.18 0.05 0 21.27 17.05 0.01 0 16.34 7.70

0.20 0.01 0 22.02 17.08 0 0 16.48 7.60

Table 5.4: Comparison of BH with RP-SIN with translation(EER in %).


Figure 5.19: Obtained EER of RP-SIN and BH as functions of translation value.

5.5.4 Discussion

The experimental results demonstrate that when the SIN method is applied in conjunc-

tion with the three types of random transformations, zero EER can always be obtained

in the UD scenario. To provide a comparison of the proposed approaches, as well as the

performance of the original features, the obtained EER of different approaches are shown

in Table. 5.5, and the ROC curve is plotted in Figure 5.20. For the random transforma-

tion based SIN methods, the parameters are selected such that the vector-wise privacy

β = 0.5 (RAT-SIN: σ2d = 0.18, RMT-SIN:σ2

r = 0.03 and d = 2, and RP-SIN: M = 80 and

d = 0.15), and the threshold values are selected based on UI scenario. For the original

features, the Euclidean and the Cosine distances are used as similarity measures. It can

be seen that in the UD scenario, FAR=0 is achieved at all selections of system thresh-

old values. This demonstrates the strong changeability of the proposed methods. In

the UI scenario, the RAT-SIN method provides comparable performance as RMT-SIN,

and both outperform that of the original features. The RP-SIN method obtains similar

performance as the original features when the Euclidean distance is applied, and slightly

worse than that of the Cosine distance measure. Moreover, it is obvious that the RAT

based method provides computational advantage over RMT and RP based approaches.

For a feature vector of dimensionality N , the RAT method requires N addition opera-


tion only, while the RMT method requires N multiplication and N addition, and the RP

method needs N × M multiplications and M × (N − 1) additions when the projected

dimensionality is M . Overall, the RAT-SIN method provides better performance and

computational cost than RMT-SIN and RP-SIN.

RAT-SIN RMT-SIN RP-SIN

Method Euc. Cos. UD UI UD UI UD UI

PCA 18.20 16.90 0 14.23 0 14.39 0 18.45

KDDA 10.41 8.84 0 7.28 0 7.44 0 10.94

Table 5.5: Comparison of different approaches (EER in %).

Figure 5.20: ROC curve of different methods.

5.6 Summary

This chapter has presented a systematic analysis of random transformation based meth-

ods for addressing the challenging problem of template changeability and privacy protec-

tion in biometrics enabled verification systems. The introduced solutions are based on

random transformations in conjunction with a sorted index number approach. Three


types of transformations, namely random additive transform, random multiplicative

transform, and random projection, are investigated. Detailed statistical analyses have

been performed on the same key and different key scenarios to study the changeability of

the proposed methods. It is shown that strong changeability can be obtained by selecting

appropriate random transformation parameters. The privacy preserving properties of the

presented approaches have been analyzed and demonstrated through both element-wise

and vector-wise privacy measures.

The proposed methods are capable of producing reissuable biometric templates by

simply varying the randomness generation key. Two different application scenarios,

user-independent and user-dependent random transformations are discussed. The user-

independent scenario applies the same random transform for all the users, while the user-

dependent scenario utilizes user-specific transform. Extensive experimentation demon-

strates that the proposed methods may produce zero EER in the user-dependent scenario.

This also indicates the strong changeability of the proposed methods. Comparison are

also presented by comparing the proposed methods with original feature vectors and

BioHashing technique. Overall, the RAT-SIN method achieves better performance and

computational efficiency than the RMT-SIN and RP-SIN methods, improves the recog-

nition accuracy of the original features, and outperforms existing works.

5.7 Appendix

5.7.1 Appendix 5-I

Proofs of Eqn. (5.2) and Eqn. (5.3): Let x = u + r, y = v + s, where u ∈ <N ,

v ∈ <N ,r ∈ <N , s ∈ <N , and they are independent to each other. The entries of r and

s are i.i.d. Gaussian random variables with mean zero and variance σ2, ri ∼ N(0, σ2),

si ∼ N(0, σ2), i = 1, ..., N . Let ui, vi, ri, si denote the elements of u, v, r, s respectively.

Define w = u−v and t = r−s, then the elements of t, ti, follows a Gaussian distribution of


mean zero and variance 2σ2, ti ∼ N(0, 2σ2), i = 1, ..., N , . We have E[ti] = 0, E[t2i ] = 2σ2

and E[t4i ] = 3(2σ2)2 = 12σ4, we have:

E[‖x− y‖2] = E[‖u + r− v− s‖2]

= E

[N∑

i=1

(ui + ri − vi − si)2

]

= E

[N∑

i=1

(wi + ti)2

]

= E

[N∑

i=1

(w2i + 2witi + t2i )

]

=N∑

i=1

w2i +

N∑i=1

t2i

= ‖w‖2 + 2Nσ2

= ‖u− v‖2 + 2Nσ2

and

E[‖x− y‖4] = E[‖u + r− v− s‖4]

= E

(N∑

i=1

(ui + ri − vi − si)2

)2

= E

(N∑

i=1

(wi + ti)2

)2

= E

(N∑

i=1

(w2i + 2witi + t2i )

)2

= E

[N∑

i=1

N∑j=1

w2i w

2j + 4

N∑i=1

w2i t

2i + 2

N∑i=1

N∑j=1

w2i t

2j +

N∑i=1

N∑j=1

t2i t2j

]

= ‖w‖4 + 8σ2‖w‖2 + 4nσ2‖w‖2 + 12Nσ4 + (N2 −N)(2σ2)2

= ‖w‖4 + (8 + 4N)σ2(Nσ2 + ‖w‖2)

The variance can be computed as:

Var[‖x− y‖2] = E[‖x− y‖4]− E[‖x− y‖2]2


= ‖w‖4 + (8 + 4N)σ2(nσ2 + ‖w‖2)− 4N2σ4 − 4Nσ2‖w‖2 − ‖w‖4

= 8‖u− v‖2σ2 + 8Nσ4

5.7.2 Appendix 5-II

Proofs of Eqn. (5.4) and Eqn. (5.5): Let xi, yi, ui, vi, ri, i = 1, 2, ..., N , denote the

element of vectors x, y, u, v, r respectively. The entries of r are i.i.d. Gaussian random

variables with mean one and variance σ2, ri ∼ N(1, σ2), i = 1, ..., N . Let xi = riui and

yi = rivi. Since E[r2i ] = σ2 + 1 and E[r4

i ] = 3σ4 + 6σ2 + 1, we have:

E[‖x− y‖2] = E

[N∑

i=1

(riui − rivi)2

]

= E

[N∑

i=1

r2i (ui − vi)

2

]

= (σ2 + 1)‖u− v‖2

Let wi = ui − vi, we have:

E[‖x− y‖4] = E

(N∑

i=1

r2i (ui − vi)

2

)2

= E

(N∑

i=1

r2i w

2i

)2

= E

[N∑

i=1

N∑j=1

r2i w

2i r

2jw

2j

]

=N∑

i=1

w4i E[r4

i ] +∑

i6=j

w2i w

2jE[r2

i ]E[r2j ]

= (3σ4 + 6σ2 + 1)N∑

i=1

w4i + (σ2 + 1)2

∑

i6=j

w2i w

2j

= [2σ4 + 4σ2 + (σ2 + 1)2]N∑

i=1

w4i + (σ2 + 1)2

∑

i 6=j

w2i w

2j


= (2σ4 + 4σ2)N∑

i=1

(ui − vi)4 + (σ2 + 1)2‖u− v‖4

and

Var[‖x− y‖2] = E[‖x− y‖4]− E[‖x− y‖2]2

= (2σ4 + 4σ2)N∑

i=1

(ui − vi)4

5.7.3 Appendix 5-III

Proofs of Eqn. (5.6) and Eqn. (5.7): Let xi, yi, ui, vi, ri, si, i = 1, 2, ..., N , denote

the element of vectors x, y, u, v, r, s respectively. The entries of r and s are i.i.d.

Gaussian random variables with mean one and variance σ2, ri ∼ N(1, σ2), si ∼ N(1, σ2),

i = 1, ..., n, and r, s are independent to each other. Let xi = riui and yi = sivi, we have:

E[‖x− y‖2] = E

[N∑

i=1

(riui − sivi)2

]

= E

[N∑

i=1

(r2i u

2i − 2riuisivi + s2

i v2i )

]

=N∑

i=1

((σ2 + 1)u2i − 2uivi + (σ2 + 1)v2

i ))

=N∑

i=1

(σ2u2i + σ2v2

i + (ui − vi)2)

= σ2(‖u‖2 + ‖v‖2) + ‖u− v‖2

Let wi = riui − sivi, it is straightforward to verify that E[wi] = ui − vi, E[w2i ] =

(u2i + v2

i )σ2 + (ui − vi)

2, and E[w4i ] = (ui − vi)

4 + 6σ2(ui − vi)2(u2

i + v2i ) + 3σ4(u2

i + v2i )

2.

E[‖x− y‖4] = E

[N∑

i=1

N∑j=1

w2i w

2j

]

=N∑

i=1

E[w4i ] +

∑

i6=j

E[w2i ]E[w2

j ]

=N∑

i=1

[(ui − vi)4 + 6σ2(ui − vi)

2(u2i + v2

i ) + 3σ4(u2i + v2

i )2]


+∑

i6=j

[(u2i + v2

i )σ2 + (ui − vi)

2][(u2j + v2

j )σ2 + (uj − vj)

2]

=N∑

i=1

(ui − vi)4 +

∑

i 6=j

(ui − vi)2(uj − vj)

2

+σ4

[3

N∑i=1

(u2i + v2

i )2 +

∑

i6=j

(u2i + v2

i )(u2j + v2

j )

]

+6σ2

[N∑

i=1

(ui − vi)2(u2

i + v2i )

]

+σ2∑

i6=j

[(u2i + v2

i )(uj − vj)2 + (u2

j + v2j )(ui − vi)

2]

= ‖u− v‖4 + σ4

[2

N∑i=1

(u2i + v2

i )2 + (‖u‖2 + ‖v‖2)2

]

+σ2[4N∑

i=1

(ui − vi)2(u2

i + v2i ) + 2(‖u‖2 + ‖v2‖)‖u− v‖2]

Simply the above equation, we have:

Var[‖x− y‖2] = E[‖x− y‖4]− E[‖x− y‖2]2

= 2σ4

N∑i=1

(u2i + v2

i )2 + 4σ2

N∑i=1

(ui − vi)2(u2

i + v2i )

5.7.4 Appendix 5-IV

Proofs of Eqn. (5.12) and Eqn. (5.13): Let x =√

NM

RTu, y =√

NM

STv, where u ∈ <N ,

v ∈ <N , R ∈ <N×M , S ∈ <N×M , and the entries of R and S are i.i.d. Gaussian random

variables, rij ∼ N(0, 1N

), sij ∼ N(0, 1N

). Let xi, yi, ui, vi denote the elements of x, y, u,

v respectively, we have:

E[‖x− y‖2] = E

[M∑

j=1

(xj − yj)2

]

= E

M∑j=1

(N∑

i=1

√N

Mrijui −

N∑

k=1

√N

Mskjvk

)2

=N

M

M∑j=1

E

(N∑

i=1

rijui −N∑

k=1

skjvk

)2


=N

M

M∑j=1

E

(N∑

i=1

rijui

)2

− 2N∑

i=1

rijui

N∑

k=1

skjvk +

(N∑

k=1

skjvi

)2

=N

M

M∑j=1

E

(N∑

i=1

rijui

)2

+

(N∑

k=1

skjvk

)2

=N

M

M∑j=1

E

(N∑

i=1

rijui

)2 + E

(N∑

k=1

skjvk

)2

=N

M

M∑j=1

E

[N∑

i=1

r2iju

2i + 2

∑

l 6=h

rljulrhjuh

]

+N

M

M∑j=1

E

[N∑

k=1

s2kjv

2k + 2

∑

l 6=h

sljvlshjvh

]

=N

M

M∑j=1

(E

[N∑

i=1

r2iju

2i

]+ E

[N∑

k=1

s2kjv

2k

])

=N

M

M∑j=1

(1

N‖u‖2 +

1

N‖v‖2

)

= ‖u‖2 + ‖v‖2 ,

To compute Var[‖x− y‖2], we first define αj = (∑N

i=1 rijui) and βj = (∑N

k=1 skjvk), we

have:

E[α2j ] = E

(N∑

i=1

rijui

)2

= E

[N∑

i=1

r2iju

2i + 2

∑

l 6=k

rljulrkjuk

]

= E

[N∑

i=1

r2iju

2i

]

=N∑

i=1

1

Nu2

i

=1

N‖u‖2 ,

Since rij ∼ N(0, 1N

), E[r4ij] = 3

N2 , then:

E[α4j ] = E

(N∑

i=1

rijui

)4


= E

[N∑

i=1

r4iju

4i + 6

∑

l 6=k

r2lju

2l r

2kju

2k

]

=3

N2

N∑i=1

u4i +

6

N2

∑

l 6=h

u2l u

2h

=3

N2

(N∑

i=1

u4i + 2

∑

l 6=h

u2l u

2h

)

=3

N2

(N∑

i=1

u2i

)2

=3

N2‖u‖4 ,

Similarly, we have E[β2j ] = 1

N‖v‖2, and E[β4

j ] = 3N2 ‖v‖4.

Let γj = (αj − βj)2, we have:

E[γ2j ] = E[(αj − βj)

2]

= E[α2j − 2αjβj + β2

j ]

=1

N(‖u‖2 + ‖v‖2)

and

E[γ4j ] = E[(αj − βj)

4]

= E[α4j − 4α3

jβj − 4αjβ3j + 6α2

jβ2j + β4

j ]

= E[α4j ] + 6E[α2

jβ2j ] + E[β4

j ]

=3

N2‖u‖4 + 6× 1

N‖u‖2 1

N‖v‖2 +

3

N2‖v‖4

=3

N2(‖u‖2 + ‖v‖2)2

where we use the fact that E[αjβj] = 0, E[α3jβj] = 0, and E[αjβ

3j ] = 0, due to rij and sij

are i.i.d. random variables with zero mean. We can compute:

E[‖x− y‖4] = E

M∑j=1

(N∑

i=1

√N

Mrijui −

N∑

k=1

√N

Mskjvk

)2

2


=N2

M2E

(M∑

j=1

γj

)2

=N2

M2E

[M∑

j=1

γ2j + 2

∑

l 6=h

γlγh

]

=N2

M2

(M∑

j=1

E[γ2j ] + 2

∑

l 6=h

E[γl]E[γh]

)

=N2

M2

(3M

N2(‖u‖2 + ‖v‖2)2 + 2

M(M − 1)

2

‖u‖2 + ‖v‖2

N

‖u‖2 + ‖v‖2

N

)

=

(1 +

2

M

)(‖u‖2 + ‖v‖2)2,

and the variance of ‖x− y‖2 can be computed as:

Var[‖x− y‖2] = E[‖x− y‖4]− E[‖x− y‖2]2

=2

M(‖u‖2 + ‖v‖2)2.

Chapter 6

Conclusion and Future Work

6.1 Conclusion

The provision of high-level security and convenient to use characteristics have made bio-

metrics based authentication system a preferred option for human identity recognition.

The need for deployment of biometrics is particularly strong in anti-terrorism, homeland

security, and other high security demanding applications. The advances in biometric

technology have improved the recognition performance of various biometric modalities

significantly in the past two decades, and many biometric recognition systems have been

used in government and civilian applications. To support the deployment of biometric

technology in a wide variety of applications, it is important to offer changeability to the

biometric templates such that a single biometric trait can be employed in different sce-

narios. When the biometric template in one application is compromised, the biometric

signal itself is not lost forever, and a new one can be generated. In addition, the biomet-

ric templates that are generated from the same biometric signal should not be able to

authenticate each other.

Privacy is also becoming an increasingly bewared issue because of the uniqueness

of biometric traits. The public is concerned with the protection of their biometric in-

126

Chapter 6. Conclusion and Future Work 127

formation, which may be used to derive sensitive personal information, or be applied in

unintended functions to against the users, such as tracking the behavior of a person. This

is particularly important for physiological traits such as face and fingerprint, which are

more stable and can not be changed. Privacy is an intrinsically complex issue which in-

volves the position of multiple sectors including legislation, government, industry, human

individual, culture, and ethics. In fact, the definition of privacy in terms of biometrics

is still open to debate. It is difficult, if not impossible, to have a universal solution to

address the concerns and benefits of all the stakeholders. The objective is to encourage

privacy protection while employing biometric systems to benefit the majority of the pub-

lic. From a technical point of view, it is expected that the original biometric signal or

features should not be recovered when the stored biometric template is compromised.

In this dissertation, effort has been placed on the generation of changeable and privacy

preserving biometric templates. Specifically, we focus on the design and development of

randomness based repeatable and non-invertible transformations on a face based biomet-

ric recognition problem. First, we contribute a systematic analysis of a random projection

(RP) based method. Comprehensive mathematical analysis has been provided to study

the distance preserving property of RP. A precise method for computing the probabil-

ity of distance preserving when projected onto an arbitrary dimensionality is derived.

It is shown that with high probability, the Euclidean distance between a set of data

points in a high-dimensional Euclidean space can be preserved up to an error factor even

when projected onto a relatively low-dimensional subspace. Consequently, our analysis

achieves a smaller projection lower-bound than the best known in existing works. To

study the changeability of RP, a geometric based analysis is presented to compute the

probability of false acceptance when different projections are applied on the same bio-

metric signal, and a vector translation approach is introduced to produce templates with

strong changeability. Detailed privacy analysis has been conducted by investigating the

individual attribute as well as the global characteristics of the reconstructed signal. The


proposed method can be applied on either high-dimensional image vectors or dimension-

ality reduced feature vectors, with the former to provide better privacy protection but

lower recognition performance, and the latter to be contrariwise. Experimental results

show that by applying the RP method on high-dimensional image vectors, the recognition

accuracy is slightly lower than the benchmark PCA technique, but it provides privacy

protection, strong changeability, while being data-independent and easy to implement.

To improve the recognition performance, we further proposed a novel approach that

can be applied on low-dimensional feature vectors. The proposed technique stores the

sorted index numbers (SIN) of the original biometric feature vector as a template, and

a new distance measure is introduced to evaluate the similarity between SIN vectors.

Experimentation demonstrates that the SIN method is capable of achieving improved

recognition accuracy compared to the original features. To provide a quantitative evalu-

ation, we introduce two privacy measures to evaluate the extent of privacy protection at

both element and vector levels. Empirical results demonstrate that the SIN method can

provide better privacy protection at a lower feature dimensionality.

The SIN method by itself does not provide changeability, and the recognition per-

formance may be sacrificed when low-dimensional feature vectors are used for stronger

privacy protection. To address these issues, we have proposed a general framework in

which random transforms are applied prior to the sorting operation. Three common ran-

dom transformations, namely random additive transform (RAT), random multiplicative

transform (RMT), and RP are discussed and compared. Extensive statistical analysis

is performed to study the changeability of the transforms, and the privacy protecting

properties of each transform are demonstrated using the previously defined element-wise

and vector-wise measures. The effectiveness of the proposed methods is supported by

extensive experimentation. It is shown that all the transforms are capable of producing

changeable and privacy preserving templates. Overall, the RAT based SIN method is

computationally efficient, and outperforms the original features as well as existing works.


Although this dissertation focuses on a face recognition problem, the proposed methods

can be generalized to all features in the continuous domain, and it is expected that such

methods can also be applied to other biometrics and other privacy preserving applications

such as data mining.

6.2 Future Work

As an extension of this dissertation, we propose the following possible directions for future

research.

• Random projection on discriminant high-dimensional data. The distance

and privacy preserving properties of random projection make it a strong candidate

as a repeatable and non-invertible transform. Distinct biometric templates can be

generated by simply varying a random matrix generation key. The performance of

random projection is critically dependent on the discriminant capability of the orig-

inal high-dimensional vectors in the Euclidean space. In this dissertation, we have

shown that the RP method produces lower performance than the PCA method,

due to the noisy representation of the original images. To obtain stronger privacy

as well as better recognition performance, the extracted feature vectors prior to

the projection operation should be high-dimensional and discriminant. Possible

solutions include advanced image processing techniques such that the image dis-

tance would provide a discriminant representation, or the integration with kernel

alike methods to map the original image vectors to a high-dimensional but more

separable space, and then apply the RP to reduce the dimensionality. As such, it

is expected that a better recognition accuracy can be achieved. In addition, RP

on 2-dimensional images (or even higher order tensor data) directly, without the

time consuming cascading procedure for vectorization, may also be investigated to

obtain better computational efficiency.


• Combination with cryptographic techniques. It has been shown that the

proposed SIN method is capable of maintaining and even improving the recogni-

tion accuracy of the original features. The extracted SIN vector is essentially a

discrete representation of the original face patterns, and it is possible to be com-

bined with error correction algorithms based biometric cryptographic constructions

for key generation. The repeatable transform based data perturbation methods can

efficiently distort the biometric data, produce changeable biometric templates, but

with possibly lower privacy guarantees. On the other hand, the cryptographic

systems may offer stronger privacy protection, but with high computational com-

plexity. The ideal case will be an effective combination of these two to achieve both

privacy and efficiency.

Bibliography

[1] “Identity theft: The aftermath 2008,” Identity Theft Resource Center. [Online].

Available: http://www.bioid.com/downloads/facedb/index.php

[2] A.K.Jain, A.Ross, and S.Prabhakar, “An introduction to biometric recognition,”

IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no. 9,

pp. 1222–1228, 2004.

[3] J. D. Woodward, N. M. Orlans, and P. T. Higgins, Biometrics - Identity Assurance

in the Information Age. McGraw-Hill/Osborne, Berkeley, 2003.

[4] U. Uludag and A. K. Jain, “Attacks on biometric systems: a case study in fin-

gerprints,” in Proceedings of SPIE, Security, Seganography and Watermarking of

Multimedia Contents VI, vol. 5306, 2004, pp. 622–633.

[5] A. Adler, “Can images be regenerated from biometric templates?” in Proceedings

of Biometrics Consortium Conference, 2003.

[6] “Privacy & biometrics: building a conceptual foundation,” Na-

tional Science and Technology Council, 2006. [Online]. Available:

http://www.biometrics.gov/docs/privacy.pdf

[7] R. E. Smith, Ben Franklin’s Web Site: Privacy and Curiosity from Plymouth Rock

to the Internet. Providence RI: Privacy Journal, 2000.

131

Bibliography 132

[8] A. Cavoukian, A. Stoianov, and F. Carter, “Biometric encryption: technology for

strong authentication, security and privacy,” in Policies and Research in Iden-

tity Management, ser. IFIP International Federation for Information Processing,

E. de Leeuw, S. Rischer-Hubner, J. Tseng, and J. Borking, Eds. Boston, ISBN

978-0-387-77995-9: Springer, 2008, pp. 57–77.

[9] H. Chen, Medical Genetics Handbook. W.H. Green, 1988.

[10] J. D. Woodward, “Biometrics: privacy’s foe or privacy’s friend?” Proceedings of

the IEEE, vol. 85, no. 9, pp. 1480–1492, 1997.

[11] S. K. Panigrahy, D. Jena, S. B. Korra, and S. K. Jena, “On the privacy protection

of biometric traits: palmprint, face, and signature,” in Contemporary Computing,

ser. Communications in Computer and Information Science, S. R. et al, Ed. Berlin

Heidelberg: Springer, 2009, pp. 182–193.

[12] A. Cavoukian and A. Stoianov, “Biometric encryption: a positive sum technology

that achieves strong authentication, security and privacy,” White paper, Office of

the Information and Privacy Commissioner of Ontario, 2007.

[13] A. Cavoukian and M. Snijder, “the relevance of untraceable biometrics and biomet-

ric encryption: a discussion of biometrics for authentication purposes,” Discussion

paper, Office of the Information and Privacy Commissioner of Ontario and Euro-

pean Biometrics Group, 2009.

[14] U.Uludag, S.Pantanti, S.Prabhakar, and A.K.Jain, “Biometric cryptosystems: is-

sues and challenges,” Proceedings of the IEEE, vol. 92, no. 6, pp. 948–960, 2004.

[15] N. K. Ratha, J. H. Connel, and R. M. Bolle, “Enhancing security and privacy in

biometrics-based authentication systems,” IBM System Journal, no. 3, pp. 614–634,

2001.

Bibliography 133

[16] A. Adler, “Vulnerabilities in biometric encryption systems,” in Proceedings of 5th

International Conference on Audio and Video based Biometric Person Authentica-

tion, 2005, pp. 1100–1109.

[17] W. J. Scheirer and T. E. Boult, “Cracking fuzzy vault and biometric encryption,”

in Proceedings of Biometrics Symposium, 2007, pp. 1100–1109.

[18] A. K. Jain, “Biometric recognition: how do i know who you are,” in Proceedings

of the IEEE 12th Signal Processing and Communications Applications Conference,

2004, pp. 3–5.

[19] A. K. Jain, A. Ross, and S. Prabhakar, “Biometrics: a tool for information secu-

rity,” IEEE Transactions on Information Forensics and Security, vol. 1, no. 2, pp.

125–143, 2006.

[20] R. M. Bolle, J. H. Connel, and N. K. Ratha, “Biometric perils and patches,” Pattern

Recognition, vol. 35, pp. 2727–2738, 2002.

[21] A. Bodo, “Method for producing a digital signature with aid of a biometric feature,”

German patent DE 42 43 908 A1, 1994.

[22] M. Blaze, W. Diffie, R. L. Rivest, B. Schneier, T. Shimomura, E. Thompson, and

M. Wiener, “Minimal key lengths for symmetric ciphers to provide adequate com-

mercial security,” A Report by an Ad Hoc Group of Cryptographers and Computer

Scientists, Tech. Rep., 1996.

[23] C. Soutar, D. Roberge, A. Stoianov, R. Gilroy, and B. V. K. V. Kumar, “Biometric

encryption,” in ICSA Guide to Cryptography, R. K. Nichols, Ed. McGraw-Hill,

1998, pp. 649–676.

Bibliography 134

[24] F. Monrose, M. K. Teiter, and S. Wetzel, “Password hardening based on keystroke

dynamics,” in Proceedings of the sixth ACM Conference on Computer and Com-

munications Security, 1999, pp. 73–82.

[25] A. Shamir, “How to share a secret,” Communications of the ACM, no. 1, pp. 612–

613, 1979.

[26] F. Monrose, M. K. Teiter, Q. Li, and S. Wetzel, “Cryptographic key generation

from voice,” in Proceedings of IEEE Symposium on Security and Privacy, 2001,

pp. 202–213.

[27] G. I. Davida, Y. Frankel, and B. J. Matt, “On enabling secure applications through

off-line biometric identification,” in Proceedings of IEEE Symposium on Security

and Privacy, 1998, pp. 148–157.

[28] A. Juels and M. Wattenberg, “A fuzzy commitment scheme,” in Proceedings of

the sixth ACM Conference on Computer and Communications Security, 1999, pp.

28–36.

[29] F. Hao, R. Anderson, and J. Daugman, “Combining crypto with biometric effec-

tively,” IEEE Transactions on Computers, no. 9, pp. 1081–1088, 2006.

[30] S. C. Draper, A. Khisti, E. Martinian, A. Vetro, and J. S. Yedidia, “Using dis-

tributed source coding to secure fingerprint biometrics,” in Proceedings of the IEEE

International Conference on Acoustic, Speech and Signal Processing, 2007, pp. 129–

132.

[31] ——, “Secure storage of fingerprint biometrics using Slepian-Wolf codes,” in In-

formation Theory and Applications Workshop, 2007.

[32] A. Juels and M. Sudan, “A fuzzy vault scheme,” in Proceedings of IEEE Interna-

tional Symposium on Information Theory, 2002, p. 408.

Bibliography 135

[33] R. C. Clancy, N. Kiyavash, and D. J. Lin, “Secure smart card based fingerprint

authentication,” in Proceedings of ACM SIGMM Workshop on Biometrics Methods

and Applications, 2003, pp. 45–52.

[34] S. Yang and I. Verbauwhede, “Automatic secure fingerprint verification system

based on fuzzy vault scheme,” in Proceedings of the IEEE International Conference

on Acoustic, Speech and Signal Processing, 2005, pp. 609–612.

[35] ——, “Secure fuzzy vault based fingerprint verification system,” in Proceedings of

38th Asilomar Conference on Signals, Systems, and Computers, vol. 1, 2004, pp.

577–581.

[36] U. Uludag, S. Pankanti, and A. Jain, “Fuzzy vault for fingerprints,” in Proceedings

of International Conference on Audio and Video based Biometric Person Authen-

tication, 2005, pp. 310–319.

[37] U. Uludag and A. Jain, “Securing fingerprint template: fuzzy vault with helper

data,” in Proceedings of IEEE Workshop on Privacy Research In Vision, 2006, p.

163.

[38] M. Freire-Santos, J. Fierrez-Aguilar, and J. Ortega-Garcia, “Cryptographic key

generation using handwritten signature,” in Proceedings of SPIE, Defense and

Security Symposium, Biometric Technologies for Human Identification, 2006, pp.

225–231.

[39] Y. Wang and K. N. Plataniotis, “Fuzzy vault for face based cryptographic key

generation,” in Proceedings of Biometrics Symposium, 2007, pp. 1–6.

[40] Y. J. Lee, K. R. Park, S. J. Lee, K. Bae, and J. Kim, “A new method for generating

an invariant iris private key based on the fuzzy vault system,” IEEE Transactions

on Systems, Man and Cybernetics, Part B, no. 5, pp. 1302–1313, 2008.

Bibliography 136

[41] K. Nandakumar and A. K. Jain, “Multibiometric template security using fuzzy

vault,” in Proceedings of 2nd IEEE International Conference on Biometrics: The-

ory, Applications and Systems, 2008, pp. 1–6.

[42] A. Kholmatov, B. Yanikoglu, E. Savas, and A. Levi, “Secret sharing using biomet-

ric traits,” in Proceedings of SPIE, Defense and Security Symposium, Biometric

Technologies for Human Identification, 2006, pp. 1–9.

[43] Y. Dodis, L. Reyzin, and A. Smith, “Fuzzy extractors: how to generate strong keys

from biometrics and other noisy data.” Springer-Verlag, 2004, pp. 523–540.

[44] X. Boyen, “Reusable cryptographic fuzzy extractors,” in Proceedings of 11th ACM

Conference on Computer and Communications Security, 2004, pp. 82–91.

[45] Q. Li, Y. Sutcu, and N. Memon, “Secure sketch for biometric templates,” in In

Asiacrypt. Springer-Verlag, 2006, pp. 99–113.

[46] Y. Sutcu, Q. Li, and N. Memon, “Protecting biometric templates with sketch:

theory and practice,” IEEE Transactions on Information Forensics and Security,

no. 3, pp. 503–512, 2007.

[47] ——, “Secure biometric templates from fingerprint-face features,” in Proceedings

of IEEE Conference on Computer Vision and Pattern Recognition, 2007, pp. 1–6.

[48] F. Hao and C. W. Chan, “Private key generation from on-line handwritten signa-

tures,” Information management and computer security, no. 2, pp. 159–164, 2002.

[49] J. paul Linnartz and P. Tuyls, “New shielding functions to enhance privacy and

prevent misuse of biometric templates,” in In AVBPA 2003, 2003, pp. 393–402.

[50] E. Verbitskiy, P. Tuyls, D. Denteneer, and J. P. Linnartz, “Reliable biometric

authentication with privacy protection,” in 24th Benelux Symp. on Info. Theory,

2003, pp. 125–132.

Bibliography 137

[51] P. Tuyls, E. Verbitskiy, T. Ignatenko, D. Schobben, and T. H. Akkermans, “Privacy

protected biometric templates: acoustic ear identification,” in Proceedings of SPIE,

2004, pp. 176–182.

[52] T. A. M. Kevenaar, G. J. Schrijen, M. V. D. Veen, A. H. M. Akkermans, and F. Zuo,

“Face recognition with renewable and privacy preserving binary templates,” in

Fourth IEEE Workshop on Automatic Identification Advanced Technologies. IEEE

Computer Society, 2005, pp. 21–26.

[53] N. K. Ratha, S. Chikkerur, J. H. Connell, and R. M. Bolle, “Generating cance-

lable fingerprint templates,” IEEE Transactions on Pattern Analysis and Machine

Intelligence, vol. 29, no. 4, pp. 561–572, 2007.

[54] M. Jeong, C. Lee, J. Kim, J. Chou, K.Toh, and J. Kim, “Changeable biometrics

for appearance based face recognition,” in Proceedings of Biometric Consortium,

2006, pp. 1–5.

[55] M. Savvides, B. V. V. Kumar, and P. K. Khosla, “Cancelable biometric filters for

face recognition,” in Proceedings of the 17th International Conference on Pattern

Recognition, 2004, pp. 922–925.

[56] T. E. Boult, “Robust distance measures for face recognition supporting revocable

biometric tokens,” in Proceedings of 7th IEEE International Conference on Auto-

matic Face and Gesture Recognition, 2006, pp. 560–566.

[57] T. Boult, W. Scheirer, and R. Woodworth, “Revocable fingerprint biotokens: accu-

racy and security analysis,” in Proceedings of IEEE Conference Computer Vision

and Pattern Recognition, 2007, pp. 1–8.

[58] C. Lee, J.-Y. Choi, K.-A. Toh, S. Lee, and J. Kim, “Alignment-free cancelable

fingerprint templates based on local minutiae information,” IEEE Transactions on

Systems, Man and Cybernetics, Part B, vol. 37, no. 4, pp. 980–992, 2007.

Bibliography 138

[59] A. Teoh, D. Ngo, and A. Goh, “Biohashing: two factor authentication featuring

fingerprint data and tokenised random number,” Pattern Recognition, vol. 37, pp.

2245–2255, 2004.

[60] T. Connie, A. Teoh, M. Goh, and D. Ngo, “Palmhashing: a novel approach for

dual-factor authentication,” Pattern Analysis and Application, vol. 7, no. 3, pp.

255–268, 2004.

[61] D. C. L. Ngo, A. B. J. Teoh, and A. Goh, “Biometric hash: high-confidence face

recognition,” IEEE Transactions on Circuits and Systems for Video Technology,

vol. 16, no. 6, pp. 771–775, 2006.

[62] A. B. J. Teoh, A. Goh, and D. C. L. Ngo, “Random multispace quantization as an

analytic mechanism for biohashing of biometric and random identity inputs,” IEEE

Transactions on Pattern Analysis and Machine Intelligence, vol. 28, pp. 1892–1901,

2006.

[63] A. Teoh, D. Ngo, and A. Goh, “An integrated dual factor authenticator based

on the face data and tokenised random number,” in Proceedings of International

Conference on Biometric Authentication, 2004, pp. 117–123.

[64] D. Maio and L. Nanni, “Multihashing, human authentication featuring biomet-

rics data and tokenised random number: a case study fvc2004,” Neurocomputing,

vol. 69, no. 1-3, pp. 242–249, 2006.

[65] B. Kong, K. Cheung, D. Zhang, M. Kamel, and J. You, “An analysis of biohashing

and its variants,” Pattern Recognition, vol. 39, no. 7, pp. 1359–1368, 2006.

[66] L. Nanni and A. Lumini, “Human authentication featuring signatures and tokenised

random number,” NeuroComputing, vol. 69, no. 7-9, pp. 858–861, 2006.

Bibliography 139

[67] ——, “An advanced multi-modal method for human authentication featuring bio-

metrics data and tokenised random numbers,” NeuroComputing, vol. 69, no. 13-15,

pp. 1706–1710, 2006.

[68] A. Lumini and L. Nanni, “An improved biohashing for human authentication,”

Pattern Recognition, vol. 40, no. 3, pp. 1057–1065, 2007.

[69] A. Goh and D. C. Ngo, “Computation of cryptographic keys from face biometrics,”

in Communications and Multimedia Security, ser. Lecture Notes in Computer Sci-

ence. Springer Berlin / Heidelberg, 2003, pp. 1–13.

[70] A. Teoh, D. Ngo, and A. Goh, “Personalised cryptographic key generation based

on facehashing,” Computers and Security Journal, no. 7, pp. 606–614, 2004.

[71] O. T. Song, A. B. J. Teoh, and D. C. L. Ngo, “A novel key release scheme from bio-

metrics,” in Intelligence and Security Informatics, ser. Lecture Notes in Computer

Science. Springer Berlin / Heidelberg, 2006, pp. 764 –765.

[72] A. B. J. Teoh, T. Connie, D. Ngo, and C. Ling, “Remarks on biohash and its

mathematical foundation,” Information Processing Letter, no. 4, pp. 145–150, 2006.

[73] A. B. J. Teoh and C. T. Yuang, “Cancelable biometrics realization with multispace

random projections,” IEEE Transactions on Systems, Man, and Cybernetics, Part

B - Special Issue on Recent Advances in Biometrics Systems, no. 5, pp. 1096–1106,

2007.

[74] W. B. Johnson and J. Lindenstrauss, “Extensions of Lipshitz mapping into Hilbert

space,” Contemporary Mathematics, pp. 189–206, 1984.

[75] A. Andoni and P. Indyk, “Near-optimal hashing algorithms for approximate nearest

neighbor in high dimensions,” in Proceedings of 47th Annual IEEE Symposium on

Foundations of Computer Science, 2006, pp. 459–468.

Bibliography 140

[76] N. Goel and G. Bebis, “Face recognition experiments with random projection,” in

Proceedings of SPIE Defense and Security Symposium, 2005, pp. 426–437.

[77] E. Brigham and H. Maninila, “Random projection in dimensionality reduction:

applications to image and text data,” in Proceedings of International Conference

on Knowledge Discovery and Data Mining, 2001, pp. 245–250.

[78] S. Kaski, “Dimensionality reduction by random mapping: fast similarity compu-

tation for clustering,” in Proceedings of International Joint Conference on Neural

Networks, 1998, pp. 413–418.

[79] S. Dasgupta, “Experiments with random projection,” in Proceedings of the 16th

Conference on Uncertainty in Artificial Intelligence, 2000, pp. 143–151.

[80] K. Liu, H. Kargupta, and J. Ryan, “Random projection based multiplicative data

perturbation for privacy preserving distributed data mining,” IEEE Transactions

on Knowledge and Data Engineering, no. 1, pp. 92–106, 2006.

[81] S. T. M. Oliveira and O. R. Zaiane, “Privacy-preserving clustering by object

similarity-based representation and dimensionality reduction transformation,” in

Proceedings of the Workshop on Privacy and Security Aspects of Data Mining in

conjunction with the Fourth IEEE International Conference on Data Mining, 2004,

pp. 21–30.

[82] P. Frankl and H. Maehara, “The johnson-lindenstrauss lemma and the sphericity

of some graphs,” Journal of Combinatorial Theory Series A, no. 3, pp. 355–362,

1987.

[83] P. Indyk and R. Motwani, “Approximate nearest neighbors: towards removing the

curse of dimensionality,” in Proceedings of the 30th Annual ACM Symposium on

Theory of Computing, 1998, pp. 604–613.

Bibliography 141

[84] S. D. Anupam and A. Gupta, “An elementary proof of the johnson-lindenstrauss

lemma,” UC Berkeley, Tech. Rep., 1999.

[85] R. I. Arriaga and S. Vempala, “An algorithmic theory of learning: robust con-

cepts and random projection,” in Proceedings of the 40th Annual Symposium on

Foundations of Computer Science, 1999, pp. 616–623.

[86] D. Achliptas, “Database-friendly random projections,” in Proceedings of the 20th

Annual Symposium on Principles of database systems, 2001, pp. 274–281.

[87] P. Li, T. J. Hastie, and K. W. Church, “Very sparse random projections,” in

Proceedings of the 12th ACM International Conference on Knowledge Discovery

and Data Mining, 2006, pp. 287–296.

[88] S. Vempala, The random projection method, ser. DIMACS series in Discrete Mathe-

matics and Theoretical Computer Science. American Mathematical Society, 2004.

[89] E. Candes, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal

reconstruction from highly incomplete frequency information,” IEEE Transactions

on Information Theory, vol. 52, no. 2, pp. 489–509, 2006.

[90] R. Hecht-Nielsen, “Context vectors: general purpose approximate meaning rep-

resentations self-organized from raw data,” Computational Intelligence: Imitating

Life, pp. 43–56, 1994.

[91] E. W. Weisstein, “Regularized gamma function,” From MathWorld–A Wolfram

Web Resource. http://mathworld.wolfram.com/RegularizedGammaFunction.html.

[92] ——, “Hypersphere,” From MathWorld–A Wolfram Web Resource.

http://mathworld.wolfram.com/Hypersphere.html.

Bibliography 142

[93] W. Du, S. Chen, and Y. S. Han, “Privacy-preserving multivariate statistical anal-

ysis: linear regression and classification,” in Proceedings of the 4th SIAM Interna-

tional Conference on Data Mining, 2004, pp. 222–233.

[94] K. Liu, “Multiplicative data perturbation for privacy preserving data mining,”

Ph.D. dissertation, University of Maryland, Baltimore County, 2007.

[95] A. Papoulis and S. U. Pillai, Probability, Random Variables and Stochastic Pro-

cesses, Fourth Edition. McGraw-Hill Education (India) Pvt Ltd, 2002, 2002.

[96] R. Baraniuk, “Compressive sensing,” IEEE Signal Processing Magazine, vol. 24,

no. 4, pp. 118–121, 2007.

[97] J.Wang, K.N.Plataniotis, J.Lu, and A.N.Venetsanopoulos, “On solving the face

recognition problem with one training sample per subject,” Pattern Recognition,

vol. 39, no. 9, pp. 1746–1762, 2006.

[98] P.J.Phillips, H.Wechsler, J.Huang, and P.Rauss, “The feret database and evalu-

ation procedure for face recognition algorithms,” Image and Vision Computing

Journal, vol. 16, no. 5, pp. 295–306, 1998.

[99] P.J.Phillips, H.Moon, S.A.Rizvi, and P.Rauss, “The feret evaluation method for

face recognition algorithms,” IEEE Transactions on Pattern Analysis and Machine

Intelligence, vol. 22, no. 10, pp. 1090–1104, 2000.

[100] T.Sim, S.Baker, and M.Bsat, “The cmu pose, illumination and expression

database,” IEEE Transactions on Pattern Analysis and Machine Intelligence,

vol. 25, pp. 1615–1618, 1998.

[101] A.M.Martinez and R. Benavente, “The AR face database,” Techni-

cal Report 24,Computer Vision Center CVC, 1998. [Online]. Available:

http://cobweb.ecn.purdue.edu/∼aleix/aleix face DB.html

Bibliography 143

[102] “http://www.fgnet.rsunit.com/.”

[103] “http://www.bioid.com/downloads/facedb/index.php.”

[104] M. A. Turk and A. P. Pentland, “Eigenfaces for recognition,” Journal of Cognitive

Neuroscience, vol. 3, no. 1, pp. 71–86, 1991.

[105] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenfaces vs. Fisherfaces:

recognition using class specific linear projection,” IEEE Transactions on Pattern

Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711–720, 1997.

[106] A. M. Martinez and A. C. Kak, “PCA versus LDA,” IEEE Transactions on Pattern

Analysis and Machine Intelligence, vol. 23, no. 2, pp. 228–233, 2001.

[107] R. Chellappa, C. Wilson, and S. Sirohey, “Human and machine recognition of faces:

a survey,” Proceedings of the IEEE, vol. 83, pp. 705–740, 1995.

[108] W.Zhao, R.Chellappa, and P.J.Phillips, “Face recognition: a literature survey,”

ACM Computing Surveys, vol. 35, no. 4, pp. 399–458, 2003.

[109] A. Tolba, A. El-Baz, and A. El-Harby, “Face recognition: a literature review,”

International Journal of Signal Processing, vol. 2, no. 2, pp. 88–103, 2006.

[110] F.Samaria, “Face recognition using hidden Markov models,” Ph.D. dissertation,

University of Cambridge, U.K., 1994.

[111] F.Samaria and S.Young, “HMM based architecture for face identification,” Image

and Vision Computing, vol. 12, pp. 537–543, 1994.

[112] L.Wiskott, J-M.Fellous, N.Kruger, and C.V.D.Malsburg, Face recognition by Elas-

tic Bunch Graph Matching. CRC Press, 1999.

Bibliography 144

[113] I. J. Cox, J. Ghosn, and P. N. Yianilos, “Feature-based face recognition using

mixture-distance,” in Proceedings IEEE Conference on Computer Vision and Pat-

tern Recognition (CVPR), 1996, pp. 209–216.

[114] G. Shakhnarovich, G. Shakhnarovich, B. Moghaddam, and B. Moghaddam, “Face

recognition in subspaces,” in in: S.Z. Li, A.K. Jain (Eds.), Handbook of Face

Recognition. Springer, 2004, pp. 141–168.

[115] S. Xiang, H. Kim, and J. Huang, “Histogram-based image hashing scheme robust

against geometric deformations,” in Proceedings of the 9th workshop on Multimedia

and Security, 2007, pp. 121–128.

[116] H. A. David and H. N. Nagaraja, Order Statistics, Third Edition. John Wiley and

Sons, 2003.

[117] K. Muralidhar, R. Parsa, and R. Sarathy, “A general additive data perturbation

method for database security,” Management Science, vol. 45, no. 10, pp. 1399–1415,

1999.

[118] J. T. Wang, X. Wang, K. I. Lin, D. Shasha, B. A. Shapiro, and K. Zhang, “Eval-

uating a class of distance-mapping algorithms for data mining and clustering,” in

Proceedings of the ACM SIGKDD International Conference on Knowledge Discov-

ery and Data Mining, 1999, pp. 307–311.

[119] J.Lu, K.N.Plataniotis, and A.N.Venetsanopoulos, “Face recognition using kernel

direct discriminant analysis algorithms,” IEEE Transactions on Neural Networks,

vol. 14, no. 1, pp. 117–126, 2003.

[120] K. Muralidhar, R. Parsa, and R. Sarathy, “A general additive data perturbation

method for database security,” Management Science, vol. 45, no. 10, pp. 1399–1415,

1999.

Bibliography 145

[121] R. Agrawal and R. Srikant, “Privacy-preserving data mining,” in Proceedings of

the ACM SIGMOD Conference on Management of Data,. ACM Press, 2000, pp.

439–450.

[122] K. Chen and L. Liu, “A survey of multiplicative perturbation for privacy-preserving

data mining,” in Privacy-Preserving Data Mining, ser. Advances in Database Sys-

tems, C. C. Aggarwal and P. S. Yu, Eds. Springer US, 2008, pp. 157–181.

[123] J. J. Kim, J. J. Kim, W. E. Winkler, and W. E. Winkler, “Multiplicative noise for

masking continuous data,” Statistical Research Division, US Bureau of the Census,

Washington D.C, Tech. Rep., 2003.

Changeable and Privacy Preserving Face Recognition

Documents