1 Evaluation of Biometric Identification Systems Dr. Bill Barrett, CISE department and US National Biometric Test Center San Jose State University email: [email protected]
Dec 24, 2015
1
Evaluation of Biometric Identification Systems
Dr. Bill Barrett, CISE department and
US National Biometric Test Center
San Jose State University
email: [email protected]
2
The Biometric Test Center
• Funded by several federal agencies• Centered in a disinterested university setting• Provide objective evaluations of commercial
biometric instruments• Provide consulting services to sponsors regarding
the most effective application of biometric instruments
• No funds accepted from vendors• No independent competing research
3
Summary of Presentation
• Three Basic Biometric Operations• Measures of effectiveness - the ROC curve• Comparison Rate Measures• Collection Variables• Evaluation strategies• Testing issues• Some results• Conclusion
5
• Enrollment: first time in.
• Verification: does this credit card belong to this person?
• Identification: who is this person, anyway?
The Three Biometric Operations
9
COMPARE operation
• Yields a DISTANCE measure between candidate C and template T
• d = distance(C, T)
• d LARGE: C probably is NOT T
• d SMALL: C probably IS T
• NOTE: is reversed for fingerprints
11
Variations on the basic measurement plan
• 3 strikes and you’re out
• Multiple templates of same person
• Template replacement over time
• Template averaging
• Binning
12
Binning
• Find some way to segment the templates, e.g.• male/female• particular finger• loop vs. whorl vs. arch• May have to include the same template in different
bins• Improves search performance, may reduce search
accuracy (more false non-matches)
16
Cross-over Threshold
• tc = cross-over threshold
• where probability of false match: I = probability of false rejection: A
17
Changing the Device Threshold
• td > tc : reduces false rejection: A increases false match: I(bank ATM choice)
• td < tc : increases false rejection: Areduces false match: I(prison guard choice)
18
The d-prime Measure
2/)('
22
21
2
1d
•Measures the overall quality of a biometric instrument.•d’ usually in the range of 2 to 10, logarithmic, like the Richter Scale.•Assumes normal distribution.
20
Penetration Rate
• Percentage of templates that must be individually compared to a candidate, given some binning.
• Search problem: usually exhaustive search, with some comparison algorithm, no reliable tree or hash classification.
• Low penetration rate implies faster searching
21
Example: fingerprints
AFIS (FBI automated classification system) classifies by:
• Left loop/ right loop• Arch/whorl• Unknown
Then• Exhaustive search of the subset of prints
22
Jain, Hong, Pankanti & Bolle,An Identity-Authentication System Using Fingerprints,Proc. IEEE vol. 85, No. 9, Sept. 1997
23
Bin Error Rate
• Probability that a search for a matching template will fail owing to an incorrect bin placement
• Related to confidence in the binning strategy• AFIS Bin error typically < 1%
25
Collection Variables
• Physical variations during biometric collection that may change the measurement
• Translation/scaling/rotation usually compensated in software
• Tend to increase the width of the authentics distribution, and thus
• ...make it easier to get a false rejection• ...cause a smaller d’
26
Liveness Issue
Can the device detect that the subject is live?
• Fake face recognition with a photograph?
• ...or a rubber print image (fingerprint)?
• ...or a glass eye (iris encoding)?
27
Collection Variables -- Fingerprints
• Pressure
• Angle of contact
• Stray fluids, film buildup
• Liveness
28
Collection Variables - Hand Geometry
• Finger positioning (usually constrained by pins)
• Rings
• Aging
• Liveness
29
Collection Variables -Iris Identification
• Lateral angle of head• Focus quality• Some people have very dark irises; hard to
distinguish from pupil• Outer diameter of iris difficult to establish• Eyelids, lashes may interfere• NO sunglasses• Liveness can be established from live video
31
Collection Variables -Face Recognition
• 3D angles• Lighting• Background• Expression• Hairline• Artifacts (beard, glasses)• Aging• Liveness: smiling, blinking
32
Collection Variables -Voice Recognition
• Speed of delivery• Articulation• Nervousness• Aging• Laryngitis• Liveness: choose speech segments for the user to
repeat, i.e. “Say 8. Say Q. Say X”
33
Example - Miros Face Recognition System
• Lighting is specified• Static background, subtracted from candidate
image to segment face• Camera mounted to a wall - standing candidate• Height of eyes above floor used as an auxiliary
measure• Verification only recommended• Liveness - can be fooled with a color photograph
34
Example - FaceitTM Face Recognition System
• No particular lighting specified; it expects similar lighting & expression of candidate and template
• Face segmented from background using live video• Face lateral angles not well tolerated• Liveness: blinking, smiling test
36
Common Factors
• Bio capture: easy to capture the full image
• Bio encoding algorithm: often proprietary
• Bio encoding: usually proprietary
• Database distance: may be proprietary
37
Convenience Factors
• Many are concerned about intrusiveness
• Some are concerned about touching
• What is the candidate’s learning curve?
...device may require some training
38
Collecting a Template Database for Testing
• Precise identity : code registration
• Getting plenty of variety: gender, age, race
• Getting many images of same identity
• Getting many different images
• Significant time frame
39
Practical Databases
• Many large template databases with unique identities & single images available
• Many large databases with inaccurate identity correlation
• Many databases with limited diversity
• Difficult to collect data over time
41
Hand Geometry for INSPASS
• INSPASS: INS Passenger Accelerated Service System
• Collected 3,000 raw transaction records• Unique individuals in database (separate magnetic
identity card)• ...from three international airports• Statistical modelling is suspect for this data• Experimental d’ is 2.1; equal error rate ~2.5%
43
FaceitTM General Comments
• Supported by a flexible Software Development Kit (SDK), using Microsoft Visual C++TM
• Several example applications• Well documented• Can use any video camera• Segments a face with motion video• Liveness: smile or eye blink
44
FaceitTM Face Recognition
• No lighting conditions specified • No background conditions specified• Multiple faces can be segmented• The database includes full images, with default
limit of 100 templates• Image conversion and code comparison is
separated, therefore is testable
45
FaceitTM Face Recognition
Faceit performance
-0.200
0.000
0.200
0.400
0.600
0.800
1.000
1.200
33 41 49 57 65 73 81 89 97
Confidence level
Ac
ce
pta
nc
e
Imposters
Authentics
46
FaceitTM Study Summary
• Done by senior computer engineering students• Not a fully diversified, controlled experiment • 50 different persons, 10 images each• Overall time frame ~ 2 months• Equal error rate crossover point ~5.5%
47
MirosTM Face Recognition
• Designed for verification• Lighting conditions specified• Static background - system takes snapshot of
background, uses it to segment a face• Keyboard code plus two images• Double image helps liveness• Software Development Kit similar to Faceit• Equal crossover error rate ~5%
48
IriscanTM Recognition System
• Based on a patent by John Daugman, US 5,291,560, Mar. 1, 1994
• Uses iris patterns laid down a few months after birth. Claims no significant aging over lifespan.
• Claims high d’, yielding cross-over error rate < 1 in 1.2 million
• Claims high rate of code comparison. Hamming distance. ~100,000 IrisCodes/second on a PC.
50
IriscanTM Observations
• Capture equipment more expensive:
zoom / telephoto / swivel robotics / autofocus• Question of conversion and code standardization:
most of the system is proprietary• Liveness • Promises to have the highest discrimination of all
51
Conclusions
• Need for independent evaluation of biometric devices is clear
• Adequate testing usually requires a special version of the software
• Acquiring a suitable database is difficult
• Proprietary software means black-box testing, therefore less conclusive