1. Introduction

Real-time acquisition of depth and colour images using structured light

and its application to 3D face recognition

Filareti Tsalakanidou, Frank Forster, Sotiris Malassiotis, Michael G. Strintzis

• Introduction of a new way of acquiring real-time images of moving objects in arbitrary scenes using a low-cost 3D sensor.

• This technique is then applied to face recognition.

• Use 3D and 2D images for personal identification, which is based on discriminatory shapes of faces that is not affected by changing light or by facial pigment.

1. Introduction

2.1. Previous work: Real-time acquisition of range images

• Principle behind technique: projection of a static coded light pattern onto the scene and measurement of its deformation on objects surfaces.

• Spatially coding of the static projection pattern, where the light rays are coded by spatial markings (sub-patterns) within this pattern.

• Problem: reflectivity of the scene. Theoretical solution: rainbow pattern.

• New approach of colour-code light via spatial encoding with a single colour projection pattern.

• Get depth map from a single snapshot of the scene illuminated by the pattern.

2.2. Previous work: 3D Face recognition • Earliest approaches were feature based.

• Here appearance based approach is used which simplifies 3D data processing by using 2D depth images.

• These depth images are used along with prior knowledge of face symmetry/geometry for face detection and localisation.

• Main problem: require accurate alignment between sensor and object being imaged.

• Use pose and illumination compensation techniques before classification.

3.1. Range image acquisition method - The projection pattern

• Key to depth acquisition technique.

• During transmission, some parts irreversibly lost, as well as introduction of ghost symbols.

• Errors are potentially frequent in coded light sensors and must be taken into account during coding/decoding.

• Colour pattern used here is composed of parallel coloured stripes, where n adjacent colour edges form a codeword.

• Colours used form the 8 corners of RGB cube

• Adjacent stripes must differ in 2+ colour channels, so 20 distinct edges possible.

• Edges are about 4 pixels apart.

3.2. Data processing – Edge pixel detection

0

0

,,

w

yxItI

w

ww l

lw

, Split colour

image in R, G, B channel

images

Filter with Gaussian

3 x 3

Establish local orientation of

pattern stripes

Compute 1D derivative

orthogonal to stripe orientation

Perform non-extrema

suppression on 1D derivative

Form multi-channel extrema

Edge Pixel Detection

Split colour image in R, G,

B channel images

Filter with Gaussian

3 x 3

Establish local orientation of

pattern stripes

Compute 1D derivative

orthogonal to stripe orientation

Perform non-extrema

suppression on 1D derivative

Form multi-channel extrema

Edge Pixel Detection

0,,

02

2'

ww

yxItI l

lw

Detection of projected colour edges: (a) colour image, (b) red single-channel extrema, (c) green single-channel extrema, (d) blue single-channel extrema and (e) traced ridges of correctly identified edge pixels.

Edge segment detection

• Spatially adjacent pixels of same class are traced to obtain edge segment.

• Determine sequences of n multi-channel extrema that share same orientation as direction orthogonal to stripes.

• Decode resulting words and edge pixels that form part of a valid codeword are interpreted to give the location of a projected edge.

• Pixel ridge must be of minimal length to be used in further processing.

• So the algorithm determines colour edges originating from projected pattern.

3.3. A novel range sensor based on the method

• Colour images acquired with Basler 302fc single-chip Bayer-Pattern CCD RGB camera with resolution of 780 x 580 pixels.

• Projection of pattern with a Panasonic LPT multimedia projector with resolution of 800 x 600 pixels.

• The projector is rotated to give convergence angle of about 20° towards the camera.

• 3D sensor to obtain depth information used in face recognition is shown below.

• Switching rapidly between coloured pattern and white light means colour image also captured which is synchronised with depth image.

4.1. 3D Face authentication system –Face detection, localisation and normalisation

• Detection and localisation of face based on 3D data and is unaffected by illumination or facial features.

• Pose compensation:

- segmentation of head from body

- accurate detection of tip of nose

- align 3D local coordinate system centred on nose

- warping of the depth image to align local coordinate system to a reference one.

• Illumination compensation:

- recovery of scene illumination from pair of depth and colour images

- assume single light source, and from artificially generated images, get non-linear relationship between image brightness and light source direction

- compensate image by multiplying with ratio image.

4.2. Face authentication

• Multimodal classification using 2D and 3D images of the face.

• Two independent classifiers used: one for colour images, the other for depth images.

• Probabilistic Matching (PM) algorithm for face recognition.

• Normalised colour and depth images generated after pose and illumination compensation are used as the inputs of the classifiers.

5.1. Experimental results

• 3D sensor accuracy:

- Focus on depth error which is mainly due to localisation error

- Statistical depth error found by acquiring several depth maps of a scene to get S.D. ( 0.01 – 0.04mm)

- Other experiments involving planar objects

- Depth accuracy of the 3D sensor ~0.1-0.3mm.

• 3D sensor data rate:

- Frame rate not fixed; depends on size of scene in image, exposure time of camera and other factors.

SequenceFPS (2.4 GHz) FPS (3.2 GHz) Avg. number of range values

Head 12.3 18.3 192 000

Fan 14.5 21.3 142 000

Gesture 16.0 23.7 83 000

Intensity and depth images of fan and head.

5.2. Evaluation of the proposed 3D face authentication scheme

• Database of several appearance variations for 73 volunteers.

• Training of PM algorithm using 4 images per person, others used for testing.

• Authentication errors were shown to be lower using the proposed scheme to those achieved manually.

• Significant improvement when depth and colour information is combined.

• Total run-time ~3seconds.

6. Conclusions

• Through the exploitation of 3D data, robust face authentication under heterogeneous conditions is achieved, using only low-cost devices.

• Unique property of the proposed system: real-time acquisition of moving scenes.

• One current possible application of the technique is human-machine interaction.

• A version of the 3D sensor based on IR illumination source is currently under investigation and is expected to overcome the obtrusiveness of the current approach.

1. Introduction

Documents

colour pattern

projection of pattern

d images

projected pattern

d depth images

edge pixels

static projection pattern

d sensor