Chris Johnson 01/05/2007
SCM-1
Faces and Caricatures:
3D Caricature Generator
by
Chris Johnson
Project Supervisor:
Steve Maddock
Module Code:
COM3010
01/05/2007
Chris Johnson 01/05/2007
Declaration
All sentences or passages quoted in this report from other people's work have
been specifically acknowledged by clear cross-referencing to author, work and
page(s). Any illustrations which are not the work of the author of this report have
been used with the explicit permission of the originator and are specifically
acknowledged. I understand that failure to do this, amounts to plagiarism and will be
considered grounds for failure in this project and the degree examination as a whole.
Name : ………………………..
Signature: ………………………..
Date: ………………………..
Chris Johnson 01/05/2007
Abstract
A caricature is a portrait which exaggerates the features of the subject which
make them recognisable, to capture their character. Caricatures were first recognised
as an individual art form in the 17th
century and became very popular in the early 20th
century, with many newspapers and magazines preferring to have caricatures of
people rather than real photographs. In the 1970s people began drawing and
modelling faces with computers, and it was not long until the first caricature generator
was created by S. Brennan.
The aim of this project is to create an autonomous 3 dimensional caricature
generator. This will take portrait photographs of a person then create a 3D model
representation of their face and caricaturise it to a certain level selected by the user.
This was achieved using polygon meshes to model the subjects face, then manipulate
the vertices of the mesh according to an algorithm which exaggerates the differences
between the subject face and an average face to create a caricature.
The results of this project are quite pleasing; the program is able to generate
good caricatures of symmetric faces by exaggeration. In addition, individual feature
exaggeration has also been attempted and yields reasonable results, as well as
providing basic solutions to the problems encountered with this feature.
Chris Johnson 01/05/2007
Contents
Title Page ……..…………………………………………………………….....i
Declaration …………..……………………………………………………….ii
Abstract …………..………………………………………………………….iii
Contents ………………………………………………………………………iv
List of Figures ...………………………………………………………………v
Chapter 1: Introduction ……………………………………………………………..1
Chapter 2: Research & Literature Review...…………………………………….....2
2.1 Introduction …………………………………………………….……..…..2
2.2 3D Modelling & Face Generation ..………………………….…………..2
2.2.1 3D Modelling with Computers ……………………………………...…..2
2.2.2 Modelling Faces with Facegen ………………………………………….3
2.2.3 Generating a face model from real life pictures ….……………………..4
2.3 Caricature Creation / Exaggeration Methods …...………………………...5
2.3.1 How do humans draw caricatures ............................................................5
2.3.2 Exaggerating the difference from the mean …………………………….6
2.3.3 Improved EDFM by Feature Normalization and Exaggeration ………...6
2.3.4 Example-Based Caricature Generation with Exaggeration and Neural
Networks ……………………………………………………………………...7
2.3.5 Interactive Caricature Generation using Morphing ……………………9
2.4 Summary ………………………………………………………………...10
Chapter 3: Proposed Approach ..…………………………………………………11
3.1 Introduction ..……………………………………………………………11
3.2 Specification ……………………………………………………………..11
3.2.1 Modelling the Face …………………………………………...………..11
3.2.2 Caricaturing the Face ..…...…………………………………...………..11
3.3 Modelling the Face …………………..…………………………….……12
3.3.1 Polygon Meshes ………………………….....…………………………12
3.3.2 Importing Face Models ………………….....…………………………12
3.4 Model Manipulation / Exaggeration ..…………………………………...12
3.5 Formal Requirements ..…………………………………………………..14
3.7 System Evaluation ……………………………………………………….15
Chapter 4: System Design ……………………...…………………………………..16
4.1 Introduction ……………………………………………………………..16
4.2 Design Technique …..……………………………………………………16
4.3 Method Overview ……………………………………………………….16
4.4 Data Structures …………….…………………………………………….18
4.5 Class Implementation ……………………………………………………19
4.6 User Interface ……………………………………………………………20
Chris Johnson 01/05/2007
Chapter 5: Implementation and Testing ……..…………………………………..21
5.1 Programming Languages ………………………………………………...21
5.2 Face Generation …………..……………………………………………..21
5.3 Face Exaggeration / Caricature Generation ……………………………..25
5.4 Feature Exaggeration …..………………………………………………...27
5.5 User Interface & Scene Display …………………………………………28
5.6 Testing …………..……………………………………………………....30
5.6.1 Face Generation …………..………………………………...…………31
5.6.2 Face Exaggeration ………..……………………………………………31
5.6.3 Feature Exaggeration ……..…………………………………………..32
5.6.4 System Testing …………..……………………………………………33
Chapter 6: Results and Discussion ……….…..…………………………………..35
6.1 Caricature Results ……..………………………………………………...35
6.2 Feature Caricature Results ……..………………………………………...36
6.3 Caricature Quality ……..………………………………………………...37
6.4 Further Work ……..………………………………………………..........40
Chapter 7: Conclusions …………………....…..…………………………………..41
References ...………………………………………………………………………...43
Apendix A ..………………………………………………………………………....44
Apendix B ..………………………………………………………………………....46
Apendix C ..………………………………………………………………………....48
Apendix D ..………………………………………………………………………....49
Apendix E ..………………………………………………………………………....52
List of Figures
Figure 1.1 ……………………….………………………………………………….....1
Figure 2.1 ……………………….………………………………………………….....2
Figure 2.2 ……………………….……………………………………..……………...3
Figure 2.3 ……………………….………………………………………………….....3
Figure 2.4 ……………………….………………………………………………….....4
Figure 2.5 ……………………….………………………………………………….....4
Figure 2.6 ……………………….………………………………………………….....7
Figure 2.7 ……………………….………………………………………………….....8
Figure 2.8 ……………………….………………………………………………….....9
Figure 2.9 ……………………….………………………………………………….....9
Figure 2.3 ……………………….………………………………………………….....3
Figure 2.10 ……………………….………………………………………………….10
Figure 3.1 …..…………………….………………………………………………….12
Figure 3.2 …..…………………….………………………………………………….13
Figure 3.3 …..…………………….………………………………………………….14
Figure 3.4 …..…………………….………………………………………………….15
Chris Johnson 01/05/2007
Figure 4.1 …..…………………….………………………………………………….17
Figure 4.2 …..…………………….………………………………………………….18
Figure 4.3 …..…………………….………………………………………………….19
Figure 4.4 …..…………………….………………………………………………….20
Figure 4.5 …..…………………….………………………………………………….20
Figure 4.6 …..…………………….………………………………………………….20
Figure 5.1 …..…………………….………………………………………………….22
Figure 5.2 …..…………………….………………………………………………….23
Figure 5.3 …..…………………….………………………………………………….24
Figure 5.4 …..…………………….………………………………………………….25
Figure 5.5 …..…………………….………………………………………………….26
Figure 5.6 …..…………………….………………………………………………….26
Figure 5.7 …..…………………….………………………………………………….27
Figure 5.8 …..…………………….………………………………………………….27
Figure 5.9 …..…………………….………………………………………………….28
Figure 5.10 ……………………….………………………………………………….29
Figure 5.11 ……………………….………………………………………………….29
Figure 5.12 ……………………….………………………………………………….30
Figure 5.13 ……………………….………………………………………………….31
Figure 5.14 ....…………………….………………………………………………….32
Figure 5.15 ……………………….………………………………………………….34
Figure 6.1 …..…………………….………………………………………………….36
Figure 6.2 …..…………………….………………………………………………….37
Figure 6.3 …..…………………….………………………………………………….38
Figure 6.4 …..…………………….………………………………………………….39
Chris Johnson 01/05/2007
Chapter 1: Introduction
For the past few decades artists and programmers have been trying to define
how a caricature is created and how the process of turning a regular portrait into a
caricature could be automated using a computer program. In 1985 S. Brennan [Bren,
85] made the first computer program to draw caricatures called “Caricature
Generator” using a widely accepted method of creating caricatures called
“Exaggerating the difference from the mean” (EDFM). EDFM is where you create
and map an average or “normal” face, then map the face you wish to make a
caricature of and compare it to the “normal” face. You then exaggerate the differences
between the two faces to make your caricature. Brennan’s Caricature Generator only
worked with 2D portraits, since then there have been other programs using the same
methods to create 2D caricatures, and more recently some for 3D caricatures, some
more successfully than others.
The aim of this project is to write a fully automated, working program which
will generate a 3D caricature of any face. A caricature is a cartoon version of a person
or face where certain features have been exaggerated and other features have been
toned down. A good caricature is instantly recognisable as a representation of the
original face. In Figure 1.1 you see 3 different caricatures, drawn by 3 different artists,
but all 3 have exaggerated the same features and they all have a distinct likeness to the
real person. The problem when trying to generate caricatures with computers, will be
knowing which features to exaggerate, and how to exaggerate them since this will
vary from person to person.
This report is split into 4 remaining chapters, Chapter 2 is a literature review
and will cover methods and technology which will be useful for this project. A plan
will then be set out in Chapter 3 where the aims and the chosen methods to achieve
them will be specified. In Chapter 4 we will look at the work achieved to date and the
overall progress of the project so far before we conclude with chapter 5 and set a plan
for the remainder of the project.
A B C
Figure 1.1
Arnold Schwarzenegger caricatures.
A : [Dean,04] , B : [Jone,05] , C : [Prch,99]
Chris Johnson 01/05/2007
Chapter 2: Research & Literature Review
2.1 Introduction This chapter will explore work done by various artists, caricaturists and
computer scientists in the past which will be useful for developing an automatic
caricature generator. Different methods of modelling 3D faces and creating
caricatures will be researched and compared. First of all, section 2.2 will cover how
the faces could be modelled in 3D. Section 2.3 will then go on to describe how
methods of creating caricatures, starting with how humans draw caricatures in section
2.3.1, then moving on to autonomous computer generated caricatures using
exaggeration (Section 2.3.2 and 2.3.3), example based learning with neural networks
(Section 2.3.4) and morphing (Section 2.3.5). This chapter will finish up with a
summary of my findings so far.
2.2 3D Modelling & Face Generation
2.2.1 3D Modeling with computers
There are various 3D computer graphic APIs and languages, but all of them
create objects in the same way. A 3D object is built up of a set of vertices, joined
together to make polygons, which in turn are joined together to make polygon meshes.
Each vertex has an X, Y and Z co-ordinate which denotes where that point is in the
displayed space (as if there was a set of axis or a grid over the whole space). These
vertices are then joined together to form polygons (2D shapes with 3 or more sides),
usually either triangles or 4 sided polygons, but they can have any number of sides.
There are rules about creating polygons when
modelling to keep a certain level of standard, one of
which is that all polygons should be convex, meaning
that if you draw a straight line between any of the
vertices making up the polygon, that line should be
contained within the polygon formed. This means that
triangular polygons are the best shape to use, because
you cannot have a concave triangle, and any other
polygon (concave or convex) can be split up into 2 or
more triangles. The polygons can then be joined
together to make polygon meshes. These are planes of
2D polygons joined together at shared vertices. Using
polygons it is possible to model any complex object to
any level of detail (Fig 2.1 shows an example of a
polygon mesh made of 3 and 4 sided polygons used to
model a human head).
Once the object is modelled, it needs to be given a colour which is done by
mapping a texture onto the model. There are 2 ways of drawing textures, 2D mapping
or 3D mapping. 2D mapping is where you take a 2D image, and map it onto the
surface of the 3D model, basically like wrapping the 2D image around the 3D model
like wrapping paper. Each polygon vertex is given a (u,v) co-ordinate, this same set of
(u,v) co-ordinates are given to the 2D image you want to use as a texture, the image is
then stretched over the surface of the 3D model, by matching up the (u,v) co-ordinates
(Fig 2.2 showsthis process). 3D mapping is where you have a 3D texture block, and
Fig 2.1
Chris Johnson 01/05/2007
you place your model inside it, and it colours your model with the colour at the same
place in the 3D texture, like carving your model out of marble or wood.
2.2.2 Modelling Faces with FaceGen
FaceGen is a program written by Singular Inversions to create 3D models of
human faces/heads (Fig 2.3 shows the facegen interface and an example model). The
program can randomly generate models of people of all ages, race and sex, as well as
give them various different facial expressions. You can also create your own custom
faces by made by your specifications. You can choose a range of different attributes to
change the appearance of the face and its features. FaceGen creates its models using
both 3-sided and 4-sided polygons to make a polygon mesh of the face, it then saves
the model as a Wavfront (.obj) file which is a text file consisting of lists of data with
information about the vertices, faces/polygons and textures. The level of detail can be
altered by how many polygons to use to model the face and the realistic look of the
Fig 2.3 - FaceGen Interface
u
v
+ =
+
Fig 2.2 – 2D Texture Mapping
u
u
u
v
v
v
Chris Johnson 01/05/2007
faces is very good, the only down side to FaceGen is that its modelling of hair is quite
poor because hair is extremely difficult to model. FaceGen also has a built in
caricature feature, where you move a slider up and down and it will caricaturise the
face according to the amount you set using the slider. The caricatures are fairly basic
and seem to just exaggerate all features equally with no discrimination to individual
features, similar to the Caricature Generator by [Bren,85]. The caricatures still bare a
slight resemblance to the original face, but they do not look like proper caricatures,
they are more like general distortions of the face because it exaggerates part of the
face which should not be exaggerated. A caricature should emphasise the facial
features, not distort the entire face.
2.2.3 Generating a face model from real life pictures
There are multiple ways to generate 3D face models of real people, most of
which depend on having various photographs of the subject, then forming the 3d
model from the information in the photographs. FaceGen has feature where it can
create 3D models of faces from real life 2D photographs in this way. It requires at
least a front portrait picture of the subject’s face, side pictures are optional but will
improve the likeness of the model if you use them. It is important that the pictures are
clear with nothing obstructing the face like hair. The person also needs to have
relatively no expression because if you input a picture of someone with a big smile,
FaceGen does not know they are smiling, so it will give them an over sized mouth. To
make the model, FaceGen asks you to perform some procedures such as marking up
the face with points to denote features as demonstrated in Fig 2.5. Using these points
FaceGen can work out the shape of the head, the size of the facial features and the
distances between them. In general all faces have the same structure, we all have 2
eyes, a nose, a mouth, a chin etc, the thing that makes us look different is the feature
size, shape and orientation within the face. Knowing this, FaceGen creates an average
face model and uses the information about the features taken from the picture to
stretch and transform the polygons in the model to mould it into the face you input.
Normal Normal Caricature Caricature
Fig 2.4 – Facegen Caricatures
Fig 2.5 – Marking up photographs in FaceGen
Chris Johnson 01/05/2007
Another way to model real life objects is manual digitisation. This is using a
stationary mechanical arm which is connected to software on a computer. The
computer is able to map any point you can position the arm at, so if you place the arm
to point at specific points on the object and record those as vertices on the computer,
you can generate a 3D model that is an exact representation of the real object, of the
same size and proportions. The problem with this method is that it takes quite a long
time, and if the object being modelled is moved at all, the model will have errors,
therefore this is not a suitable method for modelling human faces, since we are unable
to stay perfectly still for long periods of time.
Real life objects can also be modelled by doing a laser scan of the object. The
object is passed through a digitizer machine which contains a laser with a camera
mounted on it. When the laser hits the object the camera records the location of the
laser dot and draws that position as a vertex in the computers 3D model. This method
also gives an exact representation of the object, with the exact same size and
proportions. There are also larger digitizers which are big enough to successfully
model a human. These digitizers have a laser with a camera mounted and work in the
same way, but instead of the object being passed through the machine, the object is sat
in the centre, while the laser rotates around it. Both of these methods usually require
the resulting model to be “cleaned up” because errors have occurred in the scanning
process (e.g. the laser beam was reflected or dispersed, unregistered concaved areas in
the surface like the nose, and the reduced number of points at polar regions.).
Problems also occur if the subject moves at all, which in this case can be a real
problem for modelling humans, and can require a lot of “cleaning up” afterwards to
make the model look correct.
2.3 Caricature Creation / Exaggeration Methods
2.3.1 How do humans draw caricatures?
[Bren,85] goes in to great detail about how to make caricatures look like the
subject, and how people recognise faces and remember them. When people memorise
a face, they do not memorise lines or shapes, but they remember features. It is
possible to have a poorly drawn none-lifelike portrait of a person, but still be able to
recognise who the picture represents because it has good information about their
features. The book “How to draw caricatures” written by Lenn Redman [Redm,84]
covers in great detail the skills to drawing caricatures and is referenced to by almost
all papers in the area of generating caricatures. [Redm,84] says that to draw
caricatures you should try to visualise a completely average face in your mind, which
has no irregular features, then look for differences between your subjects face, and the
average face in your mind, then make those features more distinguished. If your
subject has a large nose, make it larger, or if your subjects eyes are close together,
then move them even closer. To support this, [Bren,85] studied various artists and
caricaturists, and says that all of them worked from their idea of an average face, or
range of faces, then would compare the subjects face to these, and try to visualise the
differences and exaggerations in their mind, then try to draw it in 1 stage without
focusing on lines or specifics, but more on the general visualisation of the face as a
whole. One important rule said by [Redm,84] is that we want to exaggerate, not
distort. The idea of a caricature is to exaggerate the subject’s features to make it
resemble the subject more than a proper portrait and be instantly recognisable. If you
Chris Johnson 01/05/2007
distort the face rather than exaggerate, you will lose all resemblance to the original
face. Looking back to Fig 1.1, you instantly recognise who the subject of the
caricatures, and you will notice all 3 caricatures look similar and are recognisable to
each other. This is because caricaturists have not actually changed any part of his
face, fundamentally they are still the same face, all they have done is exaggerate the
outstanding features.
2.3.2 Exaggerating the difference from the mean
EDFM compares the subject face to a purely “normal” or average face with no
imperfections or outstanding features. It then exaggerates each point by changing the
X, Y and Z co-ordinates by some function of its DFM (difference from the mean),
thus changing the shape of the face and amplifying its outstanding features. This
method tries to mimic the human way of drawing caricatures by exaggerating all
points and all features in parallel at one time, as if it were one visualisation. Each
point is exaggerated relative to its DFM and no points are disregarded.
In 1985, [Bren,85] created a program to automatically create caricatures called
the “Caricature Generator”. The Caricature Generator is based around the generally
accepted idea that caricatures are exaggerations of the face, and its caricatures are
produced using EDFM. This program was the first autonomous caricature generator,
and set precedent for future programs using this method. In 1999 a program called
PICASSO was written by [Kosh,99] which worked in the same way to produce 2D
caricatures, and later in 2001 they also created a 3D version of this program,
PICASSO 3D [Fuji,01]. PICASSO 3D creates a 3D model of the face using many 2D
pictures taken from cameras surrounding the subject, it then compares the model with
respect to the average 3D face model, and exaggerate each point based upon the
functions used for the 2D EDFM method, the only difference being that now there is a
third Z co-ordinate to compare, as opposed to just X and Y.
[Bren,85] states that “features” are what humans visualise in a face, and that
lines in a drawing only become a feature when it represents something that makes the
face distinguishable from other faces, it is therefore difficult to make a computer
know what a feature is and which features to exaggerate (this could pose a problem
for creating good caricatures which are still obviously recognisable), so instead the
program exaggerates them all. The problem with this is that when human caricaturists
draw a caricature, they can choose to ignore certain features if do not stand out, they
can leave them as they are, make them less noticeable or even leave them out
completely whereas the Caricature Generator program exaggerates all features, it just
doesn’t exaggerate the more “normal” features as much. This makes the caricatures
produced less recognisable, and appear more “noisy”. It also means the pictures can
lack artistic style.
2.3.3 Improved EDFM by Feature Normalization and Exaggeration
There are certain drawbacks to using the EDFM method to create caricatures,
as discussed by [Mo,03]. In this short paper [Mo,03] points out some problems that
will need to be considered in order to improve the EDFM method. It is mentioned that
the EDFM method has no awareness of understanding of the features. It only takes in
to account the measurable difference of each point from the mean and then
exaggerates them an amount that is linearly proportional to their DFM. Therefore if
you have a large feature (e.g. the mouth) with the same DFM as a small feature (e.g.
the eyes), it will only be exaggerated by the same amount, this means the
exaggeration of the large feature will not be as noticeable and will not give the proper
Chris Johnson 01/05/2007
effect. Also, the EDFM method used by [Bren,85] classes all points as equally when
determining which points to exaggerate, it purely works of their DFMs regardless of
which facial feature it represents. The example given in [Mo, 03] is the head size.
Head size varies largely from person to person but is relatively unnoticeable, so even
though it may have a large DFM compared to other smaller features, its DFM relative
to its size is not that great and should not be exaggerated. In Figure 2.6 taken from
[Mo, 03] you can see the normal face on the left, the face using the EDFM method in
the middle has over exaggerated the face size incorrectly, and on the right is how the
caricature should look without the over exaggerated face size.
2.3.4 Example-Based Caricature Generation with Exaggeration and Neural
Networks
Example-based caricature generation with exaggeration provides an alternative
way to create caricatures from the EDFM method, and claims to be able to copy an
artistic style or even create its own style. This is managed by teaching the program
certain styles using training data. The method of doing so is described by [Lian,03] is
as follows: The program has a library of many faces and caricatures of those faces
drawn by artists, it then maps both the original face and the caricature face with co-
ordinate points describing each feature. The program also has an average, mean face
shape saved to refer to. It aims to create a caricature by working out an algorithm
describing the relationship between the difference of the mean face shape to the
subject face shape ∆S, and the difference of the caricature face shape (drawn by the
artist) to the subject face shape ∆S`. The idea being that if you work out ∆S by
comparing the subject face to the mean face, you should then be able to work out how
much you need to exaggerate each feature using values of ∆S` from your training
data. By using this method you can change how features are exaggerated by changing
your training data. If you only use portraits and caricatures drawn by one artist as your
training data, the caricatures the program creates should be in a similar style to that
artist. This also means if you use caricatures by various different artists as your
training data, the caricatures created by the program will have its own style which is a
combination of all the training data artists styles.
This method can be implemented using Cascade Correlation Neural Networks,
as shown by [Rupe,05] who created an automatic 2D caricature program using
CCNN’s. CCNN’s are made up of many single neurons which hold specific
information, a neural network works by learning or “training” from data and storing
the learnt information in these neurons. Each time the network is trained, it will create
a lot of hidden neurons, (which are neurons that have been trained to the point where
there is no more significant reduction of error by more training) it then selects the
neuron which has the least error and installs it into the network. The rest of the
neurons are discarded and the process begins again for the next feature. Each time this
is done the margin of error produced by the network is reduced.
Fig 2.6 – [Mo,02]
Chris Johnson 01/05/2007
Fig2.7 shows the input and output of the neural network used to caricaturise
facial features. The first step is to split the original face, mean face and caricature face
into separate features. These features are then input to the neural network as training
data where it compares the 3 images of each feature in a similar way as previously
mentioned by [Lian,03] except this time it compares each feature individually, rather
than the face as a whole (Fig2.8 Shows how each feature is compared). The CCNN
learns from this how to create the caricature feature as the output, by using the
original and mean features as input. Once the CCNN has been trained, it should be
able to output a caricaturised version of any feature that you input.
Since the CCNN has learnt how to caricaturise from samples, any output it
gives will be in the same style as the samples it was trained on, this gives the output
caricatures an artistic style and seem like they are drawn by humans. Another
advantage is that by using Neural networks to caricaturise single features, you can
swap styles for each feature by giving it different training data, for example there may
be 2 or 3 different ways to exaggerate the chin or the nose, and you are not confined
to having to used the same style for each feature.
Fig 2.7 – [Rupe,05]
Chris Johnson 01/05/2007
Fig2.8 shows how each feature is compared. The 3 images are laid over each
other by lining up their centre points, then cross sections are taken and the X,Y points
from each image where they cross the line are noted. The CCNN can then learn the
relationship between the differences of each image, in order to learn how to create the
caricature image using a function of the difference between each point on the original
image and each point on the mean image. The more cross sectional lines that are used,
the more points are recorded and compared, therefore the output caricature will be
more accurate.
2.3.5 Interactive Caricature Generation using Morphing
To edit pictures using morphing you first need to define a template, which is a
series of shapes or lines which represent the image. You can then perform various
operations on the template, to alter the appearance of the image. In 1997 Ergun
Akleman wrote a procedure for using morphing tools, which allowed the interactive
manipulation of portrait images to create caricatures. [Ergu,97] says that for a
morphing tool to work well with faces the template should be
very simple. People often try to use complicated templates
because they assume that it will give them more control, however
by using complicated templates it is easy to distort the face, rather
than exaggerate it. If you distort the fact then it loses its
resemblance to the original, and fails to be a caricature. [Ergu,97]
suggests using simple templates that just outline the most basic
layout of the face, like the position of the eyes, nose and mouth
(Fig 2.9 shows an example of such a template). By doing this
when you move one line in the template, it will stretch a whole
area of the face to move with it, which will give the effect of
exaggerating or stretching, rather than just moving it
independently of everything else, and distorting the image. When altering the template
to make a caricature, [Ergu,97] says that you should restrict yourself to only editing
the size and position of the facial features, which relates to scaling and translations of
the template. This should also help to avoid distorting the face. The input image must
also be flat facing the camera, else you will get perspective deformations.
Creating caricatures with morphing at the moment requires human interaction.
A human must create the morphing template, and a human controls the morphing
process, moving the template and morphing the image manually. All the machine
Fig 2.8 – [Rupe,05]
Fig 2.9
[Ergu,97]
Chris Johnson 01/05/2007
does is the actually morphing calculations and stretching or moving of the picture as
you instruct it. The idea is that every human can recognise faces, and recognise a
likeness between two faces, which is all that is needed to create a caricature. The
morphing tool allows you to play with the face, moving and stretching the features to
try and create a caricature by trial and error. You move or stretch a feature in one
direction, if it gives a good likeness to the original face then you keep it, if it doesn’t
then you move it in the opposite direction. If that does not give a good likeness to the
original face either, then you return it to its normal position and size, and try a
different feature. This method can give surprisingly good results, however trying to
create a fully automated version of it would be very difficult because all the decision
making about which features to edit and whether it looks good or bad is made by a
human from their own imagination. Figure 2.10 shows examples of 2D morphing,
taken from [Ergu,97].
2.4 Summary
In this chapter we have covered all the methods needed to create computer
generated caricatures that look realistic and keep a likeness to the original face. We
have looked at how humans draw caricatures and how they compare to computer
generated methods as well as methods to create an artistic style for the program, or
copy another artists style. Research has been done into computer drawing methods
using polygons and meshes which can be used for the initial stage of this project
(modelling the face) and we have covered a range of methods of exaggeration to
create the caricature, one or more of these methods can be implemented for the second
stage of the project (creating the caricature).
Fig 2.10 – Examples of Morphing [Ergu,97].
Chris Johnson 01/05/2007
Chapter 3: Requirements and Analysis
3.1 Introduction The aim of this project is to create a program which can autonomously
generate a 3D caricature from a set of photographs and manage to maintain a
reasonable likeness to the subject. Given that the EDFM method will be used to
accomplish this goal, the project can be split in to smaller, more specific aims:
1. 3D Face modelling from input images: Creating a 3D model of the face from
pictures is essential, without this there is no face for the program to work with
in order to make the caricature.
2. Create an average face: A 3D model of an average face is needed to use as a
mean value to compare against, in order to calculate the face exaggeration.
3. Exaggerating the face to create a caricature: The method of exaggerating the
face will be the main focus of the project, it is the transformation of the model
from the original face to a new caricaturised face.
4. Variable levels of exaggeration: The user should be able to control the extent
to which the face is exaggerated, from normal face to caricature to
unrecognisable.
A simple user interface will also be needed, however it does not need to be artistic or
oriented towards a specific user group. The focus of the project is on generating a
caricature so there is no need to spend unnecessary time on an extravagant user
interface which is well designed and easy to use. The user interface is only needed to
act as the front end of the program, to display the programs caricature generating
capabilities.
3.2 Specification 3.2.1 Modelling the Face
The ability to model a human face is obviously a fundamental requirement to
this project, without this no caricature can be created. The system should be able to
model the face in 3D to a reasonable level of realism, however certain features may be
overlooked such as the hair. The aim of the system is to create a caricature of the face,
hair is not generally a feature exaggerated in caricatures unless the subject has a very
strange hair style, therefore for the purpose of simplicity the hair can be ignored. The
face also does not have to have expression, although this can improve the caricature it
is not the main focus of this project, but could be considered for the future. Due to the
face having no expression, there is no need to model the teeth of tongue because they
will not be visible. The face models must also be textured and lighted correctly to give
a realistic look to them. A further stage of the project is that it can also model and
caricaturise faces generated from photographs, rather than just computer modelled
faces.
3.2.2 Caricaturing the Face
As the focus of this project, caricaturising the face is the most important
section, however it is reliant on a good face modelling system to begin. An
exaggeration method must be chosen or developed. Given that the EDFM method
(explained in full in section 2.3.2) will be used there are various development ideas
for creating a caricature, some of which are optional. The most basic and mandatory
requirement is to exaggerate the whole face, this is to exaggerate each and every
Chris Johnson 01/05/2007
vertex by a single factor to give a generalised caricature. Following on from this other
caricature methods may be implemented such as feature exaggeration. Feature
exaggeration is desirable because it gives the user much more control over the
exaggeration process and what they want the caricature to look like. Feature
exaggeration should select certain features to be exaggerated by independent factors,
and also be given the option to not exaggerate certain features at all. The ability to
change the exaggeration factor and/or the average face model for caricature
comparison could also be implemented so that users can create more abstract
caricatures.
3.3 Modelling the Face 3.3.1 Polygon Meshes
The most commonly used method to model 3d shapes and surfaces is with
polygon meshes. The reason they are used is that you can model irregular shaped
objects with them. Polygon meshes consist of many 2d polygons joined together at
their vertices to create a mesh (see Section 2.2.1 for more detailed information),
although the mesh can never be truly smooth or round, with enough polygons they can
still give a realistic representation of curved and irregular shapes such as a face. Using
smoothing and an appropriate lighting algorithm you can also create the illusion of a
curved surface by changing the normals, even though the surface itself is still flat.
3.3.2 Importing Face models
The face models will be generated in a separate program and saved as an
object file. This file will contain all the vertex, polygon and texture mapping
information needed to render the face. The caricature program will be written using
Java and JOGL (which will be explained in Chapter 5) and will need to load these
faces into a data structure (see Section 4.4) ready to be modified to create the
caricature and then displayed. The polygon meshes used in these original face models
will consist of 3 sided and 4 sided polygons, however to simplify the coding of the
loader class, all the polygons will be converted to 3 sided polygons only. This is
simply done by splitting the 4 sided polygons diagonally in half to create two 3 sided
polygons as shown in fig 3.1.
Once the polygons have been split to all be 3 sided, the new polygon mesh can
be loaded into a data structure ready for use.
3.4 Model Manipulation / Exaggeration The exaggeration process for creating the caricature is the main focus of this
project. The aim is to make the caricature as recognisable as possible. This is going to
be attempted by using a variation of the EDFM method described in Chapter 2. There
are other methods available such as the neural network and the morphing methods,
Fig 3.1 4 Sided Polygon
2 x 3 Sided Polygons
Chris Johnson 01/05/2007
however both have their draw backs. Neural networks would require a large database
of 3D faces drawn by caricaturists to use as the trainings set. I do not know any
talented caricaturists and we do not have a program which would allow them to model
good caricatures. The collection of this database would be very time consuming and
not a realistic task to complete during this project, therefore this method is not
feasible. To create a caricature using morphing would require user input, to move the
template and create their caricature manually, this defies the aim of the project which
is to create an autonomous system which does not require such specific user input,
therefore this method is not suited for the task.
Having chosen the EDFM exaggeration method, the first stage will be to
exaggerate the entire face with respect to its difference from the average face. There
will be a single function used to exaggerate all the vertices of the face. Each x, y, z co-
ordinate of each vertex in the subject face will be compared to the corresponding
vertex in the average face. The exaggeration function will then be applied to each x, y,
z value giving a new set of co-ordinates which will be drawn to create the exaggerated
face (EDFM method explained in more detail in section 2.3.2 - See Fig 3.2).
This will produce a good starting place for generating a caricature although the
caricature will look rather crude. The caricature produced by this method alone should
have a reasonable likeness to the original face, however it will not look like a proper
caricature because the face as a whole will have been exaggerated, no points will have
been ignored they will just have been exaggerated less the closer to the average they
are. This makes the caricature look “noisy”, obviously computer generated and less
artistic.
Subject Face (F1) Average Face (F2)
Vertex v1
(x1,y1,z1)
Vertex v1
( x1,y1,z1)
DFM of x1: ∂x1 = F1(x1) - F2(x1)
DFM of y1: ∂y1 = F1(y1) - F2(y1)
DFM of z1: ∂z1 = F1(z1) - F2(z1)
Caricature Co-ordinate cx1 = f(∂x1) + F2(x1)
Caricature Co-ordinate cy1 = f(∂y1) + F2(y1)
Caricature Co-ordinate cz1 = f(∂z1) + F2(z1)
Where f() is the exaggeration function
Caricature Vertex v1 = (cx1,cy1,cz1)
Fig 3.2 – Creating a caricature vertex
Chris Johnson 01/05/2007
The second stage of exaggeration will be aimed at refining the looks of the
caricature, reducing the “noise” and trying to pick out individual features. Each 3D
face model will be made up of the exact same mesh with the same number of
polygons, the only difference being their position. Therefore it is possible to mark a
certain area of the mesh (a certain set of polygons) as being a certain feature (see Fig
3.3). This enables us to be able to manipulate or exaggerate different features by
different functions, and also allows us to stop exaggeration from happening at all on
certain areas of the face where it might be inappropriate, e.g. the head size itself (see
chapter 2.3.3 for more information about feature exaggeration). To do this we will
simply change the function used in the formula to
calculate exaggeration of the vertices (as described in
Fig 3.2). For example one function f() could be applied
to the eyes (labelled in red on Fig 3.3), another
function g() applied to the nose (labelled yellow), and
another function h() applied to the mouth (labelled
green). This stage will require a lot of fine tuning and
could cause problems when it comes to drawing the
whole face. If each feature is exaggerated differently
and independently of other features, then some vertices
may overlap or be moved far apart, causing the face to
look distorted and the polygons may not join together
properly on the borders between the marked out
features.
3.5 Formal Requirements
To summarise the specification section Figure 3.4 shows each stage of the
requirements. Each requirement is considered to be functional (F) or none-functional
(NF). Each requirement is also associated with a requirement level which shows their
priority: M (Mandatory), D (Desirable) and O (Optional).
Requirements F/NF Req
Face Modelling
1.1 An average face model can be generated F M
1.2 A face mesh of the subject can be generated from photographs F M
1.3 Face meshes are converted to triangular polygons only F M
1.4 The face is able to be rendered F M
1.5 The face texture can be rendered F M
Face Exaggeration
2.1 Compare subject face to average face to acquire the vertices DFM F M
2.2 Design an exaggeration function NF M 2.3 Apply the exaggeration function to the DFM and construct a new face mesh F M
2.4 Select vertices to define as facial features NF D
2.5 Design different functions for different facial features NF D
2.6 Apply the different exaggeration functions to the appropriate facial feature and construct a new face mesh F D
2.7 Address the problem of overlapping polygons or broken up polygons F D
Fig 3.3 – Marking Features
Chris Johnson 01/05/2007
due to feature exaggeration
2.8 User controlled exaggeration level F D
2.9 Ability to change the model used for the average face F O
User Interface
3.1 A window to display the caricature face and optionally the original face F M
3.2 A file menu to load faces and caricatures F D
3.3 A file menu to save faces and caricatures F O
3.4 An option to control the level of exaggeration F D
3.6 System Evaluation Evaluation will be important throughout the project to produce the best results.
There is no automated way to evaluate the program using other programs or
mathematical formulas because there is no way for a computer to recognise a face or
recognise a caricature, instead all the evaluation will be done by testing the results on
a group of people. I propose two tests, the first of which will be to decide to what
level the face should be exaggerated, and the second will be to test whether the
caricature has a good likeness to the original face.
If the face is not exaggerated enough, the result will still look like a normal
face and people will not recognise it as being a caricature, however if the face is
exaggerated too much it can get to the point where the face is not recognisable even as
being human, and has just been deformed into meaningless shapes. Therefore I will
test to determine which level of exaggeration gives the best looking caricatures. To
test which level of exaggeration should be used, I will generate a set of caricatures,
and then exaggerate them to various levels and take screenshots of each caricature at
each level. I will then present the caricatures to a group of people and ask them to
rank which image they think is the best looking caricature. I will not give them the
picture of the original face, this is to ensure that they are marking the caricatures
purely on which caricature looks the best in terms of exaggeration, not in terms of
likeness to the original picture. Everyone knows what a caricature looks like, and the
aim of this test is to select a small range of exaggeration levels to work with, which
make the face most look like a caricature. This will give the best quality caricature to
start the second phase of testing.
The second part of the evaluation will be to determine if the face is being
exaggerated properly and the face has a good likeness to the original. A caricature is
only a caricature if it represents someone. There are various ways to test this, one way
would be to give a group of people a caricature and get them to pick who they think it
looks like from a set of pictures of people, however this may not yield very reliable
results because it is easy for people to recognise faces, and given a group to pick from
even with a poor caricature it is likely they will almost all pick the right person.
Instead, if the test group is defined as being people on my course, and people within
my department, then a caricature can be created of someone within the department
who they will all know and recognise without a set of people to select from. By doing
this and giving them only the caricature, they could then be asked to name who they
think the caricature is based on, and give the caricature a mark for how recognisable it
is. Ideally a caricature could be created of a well known celebrity, then it would be
possible to expand the test group further and get a larger set of results, since everyone
should be able to recognise the subject if the caricature is good. However getting the
photographs needed to create the caricature, and getting the subjects consent to use
Figure 3.4 – Specification Table
Chris Johnson 01/05/2007
the pictures, may be quite difficult, so the test group may have to stay confined to the
university department.
By carrying out these 2 tests, I should be able to evaluate the success of the
program effectively, and be able to identify the areas which could be improved with
future development.
Chapter 4: System Design
4.1 Introduction This chapter will show how the system is designed and which parts of the
system complete each task. It will start out explaining what technique was used to
design the system, then go on to give an overview of the systems design, explaining
each part of the system in more detail. The chosen method of modelling the face and
caricaturising it will be explained in full as well as loading and display methods.
Finally we will look at the data structures required and how they are used, then finish
off with the design for the user interface.
4.2 Design Technique The system is written in Java, an object oriented language. Due to the structure
of object oriented languages we can split up the system into classes. We can split each
function and data structure into separate classes allowing us to design the structure of
the program using class-structure diagrams. Using an object oriented language means
that the system can be programming using the Extreme Programming (XP) method.
This allows the programmer to progressively program and test at the same time by
starting out with a simple aim, programming that first stage then testing it before
moving on and setting a new aim for the next stage.
4.3 Method Overview To try and simplify the project, the system is broken down into 5 main
functions: face triangulation, face generation, face comparison & exaggeration and
caricature display. These functions are used by the system to perform the tasks needed
to generate and display the caricature. These functions are not necessarily split into
separate classes, they just represent the various stages of the program from input to
output.
Face Creation The first stage is to create a face; to do this FaceGen is used (See section 2.2.2
for information about Facegen). The ideal final outcome of the system is to generate
faces from photographs using the tool built in to FaceGen, however for the simpler
versions of the system, for testing purposes computer generated faces can be created
instead. FaceGen creates the face using a polygon mesh and outputs the model as a
wavefront .obj file.
Face Triangulation
The face model file output by FaceGen consists of multiple different sided
polygons, due to the chosen programming language and the implementation technique
used to load the model into a data structure, all the polygons need to have the same
number of sides. Therefore the next stage is to parse the model file and convert all the
Chris Johnson 01/05/2007
polygons to uniform 3 sided polygons and resave the file ready to be loaded. This
must be applied to both the subject face to be caricatured and the average face used
for comparison.
Face Generator
This part of the system handles the reading in of the new face model and
storing the vertices, polygons and meshes in data types for both the user generated
face and the average comparison face. It also splits the face model up in to separate
mesh objects for the eyes, teeth, hair, tongue and skin so that we can choose not to
exaggerate or display the hidden models such as the teeth and we can remove the hair.
Another reason it is important to split up the model in to separate objects is to apply
the textures which will also be handled by this part of the system.
Face Comparison & Exaggeration
This part of the system is responsible for manipulating the model into a
caricature. It performs all the functions to compare the subject face model to the
average face model and applying the exaggeration algorithm to produce the new
caricatured model. Although the other parts of the system are fundamentally
important for the program to work, this is the most crucial part to the project. It is
important that this part of the system works well in order to generate good caricatures.
Caricature Display
The final part of the program is to display the generated caricature in a 3d
environment with the appropriate lighting and textures to give a realistic look and
feel.
These stages and their order of interaction with the system are shown in Figure 4.1.
Face Creation
(FaceGen)
Subject Face
Object File
Subject Face
Texture Files
Triangulation
Face Generation Exaggeration Display
Caricature
User Controlled
Exaggeration Factor
Photographs
Average Face
Object File
Average Face
Texture Files
Fig 4.1 –System Layout
Chris Johnson 01/05/2007
4.4 Data Structures The data structure for this program is quite simple; a face will be represented
as a set of meshes, one for each separate part of the face. As noted earlier we are only
modelling the eyes and skin, there is no need to model the teeth or tongue. Therefore
there will be 3 meshes, 1 for each eye and 1 for the skin. Although these 3 meshes
make up a single face, they are considered as separate data structures. The data
structure design is shown in Fig 4.2.
Mesh
A Mesh is composed of multiple vertices which are joined together by vectors to
create polygons. The meshes used to model our face only use triangle polygons
therefore the data structure for a Mesh consists of 2 sub structures, a list vertices and a
list of triangles.
Vertex
Each vertex in a 3d mesh has an x, y and z co-ordinate to denote its position in the 3d
space, these points are stored in a list xyz where xyz[0] = x co-ordinate, xyz[1] = y
co-ordinate and xyz[2] = z co-ordinate. The normal vector is used to work out the
correct light intensity for that vertex and is stored in a list. The list normal[] contains
the x, y and z direction for the vector in the same way xyz[] contains the x, y and z co-
ordinates for the vertex. Each vertex also has a texture co-ordinate associated with it.
The texture images are assigned a u-v co-ordinate system from 0-1 across the 2 axes
of the image (see Section 2.2.1), the appropriate u-v co-ordinate for the vertex is
stored in a list called texCoord[] where u = texCoord[0] and v = texCoord[1].
Triangle Each Triangle is constructed by joining 3 vertices, these vertex numbers are stored in
the vertexindices[] list and are indexed in reference to the vertices[] list in the Mesh
data structure. For example if the list contained vertex numbers 0,1,2 then these
vertices would relate to those stored in the list vertices[] at position vertices[0],
vertices[1] and vertices[2]. Each triangle also holds its own surface normal, which is
the normal at the centre of the polygon and is stored as a list normal[] in the same way
as the normal is stored for a vertex.
Vertex
Name Type
xyz float[]
normal float[]
texCoord float[]
Mesh
Name Type
vertices Vertex[]
triangles Triangle[]
Triangle
Name Type
vertexindices int[]
normal float[]
1
1
*
*
Fig 4.2
Chris Johnson 01/05/2007
4.5 Class Implementation This section will cover the class structure of the system; Figure 4.3 shows the
class structure and their hierarchy and interaction. The program classes have been
implemented with respect to the system designed in Figure 4.1, with the display and
exaggeration modules being compiled into the single Caricature class. Each class is
coloured to represent their function: Green = Input Files, Orange = Data Structure
storage, Blue = File Handling, Yellow = Display classes, Red = Exaggeration Class.
Pre-Existing Classes
Many of the classes were taken from a simple 3d mesh rendering program
called JOGL Meshes, written by Steve Maddock and Michael Meredith at Sheffield
University. This program was used as the starting point, and was edited and built upon
to create the Caricature program. Fig 4.4 shows a table of pre-existing classes and
their function.
Class Name Description
Caricature
The Caricature class was originally called MeshTexTester and is
the main class for the program, containing all the user interface
and GL display methods. Although this class already existed, it
has been heavily modified and the only real similarities to the
original is the method names and layout.
MeshLoadedOBJ
This class is responsible for the loading of the wavefront model
files. This class is mostly un-altered however it is only able to
load 3 sided polygons, hence the use of the Triangulator class.
FPS Animator
Real time rendering/animation class which allows the scene to
be redrawn continuously so you can alter the display of the
scene. (Written by Sun Microsystems)
Texture2D Loads and applies the texture images to a mesh.
Figure 4.3 – Class Structure
Caricature
MeshLoadedOBJ
FPSAnimator Texture2D
Triangulator
Face Model
Files (.obj)
New Face
Model Files (.obj) Texture Images
Mesh
Vertex Triangle Material
BoundingBox
Chris Johnson 01/05/2007
Mesh, Material,
BoundingBox,
Triangle, Vertex,
These classes construct the data structures for storing the face
model information such as the mesh, polygons, vertices and
material properties.
New/Modified Classes To build the Caricature system some pre-existing classes had to be altered and
some new classes added. Fig 4.5 shows which classes have been implemented by the
programmer, on top of the pre-existing program.
Class Name Description
Caricature
The Caricature class was modified to contain all the face model
exaggeration and display functions, as well as the front end of
the program such as the user interface.
Triangulator
To load the face model using the MeshLoadedOBJ class, the
model meshes must consist of only triangles (as explained in
section 4.4.2). This class performs the task of converting all the
polygons to triangles and outputting the new object file ready to
be loaded by the MeshLoadedOBJ class
4.6 User Interface The user interface does not need to be complicated it should serve its function
and should only take a small portion of the project time as specified in the
requirements. Simplicity will be the key, little time needs to be put in to presentation
or aesthetics as long as the interface allows the user to perform the necessary tasks to
run the program and set any parameters such as the exaggeration factor. Fig 4.6 shows
a proposed design for the main window of the program which will be opened when
the program is run.
Figure 4.4 – Pre-existing classes
Figure 4.5 – New classes
Exaggeration Scale
Show Textures
Normal Face Caricature
File
Load Face
Quit
Fig 4.6 – Proposed User Interface
Chris Johnson 01/05/2007
Chapter 5: Implementation and Testing This chapter will show in more detail the implementation of the program and
discuss any problems that occurred during the programming and how they were
solved. We will start off discussing the programming languages used and the reason
for choosing those languages, followed by an explanation of the full class structure for
the program. We will then look at the three main parts of the program, the loading of
the face model and its polygon triangulation, the exaggeration of the face and the
display/user interface. The chapter will then finish with a description of how the
program was tested during its implementation.
5.1 Programming Languages There is an extremely wide range of programming languages available today
in many different styles such as logic languages, functional languages, object oriented
languages etc. Each language has its own advantages and disadvantages which should
be considered, but also it is important to choose a language that not only suits the
program you wish to design, but also the programmer who will be coding it. The
Automatic Caricature system we have designed will need to implement at two
different languages, a primary programming language such as Java or C++ and a 3d
graphics rendering language such as Direct3d or OpenGL.
For a windows program the most commonly used general purpose
programming languages are C++, Java and Visual Basic. Both C++ and Java are also
commonly used in connection with 3d rendering APIs and have good support for the
programming of 3d scenes, animations and games. Object oriented languages are the
obvious choice for an object oriented design such as ours and also work well with the
extreme programming style, allowing for easy sub-division and testing of the program
as it is being coded. The programmer for this project has a good knowledge of Java
and although C++ and Java are both object oriented languages and therefore
inherently similar, the programmer has more experience with Java from their studies
at University. Therefore Java will be the programming language of choice for this
project.
A 3D rendering language must now be chosen, there are many to choose from
which are supported by Java, the most commonly used are Direct3D and OpenGL
which are usually interfaced to through an intermediate API such as Java3D or JOGL.
Both languages have the necessary functions needed for this project, however the
programmer has more experience with using OpenGL and JOGL from their studies at
University, therefore OpenGL was chosen.
Both Java and OpenGL can be installed on any computer and are available for
free on the internet, although JOGL requires some extra files, they will be included in
the package with the program and do not require any special installation, they will run
from the main program folder.
5.2 Face Generation As stated in the system design chapter, the initial face generation will be done
in a separate program, FaceGen. Once the face model has been saved as a wavefront
(.obj) file, the polygons and meshes are converted so that they are ready to be loaded
by the MeshLoadedOBJ class. This stage is performed by the Triangulator class; Fig
5.1 shows a section of the code which performs this task.
Chris Johnson 01/05/2007
Fig 5.1 – Triangulator Code
Chris Johnson 01/05/2007
Splitting the Polygons
The triangulator class reads in the wavefront model file 1 line at a time and
matches the first letter of the line by tokenizing each word on white space. The first
letter(s) of each line in a wavefront file denote what information the line contains, the
part we are interested in is the polygon or “face” information denoted by an f. Each f
line contains 3 or 4pairs of numbers, each part is a vertex number and a vertex texture
number, the 3 or 4 pairs are for the 3 or 4 corners of the polygon (See Appendix A for
full description of the wavefront file format).
When a face line is read, it is tokenized, the vertex and vertex texture numbers
are altered (this is because the file is split up, and will be explained later) and the first
3 pairs are output in the same order as a new line. If there is a fourth pair, then the 1st,
3rd
and 4th
vertex pairs are output as a second new line. This makes two triangle
polygons out of a single rectangular polygon.
Separating the Meshes
The wavefront file output by facegen contains all the meshes for each part of
the face (eyes, tongue, skin, teeth etc) in one file. In order to apply separate textures to
these meshes they have to be loaded separately, therefore built into the triangulator
class there is also a section of code to write each mesh to a separate file to be loaded
separately by the MeshLoadedOBJ class (shown in Fig 5.2).
For each mesh in the original file, the name of that mesh is read and used as
the file name for the output file for that mesh (e.g. the skin mesh would be named
skin.obj). The Triangulator class has to deal with both the average and normal face
models, therefore to split them up so they do not overwrite each other, they are stored
in separate directories. Each mesh/file name is then stored in an array which can be
accessed later by the main caricature program to load the files.
Fig 5.2
Chris Johnson 01/05/2007
The last important thing to note is that each time a vertex line is read, a count
of how many vertices has been read is incremented. This is because the meshes in the
original file all reference to vertex numbers in the same file, so when one mesh is
removed, the vertices for that mesh are also all removed, and therefore the vertex
numbers used by other meshes will all be out by that many vertices. By keeping a
count of the number of vertices so far, when a new mesh is started the program takes
away that number of vertices from each vertex number of the new mesh. The code for
this is shown in Fig 5.1 where finalvertexcount and finalvertextexcount are the
number of vertices missing due to the removal of the previous mesh. These are taken
away from the original vertex numbers before the face lines are written to the new
file.
Figure 5.3 shows what would happen if the face vertex numbers were not altered. As
you can see the face line stays the same as it was previously and now the vertex
correspond to vertices which do not exist in the file, leading to the mesh not being
able to be displayed and the system is liable to crash.
Loading Meshes The MeshLoadedOBJ class layout is similar to the Triangulator class, to load
the meshes it reads in each line of the wavefront file one at a time. The lines are then
tokenized to separate each value. Each line starts with a different letter, line starting v
= vertices, vt = vertex textures, f = faces (polygons), these letters are used to form “if”
statements for each case and then depending on which letter is found the data from
that line is stored in the appropriate data structure.
Within the Caricature class there is a method createObjects(Gl gl) which
handles the generation of all the face models used in the program. The first part of this
method is shown in Figue 5.4. The first two lines call the Triangulator class to split
the faces polygons and meshes from the input file. Each mesh is then loaded and
assigned as separate meshes eyeL, eyeR and skin. The getfilename(int) method is
used to retrieve the object file names from the triangulator class, if in a later addition
to this project the programmer decided to add in the teeth, tongue or hair, they could
be loaded by adding additional lines here with integers 3-7 as the parameter for
getfilename(). At this point the textures are assigned to the meshes by calling the
setTextureID() method from the pre-existing Texture2D class. This method applies a
specified image file as a texture to the mesh.
Mesh 1
1 : v 0.0 0.0 0.0
2 : v 1.0 0.0 0.0
3 : v 0.0 1.0 0.0
f 1 2 3
Mesh 2
4 : v 1.0 1.0 1.0
5 : v 2.0 1.0 1.0
6 : v 1.0 2.0 1.0
f 4 5 6
Mesh 2
1 : v 1.0 1.0 1.0
2 : v 2.0 1.0 1.0
3 : v 1.0 2.0 1.0
f 4 5 6
If Mesh 2 is saved
seperately, the
new file would
become
Figure 5.3
Chris Johnson 01/05/2007
5.3 Face Exaggeration / Caricature Generation There are two methods called to create the caricature, the first is the
createCaricature() method, which is called in the second part of the createObjects
method (Figure 5.7). Figure 5.5 shows the createCaricature() method in full, first the
vertices and triangles of each of the normal and average face meshes are retrieved
from their data structures and stored in lists so that they can be used in the
exaggeration calculations. Another face is then created by cloning the normal face
vertices, triangles and meshes using the pre-existing functions built into the vertex,
triangle and mesh classes. This new face will be transformed into the caricature. For
each mesh of the new face, a for loop is coded to iterate through the list of vertices
and apply the exaggeration function to it by calling the exaggerate() method. Once the
exaggeration function has been applied, the vertices and triangles can be set to the
mesh, then the caricature is ready to be displayed.
Fig 5.4 – createObjects() Method
Chris Johnson 01/05/2007
The exaggerate() method is the implementation of the algorithm described in
section 4.5. As parameters the method takes 3 vertices, 1 from each of the 3 faces we
have (Normal/Subject face “in”, Average face “av” and Caricature face “c”), these
vertices must all relate to the same part of the face in order for the algorithm to work.
i.e. The 3 vertices must all be from the same point on the nose, you cannot compare a
vertex on the nose with a vertex on the ear, if you did then the result would be
completely scrambled and not resemble a face at all. The exaggeration algorithm
implemented in Figure 5.6 is a simple difference algorithm, where the average face
vertex is subtracted from the normal face vertex and the difference is then added on to
the normal face vertex with a multiplier specified by the user (efactor) to create the
caricature face. There is no subtract vertex function, therefore each vertex xyz co-
ordinate must be retrieved and the difference calculated separately, then combined
back together afterwards to create the caricature vertex.
All of the methods so do not return anything (they return void), this is because
all the meshes have been initialised at the start of the GLspace, outside of any
methods, this means they are global so they can be accessed from within any method,
and any changes made to them within those methods effect all other methods using
those meshes. This is the simplest way to store the meshes, this way we avoid having
to send and return the meshes as parameters for every method.
Figure 5.5 – createCaricature()
Figure 5.6 – exaggerate(Vertex, Vertex, Vertex)
Chris Johnson 01/05/2007
Once the Caricature face meshes have been created and exaggerated, the
textures must be applied in the same way they were added for the average and normal
faces. The last step is then to recalculate the normals for each of the meshes so that
the lighting and shadows are rendered correctly. The function to do this is is called
calcSmoothNormals() and is built in to the pre-existing mesh class. This is the final
operation by the createObjects() method, the meshes are now ready to be displayed.
5.4 Feature Exaggeration
To implement feature exaggeration the only changes that were made were to
the createCaricature() and exaggerate() methods. The first stage was to define the
areas of exaggeration for each feature. This is done quite simply using if statements
with multiple conditions specifying the max/min x,y and z co-ordinate values of
vertices to be exaggerated. Originally the max/min values were constants, so the same
values were used for every face model, however during testing it was found this
sometimes did not have the desired effect so the values were changed to be in a range
around a centre vertex for that feature. Figure 5.8 shows the code for the nose feature.
Once the feature areas have been defined the exaggeration of the selected vertices can
be performed. To do this we have added an extra parameter to the exaggeration
method which is used to define which feature we are exaggerating. By doing this we
Figure 5.7 – createObjects() (Part 2)
Figure 5.8 – Nose exaggeration
Chris Johnson 01/05/2007
can apply different exaggeration scales or even different exaggeration algorithms to
each feature which can be specified by the user through the interface.
When exaggerating only a subset set of the total vertices, there is the problem
of overlapping polygons and distortion of the face as we will discuss in more detail in
the testing section of this chapter. To attempt to resolve this problem the exaggeration
algorithm has been modified to gradually decrease the amount of exaggeration applied
as the vertices get further away from the centre of the feature using a range factor.
This effectively smoothes the exaggeration between the feature and the rest of the
face. The algorithm has been modified to:
Original Vertex + (Exaggeration Factor * Difference * Range factor)
which is implemented as shown in Figure 5.9.
The range factor is calculated by working out the distance of the current vertex from
the centre vertex of the feature as a fraction of the distance from the centre of the
feature to the edge of the feature. This will give you a value between 0 and 1, from
which we then take the inverse so that 1 is the centre of the feature and 0 the edge:
Range = 1 - (Vertex dist from centre / distance from centre to edge of the feature)
In the case of Figure 5.9, vertex 930 is the centre of the feature and “in” is the current
vertex being exaggerated.
5.5 User Interface & Scene Display
To display the scene first the OpenGL space must be initialised, this is
performed by the init() method. This method sets up the scenes rendering settings
such as culling, lighting, shading method etc. Figure 5.10 shows the OpenGL settings
applied.
OpenGL Command Description
gl.glEnable(GL.GL_DEPTH_TEST) Speeds up rendering by enabling depth testing
gl.glEnable(GL.GL_CULL_FACE) These two stop the back facing polygons from
gl.glCullFace(GL.GL_BACK) being rendered.
gl.glShadeModel(GL.GL_SMOOTH) Sets the shading model to smooth using
Gouraud shading
gl.glPolygonMode(
GL.GL_FRONT_AND_BACK,
GL.GL_FILL)
Shades both the front and back sides of the
polygons.
gl.glEnable(GL.GL_NORMALIZE) Set the Vertex normals to be scaled along
with the rest of the object
Figure 5.9 – Nose Exaggeration Algorithm
Chris Johnson 01/05/2007
A display() method (Figure 5.11) has also been implemented, this method renders the
scene, and is called every time a frame is to be rendered. Within the display() method
the layout of the scene is defined, each mesh can be classed as a single object to be
rendered using the mesh.display() function (this is implemented in the pre-existing
mesh class, and iterates through the meshes list of triangles, rendering them 1 at a
time) and can be positioned using the glTranslatedf() function. Textures are enabled
using the glTexEnvf() function, this is inside an if statement with the Boolean
“textureOn” as its condition which is set by a checkbox on the user interface so that
the texture may be turned on/off by the user as they wish. The display() method also
sets up the users viewpoint using gluLookAt(), and implements a light by calling the
light0() which creates white light at position (0.0,1.0,1.0).
The user interface is fairly simple, it was implemented using standard
functions of the Java Swing API. A glcanvas was created to show the graphics
window and the user control actions are set in a grid layout in the area below and on
gl.glEnable(GL.GL_LIGHTING) Enables lighting within the scene
Figure 5.10 – OpenGL settings
Figure 5.11 – display() Method
Chris Johnson 01/05/2007
the right. The “Open file” menu, exaggeration scale sliders, feature exaggeration
checkboxes and “texture on” checkbox are all implemented with action listeners
which call methods in the glspace section of the class to change the appropriate
variables and perform the action. Figure 5.12 shows examples of action listeners for
the general face exaggeration and the methods they call.
5.6 Testing This section will show the problems found during the testing of the important
parts the program. The testing process was carried out as the program was being
written so that the problems could be solved progressively.
glSpace Methods:
Action Listeners:
Figure 5.12
Chris Johnson 01/05/2007
5.6.1 Face Generation
The actual face creation was all done using FaceGen so there was no problems
with the initial modelling of the face. Importing the face from FaceGen into the
Caricature Generator program was made more complicated due to the polygon
triangulation required to load the face and the mesh splitting required to load the
textures, however these problems were identified and resolved early in the project.
Once solutions were found the coding of the Triangulator class was quite straight
forward and no problems were found while testing.
5.6.2 Face Exaggeration
The general face exaggeration algorithm itself is quite simple and it was not
difficult to code in to the program, however during testing it was discovered that not
all faces were able to be exaggerated. The first stage of testing was to load various
different faces and attempt to create caricatures of them. To start with the tests were
done with simple symmetric faces generated by facegen (where the left half is an
exact mirror image of the right side). The results of this test were good and the
algorithm appeared to be working correctly so we proceeded with further testing.
Happy that the algorithm was working correctly, we tested the program using
irregular shaped faces that were un-even (non-symmetric). Unfortunately the results
for non-symmetric faces were not as good as for symmetric faces. The caricatures
produced using non-symmetric faces were distorted and un-realistic:
As you can see from Figure 5.13 the persons left cheek is sticking out from the rest of
the face while the right cheek has been pushed slightly into the face. The jaw is also
jarred, the neck has become bloated while the general head size has become thinner.
The nose has also been pushed up and bent, and there is an indentation in the person’s
right eyebrow. The problem with non-symmetric models is that certain parts of the
Original Face Caricature
Figure 5.13 – Face Distortion
Chris Johnson 01/05/2007
subject face do not line up with the same part of the average face (the input faces
features are not necessarily aligned to the centre of the face). This means that the jaw
may be slightly jarred to one side rather than being centre aligned which would cause
the twisting you see in the distorted caricature. Unfortunately no solution to this was
able to be found in the time allocated for this project so the system can currently only
caricature symmetric face. With more time and further investigation into this problem
there may be a solution to allow the caricature of none symmetric faces using this
system.
5.6.3 Feature Exaggeration
To test the feature exaggeration we need to check the correct area of the face is being
exaggerated and that the exaggeration algorithm is working correctly. Various tests
were done with many different symmetric faces, the results showed that the feature
selection was working as intended and the exaggeration was only applied to the
specified area. The algorithm was working correctly as well, however there were
display issues on the boundaries between the feature that has been exaggerated and
the rest of the face. As you can see from Figure 5.14 there has been tearing of the
mesh, most noticeably between the bottom of the nose and the top of the upper lip.
The nose has been exaggerated outwards from the face, but it has also been expanded
down, so it now covers the top part of the lip, distorting the face. You can clearly see
a gap between the bottom part of the nose and the upper lip where the exaggeration
has caused the nose to be completely separated from the rest of the face. This problem
Exaggeration with Range Factor
Original Face Nose Exaggeration
Figure 5.14 – Nose Exaggeration
Chris Johnson 01/05/2007
is due to the algorithm used; the exaggeration algorithm is applied to a specific area
with no smoothing or gradual change, either the polygons are exaggerated or they are
not. This can result in an obvious break between the exaggerated area and the rest of
the face as this example has shown. To solve this problem the algorithm was changed
to include a range factor which would scale the exaggeration to be gradually applied
as the vertices converge on the centre of the feature. The algorithm was modified to:
Original Vertex + (Exaggeration Factor * Difference * Range factor)
Where:
Range = 1 - (Vertex dist from centre / distance from centre to edge of the feature)
By using this new algorithm, there is no longer a sharp change from the normal face
vertices to the exaggerated vertices, instead the exaggeration is applied gradually so
the splitting of the vertices is reduced. The range factor implemented in the program is
just a linear change with 1 being the centre of the feature and 0 being the edge of the
feature and values evenly placed between, This gave reasonably good results as you
can see in Figure 5.14, the mesh splitting has been solved, however sometimes the
vertices and still overlap each other, especially if you are exaggerating two features
next to each other using different exaggeration factors. This is something that could
be looked in to in the future if more time were available. Using a none-linear range
factor could also be experimented with to see if the exaggeration change could be
improved.
5.6.4 System Testing
With the final version of the program complete we will now analyse the
system as a whole with respect to the specification and requirements laid out in
Chapter 3. We will explore to what extent each requirement was implemented and
how well they perform as a full working system. Figure 5.15 shows the requirements
table from chapter 3 with an additional column showing whether the requirement was
met. There are 3 implementation states: N/A – Not attempted, P – Partially
implemented/Attempted, C – Completed and passed testing.
Requirements F/NF Req State
Face Modelling
1.1 An average face model can be generated F M C
1.2 A face mesh of the subject can be generated from photographs F M C
1.3 Face meshes are converted to triangular polygons only F M C
1.4 The face is able to be rendered F M C
1.5 The face texture can be rendered F M C
Face Exaggeration
2.1 Compare subject face to average face to acquire the vertices DFM F M C
2.2 Design an exaggeration function NF M C
2.3 Apply the exaggeration function to the DFM and construct a new face mesh F M
C
2.4 Select vertices to define as facial features NF D C
2.5 Design different functions for different facial features NF D P
2.6 Apply the different exaggeration functions to the appropriate facial feature and construct a new face mesh F D
P
2.7 Address the problem of overlapping polygons or broken up F D
Chris Johnson 01/05/2007
polygons due to feature exaggeration P
2.8 User controlled exaggeration level F D C
2.9 Ability to change the model used for the average face F O P
User Interface
3.1 A window to display the caricature face and optionally the original face F M
C
3.2 A file menu to load faces and caricatures F D C
3.3 A file menu to save faces and caricatures F O N/A
3.4 An option to control the level of exaggeration F D C
This table shows that on the whole the project was successful, all of the mandatory
requirements have been fully implemented and passed testing and all of the desired
requirements have also been implemented to some extent, but there is room for
additional work in the future. The program can successfully load faces then create and
display the caricatures. A simple interface has been implemented which allows user
control of the exaggeration. Feature based exaggeration has been implemented
however there is still room for improvement and a lot still to experiment with in this
area, but what has been implemented is a very good starting point for further work to
improve the result. Other options such as changing the exaggeration algorithm or the
average face model are easily done. The program has been coded such that they can
be changed just by replacing the current exaggeration formula or changing the
average face file name to point to a different model. These are just single line
alterations to the code, however there is currently no way for the user to make these
changed through the user interface. The user can however, apply different
exaggeration factors to different features and to the face as a whole.
Figure 5.15 – Specification Testing
Chris Johnson 01/05/2007
Chapter 6: Results and Discussion
6.1 Caricature Results To demonstrate the final outcome of the project various caricatures were
generated using the Caricature Program. These caricatures were created by
exaggerating the whole face using the linear algorithm described in Chapter 5. The
results are shown in figure 6.1, the original faces being the left of the pairs of faces,
and the caricature on the right.
Chris Johnson 01/05/2007
6.2 Feature Caricature Results The ability to caricaturise different features independently of each other was
one of the optional additions to the program. This feature was partially implemented
in that it is only in a basic form, however we have looked at and partially dealt with
many of the problems associated with this feature which are discussed in chapter 5.
The features we have selected for separate exaggeration are the nose, eyes &
eyebrows, lips and chin. Figure 6.2 shows some of the results produced by our
Caricature Generator, the original faces being the left of the pairs of faces, and the
caricature on the right.
Nose:
Figure 6.1 – Caricature Generator Results
Chris Johnson 01/05/2007
Eyes & Eyebrows:
Lips / Mouth :
Chin :
6.3 Caricature Quality In order to determine the quality and practical success of the program two
questionnaires were devised as described in the evaluation section of chapter 3. To see
how well the caricatures generated represent their original subject a questionnaire was
formed asking people to match a caricatured face to its regular face from list of
regular faces. In order to do this an appropriate exaggeration level to model the
caricatures on must first be decided. The program itself has user controlled
Figure 6.2 – Feature Exaggeration Results
Chris Johnson 01/05/2007
exaggeration levels, however for the questionnaire we need a single exaggeration
level to use for all the faces. Therefore a preliminary survey had to be completed first
where the participants were shown a face caricatured to various levels of exaggeration
and asked a series of questions to determine which exaggeration level looked most
like a caricature. The questionnaire is included in Appendix B and the results are
shown in Appendix D.
The caricatures used in the questionnaire are listed in order of exaggeration
factor from A-H with exaggeration factors of 0.5 to 4.0 respectively at intervals of
0.5. Due to the tight schedule of the project the questionnaire was only completed by
20 people, however even with a small result there are some obvious trends which
should be noted. Using the results it’s possible to calculate the average exaggeration
factor chosen shown in figure 6.3.
This table shows the average exaggeration factor chosen
for the best looking caricature of the two faces (1a, 2a),
the point at which people thought the faces became
caricatures (1b, 2b) and the point at which the faces no
longer resemble a reasonable face and have become
distorted (1c, 2c). Comparing the results of question 1
with question 2, it is quite obvious that the face used in
question 1 required more exaggeration to generate a good
caricature than the face in question 2, and that the face in
question 2 became distorted and inhuman earlier on the
exaggeration scale than the face in question 1. Looking at the original faces, this is
likely to be because the face in question 1 is a lot plainer than the face in question 2,
although it is obviously different from the average face, the difference is fairly small
so in order to create a caricature the exaggeration factor must be higher. The opposite
is true for the face in question 2 which has quite distinct features such as a small
mouth and thin jaw/chin. Due to the large difference the exaggeration appears to be a
lot larger than that of question 1 using the same exaggeration factor, this is shown in
the results by the exaggeration factor being a lot lower when people thought the face
became distorted. To work out the best values for exaggeration and achieve the best
results a much larger survey would be needed to be done, however using the results
we have, we can still work out a good average exaggeration factor to use for the
second questionnaire. This first set of results is also useful to set the min/max
exaggeration factor for the sliders in the final system, to hopefully avoid the user
being able to generate a caricature which is either under or over exaggerated.
The second questionnaire is included in Appendix C; this questionnaire tests
the quality of the caricatures generated by the system by asking people to match
caricatures to their original faces. The three caricatures which were chosen were used
because they cover a wide range of faces and feature differences. The first face has
large lips but the rest of the face is quite average, the second is fairly average in
general and therefore predicted to be more difficult to identify, and the third has a
slightly thinner head than average and an elongated chin. The exaggeration factor
used for the caricatures was chosen dependant on the subjects face. As we saw from
the results of the first questionnaire the more obvious difference from the average, the
lower the exaggeration factor should be to generate a good caricature. Using the
results from question one compared to the distinctness of the subject faces. Faces 1
and 3 have fairly obvious features, similar to the face of question 2 in the first
questionnaire, therefore it was decided to use an exaggeration factor of 2.0 because
2.0 was the average exaggeration factor chosen to be the “best caricature” for that
Question Exaggeration
Factor
1a 2.6
1b 1.7
1c 3.55
2a 2
2b 1.6
2c 2.85
Figure 6.3
Chris Johnson 01/05/2007
face. Face 2 of the second questionnaire is a lot plainer, similar to the face used in
question 1 of the first questionnaire. On average the “best caricature” was achieved
for this more average face by using a factor of 2.6, therefore this was chosen as the
factor for face 2 of the second questionnaire. This survey was carried out by asking 20
random people of various ages, gender and profession. None of the people in the
survey had any experience of creating caricatures or professional art work.
A B C D E F G H I J
Q1 1 16 1 1 1
Q2 9 1 1 5 3 1
Q3 1 4 14 1
The results for the second questionnaire are shown in figure 6.4 above, with
the question numbers down the left column and the answer choices in the top row.
The yellow entries represent the correct answers (C, A, H). For graphical
representations of these results see Appendix E. These results show that 80% correctly
identified face 1, 45% correctly identified face 2 and 70% correctly identified face 3.
Again this is only a very small set of results, so we have to assume that we have a fair
and average test for these results to hold any meaning, more extensive testing would
be required to form properly reliable conclusions.
Looking at the results we can see that as predicted, face 2 was the most
difficult to identify with only 45% of people correctly matching the caricature to the
face. This is likely to be due to the original face being plainer than the others,
therefore the face had to be exaggerated more to create a good caricature, which
makes it less recognisable to the original, as explained earlier. Also if we look at the
spread of incorrect results for this question, the majority have selected either face G or
face H. When we compare the caricature to these two faces, we can see that the
caricature has been stretched vertically, which is an obvious characteristic of the two
faces G and H, this could draw peoples attention as an obvious common feature and
explain why people who chose faces incorrectly tended towards these two faces.
Face 1 and Face 3 have fairly obvious features for caricaturing which reflects
in their results. Face 1 got the best results with 80% of people correctly identifying it
and face 3 managed 70%. So why is face 3 10% lower than face 1? There are two
things which could be seen to be affecting this result, firstly as with question 2 the
caricature is quite tall and thin, which is an obvious characteristic of face G as well,
and you will notice that the majority of incorrect answers have chosen face G. The
second thing is that each original face obviously only corresponds to one caricature,
therefore if someone chose face H as the answer to questions 1 or 2, then they have
eliminated it as being a choice for question 3. We can see that 3 people chose H as
their answer for question 2, meaning that they now cannot answer question 3
correctly.
Given that the test sets were so small, the results have still allowed us to draw
some interesting conclusions and areas for improvement. Overall the results were
quite pleasing and showed that the Caricature Generator system is moderately
successful. There is obviously a much greater difficulty in recognising more average
looking faces which could be improved upon, however this is just as much a problem
for human drawn caricatures as it is computer generated caricatures. With the test set
being as small as it is, poor human judgement also has a greater affect on the results,
Figure 6.4 – Questionnaire 2 Results
Chris Johnson 01/05/2007
the people who took part in the survey were randomly chosen, so no information is
known about their own personal recognition skills. Therefore we cannot be sure that
all of the incorrect answers are due to poor caricatures generated by the system, some
may be purely down to the users poor facial recognition abilities.
6.4 Further Work Throughout this project we have strived to complete all the requirements set
for this program and we have managed to implement fully or at least partially,
everything required. There is a large scope for future work on this project, with the
Caricature Generator we have created as a basis for that work. There are various areas
which could be experimented with further and improved upon in future projects:
Exaggeration Algorithms:
The exaggeration algorithm is the most important single factor that determines the
outcome of the caricature, it is solely responsible for the manipulation of the vertices
and therefore it is important to find the best algorithm to achieve the best results. In
this project we have only implemented a simple linear exaggeration algorithm; there
is obviously a lot of room for experimentation and testing with other more
complicated algorithms, possibly starting with polynomial algorithms.
Feature Exaggeration: One of the desirable requirements for the system was the ability to exaggerate features
individually and not just exaggerate the whole face. This presented problems with
mesh splitting, overlapping vertices and distortion of the face. We attempted a simple
solution of smoothing the exaggeration towards the edge of the feature area, however
this is by no means a perfect solution and this is certainly an area for improvement.
Further work could also be done on how to define feature areas, obviously each face
is different and one area of vertices in one face may not completely cover the same
feature in another face.
Face Models:
When testing the system two main problems were discovered when testing different
face models. Firstly a face which did not have the exact same number of vertices,
indexed in the same order as the average face could not be compared or caricatured
properly. Secondly, the results from none-symmetric faces were displaying irregular
exaggeration and distortion of the face. These two problems were not solved during
this project and restrict the face used for input considerably. This could be looked at
in the future to improve the programs robustness and range of use, if this program is
to be used to accurately generate caricatures of real people (as was the original
intention), it will certainly need to handle none-symmetric faces since nobody’s face
is symmetric in the real world.
Additional/Optional Requirements :
The ability to save caricatures was an optional requirement which was not attempted
in this project. If the project were to be continued and possibly be used for
commercial appliances rather than just research, the ability to save caricatures would
be useful. It should not be difficult to implement the actual saving of the face, the
difficulty would come when trying to load the saved caricature. At the moment there
is no way to load a caricature, you can only load a normal face from which the
caricature is generated, so a new operation would need to be implemented to load
caricatures and their original faces.
Another of the optional features which was only partially implemented was the ability
to change the face used as the average. In the current system it is quite possible to
simpley replace the .obj file with the face you wish to use, however this is not a very
Chris Johnson 01/05/2007
user friendly operation, it would be much better to have a simple load file menu as
there is for loading the subject faces.
Chapter 7: Conclusions
The objective of this project was to create a computer aided, autonomous
caricature generator which would allow the user to create a 3D caricature of real
faces, without the need for any artistic or caricaturing skills.
As the author of this paper I had no previous experience of modelling 3D faces
or drawing caricatures, and only basic knowledge of 3D modelling in general, so
research was carried out to explore the possible methods we use for the project. The
research focussed on two areas which would be important to this project, firstly the
ability to create and display a realistic model of a human face and secondly the
methods of transformation of a face into a caricature. It quickly became very clear that
polygon meshes would be the best choice for modelling because of its ability to model
irregular shaped objects such as a face. Polygon meshes also allowed very easy
manipulation of the models vertices, which will be helpful when creating the
caricature. The only drawback to this method is that it cannot model perfectly smooth
or round objects, however with enough vertices and smooth shading it can appear
round to the human eye.
Various caricaturing methods were then explored including mathematical
EDFM, face morphing and EDFM using neural networks. Morphing required user
control to caricature the faces, manually manipulating the features, this requires some
user skill at being able to draw a caricature or at least pick out the features they want
to manipulate. There basic idea of this project was to create an autonomous caricature
generator which did not require any artistic skill from the user, therefore this method
was not suitable for our use. EDFM using neural networks looked like a very
promising idea, with its ability to mimic human artistic styles and generate its own
style, giving very realistic caricatures. However, this method required a large set of
human generated 3D caricatures to use as a training set, with no currently existing
caricatures to use as a set, generating these manually would require a large amount of
time and would also require someone with good artistic/caricaturing skills. Also, with
the author having no previous experience at all with neural networks this was an
extremely big task that did not seem to fit the time scale or scope for this project.
Therefore Mathematical EDFM was the chosen method because it best suited our
needs for autonomous caricature generation (no user input needed) and its relative
simplicity over the other methods.
With the methods of implementation decided, coding the program could begin.
It was decided to use a pre-existing program, FaceGen, to actually create the original
face models, which would then be loaded by the caricature generator. This presented
the first problem of none-uniform polygons in the model meshes, which was quickly
solved with the design of an pre-loading class to convert all the models polygons to
triangles. Once at the stage of generating caricatures the results for symmetric faces
were quite pleasing, the faces were being exaggerated properly and bared good
resemblance to their originals. However, during testing it was found that the results
for none-symmetric faces we of a much lower quality, many of which were distorted
during the caricaturing process. A solution to this was not found, but with the success
of caricatures using symmetric faces it was decided to push forward with the project
and attempt to reach the optional/desirable requirements. The system is also unable to
Chris Johnson 01/05/2007
create caricatures of faces which do not have the same number and order of
vertices/polygons as the average face, however it is stated that the program should
take faces created by facegen as input, in which case this problem is void.
The ability to caricaturise different features independently of each other was
one of the optional additions to the programs main requirements. The idea behind this
feature was to give the user some control over the look of their caricature while still
not needing any artistic or caricaturing skills. This addition to the program aimed to
allow the user to pick features to be exaggerated more or less at their own discretion
while not actually altering the physical shape of the face or manipulation of vertices
directly, they could only change the scale of the exaggeration calculated by the
algorithm. The ability to independently caricaturise different features was only
partially implemented in that it is in quite a basic form, it literally only allows
different exaggeration levels to be applied to each feature, there is no difference in the
exaggeration algorithms used for each facial feature. However we have explored and
partially dealt with many of the problems associated with this option such as
overlapping vertices/polygon distortion. This problem was solved by applying the
exaggeration gradually as a gradient towards the centre of the feature; this reduced the
sudden difference between the exaggerated vertices and the rest of the face, resulting
in the exaggeration looking much smoother.
In general, the final system should be considered a success. We have achieved
the aims we set out to complete, met the majority of the requirements to a good
standard of quality, and given reasonable solutions to the problems encountered
during implementation. The final program has no real issues with robustness, it does
not crash unless unexpected files are used as input (i.e. not a wavefront object file).
There is plenty of extra work and experimentation for the future with this project, we
have only provided basic solutions to the problems encountered, and as a result the
caricatures are not perfect and faces can still display deformation when the
exaggeration scale is high. There were also a few optional requirements which were
not implemented in this project such saving caricatures and a user defined average
face, however the main improvements to the system are concerning experimentation
with better exaggeration algorithms, feature selection and feature exaggeration
smoothing.
Chris Johnson 01/05/2007
References
[Dean,04] Deano, Caricature of Arnold Schwarzenegger, 2004.
[Jone,05] Court Jones, Caricature of Arnold Schwarzenegger, 2005.
[Prch,99] Josef Prchal, Caricature of Arnold Schwarzenegger, 1999
[Bren,85] S.E Brennan, The Caricature Generator. Leonardo, 18(3), pp.170-178,
1985.
[Redm,84] L. Redman, How To Draw Caricatures, 1984.
[Kosh,99] H. Koshimizu, M. Tominaga, T. Fujiwara, K. Murakami, Image
Processing for Computerized Facial Caricaturing System PICASSO,
1999.
[Fuji,01] T. Fujiwara, H. Koshimizu, K. Fujimura, G. Fujita, Y. Noguchi, N.
Ishikawa, 3D Modeling System of Human Face and Full 3D Facial
Caricaturing, 2001.
[Lian,02] L. Liang, H. Chen, Y. Xu, and H. Shum, Example-Based Caricature
Generation with Exaggeration. Pacific Conference on Computer
Graphics and Applications 2002, 386-393.
[Mo,03] Z. Mo, J.P. Lewis, and U. Neumann. Improved Automatic Caricature
by Feature Normalization and Exaggeration, 2003.
[Ergu,97] Ergun Akleman, SIGGRAPH ’97, Making caricatures with morphing,
1997.
Chris Johnson 01/05/2007
Apendix A: Wavefront files are a way of saving 3d models or objects. The Wavefront file
format (.obj) is a text file comprised of lines of data about the objects vertices,
polygons, normals and textures. Each line starts with a tag (which denotes what data
the line contains) followed by the relevant data. This list shows all the data that can be
stored in a wavefront file, and their respective tags:
Vertex data:
v 3 Point Geometric vertices
vt Texture vertices
vn Vertex normals
vp Parameter space vertices
Free-form curve/surface attributes:
deg Degree
bmat Basis matrix
step Step size
cstype Curve or surface type
Elements:
p Point
l Line
f Face
curv Curve
curv2 2D curve
surf Surface
Free-form curve/surface body statements:
parm Parameter values
trim Outer trimming loop
hole Inner trimming loop
scrv Special curve
sp Special point
end End statement
Connectivity between free-form surfaces:
con Connect
Grouping:
g Group name
s Smoothing group
mg Merging group
o Object name
Display/render attributes:
bevel Bevel interpolation
c_interp Color interpolation
d_interp Dissolve interpolation
lod Level of detail
Chris Johnson 01/05/2007
usemtl Material name
mtllib Material library
shadow_obj Shadow casting
trace_obj Ray tracing
ctech Curve approximation technique
stech Surface approximation technique
This example of a simple wavefront file draws a 2d triangle polygon on the x, y plane,
with vertices (0,0,0) (0,1,0) and (1,0,0):
v 0.0 0.0 0.0
v 0.0 1.0 0.0
v 1.0 0.0 0.0
f 1 2 3
Chris Johnson 01/05/2007
Apendix B: Questionnaire 1 The following questions are to determine which level of exaggeration
results in the best caricatures, please answer by choosing one face for
each question:
1.a) Which face you think is the best caricature?
1.b) At which point has the face become a caricature?
1.c) At which point does the face become distorted or inhuman and cease
to be a caricature?
Original Face
A B C D
E F G H
Chris Johnson 01/05/2007
2.a) Which face you think is the best caricature?
2.b) At which point has the face become a caricature?
2.c) At which point does the face become distorted or inhuman and cease
to be a caricature?
Original Face
A B C D
E F G H
Chris Johnson 01/05/2007
Apendix C: Questionnaire 2 The following 3 questions are to test how recognisable the caricatures
generated by the program are:
1) Which of the faces above do you think this image is a
caricature of?
2) Which of the faces above do you think this image is a
caricature of?
3) Which of the faces above do you think this image is a
caricature of?
A B C D E
F G H I J
Chris Johnson 01/05/2007
Apendix D: Questionnaire 1 Results
A ( 0.5 ) B ( 1.0 ) C ( 1.5 ) D ( 2.0 ) E ( 2.5 ) F ( 3.0 ) G ( 3.5 ) H ( 4.0 )
1a 1 1 2 9 3 4
1b 1 2 7 8 2
1c 1 3 9 7
2a 6 9 4 1
2b 3 10 7
2c 2 6 8 4
Question 1a Results
0
1
2
3
4
5
6
7
8
9
10
A ( 0.5 ) B ( 1.0 ) C ( 1.5 ) D ( 2.0 ) E ( 2.5 ) F ( 3.0 ) G ( 3.5 ) H ( 4.0 )
Exaggeration Factor
An
sw
er
Co
un
t
Question 1b
0
1
2
3
4
5
6
7
8
9
A ( 0.5 ) B ( 1.0 ) C ( 1.5 ) D ( 2.0 ) E ( 2.5 ) F ( 3.0 ) G ( 3.5 ) H ( 4.0 )
Exaggeration Factor
An
sw
er
Co
un
t
Chris Johnson 01/05/2007
Question 1c
0
1
2
3
4
5
6
7
8
9
10
A ( 0.5 ) B ( 1.0 ) C ( 1.5 ) D ( 2.0 ) E ( 2.5 ) F ( 3.0 ) G ( 3.5 ) H ( 4.0 )
Exaggeration Factor
An
sw
er
Co
un
t
Question 2a
0
1
2
3
4
5
6
7
8
9
10
A ( 0.5 ) B ( 1.0 ) C ( 1.5 ) D ( 2.0 ) E ( 2.5 ) F ( 3.0 ) G ( 3.5 ) H ( 4.0 )
Exaggeration Factor
An
sw
er
Co
un
t
Chris Johnson 01/05/2007
Question 2b
0
2
4
6
8
10
12
A ( 0.5 ) B ( 1.0 ) C ( 1.5 ) D ( 2.0 ) E ( 2.5 ) F ( 3.0 ) G ( 3.5 ) H ( 4.0 )
Exaggeration Factor
An
sw
er
Co
un
t
Question 2c
0
1
2
3
4
5
6
7
8
9
A ( 0.5 ) B ( 1.0 ) C ( 1.5 ) D ( 2.0 ) E ( 2.5 ) F ( 3.0 ) G ( 3.5 ) H ( 4.0 )
Exaggeration Factor
An
sw
er
Co
un
t
Chris Johnson 01/05/2007
Apendix E: Questionnaire 2 Results
A B C D E F G H I J
1 1 16 1 1 1
2 9 1 1 5 3 1
3 1 4 14 1
Question 1
0
2
4
6
8
10
12
14
16
18
A B C D E F G H I J
Exaggeration Factor
An
sw
er
Co
un
t
Question 2
0
1
2
3
4
5
6
7
8
9
10
A B C D E F G H I J
Exaggeration Factor
An
sw
er
Co
un
t
Chris Johnson 01/05/2007
Question 3
0
2
4
6
8
10
12
14
16
A B C D E F G H I J
Exaggeration Factor
An
sw
er
Co
un
t