Capturing aesthetic intention during interactive evolution

Capturing aesthetic intention during interactive evolution

Zhenyu Gu *, Ming Xi Tang, John Hamilton Frazer

Design Technology Research Centre, School of Design, Hong Kong Polytechnic University, Hung Hom Kowloon, KLN, Hong Kong, Hong Kong, China

Received 19 April 2005; received in revised form 15 August 2005; accepted 15 October 2005

Abstract

Interactive Evolutionary Systems (IES) are capable of generating and evolving large numbers of alternative designs. When using such systems,

users are continuously required to interact with the system by making evaluations and selections of the designs that are being generated and

evolved. The evolutionary process is therefore led by the visual aesthetic intentions of the user. However, due to the limited size of the computer

screen and fuzzy nature of aesthetic evaluations, evolution is usually a mutation-driven and divergent process. The convergent mechanisms

typically found in standard Evolutionary Algorithms are more difficult to achieve with IES.

To address this problem, this paper presents a computational framework that creates an IES with a higher level of convergence without

requiring additional actions from the user. This can be achieved by incorporating a Neural Network based learning mechanism, called a General

Regression Neural Network (GRNN), into an IES. GRNN analyses the user’s aesthetic evaluations during the interactive evolutionary process and

is thereby able to approximate their implicit aesthetic intentions. The approximation is a regression of aesthetic appeals conditioned on the

corresponding designs. This learning mechanism allows the framework to infer which designs the users may find desirable. For the users, this

reduces the tedious work of evaluating and selecting designs.

Experiments have been conducted using the framework to support the process of parametric tuning of facial characters. In this paper we analyze

the performance of our approach and discuss the issues that we believe are essential for improving the usability and efficiency of IES.

q 2005 Elsevier Ltd. All rights reserved.

Keywords: Interactive evolutionary systems; Genetic algorithm; Aesthetic intention; Artificial neural networks

1. Introduction

Evolutionary algorithms are inspired by the adaptive

evolutionary process in nature. In an evolutionary algorithm,

the ideas of natural selection, mutation and recombination are

applied to a computational task in order to find an optimal

configuration for a specific system. A population of configur-

ations is evolved by a ‘natural selection’ mechanism that uses

an objective function to evaluate configurations. The function

is called the fitness function and is specific to the problem

being solved.

When evolutionary algorithms are used to evolve designs

involving artistic and aesthetic problems, then they have to use

artificial selection. Such systems are called Interactive

Evolutionary Systems (IES).

Artificial selection refers to the selective breeding carried

out by humans to produce a desired evolutionary response.

0010-4485//$ - see front matter q 2005 Elsevier Ltd. All rights reserved.

doi:10.1016/j.cad.2005.10.008

* Corresponding author. Tel.: C852 27667852; fax: C852 27745067.

E-mail address: [email protected] (Z. Gu).

In the biological world, humans have used artificial selection to

breed specific types of dogs, pigeons, crops, and flowers. In a

population of those species, the individuals that exhibit the

desirable traits are selected and bred to create offspring with

similar inherited features.

A number of IES have been implemented in graphic, media

art and industrial design in order to generate desirable images

or forms [1,2]. The IES approach was first demonstrated by

Dawkins [3], when he created a program for evolving artworks

called Biomorphs. Following on from this work, many people

devised their own evolutionary systems using artificial

selection. Examples of such systems include Evolutionary

Art by Todd and Latham [4], and Evolved Computer Graphics

by Sims [5]. Most of these systems had similar interactive

interfaces, where the current population is displayed in a grid

window with users selecting preferred design based on their

aesthetic visual appeal [1]. A comprehensive survey of IES has

been given by Takagi [6].

1.1. Problem statement

For IES, the key problem is that the convergent mechanisms

of standard evolutionary algorithms are difficult to achieve.

Computer-Aided Design 38 (2006) 224–237

www.elsevier.com/locate/cad

http://www.elsevier.com/locate/cad

Z. Gu et al. / Computer-Aided Design 38 (2006) 224–237 225

Such convergent mechanisms are based on the existence of a

well-defined objective fitness functions. The most common

type of evolutionary algorithm is the Genetic Algorithm. These

algorithms use selection pressure (an index of the unevenness

of selection) and population size in order to control the

diversity in a population. Interbreeding in a small population or

under a high selection pressure immediately leads to premature

convergence [7]. Genetic Algorithms use a number of

specialised natural selection methods to reduce the selection

pressure. These include proportionate selection [8], tournament

selection [7], and truncation selection [9]. Such selection

methods usually require well-defined objective fitness

functions.

The evolutionary process in IES is guided by artificial

selection, which involves the user selecting preferred designs

from the population. This restricts the population size, due to

the limited number of individuals that can be displayed on the

computer screen [2]. In addition, the sentimental and fuzzy

nature of aesthetic judgments makes it difficult for the user to

maintain consistency in the choices over a certain period of

time. As a result, IES usually have divergent evolutionary

processes using asexual breeding and are mutation-driven.

The IES approach is typically used as a continuous novelty

generator, but not as an optimizer [1,10]. An IES may arrive at

an ideal solution by chance. It is difficult for the evolutionary

process to converge on optimum solutions in a subtly

controlled manner.

Mutation-driven evolution is also commonly used in

another kind of convergent evolutionary algorithm, called

Evolutionary Strategies. These algorithms achieve conver-

gence by adjusting the mutation rate based on the rate of fitness

improvement. The approach is similar to hill-climbing search.

However, it is difficult for IES to use this approach. Usually, a

fixed mutation rate is adopted. If the mutation rate is small and

a search space is big, the search process may lose global view

of search space, and as result the system may be misled. The

process is likely to become very time-consuming and tedious

for the user. Conversely, if the mutation rate is high, then the

inheritance of evolution may be broken.

In order to solve this dilemma, an IES approach needs to be

developed that combines a divergent evolutionary process with

certain convergent properties when subtle optimization is

required. Our approach to solving this problem is to

incorporate a Neural Network based learning mechanism,

called a General Regression Neural Network (GRNN), into an

IES. The GRNN analyses the user’s aesthetic evaluations

during the interactive evolutionary process and is thereby able

to approximate their implicit aesthetic intentions. The

approximation is a regression of aesthetic appeals conditioned

on the corresponding designs. This learning mechanism allows

the framework to infer which designs the users may find

desirable. For the users, this reduces the tedious work of

evaluating and selecting designs.

Section 2 gives an overview of research on evolutionary

algorithms that use Neural Network based fitness approxi-

mation mechanisms. Section 3 focuses on the process of

capturing the users implicit aesthetic intentions using a GRNN.

Section 4 describes the overall system framework for an IES

incorporating a GRNN learning mechanism. Section 5

discusses two demonstrations, one involving the evolution of

facial characters and another on the evolution of color tones.

Finally, Section 6 draws some general conclusions and briefly

suggests areas of future work.

2. Related work

Sims [5] has suggested: “Large amounts of information of

all the human selection choices of many evolutions could be

saved and analyzed. A difficult challenge would be to create a

system that could generalize and ‘understand’ what makes an

image visually successful, and even generate other images that

meet these learned criteria.”

Researchers have developed a number of approaches aimed

at reducing the number of fitness calculations by discovering

additional heuristic information from the past generations and

evaluations. If the fitness calculations are computationally very

expensive and slow, then such approaches can improve the

overall evolutionary process by reducing the blindness of the

search.

2.1. Fitness approximation

Fitness approximation has been reported in evolutionary art

and design [11]. Biles’ [12] Genjam was a Genetic Algorithm

for generating jazz solos with a three-layer multi layer

preceptron (MLP) Neural Network called Elualuator. Baluja

[13] used similar multi layer Artificial Neural Network (ANN)

to formulate the human aesthetic judgments on the images

produced by an IES. In his system, images were classified into

several ranked categories. ANN was then used to approximate

unknown continuous real functions with polynomials. Using

ANN to formulate human evaluation would seem to be a very

promising approach. However, the difficulty with such

approaches was that the input spaces were very large (even

with a short phrase of melody or small patch of image),

whereas the granularities of clusters and the distributions of

subspaces with different ranks were indistinct, complicated and

overlapped with unavoidable errors and contradictions of

human judgments. Therefore, it was hard to decide the learning

capacity (hidden layer scale) and the minimum size of training

set for a normal MLP. Such a system might be either over-

trained or over-generalized.

Ohsaki and Takagi [14] also tried to use a feed forward

Neural Network for predicting human evaluation of newborns

and displaying them in decreasing arrangement. They reported

that their experiments showed that the predictions given by the

Neural Network were less precise than the other simple method

that used reciprocals of Euclidean distances as predicted fitness

of newborn individuals. They did not explain why their Neural

Network approach did not work well and what kind of structure

of the Neural Network was adopted. The use of Euclidean

distances is the simplest approach, but it is not suitable in

multimodal and nonlinear situations, where different categories

are intertwined, for instance, the high order logic like XOR

Z. Gu et al. / Computer-Aided Design 38 (2006) 224–237226

problems. Actually, ANNs are usually nonlinear transforms of

the Euclidean or some other distance measures. As general-

purpose approximation methods, they can perform classifi-

cations or regressions on almost any complicated distributions.

In classification, artificial Neural Networks are used to classify

the inputs by approximating the boundary between different

classes. In regression, the problem is concerned with explicitly

representing the mapping between the random input data and

the desired responses (outputs). Most Neural Networks trained

for fitness approximations, from a statistical perspective, can

be considered as regressions of the fitness conditioned on the

corresponding solutions.

Fitness approximation in evolutionary computation has also

received attention in Engineering areas, where explicit fitness

functions are either hard to formulate or computational

expensive, such as structure optimization, aerodynamic design

optimization, and protein structure prediction. Yaochu [11]

concluded that several feed-forward Neural Network models

have been used for approximating real models in optimization:

MLP, radial basis function (RBF) networks and support vector

machines (SVM). The design and optimization of aerodynamic

shapes using Genetic Algorithms suffer from high fitness

computing cost. Giotis and Giannakoglou [15–17] have tried

using surrogate model learning from evaluated past samples to

forecast potential better candidate solutions before calling the

real fitness evaluation function. The authors preferred RBF

network for its great local approximation property. A RBF

network uses a weighted sum of a finite set of nonlinear basis

functions to approximate an unknown function.

2.2. Guiding reproduction

A number of researchers using conventional Genetic

Algorithms have developed ways of guiding the reproduction

process using simple probabilistic models (such as Bayesian

Networks), which reveal distributions of solution vectors

conditioned on fitness values. The algorithms were named

Estimation of Distribution Algorithms (EDA) by Muhlenbein

and Paab [18]. In an EDA appoach, newborns were generated

according to the estimation of the probability distribution of the

selected set of solutions in the previous generation [19]. The

purpose of the algorithm was to improve mixing and breeding

efficiency [20]. They argued that the blind recombination and

mutation operators have great difficulty in discovering the

complex correlations (referred to as Epistasis) between

parameters (or genes), an in some cases they may even disrupt

such correlations.

Univariate Marginal Distribution Algorithm (UMDA) was

proposed by Muhlenbein and Paab [18] for multivariable

optimizations using Genetic Algorithms. They managed to

independently reproduce each variable for a newborn based on

the estimated marginal distribution of the variable. However, it

is more reasonable that reproduction should follow the joint

probability of all the variables. The Mutual-Information-

Maximizing Input Clustering (MIMIC) algorithm proposed

by De Bonet [21] and the Factorized Distribution Algorithm

(FDA) proposed by Muhenbein [22] managed to use some

simplified methods to replace the overall joint probability

calculation. A full comprehensive review has been given by

Pelikan and Goldberg [19].

The advantage of EDA was that it improved fine tuning

efficiencies at the final stage of the evolutionary process. Local

fine-tuning is also a problem that exists in numeric

optimization using conventional Genetic Algorithms [23,24].

Usually the fitness improvements that can be achieved are more

significant at the beginning. Fitness improvement usually

decreases exponentially over time. Genetic Algorithms display

inherent difficulties in performing local search for numerical

applications due to the randomness of crossover and mutation.

3. Capturing implicit aesthetic intentions

A computational framework is proposed that combines an

IES with a Neural Network based learning mechanism capable

of approximating the user’s implicit aesthetic intentions. The

aim of incorporating a Neural Network differs from the

approaches discussed above. In our case, the aim is primarily to

minimize user fatigue by reducing the number of evaluations

that need to be performed by the user. In addition, the

approximation also results in a smoother, less erratic evaluation

function.

In this paper, we focus on multivariable optimization

problems using a Genetic Algorithm. Each design is

represented as a real number vector in multidimensional

Euclidian solution space. Set of designs are generated and

displayed in a girded window, where the user can then select

one or more favourite designs. We assume that such user

interactions reflect certain user preferences, or aesthetic

intentions.

The probability density of a specific class of designs in an

unexplored solution space is presumed to be even. Through

artificial selections, certain points in the space will be

deselected, while others will be selected. Some areas may

then emerge as containing a higher density of favourite designs.

When the equilibrium of the space is disturbed, the probability

density of the class becomes uneven. Such a process may

therefore reveal the user’s aesthetic intentions.

In order to identify high performance regions containing

favourite designs, the history of the user’s responses must be

saved, analysed, and formulated as a fitness landscape. The

problem is then one of explicitly approximating the implicit

mapping between evolved designs and the user’s responses.

Such a mapping can be produced using Neural Network

regression. Two possible approaches are parametric regression

using weights, or non-parametric regression using a regression

function.

3.1. Parametric versus non-parametric regression

Neural Networks or Bayesian Networks for regression can

be regarded as an explicit representation of the implicit

deterministic input–response relationship in the measured data

that may be corrupted by additive zero-mean noises.


We assume that the network input is a random variable x, and

the desired response t is also a random variable.

An important interpretation is the following: A network

learnt form the measured data has an output t that approximates

the conditional mean of the desired response data t, that is, the

regression of t conditioned on x:

tðx; wÞ Z E½tjx� (1)

where w are the optimal weights (parameters), and tð:Þ is

optimal network structure, scale and type (such as MLP, RBF

or Bayesian network). E[tjx] refers to the conditional mean

(expectation) of t defined by:

E½tjx� Z

ðtPðtjxÞdt Z

ÐtPðt; xÞdtÐPðt; xÞdt

(2)

All of the networks that have been discussed so far are thus

nonlinear regressors based on the adaptation of weights w.

A drawback with this approach is the training of the weights

and the choice of network structure, which are both complex

nontrivial issues.

For the training of the weights, complex back-propagated

learning methods are required that are computationally

expensive, such as minimizing the mean square error (MSE)

or other methods. Bayesian networks also need priori

information about the probability models fitting the structures

of the data.

For the network structure, intelligent choices of the

functional forms generally require data structure analysis. In

many circumstance, it is inappropriate to assume any

functional form for a class density function. Neural Networks

need structure estimations and adaptations by choosing the

number of layers, the type of transfer functions and the number

of nodes. The degree of freedom in the Neural Network

topology has to be decided experimentally.

In order to avoid such drawbacks, an alternative nonpara-

metric regression approach can be used that does not involve

any weights. The model we employ is a probabilistic

Neural Network called the General Regression Neural Network

[25,26].

3.2. General regression neural network (GRNN)

For a GRNN, formula (1) and (2) can be simply rewritten

as [26]:

tðxÞ Z

ÐtPðt; xÞdtÐPðt; xÞdt

(3)

The regression of t on x can be written as a function of the

joint Probability Density Function (PDF) which needs to be

directly estimated from available data. That is, given a number

of the input–response pairs (xi, ti), we seek a nonparametric

estimator of the joint PDF P(x, t) of random input vector x and

random scalar response t.

Estimating the PDF from the measured data is a common

approach in statistics [27], and is also used in the area of

Bayesian Statistics. A commonly used technique for achieving

this is the Parzen Kernel method which is a nonparametric

estimator following the idea of kernel approximation. It

provides an estimate that approaches the true density under

fairly general conditions. The estimator was proposed by

Parzen [27] and developed further to multidimensional cases

by Cacoullos [28]. The formula is as follows:

Pðx; tÞ Z1

ð2pÞðnC1Þ=2sðnC1Þ

1

m

!Xm

iZ1

expðKjjxKxijj2=2s2Þ expðKjjtKtijj2=2s2Þ (4)

where m is the number of measured input–response pairs and n

is dimension of x.

In using the Parzen method, the underlying PDF is assumed

continuous. The method starts by normalizing the measured

data and choosing a type of symmetric kernel function such as

the Gaussians. The kernels are located at each (xi, ti), and added

up for estimating the overall joint PDF. A cluster of kernels

closely packed together usually indicates an area of high

probability density.

After combining (3) and (4), performing integrations we

get:

tðxÞ Z

Pi ti expðKjjxKxijj

2=2s2ÞPi expðKjjxKxijj2=2s2Þ

(5)

The following is a simplified form of function (5).

tðxÞ Z

Pi ti expðKZi=sÞP

i expðKZi=sÞ(6)

Zi can be some type of distance metrics between any new input

x and the ith measured case xi. In real applications, there are

some alternatives to Euclidian distances to reduce the

computational costs, such as the dot product distance:

Zi Z xT xi

or the City Block distance:

Zi ZXn

jZ1

jxjKxijj (7)

The formula (5) can be realized with a multi layer feed

forward network structure in Fig. 1.

The Gaussian kernels centred at each measured case xi. ti is

the observed user’s response of the case xi. The output t can be

visualized as a weighted average of all of the values of ti. The

weight is exponentially related to the distance between the

input x and kernel centre xi.

It may appear that the GRNN is very similar to a RBF

network structure. However, the key point of the GRNN is that

the number of the nodes in the hidden layer is exactly same as

the number of measured cases. The network grows when new

measured cases are added, unlike the RBF networks, which

have fixed number of kernels and need training processes to

adapt the centres and widths of the kernels. With a GRNN, only

Fig. 1. A GRNN structure and workflow.

Fig. 2a. Twelve one dimension kernels on [0,100] with sZ5. The orange solid

lines represent selected cases.

Fig. 2b. GRNN regressions on the twelve cases. The regressions vary from

sharp to flat with the sZ2, 5, 10, 15, 20 and 35 in turn.


the unified width s (the radial deviation) of the Gaussians has

to be experimentally determined.

3.3. The network input

The vector x represents the features of an evolved design

which aroused a certain user response. The x should be a real

value vector in Euclidian space as a GRNN makes density

estimates in a continuous Euclidian space. The simplest

approach is to adopt real number phenotypic representation.

An important presumption of the fitness approximation is

that the real joint PDF of (x, t) is continuous. This means that a

slight change of x should not result in a sharp change in

aesthetic appeal. Fortunately, this assumption seems valid in

our cases.

The other requirement for x is normalization. As a pre-

processing step, it is usually necessary to scale all

components of vector x such that they have approximately

the same ranges or deviations. The necessity for this stems

from the fact that the underlying PDF is to be estimated with

symmetric kernels that have the same width s in each

dimension [26].

A design may have various representations serving different

purposes. For instance, some creative (generative) systems

devised to solve combinatorial problems use binary or

alphabetical codes or tree structures. In such cases, a

reasonable way to measure the similarities and differences

among the designs is Hamming distance or Edit distance rather

than Euclidian distance or City Block distance. Some other

density estimation methods may then be required based on

these other distance measurements.

3.4. The user response

For the response ti, one option is to let the user input a scale

value corresponding to the design xi, which may be quite

informative. However, it is likely to be very tedious for the user

and the values obtained may not be reliable.

Our experimental system simply sets ti as a Boolean value.

That is, we put all of cases into two categories: selected and

unselected. The user’s response ti of the visual output of case xi

only requires one click (selected), or no click (unselected) on it.

From the usability point of view, this is satisfactory. Referring

to Fig. 1, if x1 and x2 are selected, the desired output t1 and t2

will be set to 1, while the rest (t3wtm) are set to zero.

Therefore, only x1 and x2 are summarized in the node WS

(weighted summation). The final output is the estimated

possibility that the user will select the currently applied test

vector.

3.5. The regression function

The most important parameter is s, which determines the

smoothness of regression function. The greater the s, the

smoother the function is. This implies that the GRNN has

enough flexibility to capture both the global and gradually

the local properties of input data.

Suppose there are 12 randomly generated sample cases in a

one-dimension solution space and four of them are selected.

Fig. 2a depicts the sample cases with 12 Gaussians (sZ5)

placed on x. The four Gaussians drawn with solid orange lines

are the selected ones. The corresponding regression of t on x

can then be estimated (see the solid red line with label sZ5 in

Fig. 2b).

In Fig. 2b, a bundle of curves depict the estimated t on x

using different s values. The peaks of those curves are


estimated to be the potential desirables (the user’s

intention). A very small s may cause the search to fail to

converge, as the peaks are replaced by flat top plateaus.

(See the dotted line with sZ2).

A traditional method of selecting suitable s is the holdout

method [26]. For a particular value of s, the holdout method

consists in removing one measured case at a time and constructing

a network based on all of the other cases. Then the network is used

to estimate t for the removed case. By repeating this process for

each case and storing each estimate, the mean squared error can

be measured between the actual measured values t and estimates.

The value of s giving the smallest error should be used in the final

network. The holdout method is useful in the offline state when

learning has been completed.

In our case of dynamic online learning, we proposed a

simple alternative method. A user initially uses a range of

different s values to make several separate estimates. The user

picks the most desirable estimated result. Then, its s value will

be the mean value of the range of s for the next try. This

strategy makes the estimation of the user’s preference flexible.

Typically, a change of s shifts the high peaks significantly

when learning cases are few and far between. With the

accumulating of the learning cases, the peaks’ shifting will be

reduced.

Fig. 3. Intention inferences using a GRNN with different values of s (0.3 and

0.5). There are total 7 observed cases, of which two are selected (the circles

with asterisks).

3.6. Interpolation and extrapolation

A two-dimensional example (see Fig. 3) of a GRNN

approximation process may help explain the effect of the s, and

the basic mechanism of how the possible user preferences are

inferred from the input data.

There are total seven sampled points scattered in a 2D

solution space so far. Two of them are selected (the circles with

asterisks). Two possible consequent approximated fitness

landscapes are depicted by contour lines.

The shift of the peaks highlights the two abilities of the

GRNN: interpolation and extrapolation. Interpolation refers to

the fact that the peaks tend to match the common clustering

centres of the selected samples. Extrapolation refers to the fact

that the unselected samples counteract the peaks, making them

deviate off the clustering centres and extrapolating them

towards unexplored potential areas.

The intertwining of two classes of samples (the selected and

unselected) implies a bimodal XOR problem. A moderate s

captures this feature with two separate peaks (see the upper in

Fig. 3). When s becomes lager than a certain threshold, the two

peaks are merged into a single one (see the lower in Fig. 3). The

unselected point in middle is somehow suppressed as an error

(but it still has effect on the peak). With a suitable s, the GRNN

is robust enough to tolerate the uncertainty and confliction of

selections.

With different s, the inferences cautiously cover most

possible situations at this stage. A reasonable conjecture from

this example is that with subsequent user selections, as well as

increased samples, the shifts of the peaks will be confined and

ambiguities will be gradually eliminated.

4. System framework and implementations

In order to validate the idea, we developed a system

framework that combines a GRNN for capturing users’

aesthetic intentions with an IES. Fig. 4 illustrates the overall

structure of our framework.

4.1. System framework

The framework switches between two basic states. The first

state consists of an interactive evolutionary process guided by

artificial selection. At the beginning, a user starts with a

random population of designs and then, generation by

generation, interactively evolves this population. The user

just simply picks one or several individuals on the screen.

Then, the system will asexually breed and mutate the selected

designs to derive a new generation. The mutation rate is always

set high in order to ensure that there is enough diversity. All the

selected and unselected designs are saved in the GRNN

memory. After several generations of interactive evolution, the

system is switched to the second state.

In the second state, the GRNN module is used to capture

user’s intentions by analyzing the choices that the user made in

Fig. 4. The framework has two states. One is normal interactive evolution

guided by artificial selections by the user. The other is an automatic evolution

guided by a GRNN fitness approximations mechanism that captures the user’s

aesthetic intentions.

Fig. 5. Cartoon style faces using primitives and their combinations.

Fig. 6. An examples of parameterization of the facial model. The points labeled

C are some of the control points of a NURBS surface. The nose, mouth, ears,

eyes and eyebrows are created from primitives. The points with label A are

surface points on the NURBS for constraining these facial features (the

deviations are labeled D). Each facial feature has local coordinates for rotation

and scaling (labeled with S and R).


previous state. A normal Genetic Algorithm is then used to

automatically converge on the interesting designs to which the

GRNN gives high responses.

It is likely that the GRNN will result in a solution space with

many peaks. As a result, the Genetic Algorithm is thought to be

the most appropriate search method since it is more robust than

analytical approaches such as Newton gradient method. In this

state, the Genetic Algorithm can be set with a relatively large

population size and low selection pressure in order to ensure a

successful convergence. Different deviations s may be tested to

provide several different results.

If the results are not satisfactory, then the user may switch

back to the first state. The user may therefore work in an

iterative manner, switching back and forth between states.

Gradually, the evolved designs will get closer and closer to the

real intention of the user.

Our framework aims to provide a heuristic and intuitive

working mode for visual aesthetic problems existing in

industrial, graphic and multimedia design domains.

4.2. Facial character design

Facial character design is chosen as an application for

testing the framework proposed in this research. The computer

is used to evolve amusing or attractive animated characters

totally based on the visual impression of particular character

traits.

The first implementation of the framework is a simple

system called the Genetic Face Creator (GFC), which uses a

Genetic Algorithm to evolve 3D facial models defined by a

number of parameters.

Fig. 5 illustrates face models in GFC. A NURBS surface is

constructed to be the basis of the facial shape. (The Fig. 6

illustrates main components of the model and several examples

of the parameterization of the model.) Some facial features,

such as nose, ears, eyes and mouths, are then added onto the

surface. The facial features are made of primitive shapes, such

as spheres, cylinders, torus, cones, and their combinations.

Each facial feature’s position depends on the NURBS surface

by referencing one of the surface points on it (called anchor

points, see the cross points with label A in Fig. 6). The surface

points’ XYZ coordinates are determined by specifying UV

parametric coordinates of them on the surface.

Large number of alternative facial models can be produced

by modifying the parameters, which include the following:

† The XYZ coordinates of control points of the NURBS (with

label C)

† The U and V coordinates of the anchor points on the

NURBS surface (with label A)

† The scaling and rotation of the local coordinates of each

facial feature (with label S and R)

† The constraints (deviations on XYZ axes, with label D) from

the pivots of the facial features to the anchor points

These parameters are selected to define the phenotypes in

the Genetic Algorithm. Therefore, each phenotype is a real

number vector. Each component of the vector varies in an

Fig. 7. The experimental system has several depositories called pools to

separately record interactivities with different aesthetic intentions. The slide

bars allow the user to define weights for each aesthetic intention. The numeric

fields display the numbers of selected and unselected cases in the pools so far.


interval with specific upper and lower bounds defined by user.

The bounds of the intervals are usually loosely set with safe

margins (Fig. 7).

As the intervals of the parameters may have different scales,

all the components of a phenotype are normalized. Normal-

ization is performed by first subtracting the middle values of

their intervals, and then mapping them to a same scale of

[K256.00, 256.00]. (The normalization and scaling of the

phenotype is also a requirement for input x to GRNN.) This

allows them to be further uniformly encoded as 16-bit binary

strings (one bit for the sign, eight bits for the integer part),

which are regarded as the genes of the individual. During

evolution, every newborn genotype must be decoded from a

binary string to a phenotypic real number vector.

5. Demonstrations

It has been said IES has no reliable convergent mechanism.

With our IES, we have observed that it is usually impossible for

a user to achieve identical facial shapes in a number of

separated interactive evolutions. This is due to the fuzzy nature

of aesthetic judgments (lack of clearly formulated fitness

function) and the stochastic nature of the asexual mutation-

driven evolutions. The appearances of better designs rely on

randomly mutating the designs selected by the user in past

generation. With a small population, it’s highly possible that an

interactive evolution stops at near optimums or misses the

user’s intention.

However, the inductive nature of the GRNN makes

intention capturing possible. If the user’s intention is consistent

and the user does not significantly change his preference, the

seemingly erratic user selections may have certain consistency.

In the following experiments, we aimed to demonstrate the

feasibility of achieving convergence within an IES. The GRNN

was used to reveal implicit correlation between evolving

designs and the user responses. The evolutionary system was

then allowed to run and to evolve a set of designs that fall

within the dominant hump of the user’s preference.

5.1. Evolution of facial characters

The GFC may be used to capture aesthetic intentions for

specific facial characteristics. For example, a user may evolve

‘girl’ faces or ‘smart’ faces. The user initially evolves a set of

faces by artificial selection. Each generation consists of 12

individuals, which are all displayed to the user. The user will

usually select one or more individuals as the parents for asexual

breeding. (It is also possible to select none, in which case the

system will keep breeding the parents from the previous

generation until acceptable offspring appear.) The user’s

selections are saved in a depository, referred to as a pool.

Separate pools would be created for each aesthetic intention.

For example, one pool would be created for the ‘smart’ faces,

and another for the ‘girl’ faces.

The system is then switched to the second state. First, the

GRNN is used to capture the user’s aesthetic intentions, and

then the canonical Genetic Algorithm is used to automatically

evolve faces that reflect the user’s aesthetic intentions.

Using the GRNN also enables GFC to find compromises

between different aesthetic intentions. This can be achieved

without requiring the user to perform any further artificial

selections. For example, by activating the pools for both

‘smart’ and ‘girl’, the GFC can be used to evolve ‘smart girl’

faces. Such combinations are compromises that are achieved

by linear combinations (Fig. 7).

A simple experiment with GFC intuitively demonstrates this

point. In this experiment, we aimed to evolve the facial

character of a young lady. The experiment had three steps.

First, we initialized the GFC system, and ran it interactively

for 8 generations. 7 designs were selected out of a total of 96.

All the selections were recorded in the pool. The GFC system

was then switched to the second state. The GRNN was used to

analyze the pool and capture the user’s aesthetic intentions.

The Genetic Algorithm was then used to automatically evolve

a set of designs. The population size was enlarged to 40, and

single point crossover with a low mutation rate (0.01) was

used. Selection pressure was set lower when proportionate

selection [8] method was used. The evolution became stable

after 200 generations. The result of the convergence is shown

on left side of Fig. 8 with label A.

The second step was similar to the first step. The GFC

system was initialized and run interactively. This time, the

system was run for 4 generations, with 4 designs being selected

out of a total of 48. These selections were recorded in a second

pool. The GFC was then switched to the second state. In this

case, the GRNN was used to analyze the second pool, thereby

capturing a different set of aesthetic intentions. The Genetic

Algorithm was used to automatically evolve a second set of

designs. The result of the convergence is shown on the right

side of Fig. 8 with label B.

Finally, in the third step, no interactive artificial selection

took place. Instead, the GRNN was used to analyze both pools

from the first and second steps. This resulted in the GRNN

capturing a set of aesthetic intentions from both the first and

second steps. The Genetic Algorithm was then used to evolve a

third set of designs. The result of the convergence is shown in

the lower part of Fig. 8 labelled C.

The primitive and caricatured models in GFC allow

differences and similarities between facial features to be easily

identified. The result in C could therefore be seen to be a

compromise of the results in A and in B. The eyes and

eyebrows in C look like the results in A, whereas the mouth and

Fig. 8. A selection of faces evolved using GFC. Two users have interactively evolved a number of faces, thereby created two pools of data (shown as boxes) reflecting

their aesthetic intentions. The faces labelled A shows the result of running the GRNN and Genetic Algorithm using pool A only. The faces labelled B represent the

result of using pool B only. Finally, the faces labelled C represent the result of using both pools A and B.


nose are like the results in B. The face silhouette looks more

elegant than in the results in A and B because some appearance

flaws have been moderated.

The observation of the experiment verifies that the GFC

using GRNN fitness approximation is able to converge to a

certain stable point. Furthermore, it is suggested that when the

GRNN uses multiple pools, the final evolved designs will be a

compromise of the different aesthetic intentions for each of the

pools.

The GRNN in this demonstration used city block distance

metric to estimate the probabilistic density (see the formulas 6

and 7). The deviation s was fixed at 1500. The intention

capturing mechanism allows the user to define a series of s

values, resulting in a batch of inferences. Fig. 9 shows that the

result of C may vary slightly as different deviations are used.

The larger the deviation value, the less significant the changes

are.

5.2. Evolving colour tones

As aesthetic intentions of design are usually fuzzy and

difficult to be clearly specified, it is hard to objectively measure

whether the convergent results actually reflect the user’s

aesthetic intentions. A simpler experiment was therefore

performed that allowed for objective measurement to be

taken. The experiment involved the search and fine-tuning of

colours.

The user was asked to select shaded balls in a grid window

(see the Fig. 10). By repeatedly making selections, the user was

able to evolve the colour tone and converge towards a desired

colour. In this experiment, the desired colour was actually

predefined, thereby allowing objective measurements of

convergence to be taken. A warm grey colour tone was

displayed in a smaller window and was supposed to be the

desired target colour in the user’s mind (the small window was

always produced to him as a reference during the search). The

experiment aimed to discover whether and how the system

could help the user get closer to the ‘intention’, as the

evolutionary process progressed.

The colour tone was controlled by the three parameters of

the HSV (hue, saturation and value) colour model. (We chose

HSV model as it is more intuitive for colour comparison than

other models.) The three parameters were real numbers varying

in an interval [0.000, 1.000].

During interactive evolution, each generation consisted of

12 individuals. Each individual had a decimal real number

phenotype and a binary genotype. The parameters in each

phenotype were decoded from 16-bit binary strings (genes) in

Fig. 9. The results may slightly change with different deviations s: 250, 500,

1000, 2000 and 4000 in top down order.


the corresponding genotype. Each newborn individual in this

generation was produced by mutating the genotype of a parent,

which was one of the selected individuals in the last generation.

A mutation randomly flipped one or two bits in each gene of the

parent.

Fig. 11 visualizes the first five generations of interactive

evolution colour tuning in HSV coordinates. (Only the portion

Fig. 10. A user kept picking out the colour balls which seemed close to the

target colour tone in the smaller window during an interactive evolution.

around the target is displayed. The target is the centre of the

cross with HSV [0.083, 0.25, 0.80].) Each row shows two

views of the HSV search space. The circles scattered in the

HSV space represent the individuals generated during the

process and the asterisks represent selected individuals. Fig. 12

represents the results of the intention capture process which is

the automatic evolution process using the fitness approxi-

mation. The fitness approximation used the GRNN, and was

performed with four different deviation values s, which are

visualized in Fig. 13.

Fig. 12 shows that, with suitable deviation values (sZ6 or

9), the intention capture process converge towards the target.

Fig. 13 shows the fitness approximations using GRNN. The

centres of the regressions are seen to drift gradually as

the value of the deviation s increases. This may be due to the

extrapolating effect of the unselected points scattered in lower

part.

The values of the deviations s were initially tested with

several integers of an arithmetical progression. The suitable

value of the deviation s must be empirically determined by the

user with several trials. A general strategy has been proposed in

Section 3.5.

5.3. Some empirical guidelines

From our experimentation, we have discovered that the

value of the deviation s of the GRNN is sensitive to a number

of different factors. The deviation s appears to be highly

related to the interval span of the components of the input

vector x (parameters) and to the complexity of the problem

(multimodal or unimodal). In addition, for the different

distance metrics, the same value for s results in significant

differences. We have therefore formulated the following

empirical guidelines.

When the city block distance metric is used to estimate the

probabilistic density, the increase of the progression of s

should usually be two or three times of the interval of the

GRNN inputs. In the colour ball case, for instance, the interval

was [0.0, 1.0]; the trials of the deviations could be {3, 6, 9, 12}.

In facial shape design case, the interval was [K256.0, 256.0],

the trials adopted the deviations {1000, 2000, 3000, 4000} (see

Fig. 9). This empirical guideline seemed to work quite well in

our implemented system.

When Euclidian distance metric (see the formula 5) is

adopted, however, the value of s is usually a fraction of the

interval. For instance, in Fig. 3, the GRNN’s deviations are 0.3

and 0.5 as its interval is [K0.2, 1.2]; in Fig. 2a,b, the seemingly

reasonable deviations are {10, 15, 20} as the interval is [0.0,

100].

Of course, in order to use a more precise and problem

specific value for the deviation s, a number of further trials may

be needed.

Another unexpected phenomenon observed after tracking

number of interactive evolution processes in our system is that

the individuals bred are not distributed evenly in all directions

around the parents. The distribution of the breeding seems to be

Fig. 11. Visualization of the interactive evolution process within the HSV search space. Each row shows two views of the HSV search space. The crosses represent

the desired target color (HZ1/12, SZ0.25 and VZ0.8). Only the search space around the target colour is visualized. The circles represent individuals in the

population and the asterisks represent selected individuals. The shaded areas are missing gaps due to uneven mutations.


skewed, with the newborns bred from one parent appearing

frequently in several grid areas with gaps in between.

Generally, the symptoms can be explained as a local fine

tuning problem of Genetic Algorithm for numeric optimization

[24]. A simple explanation is that the gaps result from the

hamming cliffs between the binary genotypes of contiguous

individuals in a solution space. Thus, two close decimal

numbers may have very big hamming distance. (For instance, 7

and 8’s corresponding binary forms are 0111 and 1000.) On the

other hand, two relatively distant decimal numbers may have

very close hamming distance. (For instance, 15 and 7’s

corresponding binary forms are 1111 and 0111). As the

Fig. 12. Visualization of the results of the intention captures in the HSV search space. The two clusters of solid dots represent the evolved individuals (search results)

from two automatic evolutions using GRNN fitness approximations with sZ6 and 9 (the approximations are illustrated in Fig. 13).


newborns are produced by flipping randomly selected bits in

the parent genes, the mutations may seem jumpy.

The skewed breeding is due to the fact that the proportions

of 1 s and 0 s in a binary string are usually not equal. (For

instance, the number 7 (0111) has 3/4 probability become

Fig. 13. The 3D slices illustrate the approximations (regressions) based on all measu

distance metric in the HSV space. As the deviation value s increases, the density

[0.083, 0.25, 0.80].

a smaller number, 1/4 probability to be a larger number 15

(1111), if one bit is flipped.)

In this demonstration, the individuals bred in generation 2–5

were separated and confined within two apparent clusters.

Unfortunately, the desired target fell in the gap between

red cases from the interactive evolution process. The regressions use city block

centres of the regressions can be seen to drift gradually across the target HSV


the two main clusters. As a result, evolution stagnated with

many redundant reproductions. Although a mutation might

eventually fill this gap with increasing number of tries, it was

considered to be problematic. One solution may be to use

special coding schemes such as grey coding or real number

coding.

Nevertheless, this problem was not seen to be fatal to the

performance of the GRNN. The interactive evolution process is

only used as a pre-processor to perform initial search. The

GRNN is then used to create a fitness approximation based on

the selections during interactive evolution. Figs. 11 and 12

show that, with a suitable deviation value s, interpolation and

extrapolation by the GRNN are able to fill in the missing gaps

based on neighbouring individuals.

6. Conclusions and further work

This paper presents a framework for evolving designs that

combines an interactive evolutionary system for selecting

preferred designs, a Neural Network (GRNN) for fitness

approximation, and a canonical Genetic Algorithm for

automatically evolving designs.

The interactive evolutionary system allows its users to

express their aesthetic intentions by repeatedly selecting

preferred designs. The Neural Network then captures these

aesthetic intentions by analyzing the selections made by the

user. The Neural Network only requires the user to have made a

modest number of selections, thereby minimizing the work

required by the user. It results in a smoother, less erratic

evaluation function than the direct result of a user’s individual

choices. The Genetic Algorithm can then be run for a longer

time and with a much larger population, thereby allowing

designs to be evolved until they fall into the dominate hump of

user’s aesthetic intentions.

The main reason for having used GRNN as a model for

fitness approximation is its non parametric nature. Such a

model is very suitable for dynamic on-line learning. In

addition, as a variation of RBF network, it has good local

approximation properties that allow for local fine tuning of

designs. The interpolation and extrapolation abilities enable

GRNN to predict high fitness areas even when sample cases are

incomplete.

Further mass experiments and analysis with different

problems will be conducted to clarify the limitations and

potentials of GRNN to evolutionary algorithms. Multi user

interfaces and Internet access for mass data acquisition will

also be investigated so that these techniques may be used to

predict trends in areas such as fashion colour or car styling.

Acknowledgements

This project is supported by PhD scholarship in Hong Kong

Polytechnic University. We would like to thank Dr. Patrick

Janssen, a postdoctral researcher and former fellow PhD

student in our centre, for his proofreading of the editorial

composition of the draft of this paper.

References

[1] Bentley PJ. An introduction to evolutionary design by computers,

evolutionary design by computer. San Francisco, California: Morgan

Kaufmann; 1999. p. 1–73.

[2] Renner G, Ekart A. Genetic algorithms in computer aided design.

Comput-Aided Des 2003;35(8):709–26.

[3] Dawkins R. The evolution of evolvability, artificial life. In: Langton CG,

editor. The proceedings of an interdisciplinary workshop on the synthesis

and simulation of living systems, September, 1987, vol. VI. Los Alamos,

New Mexico: Addison-Wesley; 1989. p. 201–20.

[4] Todd S, Latham W. Evolutionary art and computers. London: Academic

Press; 1992.

[5] Sims K. Artificial evolution for computer graphics. Comput Graph 1991;

25(4):319–28 [ACM SIGGRAPH’91 Conference Proceedings, Las

Vegas, Nevada, 1991].

[6] Takagi H. Interactive evolutionary computation: fusion of the capacities

of EC optimization and human evaluation. Proc IEEE 2001;89(9):

1275–96.

[7] Whitley D. The GENITOR algorithm and selection pressure: why rank-

based allocation of reproductive trials is best. In: Proceedings of the third

international conference on genetic algorithms. United States: George

Mason University; 1989. p. 116–21.

[8] De Jong KA. An analysis of the behavior of a class of genetic adaptive

systems. PhD Dissertation. University of Michigan; 1975.

[9] Goldberg DH. Genetic algorithms in search, optimization, and machine

learning. Reading, MA: Addison Wesley; 1985.

[10] Bentley PJ. Aspects of evolutionary design by computers. In: Proceedings

of the 3rd on-line world conference on soft computing in engineering

design and manufacturing (WSC3); 1998.

[11] Yaochu J. A comprehensive survey of fitness approximation in

evolutionary computation. Soft Comput 2005;9(1):3–12.

[12] Biles JA. Genjam: a genetic algorithm for generating jazz solos.

Proceeding of international computer music conference; 1994. p.

131–7.

[13] Baluja S, Pomerleau DA, Jochem T. Towards automated artificial

evolution for computer generated images. Connect Sci 1994;3(2&3):

325–54.

[14] Ohsaki M, Takagi H. Improvement of presenting interface by predicting

the evaluation order to reduce the burden of human interactive EC

operators. In: IEEE international conference on system, man, and

cybernetics (SMC’98), San Diego, California, USA; 1998. p. 1284–9.

[15] Giotis AP, Giannakoglou KC. Single- and multi-objective airfoil design

using genetic algorithms and artificial intelligence. In: Jyvaskyla MK

et al., editors. EUROGEN 99, evolutionary algorithms in engineering and

computer science. Finland: Wiley; 1999.

[16] Giannakoglou KC, Giotis AP, Karakasis MK. Low-cost genetic

optimization based on inexact pre-evaluations and the sensitivity analysis

of design parameters. J Inverse Prob Eng 2001;9:389–412.

[17] Giannakoglou KC. Design of optimal aerodynamic shapes using

stochastic optimization methods and computational intelligence. Prog

Aerospace Sci 2002;38:43–76.

[18] Muhlenbein H, Paaß G. From recombination of genes to the estimation of

distributions I. Binary parameters. In: Eiben A, Back T, Shoenauer M,

Schwefel H, editors. Parallel problem solving from nature—PPSN IV.

Berlin: Springer; 1996. p. 178–87.

[19] Pelikan M, Goldberg DE. A survey of optimization by building and using

probabilistic models. Comput Optim Appl J 2002;21(1):5–20.

[20] Thierens D. Analysis and design of genetic algorithms. PhD Thesis.

Katholieke Universiteit Leuven, Leuven, Belgium; 1995.

[21] De Bonet JS, Isbell CL, Viola P. MIMIC: Finding optima by estimating

probability densities. In: Mozer MC, Jordan MI, Petsche T, editors.

Advances in neural information processing systems, vol. 9. Cambridge:

MIT Press; 1997. p. 424.

[22] Muhlenbein H. Convergence theory and applications of the factorized

distribution algorithm. J Comput Inf Technol 1998;7(1):19–32.

Z. Gu et al. / Computer-Aided

[23] Holland JH. Adaptation in natural and artificial systems. Ann Srbor:

University of Michigan Press; 1975.

[24] Michalewicz Z. Genetic algorithmCdata structuresZevolution pro-

grams. 3rd ed. New York: Springer; 1996.

[25] Specht DF. Probabilistic neural networks. Neural Networks 1990;3:

109–18.

[26] Specht DF. A general regression neural network. IEEE Trans Neural

Networks 1991;2:568–76.

[27] Parzen E. On estimation of a probability density function and mode. Ann

Math Stat 1962;33:1065–76.

[28] Cacoullos T. Estimation of a multivariate density. Ann Inst Stat Math

1965;18:179–89.

Zhenyu Gu received BEng in Industrial Design in

1994 and MA in Art Design in 1999 from Southern

Yangtze University. From 1994 to 1999, he

worked as a lecture teaching design sketch,

rendering and Computer Aided Industrial Design

in the same University, during which period he

obtained several design patents and won the first

prize in an international bicycle design compe-

tition. From 2000 to 2001, he worked as a research

assistant in the Design Technology Research

Center and was involved in the center’s research

initiative in using AI in Art and Architectural Design. In 2001 he obtained a

PhD scholarship from the Hong Kong Polytechnic University and he is now in

the final stage of completing his PhD project. His research interests in computer

graphics and soft computing techniques, especially with their applications in

design and media art.

Ming Xi Tang is an Associate Professor, and the

Associated Director of the Design Technology

Research Centre in the School of Design of the

Hong Kong Polytechnic University. He graduated

in 1981 with B.Sc and in 1984 with a M.Sc from

Huazhong University of Science and Technology in

Design 38 (2006) 224–237 237

The People’s Republic of China. In 1996 he

obtained his PhD from the Department of Artificial

Intelligence in the University of Edinburgh. From

1988 to 1997 he worked in University of Edinburgh,

University College London and University of

Cambridge as a researcher in the area of developing integrated and intelligent

design supporting methods and systems. In 1997 he joined the School of Design

of the Hong Kong Polytechnic University. His research interests include

Artificial Intelligence, generative and evolutionary design, complex form

generation and visualization, collaborative design technologies. He is the

executive editor of a recently established Journal of Designing in China.

Professor John Hamilton Frazer is the inter-

national research coordinator for the Digital

Project Eco-Systems in Gehry Technologies, and

former Swire Chair Professor of Design, Head of

School of Design, Director of the Design

Technology Research Centre in The Hong Kong

Polytechnic University. His research interests

include evolutionary architecture, tangible inter-

face, computational methods and techniques for

design. He pioneered the research in evolutionary

architecture and has more than 30 years of

experience in research of computer aided design techniques and processes in

both generic and specific design areas and in a post digital context.

Capturing aesthetic intention during interactive evolution

Documents