Immune Inspired Homeostasis for Long-term … University of York Department of Electronics Immune Inspired Homeostasis for Long-term Autonomy in Collective Robotic Systems Lachlan

The University of York Department of Electronics

Immune Inspired Homeostasis for Long-termAutonomy in Collective Robotic Systems

Lachlan Murray

16th December 2010

Transfer Report

Supervisors – Professor Jon Timmis and Professor Andy Tyrrell

Abstract

For future applications of collective robotic systems, such as search and rescue and theclean up of hazardous waste, the ability of the system to survive autonomously forlong periods of time is considered to be essential. To provide an artificial system withthe property of long-term autonomy the biological processes of homeostasis, realisedthrough the combined efforts of the nervous, immune and endocrine systems, is lookedto as a source of inspiration. In this report, the immune system in particular providesthe source of inspiration. Beginning with an introduction to the necessary backgroundinformation, this report goes on to introduce the topics of biological homeostasis andnatural immune systems whilst at the same time reviewing some of the various artificialsystems that they have inspired. Later sections present some early proposals on howsuch systems may help to ensure the long-term autonomy of a collective robotic system,and finally, this report concludes by outlining some initial work that has been conductedtowards this goal.

Contents

1 Introduction 6

2 Background 82.1 Symbiotic Evolutionary Robot Organisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.1 Symbricator Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.2 Grand Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2 Robot Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2.1 Robot Failure Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3 Fault Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3.1 Definitions and Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3.2 Fault Detection and Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3.3 Fault Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3 Literature Review 163.1 Homeostasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.1.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.1.2 General Homeostatic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.1.3 Engineering Artificial Homeostatic Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.2 Artificial Immune Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2.1 Biology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.2.2 Developing Artificial Immune Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.2.3 Algorithms and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.3 Artificial Immune System Framework for SYMBRION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.3.1 AIS Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.3.2 Current Progress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4 Research Topic 444.1 Homeostasis and Long-term Autonomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.1.1 Survival . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.1.2 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.1.3 Adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.1.4 Designing Autonomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.1.5 Fault Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.1.6 Artificial Energy Homeostasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.1.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.2 Artificial Robotic Organism Energy Homeostasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.2.1 Symbricator Power Management System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.2.2 Power Bus Homeostasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.2.3 Approaches to Power Bus Homeostasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.3 Long-Term Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.3.1 Adaptable Power Management in Organism Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.3.2 Fault Tolerance in Organism Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.3.3 Fault Tolerance in Individual and Swarm Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5 Preliminary Work 595.1 Simulation Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.1.1 Basic Energy Sharing Within the Stage Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.1.2 Power Management System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.2 Fault Tolerance in Individual and Swarm Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.2.1 Optimising the mDCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

6 Summary 65

4

List of Figures

1.1 Various forms of collective robotic system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1 A mock-up of an artificial robotic organism, produced during the early stages of the project. . . . . . . . . . . . 9

2.2 Prototypes of the three different Symbricator robots. Figure (a), from left to right, shows prototypes of thebackbone, active wheel and scout robots. Figure (b) shows how the three types of robot may be connected toform a simple artificial robotic organism. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.3 Images from the Robot3D simulator built specifically for the SYMBRION and REPLICATOR projects and nowreleased as open source software. Figure (a) shows 26 backbone robots forming an artificial robotic organism.Figure (b) shows an accurate model of the scout robot (foreground) and an older, less accurate, model of abackbone robot (background), the robots are not correctly scaled, in real life they are the same size. . . . . . . . 10

2.4 A sketch of the type of environment that will be used for the Grand Challenges. The pink cubes are roboticmodules and the yellow, green and blue squares are power sockets. . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.5 A taxonomy of robot failures from Carlson et al. (2004) and Carlson and Murphy (2005). Of most relevance tothis report are physical failures and the impact they may have on a system. . . . . . . . . . . . . . . . . . . . . . 12

3.1 Diagrammatic representation of Ashby’s ultrastable system, incorporating the four components: control system,parameters, essential variables and environment. Adapted from Ashby (1960). . . . . . . . . . . . . . . . . . . . 20

3.2 A drawing of Ashby’s Homeostat is shown in figure (a). Figure (b) shows a diagrammatic representation of theconnections between the four different units of the homeostat. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.3 A simplified diagram of a single homeostat unit ’A’. The diagram shows how the inputs from the other units(solid red lines), and feedback from unit A itself (dashed blue line), act through the components chosen by theuniselectors, altering the current through the coils and ultimately determine the position of the magnet. . . . . 21

3.4 Diagram showing the many different sub-types of leukocyte cell. Row c shows the five most importantcategories: eosinophils, neutrophils, basophils, lymphocytes and macrophages. Dendritic cells, though nottechnically leukocytes, are included in this diagram because they are functionally very similar macrophages.The cell representations in this diagram were adapted from Murphy et al. (2008). . . . . . . . . . . . . . . . . . 29

3.5 A diagram showing the different states of maturity of dendritic cells. The locations where the different celltypes are most prevalent, the transitions between them and the effects they have on helper T-cells are also shown. 33

3.6 A layered framework for the development of artificial immune systems. From de Castro and Timmis (2002). . 36

3.7 An inter-disciplinary, conceptual framework for designing artificial immune systems. From Stepney et al. (2005). 36

3.8 Diagram showing a potential weakness of the negative selection approach. If the size of “non-self” is muchlarger than the size of “self” then it may be hard to build up a set of detectors that sufficiently cover the“non-self” space. Without full coverage, any anomalies that fall within the white space will not be detected. . . 37

3.9 An artificial immune system framework for the SYMBRION project. . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.1 A hierarchy of three of the most important requirements of a long-term autonomous robotic system: basicsurvival, normal operation, and adaptation to a changing environment. . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.2 A modified view of the three requirements of a long term autonomous system. . . . . . . . . . . . . . . . . . . 46

4.3 An example power bus configuration for an artificial robotic organism. . . . . . . . . . . . . . . . . . . . . . . . 51

4.4 Block diagram of the Symbricator power management system of a single Symbricator robot . . . . . . . . . . . 52

4.5 Two examples of how neighbouring modules in an artificial robotic organism may share power. The dashedyellow lines show the directions in which current may flow. Due to the configuration of their docking switches,in (a) current is only able to flow from module A to module B and not from module B to module A. In (b)current is able to flow bi-directionally between modules D and E. . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.1 Figure (a) is a screenshot from the extended version of the Stage simulator that shows two robots sharingenergy, whilst another robot recharges itself at a power socket. Figure (b) shows the experimental setup, asimple environment with a single power socket, eight robots and basic obstacles. . . . . . . . . . . . . . . . . . 60

5.2 A state transition diagram of the behavioural controller developed to demonstrate the capabilities of theextensions to the Stage simulator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.3 Graph showing the change in the average state of charge of eight robots, repeated over five runs and utilisingtwo different strategies. The first strategy (continuous red line) permits robots to share energy with each other.In the second strategy (dashed blue line and dotted grey line), robots can not share energy. The dashed blueline omits the result of one anomalous run, described in text. The dotted black line includes the anomalous result. 61

5.4 The output of the mDCA algorithm. Figure (a) shows the output when using the baseline set of hand chosenparameters, whilst figure (b) shows the output when using an evolved set of parameters. The red rectanglesshow the times at which faults were injected into the corresponding sensors. . . . . . . . . . . . . . . . . . . . . 63

5.5 Expanded output of the mDCA algorithm for ‘sensor 0’, during the same runs shown in figure 5.4 and for thesame baseline (a) and evolved (b) sets of parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5

1 Introduction

The term ‘collective robotic system’ is extremely broad, covering a wide range of approaches, from the well developed fieldof swarm robotics, to more complicated systems such as self-reconfigurable robotics, and more conceptual technologiessuch as nanorobotics. A swarm robotic system, as shown in figure 2.3a, typically consists of a large number of relativelysimple and often homogenous units. Often taking inspiration from biology, in particular the activity of social insects,the behaviour of a swarm robotic system emerges purely from the local interactions of the individual robots and theirenvironment. Nanorobotics, considers the task of creating and controlling robots on a microscopic scale. Due to the largeengineering hurdles that are faced in constructing nano robots, few, if any, real-world examples of such systems currentlyexist. Self-reconfigurable robotic systems, as depicted in figure 1.1c, are composed of many individual units, physically joinedto form large robotic structures which are far more capable than the sum of their individual parts. Self-reconfigurable roboticsystems are also highly flexible, allowing the individual modules involved to alter their connectivity with their neighboursand form different robot morphologies that are better suited to the current task or environment.

A general collective robotic system may take the form of a swarm, nano, or self-reconfigurable robotic system, or it maycombine a number of different properties from each of these individual systems. In this report, a combined view is taken thatunites the approaches of swarm and self-reconfigurable robotic systems. A collective robotic system is considered to be onein which multiple (potentially heterogeneous) robots are, at any moment in time, operating in one of three different states.Firstly, robots may operate as individuals, serving only themselves. Secondly, in line with the swarm robotics approach,robots may act as members of a larger group. Finally, robots may possess the ability to physically join with each other to fromlarger robotic structures that may re-organise themselves in a manner similar to that observed in self-reconfigurable roboticsystems. This classification is depicted in figure 1.1d. It should be noted that due to its infancy, and the vastly different scaleupon which it operates, in this report, no consideration is made for the field of nanorobotics.

Future applications of collective robotic systems are expected to include: search and rescue operations, space exploration,surveying hostile environments and the cleanup of hazardous waste, to name but a few. The unifying feature of the theseapplications is that they involve robots operating in environments that may be considered too dangerous for humans toenter. Another problem with the environments of these applications is that within them, it may be difficult or impossible tocommunicate with the robots, especially in the case of search and rescue and space exploration tasks. For a collective roboticsystem to succeed at an application which involves the robots operating within a potentially hostile environment, with littleor no human interaction, the robots will need to exhibit the property of long-term autonomy.

For the development of a collective robotic system that is capable of surviving autonomously for long periods of time, twoproperties in particular may be singled out as essential to the system’s success: fault tolerance and power management. A robotthat experiences a fault in a collective robotic system and is not able to tolerate it, will not only be damaging to itself, but willpotentially have drastic system-wide consequences, detrimentally effecting the behaviour of other robots in the collective.Similarly, a robot that does not efficiently manage its usage of energy will greatly reduce both the longevity of the system andthe systems ability to complete its task, not to mention the burden that it may impose on other robots within the system. In asystem that permits the formation of self-reconfiguring robots, individual units are far more dependent on one another thanwhen operating as a swarm or as individuals, consequently, the problems associated with a robot’s inability to tolerate faults,or its inefficient use of energy, are amplified.

It is suggested in this report, that one way of providing these properties of fault tolerance and efficient power managementis to take inspiration from the biological concept of homeostasis. The term homeostasis, describes the ability of an organismto maintain a stable internal environment, even in the presence of considerable ambient variation. In the human bodyhomeostasis is generally considered to be maintained through the combined efforts of the nervous, endocrine and immunesystems. The immune system provides the majority of the focus of this report, but other approaches to maintaininghomeostasis are not yet ruled out and it is anticipated that the best method of ensuring fault tolerance and efficient powermanagement in a collective robotic system will take inspiration from a variety of different sources.

This report is structured as follows. Chapter 2 introduces some of the necessary background information, beginning withan introduction to the SYMBRION project and details of its associated robotic platform, ‘Symbricator’, upon which futurework will be implemented. Chapter 2 also covers a basic introduction to the topics of robot reliability and general faulttolerance. Chapter 3 then reviews some relevant previous work, beginning with a general review of homeostasis and artificialhomeostatic systems, before moving on to a more focused review of the field of artificial immune systems. Chapter 4 outlinesthe research aims going forward and finally, chapter 5 describes the existing progress that has been made towards these goals.

6

Swarm Robotic System

(a)

Nanorobotics System

(b)

Re-configurable Robotic System

(c)

Collective Robotic System

(d)

Figure 1.1: Various forms of collective robotic system.

7

2 Background

2.1 Symbiotic Evolutionary Robot Organisms

Symbiotic Evolutionary Robot Organisms, the SYMBRION1 project, is a large scale EU funded project involving ten differentpartners from various research institutions across Europe. The aim of the project is to investigate a variety of novel approachesfor the control of a custom built collective robotic system. The majority of the approaches involved take some form ofinspiration from biology, the research areas of: evolutionary computation, artificial development and artificial immune system,in particular, provide a large amount of the focus.

The SYMBRION project is partnered with another EU project known as REPLICATOR2 (Robotic Evolutionary Self-Programming and Self-Assembling Organisms). The REPLICATOR project also involves ten different partners, but three ofwhich are shared with SYMBRION, leading to a total of 17 partners across the consortium. The two projects, though differingin their approaches and focus, share a common hardware platform known as Symbricator (SYMRIion and repliCATOR).

The Symbricator platform is unique in that the robots possess the ability to not only operate as individuals or as membersof a swarm but also to physically aggregate with one another forming ‘artificial robotic organisms’ in manner not seen in othercollective robotic systems. Further flexibility is added to the platform in the fact that when docked to one another robots maysymbiotically share energy and computational resources. A mock-up of how the system was originally expected to look andoperate is provided in figure 2.1. Such is the adaptability of the envisaged system, in figure 2.1 a robotic organism is showncompleting a task, in this case climbing over an obstacle, that may be impossible for an individual robot or an organism witha different morphology. A central challenge of the project is how to design controllers that exploit the capabilities of thishighly flexible platform and demonstrate the kind of adaptability depicted in 2.1.

Utilising this unique platform, the SYMBRION and REPLICATOR projects are driven by two ‘Grand Challenges’. Towardsthe end of the projects, using the various approaches developed throughout, these two grand challenges will be used todemonstrate the adaptation and evolution of artificial robotic organisms. Through their situation in a dynamic environment, itis envisaged that to survive and prosper, individual robots and larger robotic organisms will need to exhibit the properties ofself-organisation, self-reconfiguration and self-healing; co-evolving and cooperating with each other to ensure the continuedsurvival of the collective.

This section begins by introducing the Symbricator platform, before going on to describe the two grand challenges that arecurrently driving the research of the SYMBRION and REPLICATOR projects.

2.1.1 Symbricator Platform

The Symbricator platform may be described as a type of a heterogeneous collective robotic system. As shown in figure 2.2,the platform consists of three different types of robot, each of which has different specialities and accordingly, is equippedwith different components. The three types of robot are similar in size and united in so far as they share a common form ofdocking connector, allowing allowing them to share energy and computational resources with each other and form roboticorganisms that consist of multiple types of robot. A 3D robotic simulator is also under development, originally namedSymbricator3D, a reduced version of the software has now been released as open source under the title: Robot3D3. Each of thethree types of robot are now introduced in turn, including basic details of their specialities and the components they possess,following which, a brief description of the Robot3D simulator is provided.

Backbone Robot

The backbone robot is the leftmost of the prototype robots shown in figure 2.2a. More so than the other types of robot, thebackbone robot is specialised to operate as a module within an artificial robotic organism. To suit its purpose the robotpossess four docking connectors, one on each of its vertical faces (i.e. excluding the top and bottom), and the entire bodyof the robot can rotate like a hinge. It is easy to imagine how these functionalities might be useful in the formation of asnake-like organism, or as a joint in the leg of a hexapod style robot. Further suited to its task, the backbone robot has a verystrong and stable structure, with actuators that are powerful enough to lift several other docked units. As an individual, thebackbone robot has a rather unique form of locomotion. As can be seen in figure 2.2b, the robot possess two ‘screw drives’ onits underside, which allow the robot to move omnidirectionaly, depending on the direction and rotational speed applied toeach screw. The choice of screw drives as the robots method of planar locomotion was also guided by the main purpose ofthe robot. By allowing the robot to move omnidirectionaly, the presence of the screw drives makes the task of aligning anddocking with other modules, on any of the robot’s four docking sides, much easier.

Scout Robot

The scout robot, another form of cubic robot, is the rightmost of the robots in figure 2.2a. Appropriately, for scouting andsurveillance tasks, the robot is much more agile than the backbone robot and far better suited to operating in individual and

1http://www.symbrion.eu [accessed 10 November 2010]2http://www.replicators.eu [accessed 10 November 2010]3https://launchpad.net/robot3d [accessed 10 November 2010]

8

Figure 2.1: A mock-up of an artificial robotic organism, produced during the early stages of the project.

(a) (b)

Figure 2.2: Prototypes of the three different Symbricator robots. Figure (a), from left to right, shows prototypes of thebackbone, active wheel and scout robots. Figure (b) shows how the three types of robot may be connected to forma simple artificial robotic organism.

9

(a) (b)

Figure 2.3: Images from the Robot3D simulator built specifically for the SYMBRION and REPLICATOR projects and nowreleased as open source software. Figure (a) shows 26 backbone robots forming an artificial robotic organism.Figure (b) shows an accurate model of the scout robot (foreground) and an older, less accurate, model of abackbone robot (background), the robots are not correctly scaled, in real life they are the same size.

swarm modes. Two caterpillar tracks, controlled by a differential drive system, provide the scout robot with a more standardmethod locomotion. The scout robot is both faster and lighter than the backbone robot, and although the current revisionof the hardware is not quite as flexible as the early prototype depicted in figure 2.2a, due to the position of its tracks, therobot should be able to operate with ease whilst oriented on at least two of its six sides. The scout robot does not have ahinge mechanism like the backbone robot, but it does possess a moving arm, which although is much weaker than the liftingmechanism of the backbone robot, may still be able to lift or drag single modules. The ability of the scout robot to help movesingle modules highlights another important potential task of the robot, that being the fast recruitment of backbone robots toa site at which a robotic organism may be formed.

Active Wheel Module

Completing the trio is a type of robot referred to as the active wheel module, it is shown in the centre of figure 2.2a. The activewheel has a very different shape to the other two modules, with its entire body capable of acting as a hinge. A potentialuse case of the active wheel’s hinge as a lifting mechanism can be seen in both parts of figure 2.2, in part (a) the robot isshown lying flat and in part (b), the other two modules having docked with it, the active wheel is shown bent, lifting theother modules off of the ground. Like the backbone robot, the active wheel has strong actuators and also possess a form ofomnidirectional drive. The active wheel, however, is faster and more mobile than the backbone robot, this fact, combined withthe robot’s ability to pick up other modules may allow the active wheel to be used to transport other modules to differentareas of the environment without them having to expend much energy. Another use of the active wheel is as an energy store,from which the other robots may recharge themselves. Since the active wheel is larger than the other robots it is capable ofstoring more energy, furthermore, unlike the other robots, whose only method of obtaining energy is through their dockingconnectors, the active wheel may possess solar panels which allow it to passively gain energy throughout normal operation.

Robot3D

The Robot3D simulator, as shown in figure 2.3, allows members of the consortium to develop controllers without the need foraccess to the robotic hardware itself, which is still very much under development. Based upon the Delta3D1 game engine andvarious other open source technologies, the simulator contains accurate physical models for both of the Symbricator cuboidrobots, but as of yet, not the active wheel. Simulating environments with accurate physics, and accurate models of the robots,allows members of the consortium to run evolutionary computation experiments that, due to time constraints and at the riskof damaging the hardware, would simply not be possible on board the real robots. Although it would be naive to expect thatany controllers developed for the simulator will transfer directly, glitch free, to the hardware, whilst the hardware is stillunder development the simulator provides an essential intermediate for the design of algorithms.

2.1.2 Grand Challenges

As mentioned in chapter 1, the future practical applications for swarm and self-reconfigurable robotic systems are numerous;popular examples include: search and rescue operations, space exploration or the clean up of hazardous waste. In thesesituations, where the environment may be inaccessible or simply too dangerous for humans, the deployment of a collectiverobotic systems such as Symbricator could increase the chances of success whilst at the same time reducing the risk to humanlife. In order for robotic systems to perform well in these situations they must be able to tackle a central problem in robotic

1http://www.delta3d.org/ [accessed 10 November 2010]

10

Figure 2.4: A sketch of the type of environment that will be used for the Grand Challenges. The pink cubes are roboticmodules and the yellow, green and blue squares are power sockets.

research: long-term autonomy. To solve this problem, robotic systems must exhibit a high degree of adaptivity, both tochanges in their external environment (weather or lighting conditions) and to changes in their physical self (broken sensors ormotors) or internal state (battery life). To address these issues the consortium have proposed two unique but complementary“Grand Challenges”, both of which are discussed shortly, a typical environment in which the challenges make take place isdepicted in figure 2.4. Beyond the scope of the SYMBRION and REPLICATOR projects these challenges can also serve as abenchmark for the fields of swarm, self-reconfigurable and evolutionary robotics.

GC1: 100 Robots 100 Days

The first of the Grand Challenges is heavily focused on the long-term autonomy of collective robotic systems. As its namesuggests, the challenge will involve placing 100 robots in room for 100 days and then observing whether or not they areable to survive autonomously without any human intervention or maintenance. The environment in which the robots willbe deployed will contain several power sources placed in locations of varying accessibility. Some power sources will beaccessible to individual robots whilst others will not, for example they may be placed too high for a single robot to reach orenclosed or blocked by structural gaps and obstacles (see figure 2.4). To reach sources inaccessible to individuals the robotswill be required to cooperate, self-organise and re-configure themselves into artificial organisms. The quantity and quality ofthe energetic resources will change over time. In the beginning, by taking advantage of power sources that are available toindividual non-cooperating robots, the environment will be relatively easy for the robots to survive in, but over time thesesources will become more scarce and the robots will need to adapt to the increasingly difficult environment if they are tosurvive. Further to the decreasing quantity of energy sources, some sources will decrease in quality by intermittently turningthemselves on and off, organisms which are able to predict and track these changes will stand a better chance of survival.

GC2: The Origin of Species

The second Grand Challenge focuses more on adaptivity as a product of evolution. An open-ended and unbounded approachto evolution will be taken with no explicit fitness function or stopping criteria, this differs from traditional evolutionarycomputing techniques in which selection and variation are managed by some pre-determined problem-specific objectivefunction. In an open-ended approach, the environment (including the other robots) acts as an implicit fitness function. Theexperimental setup will be similar to that of the first Grand Challenge but the on-going evolution of the robots will leadto a much more dynamic environment. The robots will be required not only to adapt to the changing energy resourcesand structural features of the environment but also to the behaviours of their fellow robots as they cooperate and competefor survival. The challenge will start with the evolution of simple robotic behaviours and then move on to a more difficultproblem, the co-evolution of different robotic species. The question of where to draw the line between one robotic species andanother is a difficult one, even in biology there is no consensus on the categorisation of species. Some measure of similaritybetween genetic information could be used to define a species, or the grouping of all organisms that are able to successfullyreproduce. Whatever scheme is used to define a robotic species it is likely to require variation at the genetic level and so inorder to allow for the origin of new species the genome structure must not be restricted, as is often the case in traditionalevolutionary computation. Through the open-ended approach to evolution this challenge hopes to address two specificproblems: the origin of new robotic species and the emergence of self-regulation and homeostasis. Other issues that can beaddressed by this challenge include: the evolution of individual and collective strategies; the evolution of different organismmorphologies; and the appearance and diversity of multi-cellularity in robot organisms.

11

Failures

Physical Human

Control System

SensorPower

Design Interaction

Mistakes Slips

Repairability

Impact

CommunicationsEffector

field-repairable

non-field-repairable

non-terminal

terminal

Figure 2.5: A taxonomy of robot failures from Carlson et al. (2004) and Carlson and Murphy (2005). Of most relevance to thisreport are physical failures and the impact they may have on a system.

2.1.3 Summary

The SYMBRION project is concerned with the development of novel bio-inspired approaches to the control of collectiverobotic systems. The unique platform used by the project, known as Symbricator, allows three different types of roboticmodule: the backbone, scout and active wheel, to operate in three different modes: alone, as part of a swarm or as part of anartificial robotic organism. Though the three modules are heterogeneous, they share a common docking interface ensuringthat docked robots may share power and computational resources. Though all three robots may operate in any of the threemodes, each of them has various specialities. The backbone robot is well suited to act as a member of a large robotic organism,whilst the scout and active wheel modules are better suited to acting alone or as a member of a swarm, performing dutiessuch as surveillance, in the case of the scout robot, and transport or servicing in the case of the active wheel.

Two “Grand Challenges” have been proposed for the SYMBRION project. The first challenge, ‘100 robots 100 days’,is concerned primarily with the long-term autonomy of large scale collective robotic systems, whilst the second, is moreconcerned with robotic evolution. The two challenges combined will serve as the driving force for the remainder of theproject.

2.2 Robot Reliability

In chapter 1, two properties in particular were identified as essential to the long-term survival of a collective robotic system:fault tolerance and power management. These properties are important because they can both help reduce the chances ofrobotic failures. Before attempting to provide a system with fault tolerance and efficient power management, however, it isimportant to understand when, how and why robots fail. This section begins by discussing the reliability of individual robots,before moving on to the reliability of collective robotic systems, and discussing whether or not, through redundancy, suchsystems are inherently reliable.

2.2.1 Robot Failure Modes

To help determine the type and frequency of failures that can be expected within mobile robots, Carlson and Murphy (2003)and Carlson et al. (2004) analysed, in total, three years worth of usage logs at the University of South Florida’s Center forRobot-Assisted Search and Rescue (CRASAR). Data from 13 robots was analysed, gathered under both laboratory conditionsand field conditions where the robots carried out “urban search and rescue” and “military operations in urban terrain” styletasks. The standard industry measures of Mean Time Between Failure (MTBF) and Availability (the probability that a systemwill be error free at some given point in time (Carlson et al., 2004) were used to assess the robot’s reliability.

Failures were classified according to the taxonomy introduced in Carlson et al. (2004) and later Carlson and Murphy (2005),reproduced in figure 2.5. Since the focus of this report is on autonomous robotic systems, only the five types of physicalfailure in figure 2.5 are of relevance, those being failures of: effector, power, control system, sensor and communicationfunctions. Repairability, which in Carlson et al.’s taxonmoy refers to repairs that may be carried out by a human operator, isalso not relevant in this context.

The analysis of Carlson and Murphy (2003) and Carlson et al. (2004) showed the MTBF to lie somewhere between 8 and 24

hours and availability of around 54%, for both measures robots were seen to be much less reliable under field conditions.In terms of the types of failures observed, Carlson and Murphy (2003) found failures of effector functions to be the mostcommon, with examples including tracks slipping and wheels warping due to heat. In Carlson et al. (2004) however, whichextended the previous study with an extra year of data, failures of the control system were found to be the most common. Ina meta-study of ten different studies into the reliability of field operating Unmanned Ground Vehichles (UGV) (includingdata from the original CRASAR studies) effector and control system failures were again found to be the least reliable, andtracked robots, it was observed, failed more often wheeled robots (Carlson and Murphy, 2005). The least likely componentsto fail were found to be sensors and power systems. It is suggested by Carlson and Murphy that this is due to the fact that,

12

in contrast to control systems and effector functions which may be specific to their associated platform, power systems arerelatively simple, and sensors are often well established and mass produced. It should be noted that the power system of theSymbricator platform, introduced in section 4.2.1, is highly specific to the platform and anything but simple, hence Carlsonand Murphy’s observations cannot be assumed to hold true for this platform.

The reliability analysis of the original CRASAR studies and the meta-study of Carlson and Murphy (2005) only consideredindividual robots, operating alone or semi-autonomously with human interaction. More relevant to the SYMBRION projectis the reliability of collective robotic systems. At least since early advocates such as Brooks and Flynn (1989), and perhapseven longer, it has been claimed that due to their inherent redundancy multi-robot systems are (on the whole) more reliablethan single robot systems. As highlighted by Stancliff and Dolan (2006) there is little quantitative evidence to support thisargument. To help provide some evidence for the claim that multi-robot systems are inherently reliable, Stancliff and Dolanmodel failures in multi-robot teams carrying out a planetary exploration mission, and assess the trade-off between the: lengthof operation, reliability of components and cost of the system. Various experiments were run using Stancliff and Dolan’smodels, all of which supported the claim that, whilst at an equal or reduced cost, larger multi-robot teams of unreliableindividuals may be more efficient and more reliable than smaller robot teams with more reliable individuals. Specifically,Stancliff and Dolan showed that with greater numbers of robots there is an increased chance of mission completion, but withdiminishing returns, after a certain point adding more robots does not lead to a big advantage, unless the duration of themission is increased. Furthermore, larger robot teams with significantly less reliable components have a greater chance ofcompleting the mission, and do so quicker, than smaller robot teams with less reliable components, however, as the length ofthe mission is increased the gap is reduced. A small criticism of Stancliff and Dolan’s models is that they don’t take intoaccount what would happen if a failure occurred mid-task, and furthermore, they model failures as all-or-nothing, if a singlesubsystem fails, the whole robot fails. These simplifications are not realistic and what’s more they don’t allow the model totake into account the potentially drastic effect that a partial failure or a partially completed task may have on the other robotswithin the system.

With a very different approach to Stancliff and Dolan, Winfield and Nembrini (2006) do take into account the effects ofpartial failures on collective robotic systems. Building upon their notion of a ‘dependable swarm’ (Winfield et al., 2005)and utilising the Failure Mode Effect Analysis (FMEA) methodology, Winfield and Nembrini (2006) analyse the effects thatvarious failures have on the behaviour of a robotic swarm carrying out a simple containment task. Six types of failure inparticular are considered, classified as: motor, communications, avoidance sensor, beacon sensor, control system and completesystem failures. Following analysis, Winfield and Nembrini (2006) conclude that robotic swarms do exhibit the property ofrobustness, allowing them to be thought of as reliable systems, however, whilst they are very tolerant to complete failuresthey are far less tolerant to partial failures. A motor failure in the containment task, for example, may lead to an anchoringof the swarm, since the failed robot will still be able to communicate with its neighbours but not move with them. Animportant point that may be taken from the work of Winfield and Nembrini (2006), is that although multi-robot systems maybe considered reliable, in designing a swarm or collective robotic system, it cannot simply be a case of adding more robotswith the hope of increasing reliability. The effects that potential failures may have on the system must be taken into account.In particular, partial failures may be the most damaging to a swarm robotic system, to determine the effects of these failuresWinfield and Nembrini advocate the use of FMEA, but this only tells the designer what might go wrong, to ensure that itdoesn’t go wrong, behaviours must be incorporated that detect and account for any potentially damaging failures.

2.2.2 Summary

The CRASAR studies into the reliability of mobile robots (Carlson and Murphy, 2003; Carlson et al., 2004), combined withthe larger collated study of Carlson and Murphy (2005), suggest that platform specific components, such as the controlsystem and effectors of a robot are those most likely to fail in a mobile robotic system. Simpler components such as massmanufactured sensors were observed by Carlson and Murphy (2005) to be less likely to fail. In modelling the reliability ofmulti-robot teams, Stancliff and Dolan (2006) highlight the advantages of collective robotic systems in terms of reliability, costand efficiency, but importantly, do not take into account the effects of partial failures. Winfield et al. (2005), utilising failuremode effect analysis, do take into account partial failures, and conclude that they may be far for damaging to the reliabilityof a collective robotic system than complete failures. Consequently, in the design of a collective robotic system, to ensurethe long-term survival of the system, the reliability of the individual components and by extension the individual robotsthemselves must be carefully considered. If detectable, attempts must be made to reduce the effects of any perceived failures.

2.3 Fault Tolerance

Fault tolerance can be thought of as the ability of a system to continue operating in the presence of faults, albeit with apotentially reduced level of performance. By this definition, swarm and collective robotic systems in general may be describedas fault tolerant, since the failure of a few individuals does not necessarily imply the failure of the system as a whole. Morespecifically, this type of fault tolerance may be described as implicit fault tolerance (Christensen et al., 2007), due to the factthat the control of the system is not altered when a fault occurs, fault tolerance is merely an emergent property of the adaptivenature of the controller. As highlighted by Winfield et al. (2005), in a collective robotic system, certain partial failures mayhave drastic system wide effects, and hence, implicit fault tolerance may not always be sufficient to ensure the reliability of acollective robotic system. In cases where implicit fault tolerance is not sufficient, further levels of control may be necessary tofirst detect and then minimise the effects of faults, this more explicit form of fault tolerance is the main subject of this section.Before introducing fault tolerance in more detail it is important to provide some general definitions and terminology that willbe used throughout this section.

13

2.3.1 Definitions and Terminology

Whilst the general definition of fault tolerance is widely accepted in literature, there is sometimes confusion in the definitionof the terms: fault, failure, error and anomaly. In this section, a classification based upon that of Laprie (1995) and Carlsonand Murphy (2005) is used. In this classification, a failure is an event, an error (considered equivalent to an anomaly) is a stateand a fault is a cause. More specifically, a failure is considered to be an event observable as the system doing something that isit not supposed to be doing, caused by an error or anomaly in a particular component or subsystem. An error may be thoughtof as a deviation in the state of a particular component from normal, and finally, a fault is anything that may cause an error oranomaly. To reiterate with an example, consider a small mobile robot with an array of IR range sensors performing simpleobstacle avoidance. If a fault occurs in one of the IR range sensors, it may cause that sensor to return a value which is biasedby a constant amount from its true value, this is an error. If the control system then reads the value of the faulty sensor itwill no longer be able to effectively perform obstacle avoidance, and hence, the fault has manifested itself as a failure in thecontrol system. Whilst the three terms (four including ‘anomaly’) appear tightly linked, it is important to note that, althoughby definition, a fault will always lead to an error, an error will only result in a failure if it becomes ‘activated’. If for example,the control system of the imagined robot never reads the value of the faulty sensor, the failure will not occur. According tothe terminology of Laprie (1995) an error that is not yet activated is referred to as a latent error. Since there is often a directmapping between these terms it is easy to understand how confusion arises. In this section these definitions will be strictlyfollowed, however, in later sections this restriction is lifted.

It is hard to remove a fault without replacing the faulty component, and an error will persist as long as the fault is present,however, if the fault can be isolated or the value of the error can be corrected, then it is entirely possible to reduce orcompletely remove a failure. By following a three step process incorporating: detection, diagnosis and recovery, the reductionor removal of failures is precisely what explicit fault tolerance attempts to achieve. Though detection and diagnosis are almostalways closely linked and in literature often seem to take a higher precedence to recovery, the relative importance of thethree steps may vary from system to system. In the remainder of this section, each of the three steps of fault tolerance areconsidered in turn and some examples of the various possible approaches are highlighted. Of great importance to all threesteps are the concepts of redundancy and degeneracy. Redundancy being the presence of multiple identical units whichperform the same function, and degeneracy being the presence of multiple structurally dissimilar units which perform thesame function (Edelman and Gally, 2001).

2.3.2 Fault Detection and Diagnosis

In the development of fault-tolerant control systems such as those used in aviation and other safety critical systems, detectionand diagnosis are often referred as part of the same Fault Detection and Diagnosis (FDD) process (Zhang and Jiang, 2008).There are a number of different approaches to fault detection and diagnosis, Zhang and Jiang (2008) provides an extensiveclassification that generally categorises methods as either model-based or data-based, the term model-free is also sometimesused to refer non model-based approaches (Heredia et al., 2008). Model-based approaches require the construction of a modelthat describes the normal behaviour of the system, at run-time the difference between the actual system output and that ofthe model can then be used to detect the presence of faults. In Zhang and Jiang’s area of interest, fault-tolerant safety-criticalcontrol systems, the majority of approaches to FDD are model-based, for which Zhang and Jiang provide many examples. Ofmore interest here are the methods applied to mobile, and in particular collective robotic systems. Zhuo-hua et al. (2005)surveys various methods of fault detection and diagnosis for wheeled mobile robots, again model-based approaches are wellrepresented. In particular three areas of interest are singled out by Zhuo-hua et al., these being: multiple model, particlefiltering and sensor fusion based techniques.

Various authors have utilised neural networks and supervised learning techniques to perform fault detection and diagnosis.In the novel exogenous fault detection approach of Christensen et al. (2007), a ‘leader’ robot was able to detect faults in a‘follower’ robot using a time-delay artificial neural network. In Christensen et al. (2008) the approach is applied to twofurther tasks: docking with another robot and finding and following a perimeter. A similar neural network based approach isdescribed by Wang et al. (2009), in which a recurrent neural network is trained to model the sensors of an underwater robot,and detection is performed at run-time by comparing the output of the network with the sensor outputs. Fault detectionon board another novel platform was investigated by Heredia et al. (2008), who, using a model-based approach, detectedactuator and sensor faults on board small autonomous helicopters.

Hashimoto et al. (2008) describe an approach to fault detection that is based on voting between redundant and degeneratesensors. Specifically, three laser range sensors and one dead reckoning module estimate the current velocity of a robot andvote accordingly, any sensor that is not in agreement with the others is considered to be faulty. Hashimoto et al.’s approachwas also extended to a multi-robot system (Hashimoto et al., 2009) in which both velocity and position within the group wereestimated.

2.3.3 Fault Recovery

Fault recovery is generally a simpler process than fault detection and diagnosis, especially when degeneracy or redundancyare employed. If a system includes several redundant or degenerate components then if one component fails a suitableform of recovery may be to simply isolate and ignore that component and rely on the output of other functionally similarcomponents. If more information is known about the fault, it may be possible to perform error correction. If, for example, afaulty sensor exhibits an error that causes it to return a value that is a constant amount away from what it should be, theerror can be corrected simply by accounting for this bias.

14

2.3.4 Summary

Approaches to fault tolerance may be classified as either implicit, or explicit. With implicit approaches, fault tolerance is aconsequence of the adaptiveness of the controller rather than any specific procedure that acts to remove or minimise theeffects of faults. Explicit fault tolerance on the other hand can be though of as a three step process, incorporating: detection,diagnosis and recovery. In discussing fault tolerance, the definitions of the terms: fault, failure, error and anomaly must beclarified. Faults cause errors or anomalies in system components. An error is a deviation of a component from its normalstate and until it is activated is described as latent. When activated an error results in a failure, which is the observed event ofthe system doing something that it isn’t supposed to. Redundancy, the presence of multiple identical units which performthe same function, and degeneracy, the presence of multiple structurally dissimilar units which perform the same function,are important across all three steps of fault tolerance. There are a variety of approaches to fault detection and diagnosiswhich may generally be classed as model-based or model-free. There are less approaches to fault recovery, since the task isconceptually easier, often simply requiring the isolation or removal of the faulty component.

15

3 Literature Review

3.1 Homeostasis

One of the most remarkable features of complex biological organisms is their ability to maintain a relatively stable internalenvironment, even in the presence of considerable ambient variation. This internal stability, achieved primarily throughthe interactions of the nervous, immune and endocrine systems, is referred to as homeostasis. One of the best examples ofhomeostasis is provided by the processes of thermoregulation. Put simply, thermoregulation is the ability of an organism toregulate and maintain its internal temperature within certain specific boundaries, regardless of the external temperature.In the human body, a number of physiological processes, such as sweating and shivering, contribute to the regulationbody temperature, ensuring that in spite of potentially considerable environmental changes, the internal temperature neverstrays far from 37 ◦C. Later in this section, after briefly outlining the history of the term ‘homeostasis’, the example ofthermoregulation is revisited. Some of the main processes of thermoregulation (within the human body) are introduced inmore detail and used to describe the general properties of homeostatic systems. Towards the end of this section the problemof engineering an artificial homeostatic system is considered and some previous homeostatically inspired work is reviewed.

3.1.1 History

Cannon

The term homeostasis was first coined by the American physiologist Walter Bradford Cannon in his 1926 paper entitled:“Physiological regulation of normal states: some tentative postulates concerning biological homeostatics” (Cannon, 1926).The concept was built upon in Cannon (1929), a paper which later formed the main features of Cannon’s 1932 book: “TheWisdom of the Body” (Cannon, 1932). The following definition of homeostasis comes from the opening chapter of “TheWisdom of the Body”:

“The cöordinated physiological processes which maintain most of the steady states in the organism are so complexand so peculiar to living beings —involving, as they may, the brain and nerves, the heart, lungs, kidneys andspleen, all working cöoperatively— that I have suggested a special designation for theses states, homeostasis. Theword does not imply something set and immobile, a stagnation. It means a condition—a condition which mayvary, but which is relatively constant.”

As Cannon emphasises here, words like ‘steady’ and ‘stable’ do not imply that the internal state of an organism neverchanges. The internal environment of an organism may be constantly changing, however, these changes are often onlyrelatively small, normally occurring within strict limits and tending always to return towards some set of fixed points.Cannon’s use of the words ‘coordination’ and ‘cooperation’ highlight another important aspect of homeostasis, specifically,how it is controlled. Homeostasis is achieved not by a single central control system but instead through the cooperation ofmany physically and functionally distinct units acting in a coordinated, distributed and self-organising manner.

Bernard

Although Cannon was responsible for devising the term ‘homeostasis’, he was not the first to consider the underlyingproperties of the concept. Claude Bernard, a French physiologist, was one of the first to discuss what is now referred to ashomeostasis, frequently talking about the concept in terms of the stability of the internal environment (or le milieu intérieur).In Bernard (1878), as translated by Fulton (1966) Bernard states:

“stability of environment implies an organism so perfect that it can continually compensate for and counterbalanceexternal variations. Consequently, far from the higher animals being indifferent to their surroundings, they areon the contrary in close and intimate relation to them, so that their equilibrium is the result of compensationestablished as continually and as exactly as if by a very sensitive balance.”

From the exert above it is clear that Bernard believed that the intimate relationship between an organism and its surroundingswas of great importance to the maintenance homeostasis. Bernard hints at, without explicitly stating, that their exists feedbackmechanisms between the organism and its environment, with organisms continually compensating for and adapting toexternal changes. Take the example of thermoregulation, as the external temperature changes, the organism adapts in sucha way as to counter these changes, by sweating or shivering for example. Though he was able to describe the generalproperties of homeostasis accurately, Bernard was not correct with his opinions on how homeostasis is maintained. As Cooper(2008) points out, Bernard believed that homeostasis was maintained in a centralised manner with the nervous system aloneresponsible turning on and off the various functions of the human body that balance the internal environment. As mentionedpreviously, homeostasis is now known to be maintained by a number of cooperating systems, acting in a decentralised manner.Bernard can be forgiven for his error though, for at the time he lived and wrote, many people viewed the nervous system asthe highly centralised controller of body functions, and the endocrine system and hormones were yet to be discovered.

16

A more complete history of the term ‘homeostasis’ is well summarised by Cooper (2008); including further details of thefull contributions of both Cannon and Bernard. A good introduction to homeostasis in general can be found in the openingchapter to “Vander’s Human Physiology: The Mechanisms of Body Function” (Widmaier et al., 2006). Widmaier et al.’s useof homeostasis as the introduction to a book on physiology stresses the importance of the concept.

Summary

Cannon, in 1926 was the first to coin the term homeostasis, but was not the first to discuss the concept in general. ClaudeBernard was one of the first to properly consider the stability of the internal environment of organisms. Though Bernard’sunderstanding of the control of homeostasis may have been wrong, his knowledge of the properties of the concept werelargely correct. From Cannon and Bernard’s early description of homeostasis to Widmaier et al.’s more recent accounts, oneof the most important properties of homeostasis is found to be stability. Not to be confused with fixidity, stability implies asystem in which variables may change, but generally only do so within a fixed range, and when perturbed, always returntowards some fixed point, this is homeostasis.

3.1.2 General Homeostatic Properties

Having outlined the history of homeostasis and introduced some of its basic properties, a more extensive introduction to thegeneral properties of homeostatic systems is now provided. The running example of thermoregulation is used throughoutand so this section begins with a more detailed description of some of the processes involved.

Thermoregulation of the human body

The following description of thermoregulation, adapted from Widmaier et al. (2006), outlines some of the physiologicalprocesses involved in maintaining the homeostasis of the human body’s internal temperature.

Consider a ‘lightly clad’ man situated in a room with a temperature of approximately 20 ◦C (so-called ‘room temperature’).The subject will maintain an internal temperature of around 37 ◦C because heat produced by the chemical reactions occurringwithin the cells of his body will offset any heat lost to the environment due to its lower temperature. The amount of heatbeing produced by the body is equal to the amount of heat being lost, there is no net gain or loss of heat. Imagine nowthat the temperature of the room is decreased to 5 ◦C. In this situation the rate of heat loss through the skin will increase,and if the rate of heat production remains the same then the body temperature will start to drop. As a result of the drop ininternal temperature a number of homeostatic responses will come into effect. First, the blood vessels will constrict, reducingthe amount of warm blood flowing through the skin and in turn reducing the amount of heat lost to the environment. Next,in a conscious attempt to reduce heat loss, the subject may assume a curled up position or put on extra clothes, these areknown as “voluntary” behavioural responses. If the balance between heat loss and heat production cannot be restoredsolely by reducing heat loss then mechanisms that increase heat production will be necessary. The bodies best automaticheat producing response is shivering, but further to this the subject may also initiate voluntary heat producing responsessuch as exercising. The combination of a decrease in heat loss and an increase in heat production that these homeostaticresponses provide eventually leads to the return of body temperature to its original value. Once again, even though theexternal temperature is much colder, there is no net gain or loss of heat and homeostasis has been restored. It is also possibleto imagine the opposite situation, when the temperature of the room is increased leading to an increase in body temperature.To maintain homeostasis in this situation a combination of an increase in the rate of heat loss and a decrease in the rate ofheat production is required. To increase heat loss physiological processes such as sweating or the dilation of blood vesselscome into effect. To reduce heat production muscles are relaxed and less strenuous activities are voluntarily sought out, forexample in countries with hot climates a short midday nap or siesta is often taken.

General properties

Making use of the same terminology as Widmaier et al. (2006), some of the general properties of a homeostatic control systemare now introduced. To emphasis important points the example of thermoregulation is returned to throughout.

The most basic component of homeostatic systems considered here are homeostatic or physiological variables, one exampleof which, body temperature, has already been introduced. There are a number of other physiological variables in the humanbody, including for example blood pressure and the levels of oxygen and glucose in the blood. If all the physiological variablesof an organism are stable then the internal environment of the organism as a whole can be said to be stable. A variable can beconsidered stable if it is maintained within ‘acceptable limits’, regardless of external conditions. These acceptable limits canbe thought of as marking the healthy range of a variable, in other words, values of a variable outside of its acceptable limitsare those which could lead to (or be indicative of) damage to the organism. For example, the acceptable limits of humanbody temperature are considered to be approximately 35 ◦C - 40 ◦C, an internal temperature of below 35 ◦C may lead tohypothermia whereas an individual with a temperature above 40 ◦C may experience heat stroke.

In introducing homeostasis and physiological variables Widmaier et al. emphasise the difference between stability andrigid fixidity. Variables such as the internal temperature do not remain constant with respect to time, instead they mayfluctuate within a set range, however, when they are perturbed from this range they will eventually be restored towardsa baseline value. Human body temperature does not remain constantly fixed at 37 ◦C. Throughout the course of the daybody temperature may fluctuate by around 1 ◦C, it is found to be hottest when a person is awake and coolest whilst theysleep, other factors such as exercise and eating may also cause fluctuations, but over the course of the day, the temperaturewill average out to approximately 37 ◦C. The gradual change of a variable such as body temperature over the course over

17

a set period of time is known as a biological rhythm. The difference between the temperature of the body when awake andwhen asleep forms part of the circadian rhythm, which cycles over a 24 hr period and effects a large number of physiologicalvariables.

The acceptable limits of body temperature are quite restrictive, allowing only for deviations of around 2− 3 ◦C from theoptimum temperature of 37 ◦C. In contrast to the restrictiveness of body temperature, it is perfectly normal for the bloodglucose level measured after a meal to be double what it was beforehand (Widmaier et al., 2006). The different scales uponwhich physiological variables act and the fluctuations of single variables throughout the course of the day make it difficult todetermine when any given physiological variable is within range and therefore acting homeostatically, to solve this problemWidmaier et al. propose the use of time-averaged means. Taking the example of blood glucose levels, because of its largefluctuations it may not be possible to predict the value of this variable at any given point in time, however, if repeatedmeasurements are taken throughout the course of a day, the average of these values will be much more predictable.

The responsibility of a homeostatic control system is to keep all its physiological variables within their acceptable limits, inother words, to prevent the loss of homeostasis. To maintain homeostasis, homeostatic control systems make use of a numberof regulatory responses that act so as to compensate for any changes in physiological variables that move those variables awayfrom their acceptable ranges. In doing so, these regulatory mechanisms pull variables back towards what Widmaier et al. callsset points. The set point of human body temperature for example is around 37 ◦C and the regulatory mechanisms that pull thevariable back towards this value when homeostasis is at risk of being lost are those such as sweating and shivering. It shouldbe noted that set points are not fixed for the life time of an organism and it is possible for them to be re-set (even multipletimes). When an individual has a fever the set point of body temperature is raised to reduce the proliferation of pathogens.

A common feature of many homeostatic control systems is feedback. More often than not regulatory responses will beguided by negative feedback, where an increase or decrease in the value of a particular variable leads to a response that tendsto move the value of the variable in the opposite direction. Thermoregulation provides a good example of negative feedback,a decrease in temperature leads to shivering, a response that tends to increase the body temperature. Positive feedbackmechanisms exist as well, which in response to an increase or decrease in value of a particular variable lead to a responsethat tends to move the value of the variable in the same direction. Because of their inherent instability, positive feedbackmechanisms less common in nature than negative feedback mechanisms. In should be noted that the link between a changein a variable and the response which acts to negate (or multiply in the case of positive feedback) that change does not need tobe direct, that is to say, a change in one variable may lead to a change in, possibly multiple, intermediate variables beforebringing about the regulating response that restores the original variable.

Another common feature of homeostatic control systems, often seen together with feedback mechanisms is feedforwardregulation. Feedforward regulation acts like other regulatory responses so as to compensate for changes in the value ofphysiological variables. The important difference between feedforward regulation and other regulatory responses is that itacts before a change in the variable it is regulating occurs, in some sense it predicts the change and acts to counter it before thechange actually occurs. By acting before the change occurs feedforward regulation mechanisms further reduce the deviationfrom normal and increases the stability of the organism. Instead of acting upon changes to the variable itself a feedforwardmechanism acts upon cues which may signal an impending change to the variable, for example, the smell of food preparesthe stomach for digestion and minimises the fluctuations in the internal environment that occur when food is digested.

The final property of homeostatic control systems mentioned here is acclimatisation. Acclimatisation is the improvement ofan already existing homeostatic system due to prolonged exposure to certain environmental conditions. The thermoregulatorysystem is subject to acclimatisation, an example provided by Widmaier et al. describes the acclimatisation of the sweatingmechanism in response to exercising in a hot environment. Widmaier et al. state that, over the course of a week, if anindividual is made to exercise in a hot environment for 1-2 hrs per day, at the end of the week they will start to sweat earlierand do so more profusely, consequently their internal temperature will not depart as far from its set point. Acclimatisation isnormally reversible, unless it occurs very early in life during the critical period of development, in which case it may be fixed.

Summary

The purpose of a homeostatic control system is to maintain the stability of an organism’s internal environment. The state ofthe internal environment of an organism can be described by a number of physiological variables, if all of these variables arestable, the internal environment of the organism can be said to be stable. For a variable to be stable it must be maintainedwithin acceptable limits by the regulatory responses of the homeostatic control system. Environmental or internal changesmay significantly alter the value of a variable, when this occurs regulatory responses come into effect, acting accordingto negative feedback they compensate for the change in such a way as to bring the value of the variable back towards acertain set-point. Over long periods of time physiological variables may fluctuate according to biological rhythms, thismakes it harder to determine when a variable is acting homeostatically but through the use of time-averaged means andknowledge of these biological rhythms, the problem is lessened. Homeostatic systems are not fixed from birth but are subjectto acclimatisation and can be adapted to suit environmental conditions, however, for some homeostatic systems if theseadaptations occur during the critical period of development they could be irreversible. Table 3.1 provides a summary of allthe properties introduced in this section.

3.1.3 Engineering Artificial Homeostatic Systems

The remainder of this section reviews some of the existing work into engineering artificial homeostatic systems. The term‘artificial homeostatic system’ used here encompasses both systems that are inspired by the biological components responsiblefor homeostasis, such as the nervous, immune and endocrine systems; as well as engineered systems that are based more

18

Property Description

Physiological variables any property of the internal environment of an organismBiological rhythm the normal variation of physiological variables over a set period of time, in a cyclic mannerTime-averaged means average value of a physiological variable over a certain period of timeRegulatory responses response which acts to maintain the homeostasis of a physiological variableSet-points point towards which physiological variables are drawn by regulatory responsesNegative feedback tendency of a response to act so as to reduce the magnitude of a change to a variablePositive feedback tendency of a response to act so as to increase the magnitude of a change to a variableFeedforward regulation regulation of a variable in anticipation of a change, before that change occursAcclimatisation ability of a homeostatic system to change itself to suit its environmentCritical period period during which some homeostatic systems may become fixed

Table 3.1: Some of the basic properties of biological homeostasis.

abstractly on the general properties of homeostasis and homeostatic control systems. In keeping with the focus of this report,there is a special emphasis on work within the field of robotics.

Ashby’s Homeostat

Ashby’s Homeostat is a self-adapting electromechanical device, designed and built by William Ashby to demonstrate theinteresting properties of a concept Ashby refers to as ultrastability. The Homeostat is described in detail by Ashby in his1960 book: “Design for a Brain: The Origin of Adaptive Behavior” (Ashby, 1960). In his book, Ashby attempts to answerthe question: “How does the brain produce adaptive behaviour?”, as an initial step towards this goal Ashby introduces theconcept of ultrastability and then describes the Homeostat as a working physical example of an ultrastable system.

Throughout “Design for a Brain...” Ashby refers to the example of how a kitten behaves towards fire and describes how,over time, this behaviour may be called adaptive. A kitten, Ashby suggests, has no inbuilt behaviour which prevents it fromacting inappropriately towards a threat such as fire. When a kitten first experiences a fire its behaviour is unpredictable, itmay approach the fire without fear and attempt to touch or lick it, alternatively it may cautiously ‘stalk’ the fire. In contrast,the behaviour of a fully grown cat towards fire is far more predictable, it will approach the fire and position itself a distanceaway such that it is not too hot and not too cold, if the fire burns down it will move closer, if the fire burns hotter it will movefurther away, its behaviour can be said to be adaptive. In essence the problem Ashby attempts to tackle in his book is todetermine how learning occurs and why it leads to organisms that are more capable of adapting to their environment.

Before introducing ultra-stability, Ashby first defines what he means by stability. Ashby’s views on stability are similar tothose introduced in section 3.1.1 with reference to homeostasis. Stability, as described by Ashby, “in no way implies fixityor rigidity”, and is seen by Ashby as a property of the system as a whole and not a feature of any part in particular, oftennecessarily implying the coordination of multiple sub-units. Ashby also emphasises the importance of feedback, using theWatt’s governor (Denny, 2002) and the thermostat as examples of stable systems controlled by feedback mechanisms.

Ashby goes on to discuss stability in terms of adaptation and how these two terms relate to Cannon’s notion of homeostasis.Ashby’s definition of an adaptive behaviour is simply one which maintains essential variables within physiological limits.Ashby’s “essential variables” are simply a subset of what have previously been referred to as “physiological variables”,the subset containing only the variables that are most pertinent to the survival of the organism. The similarity betweenessential and physiological variables mean that Ashby’s definition of an adaptive behaviour looks remarkably similar to thedefinitions of homeostasis given earlier in this section. This link between adaptive behaviour and homeostasis is precisely theconnection that Ashby was trying to make. The same way that the homeostatic mechanisms of the human body, in maintaingphysiological variables within their acceptable limits can be described as adaptive, so too can the behaviours of the cat thatregulates its body temperature in accordance with how hot the fire is burning. Earlier in this section the link was alreadymade when describing the voluntary behavioural responses such as putting on extra clothes to reduce heat loss.

Ashby’s definition of an ultrastable system then follows. An ultrastable system consists of four main components: thesystem controlling the organism, a set of adjustable parameters that define how the system functions, the environmentin which the organism operates, and a number of essential variables with pre-determined physiological limits. The fourcomponents interact as shown in figure 3.1. The control system and the environment interact bi-directionally through sensoryand motor channels, the environment effects the values of the essential variables, and the essential variables feedback to thesystem parameters, which in-turn determine the behaviour of the control system. There are in effect two feedback loops, oneacting directly between the organism and the environment through the sensory and motor channels and one that acts on alonger timescale via the essential variables and the control parameters.

Considering the first feedback loop in isolation, that which acts directly between the control system and the environment,is enough to describe a stable system such as a thermostat. The purpose of a thermostat is to maintain the temperature of anobject or area (call this its environment) at a certain set point. The thermostat works via a simple feedback loop, sensing thetemperature of its environment and then acting upon it (heating or cooling) in such a way as to bring the temperature closerto the set point. For the specific task it was designed to do, a stable system such as a thermostat works well over a wide butwell defined range of conditions. However, a single feedback loop of the form described above is not sufficient to account forall the interesting properties of an ultrastable system.

From Ashby’s point of view, what separates an ultrastable system from a stable system is the ability to physically re-configure itself. Endowing a system with the ability to dynamically re-configure itself greatly increases the range and type of

19

Parameters

Control system

Essentialvariables

Organism

Environment

Sensory and motor channels

Figure 3.1: Diagrammatic representation of Ashby’s ultrastable system, incorporating the four components: control system,parameters, essential variables and environment. Adapted from Ashby (1960).

problems that the system can solve, including the ability to tolerate physical alterations and degradation to the organismor control system itself. A stable system, when physically altered or when operating outside of its specific predeterminedrange, is unlikely to remain stable for very long. One can easily imagine that a thermostat with a broken sensor or an internalcomponent tampered with, could result in behaviour that instead of bringing the environmental temperature towards the setpoint, would actively push it away, clearly such behaviour is not stable and furthermore it is certainly not adaptive. Thisis where Ashby’s second feedback loop comes into play. As can be seen in figure 3.1 the environment has an effect on theessential variables of the system, when a situation arises that leads to one or more essential variables breaching their limitthen the system is no longer stable. A lack of stability, Ashby suggests, should lead to a re-configuration of the controlsystem so that stability can be restored. As the second feedback loop in figure 3.1 shows, Ashby’s method of re-configurationis relatively simple. If one or more essential variables are sent outside of their limits, then the parameters of the systemare updated, re-configuring the control system and changing the system behaviour. If after re-configuration the system isstill unstable the system will re-configure itself again, and continue to do so until a stable configuration is reached. Usingwhat Ashby’s calls step-mechanisms, parameters are updated simply by changing them to a new random value, through thisapproach Ashby’s method of learning is one of random trial and error, the validity of such an approach is discussed shortly.

The reason Ashby built the Homeostat device was to closely examine the properties of ultrastability in a practical situation.The device itself shall now be described, along with some of the experiments that Ashby carried out and the observations hemade. The Homeostat consists of four identical units, each with a single output that in-turn acts as an input to the other threeunits. On top of each unit is a pivoted magnet attached to a piece of wire, one end of which is submerged in a trough of water.Electrodes at either end of the trough allow for a D.C. output to be generated from each unit proportional to the deviation ofthe magnet from its centre position. The position of the magnet is determined by feedback from the units own output andthe inputs from the other three units. Each of the three inputs, after passing through a commutator and a potentionmeter,acts upon one of three coils. These coils, combined with a forth coil for the units own output, exert a combined force onthe magnet, determining the position of the wire within the trough and in-turn affecting the output of the unit. The onlyre-configurable part of the individual units are the commutators and the potentionmeters that each input passes through,the values of these components are determined by the position of a uniselector. With each uniselector capable of selectingone of twenty-five different outputs, and four uniselectors in total (one for each unit), there are 390,625 different possiblecircuit configurations. Figure 3.2 (a) shows a pictorial representation of Ashby’s Homeostat, with the connections between theindividual units shown in figure 3.2 (b). Figure 3.3 shows a simplified representation of the internals of a single unit.

Using the terminology introduced earlier, the essential variables of the Homeostat are the positions of the four magnets,the parameters of the system are the values of the re-configurable components and the step-mechanisms that update theseparameters are the uniselectors. Depending on the set of parameters in use, the Homeostat can be said to exist in one of twostates, stable or unstable. When stable, the four magnets move to their central positions and any attempts to displace them ismet with a response that tends to bring them back towards the centre. In contrast, when the system is unstable the magnetsdepart from their centre positions with increasing severity until they hit the ends of the troughs. The Homeostat responds toan unstable state, when one or more of the magnets approaches the end of their trough, or in other words, when one or moreof the essential variables goes beyond its limits, by updating the system parameters. The parameters are updated simplyby moving one or more of the uniselectors on to their next position. If after a short settling period the newly configuredsystem is still unstable then the process is repeated until a configuration is found in which none of the essential variables aresent beyond their limits. The different values of the components that the uniselectors choose were determined by a table ofrandom numbers, so the search for a stable configuration is in effect random.

After describing how the Homeostat works, Ashby provides details of a number of experiments that he carried out usingthe device. Each of the experiments was based upon, or inspired by, a real world experiment or situation, covering a widevariety of problems from the training of animal behaviour and environmental acclimatisation to the ability of the nervous

20

(a)

A

C

B

D

(b)

Figure 3.2: A drawing of Ashby’s Homeostat is shown in figure (a). Figure (b) shows a diagrammatic representation of theconnections between the four different units of the homeostat.

uniselectors

coils

components

A

B C DFigure 3.3: A simplified diagram of a single homeostat unit ’A’. The diagram shows how the inputs from the other units

(solid red lines), and feedback from unit A itself (dashed blue line), act through the components chosen by theuniselectors, altering the current through the coils and ultimately determine the position of the magnet.

21

system to adapt to severe surgical alterations. For brevity, only two of Ashby’s experiments are recounted here.The first was inspired by a rather gruesome experiment in which the nerves of the flexor and extensor muscles in the arm

of a monkey were severed and re-attached in a crossed position. Following the surgery, the movements of the animal’s armwere very uncoordinated, however, after a while they improved until eventually normal movement was returned. Ashbywas able to replicate this experiment using two of the Homeostat’s individual units. The two units were initially set up in astable configuration such that a small manual displacement of one of the units magnets was quickly compensated for by theother, and with the displacement removed both magnets returned to their centre positions. To simulate the surgical alteration,Ashby reversed the commutator of one of the units which led to the system becoming unstable. As a result of the instabilitythe uniselector of the altered unit changed position and arrived immediately at a stable configuration. The commutator wasthen switched back, resulting once again in an instability that was this time corrected after three uniselector switches.

The second experiment was intended to simulate the process of training. For this experiment three units were used andAshby himself assumed the role of the trainer. The behaviour that Ashby was attempting to train was that any forcedmovement of unit 1 would result in an opposite movement in unit 2. To train the Homeostat Ashby ‘punished’ the devicewhen ever the desired behaviour was not observed. The device’s punishment being the forced movement of unit 3’s magnetto an extreme position. Starting from a stable configuration, the device was tested by inducing a small movement in unit1, which resulted in a similar movement in unit 2, this was not the desired behaviour and consequently the Homeostatwas punished. The punishment led to an instability, forcing the device to re-configure. A second test once again producedundesirable behaviour, leading to a punishment and another re-configuration. When tested for a third time, a small movementof unit one led to a similar but opposing movement in unit 2. It can be said that the Homeostat had been trained to behave inthe desired way.

To emphasise the difference between a stable and an ultrastable system Ashby considers the example of an airplaneautopilot. An autopilot, amongst other things, ensures that even in turbulent air an airplane continues to fly horizontally.However, if the mechanisms that control the roll of the airplane were wired in reverse then instead of correcting disturbancesthe autopilot would actively increase the plane’s deviation from horizontal, with disastrous consequences. With the necessarysensors the Homeostat could also act as an autopilot. What sets ultrastable systems such as the Homeostat apart from stablesystems such as a conventional autopilot is that if the wirings were reversed in a Homeostat based autopilot, then after aperiod of instability, the ultrastable system would be able to settle upon a stable configuration and continue function correctly.This is precisely what Ashby showed in his experiments with the Homeostat.

In Ashby’s experiments the Homeostat undoubtedly showed some interesting properties and as a concept Ashby’s notionof ultrastability certainly sounds appealing. However, both the Homeostat and ultrastability in general are easy targets forcriticism, most of which stems from the fact that any form of learning or adaptation that this type of system is able to showcomes from an approach based largely upon random trial and error. The main problem with the trial and error approach isthat it doesn’t scale well. Though the approach worked for the Homeostat, the task in this situation was a relatively simpleone, requiring the stabilisation of at most four essential variables. To scale up the Homeostat by adding additional units,hence increasing the number of essential variables, leads to an exponential increase in the number of trials required to reachstability (Terry and Capehart, 1968). Clearly, with each trial of the Homeostat taking approximately three seconds (Ashby,1960) the task of finding a stable configuration through trial and error (within any reasonable amount of time) soon becomesinsurmountable.

The reasons that the trial and error approach does not scale are twofold. Firstly, as Ashby himself puts it “partial successesgo for nothing”, that is to say, the search for a stable configuration is not guided in anyway by previous trials. There is noguarantee that following an unsuccessful trial, any subsequent trials will be an ‘improvement’, it is simply a matter of chanceas to when the Homeostat will stumble upon a stable configuration. Secondly, there is no accumulation of adaptations. Oncethe Homeostat has adapted to one environment, if the environment was to change then in adapting to the new environmentthe Homeostat is likely to loose its ability to cope with the first. For the Homeostat, returning to a previous environment isno different to experiencing a completely new one. Another problem is that repeated exposure to the same environment doesnot improve or refine the Homeostat’s behaviour, the Homeostat’s behaviour can only be changed when the system becomesunstable.

To solve the apparent flaws with the Homeostat Terry and Capehart (1968) proposed extending the system to includeextra memory and learning components. In Capehart and Terry (1968) two simulated Homeostat’s are described, the first isequivalent to Ashby’s but for its allowance of up to ten units rather than four, the second is an extension of the first thatincorporates Terry and Caperhart’s suggested modifications. Specifically, the modifications that Terry and Capehart made,were to allow the Homeostat to store up to one hundred previous configurations, classified according to the (generalised)input pattern that led to their creation and ordered in terms of “success” (how often they have been used). When the systembecame unstable instead of resorting to trial and error, the memory was first checked for any suitable configurations. Terryand Capehart’s experiments showed that by allowing the Homeostat access to its previous configurations they were able toreduce the time to adaptation. More recently, Herrmann et al. (2004) describe a model of the homeostat in which parametersare updated incrementally instead of at random, inspired by the modifications of neural network synapses. The authors showthat their model shares similar properties with Ashby’s homeostat but since their intention was purely exploratory, they makeno direct comparisons to previous models in terms of the time taken to converge to stability. Herrmann et al. see implicationsfor their model in the control of artificial agents but have yet to investigate any specific applications.

Ashby was well aware of the Homeostat’s obvious inefficiencies and spends the whole of the second half of “Design for abrain...” considering theoretical explanations, unfortunately he was not able to offer any practical implementation as he haddone with the homeostat. To avoid diverging further into a theoretical discussion on the brain, and away from the centraltopic of homeostasis, details of Ashby’s discussions in the second half of his book are omitted, suffice to say, that after severalattempts, as reported by Cariani (2009), Ashby was never able to successfully scale up the system.

Ashby’s inability to extend the Homeostat, and the lack of much practical work on the subject following the initial

22

construction of the device, should raise warning bells as to whether it is an idea worth pursuing. However, historically, theHomeostat has always attracted interest from a philosophical point of view (Campbell, 1956; Cariani, 2009; Pickering, 2002;Warwick, 2009) and, as discussed shortly, recent work indicates a resurgence of interest from the engineering community(Barandiaran and Di Paolo, 2010; Bird et al., 2007; Di Paolo, 2000, 2003; Eldridge, 2002; Iizuka and Di Paolo, 2008; Williams,2004).

It may be that Ashby was simply ahead of his time, as is suggested by Warwick (2009) and Cariani (2009), the latterhighlighting Ashby’s position as an early critic of symbolic artificial intelligence, and the former even going so far as tocompare his ideas with the science-fiction film ‘The Matrix’. As Warwick mentions, Ashby’s mechanistic view of the brainmay not be popular even now, but nonetheless, with the emergence of more flexible hardware such as FPGAs and the vastlygreater computational power available today, it is interesting to speculate what further progress (if any) Ashby would havemade towards his ultimate aim, had he the same technologies we now possess. Further evidence of Ashby’s forward thinkingis seen in his 1952 paper entitled “Can a Mechanical Chess-Player Outplay Its Designer?” (Ashby, 1952). Ashby’s answer tothe title of his paper, and the more general question of: “to what extent can a computer outperform its designer?”, was thatjust as it is possible for a father to teach his son how to play chess and for his son to eventually become better than him:“It is quite possible for a mechanical chess-player to outplay the man who designed it.”. The main requirement of such asystem, Ashby believed, is the ability to make use of an external source of information. This information does not necessarilyneed to be provided in an organised fashion, as shown by Ashby’s comparison of the problem with evolution, where extrainformation can be added to the system through the random mutation of genes. Another noteworthy point of Ashby’s paperis his use of Shannon’s information theory (Shannon, 1948) and his prediction of the future use of the theory in a number ofapplications far beyond the scope of communication engineering that it was originally devised for. Although Ashby may nothave been satisfied with the brute force (although highly optimised) approach of systems such as IBM’s Deep Blue1, the firstcomputer to beat a chess grandmaster, his prediction that it was possible, and that of the widespread uptake of Shannon’sinformation theory, both turned out to be correct.

Ashby Inspired Homeostasis

Over recent years, Ashby’s concept of ultrastability, introduced and demonstrated using his Homeostat device, has inspiredthe creation of several artificial homeostatic systems. The applications of these Ashby-inspired homeostatic systems rangefrom the control of mobile robots to the automatic generation of music, some examples of which are now reviewed.

As motivation for the creation of an ultrastable homeostatic control system, Di Paolo (2003) argues that there is afundamental problem with existing autonomous robots, this being, that unlike animals they do not possess the property ofintentional agency. A robot does not care if it fails or succeeds, the goal is that of the designer and not the robot. Di Paolo usesthe example of a light-following Braitenberg Vehicle (Braitenberg, 1986), in which two light sensors are attached directly totwo wheels on opposing sides of the robot, their values driving the wheels such that the robot will always turn towards light.Because robots do not care whether they succeed or fail, light, Di Paolo suggests, is meaningless to a Braitenberg vehicle. Ifthe positions of a light-following robot’s sensors are swapped, with no risk of harm, a Braitenberg Vehicle will immediatelyswitch its behaviour to perform light avoiding. A light following robot may be described by a human observer as “brave” anda light avoiding robot as “fearful”, but these labels are arbitrary, if the positions of the robot’s sensors are switched then therobot will immediately change its “emotion”. As Di Paolo, highlights, this is not the case with real animals, though an animalmay be trained, it can never be made to treat punishment as reward. Di Paolo suggests that though many robotocists areinstinctively aware of this problem, they see it as a result of the complexity gap between living organisms and robots, andassume it to be solvable simply by reducing this gap. Whilst acknowledging that the complexity of robots and and theircontrol systems must increase, Di Paolo does not believe that increased complexity alone will lead to intentionality. Di Paolosuggests that, to solve the problem, inspiration may be taken from the ultrastable systems of Ashby (1960), where survival is“the mother of all values” and adaptation, in helping to ensure the survival of a system, is a fundamental requirement.

Based upon Ashby’s concept of an ultrastable system, Di Paolo (2000) describes an artificial neural network basedhomeostatic control system for a mobile robot, that was evolved to perform stable phototactic behaviour. Inspired by a seriesof visual distortion experiments carried out by Taylor (1962), Di Paolo observed the effects of inverting the visual fields of hisrobots by switching the positions of the robot’s light sensors. In some, though not all cases, Di Paolo observed that the robotswere able to adapt to their new configuration and continue to perform phototaxis. In Di Paolo’s experiments the robotswere controlled by plastic artificial neural networks, with a variety of different genetically encoded update rules. Akin to theupdating of parameters in the homeostat when essential variables went beyond their limits, the rules of individual neurons inthe robot’s network were activated locally whenever a neuron became unstable, that is whenever its firing activity was toohigh or too low. The fitness function of the genetic algorithm selected for individuals that were both capable of performingphototaxis and exhibited long term stable behaviour, characterised by the firing rates of their neurons being neither too highnor too low, i.e. behaving homeostatically.

In experiments with human subjects, after putting on special goggles that invert the field of vision in either an up/down orleft/right direction, subjects act clumsily with very poor spatial awareness, walking into walls and unsuccessfully graspingfor objects in midair. Interestingly though, after a short period, the subjects adapt to the situation and are once again able tocarry out complicated activities such as driving, what’s more, they even report that to them their new distorted view “looksnormal” (Di Paolo, 2003). In Di Paolo’s experiments with mobile robots he observed similar behaviours to those of the humansubjects. Di Paolo took robots that had evolved to successfully perform stable phototaxis and inverted their visual field byswitching the positions of their sensors, upon doing so, the phototactic behaviour was broken and the stability of the networklost, causing the robot to move erratically. As a consequence of the loss of stability the plasticity rules of individual neurons

1http://en.wikipedia.org/wiki/Deep_Blue_(chess_computer) [accessed 6 November 2010]

23

were activated and the robot’s behaviour adapted, in about 50% of cases this eventually led to the robot returning to perfectphototactic behaviour. Other disturbances were also investigated by Di Paolo (2000), such as breaking the symmetry of therobot by increasing the gain of the sensors or motors on one side of the robot, as with the inversion of the visual field, manyof the evolved controllers were able to adapt to these physical disruptions.

Di Paolo (2000) concluded his study, with a number of open questions about the approach, firstly, why does it work? Andhow can the process of adaptation be sped up during the period immediately after a sensorimotor disruption? Perhaps mostimportantly, Di Paolo highlights the fact there is no guarantee that after adaptation the robot will still perform phototaxis.Since phototaxis is only selected for during evolution and not during the lifetime of the controller, after adaptation, anybehaviour is acceptable, provided it leads to the stability of the controlling network. Di Paolo also questions the choice ofplastic update rules used and suggests ways of refining them. In Di Paolo (2003), the problems with the update rules and thefact that the adapted behaviour may not be phototactic are addressed. Originally Di Paolo believed that selecting for bothinternal stability and phototaxis would lead to controllers in which the two conditions required each other, this was seennot to be the case. As a solution Di Paolo (2003) introduced another “essential variable”, the state of charge of the robotsbattery, which was increased when the robots moved near the light sources in the arena. This addition made the only stablecondition to perform phototaxis. In this extension Di Paolo used a simplified method of control with no evolution and noneural network, unfortunately this approach lacked some of the interesting dynamics of the original model and highlightedsome further problems with the general technique. Most notably, like Ashby’s homeostat, the time to adaptation was slowand previously acquired adaptations were not retained. In Iizuka and Di Paolo (2008), the authors returned to the neuralmodel, with evolution. In selecting for controllers that behaved homeostatically when performing the desired behaviourand non-homeostatically otherwise, Iizuka and Di Paolo were able to produce controllers that greatly outperformed theirolder models in terms of their adaptivity to morphological disruptions. In a similar piece of work, influenced by Di Paolo(2000), Williams (2004) constructs a homeostatic neural network controller with more biologically plausible plasticity rulesthan Di Paolo (2000) and demonstrated similar phototactic behaviour. Further related work from Di Paolo has looked at theformation of habits in robotic systems, and the task of distinguishing between profitable and poisonous ‘food’ (Barandiaranand Di Paolo, 2010).

Based upon the same neural network model of Ashby’s homeostat, Bird et al. (2007) and Eldridge (2002) demonstratetwo radically different applications of the principles of ultrastability. Eldridge (2002) describes an adaptive, generativemusic system, in which melodies are created based on the dynamics of a neural network model of Ashby’s homeostat. Birdet al. (2007) describes an interactive art work called Net Work in which floating buoys, each of which may be considered ahomeostatic unit, adapt their colour based on their interactions with their neighbours and their environment (waves, andlights shone by human participants). This departure from the main theme of this report is intended only to exemplify thewide-spread desire for the properties of homeostasis, since their approaches are not directly relevant to robotics, no more issaid about about the models of Eldridge and Bird et al..

Homeokinesis

Inspired by Cannon’s homeostasis, Der et al. (1999) introduce the concept of homeokinesis, an unspecific method of producingbehaviours in mobile robotic systems that, to the observer, appear specific and goal-oriented. The approach of Der et al.is based on the creation of a predictive internal “self-model” by the agent and the adaptation of this model alongside theadaptation of the agent’s motion controller. Der et al. argue that existing approaches to the development of robotic behavioursinvolving evolution and learning are let down by their critical reliance on the construction of a fitness or objective functionthat drives the development. As Der et al. highlight, generally, fitness functions are created by hand and require the designerto have a specific target behaviour in mind, this process is rarely easy, and what’s more, the observed behaviour of an evolvedcontrol system is often very different to the desired or expected behaviour. In Der et al.’s approach, very little about thedesired behaviour of the robot is assumed, only that it should lie somewhere in the stable region between the extremes ofchaotic random movements and trivially simple motion, such as simply moving forward.

In Der et al.’s method, each robot is provided with an adaptable controller which maps sensor inputs to actuator outputsand an adaptive self-model which given sensor inputs predicts future sensor values. The error of the model, that is thedifference between the predicted and actual sensor values, is used as a learning signal to update the parameters (throughgradient descent) of both the robot’s motion controller and the self-model itself. As the error between the predicted andactual sensor values decreases, this simple method leads to the emergence of behaviours that are smooth and stable for thecurrent robot within the current environment.

In early experiments involving Braitenberg Vehicles and real Khepera robots, several interesting behaviours were observed.In Der et al. (1999) a simple two wheeled Braitenberg vehicle exhibited stable light following behaviour. The vehicle wasequipped with two light sensors, one on each side, which were connected to wheels on the opposing sides, with the simplestpossible model predicting no change in sensor values, the vehicle soon adapted to perform stable light following. Interestingly,the approach was able to handle structural asynchronies in the vehicle, implying that the method provides some inbuilt faulttolerance. In Der and Pantzer (1999) and Der (2000) experiments with real Khepera robots, equipped with IR sensors, arereported. Placed in a walled environment, initially the robot would simply crash into the sides, however, after a short periodof time they would fall into one of two behaviours, either avoiding the walls or actively following them. Interestingly, whenperforming wall following, the controller would sometimes adapt subtly so as to follow the wall at a different distance, adistance at which the control was more stable. The robots also showed the ability to adapt their behaviour according to theirenvironment. In one experiment, robots that were first adapted to perform “ball balancing” were placed next to a wall andafter a short period in which they attempted to “balance” the wall (which resulted in the robot crashing), they adapted theirbehaviour to perform wall following. Tolerance to the presence of simulated faults was also observed, though not extensivelyreported upon.

24

Over the last ten years the principles of homeokinesis have been applied to the control of a variety of different robots,both real and simulated. From the early experiments with wheeled robots (Der, 2000, 2001; Der and Pantzer, 1999; Deret al., 1999) to conceptual spherical robots, that move by changing the relative positions of three internal masses (Der andMartius, 2006; Der et al., 2006b). Simulated snake-like robots and chains of independent wheeled robots have also providedthe platform for several experiments (Der, 2008; Der et al., 2006a, 2008), of particular interest in the experiments with thechained robots was the emergence of natural looking serpentine locomotion, resulting from the synchronous movementsof completely independent sub-units, with no direct communication. A strange mechanical contraption called the rockingstumper (Der et al., 2006a) and a plethora of other simulated creatures, including dogs that, seemingly, attempt to escape theirenvironment and ‘wrestling’ humanoids (Der, 2008), have also been studied, several videos of which can be found online1.In all of the aforementioned experiments, nontrivial behaviours emerged as a consequence of the interaction of the agentsand their environments. Unlike conventional symbolic AI approaches, the robot and environment are strongly paired andbehaviours are said to be situated (Der, 2001), rather than acting upon a static predetermined symbolic representation of theworld, the robots maximise the use of their sensors to continuously update their internal model. Der et al.’s approach isappealing because it assumes very little about the sensors and actuators of the robot. In conventional approaches, without anadaptive model of self, ‘knowledge’ of the robots sensors and actuators is built into the robot’s controller by the designer,based on the standard operating properties of the isolated components. In the homeokinetic approach, due to the presence ofthe adaptive self-model, no predefined knowledge is needed, the robot discovers for itself the properties of its componentsthrough its interaction with the environment, continuously updating its self-model over time. The presence of an adaptiveself-model is one of the reasons why the approach is successful on such a wide variety of different platforms, regardless ofthe hardware involved.

Though seemingly not something that the authors have ever explicitly investigated, their approach is inherently robustto both changes in the external environment and changes to the components which sense or act upon the environment.The strong coupling between the robot and its environment ensures that, provided there is some correlation between theenvironment and the components sensing or acting upon it, regardless of whether the correlation matches that describedby the components datasheet, then the system will continue to operate in a stable manner. More recent work involvinghomeokinetic style approaches to self-organising robot behaviour (Ay et al., 2010; Zahedi et al., 2010), looks to informationtheory to improve the system further and observes that maximising the mutual information between successive time stepsleads to more desirable behaviour. Description of these experiments, in adding nothing to the interesting properties ofemergence and self-organisation that are of focus here, are omitted.

Der et al.’s approach is not without criticism, the behaviours observed, though nontrivial are relatively primitive. Moreinteresting than the actual behaviours, however, is the method through which they emerge and the inherent robustness ofthe system. To provide more goal-oriented behaviours Der and Martius (2006) suggest the use of re-enforcement learning,though a more desirable approach, rather than being explicitly driven by external rewards would come from the agentsinteraction with the environment alone. Thus far, the most complex robots that the approach has been applied to are simulatedhumanoids, the striking feature of the videos of these experiments is the jerkiness of the robot’s movements, suggesting thatthe observed behaviour includes artefacts of the simulation. It would be interesting to see how the approach faired withreal robots that have many degrees of freedom, although a humanoid may be beyond the current implementation, a morestable six or eight legged system may prove interesting. The only downside of more complex robots being the need for amore complex model to predict their behaviour. Although through gradient decent the search for a stable configuration ismore directed than Ashby (1960)’s for example, homeokinesis still lacks long term memory and the ability to recognise andimmediately adapt to previously encountered situations, though this was never their original aim, the problem is addressedsomewhat in Der et al. (2005).

Neruo-endocrine control

At a lower level than the previously mentioned approaches, though still significantly abstracted from the biology itself, Nealand Timmis (2003) introduce a homeostatic control system that takes inspiration from the functions of both the nervousand endocrine systems. The approach of Neal and Timmis combines a standard Artificial Neural Network (ANN) with anArtificial Endocrine System (AES). The AES models the concentrations of multiple hormones, which are released by artificialglands in response to internal or external stimuli such as the robot’s current state of charge or the values of its sensors, andover time are subject to gradual decay. The concentrations of the various hormones have a local effects on the neurons of,possibly multiple, artificial neural networks, continuously stimulating or inhibiting their activity. In affect, the hormones areutilised as a switching mechanism, altering the amount of control that neurons, or whole networks, have on the output of thesystem, in response to certain environmental conditions.

Since the method of control was first proposed by Neal and Timmis (2003) it has been applied to various different situationsand has spawned many variants on its original basic concepts. The initial work involved a few simple behaviours, namely:wandering and obstacle avoidance. Soon work moved on to include seeking behaviour (Neal and Timmis, 2005), the samework also started to show the action selection capabilities of the approach, with experiments involving a robot switchingbetween the desire to move towards a white placard and a black placard.

Vargas et al. (2005) introduced the concept of feedback mechanisms to the AES, increasing the control over hormone release,as well as making the architecture more biologically plausible. Vargas et al. also highlight the adaptability of the architecture;for a collective robotic system with multiple agents constantly effecting their surroundings, the ability to adapt to a dynamicenvironment is clearly a desirable property. Timmis et al. (2009) take adaptability a step further and add to the architecturethe capacity for “on-line” learning. On-line learning provides robots with the ability to learn how to react to the environment

1http://robot.informatik.uni-leipzig.de/videos/ [accessed 16 October 2010]

25

throughout their normal operation, rather than exclusively before being deployed, thus further increasing the adaptability ofthe architecture over long periods of time.

More recently, Timmis et al. (2010b), have investigated the control of multiple robots, carrying out a swarm foraging taskand requiring the ability to switch between a large number of behaviours, the approach was shown to be amenable to thistask. Meanwhile, Sauze and Neal (2010) have focused on the task of power management, within the context of solar-poweredlong-term autonomous sailing boats, once again, the combine neuro-endocrine approach was shown to be a feasible methodof control.

Artificial Homeostatic Hormone System

Another homeostatic control system which takes inspiration from the biological processes of the endocrine system, is theArtificial Homeostatic Hormone System (AHHS) of Schmickl and Crailsheim (2009). Over recent years the AHHS has beeninvestigated extensively within the scope of the SYMBRION and REPLICATOR projects (Schmickl et al., 2010), some examplefrom which are discussed shortly.

The Artificial Homeostatic Hormone System, as its name suggests, takes its inspiration from the interactions of hormoneswithin the endocrine system and other similar physiological networks. The approach has been applied to several applications,from the locomotion of single mobile robots, both real (Stradner et al., 2009) and simulated (Hamann et al., 2010b), to morecomplex tasks such balancing an inverted pendulum (Hamann et al., 2010a) and the coordinated motion of several connectedmodules in a robotic organism (Hamann et al., 2010a,b). The basic principles of the paradigm are as follows. Within thesystem there exists several different hormones which are secreted into virtual compartments at pre-determined rates. Thesecretion of hormones may also be effected (in either a stimulatory or inhibitory fashion) by the values of particular sensors,or the concentrations of other hormones within the same compartment. Hormones are not restricted to acting within thecompartment they were secreted in and may diffuse over time into neighbouring compartments. To complete the sensorimotorloop, when concentrations rise above a certain threshold, hormones may also have an effect on any motors or actuators withintheir local environment. Finally, all hormones are subject to decay and so an unstimulated system will eventually return to astable equilibrium point. Within the SYMBRION and REPLICATOR projects the need for virtual compartments is less critical,since when an artificial organism is formed, local communication allows for the transfer of hormones between connectedmodules, each of which may be considered as a separate compartment.

Early work with the AHHS involved the demonstration of simple obstacle avoidance controllers (Schmickl and Crailsheim,2009; Stradner et al., 2009), systems with multiple hormones were seen to perform better than systems with only a singlehormone. In particular, the common problem of a robot failing to avoid an obstacle when approaching from head on, wasover come using multiple hormones, due to the faster reaction time of the robot, and the ability of the controller to reverse thedirection of the robot’s wheels. In Hamann et al. (2010a) more complex tasks were introduced and controller configurationswere evolved, rather than designed by hand. To aid in the evolutionary process, an extension of the original system, AHHS2,was introduced by the authors. In contrast to the original AHHS, where a single hormone can only have an effect on oneother part of the system, for example a particular actuator or another hormone, in AHHS2 hormones have weighted effects onpossibly multiple parts of the system. Since the effect of a hormone can no longer jump from one part of the system to anotherin a single mutation, this addition, incorporated into AHHS2, leads to the creation of a smoother fitness landscape, and henceshould lead to better evolveability. In Hamann et al. (2010a), and later Hamann et al. (2010b) controllers were evolved toperform locomotion in simulated Symbricator robotic organisms. Controllers for three and five robot topologies were evolvedand showed several interesting forms of locomotion such as caterpillar crawling, walking, and jumping. An important,and interesting point, is that the control of these robotic organisms is entirely decentralised. Each module, regardless ofits position within the organism, runs an identical controller and no communication other than the diffusion of hormonesbetween neighbouring units occurs. Based on their interactions with the environment (through IR sensors) and other robots(through hormones) each module must discover its own role in a self-organising manner. The work described above is onlya small part of a larger picture, the authors hope to extend their system and apply it to more complicated tasks such asdynamic body formation and reconfiguration (Hamann et al., 2010b; Schmickl et al., 2010).

The AHHS produces a very interesting form of homeostatic behaviour, where perturbations to essential variables (fluctuatinghormone concentrations) are corrected by behaviours that bring these concentrations back to normal values. However, thesystem is not homeostatic in the sense described by Ashby (1960), although the behaviour may be described as adaptive,at the individual level, the structure of the control system itself can not. For example, the simple obstacle avoidancebehaviours described by Schmickl and Crailsheim (2009) and Stradner et al. (2009) would be unlikely to be robust to the samesensorimotor disruptions demonstrated by Di Paolo (2000). Furthermore, the system has not been applied to any problemwhich could not be solved by an alternative such as artificial neural networks and whether the system will be able to scale upto handle more complex tasks remains to be seen. One area where the system may come into its own, however, is in thedecentralised control of multi-robot organisms.

Immune Inspired Homeostasis

As well as the nervous and endocrine systems, the immune system, in close interaction with the other two, plays a veryimportant role in providing homeostasis to the human body. Immune inspired approaches to homeostasis are reviewedthoroughly in section 3.2, but it is worthwhile highlighting here, as identified by Owens et al. (2007), some of the advantagesthat immune inspiration can provide to an artificial system. In describing a general architecture for artificial homeostasis,Owens et al. identify three desirable properties for a homeostatic control system:

• Prediction

26

• Innate and Adaptive Response

• Acclimatisation

Expanding upon these properties, the authors note that there are clear links between their architecture for a homeostaticcontrol system and a number of immunological theories. It is proposed that through a principled approach, following theguidelines of Stepney et al. (2005), the immune system may provide a good source of inspiration for the design of homeostaticcontrol systems.

Summary

In his 1960 book: “Design for a Brain: The Origin of Adaptive Behavior” William Ross Ashby described an electro-mechanicaldevice, known as the Homeostat, that demonstrated the highly desirable ability to maintain a stable configuration in a varietyof environmental conditions. When the stability of the Homeostat was threatened by an external force, the device was almostalways able to adapt its configuration to find a suitable stable alternative, even showing the ability to adapt to internalphysical alterations. Essentially, the Homeostat was demonstrating Ashby’s notion of ultrastability, as a method of producingadaptive behaviour and maintaining the homeostasis of a system. Central to Ashby’s approach is stability, it is not the changein the environment or the physical alteration itself which causes the Homeostat to adapt, rather it is the effect that this changehas on the internal stability of the system. Though Ashby’s method of learning is too simplistic to be applicable to real-worldproblems, the general properties of ultrastability remain an interesting topic for study.

Inspired by Ashby, authors such as Di Paolo (2000) have developed homeostatic control systems for mobile robots.Borrowing ideas from Ashby’s notion of ultrastability, Di Paolo produced an adaptable, fault tolerant form of control, whereby,much like Ashby’s early experiments, the system may even adapt to physical alteration. Another homeostatically inspiredform of robot control was demonstrated by Der et al. (1999). Based upon the concept of homeokinesis, Der et al. showedhow by learning its own ‘self-model’, alongside its controller, robots may produce adaptable and inherently robust forms ofbehaviour. Furthermore, Der et al.’s homeokinetic approach, in requiring very little predefined knowledge, was shown to beapplicable to a wide variety of robotic platforms.

Taking inspiration more directly from biological systems rather than general homeostatic properties, the neuro-endocrinecontrol system of Neal and Timmis (2003) and the artificial homeostatic hormone system of Schmickl and Crailsheim (2009),have both proved to be effective forms of robot control. The former finding applications in both individual and swarm robotictasks, as well as more recently, power management on-board autonomous sailing boats. The latter, whilst being shown tobe applicable to the control of individual robots, is perhaps better suited to the decentralised control of large-scale artificialrobotic organisms. Beyond the nervous and endocrine systems, it is believed that the natural immune system may also serveas a suitable source of inspiration for the design of homeostatic control systems, consequently, it is for this subject that thefollowing section is dedicated.

3.2 Artificial Immune Systems

The main purpose of an immune system is to protect the organism that it resides in from infectious agents known as pathogens.In doing so the immune system helps to maintain homeostasis within the organism. From a high level, protection of theorganism involves the isolation or removal of pathogenic material (or the products thereof), that if left alone may lead todisease. As is shown in later sections, the processes through which this is accomplished are complex and numerous, involvingat times a large number of interacting cellular agents and signalling pathways.

Artificial Immune Systems (AIS) are systems which take inspiration from the processes and functions of natural immunesystems and use them to construct algorithms and techniques capable of solving a wide variety of computational problems.As defined by de Castro and Timmis (2002):

“Artificial immune systems (AIS) are adaptive systems, inspired by theoretical immunology and observed immune functions,principles and models, which are applied to problem solving”

In line with the purpose of natural immune systems, the most obvious problem that an artificial immune system wouldappear suited for is that of protection, or security. In an artificial system such as a computer network, protection may involve(amongst other tasks) the prevention of: denial-of-service attacks, code or data modification and eavesdropping. Whilst AIShave been successfully applied to security problems on a number of occasions, this type of application is far from the limitsof what AIS can achieve. As identified by de Castro and Timmis (2002) there are a number immune properties that wouldbe highly desirable in a variety computational systems, including for example: pattern recognition, learning and memory,autonomy, anomaly detection, distributivity, noise tolerance, fault tolerance, and self-organisation. By studying and observingnatural immune systems, it is possible to abstract some of the key concepts and processes of immunology and use them toconstruct artificial immune systems that exhibit one or more of the desirable properties listed above. Through such techniquesAIS have seen use in a number of applications from network intrusion and virus detection to robotics and control systems.

Though relatively young (and consequently small) in comparison to other non-standard computational techniques such asgenetic and evolutionary computation, artificial immune systems remain a very successful and diverse area of interdisciplinaryresearch. This section begins with a broad introduction to the biology of natural immune systems (3.2.1). In 3.2.2 some of thegeneral techniques employed when designing and constructing artificial immune systems are discussed. Finally, in 3.2.3 someof the most popular algorithms to emerge from the field are described and some of the specific applications to which AIShave been applied are reviewed.

27

3.2.1 Biology

To reiterate, the main purpose of the immune system is the protection and maintenance of the organism in which it resides.To this end, after the detection of an invading pathogen, the immune system mounts an immune response. The immuneresponse results in the isolation or removal of the pathogenic material, as well as any of the organisms own cells that havebecome infected or damaged as a consequence of the pathogens presence. The generality of the immune system is such thatit is able to recognise and respond to virtually any chemical structure, organic or inorganic. However, the main threat to theorganism are biological pathogens, which are generally considered to belong to one of four categories: viruses, bacteria, fungiand parasites. The immune system possesses a variety of strategies for detecting and dealing with these pathogens.

The vast majority of research involving immune systems, both natural and artificial, has focused on the immune systems ofvertebrate animals, that is, animals which possess a backbone or spinal column. The vertebrate immune system is made up oftwo sub-systems, known individually as the innate and adaptive immune systems.

After physical barriers such as the skin, the innate immune system is the organisms first line of defence1. The mechanisms ofthe innate immune system are described as non-specific, so named because, generally, they do not recognise or distinguishbetween the different types of foreign matter that they protect against. Instead, the cells of the innate immune system arecapable of recognising the general properties of multiple types of pathogen. Other notable features of the innate immunesystem include its relatively quick response time, and apparent lack of adaptation. The innate immune system operatesduring the period of 0-96 hrs after infection (Murphy et al., 2008) and the cells and mechanisms through which it protectsthe organism undergo very little change throughout the lifetime of the organism. Importantly, in vertebrate animals, theinnate immune system is also responsible (at least in part) for regulating the response of the adaptive immune system, themechanisms through which this is achieved are described later.

In contrast with the innate response, the response of the adaptive immune system (also known as the “acquired” immunesystem) is described as specific. Meaning simply that specific cells of the adaptive immune system are capable of recognisingspecific pathogens. More accurately, the receptors of specific cells are capable of recognising specific molecules associated withthe pathogen. These molecules, referred to as antigens, may exist on the surface of the pathogen itself or on normal cellswhich have been adversely affected by the presence of the pathogen. Generally, an antigen is said to be anything that caninduce a specific immune response. Individual antigens are generally indicative of a particular pathogen and consequentlythe term ‘antigen’ is often used interchangeably with the pathogen that it represents. The fact that specific receptors arecapable of recognising specific antigens is not a built-in feature of the adaptive immune system, rather it is “acquired” overthe lifetime of the organism. When an organism first encounters a particular pathogen it may not possess any cells thatare able to recognise and respond to it, however, after prolonged exposure the cell population adapts and new cells will becreated with receptors that are capable of recognising antigens of the pathogen. Furthermore, once the immune system hasadapted to recognise a specific pathogen, the next time it encounters this pathogen it will detect and respond to it muchfaster than it did so the previous time. This process is continuous, the more times and the longer the organism is exposed to aparticular pathogen, the better it becomes at detecting and removing it. In this sense, the adaptive immune system exhibitsproperties of memory and learning. The adaptive immune system possesses a number of mechanisms for both creatingthe population of cells that recognise specific antigens and for removing or isolating the negative effects of the associatedpathogens, later sections describe these mechanisms in detail.

The presence of an adaptive immune system in vertebrate animals distinguishes them from invertebrates, whose immunesystem consists solely of an innate part. The study of the vertebrate immune system by immunologists is largely drivenby the desire to gain a better understanding of the human immune system and the obvious health benefits that this leadsto. Consequently, the wealth of knowledge about the vertebrate immune system make it at natural source of inspirationfor engineers and computer scientist wishing to construct artificial systems that exhibit the interesting properties of naturalimmune systems. The additional complexity of the adaptive immune system and the desirable computational propertiesit exhibits, such as memory and learning, may also contribute to the increased interest that the vertebrate immune systemreceives from both immunologists and engineers.

In recent years it has been discovered that invertebrate immune systems exhibit more interesting properties than they wereoriginally thought to. Properties such as immune memory and the diversification and specificity of immune cells, whichwere originally thought to be exclusive attributes of the adaptive immune system, have been observed (in a limited form) inthe immune systems of simple invertebrates such as sponges and fruit flies (Litman et al., 2005; Loker et al., 2004; Rowleyand Powell, 2007). Despite these discoveries, and calls from the AIS community for the increased investigation into simplerinnate-only based approaches (Twycross and Aickelin, 2007), currently there exist very few algorithms that take inspirationfrom the invertebrate immune system. For this reason, throughout the remainder of this report, unless explicitly stated theterm “immune system” will be used solely to describe the vertebrate immune system. Furthermore, although some propertiesof invertebrate immune systems are shared with the innate immune systems of vertebrates, there will be no further explicitdiscussion on invertebrate immunity within this section.

This section continues by first describing some of the main cells and organs of the immune system, including a structuraloverview of the system in its entirety. Further details of the functioning of the individual innate and adaptive immunesystems are then provided, detailing how the individual agents of the immune system communicate and self-organise toprotect and maintain their host organism. Finally, some alternative immune theories are described, which cannot easily byclassified as part of the innate or adaptive systems exclusively.

The approach taken here, and the level of biological detail provided, most closely resembles that of de Castro and Timmis(2002). Providing the necessary level of detail in order for the reader to understand the main functions of the immune system,

1It should be noted, that some people consider the physical barriers and their mechanisms, such as the release of sweat and mucus, as part of the innate immunesystem. However, since the physical barriers offer few interesting properties that can be abstracted to artificial systems (aside from the physical analogies thatwould exist with or without immune inspiration), that is not the view taken here.

28

Leukocytes

Granulocytes Agranulocytes

Eosinophil Neutrophil Basophil Lymphocyte Macrophage

B-cell T-cell NK-cell

Dendritic cell

a)

b)

c)

d)

e)

Figure 3.4: Diagram showing the many different sub-types of leukocyte cell. Row c shows the five most important categories:eosinophils, neutrophils, basophils, lymphocytes and macrophages. Dendritic cells, though not technicallyleukocytes, are included in this diagram because they are functionally very similar macrophages. The cellrepresentations in this diagram were adapted from Murphy et al. (2008).

and how they may be applied to artificial systems, without overloading them with complicated and unnecessary details. Inlater sections when reviewing artificial immune systems, if deemed necessary, further biological details are provided. For asystem level view of the immune system the reader is directed to Widmaier et al. (2006) (chapter 18) and for a more completecellular level view further details can be found in Murphy et al. (2008).

Immune cells and agents

The most important cells of the immune system are the leukocytes, or “white blood cells”. There are five main types ofleukocyte: eosinophils, neutrophils, basophils, lymphocytes and macrophages (see figure 3.4 part c). The eosinophils, neutrophilsand basophils are described as granulocytes because they contain densely staining granules in their cytoplasm. The othertypes of white blood cell, the lymphocytes and macrophages, are classed as agranulocytes because they lack the granulesfound in granulocytes1.

Another important type of immune cell are the dendritic cells (pictured in figure 3.4 part e). Though not technically whiteblood cells themselves, dendritic cells share many important functions with the type of leukocytes known as macrophages,for this reason they are sometimes described as macrophage-like cells (Widmaier et al., 2006).

In immunology, since a lot of cells perform some of the same (or similar) duties, it is not uncommon for cells to be referredto by their function instead of their actual name. For example, a number of immune cells perform the task of phagocytosis, thecapture and breaking down of pathogenic material. The group of cells capable of phagocytosis are collectively referred to asthe phagocytes; this includes, amongst others, the neutrophils, macrophages and dendritic cells.

Another class of cells that are often referred to by their function are antigen presenting cells (APCs), as their name suggests,the purpose of these cells is to present antigen upon their surface, allowing for other immune cells to interact with them andin certain situations initiating an immune response. To present antigen, APCs must first engulf and break down antigensinto small fragments, or peptides, these peptides are combined with major histocompatibility complex (MHC) molecules andpresented on the surface of the cell as peptide-MHC pairings (pMHC). Dendritic cells, macrophages, and to a lesser extent

1One particular type of lymphocyte, the natural killer (NK) cell, does possess granules in its membrane, however, for the most part lymphocytes are consideredto be agranulocytes. For the purpose of this report, the distinction between granulocytes and agranulocytes described above and in figure 3.4 is sufficient.

29

B-lymphocytes (B-cells, figure 3.4 part d), are all types of antigen presenting cell. For simplicity, the terms “antigen presentingcell” and “phagocyte” will be used more frequently in this report than the specific cell names.

Many of the immune processes described in the following sections rely heavily on the interactions of different cell types.These interactions are facilitated by “chemical messengers” known as cytokines. Cytokines are small protein fragments, similarin function to the hormones of the endocrine system, they can operate as autocrine agents (acting upon the cells that releasedthem) or paracrine agents (acting upon other cells), stimulating or inhibiting the effects or the proliferation of different cells.For example, the cytokine interleukin 1 (IL-1) helps to activate T-lymphocytes (T-cells, figure 3.4 part d), a step that is crucial tothe response of the adaptive immune system. Furthermore, some cytokines may stimulate the release of other cytokines, IL-1for example, stimulates the release of interleukin 2 (IL-2) which, acting as an autocrine agent in this instance, leads to theproliferation of those same IL-2 releasing cells.

Grouped (where appropriate) by their primary function, the five main types of leukocyte (figure 3.4 part c) and the closelyrelated dendritic cells are now described in more detail.

Phagocytes Of the three aforementioned phagocytes: neutrophils, macrophages and dendritic cells, the neutrophils are byfar the most abundant. Numerous in quantity, short living and fast acting, the neutrophils are generally considered to be themost important cell of the innate immune response. The macrophages and macrophage-like cells are longer lasting thanthe neutrophils but share one of the same main purposes: phagocytosis, the ingestion and breaking down of pathogenicmaterial. Phagocytosis is a three step process. First, the phagocytes must recognise their target, they do so by recognisingthe general properties of multiple pathogens rather than the specifics of individuals. Once recognised the phagocytesbind to their target and ingest it, enclosing it in a vesicle known as a phagosome. Finally, the phagosome is broken downthrough acidification and the release of enzymes and other toxic products such as nitric oxide. Following phagocytosis,macrophages and macrophage-like cells also help to initiate the inflammatory response (described shortly) through therelease of ‘pro-inflammatory’ cytokines.

Antigen Presenting Cells Though almost all cells in the body are capable of presenting antigen, only a select few of them,including: macrophages, dendritic cells and B-cells are described as professional antigen presenting cells. Generally, all cellspresent antigen using the same method, first breaking down antigens into smaller fragments and then presenting them toT-cells as pMHC. What distinguishes professional antigen presenting cells from other cells is the type of MHC molecule thatthey use to present antigen. There are two types of MHC molecule, MHC class I, which is found on all cells, and MHC classII which is only found on professional antigen presenting cells. The presence of MHC class II molecules allows professionalAPCs to present antigen to a particular type of T-cell known as helper T-cells. In the absence of MHC class II, cells may onlypresent antigen to cytotoxic T-cells, another specialised form of T-cell. It should be noted that although B-cells are professionalAPCs by definition, this function is very much secondary to the cells main purpose, that being the production of antibodies.Furthermore, B-cells are limited to only presenting antigens that match the particular antibodies that they produce.

Lymphocytes The lymphocytes are the most important cells of the adaptive immune system. The two main types oflymphocyte are B-lymphocytes (B-cells) and T-lymphocytes (T-cells), there is also a third type, natural killer cells (NK-cells)about which much less is known.

B-cells and T-cells are responsible, both directly and indirectly, for the isolation and removal of many types of pathogen.The pathogens that each cell targets and the methods they use to remove them differ, but the cells do not work in isolation.The interactions between B- and T-cells (and other agents of the immune system) are just as important as the specific functionsof the individual cells themselves. The emphasis here is on the cell specifics, that is, the types of pathogens that each celltargets, the way in which they recognise their targets, and the methods through which they remove them. Later sections lookat the adaptive immune system from a higher level, describing the interactions between the different cell types and tyingtogether processes of the innate and adaptive immune systems.

There are several subclasses of B-cells and T-cells, each of which plays a different role in the adaptive immune response. Thetwo main subclasses of B-cell are the plasma and memory B-cells. When the right conditions arise these subclasses differentiatefrom naive B-cells to carry out their specific task. Plasma cells differentiate and proliferate in great numbers from B-cells whenan antigen has been detected, and through the production of antibodies are the B-cells most active pathogen removing cell.Memory B-cells are also produced when an antigen has been detected, but in much fewer numbers. The purpose of memoryB-cells is to “remember” the encountered antigen, allowing the immune system to respond more effectively the next time thatsame antigen is seen. There are also two main types of T-cell, namely the cytotoxic and helper T-cells. Cytotoxic T-cells aresimilar in function to plasma B-cells, they are directly responsible for the removal of pathogens. Helper T-cells on the otherhand, do not directly attack pathogens, instead, through the release of certain cytokines, they regulate the proliferation ofother cells, such as plasma B-cells and cytotoxic T-cells.

The ability of B-cells and T-cells to recognise, or bind to, specific antigens is provided by their receptors. Each cell displaysmultiple receptors on their surface, all of which are identical in structure, and consequently, all of which are able to recogniseand bind to the same molecular patterns or antigens. Because B-cells and T-cells are each responsible for the removal ofdifferent types of pathogen, their receptors are structurally very different from each other and have different methods ofbinding with antigens. T-cells are mainly responsible for the removal of virus-infected or cancerous cells, for this reason theantigens that their receptors recognise are first processed and presented to them by other cells. One of the main differencesbetween helper and cytotoxic T-cells is the type of cells that are able to present antigen to them. Due to the presence andabsence of certain proteins, cytotoxic T-cells can only recognise antigen presented to them by cells possessing MHC class Imolecules, as it happens, this is most cells in the human body. Helper T-cells on the other hand, can only recognise antigenpresented to them by cells possessing MHC class II molecules, found only on professional antigen presenting cells.

30

B-cells are primarily responsible for the removal of bacteria and viruses found in the extraceullar fluid, consequently,unlike T-cells, they do not require antigen to be presented to them by other cells. B-cell receptors are able to recognise a widevariety of molecular patterns ‘floating’ free in the extracellular fluid. Another way in which B-cell receptors differ from T-cellreceptors (TCR) is that they can be secreted by the cell, a secreted B-cell receptor is known as an antibody.

Further to the physical binding of receptors and antigens, ‘recognition’ often requires the presence of certain cytokinesand “co-stimulatory” molecules, as described shortly, these may be provided by the involved lymphocytes themselves,neighbouring lymphocytes, or other immune cells such as macrophages and dendritic cells.

In terms of how lymphocytes remove pathogens B-cells and T-cell employ very different strategies. Plasma cells, thesubclass of B-cells responsible for the removal of antigens, do not attack antigens themselves, but instead secrete largenumbers of antibodies in what is known as the antibody mediated response. For B-cells, antibodies act as both the receptors andeffectors, first detecting the presence of antigens and then initiating the processes that lead to their removal. Antibodies cannullify the effects of pathogens such as bacteria in one of three ways:

• Neutralisation - antibodies bind with antigens on the bacteria, covering its surface and inhibiting any toxic effects

• Oponization - from the greek for ‘prepare for eating’, antibodies cover the pathogen’s surface and make it easier forphagocytes, which recognise part of the antibodies, to ingest them

• Complement activation - in binding to an antigen, antibodies may initiate the complement system, a process describedshortly which eventually leads to the death of the associated pathogen

The class of T-cells that are most active in the removal of pathogens are the cytotoxic T-cells. As a result of adaptation ordamage, the targets of cytotoxic T-cells (cancerous and virus-infected cells), produce different proteins than they normallywould, these proteins are presented on the surface of the infected or damaged cell by MHC class I molecules. Cytotoxic T-cellsare able to bind with the pMHC and recognise that the cell is infected, combined with the necessary cytokine signals fromother cells, the T-cell becomes activated. Unlike B-cells, T-cells are able to directly attack the antigens that they recognise, theydo so by releasing special proteins that form channels in the infected cells membrane, causing them to leak and eventuallydie.

NK-cells have so far been neglected. As was mentioned, the purpose of NK-cells is less well understood than that of theB-cells and T-cells, however, it is known that they do play an important role in at least one aspect of the immune response.In a similar manner to the functions of cytotoxic T-cells NK-cells directly attack virus-infected cells, however, unlike T-cellsthey do so without first recognising any antigens. Though they act non-specifically, they can be seen as part of the adaptiveimmune system because their behaviour is enhanced by the release of cytokines from T-helper cells.

Eosinophils and Basophils The eosinophils and basophils are the least well understood of all the leukocytes. It is knownthat eosinophils in particular are important in the defence against parasites and that they are responsible, at least in part, forthe adverse effects of an allergic inflammatory response. Beyond this however, little is known, and for this reason there shallbe no further mention of them in this report.

Immune organs

In the human body there are several organs that are considered to be part of the immune system, though many have otherfunctions as well. Collectively, the organs of the immune system are referred to as the lymphoid organs and are split into twodistinct classes, the primary and secondary lymphoid organs. The primary organs, consisting of the thymus and the bone marroware responsible for the production and maturation of lymphocytes. The secondary lymphoid organs, including the lymphnodes, spleen and tonsils, contain only mature lymphocytes and are the locations in which lymphocytes interact with antigens,consequently, they are also the location in which the adaptive immune response is initiated.

Also considered part of the endocrine system, the first of the primary lymphoid organs, the thymus, is situated in theupper chest area and is the site in which T-lymphocytes mature. The second primary lymphoid organ, the bone marrow,is found inside many of the bones in the human body. Bone marrow is responsible for the production of several types ofimmune cell and is the location at which B-cells mature.

Though the lymphoid organs are not physically connected (except through the cardiovascular system), to increase thechance that an antigen will be met by a cell with a matching receptor, lymphocytes are constantly travelling between thesecondary lymphoid organs. The secondary lymphoid organs are spread all around the body, allowing the lymphocytes tomeet antigens from various different sources. Lymph nodes, found throughout the body, are scattered along the lymphaticvessels with which they form the lymphatic system. As well as containing lymphocytes, the lymph nodes are also home tomacrophages and macrophage-like cells and are the main site at which the adaptive response takes place. The spleen is foundbelow the diaphragm, on the left hand side of the body. Due to the fact that the spleen is connected directly to the bloodstream, it is the main location at which lymphocytes fight against pathogens that have entered the blood. The tonsils arefound in the pharynx, once again due to their location, they mainly serve to protect against ingested or inspired pathogens,providing important protection for the respiratory system.

The immune response

Having introduced the basic concepts of the immune system, the main agents involved and the basic physiology of thesystem, it is now possible to describe how these parts interact with one another to form the various processes of an immuneresponse. The discussion is split into two parts, the first describing the innate immune response, and the second describing

31

the adaptive immune response. It is important to note that the innate and adaptive systems are heavily inter-linked, theadaptive in particular is highly dependent on certain cells and processes of the innate system, these links are emphasised inthe following descriptions. Over time, such is the nature of experimental science, several immunological theories have beenproposed, accepted, contested and disproved. Not all the immunological theories discussed here are still accepted by everyimmunologist, but they all describe interesting computational properties and have all inspired work in the field of artificialimmune systems, it is for this reason primarily that they are included.

Innate immune responses

The innate immune response is the bodies first line of defence, it is initiated quickly and operates over a relatively short periodof time. The agents involved in the innate immune response included the neutrophils, macrophages, basophils, eosinophilsand dendritic cells as well as several cytokines and other specialised proteins. Two of the most important processes in theinnate immune response are inflammation and the complement cascade.

Inflammation is the body’s first response to infection or injury. Typified by localised redness, swelling, heat and pain at thesite of infection, the basic inflammatory response involves the transport of immune cells to the injured area, followed bythe removal of the cause of the infection and finally the repair of the damaged tissue. Widmaier et al. (2006) describes theinflammatory response to a bacterial infection in six steps:

1. Bacteria entering the tissue is met by macrophages, which in response to recognising the pathogen, become activatedand release various different cytokines.

2. Vasodilation then occurs in the localised area, increasing the flow of blood to the infection site.

3. The permeability of nearby capillaries is also increased in the inflamed area, making it easier for the various proteinsinvolved in the response to reach the infection.

4. Chemotaxis, the movement of leukocytes, most commonly neutrophils, into the infected area is then observed. This ismediated by the various proteins and adhesion molecules released by damaged cells, activated macrophages, and the cellslining the blood vessels near the infected area. The release of such agents causes leukocytes in the blood to slow downand stick to the cell wall when passing the site of infection, eventually drawing them in and towards the damaged area.

5. The leukocytes then begin to destroy the invading pathogens. Phagocytosis is the most common method through whichthe bacteria is removed but phagocytes may also release substances such as nitric oxide directly into the extracellularfluid, destroying the pathogens without the need for phagocytosis. On encountering pathogens, the phagocytes releasecytokines, which, through positive feedback, attract further phagocytes to the area.

6. The final step involves the repair of the damaged tissue.

The complement cascade, closely linked with the inflammatory response, is another important aspect of the innate immunesystem. The ‘complement’ proteins are family of proteins capable of destroying extracellular pathogens without the need forphagocytosis. Once activated, certain complement proteins will activate others, which in turn will activate further proteins,hence forming a cascade. One of the methods through which the complement proteins can destroy pathogens is to form amembrane attack complex (MAC) comprised of five different types of protein. The MAC attacks its targets by embedding itselfin their membrane, thus creating channels which destroy the integrity of the pathogen, causing it to ‘leak’ and eventuallyleading to its destruction. Certain molecules of the cascade also help to regulate the inflammatory response. Either directly orthrough the stimulation of other substances, the complement cascade aids the inflammatory response in the following threeways:

1. Vasodilation - increasing the flow of blood to the site of infection

2. Chemotaxis - attracting lymphocytes to the site of infection

3. Oponization - making it easier for phagocytes to ingest pathogens

The complement cascade, described here as part of the innate immune response can also be initiated by processes of theadaptive response as well. When initiated by the adaptive immune response the complement cascade is said to take theclassical pathway. When initiated by the innate immune response the cascade is said to take the alternate pathway.

The innate immune response is also partially responsible for initiating and regulating the adaptive immune response. Oneimmunological theory of how the innate immune system accomplishes this is provided by danger theory. First proposed byMatzinger (1994), danger theory suggests that for an antigen to initiate an adaptive immune response, not only must thatantigen be recognised by immune cells, but it must also be perceived to be dangerous. That is to say, to fully activate animmune cell, and initiate an immune response, the threat posed by an antigen —the danger— and the context in which theantigen was encountered must be taken into account. The threat, or danger, is determined by the presence or absence of anumber of ‘signals’. These signals are simply proteins or cytokines released by pathogens or cells of the body under certainconditions. A good indicator of danger, is the process through which a cell dies. Normal cell death, known as apoptosis, is avery orderly process, the cell is broken down gradually and signals, characteristic of this behaviour, are released. Abnormalcell death, or necrosis, on the other hand, which is caused by stress or some pathogenic presence, is highly chaotic, the cellswells up and bursts releasing its contents into the surrounding fluid, this also produces characteristic signals. Danger theoryis mostly concerned with endogenous signals, those release by the body’s own cells, but exogenous signals, such as pathogenassociated molecular patterns (PAMP) may also have a role.

32

Immature

Semi-mature

Mature

Safe Signals

Danger SignalsPAMPS

Lymph NodesExtracellular fluid

• Present antigen• Stimulate T-cells

• Present antigen• Inhibit T-cells

• Collect antigen• Receive signals

Figure 3.5: A diagram showing the different states of maturity of dendritic cells. The locations where the different cell typesare most prevalent, the transitions between them and the effects they have on helper T-cells are also shown.

The main agents of danger theory are dendritic cells, acting in their role as antigen presenting cells, they sample antigensfrom the environment and process signals before presenting their antigens to helper T-cells. Depending on the signals theyreceive dendritic cells may activate or suppress the effects of helper T-cells, resulting in the activation or suppression of theadaptive immune response. Dendritic cells can exist in several states of maturity, and it is the state they are in when theypresent their antigens to T-cells which determines whether their effect is excitatory or inhibitory. Initially dendritic cells aresaid to be immature, after sampling their environment they differentiate into mature or semi-mature states. Semi-maturedendritic cells arise when an immature dendritic cell is exposed to large amounts of endogenous signals that are characteristicof apoptosis and mature dendritic cells arise when an immature dendritic cells is exposed to endogenous or exogenoussignals associated with pathogens or indicative of necrosis. The presentation of antigen to a T-cell by a semi-mature dendriticcell results in the suppression of that T-cell, whereas the presentation of antigen to a T-cell by a mature dendritic cell results inthe activation of that T-cell, leading to the initiation of the adaptive immune response. Figure 3.5 shows a simplified diagramof the development of dendritic cells and their interactions with T-cells.

Adaptive immune responses

Due to the ability of the cells involved to recognise very specific antigens, the adaptive immune response is also referredto as the ‘specific’ immune response. The response is adaptive in the sense that over time the population of cells changesbased upon the experience of the organism and the different antigens that it has encountered. After encountering a particularpathogen or antigen, the population of immune cells adapts to better protect the organism from that pathogen, the next timethat the organism encounters the pathogen it will respond with greater speed and severity. According to Widmaier et al.(2006) the specific immune response is a three step process, incorporating:

• encounter and recognition of antigens

• activation of lymphocytes

• attack of antigens (or pathogens)

Each of these steps is now addressed in turn.

Recognition The recognition of an antigen by lymphocyte receptors is often spoken of in terms of self/non-self discrimination,where ‘self’ are molecules of the body’s own cells and ‘non-self’ are molecules belonging to foreign cells. The human bodyis able to create millions of different lymphocyte receptors, each able to recognise different antigens, it seem paradoxicalthen that the particular proteins that make up these receptors are accounted for by roughly only 200 genes (Widmaier et al.,2006). The reason such diversity is possible is because during the development of lymphocytes the DNA of the genes thatmake up the receptors is randomly cut and recombined, furthermore the genes may mutate at a high rate through a processknown as somatic hypermutation, all of which further increases the potential diversity. Because of genetic recombination andsomatic hypermutation, receptors can be produced that recognise virtually any molecular pattern, including self molecules.Lymphocytes, however, do not (normally) recognise self molecules, when they do the consequences are disastrous, with theimmune system mounting an autoimmune response against the body itself. The fact that receptors are not produced whichrecognise self is described as self-tolerance. The reason that self-tolerance is observed in the immune system is thought to bedue negative selection, this being the process during the development of lymphocytes in the thymus and bone marrow thatcells whose receptors recognise self molecules are selectively removed or have their receptors edited or replaced. Furthermore,lymphocytes that survive development, may later simply become inactive or ineffective due to the lack of co-stimulation theyreceive from other cells in the immune system. It should be noted that the recognition of an antigen by a receptor is not a

33

binary relationship, some receptors may form stronger binds with antigens than others. The strength of a receptor-antigenbind is described as the affinity of the receptor to that antigen, receptors which bind strongly with an antigen are said to havea high affinity for that antigen.

Activation The binding of a lymphocyte and antigen alone is not sufficient to initiate the adaptive immune response, that isto say, it is not sufficient to orchestrate an attack upon the antigen with which the lymphocyte is bound. To fully initiate animmune response the lymphocytes must become activated. Activation involves the interaction of multiple immune cells,cytokines and other signalling molecules, and crucially, leads to the proliferation of cells which recognise the antigen thatinitiated the response. The proliferation of lymphocytes is an extremely important step in the adaptive immune response,since only when they exist in sufficient numbers will the cells be able to fully combat the effects of their pathogenic targets.The activation of helper T-cells was earlier described in terms of danger theory, alternatively, helper T-cell activation may bedescribed simply as a sequence of events incorporating: the binding of pMHC-II on APC with the receptors of helper T-cells;the co-stimulation of the T-cells, provided by ‘nonantigenic’ matching proteins on the surfaces of both cells; and the release ofcertain cytokines by the APC (Widmaier et al., 2006).

Another theory of T-cell activation is that of the complex tuneable signalling pathways described by Stefanová et al.(2003). The bind between TCR and pMHC is much weaker than that of the bind between antigens and antibodies. Thesignalling pathways of T-cell receptors help to explain why, even with a weak physical bind and when the antigens aregreatly outnumbered by the presence of self-molecules, T-cells may still become activated by antigens. According to thetheory, whether or not a T-cell is activated is largely determined by the duration of the TCR-MHC bind, if this bind existslong enough for a process known as kinetic proofreading to complete, then the T-cell will become activated. However, duringkinetic proofreading negative feedback is seen to be in effect which can reverse the process. Importantly though, if a TCRcompletes proofreading, it can spread signals which protect other TCRs from the effects of negative feedback.

Once activated, helper T-cells then provide the necessary stimulation, through the release of cytokines, for the activation ofB-cells and cytotoxic T-cells. The activated lymphocytes, capable of recognising the antigen that initiated the response, thenproliferate in great numbers through a process known as clonal expansion. As its name suggests, clonal expansion involvesthe creation, through cloning, of multiple copies of the original activated lymphocytes, producing new cells which maythemselves go on to produce clones. T-cells and B-cells both undergo clonal expansion, however, due to the presence ofsomatic hypermutation, the expansion of B-cells is more interesting that that of T-cells and leads to the production of cellsthat are not only capable of recognising the antigen that initiated the response but that are actually better at doing so than theoriginal cells that initiated the response, that is to say, they have a higher affinity for the antigen. For this reason the focushere is on the clonal expansion of B-cells only.

The clonal expansion of B-cells is similar to Darwin’s theory of natural selection (Darwin, 1859). The process begins whenB-cells bind with antigen, with the necessary stimulation from T-helper cells the B-cells become activated and with a highmutation rate produce clones. The presence of genetic mutation leads to a diverse set of progeny, some of which will nolonger match the antigen and some of which will form a better match with a higher affinity. Those cells which don’t matchthe antigen will not become activated and will proliferate no further, those which do match the antigen will produce clones.Cells with a higher affinity to the antigen are more likely to become activated and hence more likely to produce similarclones, and so, over time the population of B-cells becomes better at matching the antigen.

Attack After proliferation, in order to attack the invading pathogen, B-cells must differentiate into plasma cells, the antibodyproducing cells responsible for initiating the antibody mediated response. The first time that the immune system encountersan antigen, known as the primary response, there will be very few B-cells that match the antigen and the response will be slow,however, after a short period (the lag phase), cells which match the antigen will proliferate and differentiate into plasmacells, eventually leading to the removal of the invading pathogen. The second time that the immune system encountersthe same antigen (the secondary response), the lag phase is much shorter and the response is much quicker, with antibodiesbeing produced in larger numbers at a faster rate, the reason for this is that during the primary response, some B-cells alsodifferentiate into memory cells. Memory cells are long-living B-cells which possess receptors for the antigen that led to theircreation but secrete little (if any) antibodies, during the secondary response, they act as a starting point and can quicklydifferentiate into plasma cells, attacking and removing the pathogen much quicker than during the primary response.

Not only can the immune system respond faster to previously seen antigens, it can also respond faster to antigens whichare structurally similar to previously seen antigens, a phenomenon known as cross-reactivity which is easily accounted for bythe clonal expansion theory.

There are two further immunological theories, that do not easily or exclusively fit under the headings of ‘innate-’ or‘adaptive immune response’, but deserve attention nonetheless. The first, immune network theory, pioneered by Jerne (1974) isno longer widely accepted by immunologists, but has inspired a large amount of work in the field of AIS. The second, immunecognition, a phrase made famous by Cohen (2000) discusses the immune system from a less biological and more philosophicalpoint of view, describing the functions of the immune system in terms of cognition and self-organisation.

Immune network theory

The immune, or idiotypic, network theory of Jerne (1974) proposes that the immune system is made up of a large network ofinteracting lymphocytes and molecules, each capable of recognising each other, even in the absence of antigen. Recognitionoccurs through the matching of paratopes on antibodies to epitopes on antigens. The immune network theory states thatantibodies recognise not just foreign antigens but also other antibodies. Analogous to epitopes on antigens, the recognisingregions on antibodies are referred to as idiotopes and the set of all idiotopes displayed by a set of antibodies is referred to

34

as an idiotype. Upon recognition of of an epitope or idiotope, antibodies may have a positive or negative response on thecorresponding cell or molecule, resulting in, for example, the proliferation or suppression of that cell or molecule. Thesesimple network dynamics allowed Jerne to explain several interesting properties of the immune system, such as self-toleranceand immune learning and memory. Though the theory may still be useful in an artificial context, from the point of view ofmost immunologists it is no longer accepted.

Immune cognition

The immune system, in the opinion of Cohen (2000) should be considered as a complex cognitive system. According toCohen, far from simply being responsible for the removal of pathogens, the immune system should be viewed as a generalmaintainer of the body, acting both in the presence and absence of pathogens. The immune system does not simply stopfunctioning when the body is free of foreign antigen, it is constantly involved in the processes of repair and cell death, growthand differentiation. This maintenance, it is argued by Cohen, is achieved via a three step process, incorporating:

1. Recognition - determining what is right and what is wrong

2. Cognition - interpreting signs, evaluating them, and making decisions

3. Action - carrying out the decisions

Summary

The natural immune system is a large and complex system, consisting of several immune agents, interacting in a distributedand self-organising manner to ensure, through various forms of protection and maintenance, the continued survival of theorganism in which they are situated. The immune system can be viewed as a two part system, the first part, the innateimmune system, is the organisms first line of defence, protecting the organism over a short period of time in a “non-specific”manner. The second half of the immune system, the adaptive immune system, acts over a much longer timescale, protectingthe organism from specific threats that the immune system must ‘learn’ to recognise. The main threats to an organism arebiological pathogens, of which there can be considered at least four different types: viruses, bacteria, fungi and parasites.The main actors responsible for removing or reducing the effects of these pathogenic threats are the white blood cells, orleukocytes. There are five main types of leukocyte: eosinophils, neutrophils, basophils, lymphocytes and macrophages,of these, lymphocytes may be considered the most important cells of the adaptive immune system and neutrophils andmacrophages, acting in their roles as phagocytes and antigen presenting cells, the most important agents of the innateimmune system.

Manufactured and resident in the primary and secondary lymphoid organs respectively, the role of the lymphocytes in theadaptive immune response has been explained by a number of different immunological theories. Amongst them, the clonalexpansion theory explains how, from a population of cells in which very few are capable of recognising a particular foreignantigen, a population may emerge in which there are a large number of cells capable of recognising the antigen, what’s more,these cells may recognise the antigen with a much higher affinity than their predecessors. With the inclusion of memory cells,clonal expansion also accounts for the fact that secondary and tertiary responses to a particular antigen will happen withgreater speed and severity. Another theory, negative selection, describes the removal of lymphocytes that match with selfmolecules and in doing so helps to explain the concept of self-tolerance, the fact that immune cells do not (usually) mount aresponse against the organisms own cells.

The roles of the neutrophils and macrophages of the innate immune system are most evident in the processes ofinflammation, and the alternative complement cascade. Further to these direct forms of coping with foreign invaders, theinnate immune system is also responsible for regulating the adaptive immune response. One theory in particular, Dangertheory, attempts to explain this link between the innate and adaptive immune systems.

Other theories such as immune network theory, that proposes a view of the immune system as a massive network ofinteracting cells and molecules, and Cohen’s view of the immune system as a cognitive system can also explain several of theinteresting properties of the natural immune system that one may wish to abstract into an artificial context.

3.2.2 Developing Artificial Immune Systems

Over the years, immunologists, mathematicians, computer scientists and engineers have all attempted, in some form oranother, to model the processes of the natural immune system. Across these various fields, two extreme approaches toartificially encapsulating immune properties may be identified. At one end of the spectrum are engineers, who take inspirationfrom immunology to create algorithms based upon abstractions of biological processes and apply these algorithms to specificproblems. At the other end of the scale are computational immunologists who use mathematics and other modelling tools toaccurately simulate biological processes within a computer, with the hope of better understanding how a particular processor function works and aiding in the study of the causes, effects and cures of infections diseases. The original definition of anartificial immune system from de Castro and Timmis (2002) (introduced at the start of this section), more closely resemblesthe first approach, as does de Castro and Timmis’s application driven framework for engineering artificial immune systems(depicted in figure 3.6) upon which a large number of immune-inspired systems are based. Recent years however, have seen achange in the focus of AIS, with more people moving into the middle ground that lies between the two extreme approaches.

One of the early proponents of this shift was Stepney et al. (2005), whom, noticing the often naive approach of many inabstracting biological principles into computational algorithms, particularly within the field of AIS, proposed a conceptualframework that invokes the use of more sophisticated biological models. Stepney et al.’s framework is depicted in figure

35

Representation

Immune Algorithms

Affinity MeasuresAIS

Solution

Application Domain

Figure 3.6: A layered framework for the development of artificial immune systems. From de Castro and Timmis (2002).

2 STEPNEY, et al.

the fact that these two fathers of computing are now more associatedwith the standard, distinctly non-biological computational models.

Computation is rife with bio-inspired models (neural nets, evolution-ary algorithms, artificial immune systems, swarm algorithms, ant colonyalgorithms, L-systems, . . .). However, many of these models are naivewith respect to biology. Even though these models can work extremelywell, their naivety often blocks understanding, development, and analysisof the computations, as well as possible feedback into biology.

2. A CONCEPTUAL FRAMEWORK

The next steps in bio-inspired computation should be to developmore sophisticated biological models as sources of computational inspi-ration, and to use a conceptual framework to develop and analyse thecomputational metaphors and algorithms.

We propose that bio-inspired algorithms are best developed andanalysed in the context of a multidisciplinary conceptual framework thatprovides for sophisticated biological models and well-founded analyticalprinciples.

Figure 1 illustrates a possible structure for such a conceptual frame-work. Here probes (observations and experiments) are used to providea (partial and noisy) view of the complex biological system. From thislimited view, we build and validate simplifying abstract representations,models, of the biology. From these biological models we build andvalidate analytical computational frameworks. Validation may use

FIGURE 1An outline conceptual framework for a bio-inspired computational domain.

Figure 3.7: An inter-disciplinary, conceptual framework for designing artificial immune systems. From Stepney et al. (2005).

3.7. The framework is iterative throughout but always begins with ‘probing’ the biological, or in this case immunological,system of interest. From observations of the biological system, abstract models are constructed and from these, analyticalcomputational frameworks are built. Through the continued interaction between engineers, biologists, and mathematicians,these models may be refined and validated, possibly resulting in further probing of the biological system. When enoughis learnt about the system, bio-inspired algorithms may be constructed, which even then may feedback into the modellingstages.

In response to a perceived stagnation in the field, and in agreement with Stepney et al. (2005) over the need for morebiologically plausible approaches to AIS, Timmis (2007) set four challenges for the community:

1. To develop novel and accurate metaphors and be a benefit to immunology,

2. to consider the application of AIS,

3. to develop a theoretical basis for AIS,

4. and to consider the integration of immune and other systems.

Following on from these challenges, Timmis et al. (2008b) define the concept of immuno-engineering, a principled, inter-disciplinary approach to designing artificial immune systems, involving closer interaction between immunologists andengineers. Immuno-engineering does not contradict de Castro and Timmis’s original definition or framework, it merelyaugments it. Based upon the conceptual framework of Stepney et al. (2005), without loosing sight of what, for many, is themost an important part of an artificial immune system: the application, the immuno-engineering approach aims to benefitboth engineers and immunologists alike.

Due to the relative novelty of the approach, not all of the systems reviewed in the following section follow the principlesimmuno-engineering strictly, however, to ensure the continued success of the field, it is hoped that most future AIS work willfollow the guidelines of Stepney et al. (2005) and Timmis et al. (2008b).

3.2.3 Algorithms and Applications

Artificial Immune Systems have been applied to a wide variety of applications, including: computer and network security,anomaly detection and diagnosis, robot control, optimisation problems, pattern recognition, and machine learning to namebut a few (de Castro and Timmis, 2002). An exhaustive review of all applications would not be appropriate or necessary (forsuch a review see de Castro and Timmis (2002), Dasgupta et al. (2003), Hart and Timmis (2005) and more recently Zhenget al. (2010)), instead, in this section only a subset of the existing applications are reviewed, specifically those that relateclosely to the main themes of this report: robotics and fault tolerance. It should be noted, that several algorithms, thoughinitially designed for a specific problem could have wider-reaching applications. Consequently, state-of-the art and seminal

36

Normal "self" elements

Anomalous "non-self"

"non-self" detectors

Key

Figure 3.8: Diagram showing a potential weakness of the negative selection approach. If the size of “non-self” is much largerthan the size of “self” then it may be hard to build up a set of detectors that sufficiently cover the “non-self” space.Without full coverage, any anomalies that fall within the white space will not be detected.

work outside of the aforementioned applications are also reviewed, but where possible, attempts are made to address theseapproaches within the context of fault tolerant, collective robotic systems.

According to the framework of de Castro and Timmis (2002), at the heart of an artificial immune system is an algorithm.Over the years, several different types of algorithm have become popular, each based on different cells, processes andprinciples of biological immune systems. In this section, work is categorised into one of several classes, according to the typeof algorithm that it employs, the three main classes reviewed here are: negative selection, clonal selection and danger theory(exemplified by the dendritic cell algorithm). A further class is reserved for newer, novel, approaches which cannot easily beplaced within any of the other categories. For each of the three main classes, the basic algorithm will be introduced first,before reviewing some of the different variations of it, and the applications to which it has been applied.

Negative Selection

Some of the earliest approaches to AIS and fault tolerance centre around the concept of negative selection. The first use of thenegative selection principle as a form of artificial immune system was that of Forrest et al. (1994), who applied their techniqueto the problem of detecting computer viruses. Another pioneering use of the approach, perhaps more pertinent to the topic ofthis report, was the detection of anomalies in time-series data by Dasgupta and Forrest (1995).

The algorithm itself is conceptually very simple, as demonstrated by the standard example presented in algorithm 1. Duringa training phase, detectors are created at random and their affinity with a set of elements that describe the normal state ofthe system is calculated. If the affinity between a detector and a member of the training set is greater than a predeterminedthreshold then that detector is discarded, otherwise it is added to the set of generated detectors. The output of the trainingphase is a set of detectors that will only recognise abnormal behaviour. When new data is presented to the set of detectors,only anomalous items will have a high affinity, and hence, be classified as such.

Algorithm 1 The Standard Negative Selection Algorithm

input : S = set of elements describing normal behaviouroutput : D = set of detectors to be generated

looprandomly generate a new detector dcalculate the affinity between d and every member of Sif affinity between d and at least one member of S greater than some threshold then

d is rejectedelse

d is added to Dend if

end loop

There is one obvious drawback to the negative selection approach. As depicted in figure 3.8, where the central orangecircle represents normal states of the system and the surrounding space represents abnormal or anomalous states. Duringtraining, detectors, represented by the red circles in figure 3.8, will be created at random locations in this space, the radius ofthe circles represents the affinity threshold. Detectors which have an affinity to the normal states greater than the threshold,that is, red circles which overlap with the central orange circle, will be thrown away, leaving something like the picture shownin figure 3.8. The problem with this approach is that, as shown in figure 3.8 it may take a long time to create a set of detectorswhich covers the entire space of abnormality, and if this space is not covered, any new data that falls within the white regionwill be wrongly classified as normal. Despite this potential flaw, a number of authors have shown negative selection inspiredapproaches to be an effective form of fault tolerance, employing various techniques to get around this initial problem.

37

A notable early example is the immunotronics approach of Bradley and Tyrrell (2000), one of the first applications of AIStechniques to hardware based fault tolerance. Exemplified by Bradley and Tyrrell (2001) and Canham et al. (2003), theimmunotronics approach shows how immunological concepts can be mapped onto a hardware platform such as an FPGAand provide effective fault tolerance based on the abstraction of immunological processes such as negative selection. Toadd further reliability to such systems, in Bradley et al. (2000), the immunotronics approach is combined with “embryonicarrays”, another hardware-based bio-inspired technique that takes its inspiration from the development of single cells intomulti-cellular organisms.

Clonal Selection

Clonal selection algorithms, as their name suggests, are based on the principles of clonal expansion and selection that wereintroduced in section 3.2.1. Adapted from Timmis et al. (2008a), algorithm 2 shows the standard clonal selection algorithmfor the task of pattern matching. Starting with an initially random population of antibodies, the affinity between a singleinput pattern and every antibody in the population is calculated. The antibodies which have the highest affinity are cloned,the number of clones being proportional to the affinity of the antibody with the input pattern. The clones are then mutated,producing some antibodies with higher and some with lower affinities than their parent, the antibodies with the highestaffinity are stored in the memory set. To ensure diversity, analogous to receptor editing in biology, the antibodies withthe lowest affinity to the pattern are removed and replaced with new randomly generated antibodies. The process is thenrepeated for a predetermined number of generations, or until an acceptable memory set is formed.

Algorithm 2 The Standard Clonal Selection Algorithm

input : S = set of patterns to be recognised, n = the number of worst elements to select for removaloutput : M = set of memory detectors capable of classifying unseen patterns

create an initial random set of antibodies: Aloop

for pattern in S do1) calculate and store the affinity between the input pattern and every antibody in A2) clone the antibodies with the highest affinity to the input pattern

(the number of clones produced is proportional to the antibody affinity)3) mutate the clones4) add the antibodies with the highest affinity to the memory set M5) replace the n lowest affinity antibodies with new randomly generated antibodies

end forend loop

Three of the most influential AIS algorithms of recent years that are based upon the clonal selection principle are:CLONALG (de Castro and Von Zuben, 2002), AIRS (Watkins et al., 2004) and the B-cell Algorithm (BCA) (Kelsey and Timmis,2003). Of the three algorithms CLONALG most closely resembles that of the generic algorithm 2, it was developed primarilyto perform machine-learning and pattern-recognition tasks but has also been applied to other types of problem. AIRS is asupervised learning algorithm, memory cells are created based on the exposure of B-cells to training data (antigens), thestandard clonal expansion with mutation process, and the competition for resources amongst B-cells. Once training hasfinished, memory cells classify input data using a k-nearest neighbour style approach, the k most stimulated memory cellsvoting for the class they are associated with. The first application of the BCA was to the problem of function optimisation, thealgorithm itself is similar to CLONALG and the generic algorithm shown above, the stand out difference being the novelmutation operator used which the authors refer to as contiguous somatic hypermutation. This novel mutation operator differsfrom standard techniques because rather than mutations happening uniformly along the antibody vector, they are focusedwithin a particular region, a process which is believed to be more biologically plausible.

One example of a fault tolerant robotic system that takes its inspiration (loosely) from the principles of clonal selection isprovided by Jakimovski et al. (2009). Jakimovski et al., describe a robot anomaly detection system (RADE), to detect faults inthe legs of a hexapod robot named OSCAR (Organic Self-configuring and Adaptive Robot). When a fault is detected in one ofthe robot’s legs a unique self-amputation mechanism known as R-LEGAM (Robot LEG AMputation) is used to remove the legand a novel strategy for re-configuring the orientations and control of the legs, known as SIRR (Swarm Intelligence for RobotRe-configuration) (Jakimovski et al., 2008), is initiated. It should be emphasised that the authors are not trying to model thebiological process of autotomy or self-amputation, which is more commonly the result of a distraction defence mechanismthan a problem with the functionality of the appendage. The authors are merely using an immune inspired algorithm as ananomaly detection system and following detection, self-amputation as a method of fault recovery.

The RADE system was originally introduced by Jakimovski and Meyer (2008), the approach involves the use of fuzzy logicand pre-defined rules such as:

IF behaviour IS walking AND scurrent IS verybig THEN anomaly IS present WEIGHT 0.1

Where attributes such as scurrent are fuzzy sets with hand-chosen membership rules. The above rule describesanomalous behaviour, the only other type of rule possible is one which describes normal behaviour. If the conditions of arule are met then the rule is said to fire and analogous to clonal proliferation (without mutation), the authors argue, the

38

weight of the firing rule is increased whilst the weights of all rules of the opposite type are decreased. Each leg of the robot ismonitored in isolation and the decision as to whether or not a leg is faulty is determined by a weighted sum of the rules,outputting a value between 0 and 1, where 0 indicates normal behaviour and 1 indicates anomalous behaviour. The approachof Jakimovski and Meyer is rather simplistic, and does not make full use of all the benefits of the clonal selection principle, itdoesn’t include mutation for example, although there are future plans to incorporate a learning mechanism for automaticallygenerating rules.

Jakimovski and Meyer compare their approach to a version of the algorithm without the dynamic updating of the weightsand it was seen to respond faster and more effectively, however, no wider comparisons were made so it is hard to objectivelyassess the quality of the algorithm. The question of who monitors the monitor is also not addressed and whilst providedthe robot had enough fully functioning legs, a false positive, resulting in the unnecessary amputation of a leg would notbe disastrous, if the release mechanism itself failed, then the control of a robot with more legs than it believes may proveproblematic. These criticisms aside, the ability of Jakimovski et al. (2009) system to detect a fault, remove the faulty componentand then re-configure in order to compensate for the fault, makes for an interesting study and is highly relevant to SYMBRIONand other self-reconfiguring modular robotic systems.

Another good example of a system inspired by the clonal selection principle is de Lemos et al.’s adaptable error detectionframework, designed to detect errors in automated teller machines (ATM) (de Lemos et al., 2007). de Lemos et al. present anadapted version of the AISEC algorithm (Secker et al., 2003), a system that was originally developed to classify emails aseither ‘interesting’ or ‘non-interesting’. de Lemos et al.’s version is shown to be suitable for detecting errors in ATMs, what’smore, the system is able to detect errors before they occur, sometimes as much as 12 hrs in advance.

Throughout their operation ATMs produce log files which, amongst other things, contain a sequence of codes that representthe state of the system over time. The job of the anomaly detection system then, is to detected sequences of codes thateventually lead to fatal states. During a training phase the system is presented with old ATM log files, from which it createsdetectors based upon those sequences that lead to fatal states, to add diversity and generality some detectors are also subjectto cloning and mutation. After training, the system enters the on-line phase, in which it continuously classifies incomingsequences of states as either normal or indicative of a forthcoming failure. During the on-line phase the system also adapts itsset of detectors, through clonal selection, to meet the unique operational profile of the individual ATM (which may changeover time).

Although de Lemos et al.’s system is not directly related to robotics, some of the requirements of ATMs and collectiverobotic systems are similar, furthermore, some of the properties exhibited by de Lemos et al.’s system would be highlydesirable in a fault tolerant robotic system. In terms of their requirements, both ATMs and robotic systems must be able tooperate autonomously for long periods of time and be able to adapt to different conditions, in a robotic system this maymean adapting to a new environment or in the case of ATMs adapting to different usage patterns, both cases in which faultsmay present themselves differently. To meet these requirements de Lemos et al.’s system exhibited a number of interestingproperties, most notably, the ability to detect errors before they occur. Such predictive ability would be highly desirable ina collective robotic system, allowing faulty robots to be removed or isolated, before they have a detrimental effect on theoperation of the overall system. de Lemos et al. also describe plans to implement a network wide version of their system,allowing individual ATMs to share generic detectors with spatially disparate neighbours, this ability to collectively shareacquired information would also be useful in a collective robotic system.

Danger Theory

The most prominent AIS algorithm to take inspiration from danger theory is Julie Greensmith’s Dendritic Cell Algorithm(DCA) (Greensmith et al., 2005, 2006a). Over recent years, the DCA has been applied to a large number of anomaly detectiontasks, with applications ranging from intrusion detection in computer networks (Al-Hammadi et al., 2008; Greensmith et al.,2006b) and misbehaviour detection in wireless sensor networks (Kim et al., 2006), to the detection of multipath errors in GPSdata (Ogundipe et al., 2010) and numerous robot-related applications (Bi et al., 2010; Humza et al., 2009; Mokhtar et al., 2009;Oates et al., 2007).

In taking inspiration from danger theory, the algorithm abstracts the functionality of dendritic cells (DC) by maintaining apopulation of DCs and using at least three types of signal, derived from the input data, to control their maturation. Therelative concentration of mature and semi-mature DCs determining the presence or absence of an anomaly. The three types ofsignals most commonly used are referred to as safe, danger and PAMP (Pathogen Associated Molecular Patterns). Followingtheir biological counterparts, it is advised that safe signals correspond to patterns in the data that are known to be normal,danger signals correspond to patterns which may indicate abnormality, and PAMP signals to situations that are definitelyindicative of an anomaly. The mapping of data to signal, however, is entirely at the discretion of the designer and must beconstructed by hand, there is no good reason why the suggested mapping should be strictly adhered to.

Algorithm 3 shows the standard dendritic cell algorithm. The algorithm begins with a pool of dendritic cells and a streamof data items to be classified as normal or anomalous. For each data item, a subset of the dendritic cells are selected at randomto sample that item. When sampling a piece of data, the unique ID of the item is added to the sampling DCs list of collectedantigens and the current values of safe, danger and PAMP signals are calculated. Based upon the current signal values, theconcentrations of three output ‘cytokines’: mature, semi-mature and csm (co-stimulatory molecules), are calculated andadded to the values stored within the sampling DC. If the total concentration of output cytokines exceeds a predeterminedthreshold, randomly chosen and specific to the current DC, then that cell is said to migrate, it will no longer sample antigenand is removed from the population with a new cell added in its place. The state of maturity of migrated cells is determinedby the relative concentrations of mature and semi-mature cytokines, if the concentration of semi-mature cytokines is greaterthan the concentration of mature cytokines then the cell is said to be semi-mature, otherwise it is said to be mature. Theclassification of data items as normal or anomalous is based on the number of DCs which sampled it and then went on to

39

become mature, and the number that went on to become semi-mature, if the number of semi-mature cells is larger then thedata item is said to be normal, otherwise it is said to be anomalous.

Algorithm 3 The Standard Dendritic Cell Algorithm

input : S = stream of data

create a pool of DCs, Dcreate an empty pool of migrated DCs, Mfor all data in S do

select a subset of DCs at random from D, place in a new pool Efor all dc in E do

add data to list of antigens collected by dccalculate safe, danger and PAMP signal valuesupdate concentration of output cytokines in dcif concentration of output cytokines above threshold of dc then

move cell to Madd new DC to D

end ifend for

end for

for all dc in M doif semi-mature cytokine concentration greater than mature cytokine concentration then

set dc to semi-matureelse

set dc to matureend if

end for

for all data in S doif number of cells in M that present data as mature is greater than number that present as semi-mature then

data labelled as abnormalelse

data labelled as normalend if

end for

The applications of the DCA most pertinent to this report are those within the field of robotics, some examples of whichare now reviewed. As part of the SYMBRION project, Mokhtar et al. (2009) present a simplified version of the DCA, referredto as the modified Dendritic Cell Algorithm (mDCA), that is the first step towards the integrated AIS framework, describedin section 3.3 and published in Mokhtar et al. (2008) and Timmis et al. (2010a). To allow the DCA to operate in an on-linefashion, on a resource limited micro-controller, Mokhtar et al. (2009) remove the population aspect of the algorithm andreduce it to a matter of simply calculating and aggregating signal values at each time step. The mDCA has been appliedto three different anomaly detection tasks on-board Symbricator style robots: the detection of faulty IR range sensors, thedetection of faults within a robots actuation and, in collaboration with Humza et al. (2009), the detection of faults within arobot’s power management system.

Further to its traditional use as an anomaly detector, Bi et al. (2010) adapt the DCA to perform another important taskwithin the hierarchy of fault tolerance: fault diagnosis. The main test case of Bi et al.’s Diagnostic-DCA (D-DCA)1 is thediagnosis of ‘stuck-at-fault’ scenarios in the IR range sensors of small mobile robots. Bi et al. assume the presence of ananomaly detection system, that, without specifying the cause of any anomaly and with a reasonably high level of accuracy,classifies the current state of the robot as normal or anomalous. Given this contextual information, and the current values ofthe robot’s eight IR sensors, the D-DCA determines which of the sensors is responsible for the fault.

Another interesting example of the DCA’s application to a robotic problem is Oates et al.’s use of the DCA as a method ofdetecting anomalous objects or activity as part of a mobile robotic sentry or security guard system (Oates et al., 2009, 2007).

Novel approaches

There are several novel AIS algorithms that do not easily fit into any of the aforementioned categories. Recent work, inspiredby the tuneable activation thresholds of T-cells, falls into this category (Guzella et al., 2007; Owens et al., 2009).

Guzella et al. (2007), in taking inspiration from the tuneable activation threshold (TAT) hypothesis of Grossman and Paul(1992, 2001), present one of the first AIS approaches in this area, aimed at solving anomaly detection problems in which thereis a strict temporal ordering upon the input data. Guzella et al.’s approach uses more biologically plausible models of T-cells

1Not to be confused with the deterministic DCA (dDCA) (Greensmith and Aickelin, 2008).

40

than are commonly used in AIS, involving multiple receptors and sub-populations of T-cells that recognise specific features ofan antigen rather the entire antigen as a whole.

Owens et al. (2009) present a second anomaly detection algorithm inspired by the TAT hypothesis. Employing anincremental, interdisciplinary approach to designing artificial immune systems, as advocated by Stepney et al. (2005) andTimmis et al. (2008a), the algorithm of Owens et al. (2009) was based upon previously created models of T-cell signallingpathways by Altan-Bonnet and Germain (2005) and Feinerman et al. (2008), analysed and extended by Owens et al. (2008,2010). The behaviours of T-cells captured in the models of Owens et al. were abstracted and mapped onto the statisticaltechnique of kernel density estimation, that forms the basis of their algorithm. The algorithm itself classifies data as normal orabnormal according to whether or not it is seen to belong to an estimated probability distribution. By imposing a slidingwindow upon the input data, Owens et al. are able to detect anomalies that differ significantly from previous readings,whilst at the same time taking into account the gradual shifts in sensor values than can be expected as a result of varyingenvironmental factors such as the ambient light conditions. Relevant future applications of this work, unpublished, includes arobotic sniffer dog for detecting the chemical signatures of various substances and the detection of anomalous behaviour inswarms of foraging robots.

For a mobile robotic platform such as Symbricator, gradual temporal changes, relating to both the external environmentand the internal state of the system are common place. It is easy to see how the algorithms of both Guzella et al. (2007) andOwens et al. (2009) are applicable in this context, allowing robots to continue operating in a gradually changing environmentwhilst retaining the sensitivity necessary to detect anomalies that are the result of faulty components.

In a completely different vein of work, Schreckling and Marktscheffel (2010) present an artificial immune system for anartificial chemistry known as Fraglets (Tschudin, 2003). Originally developed for applications within the field of computernetworking, the Fraglets chemistry is based entirely on the interactions of symbolic strings, or ‘fraglets’. Individual fraglets,serving as both the code and the data of the system, are placed within a ‘reactor’, in which at each time step, in accordancewith a number of production rules, they probabilistically collide and interact with each other to produce a new set of fraglets.An example of a the rule known as match, which takes two strings and concatenates their tails if a certain condition is met, isprovided below:

[match S TAIL1] + [S TAIL2] → [TAIL1 TAIL2]

What results from this approach is a highly flexible system, inherently parallel and distributed, which is extremely wellsuited for the development software exhibiting self-organising and emergent properties. However, there is a downside to thehighly flexible nature of Fraglets, and artificial chemistries in general. The allowance of programs to change structure whilstretaining their basic function, and the inbuilt support for behaviours such as self-replication, introduces brand new formsof attack that are undetectable using classical security techniques. To counter this problem Schreckling and Marktscheffelextend the Fraglets system with the introduction of a signalling component that allows special system responses, such as theprevention of code execution, to be instigated whenever certain conditions, indicative of malicious behaviour, are observed.

In this early work the malicious behaviours that Schreckling and Marktscheffel are attempting to counter are fraglets thatattempt to access data or remove other fraglets in an unauthorised way. The approach, necessarily, involves the mapping ofantigens, antibodies, B-cells and helper T-cells into fraglets. Antigens are considered to be the fraglets responsible for themalicious behaviour, or the products thereof, B-cells are self-replicating fraglets which produce antibodies and helper T-cellsmediate the replication of the B-cells. When a malicious fraglet is inserted into the system, the signals that it produces arematched against pre-defined patterns, leading to the creation of B-cells which, mediated by helper T-cells, proliferate throughself-replication and produce antibodies which bind with antigens and execute rules which lead to that antigens removal.

Though Schreckling and Marktscheffel’s early experiments in protecting against certain specific tasks were successful, dueto the already emphasised flexibility of artificial chemistries, future conceivable attacks are limited only by the ingenuityof the attackers. A daunting task then is faced by anyone wishing to eliminate all possible forms of attack and care mustbe taken if extending the system, so as to not lose the flexibility that made the approach so appealing in the first place. Asuitable balance between flexibility and security must be found, which so far Schreckling and Marktscheffel have managed.

The highly specialised nature of Schreckling and Marktscheffel’s approach, and its tight coupling to the Fraglets system,appears to suggest very little relevance to the task of ensuring the long-term autonomy of a collective robotic system.Regardless of the lack any direct applicability, what is appealing about Schreckling and Marktscheffel’s approach is thedecentralised method of control, and the seeming ease with which such a system could be distributed, throughout forexample, a robotic organism. Though it may be beyond the scope of this project, one can imagine a realisation of an artificialchemistry within a robotic organism where, fraglets, or any other chosen molecular representation of the chemistry, maydiffuse between different modules in a manner similar to that of the hormones of Schmickl et al. (2010) AHHS controllers,endowing the system with a highly flexible, and with the addition of an immune inspired component, robust, form of control.

Another decentralised form of AIS, which may be seen as more relevant to the task at hand, is the Modular RADAR(Robust Adaptive Decentralised search with Automated Response) architecture presented by Banerjee and Moses (2010). Theauthors do not suggest a new type of immune algorithm, but rather, a general approach in which, based upon empiricalbiological data, the size and number of artificial ‘lymph nodes’ in a system increases with the overall size of the system.Banerjee and Moses’s approach results in a highly scaleable system that balances the tradeoff between local antigen detectionand global antibody production, and is applied to the applications of multi-robot control and search within peer-to-peer (P2P)networks.

By analysing biological data on the West Nile Virus (WNV) in a variety of different animals, Banerjee and Moses observedthat regardless of the size of the animal, the time taken by the immune system to detect and respond to the virus was thesame (around 3 days). From their observations, Banerjee and Moses hypothesise that the total lymph node volume (numberof lymph nodes multiplied by their size) is linearly proportional to the mass of the animal. Furthermore, Banerjee and Mosessuggest that there exists a tradeoff between the size and number of lymph nodes, that minimises both the time taken to

41

detect an antigen and the time taken to recruit B-cells from other lymph nodes in order to produce the required amount ofantibodies. If the size of lymph nodes was fixed and the number simply increased in larger animals then the time taken todetect an antigen would scale linearly with the size of the animal, but due to the costs of global communication betweenlymph nodes, the time for sufficient antibody production would not. Conversely, with a fixed number of lymph nodes,increasing in volume with the size of the system, the communication costs would be minimal but, due to the larger areaserviced by each lymph node, the migration times of antigens to the lymph nodes would be prohibitively slow. Hence, atradeoff is required that balances antigen detection (local communication) and antibody production (global communication).

Neither of the applications to which Banerjee and Moses applied their approach are directly relevant to the kind of collectiverobot system used within the SYMBRION project, but an obvious mapping exists. Imagine a distributed AIS within anartificial organism, whereby each module may be considered analogous to a lymph node and cells may migrate betweendifferent modules, similar to the architecture described by Mokhtar et al. (2008). It is not hard to see that, in accordance withBanerjee and Moses’s hypothesis, for the system to be scaleable, when new modules are added to an artificial organism, theone-to-one mapping between robot and lymph node may not be efficient. In considering each module as an artificial lymphnode the approach would equate to having a fixed sized lymph node and simply increasing the number of nodes as a methodof scaling, which Banerjee and Moses observed was not the case in biology. Whether or not this matters is unclear, but it iscertainly something that would need to be carefully considered.

Ismail and Timmis (2009) describe a swarm aggregation algorithm, inspired by granuloma formation, and developedfollowing the immuno-engineering approach of Timmis et al. (2008a). Granuloma formation is an immunological processwhere cells can isolate and destroy pathogens by physically surrounding them. The main actors in granuloma formation are:macrophages, T-cells and cytokines. Granuloma formation starts when macrophages engulf pathogenic material such asbacteria, the bacteria then multiplies within the macrophages and the cells are said to be infected. Infected macrophagessignal to other cells that they are infected and are eventually broken down through cell lysis. The signals sent by infectedmacrophages attract other immune cells to the site of infection which form a barrier around the infected macrophages,isolating both the infected cells and the debris produced by lysis. Ultimately this leads to the formation of either a chronicgranuloma (one that persists over time and isn’t harmful to the host); a healing scenario where the infection is removed fromthe host; or in the worst case the granuloma continues to expand and will harm the host, potentially leading to death. T-cellsand cytokines serve as regulators and signal carriers throughout the process.

Ismail and Timmis envisage a situation in which a single member of a robotic swarm suffers a partial failure such that itis no longer able to move but can still communication with the other members of the swarm (e.g. a motor failure). In linewith the immuno-engineering approach, following sufficient modelling of the processes of granuloma formation, Ismail andTimmis are constructing a distributed control algorithm, inspired by granuloma formation, in which functional robots willencapsulate and contain faulty robots (much like healthy immune cells contain infected cells) in order to minimise or removeany negative effects that the robot may have on the swarm as a whole.

A particular application of this approach that is highly relevant to the SYMBRION project, is the situation in which thefault is simply that the robot has ran out of energy. In such a situation, after isolating the faulty robot, Symbricator stylerobots would, in a distributed manner, be able to share power with the faulty robot, allowing it to continue normal operation.

Summary

Of the types of algorithm reviewed in this section, the most dominant in the wider field of AIS are those based upon clonalselection, negative selection and danger theory. Classic clonal selection algorithms include CLONALG (de Castro andVon Zuben, 2002), AIRS (Watkins et al., 2004) and the B-cell Algorithm (BCA) (Kelsey and Timmis, 2003). More recently andmore relevantly however, is the clonal selection inspired predictive error detection system of de Lemos et al. (2007). Notableapproaches based upon the negative selection principle include the immunotronics concept of Bradley and Tyrrell (2000)for hardware based fault tolerance. The primary algorithm to arise from danger theory is the Dendritic Cell Algorithm ofGreensmith et al. (2005, 2006a) which has spawn many variants and been applied to a number of different applications.

Beyond clonal selection, negative selection and danger theory, more recently, a number of novel AIS approaches havearisen that break the traditional mould of the standard approaches. Owens et al. (2009) and Guzella et al. (2007) are twogood examples that are based upon theory that is well grounded in immunology. Schreckling and Marktscheffel (2010) solvean interesting problem in attempting to bring security to the inherently flexible, and hence exploitable, architecture of anartificial chemistry. Finally, Banerjee and Moses (2010), based upon the observation of biological data, suggest an approachwhich may benefit any population based AIS.

3.3 Artificial Immune System Framework for SYMBRION

As first outlined by Mokhtar et al. (2008) and later elucidated in Timmis et al. (2010a), a proposed artificial immune systemframework for the SYMBRION project is depicted in figure 3.9. Following this framework, various immune inspiredalgorithms have been implemented, many of which were introduced in section 3.2. This section begins by briefly describingthe structure of the AIS framework and its situation within the SYMBRION project as a whole. Following which, the relevantwork reviewed in section 3.2 is positioned within the framework.

3.3.1 AIS Framework

In line with the natural immune system, the framework is split into two layers: innate and adaptive. The innate layer performsbasic anomaly detection to identify anomalies of which there is well known a priori knowledge. Examples of anomalies thatmay be detected by the innate layer include: components operating outside of their recommended temperature range and

42

Figure 3.9: An artificial immune system framework for the SYMBRION project.

sensors that return values beyond their supposed limits or do not correlate with the values of neighbouring sensors. Theadaptive layer incorporates the properties of learning and memory, allowing for the detection of more subtle faults, such assensor degradation, where sensors may be operating within their normal limits, but are not returning values that correspondappropriately to the environment they are sensing. Fault identification will also take place in the adaptive layer, allowingthe framework to communicated detected faults to other layers of the robot’s controller, recommending, for example, thata certain sensor or actuator should not be trusted to work as normal. Further integration with the rest of the SYMBRIONproject is provided through the shared use of evolutionary information and plans to encode the immune algorithms withinthe genome of individual robots. With the immune algorithms encoded in the genome, over time, the AIS itself may evolveand adapt along side the evolution of the other on-board controllers, some initial work towards this goal is described inchapter 5.

Another important aspect of the framework, not depicted in figure 3.9, is the ability for robots to share immunologicalinformation. Every robot may be considered analogous to a lymph node in the natural immune system, if a particular actionof the instantiated framework is seen to be beneficial to the robot, the robot may choose to share the information that led tothis action with other robots. The sharing of immunological information may be achieved either by directly communicatingwith other members of a robotic organism or by broadcasting to the rest of the collective.

3.3.2 Current Progress

In terms of where the existing work fits in to the framework, in the innate layer, the job of anomaly detection is carried outby the mDCA algorithm of Mokhtar et al. (2009). In the adaptive layer, an instance based B-cell algorithm, not reviewed insection 3.2 but described briefly by Timmis et al. (2010a), is combined with the mDCA algorithm, and in unpublished work, isshown to improve the “health” of Symbricator style robots. The T-cell inspired algorithm of Owens et al. (2009) also fits intothe adaptive anomaly detection unit of the adaptive layer. Finally, completing the current progress within the adaptive layerare the swarm aggregation algorithm of Ismail and Timmis (2009) and the fault identification algorithm of Bi et al. (2010).

43

4 Research Topic

Over the next two years my research will focus on homeostasis and long-term autonomy in collective robotic systems. Morespecifically, long-term autonomy as a consequence homeostasis, and with a special emphasis on the provision of homeostasisthrough immune inspired approaches. The Symbricator robotic platform and the associated simulation tools, introduced insection 2.1.1, will serve as the main test bed for any algorithms developed, and the chosen applications will be guided by theSYMBRION and REPLICATOR “Grand Challenges” described in section 2.1.2.

Driven by three basic requirements of a long-term autonomous system, this chapter begins by discussing homeostasis andlong-term autonomy in general. Two concepts in particular that can help provide long-term autonomy to a collective roboticsystem are identified. The first of which, general fault tolerance, is discussed with relation to the various different types of faultthat a collective robotic system may experience. The second, artificial energy homeostasis, is an umbrella term that encompassesa number of approaches for efficiently managing the delivery, distribution and usage of power to and amongst a collectiverobotic system. The various approaches involved in artificial energy homeostasis, including some aspects of fault tolerance,are discussed within the context of the Symbricator platform and with relation to the three distinct modes of operation:‘individual’, ‘swarm’ and ‘organism’. Finally, before outlining the main aims of my research, the initial application within theSYMBRION project, the dynamic re-configuration of an organism’s power bus, is described in detail.

4.1 Homeostasis and Long-term Autonomy

As the examples provided in chapter 1 show, long-term autonomy is an essential attribute of both current and future collectiverobotic systems. For tasks such as search and rescue and space exploration, where direct interaction, or even sustainedlong-distance communication with the robots may be difficult or impossible, the ability of the robots to operate autonomouslywithout human interaction will be invaluable. This section begins by introducing a three point hierarchy of some of the basicrequirements of a long-term autonomous system. Each of these three points will be introduced in turn before going on todiscuss how, in taking inspiration from the processes of biological homeostasis, these requirements may be met. The threepoint hierarchy is depicted in figure 4.1 and incorporates the following:

1. Survival

2. Operation

3. Adaptation

Starting at the lowest level, survival, the basic needs of individual robots are detailed, the discussion then progresses tooperation, highlighting the need for goal-oriented behaviours, and finally to the highest level of adaptation, the requirementto endow the robots with enough flexibility to survive in a continuously changing environment. It should be stressed thatalthough the hierarchy implicitly imposes an order upon the requirements, this is not necessarily an order of importance,what’s more, as is seen shortly, the distinction between the individual levels is far from clear cut. The separation and orderingof requirements is intended primarily for illustrative purposes.

4.1.1 Survival

At the lowest level of the hierarchy is survival, the requirement to keep the robots powered and functioning, but not necessarilyworking towards their goal. Survival entails the bare minimum set of functions that a robot needs to operate: powermanagement, basic behaviours and the ability to obtain energy from the environment.

Power management, at its most basic, simply involves the ability to recharge a robot’s battery. A more complex powermanagement system will also ensure that the robots use of power is as efficient as possible, thus, increasing the overalllifetime of the robot, reducing the time wasted on recouping expended energy, and maximising the time spent towards therobot’s main goals. In an individual robot, the efficient use of power may involve switching off redundant or unnecessarysensors or actuators. At the level of the swarm, it may not be necessary for all robots to be operating at full power all of thetime, a more efficient division of labour may involve some robots ‘resting’ whilst others ‘work’. At the artificial organismlevel, especially with platforms such as Symbricator, in which connected modules can share power, the configuration whichleads to the most efficient use of power must be found. Finding the best configuration of the system at a particular momentin time in a dynamic environment is a non-trivial problem, and one which must be solved in order to ensure the long-termsurvival of the collective, this problem is discussed extensively in later sections.

Basic behaviours, are those which are essential to the short term survival of individual robots, but through their continuedutilisation also help to ensure the long term survival of the overall system. Akin to instinctual or reflex behaviours in nature,basic behaviours can generally be thought of as unlearned or inbuilt. A good example of an essential basic behaviour is‘obstacle avoidance’, which helps prevent robots from getting stuck in their environment.

The ability of a robot to locate and obtain energy from a source within its environment, otherwise known as energy foraging,may be thought of as a basic behaviour, or as a supplement to power management, but so essential is it to the survival of thesystem that it warrants its own mention. Regardless of whether the action is passive, for example obtaining energy through

44

Survival

Operation

Adaptation

Long Term Autonomy

power management

energy foraging

fault tolerance environmental adaptation

goal oriented behaviours

basic behaviours

Figure 4.1: A hierarchy of three of the most important requirements of a long-term autonomous robotic system: basic survival,normal operation, and adaptation to a changing environment.

solar panels, or active, for example locating and engaging with power sockets, without the ability to forage for energy arobotic system stands no chance of survival.

The other two levels of the hierarchy are critically dependent on the basic survival of the system. Without the ability toforage for and manage power efficiently a robot will soon run out of energy, thus rendering the system useless and the othertwo levels of the hierarchy irrelevant. With similar consequences, if basic behaviours such as obstacle avoidance are notprovided then a robot may easily become stuck or lost in its environment.

4.1.2 Operation

The successful operation of a system, where success is measured by the extent to which the system serves its purpose, formsthe second requirement of a long-term autonomous system. This level encompasses all the goal oriented behaviours of thesystem, that is, those behaviours which directly help to satisfy the purpose of the system, for example, in a search and rescuetask this might include searching for signs of life or excavation behaviours.

The goal or purpose for which the system operates provides the driving force of the system. Although, a perfectly validand interesting system that ‘just survives’ would not require any goal oriented behaviours beyond those mentioned in section4.1.1, in an more complicated example, the behaviours provided by the survival level alone would not be sufficient. For areal world application, the successful operation of a potentially large number of goal oriented behaviours is an essentialrequirement of the system, without this ability the system would be considered useless.

It should be noted that, as both the operation and survival layers will need to make use of the system’s sensors andactuators, conflicts are likely to exist between the behaviours of the two layers. When designing a long term autonomoussystem, the links between the operation and survival layers must be carefully considered and any conflicts safely resolved.

4.1.3 Adaptation

The final, and arguably most important requirement of an autonomous system is adaptation. In order for a system to operatein a wide variety of situations, as the environment around it changes the system must be able to adapt to new (possiblyunseen) conditions in a continuous and seamless manner. Before continuing it is important to define what is meant by thesystem’s “environment”. Although the distinction between a system and its environment may be clear at a physical level,functionally, if examined a little closer, it becomes less obvious whereabouts to draw the line. With regards to a biologicalsystem and the environment in which it operates, Ashby (1960) provides a good example of why it is difficult to separate thetwo:

“if a mechanic with an artificial arm is trying to repair an engine, then the arm may be regarded as part of the organism thatis struggling with the engine, or as part of the machinery with which the man is struggling.”

Though Ashby is referring to a biological organism, it is not hard to imagine a similar situation with the human agentreplaced by an artificial system such as a robot. In fact, in this example, due to the likely physical similarities between theartificial arm and the robot, replacing the mechanic with a robot further blurs the line between organism and environment.This example can be taken further still, as Ashby does:

“The chisel in a sculptor’s hand can be regarded either as a part of the complex biophysical mechanism that is shaping themarble, or it can be regarded as a part of the material which the nervous system is attempting to control. The bones in thesculptor’s arms can similarly be regarded either as part of the organism or as part of the ‘environment’ of the nervous system.”

45

Survival

Operation

Long Term Autonomy

Adaptation

Figure 4.2: A modified view of the three requirements of a long term autonomous system.

By analogy, the same can be said of a robot and its environment, raising the question: are the sensors and actuators ofthe robot part of the control system, or simply tools in the environment with which the software interacts? Because the linebetween the organism (or robot) and its environment is blurred, Ashby advocates the consideration of an organism and itsenvironment as a single inseparable system. The same view is taken here, the robot and its environment, though physicallyseparate are considered functionally paired, continuously influencing one another through multiple feedback loops. Thevariable environmental conditions in which an artificial autonomous system must be able to survive, now include not justthe implicit external conditions such as the terrain and weather, but also properties of the internal components such as thestate of charge of a robot’s battery or the values (reliable or not) of its various sensors and actuators, as well as their physicalcondition.

For an autonomous system to survive it must be able to adapt to changes in any part of the overall system, be that theencounter of a new type of external terrain or a faulty internal component. Interesting, when considering the external andinternal environments as part of the same system, the response to a failure in either of these parts, may not even need todistinguish between the two. The response to climactic conditions which result in a slippery terrain (excessive rain or snowfor example) may be found to be similar to the response to a hardware fault which causes the axle of a wheeled robot to slipunpredictably —reduce speed and avoid turning— since from the point of view of the control system, the two are equivalent.As with the examples of sensorimotor disruption from Di Paolo (2000, 2003) in section 3.1, the response, rather than attemptto directly ‘fix’ the problem (which, if it was an climactic change, would be impossible) simply adapts the control system soas to maintain stability.

Adaptation is strongly linked to both survival and operation. Without adaptation, in a dynamic real-world environment arobot is unlikely to be able to survive. One can imagine that the inability to adapt to a broken sensor will quickly lead toundesirable behaviour, threatening the survival of both the individual and the collective. Likewise, the inability to operate inunexpected external environmental conditions with lead to the same consequence. Furthermore, beyond simply surviving, arobot that is able to adapt to changes in its environment will be far better equipped to complete its goal. A more accurateview of long term autonomy then, may be seen in figure 4.2 where adaptation is pervasive across all levels of the system.

4.1.4 Designing Autonomy

One obvious, but naive, approach to designing an autonomous system would involve designing each of the three parts:survival, operation and adaptation independently. Using such an approach the designer may begin by assuming a perfectrobot, operating in a well understood environment. Upon these assumptions the designer will easily be able to develop boththe basic behaviours necessary for the survival of the robot and the goal oriented behaviours necessary for the system to beconsidered a success. To handle unforeseen situations the designer may later add on modules with names like ‘anomalydetection unit’ which account for changes in the environment. Whilst such a modularised approach sounds appealing from asoftware engineering perspective, a system designed in such a way is restricted by the interfaces of its modules and may lackthe plasticity required to operate in a real-world dynamic environment. Furthermore, the links between the different layers,already stressed in the previous section will make the task even harder. For these reasons, where possible, a more unitedapproach is advocated, considering all three layers together in a manner more similar to figure 4.2 than 4.1. It is proposedthat the development of such an autonomous system, capable of operating in a variety of situations by adapting its behaviourat all levels, will be made possible by taking inspiration from the principles of biological homeostasis, with a special emphasison the operation of immune systems.

The development of a fully homeostatic system, capable of adapting as seamlessly to changes in both its internal andexternal environments as the human body does, is far beyond the realms of existing hardware and software paradigms. Thegap between the physiology of biological organisms and robotic hardware, both in terms of plasticity and complexity, remainslarge. In biology, assuming such a distinction can be made, both ‘hardware’ and ‘software’ show large amounts of structuraland functional plasticity. Biological hardware: the cells, tissues and organs of an organism are constantly changing; as cellsdie and are replaced and as organs are damaged and repaired, the physical make-up of an organism differs from one dayto the next. The hardware of an individual robot on the other hand is generally fixed throughout its ‘lifetime’. Biologicalsoftware: the processes which govern the ‘hardware’ and the various associated proteins and molecules, most notably theprocesses of the nervous system, and most pertinently those of the immune system, are also constantly changing as theorganism grows, learns and adapts. Whilst many attempts have been made to model the processes of biological organisms,

46

even the most advanced models are only adaptable in a limited sense, and are often restricted by constraints of the hardwareupon which they are deployed.

A natural benefit of Symbricator and other modular robotic platforms which may help in the development of homeostaticsystems is that at the organism level, structural plasticity comes for free. Although with a much larger granularity, theability of an artificial organism to grow and repair itself through the addition of extra or the replacement of existing modulesresembles the processes of biological organisms much closer than previously possible. The potential for an artificial organismto be formed at one moment in time by a completely different set of modules than at a previous time, whilst through theincremental nature of growth and the distributed control, still retaining the experience gained by now absent modules,provides a highly flexible platform upon which to construct an artificial homeostatic system. It should be emphasised, thatbecause of the vast difference in terms of scale, care must be taken if, somewhat naively, individual modules in an artificialorganism are considered analogous to individual cells in a biological organism.

In the future, both the complexity and plasticity of robotic hardware will need to increase, with solutions perhaps lying inmodular robotics, or if this proves not to be sufficiently plastic, in hybrid bio-electro-mechanical systems such as Adamatzkyand Jones (2008); Sato et al. (2008); Tsuda et al. (2009). Regardless of the underlying hardware, the software of a trulyhomeostatic system must also be highly adaptable, to which end, it must be designed in close harmony with, and notrestricted by, the hardware upon which it will run and the environment with which it will interact. Ultimately, the systemmust be considered as a whole, however, it is conceded that, as a step towards the eventual goal, some fragmentation may benecessary.

Naturally, my research will not attempt to address all the aspects of long term autonomy. Primarily, I will focus on therequirements of adaptation and survival, concentrating on two applications in particular: fault tolerance and artificial energyhomeostasis. Whilst I will not be directly involved in the development of specific robotic behaviours, in order to ensure thelong term survival of the system, fault tolerance should be an integral part of these behaviours and communication betweenthe power management system and the behavioural controllers will be essential. For example, during the formation of anartificial organism, the power management system will need to assess whether the addition of a new module to a particularlocation in the organism will help or hinder the homeostasis of the overall system. Based upon the assessment of the powermanagement system, recommendations as to whether or not the module should join the organism at that location can then bemade. The following two sections will discuss in general these two main foci of my research: fault tolerance and artificialenergy homeostasis.

4.1.5 Fault Tolerance

To improve the long-term autonomy of a collective robotic system, both the reliability of the individual units and the reliabilityof the interactions between different units (reliability of the collective) must be improved. As section 2.2 showed, there are anumber of reasons why robots fail, from hardware and software failures to simple human error. An autonomous system, byits very nature, should not be effected by human error, but even if the strictest engineering practices are adhered to it willalmost certainly be susceptible to failures of hardware and software.

Software failures at the system level, for example complete system crashes or freezes are ignored for now and the softwareof the control system alone provides the focus. Commonly, failures of the control system, i.e. the robot doing somethingthat it wasn’t supposed to, are a result of a change in the robot’s environment that it was not designed to account for. Thedevelopment of the behavioural control system itself is beyond the scope of this report, however, indirectly, through theprovision of hardware fault tolerance, the reliability of the control system may be improved.

The chance of hardware failures are much more likely with new technologies, and so for the Symbricator platform, a noveland complex robotic system, still very much under development, the chances of faults are high. Accepting that faults willoccur and that the reliability of robots will be low, in order to ensure the long term autonomy of a collective robotic system,when they do occur, faults must be detected and corrected quickly. Complete correction of faults may not always be possible,however, in some cases, simply tolerating the presence of faults may be sufficient.

The conventional approach to hardware fault tolerance, as described in section 2.3 begins with fault, or more specifically,anomaly detection, this is usually combined with some form of diagnosis which is then followed by a response, that acts so as toremove or reduce the effects of the fault. In order to provide hardware fault tolerance using this method, some understandingof the system in question, and the various ways in which it may fail, is required. This section begins by discussing varioustypes of hardware fault, their effects at the individual, swarm and collective levels, and some suggestions of how they maybe tolerated. Later in this section the immune inspired approach to anomaly detection is briefly discussed, followed by analternative approach to fault tolerance that departs from the standard process of detect and correct.

Actuator failures

As mentioned in section 2.2, one of the commonest robotic failures is that of an effector function, such as the tracks or wheelsof a robot. If such a component were to fail, for example if a track was to slip from its wheels, at the individual level, withcurrent technology very little could be done to rectify this, that is to say the robot cannot replace its own track. However, insuch a situation, a truly homeostatic system may be able to make use of the properties of degeneracy or redundancy, adaptingits original mode of operation in the face of an environmental change, in this case the slipping of a track from its wheels.Assuming a differential drive system, common in many small robotic platforms, an example of redundancy is observed inthe fact that the robot is likely to still have one fully functioning track, it may be possible then, for the robot to move usingthis track alone, albeit less effectively, and requiring the control system to adapt to take into account the new method oflocomotion. Alternatively, if the tread was fully removed (a much easier task for the robot than replacing it), the robot maystill be able to move using the drive wheel of the track (assuming the drive wheel was in contact with the ground), once

47

again however, the effectiveness of the movement would be reduced and the control system would need to adapt to the newconfiguration, furthermore, all the advantages of having tracks in the first place would be lost.

Though highly prevalent in nature (Edelman and Gally, 2001) physical degeneracy is uncommon in mobile robots, withredundancy of hardware commonly seen as the preferred method of adding robustness to a system. The Symbricator‘backbone’ robots do provide at least one good example of how degeneracy may help tolerate a fault in the robots locomotion.As an individual, the principle method of locomotion for the backbone robots are two ‘screw drives’ on their underside,however, the robots contain a second moving part, their entire body can rotate like a hinge, the main purpose of which is notthe planar locomotion of the individual, but to facilitate the collective movement of an organism. In theory, however, if one orboth of the robot’s screw drives were to fail, the control system may be able to tolerate the fault by using the movement of thehinge to provide a very basic shuffling motion.

At the collective level, to correct a hardware fault such as a slipped track, it is still very unlikely that one robot would beable to help re-position the track of another but there are several other things that can be done to tolerate faults of effectorfunction such as this. Depending on whether the fault was detected by the faulty robot itself or by another member of thecollective the response may be slightly different but the outcome will essentially be the same. If it is just an effector functionthat has failed then there are still several tasks that the robot will be able to carry out, including acting as an energy resourcefor other robots or acting as a static sensing node in the environment. Furthermore, if the platform in question is as flexibleas that of the SYMBRION and REPLICATOR projects, the robot may still possess enough functionality to act as a module in arobotic organism. Different locations within an organism require different functionalities, so even a robot which has lostcomplete locomotive ability may still be able to act as a purely sensing module.

Sensor failures

As the investigations in section 2.2 showed, due to common sensors being well established and mass manufactured they donot fail very often, but when they do, if not detected or if not properly accounted for, the consequences can be disastrous. Ifthe failure is detected, the robots may still be able to continue operating at a reduced capacity. At the individual level, if, as isnormally the case, there are redundant sensors of the same type, or degenerate sensors of different types, for example aninfra-red and a laser range finder, both capable of measuring the distance to nearby objects, the robot may be able to adapt itscontrol system to make use of these alternative sensors and ignore the faulty sensors. At the collective level, in a similarmanner to the response to a loss of effector function described earlier, a robot with poor sensing capabilities may still be ableto transport other robots or act as a mobile, but non-sensing, part of an artificial organism.

Power System Failures

Another type of failure that must be considered are those of the power management module. Rechargeable batteries donot last forever and are only useable for a limited number of charge and re-charge cycles, their maximum state of chargedecreasing as the number of cycles increases. A battery that has reached the end of its life will obviously have drastic systemwide effects, preventing the robot from functioning in even a limited capacity. A faulty battery with an unusually lowmaximum state of charge or an unusually high rate of discharge will cause similar problems, greatly reducing the period oftime in which the robot may carry out its task. For an individual robot, there is not much that can be done to counter theeffects of a dead or damaged battery, within the Symbricator platform however, in a collective robotic organism that allowsfor energy sharing between modules there is one solution. As described shortly (4.2) the Symbricator power managementsystem is flexible enough that robots may recharge the batteries, or if the other robot’s battery is damaged, directly powerthe components of other modules, meaning that it is theoretically possible for a fully functioning organism to contain somemodules with dead or damaged batteries.

Other problems within the power management system might result from the inaccurate measurement of the current stateof charge. The process of measuring a batteries current SOC is never 100% accurate, but if sufficiently inaccurate, for example,if a battery with 20% of its maximum capacity remaining is reported as having 80% remaining, it may lead to problems suchas: overcharging the battery, complications with cell balancing, or a robot running out of energy before it is able to rechargeor notify others of its needs.

The complexity of the Symbricator hardware may lead to further problems for the power management system. For example,the transfer of energy between individual robots relies on the formation of a secure connection between modules and despitethe best efforts of the hardware designers to minimise the problem, the transfer of energy between different robots is acceptedto be a lossy process. With faulty docking hardware the problem of inefficient energy transfer will be amplified and so wherepossible it would be desirable that modules use their ‘best’ docking connectors for transferring energy. Other problems withthe docking hardware that the robots must tolerate include a docking connector being continuously set (or un-set) to transferenergy, either with, or more destructively, without, the knowledge of the module.

Anomaly Detection

All of the above examples have assumed that when a fault occurs, it is possible for the robot, or other robots of the collective,to detect the fault, however, nothing has been explicitly said about how hardware faults or anomalies may be detected. Thisprocess of detection is far from trivial and deserves special attention. Due to the, perhaps superficial, analogy between anunhealthy biological organism and a faulty robot, artificial immune systems, present themselves as an appealing candidatefor the task of anomaly detection. The prevalent history of AIS within the field of anomaly detection, as was observed bythe number of immune-inspired anomaly detection systems reviewed in section 3.2, adds further weight to their potentialutilisation.

48

There are two features in particular: prediction and distributed detection, that have been observed in artificial immunesystems and provide yet further motivation to their application within the context of a collective robotic system. Prediction, asobserved in the work of de Lemos et al. (2007) is an especially appealing property with relation to the formation of artificialorganisms because it may help prevent robots that are likely to fail from ever joining an organism. Furthermore, within anartificial organism an individual that detects a potential power failure will be able to communicate this information to itsneighbours, allowing the robot to be removed before it causes damage to the organism. Distributed detection is also desirablebecause, amongst other things, it will allow robots to collectively detect faults in individuals which may themselves beincapable of such detection. Hence, by reducing the number of false negatives (robots which are not aware that they arefaulty), distributed detection may lead to a vastly improved anomaly detection system.

Beyond Anomaly Detection

Earlier, it was suggested that the analogy between an unhealthy biological organism and a faulty robot is purely superficial,the reason for this is that, if examined a little closer, the presence of a pathogen within a biological organism may not becomparable to the degradation of a sensor or the loss of movement in an actuator. The types of faults that can be expected inan artificial collective robotic system may closer resemble more drastic biological malfunctions such as the partial loss of sightor the loss of feeling or movement in a limb. This observation alone is not sufficient to reject the AIS based anomaly detectionapproach, and that is not the intention. Artificial immune systems, as general tools that may be used to detect anomaliesin data streams, regardless of where that data comes from and irrespective of the similarity to biological ‘anomalies’, arevery good at what they do. The purpose of this observation is merely to highlight the fact that there may be an alternativeapproach in which the biological inspiration is more closely aligned to the task at hand, and with some advantages, is able totolerate the presence of faults without the need for an explicit anomaly detection step. Such an approach is now described.

Based upon Ashby’s notion of ultrastability, the proposed method involves monitoring some form of internal stabilityand using this as an indicator of the current state of the system. A system that is internally stable can be considered to beoperating normally and a system without internal stability can be considered to be operating abnormally. Any significantdeparture from internal stability, that is, any condition, such as a faulty sensor, that leads to a loss of homeostasis, will triggeran adaptation in the control system that allows the system to regain stability and hence tolerate the change. This approachis not analogous to the detection and removal of pathogens by the immune system, but instead to the adaptation of thenervous system in responses to environmental changes. Regardless of whether the changes are drastic internal sensorimotordisruptions or more subtle changes to the external environment, the response is essentially the same. This approach, shownto be a successful method of fault tolerance by both Di Paolo (2000) and Der et al. (1999) has several advantages over standardanomaly detection based techniques, which are now discussed.

The first major advantage has already been alluded to, this being the fact that, through this approach, adaptive behaviourand fault tolerance come hand in hand, to all intents and purposes they are the same thing. In fitting with the considerationof the robot and its environment as part of the same inseparable system, as described in section 4.1.3, a change in either theexternal environment or a faulty internal component has the same consequence, it leads to a re-configuration that acts so asto maintain homeostasis. This property is highlighted by the fact that Der et al. (1999) were not even looking to provide faulttolerance in their system, it came simply as a consequence of the adaptability of their approach.

Conventionally, the response to an anomaly detected using a standard immune-inspired approach will not itself bebio-inspired. In the case of a faulty sensor, the response may simply involve ignoring the output of that sensor and relyingon the output of other redundant or degenerate sensors. However, even a faulty sensor may still be able to provide usefulinformation about the environment and it would be highly desirable to make use of this information. For example, were thefault some form of sensor-bias, where the sensor returns an incorrect value that is a constant distance away from what itshould be, if it were possible to account for the bias, the sensor would still be useful. In order to determine by what amountto offset the bias, a system based upon explicit anomaly detection would need to be paired with some form of fault diagnosismodule. Another advantage of the Ashbyian approach is that it does not assume any prior knowledge about the types of faultit may experience, hence it does not require specialised modules to diagnose faults. For a faulty sensor to remain useful withthis approach, the only requirement is that there is at least some correlation between the environment and the values returnedby the sensor. As the correlation between the values of a faulty sensor and the environment decreases to randomness, theability of the system to adapt and produce a stable configuration that can tolerate the fault is lost.

This approach, is of course, not without some disadvantages. Such a system clearly favours the united approach to designadvocated in section 4.1.4, and whilst this approach is desirable it is not always practical. Our position within the Symbricatorconsortium does not allow us much influence of the development of the behavioural control system itself so it would be hardto integrate an approach such as this, which is so interlinked with the behaviour of the robot, to pre-existing behaviouralcontrollers. At the level of power management however, where we have much more control over the system, an approach suchas this may find a place. The second, and quite major disadvantage of this approach, is within the process of re-configuration.Addressing the question of how to find a new stable configuration is non-trivial, it has already been discussed that a purelyrandom approach such as Ashby’s will not scale, but something more directed may not be sufficiently flexible and capable offinding desirable behaviours, a happy medium is required, but finding this will not be easy.

4.1.6 Artificial Energy Homeostasis

Artificial energy homeostasis, a term first mentioned in Humza et al. (2009) and discussed with the first author HumzaRaja of the Fraunhofer Institute for Biomedical Engineering during his visit to York from July 5 to July 9 2010. Taking itsname from the biological concept of homeostasis, generally speaking, artificial energy homeostasis can be thought of as thedynamic regulation of a systems power in response to changes in both its external and internal environments, acting in such

49

a way as to ensure the most efficient delivery, distribution and usage of energy by the system. The term encompasses anumber of approaches at all three levels of robot interaction, from the energy foraging behaviours of individual robots to thedistributed adaptive power management of an artificial robotic organism. The concept of artificial energy homeostasis hasclear implications to both the survival and success of an autonomous collective robotic system. Without the ability to obtainenergy from its environment, as the individual robots run out of energy, the system will soon become useless. Furthermore,without efficient use of power, not only will the time spent working towards the goal of the system be reduced, but the overalllifetime of the system, as determined by the maximum battery cycle life of the robots, will be decreased. In this section,some of the various approaches to artificial energy homeostasis are discussed with relation to the three different modes ofoperation: ‘individual’, ‘swarm’, and ‘organism’.

Individual

At the individual level, artificial energy homeostasis is achieved by the ability of individual robots to forage for energy intheir environment, and once obtained, to make as efficient use of this energy as possible. The ability to make as efficientuse of energy as possible has already been covered, incorporating, at the individual level, the ability of robots to turn offredundant sensors when they are not in use.

Swarm

At the level of the swarm, artificial energy homeostasis is seen in the ability of robots to not only look after their own energyrequirements, but also those of the other members of the swarm. Through the process of energy trophallaxis, inspired by theability of ants and other social insects to transfer food from their own stomachs to those of their neighbours, robots maytransfer energy to other members of the swarm. Furthermore, robots may preserve the energy of the combined swarm, the socalled social stomach, by hibernating when there are no immediate benefits for their current continued operation, effectivelyworking and sleeping in shifts.

Organism

In a robotic organism, artificial energy homeostasis incorporates all of the behaviours observed in the other levels of robotinteraction, along side some more specialised requirements, imposed by the structural and functional uniqueness of anartificial robotic organism.

At the organism level, energy foraging remains a necessary requirement of artificial energy homeostasis, but the processthrough which it is achieved is very different to that observed at the individual level, requiring the co-ordinated movementof the entire organism in order to reach the power source, and once located, the distribution of energy amongst the entireorganism, through an approach similar to energy trophallaxis.

Throughout normal operation, the organism must also dynamically manage the distribution and usage of energy within theorganism, adapting to the different energy requirements of the individual modules, imposed by, amongst other things: theirrole in the organism, their current internal state and the environment in which they are operating. This topic, incorporatingthe dynamic reconfiguration of an organisms “power bus”, is discussed extensively in the following section.

4.1.7 Summary

Three basic requirements of a long-term autonomous system may be identified: survival, operation and adaptation. Survivaldescribes the basic ability of individual robots to continue operating, through the utilisation of: basic behaviours, efficientpower management and simple energy foraging. Operation incorporates all the goal-oriented behaviours of a robot, withoutgoal-oriented behaviours a system may be seen to have no purpose. Adaptation, pervasive across all levels and perhaps themost important requirement of a long-term autonomous system, allows robots to survive and operate in a continuouslychanging environment. The environment in this context, it is important to emphasise, incorporates not just conditions externalto the system such as the climate or terrain, but also internal conditions such as the state of charge of a robots battery or theflow of current in an artificial organism.

The modularity and flexibility of the Symbricator hardware provides an interesting platform upon which to construct anautonomous system. When designing an autonomous control system for the Symbricator platform, a united approach thattakes into account all of the aforementioned requirements at the same time, as well as the hardware upon which it will bedeployed, is advisable.

Two particular concepts that can be identified as potential methods of providing long-term autonomy to a collectiverobotic system are: fault tolerance and artificial energy homeostasis. The necessity for fault tolerance comes from the desireto increase the reliability of robotic systems, both at the level of the individual and at the level of the collective. Variousactuator, sensor and power systems failures may be identified and at least two methods to tolerate them imagined. The first,based upon explicit immune inspired anomaly detection, and the second, based upon Ashby’s principles of adaptation andultrastability. Artificial energy homeostasis involves a number of different behaviours, from the basic energy foraging andpower management of individuals, to energy trophallaxis in swarms and adaptable power management in organisms. At allthree levels of robot interaction, these processes of artificial energy homeostasis, combined with fault tolerance, are essentialto the survival and continued operation of a long-term autonomous system.

50

Self powered

Powering others

Externally powered

Recharging

Energy sharing ON

Energy sharing OFF

Key

Power bus

A

B

C

D E G

H

F

Bus 1

Bus 2

Figure 4.3: An example power bus configuration for an artificial robotic organism.

4.2 Artificial Robotic Organism Energy Homeostasis

Within the context of SYMBRION, as a first step towards ensuring system wide energy homeostasis, my research will initiallyfocus only on the energy homeostasis of a robotic organism. Specifically, the task I will look at is the dynamic re-configurationof an organism’s power bus in response to changes in environmental conditions, both internal and external to the organism.This section begins by explaining the basic hardware architecture of the Symbricator power management system, and in theprocess, identifying the constraints of the hardware and some of the problems that must be overcome when designing anadaptable power management system. The task of dynamically re-configuring the power bus of an artificial organism is thendescribed in more detail, highlighting some of the general requirements of the task. Finally, some suggestions are made as tohow, in taking inspiration from biology, the task of re-configuring an organisms power bus may be achieved in a manner thatis both energy efficient and fault tolerant.

4.2.1 Symbricator Power Management System

The flexibility of the Symbricator power management system is such that when an artificial organism is formed, the modulesinvolved are able to create “power buses”, through which they may share power with any other modules connected to thesame bus. There are essentially two forms of energy sharing that modules make take part in. Firstly, individual modules areable to recharge others by directly transferring energy from their own batteries to the batteries of other modules. Secondly,modules are able directly power the components of another module, without the need to recharge that modules battery.Whether or not two neighbouring modules are able to share power, that is to say, whether or not two modules are connectedto the same power bus is determined by the individuals, each module is able to decide which of its four connectors will allowfor the transfer of energy. For a power bus link to be created between two modules, both modules must allow for the transferof energy from the appropriate connectors.

All of the basic properties of the power management system, as described above, are shown in figure 4.3. In the simpleorganism of figure 4.3 there are two power buses, “Bus 1”, connecting modules: C, B, D and E; and “Bus 2”, connectingmodules: F, G and H. Even though one of module A’s connectors is set to allow for energy sharing, the connector of moduleB that it is linked to is not, and so module A is not connected to any other modules through a power bus. Bus 1 shows how asingle module, C, acting as a source, may power the components of several other modules on the same bus and Bus 2 showssimilarly how a single module may recharge other modules. In this example each bus only contains one source, which iseither powering or recharging the connected modules, however in larger organisms it is possible for a bus to contain multiplesources and to perform a mixture of recharging and externally powering.

A more detailed overview of the power management system of a single Symbricator robot can be seen in figure 4.4. Themost important components are the battery pack, the battery management module, the energy sharing module, the systempower manager and the four docking interfaces. The battery pack consists of six lithium polymer (LiPo) cells providingapproximately 1400 mAh of charge capacity and a nominal voltage of 22.2 volts. The battery management module (BMM)is responsible for managing the battery pack, its roles include: monitoring the current state of charge, cell balancing andcontrolling the charging and discharging of the battery pack. The energy sharing module controls the robot’s ability toshare power with modules belonging to the same power bus. The system power manager is responsible for a number ofpower related functions, including providing power to the high voltage components. Finally, the docking interfaces allow forbidirectional current flow between physically connected modules.

One important detail, relating to the flow of current between connected modules, was omitted in the simplified descriptiongiven above. It was said that for two modules to share energy they must both allow for the transfer of energy from theappropriate connectors, whilst this remains true, what was omitted was the fact that modules can only determine whether or

51

Docking Interface A

Docking Interface C

Doc

king

In

terfa

ce D

Doc

king

In

terfa

ce B

Battery Pack

Battery Management

Module (BMM)

Core Components and Sensors

System Power Manager

High Voltage Peripherals

Battery Charger

Energy Sharing Module

3V3 Regulator

Figure 4.4: Block diagram of the Symbricator power management system of a single Symbricator robot

not current flows out of their interfaces and not whether it flows in. For example, in figure 4.3 module B’s docking interface,on the side at which it is connected to module A, is set to prevent current from flowing out, conversely, the associated dockinginterface of module A is set to allow current to flow out. The resulting behaviour is shown by the dashed yellow line infigure 4.5 (a), current is able to flow from module A to module B but not in reverse from module B to module A. In a similarmanner, figure 4.5 (b) shows how current can flow bi-directionally between modules D and E, since they both have theirdocking interfaces set to allow current to flow. There is a third situation that is not pictured in figure 4.5, exemplified by theconnection between modules E and G in figure 4.3, since both modules are set to prevent the flow of current out of theirinterfaces, no current can flow in any direction between the two modules.

This feature of the hardware design adds even more flexibility to the system, allowing robots to power others without theneed for any form of communication or hand-shaking. One particular benefit of this flexibility is that it potentially allowsrobots to revive other modules which have completely run out of energy. Of course, with this added flexibility comes addedcomplexity, and with this, a higher risk of undesirable behaviour, which will need to be considered during the design of theadaptive power management software.

There are certain hardware constraints that must also be taken into account during the development of the powermanagement system. For example, the chosen docking connectors limit the flow of current through an organism’s power busto 8A. Assuming that when recharging each robot consumes around 1A of current then the theoretical maximum number ofmodules on a single bus that can simultaneously recharge is eight. Others features that must also be considered includemeasurement inaccuracies resulting from the conversion of analogue signals to digital. In particular these inaccuracies areobserved in the estimation of the state of charge by the battery management module and the estimation of branch current bythe power manager.

4.2.2 Power Bus Homeostasis

Within an artificial robotic organism, individual modules can exist in one of a number of different states: self powered,externally powered, powering others or recharging. The potential states of all the individual modules in the organism,combined with the fact that, in terms of energy sharing, modules may be uni- or bi- directionally connected to anywherebetween zero and all of their physically connected neighbours, means that, even for a relatively simple organism topology,there exists an extremely large number of possible power bus configurations. The task of the power management systemthen, is to find a suitable configuration from the many possibilities, that for any one moment in time, leads to the mostefficient usage and distribution of power within the organism, whilst necessarily, at the same time is tolerant to the presenceof faults and adheres to the physical constraints of the hardware. This section addresses some of the issues involved in thedynamic re-configuration of an organisms power bus. Beginning with some thoughts on the general methods of controllingthe re-configuration, this section then presents some suggestions as to which situations or conditions should lead to are-configuration of the power bus. Finally, the necessity of finding configurations in which the usage and distribution ofpower is both efficient and fault tolerant is discussed.

52

A

B

(a)

E

D

(b)

Figure 4.5: Two examples of how neighbouring modules in an artificial robotic organism may share power. The dashed yellowlines show the directions in which current may flow. Due to the configuration of their docking switches, in (a)current is only able to flow from module A to module B and not from module B to module A. In (b) current isable to flow bi-directionally between modules D and E.

Controlling re-configuration

Of all the conceivable methods of controlling the re-configuration of an organisms power bus, the majority can be describedas either centralised or decentralised forms of control. Centralised control requires a single module to take full responsibilityfor the actions of the entire organism, although data and processing may be distributed throughout all of the modules,ultimately, the decision of which configuration the power bus should take is the responsibility of a single module alone.Decentralised control on the other hand, is entirely distributed, acting independently, modules communicate directly withonly their immediate neighbours, and in-directly, through the flow of current on the power bus, with more distant modules.With the decentralised approach, the configuration of the power bus is not the decision of single module, rather it is anemergent property of the local interactions of every member of the organism. Hybrid approaches that lie somewhere betweencentralised and decentralised control may exist, however, for simplicity, only the two extreme cases are considered here. Therelative merits and faults of these two opposing forms of control are now discussed.

Designing an emergent system is a notoriously difficult task, obtaining the correct system level behaviours purely from theinteractions of low level agents is a non-trivial problem, made harder by the fact that the correlation between the two levels isoften difficult to observe. Due to these difficulties, the design of a centralised controller may be seen as a conceptually simplertask than the design of a decentralised controller. There are, however, several disadvantages to the centralised approach whichmust be taken into account. With the centralised approach, the main decision making module will require global knowledgeof the entire organism, including the current shape of the organism, the current power bus configuration and the internalstates of every module involved. This requirement for global knowledge has several drawbacks, most notably, it requiresevery module in the organism to periodically communicate their current state to the decision making module, a costly process,which does not scale well as the size of the organism increases. Another disadvantage with the centralised approach is thatit introduces a single point of failure to the system, that being the presence of the single decision maker, and whilst in theevent of a failure, the module responsible for decision making may be changed, any failure must first be detected and thencommunicated to all of the other modules in the organism. With the decentralised approach, no single module requiresglobal knowledge of the overall state of the organism and modules need only communicate with their immediate neighbours,making the approach far more scaleable and computationally efficient than the centralised form of control. Furthermore, forthe purposes of power management, modules do not even need to know the shape of the organism that they are situated in,this fact, combined with the absence of a single point of failure means that the decentralised approach is also inherently farmore robust to the presence of faults or changes to the organism’s morphology.

When to re-configure?

The question of when to re-configure the power bus has been touched upon throughout this chapter. Ideally, the situationsthat lead to a re-configuration of the power bus should not be describable in an “if-then” manner, especially if a decentralisedapproach to control is employed, the process should be continuous, self-organising and emergent. There are, however, certainevents which may be correlated to a change in the power bus configuration and it remains useful to identify them.

Re-configuration of the power bus will be necessary whenever the stability of the current configuration is lost, to borrowthe biological terms used to describe homeostasis in section 3.1.2, this is whenever essential variables are at risk of breachingtheir physiological limits. The essential variables in this context are as yet undefined, however, the state of charge of each

53

module and the measured power bus currents will almost certainly have a central role. The values of the essential variableswill change according to the current state of the system (which includes both the internal and external environments), and soany change to the system, internal or external, may lead to a re-configuration. External changes include, for example, theorganism encountering a new type of terrain, in which the energy requirements of the individual modules may be different,and hence may lead to the emergence of a new configuration. Similarly, and with the same consequences, a change in theorganism’s task may effect the energy requirements of the modules and result in the formation of a new configuration.Internal changes include faults in the individual modules or the gradual decrease over time in the total amount of energystored in the organism, with different levels requiring different distribution strategies, and hence different configurations.

Finding suitable configurations

Whether or not a particular configuration is suitable is determined by the current conditions of both the internal environmentof the organism itself and the external environment in which it is operating, both of which must be taken into account whenre-configuring the system. External conditions which must be considered include the availability of energy resources in theenvironment and the type of terrain in which the organism is operating. Internal conditions, of which there are several,include the current configuration of the system, the current state of charge of each of the involved modules, the power buscurrents, the role of each module within the organism and their associated energy requirements, as well as the physicalcondition of each of the modules and the multiple on-board sensors and actuators.

In a “good” configuration, the distribution of energy amongst modules will be as efficient as possible. For distribution tobe efficient, configurations must be found which allow energy to be transported more readily to the modules which require itthe most, whilst, with a lower priority, still servicing the modules with lower energy requirements. Furthermore, to stabilisethe flow of current, the disparity between the states of charge of neighbouring modules must be reduced.

In terms of the efficient usage of energy (another indicator of a good configuration), individual modules within an organismmust monitor their own usage to make sure they are not wasting energy unnecessarily. This may involve turning offredundant or unused sensors which are rendered useless by their position in the organism, for example in figure 4.3 moduleD is unlikely to need any of the sensors positioned on the sides with which it is connected to modules B and E. The role of amodule within the organism should also effect which components it has turned on, there would be no reason for a dedicatedsensing module for example, to have it’s locomotion actuators turned on. Conversely, an actuator module would not need toprovide power to any of its sensing components, and a dedicated energy reserve could power down almost all peripheralcomponents.

As previously mentioned, the chances of a hardware fault occurring in the Symbricator hardware is high, and so whenfaults do occur the system must be able to tolerate them. Since faults at any level of organisation will have a large effect of thepower consumption of the organism, the power management system must be able to tolerate a large variety of faults, fromsensor and actuator faults, to batteries with unusually high dissipation rates and loose docking connectors. Ideally, power busconfigurations that are inherently robust to the introduction of faults would be desirable, requiring the minimum amount ofre-organisation when faults do occur. Such configurations may incorporate redundant channels, eliminating single points offailure, such that if a module on one channel fails, power can still be routed via an alternate channel. This approach will notalways be possible and sometimes a more drastic re-configuration will be needed, in such situations it may be necessary todetermine the exact location and cause of the fault. To reduce the detrimental effects of a fault, the time to which it is detectedand accounted for should be minimised. Ideally, in a predictive manner, a faulty module should be detected and accountedfor before it fails. Once a faulty module has been detected there are a number of ways it may be dealt with, depending onits location, its role within the organism and the type and severity of the fault in question. A faulty docking connector,which may lead to an excessive loss of power in transfer or the complete prevention of energy sharing on that side of themodule should result in that unit being fully or partially isolated from the rest of the organism, i.e. the the whole module, orjust the faulty connector, should be prevented from joining a power bus. Any sensor or actuator fault will almost certainlyhave a detrimental effect on the behaviour of the organism and will consequently lead to an excessive use of power. Theappropriate response to a sensor or actuator fault very much depends on the role of the module within the organism. Somepossible responses, such as changing the position or role of the module to match its abilities have already been mentioned.Alternatively, the faulty module may be completely removed from the organism and replaced with a fully functioning one, orwithout replacement, the organism shape may adapt to a configuration involving one less module. Perhaps of most relevanceto the task of energy homeostasis are faults within the power management system, such as the failure of a battery, or theunreliable reading of a battery’s current status. Such a fault may be tolerable simply be re-organising the power bus, or itmay require some of the more drastic re-structuring methods mentioned above.

4.2.3 Approaches to Power Bus Homeostasis

Now that the task of maintaining the energy homeostasis of an artificial organism has been introduced, and the basic detailsof the Symbricator power management system have been outlined, it is possible to begin to answer the important question ofhow to meet the general requirements for artificial energy homeostasis described above. As a starting point, my researchwill focus on the dynamic re-configuration of the power bus alone, assuming a static organism shape with no physicalre-configuration of the modules taking place. This focus is reflected by the content of this section, however, because thephysical structure of the organisms and the configuration of the power bus are so inter-linked, the topic will not be ignoredcompletely. This section provides some early thoughts on how the problem of re-configuration may be tackled. Four differenttypes of bio-inspired approach are discussed, the advantages and disadvantages of each are outlined, and some methodsthrough which they may be combined are described. First, however, a classical engineering solution is proposed, which maybe used as a benchmark for the assessment and comparison of the bio-inspired approaches.

54

Classical engineering

The task of dynamically re-configuring the power bus of an artificial robotic organism, to the author’s knowledge, has neverbeen considered before. Largely due to the lack of any hardware system with the capabilities that would make it possible.The flexibility of the Symbricator platform, in allowing for such dynamic re-configuration, is unique. Such uniqueness,though interesting and exciting from a research stand point, is fraught with difficulties from an engineering perspective.Thankfully, the task of designing and building the hardware itself is, to a large extent, complete. However, from a softwarepoint of view, the lack of a proven method of tackling the problem, leaves little in the way of a ‘starting point’ and makesthe assessment or comparison of any developed solutions difficult. That said, there are a number of parallels between thisproblem and the extremely well developed field of computer networking, which may allow the development of some baselinetechniques for comparison. In computer networking, data must be passed between different nodes within a network, typicallythis is achieved through the use of routing algorithms which determine the path along which the data will travel. The taskat hand will also requires the construction of routes, connecting different modules within an organism, and transmittingnot data1, but energy. A particular type of computer network that shares many similarities with the Symbricator platformare Wireless Sensor Networks (WSN). Like the Symbricator platform, a WSN consists of several, possibly heterogeneous andalmost certainly autonomous, units or nodes, which communicate and cooperate in order to achieve a common goal, mostoften this is some form of monitoring task. Unlike a Symbricator organism, the units in a WSN are spatially distributed andalthough some may be mobile, for the majority, any form of movement is passive, occurring only when the environment inwhich they are situated changes. The nodes in a WSN form a wireless ad-hoc network, the structure of which, like the powerbus of a Symbricator organism, must dynamically adapt to changes in the environment and does so, for the most part, ina decentralised manner. WSN are also subject to many of the same challenges as an artificial organism, they have limitedenergy resources, they must survive autonomously for long periods of time and they must be robust to and tolerant of thepresence of faults. To meet these, and other challenges, many people have turned to biology for inspiration (Dressler andAkan, 2010). Such approaches may help guide the development of the bio-inspired techniques described in the followingsections, however, the main reason for investigating computer networking and sensor networks in particular is to find wellaccepted, state of the art, techniques (which may well include some bio-inspired approaches) with which the novel algorithmsdeveloped for the Symbricator platform may be compared. At a later stage it may even be possible to give back to the fieldcomputer networking, by adapting the algorithms developed here to suit the needs of wireless sensor networks.

Artificial development

The cells, tissues, skeletal structure, organs and limbs of the human body all originate from a single fertilised egg; inmulticellular organisms, this remarkable ability of complex structures and patterns to emerge from such simplicity isaccounted for by the biological processes of growth and development. These processes continue throughout the lifetime of theorganism, helping to provide the high levels of adaptivity and robustness that are typically found in multicellular organisms.Adaptivity is observed in the ability of plants to adapt their structure in response to changes in their external environment, forexample, a plant whose light source is blocked off will grow away from or around the obstructing object in order to overcomethe problem. Robustness is ensured by the high levels of redundancy and degeneracy across all levels of multicellulardevelopment, allowing organisms to survive in potentially dangerous environments, the ability to grow new cells, and inextreme cases whole limbs in response to physical damage is one example of the type of robustness found in multicellularorganisms. These two properties, along with scalability, another inherent characteristic of development, would be highlydesirable in artificial systems. The field of artificial development arose, at least partly, in response to the desire to provideartificial systems with such properties. The study of artificial systems inspired by biological development has been referred toby a variety of different names (Chavoya, 2009): artificial embryology, morphogenesis, artificial ontogeny, computationalembryology, computational development and artificial embryogeny; the term ‘artificial development’ used here, is broadenough that it covers all of these approaches.

There are a wide variety of computational systems that can fall under the term artificial development, from Lindenmayersystems (L-systems) and random boolean networks to reaction-diffusion systems and artificial gene regulatory networks(Chavoya, 2009). All artificial development systems though, can be placed into one of two classes: macro- or micro-modeldevelopmental systems (Kuyucu et al., 2010). Macro-model systems, including for example L-systems, are inspired bybiological development from a high level of abstraction, individual cells and chemicals are not modelled, instead, onlythe general characteristics of development are captured. Micro-model systems examine biology from a much lower level,focusing on the interactions of subcellular components such as DNA and proteins. Though they still make use of abstractions,micro-model development systems, such as artificial gene regulatory networks, model biology much more accurately thanmacro-model systems.

To understand how an artificial developmental system might be used to help re-configure the power bus of a roboticorganism, some basic knowledge of the processes of biological development is needed. One of the most important componentsof biological development are gene regulatory networks (GRN), through the interactions of multiple genes and proteins, GRNshelp to explain how the complex structures of multicellular organisms are formed from simple genetic instructions. Eachof the genes in an organism holds the code for a specific type of protein, but only when these genes are expressed can theassociated proteins be created. Whether or not a gene is expressed is determined by the state of the cell, i.e. the concentrationof certain chemicals and proteins, and the activity other regulatory genes —which may themselves depend on the expressionof yet further genes. For a gene to be expressed, special proteins known as transcription factors must bind with specific regionson the gene known as cis-regulatory elements, not all transcription factors will lead to the expression of the associated gene

1Since a dynamic communication bus is also created when an organism is formed, the transmission of data between the modules in an organism must also beensured, however, that is beyond the scope of this work.

55

and some may actively inhibit it, but if the right factors do bind, the gene will be expressed and as a result will produceproteins of its own. These proteins may be transcription factors that influence the activity of other genes, or proteins whichalter the structure of function of the cell. Transcription factors may affect multiple genes, and to be produced by a gene, thatgene may require the presence of multiple factors, in this sense, a self-regulating network is formed that is influenced bythe concentrations of chemicals and proteins within the environment and leads to the creation of multiple proteins, some ofwhich may feedback into the network and some of which may lead to cell growth and differentiation.

Because of the inherent adaptivity, robustness and scalability that they provide, as well as their ability to operate in highlydynamic environments, artificial development systems seem very well suited to the task of re-configuring the power bus ofan artificial robotic organism. What’s more, to add further robustness to the system, as some authors are already attempting(Liu et al., 2010) artificial development could be combined with an immune inspired approach. If a micro-model system usingan artificial GRN was employed then the obvious strategy would be to consider each module of the organism as a cell, or aposition in which a cell could potentially appear. Starting with a single cell, a micro-model artificial developmental systemcould ‘grow’ the initial structure of the power bus. Influenced by the current state of the environment, individual cells coulddivide creating new cells in neighbouring modules and differentiate into particular cell types that would defined the currentrole of that module: self-powered, powering others, externally powered, recharging for example. Any two neighbouringmodules which both contained live cells, could automatically be considered to belong to the same power bus. For the systemto be adaptive the concentration of the various chemicals that affect the regulatory network of the system would need tocorrespond to the current state of the system, for example, in one module there might be a protein whose concentration isrelated to the amount of current flowing at any one moment in time, or the SOC of that module. How to solve the problem ofwhich genes produce which proteins and which proteins effect which genes, in other words, how to determine they dynamicsof the GRN is usually solved by artificial evolution, a prospect that fits well with the global focus of the SYMBRION project.

The use of a micro-model approach, especially when combined with artificial evolution, does have some potential problemsin this context. Firstly there is the problem of scale, in biology the number of cells involved in development is extremely large,even in artificial development, systems tend to simulate around 100-500 cells. In the Symbricator platform it is envisagedthat an organism will consist of anything from two to two hundred modules, but is likely to be no more than 10 or 20 inearly implementations, with such a small number of potential cells it is not clear how well the micro-model approach willperform. One solution would be to simulate multiple cells within each module but this amplifies the second problem with thisapproach, the fact that it is very computationally expensive, especially when combined with artificial evolution. A solution toboth of these problems might be to use a macro-model developmental system instead, the scale of which would be muchbetter suited to the task at hand and the computational overheads would be greatly reduced.

Evolutionary computation

Another bio-inspired paradigm that fits nicely within the scope of the SYMBRION project is the well established field ofevolutionary computation. Utilised in either an online or offline manner, artificial evolution could be used to optimisethe current configuration of an organism’s power bus or to adapt it more drastically in response changing environmentalconditions.

The most natural use of artificial evolution would be seem to be in the evolution of the layout of the power bus, in whichcase, the problem reduces almost to one of evolving graph topologies. There are several problems with this approach however.To allow the organism to adapt to changing environmental conditions, evolution would need to be performed online, butin which case, any poor solutions, of which there can be expected to be many, may have disastrous consequences for theorganism. Evolving topologies offline in simulation would remove the problem of undesirable configurations but at theexpense of on-line adaptability. Other disadvantages of this approach are the high computational requirements of evolutionand the fact that, what would essentially be a centralised form of control, brings with it all the problems mentioned in section4.2.2, such as the lack of scalability and single points of failure.

Rather than relying on the process of evolution to provide adaptability, a better use of artificial evolution would be toevolve an adaptable system itself. As was suggested in the previous section, this might involve, but is not limited to, evolvinga developmental system such as a gene regulatory network, or a simpler macro-model developmental system.

Immune inspiration

The use of immune inspired approaches alone may not be sufficient to control the re-configuration of an organisms powerbus and so will almost certainly need to be combined with other approaches. However, as has been discussed throughout,immune inspiration can provide some very effective methods of ensuring fault tolerance to a collective robotic system, which,for the task of re-configuring an organisms power bus is essential. The most obvious application will be in the detection ofanomalies in the power management system, which ideally, would be ensured in a predictive manner at both at the individuallevel and, through a distributed approach, at the organism level.

Self-organisation

Although the fields of artificial development, evolutionary computation and artificial immune systems each exhibit theproperty of self-organisation in their own way, there are a number of other bio-inspired techniques which can be classifiedunder this term and may provide yet further useful methods for the task of re-configuration of an organisms power bus. Antcolony optimisation (Dorigo et al., 1996), is one such example that stands out due to its successful application to a number ofcomputer networking problems (Caro et al., 2005), the similarities between which the task of re-configuring an organism’spower bus have already been identified. There are other approaches too, such as firefly synchronisation, that have previously

56

been used in a robotic context (Christensen et al., 2009) and may be useful here. The homeostatic systems of Di Paolo (2000)and others may also be described as self-organising and have been already been identified as well suited to the task ofdynamic power management. Artificial chemistries, such as the fraglets system for which Schreckling and Marktscheffel(2010) constructed an artificial immune system may also be well suited to the task. Much like the system of Di Paolo (2000) isdriven by the desire to retain stability, Tschudin and Meyer (2009) introduce a reactive computing paradigm they refer to as‘programming by equilibria’ that utilises the fraglets system and shows how a desired solution to a problem may emergesimply from the systems desire to reach equilibrium. Many of these self-organising approaches may also be combined withother techniques such as artificial evolution and artificial immune systems to provide highly flexible systems, capable ofadapting to changes in environmental conditions, both external and internal, in an efficient and fault tolerant manner.

4.2.4 Summary

The Symbricator power management system was purposefully designed to be as flexible as possible. Within an artificialrobotic organism, modules may share energy uni- or bi-directionally with any of their connected neighbours, in doing soforming power buses, that may be be used to distribute energy throughout the organism to the places that need it most.These power buses are not fixed and at any moment in time may be adapted or re-configured to meet the current needs of theorganism and its environment.

To handle this flexibility a strategy for re-configuring the power bus of an organism is required that ensures both the mostefficient distribution of power amongst, and the most efficient usage of energy within, an artificial robotic organism. In thedesign of such a strategy a number of things must be considered, including: the choice between a centralised or decentralisedform of control, the identification of situations in which re-configuration would seem to be desirable, and for a particularmoment in time, the identification of configurations that may be considered suitable or unsuitable.

Several possible approaches to the task of dynamically re-configuring an organisms power bus may be identified, including:artificial development, evolutionary computation, artificial immune systems and other self-organising approaches. Theseapproaches may be useable in isolation but the best solution to ensuring the homeostasis of a robotic organism is likely tocombine various aspects of multiple approaches.

4.3 Long-Term Aims

The main purpose of this section is to focus the discussions of the previous section. Based upon the topics of long-termautonomy, fault tolerance and artificial energy homeostasis, three long term aims of my research are outlined. Details ofcurrent progress towards these aims are included and where relevant, any intermediate short term goals are documented.The three main aims are briefly outlined below and are elaborated upon in following sections:

1. Adaptable Power Management in Organism Mode - The development of a method of dynamically controlling theconfiguration of an artificial robotic organism’s power bus. By taking into account the current state of the environment,the method should continuously ensure the most efficient usage and distribution of energy by the organism.

2. Fault Tolerance in Organism Mode - The development of a distributed method of fault tolerance that is heavilyintegrated with aim (1). The method should help ensure the continued operation of the organism, despite the presenceof a variety of potential faults, from sensor and actuator faults to faults within the power management system.

3. Fault Tolerance in Individual and Swarm Modes - In two parts, firstly, to optimise the previously developed mDCAand B-cell algorithms and apply them to the task of anomaly detection within individual and swarm modes. Secondly,to implement a version of the RDA for anomaly detection and to compare this algorithm with the previously developedmDCA and B-cell algorithms.

4.3.1 Adaptable Power Management in Organism Mode

The problem of adaptable power management in organism mode is exemplified by the task of dynamically re-configuring thepower bus of an artificial robotic organism, as described in section 4.2. Some proposed solutions to this task were outlined insection 4.2.3 but before investigating these approaches further, a better understanding of the underlying dynamics of theSymbricator power management system must be gained. To better understand the system, further communication withHamza Raja from Fraunhofer, the designer of the Symbricator power management hardware, will be necessary. Furthermore,the completion of ongoing work into the development of a realistic simulation of the power management system will becrucial, this forms the initial short-term goal of this task.

The simulation of the power management system will include realistic battery charge and discharge curves and be capableof simulating the flow of current between multiple connected modules. High level controls, such as the ability to turn on andoff energy sharing at each of an individual modules four docking connectors will be implemented, allowing configurationssuch as that shown in figure 4.3 to be simulated. It will also be possible to apply different loads to different modules andto observe the effects this has on the flow of current through the system. Furthermore, to allow for the creation of morerealistic data, the power management simulation may be integrated with the existing Robot3D simulator. In this simulator,the movement of whole robotic organism may be simulated and depending on the role of individual modules within theorganism, different loads may be applied to the simulation of the power management system.

57

4.3.2 Fault Tolerance in Organism Mode

By ensuring that the process of re-configuring an organisms power bus and the distribution and usage of energy amongstan organism is not only efficient but is also fault tolerant, this aim will be integral to the development of an adaptablepower management system. Beyond the task of power management however, tolerance to the failure of sensors or actuatorswhich may effect higher level behavioural controllers must also be provided. This is likely to take the form of a distributed,immune-inspired, anomaly detection system, but a suitable source of inspiration is yet to be settled upon and furtherinvestigation is needed. It is known that, as was highlighted in the previous section, the properties such as prediction anddistributed detection will be highly desirable in such an anomaly detection system.

4.3.3 Fault Tolerance in Individual and Swarm Modes

Existing work towards this goal is documented in section 5.2. It will begin with the optimisation of two existing AISalgorithms: the modified Dendritic Cell Algorithm of Mokhtar et al. (2009) and an instance based B-cell algorithm thatextends the mDCA into a combined innate and adaptive inspired system. These algorithms will be applied to the task ofanomaly detection in individual and swarm modes, detecting for example, faults within a robot’s infra-red sensors. Further tothe mDCA and B-cell algorithms, an implementation of the T-cell inspired anomaly detection algorithm of Owens et al. (2009),now known as the Receptor Density Algorithm (RDA), will also be constructed. Due to the fact that the RDA performs a verysimilar function to the mDCA and B-cell algorithms, it may be possible to make comparisons between the various approaches.Furthermore, in line with the AIS framework presented in section 3.3 it may be possible to integrate the approaches intosingle system.

4.3.4 Summary

Three long term aims of my research have been identified: (1) the development of an adaptable power management systemfor artificial robotic organisms; (2) the development of a distributed immune-inspired anomaly detection system; and (3)the continued development of the existing immune-inspired anomaly detection algorithms for operations in individual andswarm modes. Aims (1) and (2) will, necessarily, be heavily integrated with each other, and all three aims will need to beintegrated with the other parts of the SYMBRION project.

58

5 Preliminary Work

In this section, some of the preliminary work that has been carried out during the past year, and is relevant to the three mainaims identified in section 4.3, is documented. This work ranges from the development of simulation tools to the extensionand adaptation of algorithms developed by Maizura Mokhtar, a previous member of the SYMBRION project. Much of thiswork is incomplete and ongoing, consequently, where appropriate future plans are also outlined.

5.1 Simulation Tools

5.1.1 Basic Energy Sharing Within the Stage Simulator

To address all three of the aims introduced in section 4.3, an accurate model of a Symbricator style power management systemwill be essential. Due to the lack of such a model within the Robot3D simulator, the decision was made to extend the alreadyexisting power model of the Stage1 robotic simulator. The extension provides robots with the ability to not only obtain energyfrom an external source, but also to share energy with each other, as is possible with the Symbricator platform. The purposeof this work was mainly to allow for the future development of any algorithms that required the energy sharing capabilitiesof Symbricator robots, however, to demonstrate the capabilities of the platform, a simple strategy for maintaining the energyhomeostasis of a robotic swarm was also implemented.

Stage models of Symbricator style robots created by Wenguo Liu of the Bristol Robotic Lab at the University of the West ofEngland2 were used as a staring point. The models include accurate representations of: IR range finders, IR communicationdevices, docking devices, docking LEDs and light detectors. Using these models Wenguo implemented a distributed strategyfor self-assembly, as a first step towards allowing robots to transition between swarm and organism modes. Wenguo’s modelsand strategy were adapted and extended to allow robots dock not only with each other but with external power sockets.When connected to a power socket robots may recharge their own batteries and when connected to another module, robotsmay provide energy for, or receive energy from, that module.

Figure 5.1a shows a screenshot of the extended Stage simulator, the blue cubes are extended versions of Wenguo’sSymbricator robot models and the orange box is a power socket. In the foreground two modules can be seen docked togethersharing energy and in the middle-ground a robot can be seen recharging itself at a power socket.

Energy Sharing Behavioural Controller

To demonstrate the new capabilities of the robots, a simple behavioural controller was developed. Specifically, the controllerdemonstrates the ability of robots to:

• dock with a power socket and recharge their own batteries

• dock with another robot and recharge that robot’s batteries

Figure 5.2 shows a state transition diagram for the controller. Robots perform simple obstacle avoidance with wanderingand energy foraging. When an individual is in need of energy it may recharge itself at a power socket by following the states:

wander→ approach→ align→ dock→ recharge→ undock→ recover→ wander

Alternatively, a robot in need of energy may stop moving, entering the wait state, and broadcast messages to the rest of theswarm notifying them of its needs. If a robot has enough energy and detects the presence of another robot in need it maydock with that robot and recharge its batteries by following the states:

wander→ approach→ align→ dock→ provide→ undock→ recover→ wander

Preliminary Experiments

To further demonstrate the capabilities of the system and assess the survivability of the approach, some informal preliminaryexperiments were carried out, varying the energy foraging strategy employed by the individual robots. Since they are onlypreliminary, no conclusions are drawn from these experiments, to do so would require repeated runs and more experimentalrigour.

The experimental setup is shown in figure 5.1b. The environment includes a single power socket and eight individualrobots. Each experimental run lasted one simulated hour and at each time step the state of charge of each robot was recorded.To investigate the effect of the robots strategy on the survivability of the system, the performance of an altered version of thecontroller described above was compared with the original controller. The altered controller is identical to the one describedabove except that robots will never wait to be recharged by their neighbours and will always attempt to fend for themselves.

1http://playerstage.sourceforge.net/index.php?src=stage [accessed 13 November 2010]2http://www.brl.ac.uk/ [accessed 13 November 2010]

59

(a) (b)

Figure 5.1: Figure (a) is a screenshot from the extended version of the Stage simulator that shows two robots sharing energy,whilst another robot recharges itself at a power socket. Figure (b) shows the experimental setup, a simpleenvironment with a single power socket, eight robots and basic obstacles.

Figure 5.2: A state transition diagram of the behavioural controller developed to demonstrate the capabilities of the extensionsto the Stage simulator.

60

0 10 20 30 40 50 600

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Time (minutes)

Avera

ge E

nerg

y S

tore

d

Figure 5.3: Graph showing the change in the average state of charge of eight robots, repeated over five runs and utilising twodifferent strategies. The first strategy (continuous red line) permits robots to share energy with each other. In thesecond strategy (dashed blue line and dotted grey line), robots can not share energy. The dashed blue line omitsthe result of one anomalous run, described in text. The dotted black line includes the anomalous result.

Five separate runs were conducted for both the original and altered controller. The average state of charge of the robots(as a percentage of the maximum), over the five runs of both the original and altered controllers, are plotted in figure 5.3.The result of the original controller is represented by the continuous red line, whilst the result of the adapted controller isrepresented by the dashed blue line. Early on in one of the runs of the adapted controller, a robot ran out of power directly infront of the charge, with the other robots not able to recharge it, access to the power socket was blocked for the remainder ofthe experiment, leading to a significantly worse performance for that run. The calculation of the average state of charge thatis represented by the dashed blue line excludes the result in which a robot blocked the charger. With the anomalous resultincluded, the average state of charge over all five runs that used the altered controller, is shown by the dotted black line.

Though no formal conclusions can be drawn, based on observations, some initials comments can be made. From figure5.3 it is clear that the original controller, with energy sharing, maintains a higher average state of charge than the adaptedcontroller. Though not represented in figure 5.3, the main reason for the better result of the original controller is that whenusing the adapted controller, in all runs, at least two robots completely ran out of energy. The original controller also appearsto maintain a steadier average state of charge, whilst with the adapted controller the average state of charge appears todecrease with time. Extrapolating figure 5.3 for longer time periods, if the same trends continue, it might be expected that allthe robots using the adapted controller would run out of the energy and the average state of charge would decrease to zero.Meanwhile the average state of charge of the robots that utilised energy sharing might be expected to remain constant. Toconfirm both these predictions further experiments would be necessary. In general it can be said that what the results of thisexperiment do show is (a) the need for efficient power management strategies and (b) the need for fault tolerance so that thecontinued utilisation of energy sharing strategies can be ensured.

5.1.2 Power Management System

The model of power used in the Stage simulator is sufficient for experiments in swarm mode, however, for organism modeexperiments, especially when considering aim (1): the task of dynamic re-configuration of an organism power bus, a moresophisticate model will be needed. The main reasons that the Stage model is not sufficient is because it only allows thetransfer of energy between pairs of modules and the only power-related information that can be obtained about the internalstate of a robot is the state of charge of its battery. For experiments in organism mode, the ability to read the flow of currentat various locations within the organism, as well as the state of charge of the individual modules, will be essential. Themodel will also need to take into account the current configuration of the power bus and the different loads being applied toindividual modules at different moments in time.

61

The implementation of this model is in the very early stages of development, it is proposed that the circuit simulatorsoftware ngspice1, may be used and some initial work has shown this to be plausible. After a suitable model has been createdit may be possible to integrate it with the Robot3D or Stage simulators so that more realistic data on the dynamics of the flowof current through the power bus may be gathered.

5.2 Fault Tolerance in Individual and Swarm Modes

5.2.1 Optimising the mDCA

Some work has began into optimising the parameters of the mDCA algorithm using evolutionary computation techniques.Before optimisation began however some alterations were made to the original algorithm to fix some obvious deficiencies,improving the baseline performance of the algorithm in the process. These alterations are described first, followed by a briefdescription of some of the experiments carried out so far, with an example of one of the fittest individuals evolved so far.Finally, the plans for the future are outlined.

Improvements to the Baseline mDCA

The improvements to the mDCA are based upon the way in which the various signals: danger, safe and PAMP, that representwhether the system is in a normal or abnormal state are calculated, specifically for the case of detecting faults in IR rangesensors. Originally, both the danger and safe signals were calculated in the same way, by taking the difference between pastand present sensor values. For safe signals, which should represent normal behaviour, the assumption being that as a robotwanders in its environment it will encounter new obstacles and over time its sensor values will change. For danger signals,which represent abnormal behaviour, the assumption is that current sensor values should be similar to past sensor values,therefore any change is considered to be bad. Whilst these assumptions may be true, the method used to capture them is notsufficient. The sensor values of a robot that is wandering in an obstacle filled environment, but is not currently within sensorrange of any obstacles, will remain the same, but using the original mDCA this is not considered a normal mode of operation.Due to the limited range of sensors of Symbricator style robots, even in a densely populated environment, the period of timefor which a robot is not in contact with any obstacles may be large. To solve this problems, the new safe signal is calculatedby comparing the difference between the current sensor value and the average of the sensors history, only when the sensor iswithin range of an obstacle. The assumption that the danger signal calculation was based upon is essential that their shouldbe a correlation between past and present sensor values, to better capture this assumption the danger signal calculation wasadapted so that is now based upon the variance of current and previous sensor values, a large variance signalling abnormalbehaviour. An alteration was also made to the calculation of the PAMP signal. Another indicator of abnormal behaviour,the PAMP signal was previously calculated by comparing the value of a sensor with that of its neighbour, based on theassumption that neighbouring sensors should return similar values. This assumption is true, but the method used doesn’tdifferentiate between whether a difference between a sensor and its neighbour is a result of a fault in the sensor itself or itsneighbour, this led to a large number of false positives in the original algorithm, with sensors believing themselves to befaulty when there was actually a fault present a neighbouring sensor. To solve this problem in the new calculation of thePAMP signal, a comparison is made with both of the sensors neighbours, significantly reducing the chance of false positive atthe cost of an extra comparison.

Preliminary Experiments and Results

Some preliminary experiments have been carried out to further improve the performance of the mDCA algorithm. Theexperiments involve using a Genetic Algorithm (GA) to optimise the parameters of the mDCA algorithm in an off-line manner.The Stage simulator models, introduced in section 5.1.1 were used with an environmental setup similar to that of figure 5.1b,but housing only a single robot. The behavioural controller introduced in section 5.1.1 was also used, but the runtime ofeach experiment was sufficiently short that the robot had enough energy to survive without having to recharge, essentiallyperforming simple wandering with obstacle avoidance. The controller was extended to include the mDCA algorithm formonitoring the state of the robots eight IR sensor values, for each sensor, the algorithm outputs a value between 0 and 1,where ‘0’ indicates that an anomaly may be present and ‘1’ indicates no anomaly. In the experiments, when an anomaly wasdetected in an IR sensor, the response was to ignore that sensor and take the value of its nearest neighbour instead.

To optimise the parameters of the mDCA, a standard genetic algorithm based upon the GAlib2 library was used. The GAwas ran for 100 generations with a population size of 50 and mutation and crossover rates of 0.3 and 0.5 respectively. Toassess the fitness of the parameters, the robot was left to wander in the environment for 20 simulated minutes. For every run,after a period of 250 seconds in which no faults were simulated, three faults were injected at random times into a one of theeight sensors, with a minimum duration of 50 seconds and a maximum duration of 500 seconds. Two different types of faultswere simulated, one which randomly changed the value of the sensor at each timestep and one which set the value of thesensor to a fixed value for the duration of the fault. Fitness was calculated as the sum of the distance between the actual andideal outputs of the algorithm at each timestep, the lower this value the better.

Figure 5.4 shows a comparison between the output of one of the fittest individuals evolved (5.4b) and the output produceda robot using a baseline set of parameters chosen by hand (5.4a). As represented by the red areas in figure 5.4, in this examplefour faults were injected in total, two into ‘sensor 0’, one into ‘sensor 1’ and one into ‘sensor 7’. Both sets of parameters canbe seen to correctly detect all four faults, outputting zero for the majority of the time in which the fault is active. The evolved

1http://ngspice.sourceforge.net/ [accessed 13 November 2010]2http://lancet.mit.edu/ga/ [accessed November 14 2010]

62

(a) (b)

Figure 5.4: The output of the mDCA algorithm. Figure (a) shows the output when using the baseline set of hand chosenparameters, whilst figure (b) shows the output when using an evolved set of parameters. The red rectangles showthe times at which faults were injected into the corresponding sensors.

solution can be seen to react faster to the introduction of the fault than the baseline, this can be explained by the fact thatin all of the fittest individuals, one the parameters, ‘history’, which determines over how many timesteps an intermediateoutput will be averaged to produce the actual output, was observed to be 1. The fact that the ‘history’ parameter was set to 1,reduces the output of the algorithm to a binary decision and allows it to react faster than previously possible. For sensor 0

only, figure 5.5 shows the sensor values and the various signal outputs over the duration of the run, ‘output signal’ is theintermediate output of the algorithm that is thresholded and averaged to give ‘DangerSig’, the designated name for the finaloutput of the algorithm.

Future Experiments

Currently, the calculation of the danger signal is tailored towards the detection of faults that cause a sensor to return randomvalues and the calculation of the safe signal is tailored towards faults which cause the sensor to remain stuck at a particularvalue. It will be interesting to see how, in future experiments, the algorithm handles different types of fault, for example,faults that introduce a bias to the sensor. Other future work will focus on on-line evolution and the use of multiple robotscompleting a harder task, as well as the application of the algorithm to faults within different components. The envisagedfuture task will include the requirement for robots to recharge both themselves and their neighbours, for this purpose, thesetup pictured in figure 5.1b may provide a starting point. Using a different task opens up the possibility for trying differentfitness functions that are based of the success of the system at completing the task, rather than the effectiveness of thealgorithm itself, such an approach will be essential when moving towards on-line evolution. To further aid on-line evolutionand to better integrate the algorithm with the rest of the SYMBRION project, a genetic framework developed by one of theother project partners shall also be utilised.

5.3 Summary

Extending the work of Wenguo Liu, models of Symbricator style robots have been developed within the Stage roboticsimulator that allow robots to transfer energy between each other and to re-charge themselves at power sockets. A basicbehavioural controller has been developed to demonstrate the capabilities of the Stage simulator, which may be used as atest-bed for the development of future algorithms required to operate in individual and swarm modes. A more realistic modelof the Symbricator power management system is also under development which, when completed, may be integrated withthe Robot3D or Stage simulators to accurately simulate the flow of current amongst modules of an artificial robotic organism.

Improvements to the existing mDCA algorithm have been made and initial experiments towards optimising the parametersof the algorithm have begun. The preliminary experiments show an improvement in performance over hand chosenparameters, but further investigation is required. In particular, future work will focus on on-line evolution, the use of multiplerobots, the investigation of different types of faults, and integration with the existing SYMBRION genetic framework.

63

(a) (b)

Figure 5.5: Expanded output of the mDCA algorithm for ‘sensor 0’, during the same runs shown in figure 5.4 and for thesame baseline (a) and evolved (b) sets of parameters.

64

6 Summary

The Symbricator robotic platform is an example of a collective robotic system that combines aspects of both swarm andself-reconfigurable modular robotic systems. The three heterogeneous forms of Symbricator robot may operate alone asindividuals, collaboratively as members of a swarm or collectively and in close harmony as members of an artificial roboticorganism. Though each of the three different types of robot have different specialities they all share a common dockinginterface which allowing them to combine to form robotic organisms and in doing so share power and computationalresources.

The future potential applications of collective robotic systems such as Symbricator include: search and rescue operations,space exploration, surveying hostile environments and the clean up of hazardous waste. All of these proposed applicationswill require collective robotic systems to exhibit the property of long-term autonomy, operating for long periods of timein potentially hostile environments, with little or no human interaction. Two properties are identified as essential to theprovision of long term autonomy: fault tolerance and power management. To provide fault tolerance and efficient powermanagement it is suggested that inspiration may be taken from the principles of biological homeostasis, and by association,the processes of the natural immune system that help to maintain homeostasis.

Towards the aim of providing long-term autonomy to a collective robotic system, in chapter 2 this report began byintroducing some of the necessary background information on the topics of robot reliability, fault tolerance and theSymbricator platform itself. In chapter 3 an extensive review of homeostasis and the natural immune system was provided,along side details of previous work that has been inspired either at a low level by specific biological processes or at a higherlevel by general homeostatic or immune properties. Current progress towards an artificial immune system framework forthe SYMBRION project was also recounted at the end of chapter 3. In chapter 4, some early thought s on how immune- orhomeotatically-inspired long-term autonomy may be provided to a collective robotic system were outlined. The specifictask of dynamically re-configuring the power bus of an artificial robotic organism was identified as a particularly importantmethod of ensuring the long-term autonomy of a collective robotic system such as Symbricator. At the end of chapter 4 threelong term aims of my research were identified:

1. Adaptable Power Management in Organism Mode

2. Fault Tolerance in Organism Mode

3. Fault Tolerance in Individual and Swarm Modes

Finally, in chapter 5 some preliminary work that has been carried out, including the development of various simulationtools and initial experiments in optimising the parameters of a previous immune inspired algorithm were documented.

The three aims identified in chapter 4 and the two grand challenges of the SYMBRION and REPLICATOR projectsintroduced in chapter 2 will serve as the main driving force going forward. Short term goals include the developmentof a more realistic model of the Symbricator organism level power management system and, through the use of standardevolutionary computation techniques, the continued optimisation of existing parts of the SYMBRION AIS framework. Longerterm goals, focusing specifically on aims (1) and (2), will look towards the development of a distributed AIS for the provisionof fault tolerance in organism mode, heavily interlinked with the provision artificial energy homeostasis, that being, theefficient usage and distribution of energy within an artificial robotic organism.

65

Bibliography

Andrew Adamatzky and Jeff Jones. Towards physarum robots: Computing and manipulating on water surface. Journal ofBionic Engineering, 5(4):348 – 357, 2008. ISSN 1672-6529. doi: DOI:10.1016/S1672-6529(08)60180-8.

Y. Al-Hammadi, U. Aickelin, and J. Greensmith. Dca for bot detection. In Evolutionary Computation, 2008. CEC 2008. (IEEEWorld Congress on Computational Intelligence). IEEE Congress on, pages 1807 –1816, 2008. doi: 10.1109/CEC.2008.4631034.

G Altan-Bonnet and Ronald N Germain. Modeling t cell antigen discrimination based on feedback control of digital erkresponses. PLoS Biol, 3(11), 10 2005.

W. Ross Ashby. Can a Mechanical Chess-Player Outplay Its Designer? The British Journal for the Philosophy of Science, 3(9):44–57, 1952. ISSN 00070882.

W. Ross Ashby. Design for a brain: The origin of adaptive behavior. Chapman and Hall, London, 2nd edition, 1960.

N. Ay, R. Der, and M. Prokopenko. Information driven self-organization: The dynamical systems approach to autonomousrobot behavior. Theory in Biosciences, 2010.

Soumya Banerjee and Melanie Moses. Modular radar: An immune system inspired search and response strategy fordistributed systems. In Emma Hart, Chris McEwan, Jon Timmis, and Andy Hone, editors, Artificial Immune Systems, volume6209 of Lecture Notes in Computer Science, pages 116–129. Springer Berlin / Heidelberg, 2010.

X. Barandiaran and E. Di Paolo. Homeostatic plasticity in robots: from development to operant conditioning to habitformation. In Proceedings of CogSys2010, 2010.

Claude Bernard. Leçons sur les phénomènes de la vie communs aux animaux et aux végétaux. Baillière, 1878.

Ran Bi, J. Timmis, and A. Tyrrell. The diagnostic dendritic cell algorithm for robotic systems. In Evolutionary Computation(CEC), 2010 IEEE Congress on, pages 1–8, 2010. doi: 10.1109/CEC.2010.5586499.

J. Bird, M. d’Inverno, and J. Prophet. Net work: An interactive artwork designed using an interdisciplinary performativeapproach. Digital Creativity, 18(1):11–23, 2007.

D. Bradley and A. Tyrrell. Immunotronics : Hardware fault tolerance inspired by the immune system. In Julian Miller, AdrianThompson, Peter Thomson, and Terence Fogarty, editors, Evolvable Systems: From Biology to Hardware, volume 1801 of LectureNotes in Computer Science, pages 11–20. Springer Berlin / Heidelberg, 2000.

D. Bradley, C. Ortega-Sanchez, and A. Tyrrell. Embryonics+immunotronics: a bio-inspired approach to fault tolerance. InEvolvable Hardware, 2000. Proceedings. The Second NASA/DoD Workshop on, pages 215 –223, 2000. doi: 10.1109/EH.2000.869359.

D.W. Bradley and A. M. Tyrrell. The architecture for a hardware immune system. Evolvable Hardware, NASA/DoD Conferenceon, 2001.

Valentino Braitenberg. Vehicles: Experiments in Synthetic Psychology. The MIT Press, February 1986. ISBN 0262521121.

Rodney A. Brooks and Anita M. Flynn. Fast, cheap and out of control: A robot invasion of the solar system. Journal of theBritish Interplanetary Society, 42:478–485, 1989.

Donald T. Campbell. Adaptive behavior from random response. Behavioral Science, 1:105–110, 1956.

R. Canham, A.H. Jackson, and A. Tyrrell. Robot error detection using an artificial immune system. In Evolvable Hardware,2003. Proceedings. NASA/DoD Conference on, pages 199 – 207, 2003. doi: 10.1109/EH.2003.1217667.

Walter B. Cannon. Physiological regulation of normal states: some tentative postulates concerning biological homeostatics.Jubilee Volume for Charles Richet, pages 91–93, 1926.

Walter B. Cannon. Organization for physiological homeostasis. Physiological Reviews, 9(3):399–431, 1929.

Walter B. Cannon. The Wisdom of the Body. W. W. Norton, New York, 1932.

B.L. Capehart and R.A. Terry. Digital simulation of the homeostat modified to show memory and learning. Systems Scienceand Cybernetics, IEEE Transactions on, 4(2):188–190, july 1968. ISSN 0536-1567. doi: 10.1109/TSSC.1968.300147.

Peter A. Cariani. The homeostat as embodiment of adaptive control. International Journal of General Systems, 38(2):139–154,2009.

66

J. Carlson and R.R. Murphy. Reliability analysis of mobile robots. In Robotics and Automation, 2003. Proceedings. ICRA ’03. IEEEInternational Conference on, volume 1, pages 274 – 281, 2003. doi: 10.1109/ROBOT.2003.1241608.

J. Carlson and R.R. Murphy. How ugvs physically fail in the field. Robotics, IEEE Transactions on, 21(3):423 – 437, 2005. ISSN1552-3098. doi: 10.1109/TRO.2004.838027.

J. Carlson, R.R. Murphy, and A. Nelson. Follow-up analysis of mobile robot failures. In Robotics and Automation, 2004.Proceedings. ICRA ’04. 2004 IEEE International Conference on, volume 5, pages 4987 – 4994, 2004. doi: 10.1109/ROBOT.2004.1302508.

Gianni Di Caro, Gianni Di Caro, Frederick Ducatelle, Frederick Ducatelle, Luca Maria Gambardella, and Luca MariaGambardella. Anthocnet: An adaptive nature-inspired algorithm for routing in mobile ad hoc networks. EuropeanTransactions on Telecommunications, 16:443–455, 2005.

Arturo Chavoya. Artificial development. In Aboul-Ella Hassanien, Ajith Abraham, Athanasios Vasilakos, and Witold Pedrycz,editors, Foundations of Computational, Intelligence Volume 1, volume 201 of Studies in Computational Intelligence, pages 185 –215. Springer Berlin / Heidelberg, 2009.

A.L. Christensen, R. O’Grady, and M. Dorigo. From fireflies to fault-tolerant swarms of robots. IEEE Transactions onEvolutionary Computation, 13(4):754 –766, 2009. ISSN 1089-778X.

Anders Christensen, Rehan O’Grady, Mauro Birattari, and Marco Dorigo. Exogenous fault detection in a collective robotictask. In Fernando Almeida e Costa, Luis Rocha, Ernesto Costa, Inman Harvey, and AntÃsnio Coutinho, editors, Advancesin Artificial Life, volume 4648 of Lecture Notes in Computer Science, pages 555–564. Springer Berlin / Heidelberg, 2007.

Anders Christensen, Rehan OâAZGrady, Mauro Birattari, and Marco Dorigo. Fault detection in autonomous robots based onfault injection andÂalearning. Autonomous Robots, 24:49–67, 2008. ISSN 0929-5593.

I.R. Cohen. Tending Adams Garden: Evolving the Cognitive Immune Self. Elsevier Academic Press, Amsterdam, 2000.

Steven J. Cooper. From claude bernard to walter cannon. emergence of the concept of homeostasis. Appetite, 51(3):419 – 427,2008. ISSN 0195-6663. doi: DOI:10.1016/j.appet.2008.06.005.

Charles. Darwin. On The Origin of Species. John Murray, London, 1859.

D. Dasgupta, Z. Ji, and F. Gonzalez. Artificial immune system (ais) research in the last five years. In The 2003 Congress onEvolutionary Computation, CEC ’03, volume 1, pages 123–130, December 2003. doi: 10.1109/CEC.2003.1299565.

Dipankar Dasgupta and Stephanie Forrest. Novelty detection in time series data using ideas from immunology. In InProceedings of The International Conference on Intelligent Systems, 1995.

Leandro N. de Castro and Jonathan Timmis. Artificial Immune Systems: A New Computational Intelligence Approach. Springer-Verlag, London, UK, 2002. ISBN 978-1-85233-594-6.

L.N. de Castro and F.J. Von Zuben. Learning and optimization using the clonal selection principle. Evolutionary Computation,IEEE Transactions on, 6(3):239 –251, June 2002. ISSN 1089-778X. doi: 10.1109/TEVC.2002.1011539.

R. de Lemos, J. Timmis, M. Ayara, and S. Forrest. Immune-inspired adaptable error detection for automated teller machines.IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 37(5):873 –886, sep. 2007. ISSN 1094-6977.doi: 10.1109/TSMCC.2007.900662.

Mark Denny. Watt steam governor stability. European Journal of Physics, 23(3):339, 2002.

Ralf Der. Self-organized robot behavior from the principle of homeokinesis. In Proceedings of the SOAVE Workshop, 2000.

Ralf Der. Self-organized acquisition of situated behaviors. Theory in Biosciences, 120:179–187, 2001. ISSN 1431-7613.

Ralf Der. Artificial life from the principle of homeokinesis. In Proceedings of the German Workshop on Artificial Life, 2008.

Ralf Der and Georg Martius. From motor babbling to purposive actions: Emerging self-exploration in a dynamical systemsapproach to early robot development. In Stefano Nolfi, Gianluca Baldassarre, Raffaele Calabretta, John Hallam, DavideMarocco, Jean-Arcady Meyer, Orazio Miglino, and Domenico Parisi, editors, From Animals to Animats 9, volume 4095 ofLecture Notes in Computer Science, pages 406–421. Springer Berlin / Heidelberg, 2006.

Ralf Der and Thomas Pantzer. Emergent robot behaviour from the principle of homeokinesis. In Experiments with theMini-Robot Khepera. Proceedings of the 1st International Khepera Workshop’99, 1999.

Ralf Der, Ulrich Steinmetz, and Frank Pasemann. Homeokinesis - a new principle to back up evolution with learning.Computational Intelligence for Modelling, Control, and Automation, pages 43–47, 1999.

Ralf Der, Frank Hesse, and Ren Liebscher. Contingent robot behavior from self-referential dynamical systems. submitted toautonomous robots, 2005.

67

Ralf Der, Frank Hesse, and Georg Martius. Rocking stamper and jumping snakes from a dynamical systems approach toartificial life. Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems, 14(2):105–115, 2006a. ISSN1059-7123.

Ralf Der, Georg Martius, and Frank Hesse. Let it roll — emerging sensorimotor coordination in a spherical robot. In ArtificialLife X: Proceedings of the Tenth International Conference on the Simulation and Synthesis of Living Systems, pages 192–198. MITPress, 2006b.

Ralf Der, Frank Guttler, and Nihat Ay. Predictive information and emergent cooperativity in a chain of mobile robots. InProceedings ALifeXI. MIT Press, 2008.

Ezequiel A Di Paolo. Homeostatic adaptation to inversion of the visual field and other sensorimotor disruptions. In J-A.Meyer, A. Berthoz, D. Floreano, H. Roitblat, and S W. Wilson, editors, From Animals to Animals, Proc. of the Sixth InternationalConference on the Simulation of Adaptive Behavior, SAB’2000, pages 440–449. MIT Press, 2000.

Ezequiel A Di Paolo. Organismically-inspired robotics: Homeostatic adaptation and natural teleology beyond the closesensorimotor loop. In K. Murase and T. Asakura, editors, Dynamical Systems Approach to Embodiment and Sociality, AdvancedKnowledge International, pages 19–42, 2003.

M. Dorigo, V. Maniezzo, and A. Colorni. Ant system: optimization by a colony of cooperating agents. Systems, Man, andCybernetics, Part B: Cybernetics, IEEE Transactions on, 26(1):29 –41, 1996. ISSN 1083-4419. doi: 10.1109/3477.484436.

Falko Dressler and Ozgur B. Akan. A survey on bio-inspired networking. Computer Networks, 54(6):881–900, 2010. ISSN1389-1286. doi: DOI:10.1016/j.comnet.2009.10.024.

Gerald M. Edelman and Joseph A. Gally. Degeneracy and complexity in biological systems. Proceedings of the National Academyof Sciences of the United States of America, 98(24):13763–13768, November 2001. ISSN 0027-8424. doi: 10.1073/pnas.231499798.

Alice Eldridge. Adaptive systems music: Algorithmic process as musical form. In In Proceedings of the 5th InternationalConference on Generative Art, 2002.

O. Feinerman, J. Veiga, J.R. Dorfman, R.N. Germain, and G. Altan-Bonnet. Variability and Robustness in T Cell Activationfrom Regulated Heterogeneity in Protein Levels. Science, 321:1081–1084, 2008. doi: 10.1126/science.1158013.

Stephanie Forrest, Alan S. Perelson, Lawrence Allen, and Rajesh Cherukuri. Self-nonself discrimination in a computer. In InProceedings of the 1994 IEEE Symposium on Research in Security and Privacy, pages 202–212. IEEE Computer Society Press,1994.

John F. Fulton. Selected readings in the history of physiology. Charles C. Thomas, Springfield, IL, 2nd edition, 1966.

Julie Greensmith and Uwe Aickelin. The deterministic dendritic cell algorithm. In Peter Bentley, Doheon Lee, and SungwonJung, editors, Artificial Immune Systems, volume 5132 of Lecture Notes in Computer Science, pages 291–302. Springer Berlin /Heidelberg, 2008.

Julie Greensmith, Uwe Aickelin, and Steve Cayzer. Introducing dendritic cells as a novel immune-inspired algorithm foranomaly detection. In Christian Jacob, Marcin Pilat, Peter Bentley, and Jonathan Timmis, editors, Artificial Immune Systems,volume 3627 of Lecture Notes in Computer Science, pages 153–167. Springer Berlin / Heidelberg, 2005.

Julie Greensmith, Uwe Aickelin, and Jamie Twycross. Articulation and clarification of the dendritic cell algorithm. In HuguesBersini and Jorge Carneiro, editors, Artificial Immune Systems, volume 4163 of Lecture Notes in Computer Science, pages404–417. Springer Berlin / Heidelberg, 2006a.

Julie Greensmith, Jamie Twycross, and Uwe Aickelin. Dendritic cells for anomaly detection. In Evolutionary Computation, 2006.CEC 2006. IEEE Congress on, pages 664 –671, 2006b. doi: 10.1109/CEC.2006.1688374.

Zvi Grossman and William E. Paul. Adaptive cellular interactions in the immune system: the tunable activation threshold andthe significance of subthreshold responses. In Proceedings of the National Academy of Sciences of the United States of America,volume 89, pages 10365–10369. National Academy of Sciences, 1992.

Zvi Grossman and William E. Paul. Autoreactivity, dynamic tuning and selectivity. Current Opinion in Immunology, 13(6):687 –698, 2001. ISSN 0952-7915. doi: DOI:10.1016/S0952-7915(01)00280-1.

T. S. Guzella, T. A. Mota-Santos, and W. M. Caminhas. Towards a novel immune inspired approach to temporal anomalydetection. In ICARIS’07: Proceedings of the 6th international conference on Artificial immune systems, pages 119–130, Berlin,Heidelberg, 2007. Springer-Verlag. ISBN 3-540-73921-1, 978-3-540-73921-0.

Heiko Hamann, Jürgen Stradner, Thomas Schmickl, and Karl Crailsheim. Artificial hormone reaction networks: Towardshigher evolvability in evolutionary multi-modular robotics. In Proceedings of the 12th International Conference on Artificial Life(Alife XII). MIT Press, 2010a.

68

Heiko Hamann, Jürgen Stradner, Thomas Schmickl, and Karl Crailsheim. A hormone-based controller for evolutionarymulti-modular robotics: From single modules to gait learning. In Proceeding of the IEEE Congress on Evolutionary Computation(CEC’10), pages 244–251, 2010b.

Emma Hart and Jonathan Timmis. Application areas of ais: The past, the present and the future. In Christian Jacob, MarcinPilat, Peter Bentley, and Jonathan Timmis, editors, Artificial Immune Systems, Lecture Notes in Computer Science, pages483–497. Springer Berlin / Heidelberg, 2005.

M. Hashimoto, R. Kitade, F. Itaba, and K. Takahashi. Voting based fault isolation of in-vehicle multi-sensors. In SICE AnnualConference, 2008, pages 1942 –1946, 2008. doi: 10.1109/SICE.2008.4654979.

M. Hashimoto, T. Ishii, and K. Takahashi. Sensor fault detection and isolation for mobile robots in a multi-robot team. InIndustrial Electronics, 2009. IECON ’09. 35th Annual Conference of IEEE, pages 2348 –2353, 2009. doi: 10.1109/IECON.2009.5415410.

G. Heredia, A. Ollero, M. Bejar, and R. Mahtani. Sensor and actuator fault detection in small autonomous helicopters.Mechatronics, 18(2):90 – 99, 2008. ISSN 0957-4158. doi: DOI:10.1016/j.mechatronics.2007.09.007.

J. M. Herrmann, M. Holicki, and R. Der. On ashby’s homeostat: A formal model of adaptive regulation. In From animals toanimates, pages 324–333. MIT Press, 2004.

R. Humza, O. Scholz, M. Mokhtar, J. Timmis, and A. Tyrrell. Towards energy homeostasis in an autonomous self-reconfigurablemodular robotic organism. In Future Computing, Service Computation, Cognitive, Adaptive, Content, Patterns, 2009. COMPUTA-TIONWORLD ’09. Computation World:, pages 21 –26, 2009.

Hiroyuki Iizuka and Ezequiel Di Paolo. Extended homeostatic adaptation: Improving the link between internal andbehavioural stability. In Minoru Asada, John Hallam, Jean-Arcady Meyer, and Jun Tani, editors, From Animals to Animats 10,volume 5040 of Lecture Notes in Computer Science, pages 1–11. Springer Berlin / Heidelberg, 2008.

Amelia Ritahani Ismail and Jon Timmis. Aggregation of swarms for fault tolerance in swarm robotics using an immuno-engineering approach. In UK Workshop on Computational Intelligence, 2009.

B. Jakimovski and B Meyer. Artificial immune system based robot anomaly detection engine for fault tolerant robots. In 5thInternational Conference on Autonomic and Trusted Computing (ATC-08), pages 177–190. Springer Berlin / Heidelberg, 2008.

B. Jakimovski, B Meyer, and E Maehle. Swarm intelligence for self-reconfiguring walking robot. In Swarm IntelligenceSymposium, SIS 2008. IEEE, 2008.

B. Jakimovski, B Meyer, and E Maehle. Self-reconfiguring hexapod robot oscar using organically inspired approaches andinnovative robot leg amputation mechanism. In International Conference on Automation, Robotics and Control Systems, ARCS-09,2009.

N. K. Jerne. Towards a network theory of the immune system. Annales d’immunologie, 125C(1-2):373–389, January 1974. ISSN0300-4910.

Johnny Kelsey and Jon Timmis. Immune inspired somatic contiguous hypermutation for function optimisation. In ErickCantú-Paz, James Foster, Kalyanmoy Deb, Lawrence Davis, Rajkumar Roy, Una-May O’Reilly, Hans-Georg Beyer, RussellStandish, Graham Kendall, Stewart Wilson, Mark Harman, Joachim Wegener, Dipankar Dasgupta, Mitch Potter, AlanSchultz, Kathryn Dowsland, Natasha Jonoska, and Julian Miller, editors, Genetic and Evolutionary Computation - GECCO2003, volume 2723 of Lecture Notes in Computer Science, pages 202–202. Springer Berlin / Heidelberg, 2003.

Jungwon Kim, Peter Bentley, Christian Wallenta, Mohamed Ahmed, and Stephen Hailes. Danger is ubiquitous: Detectingmalicious activities in sensor networks using the dendritic cell algorithm. In Hugues Bersini and Jorge Carneiro, editors,Artificial Immune Systems, volume 4163 of Lecture Notes in Computer Science, pages 390–403. Springer Berlin / Heidelberg,2006.

Tüze Kuyucu, Martin A. Trefzer, Julian F. Miller, and Andy M. Tyrrell. Evolvability of multicellular developmental systems:Mechanisms and parameters. Awaiting publication, 2010.

J-C. Laprie. Dependable computing and fault tolerance: Concepts and terminology. In Fault-Tolerant Computing, 1995, ’Highlights from Twenty-Five Years’., Twenty-Fifth International Symposium on, page 2, 1995. doi: 10.1109/FTCSH.1995.532603.

Gary W. Litman, John P. Cannon, and Larry J. Dishaw. Reconstructing immune phylogeny: new perspectives. Nature reviews.Immunology, 5(11):866–879, 2005.

Yang Liu, Jon Timmis, Omer Qadir, Gianluca Tempesti, and Andy Tyrrell. A developmental and immune-inspired dynamictask allocation algorithm for microprocessor array systems. In Emma Hart, Chris McEwan, Jon Timmis, and Andy Hone,editors, Artificial Immune Systems, volume 6209 of Lecture Notes in Computer Science, pages 199 – 212. Springer Berlin /Heidelberg, 2010.

Eric S. Loker, Coen M. Adema, Si-Ming Zhang, and Thomas B. Kepler. Invertebrate immune systems–not homogeneous, notsimple, not well understood. Immunological reviews, 198:10–24, 2004.

69

P. Matzinger. Tolerance, danger, and the extended family. Annual review of immunology, 12(1):991–1045, 1994. ISSN 0732-0582.doi: 10.1146/annurev.iy.12.040194.005015.

M. Mokhtar, J. Timmis, A.M. Tyrrell, and Ran Bi. An artificial lymph node architecture for homeostasis in collective roboticsystems. In Self-Adaptive and Self-Organizing Systems Workshops, 2008. SASOW 2008. Second IEEE International Conference on,pages 126 –131, 2008. doi: 10.1109/SASOW.2008.12.

M. Mokhtar, Ran Bi, J. Timmis, and A.M. Tyrrell. A modified dendritic cell algorithm for on-line error detection in roboticsystems. In Evolutionary Computation, 2009. CEC ’09. IEEE Congress on, pages 2055 –2062, 2009. doi: 10.1109/CEC.2009.4983194.

Kenneth Murphy, Paul Travers, and Mark Walport. Janewway’s Immunobiology. Garland Science, New York : Tayler & FrancisGroup, 7th edition, 2008. ISBN 978-0-8153-4123-9.

M Neal and J Timmis. Once More Unto the Breach: Towards Artificial Homeostasis? In L N De Castro and F J Von Zuben,editors, Recent Developments in Biologically Inspired Computing, pages 340–365. Idea Group, January 2005. ISBN 1591403138.

Mark Neal and Jon Timmis. Timidity: A useful emotional mechanism for robot control? Informatica (Slovenia), 27(2):197–204,2003.

R. Oates, M. Milford, G. Wyeth, G. Kendall, and J.M. Garibaldi. The implementation of a novel, bio-inspired, roboticsecurity system. In Robotics and Automation, 2009. ICRA ’09. IEEE International Conference on, pages 1875 –1880, 2009. doi:10.1109/ROBOT.2009.5152392.

Robert Oates, Julie Greensmith, Uwe Aickelin, Jonathan Garibaldi, and Graham Kendall. The application of a dendritic cellalgorithm to a robotic classifier. In Leandro de Castro, Fernando Von Zuben, and Helder Knidel, editors, Artificial ImmuneSystems, volume 4628 of Lecture Notes in Computer Science, pages 204–215. Springer Berlin / Heidelberg, 2007.

Oluropo Ogundipe, Julie Greensmith, and Gethin Roberts. Multipath detection using the dendritic cell algorithm. In FIGCongress 2010 Facing the Challenges - Building the Capacity, 2010.

Nick Owens, Jon Timmis, Andrew Greensted, and Andy Tyrell. On immune inspired homeostasis for electronic systems. InLeandro de Castro, Fernando Von Zuben, and Helder Knidel, editors, Artificial Immune Systems, volume 4628 of LectureNotes in Computer Science, pages 216–227. Springer Berlin / Heidelberg, 2007.

Nick Owens, Jon Timmis, Andrew Greensted, and Andy Tyrrell. Modelling the tunability of early t cell signalling events. InPeter Bentley, Doheon Lee, and Sungwon Jung, editors, Artificial Immune Systems, volume 5132 of Lecture Notes in ComputerScience, pages 12–23. Springer Berlin / Heidelberg, 2008.

Nick Owens, Andy Greensted, Jon Timmis, and Andy Tyrrell. T cell receptor signalling inspired kernel density estimationand anomaly detection. In Paul Andrews, Jon Timmis, Nick Owens, Uwe Aickelin, Emma Hart, Andrew Hone, and AndyTyrrell, editors, Artificial Immune Systems, volume 5666 of Lecture Notes in Computer Science, pages 122–135. Springer Berlin /Heidelberg, 2009.

Nick D.L. Owens, Jon Timmis, Andrew Greensted, and Andy Tyrrell. Elucidation of t cell signalling models. Journal ofTheoretical Biology, 262(3):452 – 470, 2010. ISSN 0022-5193. doi: DOI:10.1016/j.jtbi.2009.10.017.

Andrew Pickering. Cybernetics and the mangle: Ashby, beer and pask. Social Studies of Science, 32(3):413–437, 2002.

Andrew F. Rowley and Adam Powell. Invertebrate immune systems specific, quasi-specific, or nonspecific? Journal ofimmunology, 179(11):7209–7214, 2007.

H. Sato, C.W. Berry, B.E. Casey, G. Lavella, Ying Yao, J.M. VandenBrooks, and M.M. Maharbiz. A cyborg beetle: Insect flightcontrol through an implantable, tetherless microsystem. In Micro Electro Mechanical Systems, 2008. MEMS 2008. IEEE 21stInternational Conference on, pages 164–67, jan. 2008.

C. Sauze and M. Neal. A neuro-endocrine inspired approach to long term energy autonomy in sailing robots. In in proceedingsof TAROS (Towards Autonomous Robotic Systems) 2010, pages 255–262, August 2010.

T. Schmickl and K. Crailsheim. Modelling a hormone- based robot controller. In MATHMOD 2009 - 6th Vienna InternationalConference on Mathematical Modelling, 2009.

Thomas Schmickl, Heiko Hamann, Jürgen Stradner, and Karl Crailsheim. Hormone-based Control for Multi-modular Robotics.Springer, 2010. P. Levi and S. Kernbach, editors.

Daniel Schreckling and Tobias Marktscheffel. An artificial immune system approach for artificial chemistries based on setrewriting. In Emma Hart, Chris McEwan, Jon Timmis, and Andy Hone, editors, Artificial Immune Systems, volume 6209 ofLecture Notes in Computer Science, pages 250 – 263. Springer Berlin / Heidelberg, 2010.

A. Secker, A.A. Freitas, and J. Timmis. Aisec: an artificial immune system for e-mail classification. In Evolutionary Computation,2003. CEC ’03. The 2003 Congress on, volume 1, pages 131–138 Vol.1, dec. 2003. doi: 10.1109/CEC.2003.1299566.

70

Claude E. Shannon. A mathematical theory of communication. Bell system technical journal, 27, 1948.

S. B. Stancliff and J. M. Dolan. Mission reliability estimation for multi-robot team design. In In Proceedings of IEEE InternationalConference on Intelligent Robots and Systems (IROS, pages 2206–2211, 2006.

I. Stefanová, B. Hemmer, M. Vergelli, R. Martin, W. E. Biddison, and R. N. Germain. TCR ligand discrimination is enforced bycompeting ERK positive and SHP-1 negative feedback pathways. Nat Immunol, 4(3):248–254, March 2003. ISSN 1529-2908.doi: 10.1038/ni895.

Susan Stepney, Robert E. Smith, Jonathan Timmis, Andy M. Tyrrell, Mark J. Neal, and Andrew N. W. Hone. Conceptualframeworks for artificial immune systems. International Journal of Unconventional Computing, 1(3):315–338, 2005.

J. Stradner, H. Hamann, T. Schmickl, and K. Crailsheim. Analysis and implementation of an artificial homeostatic hormonesystem: A first case study in robotic hardware. In IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS2009., pages 595–600, 2009. doi: 10.1109/IROS.2009.5354056.

J. G. Taylor. The behavioral basis of perception. 1962.

R.A. Terry and B.L. Capehart. A return to the homeostat. In Adaptive Processes, 1968. Seventh Symposium on, volume 7, 1968.doi: 10.1109/SAP.1968.267094.

J. Timmis, P. Andrews, N. Owens, and E. Clark. An interdisciplinary perspective on artificial immune systems. EvolutionaryIntelligence, 1:5–26, 2008a. ISSN 1864-5909.

J. Timmis, M. Neal, and J. Thorniley. An adaptive neuro-endocrine system for robotic systems. In Robotic Intelligence inInformationally Structured Space, 2009. RIISS ’09. IEEE Workshop on, pages 129 –136, 2009. doi: 10.1109/RIISS.2009.4937917.

J. Timmis, A. Tyrrell, M. Mokhtar, A. Ismail, N. Owens, and R. Bi. An Artificial Immune System for Robot Organisms, pages282–306. Springer, 2010a. Levi, P and Kernback, S., editors.

Jon Timmis. Artificial immune systems - today and tomorrow. Natural Computing, 6:1–18, 2007. ISSN 1567-7818.

Jon Timmis, Emma Hart, Andy Hone, Mark Neal, Adrian Robins, Susan Stepney, and Andy Tyrrell. Immuno-engineering.In Mike Hinchey, Anastasia Pagnoni, Franz Rammig, and Hartmut Schmeck, editors, Biologically-Inspired CollaborativeComputing, volume 268 of IFIP International Federation for Information Processing, pages 3–17. Springer Boston, 2008b.

Jon Timmis, Lachlan Murray, and Mark Neal. A neural-endocrine architecture for foraging in swarm robotic systems. InJuan González, David Pelta, Carlos Cruz, GermÃan Terrazas, and Natalio Krasnogor, editors, Nature Inspired CooperativeStrategies for Optimization (NICSO 2010), volume 284 of Studies in Computational Intelligence, pages 319–330. Springer Berlin /Heidelberg, 2010b.

C. Tschudin and T. Meyer. Programming by equilibria. In 15th Kolloquium Programmiersprachen und Grundlagen der Program-mierung (KPS 2009), pages 37–46, October 2009.

Christian F. Tschudin. Fraglets - a metabolistic execution model for communication protocols. In In Proceeding of 2nd AnnualSymposium on Autonomous Intelligent Networks and Systems (AINS), 2003.

Soichiro Tsuda, Stefan Artmann, and Klaus-Peter Zauner. The phi-bot: A robot controlled by a slime mould. In AndrewAdamatzky and Maciej Komosinski, editors, Artificial Life Models in Hardware, pages 213–232. Springer London, 2009. ISBN978-1-84882-530-7.

Jamie Twycross and Uwe Aickelin. Biological inspiration for artificial immune systems. In ICARIS’07: Proceedings of the6th international conference on Artificial immune systems, pages 300–311, Berlin, Heidelberg, 2007. Springer-Verlag. ISBN3-540-73921-1, 978-3-540-73921-0.

Patrícia Vargas, Renan Moioli, Leandro N. de Castro, Jon Timmis, Mark Neal, and Fernando J. Von Zuben. Artificialhomeostatic system: A novel approach. In Mathieu S. Capcarrere, Alex A. Freitas, Peter J. Bentley, Colin G. Johnson, andJon Timmis, editors, Advances in Artificial Life, volume 3630 of Lecture Notes in Computer Science, pages 754–764. SpringerBerlin / Heidelberg, 2005.

Jianguo Wang, Gongxing Wu, Y. Sun, Lei Wan, and D. Jiang. Fault diagnosis of underwater robots based on recurrentneural network. In Robotics and Biomimetics (ROBIO), 2009 IEEE International Conference on, pages 2497–2502, 2009. doi:10.1109/ROBIO.2009.5420479.

Kevin Warwick. The philosophy of W. Ross Ashby and its relationship to ‘The Matrix’. International Journal of General Systems,38(2):239–253, February 2009. doi: 10.1080/03081070802601475.

Andrew Watkins, Jon Timmis, and Lois Boggess. Artificial immune recognition system (airs): An immune-inspired supervisedlearning algorithm. Genetic Programming and Evolvable Machines, 5:291–317, 2004. ISSN 1389-2576.

Eric P. Widmaier, Hershel Raff, and Kevin T. Strang. Vander’s Human Physiology: The Mechanisms of Body Function. McGraw-Hill,Ney York, 10th edition, 2006. ISBN 0-07-111677-X.

71

Hywel Williams. Homeostatic plasticity in recurrent neural networks. In Stefan Schaal, Auke Jan Ijspeert, Aude Billard, SethuVijayakumar, John Hallam, and Jean-Arcady Meyer, editors, From animals to animats 8: Proceedings of the Eighth InternationalConference on the Simulation of Adaptive Behavior, pages 344–353. MIT Press, 2004.

Alan F. T. Winfield and Julien Nembrini. Safety in numbers: fault-tolerance in robot swarms. International Journal of Modelling,Identification and Control, 1(1):30–37, January 2006.

Alan F.T. Winfield, Christopher J. Harper, and Julien Nembrini. Towards dependable swarms and a new discipline of swarmengineering. In Erol Sahin and William M. Spears, editors, Swarm Robotics, volume 3342 of Lecture Notes in Computer Science,pages 126–142. Springer Berlin / Heidelberg, 2005.

Keyan Zahedi, Nihat Ay, and Ralf Der. Higher coordination with less control-a result of information maximization in thesensorimotor loop. Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems, 18(3-4):338–355, 2010.ISSN 1059-7123.

Youmin Zhang and Jin Jiang. Bibliographical review on reconfigurable fault-tolerant control systems. Annual Reviews inControl, 32(2):229 – 252, 2008. ISSN 1367-5788. doi: DOI:10.1016/j.arcontrol.2008.03.008.

Jieqiong Zheng, Yunfang Chen, and Wei Zhang. A survey of artificial immune applications. Artificial Intelligence Review, 34:19–34, 2010. ISSN 0269-2821.

Duan Zhuo-hua, Cai Zi-xing, and Yu Jin-xia. Fault diagnosis and fault tolerant control for wheeled mobile robots underunknown environments: A survey. In Robotics and Automation, 2005. ICRA 2005. Proceedings of the 2005 IEEE InternationalConference on, pages 3428 – 3433, 2005. doi: 10.1109/ROBOT.2005.1570640.

72

Immune Inspired Homeostasis for Long-term … University of York Department of Electronics Immune Inspired Homeostasis for Long-term Autonomy in Collective Robotic Systems Lachlan

Documents

Immune Inspired Homeostasis for Long-term … University of York Department of Electronics Immune Inspired Homeostasis for Long-term Autonomy in Collective Robotic Systems Lachlan