Representation Learning for Mobile Robots in Dynamic ...vrs.amsi.org.au/wp-content/uploads/sites/78/2017/05/michael_olivia_… · against each other (see also Fig. 4). In this simulation,

Representation Learning for Mobile Robots in Dynamic Environments

Olivia Michael

Supervised by A/Prof. Oliver Obst

Western Sydney University

Vacation Research Scholarships are funded jointly by the Department of Education and Training and the Australian Mathematical Sciences Institute.

1 Abstract

This project will be using Neural Network and exploring different architecture of networks usingdata from an abstract robot simulation, RoboCup simulation league. This is where two teams of 11robots play soccer against each other. In this simulation, robots are part of a dynamic environmentwhere they “see” a part of the soccer field, with a number of landmarks, and possibly the ball andother robots. For close-by robots, this also includes information about team and uniform number;for robots that are further away, this information is missing. We will investigate different neuralnetwork architectures, with the goal to find out how to best learn a representation that allows foran accurate reconstruction of the current state of the environment.

2

Contents

1 Abstract 2

2 Introduction 4

3 Neural Network 4

4 RoboCup 6

5 Architecture of the Neural Networks 6

6 Results 7

7 Conclusion and Future Work 8

8 Acknowledgments 8

3

2 Introduction

Mobile robots that move around in a known environment together with a team of other robots havetwo very fundamental problems to solve: (1) Where am I?, and (2) Where is everyone else? Theproblem of a robot estimating its own location and orientation, using data from sensors like cameras,laser range finders, or sonars, is called “self-localisation”. Known positions of fixed objects thatare detected by the robot’s sensors can provide the necessary information to calculate the currentlocation and orientation, provided enough of these landmarks are in the robot’s current “field ofview”, and the sensor data are accurate enough. In situations with limited information, however,it may not be possible for the robot to correctly localise just from its current view of the world.In this case, statistical approaches can be used, to estimate and filter out the most likely currentposition in the environment. Robots that cooperate in a team face the additional problem of keepingtrack of everyone else in the team, in particular with only partial perception of the environmentand unreliable information. An accurate model of the current state of the world is beneficial toboth plan actions, as well as to predict what the state of the world might look like in the nearfuture. Approaches for self-localisation and maintaining a world model usually require a choice ofrepresentation for objects in the environment, a sensor model, or also a selection of features that aredeemed relevant for the problem, and as a result, are dependent on the quality of representation,sensor model and feature selection. In this project, we will investigate to instead learn the bestrepresentation of current and past sensor data in order to maintain an up-to-date world model, usinga neural network (“deep learning”) based approach [7]. This project will be using data from anabstract robot simulation, RoboCup simulation league [3], where two teams of 11 robots play socceragainst each other (see also Fig. 4). In this simulation, robots are part of a dynamic environmentwhere they “see” a part of the soccer field, with a number of landmarks, and possibly the ball andother robots. For close-by robots, this also includes information about team and uniform number;for robots that are further away, this information is missing. We will investigate different neuralnetwork architectures, with the goal to find out how to best learn a representation that allows foran accurate reconstruction of the current state of the environment.

3 Neural Network

Humans store learned information in a highly interconnected neural network in the brain. Whengiven a set of inputs and outputs, our neurons “learn” the relationship between the inputs andoutputs through weights assigned to each input and net of synapse firing thresholds [1]. ArtificialNeural Networks are inspired by the operations of brain. In our application, it’s is a two-stageregression or classification model (see Fig. 1) [5], but Neural Networks can also be used for classifi-cation tasks.

Zm = σ(α0m + αmTX),m = 1...M (1)

Yk = (β0k + βkTZ), k = 1...K (2)

where Z = (Z1, Z2, ..., ZM ) are the units in the hidden layer, and Y = (Y1, Y2, ..., YK) is theoutput. Non-linear activation functions are an important component of neural networks, and thesigmoid function, σ(x), is a commonly used example for such a function (see Fig. 2 for graph).

4

Multiple linear layers in a neural network can be reduced to one linear layer, so that multiple layersonly make sense when there are nonlinearities like σ(x).

Sigmoid function equation σ(x) =1

(1 + e(−x))(3)

α are the weights in the first layer and the β are the weights in the second layer which are initiallychosen at random then adjusted accordingly to minimise the error using the back-propagationmethod [4]. The root mean square error is used to compute the error by subtracting the computedoutput Yk from the correct output Yi

RMSE =

√√√√1/n

n∑i=1

(Yi − Yk)2 (4)

Figure 1: Training a Neural Network [2]

The number of hidden layers, and the number of nodes in each hidden layer, must be specifiedin advance. In theory the more hidden layers and nodes the more accurate the neural network butthis will increase the training time as well as risk overfitting, as the neural network will “memorise”the output rather than compute it which would result in low training error but a high testing error.

5

Figure 2: Sigmoid Function [8]

4 RoboCup

Robocup is an international competition, it’s goal is by the year 2050, develop a team of fullyautonomous humanoid robots that can win against the human world soccer champion team. Thereare many leagues in Robocup including the 2D simulation which is the most practical when tryingto evaluate various strategies, theories or algorithms [9]. In the simulation, there are 2 teams of11 virtual players versing each other in soccer. Each player is a seperate computer program withit’s own environmental information that it can “see”. There are stationary landmarks on the fieldwhich are 55 flags and 4 lines (Fig. 3). Each player has a camera sensor that relays what is init’s field of view once every step cycle, once every 150 milliseconds which it can use to ’know’ itslocation. Fig. 4 is an example of the information that’s being relayed to the player. It contains theFlag name, distance and angle of the flag from player itself. We created code and different tools toextract this data for the flags as well as the players’ exact correct locations from the simulator andconvert it into the correct format for the Neural Networks [3].

5 Architecture of the Neural Networks

After trying many different Architectures for the NN, the best results were achieved when usinga NN with 4 inputs; X and Y coordinates of the flag location, the angle and distance of the flagfrom the player. Two independent Neural networks were created for the X- and Y- coordinatesof the player (see Fig. 5 and Fig. 6 respectively). With this approach, all weights can be trainedspecifically for the purpose of estimating X or Y. Using these Networks, the player is now aware ofit’s location so new Neural Networks were created using these estimates to take advantage of the

6

Figure 3: Landmarks on the soccer field

neural networks ability to make predictions. The new networks have 2 inputs each, the previousX coordinate of the player as well as the current one to predict its next X location (Fig. 7) and asimilar network for the Y location (Fig. 8).

6 Results

Fig. 9 shows an example of the results, the yellow circle represents the actual location while theother circles represent the predictions according to individual flags, the purple being it’s worstprediction. So by averaging the results we were able to get a much more accurate results as inFig. 10.

The neural networks had a large RMSE for the X and Y location, as in the results table below,when averaged it had a smaller error but it’s still large error compared to the results reported in [6]using the triangulation method which had an average error of 0.13. The prediction error using theestimated locations was 0.31 (Fig. 11). To estimate a lower bound for the prediction error of ourmethod, we also used ground truth of previous player locations as an input (Fig. 13). This resultedin a very low error of 0.10 for predicting the next position of the player (Fig. 12). This error islower than the triangulation error for estimating the current position. It’s also lower than the errorof assuming that the player remains stationary to ensure that the neural network is successful anduseful.

7

Figure 4: Data received according to field of view

Results Error

XNN 2.37YNN 2.27

Average 0.85Prediction Est NN 0.31

Prediction Act NN 0.10Triangulation [6] 0.13

7 Conclusion and Future Work

Even though the neural networks were not able to determine the current location of the player moreaccurately than triangulation, we were able to predict the next position of the robot with a smallerror rate. There is a large scope for future work within RoboCup, or for applications in robotics ingeneral: as an example we can explore different inputs for the Neural network that can better thelocalisation such as speed and energy. We can also predict the location of other players, the ball orwe can use the neural network to predict more than one future move (Fig. 14) and integrate thisinformation into the player’s game strategy. Outside RoboCup, Neural networks are becoming verypopular because of the ability to make predictions and they are being integrated into many areastoday. In this project the neural networks were able to predict the next position for the robot. Thismethod can be used in other areas, e.g., to predict the location of planes [1] or in an autonomouscar.

8 Acknowledgments

I would like to thank the AMSI for this opportunity and the funding as well as my supervisor OliverObst for his guidance, time and effort.

8

Figure 5: X Neural Network

Figure 6: Y Neural Network

9

Figure 7: X Prediction Neural Network

Figure 8: Y Prediction Neural Network

10

Figure 9: Prediction vs. Actual

Figure 10: Prediction and Average vs. Actual

11

Figure 11: Prediction A vs. Actual

Figure 12: Prediction E vs. Actual12

Figure 13: Prediction E vs. Prediction A vs. Actual

References

[1] Doshi A. Aircraft Position Prediction Using Neural Networks. Massachusetts Institute of Tech-nology, 2005.

[2] Oustimov A. and Vu V. Artificial neural networks in the cancer genomics frontier. TranslationalCancer Research, 3, 2014.

[3] Heintz F. Huang Z. Kapetanakis S. Kostiadis K. Kummeneje J. Noda I. Obst O. Riley P. SteffensT. Wang Y. Chen M., Foroughi E. and Yin X. RoboCup Soccer Server, July 2002.

[4] Bengio Y. Goodfelloe I. and Courville A. Deep Learning. MIT Press, 2016.

[5] Tibshirani R. Hastie T. and Friedman J. The Elements of Statistical Learning. 2008.

[6] Bach J. and Gollin M. Self-loalisation revisited. RoboCup 2001: Robot Soccer World Cup V,pages 251–256, 2002.

[7] Schmidhuber J. Deep learning in neural networks: An overview. CoRR, 2014.

[8] V. Morello, E. D. Barr, M. Bailes, C. M. Flynn, E. F. Keane, and W. van Straten. Spinn: astraightforward machine learning solution to the pulsar candidate selection problem. Mon. Not.Roy. Astron. Soc., 443(2), 2014.

13

Figure 14: Predicting the next locations

[9] Ribeiro F. Stone P. Stryk O. Nardi D., Noda I. and Veloso M. Robocup soccer leagues.

14

Representation Learning for Mobile Robots in Dynamic ...vrs.amsi.org.au/wp-content/uploads/sites/78/2017/05/michael_olivia_… · against each other (see also Fig. 4). In this simulation,

Documents