Surface Modelling with Radial Basis Functions Neural Networks Using Virtual Environments

F. Sandoval et al. (Eds.): IWANN 2007, LNCS 4507, pp. 170–177, 2007. © Springer-Verlag Berlin Heidelberg 2007

Surface Modelling with Radial Basis Functions Neural Networks Using Virtual Environments

Miguel Ángel López1, Héctor Pomares1, Miguel Damas1, Antonio Díaz-Estrella2, Alberto Prieto1, Francisco Pelayo1, and Eva María de la Plaza Hernández

1 Department of Computer Architecture and Computer Technology, University of Granada {malopez,hpomares,mdamas}@atc.ugr.es

2 Department of Electronic Technology, University of Málaga [email protected]

Abstract. Modelling capabilities of Radial Basis Function Neural Networks (RBFNNs) are very dependent on four main factors: the number of neurons, the central location of each neuron, their associated weights and their widths (radii). In order to model surfaces defined, for example, as y = f(x,z), it is common to use tri-dimensional gaussian functions with centres in the (X,Z) domain. In this scenario, it is very useful to have visual environments where the user can interact with every radial basis function, modify them, inserting and removing them, thus visually attaining an initial configuration as similar as possible to the surface to be approximated. In this way, the user (the novice researcher) can learn how every factor affects the approximation capability of the network, thus gaining important knowledge about how algorithms proposed in the literature tend to improve the approximation accuracy. This paper presents a didactic tool we have developed to facilitate the understanding of surface modelling concepts with ANNs in general and of RBFNNs in particular, with the aid of a virtual environment.

1 Introduction

Artificial Neural Networks (ANNs) and, in particular, Radial Basis Functions Neural Networks (RBFNNs) have been extensively used in the literature for both function approximation (regression) and classification problems. As it is well-known, the optimization of a RBFNN requires the determination of the number and location of every RBF comprising the net. As it is a non-linear optimization problem, one common method consists in assigning an RBF per data sample and then removing less important RBFs using orthogonal least squares (OLS) [1], or a priori fixing the number of RBFs and initialize the centres of each RBF using a clustering algorithm [2,3]. The output weights are normally given random initial values since the output of the network has linear dependence with them, so they are usually optimized with a linear learning algorithm (the initial value is, in principle, of no importance). As for the radii of the RBFs, the final performance of the net depends strongly on their initial values, since a local minimization procedure is usually conducted from this initial point in order to find the nearest local minima.

https://www.researchgate.net/publication/5608613_A_new_clustering_technique_for_function_approximation?el=1_x_8&enrichId=rgreq-92e20d881c3af185abd95b231f37247c&enrichSource=Y292ZXJQYWdlOzIyMTU4MjE1ODtBUzoxMDQyNTk0MDgzNjc2MTlAMTQwMTg2ODc4ODQ4MA==

https://www.researchgate.net/publication/248848344_Pattern_Classification_Scene_Analysis?el=1_x_8&enrichId=rgreq-92e20d881c3af185abd95b231f37247c&enrichSource=Y292ZXJQYWdlOzIyMTU4MjE1ODtBUzoxMDQyNTk0MDgzNjc2MTlAMTQwMTg2ODc4ODQ4MA==

Surface Modelling with RBFNNs Using Virtual Environments 171

Since input/output mappings of the type y=f(x,z) can be represented in the three-dimensional space, and due to the fact that RBF neurons have a tangible physical nature, the use of virtual environments to visualize and interact with the network and its output, is very valuable to gain an insight of the advantages of a suitable initial configuration of the RBFNN parameters. Internet offers standard tools such as VRML [4] which are very appropriate in virtual environments.

The system developed in this work offers the user a tool intended to facilitate the teaching of concepts related to RBFNNs based on virtual reality environments. In particular, this tool allows the user to define the four key factors in the design of a RBFNN, and lets him interact with the output surface and experiment the consequences of a suitable parameter adjustment.

This paper is organized as follows: Section 2 presents the four main factors for the specification of a RBFN, explains the difficulties arising when giving them initial values and justifies the use of virtual environments. Section 3 describes the system we have implemented for RBFNN teaching using virtual environments. Next, in Section IV, we show some real application examples of the proposed tool, and finally some conclusions are drawn in section 5, as well as some future work related to this paper.

2 Factors Involved in RBFNN Design

In this work we will use a Cartesian coordinate system XYZ in which the XZ plane is on the paper plane and the Y axis grows upwards. Therefore, our surfaces to be modelled will be of the form y=f(x,z). When referring to the output surfaces, we will use the following abbreviations: SOb will be the surface we want to model (our Objective); SNet the output surface generated by the RBFNN; and SErr the Error Surface, computed for every data point using Eq. (1).

SErr = SNet - SOb (1)

This section reviews some existing methods to adjust the four main factors of an RBFNN together with some examples that justify the importance of a good initial estimation for them. In every case, we assess the possibility of making the adjustment using virtual environments.

2.1 The Number of Neurons in the Net

In the literature, the number of neurons of the RBFNN has been given different importance. In some cases, this number is a priori fixed, generally by trial and error, and not included in the presented methodologies probably for the high computation cost. In other cases, the number of neurons is increased until a given approximation error limit is reached. Generally speaking, fixing the number of neurons without having into account the SOb information, i.e. the surface to be approximated, may result inefficient. For instance, Fig. 1 shows a log-like surface defined on the XZ domain which could be easily approximated by an only neuron with a visual tool. Nevertheless, some methods based on clustering algorithms could have proposed a number of neurons proportional to the area of the domain of the function, without taking into account the interpolation properties of the net. The analytic expression of

172 M.Á. López et al.

the visually tuned output surface (SNet) using just one neuron is given in Eq. (2). In this case, a proper election of the net to be used could be visually estimated by observing the error surface (SErr).

(a) (b)

Fig. 1. Surfaces represented using matrices of size 64x64 (a) Left: SErr. Right: SOb. Back: SNet. (b) Left: SNet. Right: SErr. Back: SOb.

The election of a minimal set of neurons not only compresses the information of the surface to be modelled using a minimal amount of parameters, in this case representing a data matrix of size 64x64 by a vector with just four parameters: height (200), width (-6.75·10-4, and location (60,60), but also minimizes the risk of over-fitting.

))60()60((475.6 22

200),( −+−−−== zxeezxfy . (2)

2.2 Radii of the Chosen Radial Basis Functions

The computation of the optimum values of the radii or width of the RBFs is a problem which still hasn’t been given a correct answer. One interesting approach given in [5] consists in using the same radius for all the RBFs in the network. This approach has the advantage of dramatically reducing the number of non-linear and difficult-to-initialize parameters of the net, but obviously also reduces the approximation capability of every RBF. In this case, in [5] it is proposed this “shared” radius value to be given a value:

maxdσ =

2m. (3)

Where dmax is the maximum distance between RBFs in the net, and m is the total number of RBFs. With Eq. (3), we can be sure that every neuron is neither very narrow nor very width, but this value can be far from being a suitable one, since it does not take into account the function to be modelled.

The other main option consists in using an independent radius for every radial basis function. In this case the number of parameters is bigger and the net is much more difficult to optimize. In contrast, it is expected that fewer RBFs are necessary

https://www.researchgate.net/publication/265439255_Neural_Networks_A_Comprehensive_Foundation?el=1_x_8&enrichId=rgreq-92e20d881c3af185abd95b231f37247c&enrichSource=Y292ZXJQYWdlOzIyMTU4MjE1ODtBUzoxMDQyNTk0MDgzNjc2MTlAMTQwMTg2ODc4ODQ4MA==



for a specific approximation problem. As an example of this approach, Moody and Darken [11] suggested an initialization of the RBF widths given by the k-nearest neighbour heuristic. Some other different approaches can also be found in [10], [12].

Although the adjustments of the radii of the RBFs can seem to be very difficult for the researcher, the use of virtual environments to experiment with different methodologies facilitated the understanding of the problem and interpretation of the results. In this way, the exercise of adjusting the RBF widths based on visual criteria and then checking if those initial values agree with the expressions given by other authors has a very pedagogical importance.

2.3 Initial Choice of the RBF Weights

In ANNs in general, one common method for the initial choice of parameters is to use random numbers. In the case of the weights associated to every RBF, one can easily give an initial guess using the average of the outputs of the data points belonging to the RBF, i.e. those data points which activate the RBF more than a threshold value. When using local minimization procedures or more sophisticated methods such as genetic algorithms, these initial guesses are important since they make the convergence process faster. Nevertheless, as it is well-known, the output of the RBFNN has a linear dependency on the output weights, so the use of a simple linear equation solving procedure would be sufficient to find the (only) optima weights. In the RBFNN literature it is common the use of the Cholesky algorithm when the number of data is enough to ensure the non-singularity of the activation matrix. Very common are also the singular value decomposition (SVD) method and the orthogonal least squares (OLS) technique, which are capable of giving the correct answer even in the case of singular matrices.

Even in these cases, the use of an interactive virtual environment can help the future researcher (and now doctoral student) to understand how different values of the RBF weights may affect the output surface, and allow him/her to experiment with them.

Fig. 2 shows a SINC-type objective surface (SOb). With the aid of a virtual environment and by mere imitation, the user can approximate the SOb using, for example, 9 neurons, 4 of which with equal positive weights, one of them with an outstanding positive value, and 4 with equal negative weights. The user can see the approximation accuracy by inspecting the SErr (the flatter the better). The analysis in this case offers the opportunity of exercising the concepts of centre location, initial weights and surface representation as a consequence of a linear combination of radial basis functions.

2.4 Election of the Centres of the Radial Basis Functions

The initial location of the centres of every RBF is fundamental for the final performance of the RBFNNs as a modelling tool. There exist different techniques for this problem in the literature. The easiest way is to choose a random sub-set from the training data set [6], but this must be done only when these data points are distributed in a representative way for the given problem. A more interesting approach is the one proposed by Chen in 1991 [1] who, starting from the complete training set as

https://www.researchgate.net/publication/220552446_Time_series_analysis_using_normalized_PG-RBF_network_with_regression_weights?el=1_x_8&enrichId=rgreq-92e20d881c3af185abd95b231f37247c&enrichSource=Y292ZXJQYWdlOzIyMTU4MjE1ODtBUzoxMDQyNTk0MDgzNjc2MTlAMTQwMTg2ODc4ODQ4MA==

https://www.researchgate.net/publication/221165268_Width_optimization_of_the_Gaussian_kernels_in_Radial_Basis_Function_Networks?el=1_x_8&enrichId=rgreq-92e20d881c3af185abd95b231f37247c&enrichSource=Y292ZXJQYWdlOzIyMTU4MjE1ODtBUzoxMDQyNTk0MDgzNjc2MTlAMTQwMTg2ODc4ODQ4MA==

https://www.researchgate.net/publication/233784964_Fast_Learning_in_Networks_of_Locally-Tuned_Processing_Units?el=1_x_8&enrichId=rgreq-92e20d881c3af185abd95b231f37247c&enrichSource=Y292ZXJQYWdlOzIyMTU4MjE1ODtBUzoxMDQyNTk0MDgzNjc2MTlAMTQwMTg2ODc4ODQ4MA==


candidates, using OLS the algorithm advances by selecting one by one those RBF centres that contribute most to the reduction of the SErr. When this error falls below some threshold value, the algorithm stops and thus the number and location of the RBFs comprising the network is estimated automatically.

When the RBFNNs are used in classification problems, it is common to use clustering algorithms such as k-means, or fuzzy c-means (FCM) [13]. These methods basically consist in finding a set of cluster centres such as the sum of some distance measures between the data points and the cluster centres they belong to (they can even belong to several clusters with different degrees in the case of FCM) is minimal. Recently, some authors have translated the use of clustering techniques to function approximation problems using RBFNNs [7],[8]. These methods, which are based on the classic clustering algorithms for classification problems, try to extract some information of the output continuous variable to locate the clusters. Some examples of these methods are CFA, ACE and CFC. In [9] some comparisons of these methods can be found.

Again, all these techniques can be better understood using a virtual environment which lets the user execute them on-line and see how they evolve within a virtual world. In this way, the user can see how these techniques really work and gain the necessary expertise to refine and improve them.

(a) (b)

Fig. 2. (a) Left: SOb; right: SNet; back: SErr. (b) Left: SOb; right: SNet; back: SErr. All surfaces are represented using 32x32 matrices.

3 The Virtual Environment

In order to work with the virtual environment, the following elements are needed: A web browser, an X3D world plug-in [4] and a Java applet developed for this system. The virtual environment is defined in a simple web page by means of the insertion of two labels.

<embed src="VRMLPFC.wrl" border=0 height="290" width="750"> <applet code="AppletPFC.class" name="AppletPC" height=900 width=750 mayscript> </applet>

https://www.researchgate.net/publication/221581779_Studying_the_Convergence_of_the_CFA_Algorithm?el=1_x_8&enrichId=rgreq-92e20d881c3af185abd95b231f37247c&enrichSource=Y292ZXJQYWdlOzIyMTU4MjE1ODtBUzoxMDQyNTk0MDgzNjc2MTlAMTQwMTg2ODc4ODQ4MA==

https://www.researchgate.net/publication/3302382_Growing_radial_basis_neural_networks_merging_supervised_and_unsupervised_learning_with_network_growth_techniques_IEEE_Trans_Neural_Networks_8_6_1492-1505?el=1_x_8&enrichId=rgreq-92e20d881c3af185abd95b231f37247c&enrichSource=Y292ZXJQYWdlOzIyMTU4MjE1ODtBUzoxMDQyNTk0MDgzNjc2MTlAMTQwMTg2ODc4ODQ4MA==

https://www.researchgate.net/publication/286201841_Pattern_recognition_with_fuzzy_objective_function_algorithms_Kluwer_Academic?el=1_x_8&enrichId=rgreq-92e20d881c3af185abd95b231f37247c&enrichSource=Y292ZXJQYWdlOzIyMTU4MjE1ODtBUzoxMDQyNTk0MDgzNjc2MTlAMTQwMTg2ODc4ODQ4MA==


.html page

plug-inVRMLPFC.wrl

appletAppletPFC.class

Fig. 3. Virtual environment made up by the virtual world (up) and control applet (down) from which we can interact with the virtual world

The first one loads the virtual world interpreted by a specific plug-in for the X3D standard. The virtual world is contained in a VRML-type file called VRMLPFC.wrl. The second label loads the java class AppletPFC.class which is an applet for the control and interaction with the virtual world.

The standard used to define the virtual world is VRML 2.0 (Virtual Reality Modelling Language), though it is presently integrated in the very X3D standard. The virtual world is made up of three nodes equal in size and shifted along the horizontal XZ plane (see Fig. 3). Every node is composed of a plane domain on which the three surfaces SOb, SNet and SErr are represented.

The interaction with the virtual world is accomplished through an applet loaded in the web page. This applet captures all events generated in the virtual world and executes the corresponding java method. Besides, it can read from and write to the different nodes in order to extract numerical information from the virtual world. It can also make all necessary learning computations and update the virtual world. For all this, we have used a toolkit of X3D Consortium called Xj3D.

4 Some Experimental Results

The experiments carried out consist in showing a real surface to be modelled and let the user make the initial configuration of the RBFNN, taking advantage of the facilities provided by the virtual environment. The flatter the error surface (SErr) the better. For this the user should use one of the algorithms he is learning or just use their visual instinct. With this process, the researcher gains practice and can


understand the fundamental factors in the design of RBFNNs. The web browser used is Internet Explorer 6.0 and the plug-in is Cortona 5.1, developed by ParallelGraphics.

Two surface modelling tests have been carried out. In both cases we are dealing with real surfaces. The first experiment (see Fig. 4.a) is a mountainous surface which spans 25 km2 of the Sierra Nevada National Park, in Spain. The user, after analysing the SOb and interacting with the system, generated the SNet using 22 radial basis functions. In the second experiment (see Fig. 4.b) the SOb is the 3D representation of a grey-scale image with integer values in the range [0,255], proportional to the luminosity. In this case, the final of neurons configured by the user was 68. Comparing the results with the JPG standard, we can observe that a JPG image is not capable of an adequate representation of the pupil of the eye due to the high luminosity contrast with the iris. Nevertheless, for a RBFNN this is a problem very easy to solve by placing a neuron just in the centre of the iris.

(a) (b)

Fig. 4. (a) Left: Surface to be modelled (SOb); right: Output of the net (SNet) with 22 neurons. (b) Up left: SOb, up right: JPG with 50% compression; down left: JPG with 75% compression; down right: SNet with 68 neurons.

5 Conclusions

In this work we have presented a didactic tool based in virtual reality environments which allow the user (the potential researcher) adjust the main four factors of a RBFNN: the number of neurons, their central positions, the radii of the RBFs and the initial weights.

This tool is based on the virtual world modelling standard VRML and has been implemented with the help of the Xj3D toolkit provided by Web3D Consortium.The system developed has been applied both in function approximation problems of the type y=f(x,z) and for grey-scale image modelling. The results obtained have been compared against those obtained by the graphic standard JPEG.

As future work we plan to develop new Java classes that implement the most important methods existing in the literature about the initial configuration of the RBFNNs. For instance, we are planning to implement clustering algorithms such as CFA and executing it on-line while the user observes and understands its evolution until the final convergence.


References

1. Chen, S.: Orthogonal Least Squares Learning for Radial Basis Function Networks, IEEE Transactions on Neural Network, 2(2) (1991)

2. Duda, R.O., Hart, P.E.: Pattern Classification and Scene Systems. Wiley, New York (1973)

3. González, J., Rojas, I., Pomares, H., Ortega, J., Prieto, A.: A New Clustering Technique for Function Approximation, IEEE Transactions on Neural Networks, 13(1) (January 2002)

4. www.web3d.org 5. Haykin, S.: Neural Networks a comprehensive foundation, Prentice Hall, Englewood

Cliffs. 6. Loewe, D.: Adaptative Radial Basis Function Nonlinearities, and the Problem of

Generalisation. In: First IEEE International Conference on Artificial Neural Networks, London, pp. 171–175 (1989)

7. Karayiannis, N.B., Mi, G.W.: Growing Radial Basis Neural Networks: Merging Supervised and Unsupervised Learning with Network Growth Techniques. IEEE Trans. Neural Networks 8(6), 1492–1506 (1997)

8. Sutanto, E.L., Masson, J.D., Warwick, K.: Mean-Tracking Clustering Algorithm for Radial Basis Function Center Selection. Int. J. Contr. 67(6), 961–977 (1997)

9. González, J., Rojas, I., Pomares, H., Ortega, J.: Studying the Convergence of the CFA Algorithm, IWANN 2003, LNCS 2686, pp. 550-557 (2003)

10. Benoudjit, N., Archambeau, C., Lendasse, A., Lee, J., Verleysen, M.: Width optimization of the Gaussian kernels in Radial Basis Function Networks. In: Proceedings of ESANN, pp. 425–432 (April 2002)

11. Moody, J., Darken, C.J.: Fast learning in networks of locally-tuned processing units, Neural Computation No. 1, pp. 281–294 (1989)

12. Rojas, I., Pomares, H., Bernier, J.L., Ortega, J., Pelayo, F., Prieto, A.: Time series analysis using normalized PG-RBF network with regression weights. Neurocomputing 42, 267–285 (2002)

13. Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum, New York (1981)
























Surface Modelling with Radial Basis Functions Neural Networks Using Virtual Environments

Documents