Top Banner
Melih Sözdinler 2009800075 Survey:Random Walk Melih S¨ ozdinler 1 Computer Engineering Bogazi¸ ci University Bebek,Istanbul 34342 Turkey Abstract. Random Walk is a concept that is surely used in many topics of Computer Science and it has a lot more contrbutions to do in the future. In this survey paper we are going to deal with the concept of random walk, its application areas and the future contributions. We also give some experimental details to show the nature of random walk. 1 Introduction In computer science the basic approach to the random walk is in 1D. According to this approach, basically someone flips a coin or chooses random number R. At the coin example, if the coin is head, we will do something, such as moving right else we will do something else such as moving left. Furthermore, let’s consider a constant C if R C move right else move left. That is the concept of random walk in 1D. When dimension changes basically the moving options increases exponentialy. In the literature, random walk is used in many areas from bioinformatics to complex systems. It seems to be basic but its contributions are countless. The motivation behind the random walk is to create stochastic movements or step-by-step hops. Due to its nature, it is perfect for modelling decision problems and because of randomness one can not say ”I found a deterministic function for random walk”. Modelling the random walk is also well studied. The basic version depends on the probability distribution function of flipping a coin simply head-and-tail in 1D, but it might be the case that n dimensions and Gaussian Distribution. In 1905 Pearson introduced the term ”random walk” He was interested in describing the spatial/temporal evolution of mosquito populations invading cleared jungle regions. But his first findings implies that the problem is too complex to solve deterministically. Rayleigh was also worked over these before Pearson and He formulated the 1D Random Walk with Gaussian having spreading around the σ = n. Bachelier gave the probability distribution of walkers after n steps (discrete) or time, t (continuous) as P n+1 (x)= +-∞ P 1 (x - x )P n (x)dx or P (x, t + γ )= P (x - x )P (x ,t)dx . Einstein(1905) Gave a physical description of Brownian motion; Later experimentally verified by Perrin(1909). Both sets of work received Nobel Prize for providing proof for the atomic nature of matter Einstein’s assumptions (similar to Bachelier): 1. Many Independent random walkers 2. Each takes steps that, after some very small time interval, γ , can be considered independent in time 3. He used a continuum approach rather than discrete Basically the idea around Random Walk emerged. We will give some application areas about the topic. Random walks arise in the motion of particles under collision (such as Brownian motion), in gambling problems (the fortune of a (perhaps unfortunate) gambler), and in mathematical models in finance (such as the pricing of options).In economics it is used in order to model shares prices and other factors. In physics, random walks are used as simplified models of physical Brownian motion and the random movement of molecules in liquids and gases. See for example diffusion- limited aggregation. Also random walks and some of the self interacting walks play a role in quantum field theory.
5
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Random Walk Survey

Melih Sözdinler2009800075

Survey:Random Walk

Melih Sozdinler1

Computer EngineeringBogazici University

Bebek,Istanbul 34342 Turkey

Abstract. Random Walk is a concept that is surely used in many topics of Computer Science and it has a lotmore contrbutions to do in the future. In this survey paper we are going to deal with the concept of random walk,its application areas and the future contributions. We also give some experimental details to show the nature ofrandom walk.

1 Introduction

In computer science the basic approach to the random walk is in 1D. According to this approach, basically someoneflips a coin or chooses random number R. At the coin example, if the coin is head, we will do something, such asmoving right else we will do something else such as moving left. Furthermore, let’s consider a constant C if R ≥ Cmove right else move left. That is the concept of random walk in 1D. When dimension changes basically the movingoptions increases exponentialy.

In the literature, random walk is used in many areas from bioinformatics to complex systems. It seems to be basicbut its contributions are countless. The motivation behind the random walk is to create stochastic movements orstep-by-step hops. Due to its nature, it is perfect for modelling decision problems and because of randomness onecan not say ”I found a deterministic function for random walk”. Modelling the random walk is also well studied.The basic version depends on the probability distribution function of flipping a coin simply head-and-tail in 1D, butit might be the case that n dimensions and Gaussian Distribution. In 1905 Pearson introduced the term ”randomwalk” He was interested in describing the spatial/temporal evolution of mosquito populations invading cleared jungleregions. But his first findings implies that the problem is too complex to solve deterministically. Rayleigh was alsoworked over these before Pearson and He formulated the 1D Random Walk with Gaussian having spreading aroundthe σ =

√n. Bachelier gave the probability distribution of walkers after n steps (discrete) or time, t (continuous) as

Pn+1(x) =∫ +∞

−∞P1(x − x′)Pn(x)dx′ or P (x, t + γ) =

∫P (x − x′, γ)P (x′, t)dx′.

Einstein(1905) Gave a physical description of Brownian motion; Later experimentally verified by Perrin(1909).Both sets of work received Nobel Prize for providing proof for the atomic nature of matter

Einstein’s assumptions (similar to Bachelier):

1. Many Independent random walkers2. Each takes steps that, after some very small time interval, γ , can be considered independent in time3. He used a continuum approach rather than discrete

Basically the idea around Random Walk emerged. We will give some application areas about the topic. Randomwalks arise in the motion of particles under collision (such as Brownian motion), in gambling problems (the fortune ofa (perhaps unfortunate) gambler), and in mathematical models in finance (such as the pricing of options).In economicsit is used in order to model shares prices and other factors. In physics, random walks are used as simplified modelsof physical Brownian motion and the random movement of molecules in liquids and gases. See for example diffusion-limited aggregation. Also random walks and some of the self interacting walks play a role in quantum field theory.

Page 2: Random Walk Survey

Melih Sözdinler2009800075

Fig. 1. Route of each agent in 2D Plot for Zero Start Agent

Furthermore, in complex systems it is used to model agent like systems such as crawling the web to determine somestatistical results such as page ranking, estimating the size of the web. Moreover, in bioinformatics, researches are donesuch as biclustering, protein-protein interactions using random walks. In wireless networking, random walk is used tomodel node movement and agent based systems such as event handling and reporting may be done using random walkconcept. Random walk is also used to model gambling.

In this paper, we will give some experimental results about the behaviour of random walk in one dimension andwe will also cover some specific applications such as random walk in modelling gambling and biclustering.

2 Experiments

We designed some set of experiments to understand random walk in 1D. We have two different implications using justone agent starting from the initial point 0 at each trial. In most of the cases we run the simulation for 100 movementsof the agent. Implementation is done with LEDA C++ graph library [1]. We construct graph for the random walkand we extend the graph during the walk if it is necessary. We discover two things these are ”what happens until 100movements?” and ”what is the most visited node during the random walk?”. This section divided into three parts, thefirst part is agent always starts from initial point lets say where p = 0. In the second part, at each 100 times randomwalk agent continues where it’s previous random walk ended. Finally, the agent chooses random node at the initialrandom walk graph and starts from this node. For all of these setup each run last 100 execution of 100 movementswithout violating the settings.

2.1 Zero Start Agent

First of all, in the Figure 1 and 2, you will see the how the agent makes its moves. The plot shows the current positionon the line (vertical axis) versus the time steps (horizontal axis). From the non-linear plots, we know that there is noexact function that gives us the right position of the agent at time t. So we can propose some statistical behaviours.in the in the Figure 3, we can see the normal like peak around 0 since each time we start at point 0 and graduallyagent goes to a point where p−(x) < 0 or p+(x) ≥ 0. Interestingly, although the agent starts its route at 0 point, itdoes not have a mean center at 0. Its highest peak at point 7.

Page 3: Random Walk Survey

Melih Sözdinler2009800075

Fig. 2. Route of each agent in 3D Plot fot Zero Start Agent

2.2 Continous Agent

In addition, we tested the behaviour of continous agent. Interestingly, maybe because of the length of our run, theagent makes fluctuations in terms of how often it visits the specific nodes in terms of the sum of number of visits. Wemake 5 different run, and we plotted at Figure 4. We have two peak around where p1 ≈ −100 and p2 ≈ −15. Thesetwo peak may be merged at the point pm ≈ −60, the disctinction may be the lack of runs since we have 5 differentcontinous random walk. More interestingly, on the average agent spends more time at negative nodes. This is relaventfor 5 different simulation and the random number generator and its distribution may have a tendency to make theagent spending more time at negative instances.

2.3 Random Restart Agent

Finally we tested the agent that have a random restart such that let S = pi, pi+1, pi+2, ..., pj be the set of points ornodes visited by the agent. Then we randomly choose a point pk where i ≤ k ≤ j. The agent continues its routing in1D from that point pk. The resulting plot given at Figure 5. This time we found normal distribution with µ ≈ 0. Dueto its nature of random walk, it is used to make a movements for some intervals. For this experiment since it beginswith 0 at the beginning and at the next step we continue with a randomly chosen point pk, the tendency of makingthese intervals collected around 0. In Table 2.3, we give the intervals of these five different runs with 100 execution.The intervals are close to each other and the agent can not go somewhere else since random restart is bounding forcesover the agent. Meanly, agent can not easly leave its interval and whenever it tries to break the bounds, the methoddoes not allow it to go further.

Random Walk Intervals for Random Restart Agent

Interval 1 -82 ... 0 ... 71Interval 2 -67 ... 0 ... 71Interval 3 -57 ... 0 ... 73Interval 4 -66 ... 0 ... 66Interval 5 -64 ... 0 ... 69

Page 4: Random Walk Survey

Melih Sözdinler2009800075

Fig. 3. Sum of the instances at point i for each Zero Start Agents that are routing

3 Random Walk Applications

There are variety of cases that use Random Walks. We do not consider all of these methods. We concentrate on twomethods. The first one is Random Walk in Biclustering and the second one is Random Walk in Gambling.

Biclustering is the topic introduced by [5] and first heuristic method is given by [3]. The problem is to find amaximumum submatrix instances while not violating the constraints. Constraints define a bicluser type such thatsubmatrices could be all constant, constant rows, constant columns or coherent. The problem is proven to be NP-Hard. Interestingly, in [2], the biclustering problem is solved using a random walk. The method is so simple that on a2D matrix, the agent makes random walks and try to extend the locally found submatrixes by looking at its neighbourswhile obtaining the similarity score at each extension.

In Gambling random walks are used at ”Gambler’s Ruin” [6]. Suppose the case of flipping a coin and you startwith a dollars, and repeatedly bet one dollar until you either reach 0 dollars (i.e. go broke), or reach c dollars (i.e.get rich), and then you stop. Suppose you have probability 1/2 of winning (or losing) each bet. When we ask ”Whatis the probability q of reaching c before 0?” Since on average we break even, the amounts of money we have form amartingale, i.e. a random sequence which stays the same on average. It follows that our average amount at the endshould equal the amount we started with. That is q(c) + (1 − q)(0) = a, so that q = a/c. It is obvious that peopletends to lose money when they desired higher c with their starting amount a. Interesting point is that when the casegambler have a = 0. This time casino ruins the gambler. Casino have the advantage that it can give as much as thegambler wins, and although the gambler earns money c − a > 0 at time t, one can not say that at time t + k themoney still remains. This is strongly related the results that we obtained at the experimental section. Vice versa ifthe gambler have a = 0 at time t, the casino is ruined the gambler. The money is exactly owned by casino.

4 Conclusion

In this paper, we give some aspects of the ”Random Walk” phenomenon with giving some experimental results andtwo specific basic applications. We believe that biology [4] and complex systems has still much more attractive pointsthat could be simulated using ”Random Walk”. Because of its nature and quickly adaptability to many systems wethought that ”Random Walk” phenomenon will be still attractive in the future.

Page 5: Random Walk Survey

Melih Sözdinler2009800075

Fig. 4. Route of each agent in 2D Plot for Continous Agent

Fig. 5. Route of each agent in 2D Plot for Random Restart Agent

References

1. Leda c++ algorithm library. http://www.algorithmic-solutions.com/.2. F. Angiulli, E. Cesario, and C. Pizzuti. Random walk biclustering for microarray data. Inf. Sci., 178(6):1479–1497, 2008.3. Y. Cheng and G. M. Church. Biclustering of expression data. In R. Altman, L. Bailey, Timothy, P. Bourne, M. Gribskov,

T. Lengauer, and I. N. Shindyalov, editors, Proceedings of the 8th International Conference on Intelligent Systems forMolecular (ISMB-00), pages 93–103, Menlo Park, CA, Aug. 16–23 2000. AAAI Press.

4. E. A. Codling, M. J. Plank, and S. Benhamou. Random walk models in biology. Journal of The Royal Society Interface,5(25):813–834, 2008.

5. J. A. Hartigan. Direct clustering of a data matrix. Journal of the American Statistical Association, 67(337):123–129, 1972.6. R. W. Shonkwiler and F. Mendivil. Random walks. pages 165–212. 2009.