Lectures on Probability and Statistical Models Phil Pollett Professor of Mathematics The University of Queensland c These materials can be used for any educational purpose provided they are are not altered Probability & Statistical Models c Philip K. Pollett
38
Embed
Lectures on Probability & Statistical Models · 13 Markov chains Imprecise (intuitive) definition . A Markov process is a random process that “forgets its past”, in the following
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
so that, given the present state of the process, its past andits future are independent . If the set of states S is discrete,then the process is called a Markov chain.
Remark . At first sight this definition might appear to coveronly trivial examples, but note that the current state couldbe complicated and could include a record of the recentpast.
(Born: 14/06/1856, Ryazan, Russia; Died: 20/07/1922, St Petersburg, Russia)
Markov is famous for his pioneering work on Markov chains, whichlaunched the theory of stochastic processes. His early work was innumber theory, analysis, continued fractions, limits of integrals,approximation theory and convergence of series.
Example . There are two rooms, labelled A and B. There isa spider, initially in Room A, hunting a fly that is initially inRoom B. They move from room to room independently:every minute each changes rooms (with probability p for thespider and q for the fly) or stays put, with thecomplementary probabilities. Once in the same room, thespider eats the fly and the hunt ceases.
The hunt can be represented as a Markov chain with threestates: (0) the spider and the fly are in the same room (thehunt has ended), (1) the spider is in Room A and the fly isin Room B, and, (2) the spider is in Room B and the fly is inRoom A.
Eventually we will be able to answer questions like “What isthe probability that the hunt lasts more than two minutes?”
Let Xn be the state of the process at time n (that is, after nminutes). Then, Xn ∈ S = {0, 1, 2}. The set S is called thestate space. The initial state is X0 = 1. State 0 is called anabsorbing state, because the process remains there once itis reached.
Definition . A sequence {Xn, n = 0, 1, . . . } of randomvariables is called a discrete-time stochastic process; Xn
usually represents the state of the process at time n. If{Xn} takes values in a discrete state space S, then it iscalled a Markov chain if
Pr(Xm+1 = j|Xm = i, Xm−1 = im−1, . . . , X0 = i0)
= Pr(Xm+1 = j|Xm = i). (1)
for all time points m and all states i0, . . . , im−1, i, j ∈ S. If theright-hand side of (1) is the same for all m, then the Markovchain is said to be time homogeneous.
Remarks . (1) Matrices like this (with non-negative entriesand all row sums equal to 1) are called stochastic matrices.Writing 1 = (1, 1, . . . )T (where T denotes transpose), we seethat P1 = 1. Hence, P (and indeed any stochastic matrix)has an eigenvector 1 corresponding to an eigenvalue λ = 1.
(2) We may usefully set P (0) = I, where, as usual, Idenotes the identity matrix:
Example . Returning to the hunt, the three states were: (0)the spider and the fly are in the same room, (1) the spider isin Room A and the fly is in Room B, and, (2) the spider is inRoom B and the fly is in Room A. Since the spider changesrooms with probability p and the fly changes rooms withprobability q,
or, equivalently, in terms of transition matrices,P (n+m) = P (n)P (m). Thus, in particular, we haveP (n) = P (n−1)P (remembering that P := P (1)). Therefore,
P (n) = Pn, n ≥ 1.
Note that since P (0) = I = P 0, this expression is valid for alln ≥ 0.
Example . Returning to the hunt with p = 1/4 and q = 1/2,suppose that, at the beginning of the hunt, each creature isequally likely to be in either room, so thatπ
Notice also that |r| < 1, since p, q ∈ (0, 1). Therefore, π isalso a limiting distribution because
limn→∞
Pr(Xn = 0) = q/(p + q) ,
limn→∞
Pr(Xn = 1) = p/(p + q) .
Remark . If, for a general Markov chain, a limitingdistribution π exists, then it is a stationary distribution, thatis, πP = π (π is a left eigenvector corresponding to theeigenvalue 1).
For details (and the converse), you will need a moreadvanced course on Stochastic Processes.
Example . Max (a dog) is subjected to a series of trials, ineach of which he is given a choice of going to a dish to hisleft, containing tasty food, or a dish to his right, containingfood with an unpleasant taste.
Suppose that if, on any given occasion, Max goes to theleft, then he will return there on the next occasion withprobability 0.99, while if he goes to the right, he will do soon the next occasion with probability 0.1 (Max is smart, buthe is not infallible).
Let Xn be 0 or 1 according as Max chooses the dish to theleft or the dish to the right on trial n. Then, {Xn} is atwo-state Markov chain with p = 0.01 and q = 0.9 and hencer = 0.09. Therefore, if the first dish is chosen at random (attime n = 1), then Max chooses the tasty food on the n-thtrial with probability
Birth-death chains . Their state space S is either theintegers, the non-negative integers, or {0, 1, . . . , N}, and,jumps of size greater than 1 are not permitted; theirtransition probabilities are therefore of the form pi,i+1 = ai,pi,i−1 = bi and pii = 1 − ai − bi, with pij = 0 otherwise.
The birth probabilities (ai) and the death probabilities (bi)are strictly positive and satisfy ai + bi ≤ 1, except perhaps atthe boundaries of S, where they could be 0. If ai = a andbi = b, the chain is called a random walk .
Gambler’s ruin . A gambler successively wagers a singleunit in an even-money game. Xn is his capital after n betsand S = {0, 1, . . . , N}. If his capital reaches N he stops andleaves happy, while state 0 corresponds to “bust”. Hereai = bi = 1/2, except at the boundaries (0 and 1 areabsorbing states). It is easy to show that the player goesbust with probability 1 − i/N if his initial capital is i.
The Ehrenfest diffusion model . N particles are allowed topass through a small aperture between two chambers Aand B. We assume that at each time epoch n, a singleparticle, chosen uniformly and at random from the N ,passes through the aperture.
Let Xn be the number in chamber A at time n. Then,S = {0, 1, . . . , N} and, for i ∈ S, ai = 1 − i/N and bi = i/N . Inthis model, 0 and N are reflecting barriers. It is easy toshow that the stationary distribution is binomial B(N, 1/2).
Population models . Here Xn is the size of the populationtime n (for example, at the end of the n-th breeding cycle, orat the time of the n-th census). S = {0, 1, . . . }, orS = {0, 1, . . . , N} when there is an upper limit N on thepopulation size (frequently interpretted as the carryingcapacity). Usually 0 is an absorbing state, corresponding topopulation extinction, and N is reflecting.
Example . Take S = {0, 1, . . . } with a0 = 0 and, for i ≥ 1,ai = a > 0 and bi = b > 0, where a + b = 1. It can be shownthat extinction occurs with probability 1 when a ≤ b, andwith probability (b/a)i when a > b, where i is the initialpopulation size. This is a good simple model for apopulation of cells: a = λ/(λ + µ) and b = µ/(λ + µ), where µand λ are, respectively, the death and the cell division rates.
The logistic model . This has S = {0, . . . , N}, with 0absorbing and N reflecting, and, for i = 1, . . . , N − 1,
ai =λ(1 − i/N)
µ + λ(1 − i/N), bi =
µ
µ + λ(1 − i/N).
Here λ and µ are birth and death rates. Notice that the birthand the death probabilities depend on i only through i/N , aquantity which is proportional to the population density :i/N = (i/Area)/(N/Area). Models with this property arecalled density dependent .
Telecommunications . (1) A communications link in atelephone network has N circuits. One circuit is held byeach call for its duration. Calls arrive at rate λ > 0 and arecompleted at rate µ > 0. Let Xn be the number of calls inprogress at the n-th time epoch (when an arrival or adeparture occurs). Then, S = {0, . . . , N}, with 0 and N bothreflecting barriers, and, for i = 1, . . . , N − 1,
(2) At a node in a packet-switching network, data packetsare stored in a buffer of size N . They arrive at rate λ > 0and are transmitted one at a time (in the order in which theyarrive) at rate µ > 0. Let Xn be the number of packets yet tobe transmitted just after the n-th time epoch (an arrival or adeparture). Then, S = {0, . . . , N}, with 0 and N bothreflecting barriers, and, for i = 1, . . . , N − 1,
Genetic models . The simplest of these is the Wright-Fishermodel . There are N individuals, each of two genetic types,A-type and a-type. Mutation (if any) occurs at birth. Weassume that A-types are selectively superior in that therelative survival rate of A-type over a-type individuals insuccessive generations is γ > 1. Let Xn be the number ofA-type individuals, so that N − Xn is the number of a-type.
Wright and Fisher postulated that the composition of thenext generation is determined by N Bernoulli trials, wherethe probability pi of producing an A-type offspring is givenby
pi =γ[i(1 − α) + (N − i)β]
γ[i(1 − α) + (N − i)β] + [iα + (N − i)(1 − β)],
where α and β are the respective mutation probabilities. Wehave S = {0, . . . , N} and