Abstract This review contains a brief description of the results in random walk. We explain how the random walk Chapter II Literature Review theory has evolved, then define Central Limit Theorem and how random walk is related with Central Limit Theorem. We define simple random walk, Brownian motion and give an alternative method to show Brownian motion as an approximation of simple random walk. 5 Random Walks The definition of random walk may be thought of as the mathematical theory of the aimless wanderings of an idealized drunkard, yet the applications of the basic formalism to physical problems are numerous and significant and many mathematical results beautiful in their own right may be derived. Fourier transform methods are used to find where will be the drunkard after a given number of steps. For the invention of the term random walk we are indebted to the statistician Karl Pearson,1 who wrote the following brief letter (Pearson 1905a) to the English journal ‘Nature’ which appeared in the issue of 27 July 1905. An answer to Pearson’s question was not long in coming in the form of a letter from Lord Rayleigh, published in the same journal one week later. The arguments of Rayleigh’s 1880 paper are elegantly simple. Translating his approach into random walk terminology, he commences with the observation that if all steps are of unit length and take place parallel to the x-axis, the probability that the walker lies between x and x + δx after n steps is given for large n and small δx by −1 (2πn ) 2  − 1 n2  1 x δx . exp 2   6 This result, which he calls Bernoulli’s theorem, is a very old precursor of the modern Central Limit Theorem, some what informally stated. He extends this result first to the case in which x-axis and 1 steps are parallel to the 2n 1 steps are parallel to the y-axis, and then to the case in which 2n the number of steps along each axis varies. Finally he removes the restriction that the steps be parallel to the axis, and produces the desired result. Most of the analyses of Rayleigh and Pearson problems are based on the Fourier transform. Let the position of the walker after n steps be denoted by Rn. For a walk commencing at the origin of co-ordinates, we have n Rn = ∑ Y j (2.1) j =1 where the random vector Yj represents the displacement on the jth step. If the mean displacement on the jth step is m j we find on expectations of both sides of (2.1) (with expectations denoted by angular brackets) that the mean position after n steps is given by n Rn = ∑ m j j =1 (2.2) 7 If σ 2j = Y j − m j 2 (2.3) is the variance of the jth displacement, it follows from the independence of the displacements that the variance of the position is Var ( Rn ) = Rn − Rn 2 n = ∑ σ 2j (2.4) j =1 In the most commonly encountered case, in which the individual steps have the same probability density function, and so the mean m and variance σ2, (2.2) and (2.4) reduces to Rn =nm (2.5) Var (Rn ) = nσ 2 (2.6) and A simple random walk of length ‘n’ in Z d is an ordered sequence of points (ω (0), ω (1),......., ω (n )) in Z d with ω (0) = 0 and ω (i ) − ω (i − 1) = 1, 0 < i ≤ n. Let C ns (d ) denote the cardinality of the set of walks and Rns (d ) denote the sum of the squares of the end-to-end distance in d-dimensional Euclidean space, and where s=0 refers to simple random walks and s=1 refers to random walks with exclusion of immediate reversals. 8 Lemma2 C ns (d ) and Rns (d ) for s=0 and s=1 can be written as n C ns (d ) = ∑ Ans,i [2d ]i ; 2 , i =1 n Rns (d ) = ∑ Rns,i [2d ]i ; 2 , i =1 where Ans,i and R sn,i are independent of d and depend only on n, s and i. Proof: The basic idea is to partition walks according to the number of axis directions in which steps are made. The partitioning is to be made according to the direction chosen (positive or negative) when each axis direction is sampled for the first time. Thus if ‘i’ axis are sampled (with i ≤ d ), we have [2d ]i , 2 classes of walk each of which produces the same contribution to the cardinality of the set of walks (simple and with exclusion of immediate reversals) with the following attributes, the first direction sampled is the “1 direction”, that is, (1,0,0,….,0), and so on, i.e. if + U j represents a+1 along axis ‘j’ and - U j represents a-1 along axis ‘j’ then the first ‘k’ steps, where ‘k’ varies from 1 to n-i+1, are + U 1 the (k + 1)th step is + U 2 and the next ‘r’ steps, where ‘k’ varies from 1 to ‘k+2’ to ‘n-i+2’, are any of ± U 1 or ± U 2 and so on until ‘i’ directions are sampled. The Lemma follows from simple symmetry considerations. 9 Normal Distribution and Central Limit Theorem3 For a normal curve, the mean, median and mode are the same. To define a particular normal probability distribution, wee need only two parameters the mean µ and the standard deviation σ. Central Limit Theorem (C.L.T.) is a rule assuring that the sampling distribution of the mean approaches normal as the sample size increases, regardless of the shape of the population distribution from which the sample is selected. 4 The C.L.T. says that the probability distribution function of the sum of a large number of random variables approaches a Gaussian. Although the theorem is known to apply to some cases of statistically dependent random variables, most applications, and the largest body of knowledge, are directed towards statistically independent random variables. The C.L.T. guarantees only that the distribution of the sum of random variables becomes Gaussian. It does not follow that the probability density is always Gaussian.5 The practical usefulness of the C.L.T. does not reside so much in the exactness of the Gaussian distribution for N → ∞ because the variance of 1/N becomes infinite. Usefulness derives more from the fact that 1/N for finite N may have a distribution that is closely approximated 10 as Gaussian. The approximation may be quite accurate, even for relatively small values of N, in the central region of the Gaussian curve near the mean, even for large values of N. The approximation is made more accurate by increasing N. Central Limit Theorem6,7 If X1,X2,……..Xn constitute a random sample from an infinite population with the mean µ, the variance σ2, and the moment generating function MX(t), then the limiting distribution of Z = X −ω σ n as n → ∞ is the standard normal distribution Consider the particle performs a random walk of unit steps by X1, X2,…….,Xn . The position of the particle be denoted by S1 , S 2 , ......, S n . Then Pr { X i = 1} = p; . Pr {X i = −1} = q; p + q = 1; 0 < p < 1 11 The displacements X i ; i = 1,2,......, n in the n steps are mutually independent. Then the total displacement S n is the sum of n independent and identically distributed ( i.i.d.) random variables X i S n = X 1 + X 2 + ....... + X n E [X i ] = 1 × p + (− 1) × q = p − q [ ] E X i2 = p + q [ ] (2.7) ∴Var ( X i ) = E X i2 − E [ X i ] = ( p + q ) − ( p − q ) = 4 pq 2 2 ∴ E [S n ] = n( p − q ) Thus Var (S n ) = n4 pq But p = q = ∴ E [S n ] = 0 1 2 (2.8) Var (S n ) = n (2.9) ∴ σ = Var (S n ) = n Now since Xi S n = X 1 + X 2 + .... + X n are i.i.d. random variables, then the sum for large n is asymptotically normal with mean µt and variance σ 2 t (by virtue of the C.L.T. for equal components).8 Here t represents the length of the interval of time during which the displacement, which take place is equal to the increment S n − S 0 . We thus find that for o<i<n, {S n − S i } is normally distributed with mean µ ( n − i ) and variance σ 2 (n − i ). 12 Further the increments S i − S 0 and S n − S i are mutually independent; this implies that {S i } is a Markov process. Simple Random Walks9 Let X1, X2,……..be independent and identically distributed random variables defined on a probability space (Ω, F , Ρ ) taking values in the integer lattice Zd with P{X j = e} = 1 ; e =1 2d A simple random walk starting at x ∈ Z d is a stochastic process Sn, indexed by the non-negative integers, with S0=x and Sn=x+X1+X2+------Xn. The probability distribution of Sn is denoted by p n ( x, y ) = P x {S n = y} That is the simple random walk starting at x and reaches the point y after n steps. Consider p n (x ) for large n. Assume S0=0. ie ,we consider x=0. ∴ Sn = X1 + X 2 + − − − − − + X n 13 is the sum of independent random variables with mean 0 and covariance The C.L.T. states that Sn n 1 I d converges in distribution to a normal random 1 I d variable in Rd with mean 0 and covariance If we consider A ⊂ R d is an open ball, then we localized C.L.T. to A. ∴ By multivariate normal distribution d lim n→∞ S   d  2 − P n ∈ A  = ∫   e  n  A  2π  d x 2 2 (2.10) dx1 dx 2 ......dx d = The random variable S n only takes on values in Z d . If n is even, then S n has even parity while S n has odd parity for n odd. A typical open ball A ⊂ R d contains about n 2 m( A) points in the d lattice n −1 2 Z d , where m denote Lebesgue measure. About half of these points will have even parity. ∴ If n is even and the random walk spreads itself evenly as possible among the possible lattice points then d S 2  d  2 x  P n =  e ≈ d  n  n 2  2π   n −d x 2n 2 (2.11) 14 Since the problem of random walk is mathematically the problem of addition of independent (and usually identically distributed) random variables, the ultimate behavior of random walkers can be deduced from the C.L.T. of probability theory. For example, if the individual steps in one dimensional walk have displacement µ and variance σ 2 then the asymptotic probability density function Pn for the position X n after n steps is the normal density with mean n µ and variance n σ 2 Theorem If F has mean µ and finite variance σ 2 >0, then S n − mn σ n → φ in distribution where φ is the normal distribution with mean 0 and variance 1. Proof: We may suppose m=0, by considering the random variables X i − m , whose second moment is σ 2 . As in the preceding proof, we have   S  E exp it n  =   σ n   t n  = f  σ n   i 2σ 2 1 + 2   t2 t2 = 1 − + o n  2n n 2  t   t    + o   σ σ n n     −t (it )   → e 2 = e 2 .  2 2    n 2 (2.12) The limit being the characteristic function of φ , the proof is ended. 15 Local Central Limit Theorem6 Consider the simple random walk. If Y is any random variable taking values in Zd, the characteristic function φ (θ ) = φY (θ ), θ = (θ1 , ......θ d ), given by φ (θ ) = E (e iY ⋅θ ) = ∑ y∈Z P{Y = y}e iy⋅θ d (2.13) has period 2π in each component. We can therefore think of φ as a function on [− π , π ]d with periodic boundary conditions. The inversion formula for the characteristic function is P {Y = y} = 1 (2π ) d ∫e − iy ⋅θ φ (θ )dθ . (2.14) [−π ,π ]d This can be derived from (2.13) by multiplying both sides by e −iy⋅θ , integrating, and noting that ∫e ix⋅θ dθ = (2π ) δx. d [−π ,π ]d The characteristic function easily, φ n (θ ) = [φ (θ )]n , where φ (θ ) = 1 d ∑ cosθ j . d j =1 φn for Sn can be computed 16 We will now prove a very strong version of the local central limit theorem. Let p 0 (x ) = δ (x ) and for n>0, d  d  2 − p n ( x ) = p(n, x ) = 2  e  2πn  d x 2n 2 . (2.15) We write n ↔ x if n and x have the same parity, if n = x1 + x 2 + ......... + x d is even. Similarly we will write x ↔ y and n ↔ m .We define the error E (n, x ) by  p (n, x ) − p (n, x ) if n ↔ x E (n, x ) =  if n ↔ x 0 If f: Z d → R is any function and y ∈ Z d, we let ∇ y f and ∇ 2y f be the first and second differences in the direction y defined by ∇ y f ( x ) = f ( x + y ) − f ( x ), ∇ 2y f ( x ) = f ( x + y ) + f ( x − y ) − 2 f ( x ) Theorem: If E(n,x) is defined as above, then ( E (n, x ) ≤ O n −(d + 2 ) ( E (n, x ) ≤ x − 2 O n − d 2 ), 2 ). Moreover, if y ↔ 0, there exists a cy<∞ such that 17 ( ∇ y E (n, x ) ≤ c y O n −(d +3 ) ∇ y E (n, x ) ≤ c y x −2 ( ( −2 ), O n −(d +1) ∇ 2y E (n, x ) ≤ c y O n −(d + 4 ) ∇ 2y E (n, x ) ≤ c y x 2 ( 2 2 ), 2 ). ), O n −(d + 2 ) Proof: (Greogory F.Lawler. Intersection of Random walks. Birkhauser, Boston Basel, Berlin, Probability and its applications –see pages 14-16) Markov Chains10 In the theory of Markov Chains we consider the simplest generalization which consists in permitting the outcome of any trial to depend on the outcome of the directly preceding trial and only on it. Markov Chains illustrate well the connection between probability and measure. Definition Let S be a finite or countable set. Suppose that to each pair i and j in S there is assigned a nonnegative number Pij and that these numbers satisfy the constraint ∑P j ∈S ij = 1, i ∈ S (2.16) Let X0,X1,……. be a sequence of random variables whose ranges are contained in S. The sequence is a Markov Process if 18 P[X n +1 = j X 0 = i0 ,..... X n = in ] = P[X n +1 = j X n = in ] = Pi , j (2.17) for every n and every sequence io,i1,……in in S for which P[ X 0 = i0 ,..... X n = in ] > 0 The set S is the state space or phase space of the process, and the Pij are the transition probabilities. A Markov Process having a finite or denumerable state space is called a Markov Chain. Part of the defining condition is that the transition probability P[X n +1 = j X n = i ] = Pij (2.18) does not vary with n. The elements of S are thought of as the possible states of a system, X n representing the state at time n. The process X 0 , X 1 , X 2 ,..... then represents the history of the system which evolves in accordance with the probability law (2.17). The conditional distribution of the next state X n +1 given the state X n must not further depend on the past X 0 ,...... X n −1 . This is what (2.17) requires. The initial probabilities are π i = P[ X 0 = i ] The π i are non-negative and add to (2.16), but the definition of Markov Chain places no further restriction on them. 19 The properties of the Markov Chain are entirely determined by the transition and initial probabilities. The chain rule for conditional probabilities is P [ X o = i0 , X 1 = i1 , X 2 = i2 ] = P [ X 0 = i0 ] P [ X 1 = i1 X 0 = i0 ] P [ X 2 = i2 X 0 = i0 , X 1 = i1 ] = π i0 Pi0i1 Pi1i2 Similarly P[ X t = it ,0 ≤ t ≤ m] = π i0 Pi0i1 .....Pim −1im for any sequence i0, i1.…., im of states. Further, P[ X m +t = jt ,1 ≤ t ≤ n X s = i s ,0 ≤ s ≤ m] = Pim j1 Pj1 j2 ......... Pjn −1 jn as follows by expressing the conditional probability as a ratio and applying to numerator and denominator. The properties of the Markov Chain are entirely determined by the transition and initial probabilities. Chapman-Kolmogorov equation11 Pjk denotes the probability of unit step transition from state j to the state k at the next trial. The m-step transition probability is denoted by Pr {X m + n = k X n = j} = Pjk (m ) 20 Pjk(m) gives the probability that from the state j at the nth trial, the state k is reached at (m+n)th trial in m steps that is the probability of transition from state j to the state k in exactly m steps. Using the Markovian property of the process we may also derive the Chapman-Kolmogorov equation. ∞ Pij (t + s ) = ∑ Pik (t )Pkj (s ) (2.19) k =0 This equation states that in order to move from state i to state j in time t+s, X(t) moves to some state k in time t and then from k to j in the remaining time s. Adding out the intermediate states now gives the formula [ ] ∑P Pij(n ) = P X m + n = j X m = j = k1 .....k n −1 ik1 Pk1k 2 ....... Pk n−1 j (2.20) for the nth order transition probabilities . A Markov process for which all realization or sample functions {Xt, t ∈ [0,∞)} are continuous function is called diffusion process. The Poission process is a continuous time Markov chain and Brownian motion is a diffusion process. Brownian motion12 In 1827 the botanist R. Brown noticed that minute particles suspended in a liquid moved on highly irregular trails. This and a similar 21 phenomenon for smoke particles in air, was explained as resulting from molecular bombardment of the particles. Einstein published a mathematical study of this motion, which led to the calculation of Avogadro’s number. In 1923 Wiener proposed a rigorous mathematical model that exhibited random behaviour similar to that observed in Brownian motion. The paths described by this Wiener process in 3-dimensional space are so irregular as to have Hausdorff dimension equal to 2. We first define Brownian motion in one dimension, and then extend the definition to the higher dimensional cases. Let as consider a particle performing a random walk on the real line. Suppose at small intervals τ the particle jumps a small distance δ , randomly to the left or right. (This might be a reasonable description of a particle undergoing molecular bombardment in one dimension.) Let X τ (t ) denote the position of the particle at time t. Then, given the position X τ (kτ ) at time kτ , X τ (k + 1)τ is equally likely to be X τ (kτ ) + δ or X τ (kτ ) - δ . Assuming that the particle starts at the origin at time 0, then for t>0, the position at time t is described by the random variable ( X τ (t ) = δ Y1 + ...... + Yt τ  ) 22 where Y1 , Y2 ,....... are independent random variables, each having probability 1 1 of equaling 1 and probability of equaling -1. The Central Limit 2 2 Theorem tells as that, for fixed t, if τ is small then the distribution of the random variable X τ (t ) is approximately normal with mean 0 and variance t, since the Yi has mean 0 and variance 1. In the same way, if t and h are fixed, and τ is sufficiently small, then X τ (t + h ) − X τ (t ) is approximately normal with mean 0 and variance h. If 0 ≤ t1 ≤ t 2 ≤ ...... ≤ t 2 m , then the increments X τ (t 2 ) − X τ (t1 ), X τ (t 4 ) − X τ (t 3 ),........ X τ (t 2 m ) − X τ (t 2 m −1 ) are independent random variables. We define Brownian motion with the limit of the random walk X τ (t ) as t → 0. Definition Brownian motion or the Wiener process to be a random process X such that: (i) with probability 1, X(0)=0 (i.e. the process starts at the origin) and X(t)is a continuous function of t; (ii) for any t ≥ 0 and h>0 the increment X (t + h ) − X (t ) is normally distributed with mean 0 and variance h, thus ( P ( X (t + h ) − X (t ) ≤ x ) = 2πh ) −1 2  − u2 exp ∫  2h −∞ x  du;  (2.21) 23 (iii) if 0 ≤ t1 ≤ t 2 ≤ ..... ≤ t 2 m , the increments X (t 2 ) − X (t1 ),....... X (t 2 m ) − X (t 2 m −1 ) are independent. From (i) and (ii) we obtain that X(t) is itself normally distributed with mean 0 and variance h for each t. Also X (t + h ) − X (t ) has distribution independent of t. We can extend the definition of Brownian motion from R to R n ; define Brownian motion on R n so that the coordinate components are independent1-dimensional Brownian motions. Thus X : [0, ∞ ) → R n given by X (t ) = ( X 1 (t ),......, X n (t )) is an n-dimensional Brownian motion on some probability space if the random process X i (t ) is a 1-dimensional Brownian motion for each i, and X 1 (t1 ),......, X n (t n ) are independent for all sets of times t1 ,......., t n . By definition, the projection of X(t) onto each of the coordinate axes is a 1-dimensional Brownian motion. The n-dimensional Brownian motion is isotropic, that is, it has the same characteristic in every direction. Let Xt denote the displacement at time t of a Brownian particle.8 The displacement X t − X s over the time interval (s, t ) can be regarded as the sum of a large number of small displacements. The C.L.T. is essentially 24 applicable and it seems reasonable to assert that X t − X s is normally distributed. Similarly it seems reasonable to assume that the distribution of X t − X s and that of X t + h − X s + h are the same, for any h>0, if we suppose the medium to be in equilibrium. Finally it is intuitively clear that the displacement X t − X s should depend only on the length t-s and not on the time we begin observation. Alternative method to show Brownian motion as an approximation of simple13 random walk Suppose Brownian particle starts at the origin and jumps every ∇t seconds, moving to the right a small distance ∇x with probability 1 or 2 to the left a small distance ∇x with probability 1 2 . If X n (t ) is the position of the particle at time t= n∇t random variables Y1, Y2,…..Yn where P(Yi = ±∇x ) = 1 for all i ≥ 1. 2 Now Var ( X n (t )) = n(∇x )2 and hence X n (t ) = Y1 + Y2 + ....... + Yn n∇x where E ( Z n ) = 0 , Var ( Z n ) = 1 n∇x = Z n n∇x 25 n = t ∇t and Now (∇x )2 n ∇x = t ∇x ∇t . If we assume that ∇t → σ 2 ≥ 0 as ∇t → 0, the C.L.T. for identically and independent random variables implies X n (t ) → N (0,σ 2t ) , that is X n (t ) → W (t ) . Now we show that all the finite dimensional distribution of {X n (t )} converges to that of {W (t ), t ≥ 0}. If 0 ≤ t1 ≤ t2 ≤ ...... ≤ tn , then X n ( t j ) − X n ( t j −1 ) → W ( t j ) − W ( t j −1 ) for each fixed j ≥ 1 Since the random walk and the Brownian motion {W (t )} have independent increments, the joint characteristic functions of X n (t1 ),........ X n (tk ) is   k    j =1  φn ( u1 ,.......uk ) = E exp i ∑ u j X n ( t j )     k     k  = E  exp i  ∑ u j  X n ( t1 )  E exp i ∑ u j ( X n ( t2 ) − X n ( t1 ) )  ..........  j =2    j =1     { } E exp iuk ( X n ( tk ) − X n ( tk −1 ) )    → φ ( u1 ,......, uk ) , the joint characteristic function of W ( t1 ) ,......,W ( tk ) Hence by continuity theorem for characteristic function the random vectors ( X n (t1 ),....., X n (t k )) → (W (t1 ),....,W (t k )) 26 References 1 B.D.Hughes., 1995. Random walks and Random environments. Random Walks, Vol 1 Oxford Science Publications, Oxford University Press, New York, U.S.A. 2 A. S. Padmanabhan, Random walks strictly confined to a subspace (2001)Stat. Prob.Lett., 53, 209-303. 3 William Feller. An introduction to probability theory and its applications Volume 1, 2. 4 Richard I Levin and David s Rubin 1996 Statistics for Management. Prentice-Hall of India 5 Peyton Z. Peebles, Jr Probability, Random variables and Random Signal Principles Tata Mc Graw-Hill Edition 2002 6 John. E. Freund. 1992 Mathematical Statistics. Prentice-Hall International 7 Thomas H. Wonnacot and Ronald J. Wonnacot 1990 Introductory Statistics John Wiley, Chichester, New York, Brisbane, Toronto, Singapore. 8 Robert V Hogg and Elliot A Tanis 1997 Probability and Statistical Inference. Prentice-Hall International. 27 9 Greogory F. Lawler. Intersection of Random walks.Birkhauser, Boston Basel Berlin Probability and its applications 10 Patrick Billingley, 1991 Probality and Measure, John Wiley, Singapore. 11 Samuel Karlin and Howad M Tailor, A first course in stochastic process, Academic, San Diego, U.S.A(V0l 1 0f 2 Volumes). 12 Falconer K. J., 1995 Fractal Geometry, John Wiley, Chichester, New York, Brisbane, Toronto, Singapore. 13 A K Basu, 1987 Introduction to Stochastic process Narosa Publishing House.