MS&E 223 Simulation Peter J. Haas Lecture Notes #5 Generation Of Non-Uniform Random Numbers Spring Quarter 2005-06 Generation of Non-Uniform Random Numbers Refs: L&P, Chs. 6 and 7, L&K, Ch. 7, and Devroye, Non-Uniform Random Variate Generation (watch for typos) Problem: Given a uniform random variable U, generate a random variable X having a prescribed distribution function FX (⋅) . We previously discussed the inversion method. While inversion is a very general method, it may be computationally expensive. In particular, computing FX−1 ( ⋅ ) may have to be implemented via a numerical root-finding method in many cases. Therefore, we will now describe other methods for non-uniform random number generation. 1. Acceptance-Rejection Method We now discuss a method that is well-suited to generating random variates with an easily-calculated density. This method is called acceptance-rejection. Goal: Generate a random variate X having given probability density fX(x), where fX(x) is positive only on the interval [a, b] (where - ∞ < a < b < ∞ ). Enclose the density in a rectangle having base (b−a) and height m, where m = sup fX (x) . a≤ x≤ b fX (x) m x b a Suppose we throw down points uniformly in the rectangle R (denote these points by the symbol x). Throw away (or reject) the points above the density fX(x). Claim: The x-coordinate of each accepted point (denoted by the symbol ⊗ ) has density fX(x). Proof: Let (Z1 , Z2 ) be the (x, y) coordinates of a random point distributed uniformly in R. Then, for a ≤ x ≤ b, P(X ≤ x) = P(Z1 ≤ x | acceptance) = P(Z1 ≤ x , Z2 ≤ f X (Z1 )) . P(Z2 ≤ f X (Z1 )) Page 1 of 8 MS&E 223 Simulation Peter J. Haas Lecture Notes #5 Generation Of Non-Uniform Random Numbers Spring Quarter 2005-06 But P(Z1 ≤ x , Z 2 ≤ f X (Z1 )) is just the probability that (Z1 , Z2 ) falls into the shaded region below: fX(t) x a This probability is just P(X ≤ x) = ∫ x a ∫ x a f X (t) dt area of R b t -1 . Putting x = b, we conclude that P(Z 2 ≤ f X (Z1 )) = (area of R) . So, f X (t) dt . as desired. Hence, acceptance-rejection works. A more precise specification of the algorithm is: 1. 2. 3. Generate U1 , U 2 as two independent U(0,1) random variates. Set Z1 = a + (b − a)U1 and Z 2 = mU 2 [We are essentially using the inversion method here] If Z 2 ≤ f X (Z1 ) , return X = Z1 . Else, return to 1. Note that the number of pairs N of uniform random variates (U1 , U 2 ) required to generate X is geometrically distributed with mass function P(N = k) = p(1 − p)k −1 for k ≥ 1 , -1 where p = (area of R) . The mean number of pairs that would be required is then E(N) = 1 = (b−a)m . p Hence, a good measure of how fast such an acceptance-rejection algorithm will be is the closeness of the quantity (b−a)m to 1. (So we want m to be as small as possible.) 2. Generalized Acceptance-Rejection There are many random variates having densities that are positive over infinite intervals, or such that (b−a)m is very large. In such cases, the following generalization of acceptance-rejection may be useful. Goal: Generate a random variate X having given probability density f X ( ⋅ ) . Page 2 of 8 MS&E 223 Simulation Peter J. Haas Lecture Notes #5 Generation Of Non-Uniform Random Numbers Spring Quarter 2005-06 We assume that there exists a random variable Y, having known density g(⋅) , such that: 1. We have a means of efficiently generating replicates of Y (e.g. Y is exponential). 2. there exits a constant c > 0 such that fX(x) ≤ c g(x) for all x. x x x x fX(x) x x x c g(x) x x x x x x x A random variable Y having the above properties is called a majorizing random variable and the density g(⋅) is called a majorizing density. Let c = sup x f X (x) g(x) be the smallest such constant. Generalized acceptance-rejection works as follows: 1. 2. 3. Generate the random variable Y having density g(⋅) . Generate a uniform(0,1) random variable U, independent of Y. If Ucg(Y) ≤ fX(Y), return X = Y. Else, return to 1. An argument similar to that used in the case of acceptance-rejection shows that the above algorithm produces variates with the required density fX (⋅) . In this case, we have, using the law of total expectation and the independence of U and Y, P(Y ≤ x,acceptance) = P ( Y ≤ x, U ≤ f X (Y) / cg(Y) ) = ∫ P ( U ≤ f X (Y) / cg(Y) | Y = y ) g(y) dy x −∞ = ∫ P ( U ≤ f X (y) / cg(y) ) g(y) dy = ∫ x −∞ ( f X (y) / cg(y) ) g(y) dy = (1/ c) ∫−∞ f X (y) dy. −∞ x x Taking x = ∞ in the above calculation yields P(acceptance) = 1/ c , so that x P(Y ≤ x, acceptance) (1/ c) ∫-∞ ( f X (y) / c ) dy P(X ≤ x | acceptance)= = ∫ f X (y) dy . = -∞ P ( acceptance ) 1/ c x Page 3 of 8 MS&E 223 Simulation Peter J. Haas Lecture Notes #5 Generation Of Non-Uniform Random Numbers Spring Quarter 2005-06 The following figure provides some intuition; it was obtained by scaling the previous figure by a factor of 1/c. The majorizing density curve plays the role of the bounding rectangle in the simple A/R method. g(x) fX (x)/c x Note that the number N of random variates Y required to generate a single replicate of X is geometric with mean c. Hence, c is a measure of the efficiency of the algorithm--the closer c is to 1, the better. 3. Convolution Method To generate a random variable X that can be represented as X = Y1 + + Ym , where the Yi 's are independent random variables, an obvious strategy is to generate the Yi 's independently, and sum them to obtain X. This is known as the convolution method because the distribution function of X can be computed analytically as the convolution of the distributions of the Yi 's . Example: Suppose X is binomially distributed with parameters n and p (0 ≤ p ≤ 1). Then, it is well known that D X = Y1 + + Ym , where the Yi 's are i.i.d. Bernoulli (p) random variables having mass function P(Yi = 0) = 1 − p and P(Yi = 1) = p If n isn't too large, the convolution method can be a reasonably efficient way of generating X. Otherwise, one can turn to more sophisticated algorithms. See Devroye. 4. Composition Method Suppose that either the distribution FX (⋅) or the probability density fX (⋅) can be represented either of the following two forms: (1) FX (x) = p1FY1 (x) + p 2 FY2 (x) + ... + p m FYm (x) Page 4 of 8 MS&E 223 Simulation Peter J. Haas (2) Lecture Notes #5 Generation Of Non-Uniform Random Numbers Spring Quarter 2005-06 f X (x) = p1f Y1 (x) + p 2 f Y2 (x) + ... + p m f Ym (x) where p1 ,… , p m are non-negative and sum to one (so that they form a probability mass function). Then, assuming that the Yi's are relatively easily to generate, we can generate X as follows: 1. Generate a discrete random variable I on {1, ..., m}, where P(I = j) = p j for 1 ≤ j ≤ m. 2. Generate YI from FYI (⋅) (or fYI (⋅) ). 3. Return X = YI . This approach can sometimes lead to very fast algorithms for non-uniform variate generation. To see why, suppose we approximate the density fX (⋅) (of the random variable X to be generated) from below by a piecewise-linear function. R3 fX(x) R1 a R2 b c The area underneath the piecewise-linear function consists of two regions R 1 and R 2 ; the area above the approximation and below fX (⋅) is R 3 . If we let p i denote the area of R i , then it is clear that p1 , p 2 , and p 3 form a probability mass function. Furthermore, fX(x) = p1f Y1 (x) + p 2 f Y2 (x) + p 3 f Y3 (x) where ⎧ 2(x − a) , a≤x≤b ⎪ f Y1 (x) = ⎨ (b − a) 2 ⎪ 0 , otherwise ⎩ ⎧ 2(c − x) , b≤x≤c ⎪ f Y2 (x) = ⎨ (c − b) 2 ⎪ 0 , otherwise ⎩ f Y3 (x) = 1 (f X (x) − p1f Y1 (x) − p 2 f Y2 (x)) . p3 Page 5 of 8 MS&E 223 Simulation Peter J. Haas Lecture Notes #5 Generation Of Non-Uniform Random Numbers Spring Quarter 2005-06 To see this, set g i (x) = pi f Yi (x) for i=1,2,3 and observe that g 2 (x) = f Y2 (x) = 0 for x ∈ [a, b] . Now pick a value z ∈ [a, b] and observe that ( ) P(X ≤ z) = ∫ f X (x) dx = ∫ ⎡⎣g1 (x) + ( f X (x) − g1 (x) ) ⎤⎦ dx = ∫ p1 f Y1 (x) + p3 f Y3 (x) dx a a a z z ( z z ) = ∫ p1 f Y1 (x) + p 2 f Y2 (x) + p3 f Y3 (x) dx, a so that the desired result follows after taking derivatives. A similar argument establishes the result when x ∈ [b,c] . To generate random variables from triangular densities is easy and very fast. Inversion can be used, or one can take advantage of the following: Exercise: Let Z1, Z2 be two independent uniform random variables on [a, b]. Prove that the random variable max(Z1, Z2) has density fY1 (⋅) . (If inversion is used, a square root needs to be computed; this avoids the square root, at the cost of generating an additional random variable). Exercise: Let Z1, Z2 be two independent uniform random variables on [b, c]. Prove that the random variable min(Z1, Z2) has density fY2 (⋅) . Hence, so long as p3 is small, this composition approach to generating the random variable X will be efficient, because whatever routine is used to generate Y3 will only be used infrequently (and hence we don’t care much if the routine is relatively inefficient). Very complex densities fX (⋅) may be approximated by combinations of "triangles" (i.e. triangular densities of the above form) and "rectangles" (i.e. uniform random variables). One then needs some (possibly inefficient) algorithm to generate variates from what is "left over" after the approximation. But, as in the above example, this will only be called infrequently. This approach, while time-consuming to code, can lead to very fast variate generation schemes. 5. Generating Discrete Random Variables: The Alias Method We return now to the question of how to efficiently generate discrete random variables with a given probability mass function. Specifically, suppose our goal is to generate a random variable X for which P(X = x i ) = p i for 1 ≤ i ≤ n. The alias method relies upon the idea that if the p i 's are uniform (so that p i = 1/n for 1 ≤ i ≤ n), then X can be rapidly generated. Conceptually, we can think of dropping points randomly in a rectangle that is 1/n units tall and n units wide. The rectangle is divided into n vertical regions, each 1 unit wide, numbered 1,2,..., n from left to right. E.g., for n = 4 we have: Page 6 of 8 MS&E 223 Simulation Peter J. Haas Lecture Notes #5 Generation Of Non-Uniform Random Numbers Spring Quarter 2005-06 1/4 1 2 3 4 0 4 If the generated point lands in region i, then we return the value x i . In this simple case we don’t need to actually generate the y-coordinate of our random point, and can implement our technique efficiently by generating a uniform(0,1) random variable U and setting X = x ⎡nU ⎤ , where ⎡⎢ z ⎤⎥ denotes the smallest integer greater than or equal to z. (We don’t generate an explicit x value, we merely determine in which region our x value falls.) For non-uniform p i 's , the idea now is to “shift” probability mass around so as to make a rectangular region---where the vertical regions may now be horizontally partitioned into subregions---and use essentially the above idea. In shifting mass to make the rectangle, it turns out that the ith vertical region never needs to be chopped up into more than two parts: one part is associated with the original x i value, and the other part is associated with an alternate value x j for some j ≠ i . The alternate value is called the alias value. (See figures.) The x-coordinate of the random point is handled as before---i.e., we don’t generate explicit x-values but merely choose which region we fall in. Similarly, we don’t generate explicit y-values, but rather choose which of the two parts of the region that the point falls in. The point falls in the lower part of vertical region i with probability ri = (height of lower part of region i)/(height of region i). In general, to use the alias method, we first need to execute a set-up algorithm to generate the alias values and probability table. For 1 ≤ i ≤ n, this yields a set of a i 's (the alias values) and ri 's (the aliasing probabilities). We can then employ the following algorithm: Generate two independent uniform (0,1) random variables U1 and U 2 . 1. Set I = nU1 . 2. If U 2 ≤ rI , then return x I . Otherwise, return a I . 3. To set up the alias table requires running an algorithm that executes in O(n) time. (In other words, the algorithm runs in time which is essentially linear in n.) See Appendix 8B in L&K for details. When n is large, some care has to be taken when programming to avoid numerical round-off errors. Page 7 of 8 MS&E 223 Simulation Peter J. Haas Lecture Notes #5 Generation Of Non-Uniform Random Numbers Spring Quarter 2005-06 Example 1: 1/2 1/2 1/3 1/3 x2 1/6 1/6 x2 x3 x1 x1 x2 x3 i xi pi ai ri 1 2 3 x1 x2 x3 1/6 1/2 1/3 x2 x2 x3 0.5 1 1 Example 2: 3/7 7/21 2/7 x1 5/21 x2 x1 1/7 3/21 x2 x3 x1 x2 x3 i xi pi ai ri 1 x1 3/7 x1 1 2 x2 3/7 x1 5/7 3 x3 1/7 x2 3/7 Page 8 of 8
© Copyright 2025 ExpyDoc