CS 369 Tutorial 5 4 April 2014 1. (a) If 48% of all teenagers own a skateboard and 39% of all teenagers own a skateboard and roller blades. What is the probability that a teenager owns roller blades given that the teenager owns a skateboard? Answer: The conditional probability is given by the formulae: P (A|B) = P (A, B) P (B) We have P (skate) = 0.48 and P (skate, roller) = 0.39. Then P (roller|skate) = P (skate, roller) 0.39 = = 0.8125 P (skate) 0.48 (b) If 88% of all households have a television and 51% of all households have a television and a DVD. What is the probability that a household has a DVD given that it has a television? Answer: P (dvd, tv) 0.51 P (dvd|tv) = = = 0.58 P (tv) 0.88 (c) 84% of the houses have a garage and 65% of the houses have a garage and garden. What is the probability that a house has garden given that it has a garage? Answer: 0.65 P (garden|garage) = = 0.77. 0.84 2. Question 2: The survival probabilities for men are as follows: i) probability that a man lives at least 70 years: 80% ii) probability that a man lives at least 80 years: 50%. What is the probability that a man lives at least 80 years given that he has just turned 70? Answer: Note that if a man lives at least 80 years that implies that the men lives at least 70 years. So P (70, 80) = P (80). And using the definition of the conditional probability we find P (80|70) = P (70, 80) 0.5 = = 0.625 P (70) 0.8 3. There is a new diagnostic test for a disease that occurs in about 0.05% of the population. The test is not perfect but will detect a person with the disease 99% of the time. It will, however, say that a person without the disease has the disease with about 3% of the time. A person is selected at random from the population and the test indicates that this person has the disease. What are the conditional probabilities that 1 (a) the person has the disease? (b) the person does not have the disease? Answer: Let D be the event that the person has the disease and T be the event that the test gives a positive result. We need to find P (D|T ) and P (Dc |T ), where Dc is the event that the person does not have the disease. We are given that P (D) = 0.0005 and that P (T |D) = 0.99 and P (T |Dc ) = 0.03. We can also find P (Dc ) = 1 − P (D) = 0.9995. The outcome space X for the random experiment where we assess whether a person has the disease or not consists of only two outcomes, ‘a person has the disease’ and ‘the person does not have the disease’. So to find the marginal distribution P (T ) we need to sum joint probabilities over all (two in this case) possible outcomes: X X P (y) = P (x, y) = P (y|x)P (x) = P (y|D)P (D) + P (y|Dc )P (Dc ) x∈X x∈X and P (T ) = P (T |D)P (D) + P (T |Dc )P (Dc ) = 0.99 × 0.0005 + 0.03 × 0.9995 = 0.03048 Applying Bayes’ theorem we find P (T |D)P (D) 0.99 × 0.0005 = = 0.0162 P (T ) 0.03048 P (D|T ) = P (Dc |T ) = P (T |Dc )P (Dc ) 0.03 × 0.9995 = = 0.9838 P (T ) 0.03048 4. An urn contains 7 red and 11 white balls. Draw one ball at random from the urn. Let X = 1 if a red ball is drawn, and let X = 0 if a white ball is drawn. Give the probability mass function, mean and variance of X. Answer: The state space for variable X is X = {0, 1}. The probability mass function is f (0) = E(X) = X 11 7 and f (1) = 18 18 xf (x) = 0 × x∈X E(X 2 ) = X 7 11 7 +1× = 18 18 18 x2 f (x) = 02 × x∈X V (X) = E(X 2 ) − (E(X))2 = 7 11 7 + 12 × = 18 18 18 7 − 18 7 18 2 = 77 182 5. The joint probability density function of random variables X and Y on the interval (0, ∞) is y x 1 pXY (x, y) = e− 5 − 6 30 2 (a) Find the marginal distribution for X and for Y . (You can calculate the integral using www.wolframalpha.com if you need to. The syntax to calculate Rb a f (x, y)dx is “integrate f (x, y) from a to b with respect to y”.) y x Answer: Let p1 (x) = 15 e− 5 and p2 (y) = 16 e− 6 . Note that pXY (x, y) = p1 (x)p2 (y) Also note that p1 and p2 are probability density functions for exponential distributions with parameters 15 and 61 . We could calculate the integrals to find the marginal distributions but because the joint probability density function is a product of two 1-dimensional probability density functions depending on x and y, these functions are the probability density functions for variables X and Y and the marginal distributions are just exponential distributions with parameters 15 and 61 . (b) Are X and Y independent? Answer: As we can see from question (a), the variables are independent because their joint probability density function is a product of their marginal probability density functions. 6. Which distribution of the ones discussed in lectures would you use to model each of the following random variables? (a) A woman is trying to hit the 20 on a dart board. Let X1 = the number of throws before she hits the 20. Answer: Geometric, since each throw is a Bernoulli variable with outcomes hit (success) and miss (failure). (b) X2 = the time between eruptions of a volcano. Answer: Exponential. The volcano eruptions is the Poisson process, i.e., the process where events happen through time with some rate. And the time between events in the Poisson process is exponentially distributed. (c) X3 = the number of eruptions of a volcano in 1000 years. Answer: Poisson. The number of events in the Poisson process on an interval of given length is Poisson distributed. (d) X4 = the time between 3 eruptions of a volcano. Answer: Gamma. The sum of exponentially distributed variables is Gamma distributed. (e) A shop is open between 11 and 3 and customers arrive at any time at any apparently constant rate. X5 = the arrival time of any particular customer. Answer: Uniform. The times when customers arrive are just points randomly thrown on the interval between 11 and 3. (f) A woman is trying to measure the length of a deck with a tape measure. The deck is 20m long. X6 = the measurements obtained by the woman. Answer: Normal. The measurements should be concentrated around the true value. (g) A woman likes only some of the songs in her mp3 library that she is listening to on random play. X7 = whether or not she likes the next song. Answer: Bernoulli. The woman either likes or does not like a randomly chosen song. So assessing if the women likes the song or not is a Bernoulli trial. 3 (h) A student sits 25 tests in a year and needs to get beyond a certain threshold in each test to pass. X8 = Number of tests in which the threshold is achieved. Answer: Binomial. Getting beyond the threshold is a Bernoulli trial. And the number of successes follows the Binomial distribution. 7. There are 10 questions in a test. Students find that some questions are easy to answer and some are not. A survey was taken of 12 students where each student was asked how many questions they found easy to answer. The results of the survey were D = (2, 5, 8, 9, 4, 5, 8, 1, 10, 0, 4, 5). (a) For a single data point, what is an appropriate model to use? That is, which distribution is a single data point drawn from? Answer: The event that a student finds a question easy to answer is similar to tossing a coin. Having 10 questions is equivalent to tossing a coin 10 times. So the appropriate distribution is the Binomial distribution. (b) What are the parameters of this model, what do they represent in the problem and are they known or unknown? Answer: The binomial distribution has two parameters, n and p. n represents the number of questions in the test which we already know to be 10. The parameter p is the probability that a student find a question easy to answer and is unknown — it is the parameter we need to estimate. (c) By using the pdf for this distribution and assuming independence of the data points, write down the likelihood function for the model. Answer: If ith student find Di questions easy then 12 Y 10 Di L(D|p) = p (1 − p)(10−Di ) = Di i=1 p P12 i=1 Di P12 (1 − p) i=1 (10−Di ) 12 Y 10 i=1 Di (d) Propose a way to estimate any unknown parameters of your model. Answer: One good way to estimate a parameter when the likelihood function is available is to find a maximum of this function with respect to the unknown parameter. And the point at which the maximum is reached can be used as an estimator (maximum likelihood estimator) of the parameter. 4