CS 369 Tutorial 5

CS 369 Tutorial 5
4 April 2014
1. (a) If 48% of all teenagers own a skateboard and 39% of all teenagers own a
skateboard and roller blades. What is the probability that a teenager owns
roller blades given that the teenager owns a skateboard?
Answer: The conditional probability is given by the formulae:
P (A|B) =
P (A, B)
P (B)
We have P (skate) = 0.48 and P (skate, roller) = 0.39. Then
P (roller|skate) =
P (skate, roller)
0.39
=
= 0.8125
P (skate)
0.48
(b) If 88% of all households have a television and 51% of all households have a
television and a DVD. What is the probability that a household has a DVD
given that it has a television?
Answer:
P (dvd, tv)
0.51
P (dvd|tv) =
=
= 0.58
P (tv)
0.88
(c) 84% of the houses have a garage and 65% of the houses have a garage and
garden. What is the probability that a house has garden given that it has a
garage?
Answer:
0.65
P (garden|garage) =
= 0.77.
0.84
2. Question 2: The survival probabilities for men are as follows:
i) probability that a man lives at least 70 years: 80%
ii) probability that a man lives at least 80 years: 50%.
What is the probability that a man lives at least 80 years given that he has just
turned 70?
Answer: Note that if a man lives at least 80 years that implies that the men lives
at least 70 years. So P (70, 80) = P (80). And using the definition of the conditional
probability we find
P (80|70) =
P (70, 80)
0.5
=
= 0.625
P (70)
0.8
3. There is a new diagnostic test for a disease that occurs in about 0.05% of the
population. The test is not perfect but will detect a person with the disease 99%
of the time. It will, however, say that a person without the disease has the disease
with about 3% of the time. A person is selected at random from the population
and the test indicates that this person has the disease. What are the conditional
probabilities that
1
(a) the person has the disease?
(b) the person does not have the disease?
Answer: Let D be the event that the person has the disease and T be the event
that the test gives a positive result. We need to find P (D|T ) and P (Dc |T ), where
Dc is the event that the person does not have the disease. We are given that
P (D) = 0.0005 and that P (T |D) = 0.99 and P (T |Dc ) = 0.03. We can also find
P (Dc ) = 1 − P (D) = 0.9995.
The outcome space X for the random experiment where we assess whether a person
has the disease or not consists of only two outcomes, ‘a person has the disease’ and
‘the person does not have the disease’. So to find the marginal distribution P (T )
we need to sum joint probabilities over all (two in this case) possible outcomes:
X
X
P (y) =
P (x, y) =
P (y|x)P (x) = P (y|D)P (D) + P (y|Dc )P (Dc )
x∈X
x∈X
and
P (T ) = P (T |D)P (D) + P (T |Dc )P (Dc ) =
0.99 × 0.0005 + 0.03 × 0.9995 = 0.03048
Applying Bayes’ theorem we find
P (T |D)P (D)
0.99 × 0.0005
=
= 0.0162
P (T )
0.03048
P (D|T ) =
P (Dc |T ) =
P (T |Dc )P (Dc )
0.03 × 0.9995
=
= 0.9838
P (T )
0.03048
4. An urn contains 7 red and 11 white balls. Draw one ball at random from the urn.
Let X = 1 if a red ball is drawn, and let X = 0 if a white ball is drawn. Give the
probability mass function, mean and variance of X.
Answer: The state space for variable X is X = {0, 1}. The probability mass
function is
f (0) =
E(X) =
X
11
7
and f (1) =
18
18
xf (x) = 0 ×
x∈X
E(X 2 ) =
X
7
11
7
+1×
=
18
18
18
x2 f (x) = 02 ×
x∈X
V (X) = E(X 2 ) − (E(X))2 =
7
11
7
+ 12 ×
=
18
18
18
7
−
18
7
18
2
=
77
182
5. The joint probability density function of random variables X and Y on the interval
(0, ∞) is
y
x
1
pXY (x, y) = e− 5 − 6
30
2
(a) Find the marginal distribution for X and for Y . (You can calculate the
integral using www.wolframalpha.com if you need to. The syntax to calculate
Rb
a f (x, y)dx is “integrate f (x, y) from a to b with respect to y”.)
y
x
Answer: Let p1 (x) = 15 e− 5 and p2 (y) = 16 e− 6 . Note that
pXY (x, y) = p1 (x)p2 (y)
Also note that p1 and p2 are probability density functions for exponential
distributions with parameters 15 and 61 . We could calculate the integrals to find
the marginal distributions but because the joint probability density function
is a product of two 1-dimensional probability density functions depending on
x and y, these functions are the probability density functions for variables X
and Y and the marginal distributions are just exponential distributions with
parameters 15 and 61 .
(b) Are X and Y independent?
Answer: As we can see from question (a), the variables are independent
because their joint probability density function is a product of their marginal
probability density functions.
6. Which distribution of the ones discussed in lectures would you use to model each
of the following random variables?
(a) A woman is trying to hit the 20 on a dart board. Let X1 = the number of
throws before she hits the 20.
Answer: Geometric, since each throw is a Bernoulli variable with outcomes
hit (success) and miss (failure).
(b) X2 = the time between eruptions of a volcano.
Answer: Exponential. The volcano eruptions is the Poisson process, i.e.,
the process where events happen through time with some rate. And the time
between events in the Poisson process is exponentially distributed.
(c) X3 = the number of eruptions of a volcano in 1000 years.
Answer: Poisson. The number of events in the Poisson process on an
interval of given length is Poisson distributed.
(d) X4 = the time between 3 eruptions of a volcano.
Answer: Gamma. The sum of exponentially distributed variables is Gamma
distributed.
(e) A shop is open between 11 and 3 and customers arrive at any time at any
apparently constant rate. X5 = the arrival time of any particular customer.
Answer: Uniform. The times when customers arrive are just points randomly thrown on the interval between 11 and 3.
(f) A woman is trying to measure the length of a deck with a tape measure. The
deck is 20m long. X6 = the measurements obtained by the woman.
Answer: Normal. The measurements should be concentrated around the
true value.
(g) A woman likes only some of the songs in her mp3 library that she is listening
to on random play. X7 = whether or not she likes the next song.
Answer: Bernoulli. The woman either likes or does not like a randomly
chosen song. So assessing if the women likes the song or not is a Bernoulli
trial.
3
(h) A student sits 25 tests in a year and needs to get beyond a certain threshold
in each test to pass. X8 = Number of tests in which the threshold is achieved.
Answer: Binomial. Getting beyond the threshold is a Bernoulli trial. And
the number of successes follows the Binomial distribution.
7. There are 10 questions in a test. Students find that some questions are easy to
answer and some are not. A survey was taken of 12 students where each student
was asked how many questions they found easy to answer. The results of the
survey were D = (2, 5, 8, 9, 4, 5, 8, 1, 10, 0, 4, 5).
(a) For a single data point, what is an appropriate model to use? That is, which
distribution is a single data point drawn from?
Answer: The event that a student finds a question easy to answer is similar
to tossing a coin. Having 10 questions is equivalent to tossing a coin 10 times.
So the appropriate distribution is the Binomial distribution.
(b) What are the parameters of this model, what do they represent in the problem
and are they known or unknown?
Answer: The binomial distribution has two parameters, n and p. n represents the number of questions in the test which we already know to be 10.
The parameter p is the probability that a student find a question easy to
answer and is unknown — it is the parameter we need to estimate.
(c) By using the pdf for this distribution and assuming independence of the data
points, write down the likelihood function for the model.
Answer: If ith student find Di questions easy then
12 Y
10 Di
L(D|p) =
p (1 − p)(10−Di ) =
Di
i=1
p
P12
i=1
Di
P12
(1 − p)
i=1 (10−Di )
12 Y
10
i=1
Di
(d) Propose a way to estimate any unknown parameters of your model.
Answer: One good way to estimate a parameter when the likelihood function is available is to find a maximum of this function with respect to the
unknown parameter. And the point at which the maximum is reached can be
used as an estimator (maximum likelihood estimator) of the parameter.
4