slide

確率統計特論 (Probability & Statistics)
May 14, 2014
Lesson 4
Expectation, Variance, Moment…
Todays topics
• expectation,
• Markov’s inequality
• variance, covariance, moment
• Chebyshev’s inequality
来嶋 秀治 (Shuji Kijima)
Dept. Informatics,
Graduate School of ISEE
Today’s topic 1
Expectation
3
Expectation
 Expectation (期待値) of a discrete random variable X is defined by
E𝑋 =
𝑥⋅𝑓 𝑥
𝑥∈Ω
only when the right hand side is converged absolutely (絶対収束),
i.e.,
𝑥∈Ω
𝑥⋅𝑓 𝑥
< ∞ holds.
If it is not the case, we say “expectation does not exist.”
 Expectation (期待値) of a continuous random variable X is defined by
+∞
E𝑋 =
𝑥 ⋅ 𝑓 𝑥 d𝑥 .
−∞
Definitions
 k-th moment (k次の積率) of X
E[Xk]
 variance (分散) of X
Var[X]:=E[(X-E[X])2]
 standard deviation (標準偏差) of X
[X]:= Var[X]
 covariance (共分散) of X, Y
Cov[X,Y]:= E[(X-E[X])(X-E[Y])]
4
Compute expectations of distributions
*Ex 2.
Discrete
(*i) Bernoulli distribution B(1,p).
(*ii) Binomial distribution B(n,p).
(iii) Geometric distribution Ge(p).
(iv) Poisson distribution Po().
Continuous
(v) Exponential distribution Ex().
(vi) Normal distribution N(,2).
5
Ex. Expectation of Geom. distr.
Thm.
The expectation of XGe(p) is (1-p)/p
proof
E[X] = 0p + 1(1-p)p + 2(1-p)2p + 3(1-p)3p +…
(1-p)E[X] =
pE[X] =
(1-p)p + 1(1-p)2p + 2(1-p)3p +…
(1-p)p + (1-p)2p + (1-p)3p +…
= (1-p)p/(1-(1-p)) = 1-p.
Thus, E[X] = (1-p)/p.
6
Properties of Expectations
Thm.
For an arbitrary constant c,
E[c] = c
E[cX] = cE[X]
E[X+c] = E[X]+c
7
8
Linearity of expectations
Thm. (linearity of expectation; 期待値の線形性)
𝑛
𝑛
𝐸
𝑋𝑖 =
𝑖=1
𝐸(𝑋𝑖 )
𝑖=1
proof.
E 𝑋+𝑌
=
=
𝑥
𝑦(𝑥
+ 𝑦) Pr 𝑋 = 𝑥 ∩ 𝑌 = 𝑦
=
𝑥
𝑦 𝑥𝑓(𝑥, 𝑦)
+
𝑥
=
𝑥𝑥
𝑦 𝑓(𝑥, 𝑦)
+
𝑦𝑦
=
𝑥 𝑥𝑓(𝑥)
+
= E 𝑋 + E[𝑌]
𝑦 𝑦𝑓(𝑦)
𝑦 𝑦𝑓(𝑥, 𝑦)
𝑥 𝑓(𝑥, 𝑦)
𝑥
𝑦
𝑥 + 𝑦 𝑓(𝑥, 𝑦)
9
Linearity of expectations
Thm. (linearity of expectation; 期待値の線形性)
𝑛
𝐸
𝑛
𝑋𝑖 =
𝑖=1
𝐸(𝑋𝑖 )
𝑖=1
proof.
E 𝑋+𝑌
=
+∞ +∞
𝑥 + 𝑦 𝑓 𝑥, 𝑦
−∞ −∞
+∞ +∞
𝑥𝑓 𝑥, 𝑦 d𝑥d𝑦
−∞ −∞
=
+∞
𝑥
−∞
=
+∞
𝑥𝑓(𝑥)d𝑥
−∞
=
+∞
𝑓
−∞
= E 𝑋 + E[𝑌]
d𝑥d𝑦
+
+∞ +∞
𝑦𝑓
−∞ −∞
𝑥, 𝑦 d𝑦 d𝑥 +
+
+∞
𝑦𝑓(𝑦)d𝑦
−∞
+∞
𝑦
−∞
𝑥, 𝑦 d𝑥d𝑦
+∞
𝑓
−∞
𝑥, 𝑦 d𝑥 d𝑦
Application of linearity of expectation
Thm.
The expectation of X B(n;p) is np
proof
Suppose X1,…,Xn are i.i.d. B(1;p),
then X=X1+…+Xn follows B(n,p).
E[Xi] = 1 p + 0(1-p) =p
E[X] = E[i Xi] = i E[Xi] = i p =np
10
Ex. Coupon collector
•ビックリマンシール
•ポケモンカード
 The are n kinds of coupons.
 How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
11
Ex. Coupon collector
•ビックリマンシール
•ポケモンカード
 The are n kinds of coupons.
 How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
Suppose you have already drawn k kinds of coupon.
 The probability of another kind is
 The expected number of draws to have new one
12
•ビックリマンシール
•ポケモンカード
Ex. Coupon collector
 The are n kinds of coupons.
 How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
Thm.
harmonic number
13
Ex. Coupon collector
•ビックリマンシール
•ポケモンカード
 The are n kinds of coupons.
 How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
 What is the probability of completion after m trials?
14
Today’s topic 1
Markov’s inequality
Markov’s inequality
Thm. Markov’s inequality
Let X be a nonnegative random variable, then
holds for any a  0.
16
Markov’s inequality
Thm. Markov’s inequality
Let X be a nonnegative random variable, then
holds for any a  0.
Proof.
∞
𝑎
∞
𝑋
𝑥
𝑥
𝑥
𝐸
=
𝑓(𝑥)𝑑𝑥 =
𝑓(𝑥)𝑑𝑥 +
𝑓(𝑥)𝑑𝑥
𝑎
0 𝑎
0 𝑎
𝑎 𝑎
∞
∞
𝑥
≥
𝑓(𝑥)𝑑𝑥 ≥
𝑓(𝑥) 𝑑𝑥 = Pr[𝑋 ≥ 𝑎]
𝑎 𝑎
𝑎
Thus,
𝑋
E𝑋
Pr 𝑋 ≥ 𝑎 ≤ E
=
𝑎
𝑎
17
Ex. Coupon collector
•ビックリマンシール
•ポケモンカード
 The are n kinds of coupons.
 How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
 What is the probability of completion after m trials?
Using Markov’s inequality,
e.g., n=108, m=1000,
Pr[Completion]  1-Pr[X  1000]  0.5
too loose?
18
Today’s topic 2
Moment & Variance
20
Motivation
Consider the following three distributions.
Distr. 1.
Distr. 2.
Distr. 3.
• Pr[X=0] = 1/3
• Pr[X=k] = 2-(k+1)
• Pr[X=0] = 2/3
• Pr[X=1] = 1/3
for k=0,1,2,…
• Pr[X=2] = 1/3
• Pr[X=1] = 0
• Pr[X=2k] = 2-2k
for k=1,2,…
Ex[X]
=1
Ex[X]
= 1
Ex[X]
= 1
21
Motivation
Consider the following three distributions.
Distr. 1.
Distr. 2.
Distr. 3.
• Pr[X=0] = 1/3
• Pr[X=k] = 2-(k+1)
• Pr[X=0] = 2/3
• Pr[X=1] = 1/3
for k=0,1,2,…
• Pr[X=2] = 1/3
• Pr[X=1] = 0
• Pr[X=2k] = 2-2k
for k=1,2,…
Ex[X]
=1
Ex[X]
= 1
Ex[X]
= 1
Pr[X1] = 1/3
Pr[X1] = 1/4
Pr[X1] = 1/3
Pr[X2] = 0
Pr[X2] = 1/8
Pr[X2] = 1/12
Pr[X106] = 0
Pr[X1000] =1/512
Pr[X1000]  1/192
Definitions
 k-th moment (k次の積率) of X
E[Xk]
 variance (分散) of X
Var[X]:=E[(X-E[X])2]
 standard deviation (標準偏差) of X
[X]:= Var[X]
 covariance (共分散) of X, Y
Cov[X,Y]:= E[(X-E[X])(X-E[Y])]
22
Properties of Var and Cov
Thm.
Var[X] = E[X2] – (E[X])2
Cov[X,Y] = E[XY] – E[X]E[Y]
Var[X+Y] = Var[X] + Var[Y] + 2Cov[X,Y]
23
Properties of Var and Cov
Thm.
Var[X] = E[X2] – (E[X])2
Cov[X,Y] = E[XY] – E[X]E[Y]
Var[X+Y] = Var[X] + Var[Y] + 2Cov[X,Y]
E[(X-E[X])2] = E[X2 – 2XE[X] + (E[X])2]
= E[X2] - 2E[X]E[X] + (E[X])2
= E[X2] - (E[X])2
Cov[X, Y] = E[(X-E[X])(Y-E[Y])]
= E[(XY -XE[Y] -YE[X] +E[X]E[Y]]
= E[XY] - 2E[X]E[Y] + E[X]E[Y]
= E[XY] - E[X]E[Y]
24
Properties of Var and Cov
Thm.
Var[X] = E[X2] – (E[X])2
Cov[X,Y] = E[XY] – E[X]E[Y]
Var[X+Y] = Var[X] + Var[Y] + 2Cov[X,Y]
Var[X+Y] = E[(X+Y)2] – (E[X+Y])2
= E[X2+2XY+Y2] - (E[X]+E[Y])2
= E[X2]-(E[X])2 + E[Y2]-(E[Y])2 +2E[XY]-2E[X]E[Y]
= Var[X] + Var[Y] + 2Cov[X,Y]
25
Properties of Var and Cov
Thm. Suppose X and Y are independent
E[XY] = E[X]E[Y]
Cov[X,Y] = 0
Var[X+Y] = Var[X] + Var[Y]
E[XY] = x y xy Pr[X=xY=y]
= x y xy Pr[X=x] Pr[Y=y]
= (x x Pr[X=x])(y y Pr[Y=y])
= E[X]E[Y]
Cov[X,Y] = E[XY] –E[X]E[Y]
=0
26
Properties of Var and Cov
Thm. Suppose Xi (i=1,…,n) are mutually independent
Var[X1+X2+…+Xn] = Var[X1] + Var[X2] +… +Var[Xn]
27
Today’s topic 3
Chebyshev’s inequality
Chebyshev’s inequality
Thm. Chebyshev’s inequality
For any a  0.
proof.
Remark that
Using Markov’s inequality,
29
Chebyshev’s inequality
Cor. Chebyshev’s inequality
For any t  0.
proof.
30
Ex. Coupon collector
•ビックリマンシール
•ポケモンカード
 The are n kinds of coupons.
 How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
 What is the probability of completion after m trials?
Using Markov’s inequality,
e.g., n=108, m=1000,
Pr[Completion]  1-Pr[X  1000]  0.4
too loose?
31
Ex. Coupon collector
•ビックリマンシール
•ポケモンカード
 The are n kinds of coupons.
 How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
 What is the probability of completion after m trials?
Using Chebyshev’s inequality,
e.g., n=108, m=1000 (t  1),
Pr[Completion]  1-Pr[X  1000] =??
32
•ビックリマンシール
•ポケモンカード
Ex. Coupon collector
 The are n kinds of coupons.
 How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
 What is the probability of completion after m trials?
Ex. 2.
Var 𝑋
𝑛
=
𝑛
Var 𝑋𝑖 =
𝑖=1
𝑛
≤
𝑖=1
1
2 =
𝑝𝑖
𝑛
𝑖=1
𝑖=1
1 − 𝑝𝑖
𝑝𝑖2
𝑛
𝑛−𝑖+1
2
𝑛
= 𝑛2
𝑖=1
2
1
𝜋
2
≤
𝑛
𝑖2
6
33
Ex. Coupon collector
•ビックリマンシール
•ポケモンカード
 The are n kinds of coupons.
 How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
 What is the probability of completion after m trials?
Using Chebyshev’s inequality,
e.g., n=108, m=1000 (t  1),
still loose?
 Chernoff’s bound
Pr[Completion]  1-Pr[X  1000]  0.92
34
Law of Large number
Law of large numbers (大数の法則)
36
Def.
A series {Xn} converges X in probability (Xに確率収束する), if
  0, limn+ Pr[|Xn-X|  ] = 1
independent and identically distributed
(独立同一分布)
Thm. (low of large numbers; 大数の法則)
Let r.v. X1,…,Xn are i.i.d., w/ expectation , and variance 2,
then (X1+…+Xn)/n converge  in probability;
i.e.,
Thm. (low of large numbers; 大数の法則)
Let r.v. X1,…,Xn are i.i.d., w/ expectation , and variance 2,
then (X1+…+Xn)/n converge  in probability;
i.e.,
37
Thm. (low of large numbers; 大数の法則)
Let r.v. X1,…,Xn are i.i.d., w/ expectation , and variance 2,
then (X1+…+Xn)/n converge  in probability;
i.e.,
Using Chebyshev’s inequality (remind lesson 3)
Thm. Chebyshev’s inequality
For any a  0.
38