Exercise 3.11 VC Dimension Bayes Classifier Solution to Assignment]1 Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Spring 2014) Hong Chang (ICT, CAS) Solution to Assignment]1 Gaussian Bayes Classifier Exercise 3.11 VC Dimension Bayes Classifier Gaussian Bayes Classifier Exercise 3.11 We have seen that, as the size of a data set increases, the uncertainty associated with the posterior distribution over model parameters decreases. Make use of the matrix identity (M + vvT )−1 = M−1 − (M−1 v)(vT M−1 ) 1 + vT M−1 v to show that the uncertainty σN2 (x) associated with the linear regression function given by (3.59) satisfies 2 σN+1 (x) ≤ σN2 (x) (3.59): σN2 (x) = 1 β + φ(x)T SN φ(x) β is the precision (inverse variance) of noise. The first term represents the noise on the data; the second term reflects the uncertainty associated with the parameters. Hong Chang (ICT, CAS) Solution to Assignment]1 Exercise 3.11 VC Dimension Bayes Classifier Gaussian Bayes Classifier Solution to Exercise 3.11 As S−1 N T = S−1 0 + βΦ Φ = S−1 0 +β N X (3.51) φ(xn )φ(xn )T (1) n=1 So, S−1 N+1 = S−1 0 +β N+1 X φ(xn )φ(xn )T n=1 = S−1 N + βφ(xN+1 )φ(xN+1 )T 2 From (3.59), we can express σN+1 (x) as 2 σN+1 (x) = 1 + φ(x)T SN+1 φ(x) β = σN2 (x) + φ(x)T (SN+1 − SN )φ(x) Hong Chang (ICT, CAS) Solution to Assignment]1 (2) Exercise 3.11 VC Dimension Bayes Classifier Gaussian Bayes Classifier Solution to Exercise 3.11 −1 Applying √ the matrix identity to Eqn (1), by setting M = SN and v = βφ(xN+1 ): SN+1 = SN − βSN φ(xN+1 )φ(xN+1 )T SN 1 + βφ(xN+1 )T SN φ(xN+1 ) Substituting Eqn (3) into Eqn (2), we get: 2 σN+1 (x) = σN2 (x) − βφ(x)T SN φ(xN+1 )φ(xN+1 )T SN φ(x) 1 + βφ(xN+1 )T SN φ(xN+1 ) Since SN is positive definite and SN φ(xN+1 )φ(xN+1 )T SN is positive semidefinite, the second term above are nonnegative. Hence 2 σN+1 (x) ≤ σN2 (x). Hong Chang (ICT, CAS) Solution to Assignment]1 (3) Exercise 3.11 VC Dimension Bayes Classifier Gaussian Bayes Classifier Understanding by Illustration 1 1 t t 0 0 −1 −1 0 x 1 1 0 x 1 0 x 1 1 t t 0 0 −1 −1 0 x 1 Preditive distribution for Bayesian linear regression by linear combination of Gaussian basis functions. The mean (red curve) and standard deviation (red shaded region) are shown. Hong Chang (ICT, CAS) Solution to Assignment]1 Exercise 3.11 VC Dimension Bayes Classifier Gaussian Bayes Classifier VC Dimension 1 sign(x12 + a) 2 A Gaussian Bayes classifier with equal covariances VC dimension = 1 VC dimension = 4 The learner gives linear decision plane in 3D space! 3 Decision boundaries that are circles centered at the origin, of radius a and where the class value we predict inside the circle is specified by the parameter b VC dimension = 2 Hong Chang (ICT, CAS) Solution to Assignment]1 Exercise 3.11 VC Dimension Bayes Classifier Gaussian Bayes Classifier Joint Bayes Classifier (1) We can compute p(y |x1 , x2 ) by estimating p(x1 , x2 |y ) and p(y ): p(y = 1) = 8/16 = 1/2 and p(x1 , x2 |y = 0) = [1, 1, 3, 3]/8 = [1/8, 1/8, 3/8, 3/8] p(x1 , x2 |y = 1) = [3, 3, 0, 2]/8 = [3/8, 3/8, 0, 1/4] where we list the probabilities for (x1 , x2 ) = (0, 0), (0, 1), (1, 0), (1, 1) in that order. Then, p(y = 1|x1 , x2 ) = p(x1 , x2 |y = 1)p(y = 1) p(x1 , x2 |y = 1)p(y = 0)p(x1 , x2 |y = 0)p(y = 0) Hong Chang (ICT, CAS) Solution to Assignment]1 Exercise 3.11 VC Dimension Bayes Classifier Gaussian Bayes Classifier Joint Bayes Classifier (2) For the test data points, we have p(y = 1|0, 1) = 3/4 ⇒ predict 1 p(y = 1|1, 0) = 0 ⇒ predict 0 p(y = 1|1, 1) = 2/5 ⇒ predict 0 The error rate on this test set is 2/3. Hong Chang (ICT, CAS) Solution to Assignment]1 Exercise 3.11 VC Dimension Bayes Classifier Gaussian Bayes Classifier Naive Bayes Classifier (1) Learning individual distribution for each feature independently p(x1 |y = 0) = [2, 6]/8 = [1/4, 3/4] p(x2 |y = 0) = [4, 4]/8 = [1/2, 1/2] p(x1 |y = 1) = [6, 2]/8 = [3/4, 1/4] p(x2 |y = 1) = [3, 5]/8 = [3/8, 5/8] The error rate on this test set is 2/3. Hong Chang (ICT, CAS) Solution to Assignment]1 Exercise 3.11 VC Dimension Bayes Classifier Gaussian Bayes Classifier Naive Bayes Classifier (2) Predict for y similarly, assuming p(x1 , x2 |y) = p(x1 |y)p(x2 |y ): p(y = 1|0, 1) = 0.7895 ⇒ predict 1 ⇒ predict 0 p(y = 1|1, 1) = 0.2941 ⇒ predict 0 p(y = 1|1, 0) = 0.2 Hong Chang (ICT, CAS) Solution to Assignment]1 Exercise 3.11 VC Dimension Bayes Classifier Gaussian Bayes Classifier (1) Estimate the mean and covariance Plot the class-wise Gaussians Hong Chang (ICT, CAS) Solution to Assignment]1 Gaussian Bayes Classifier Exercise 3.11 VC Dimension Bayes Classifier Gaussian Bayes Classifier Gaussian Bayes Classifier (2) Matlab code equalCov=false/true; learner=gaussBayesClassify(Xtr,Ytr,equalCov); class2DPlot(learner, Xtr, Ytr); Bayes classifier boundary with Gaussian class-conditional distribution: Hong Chang (ICT, CAS) Solution to Assignment]1 Exercise 3.11 VC Dimension Bayes Classifier Gaussian Bayes Classifier Gaussian Bayes Classifier (3) Matlab code useConstant=false; equalCov=false/true; for degree=1:4 learner=polyClassify(degree,useConstant,gaussBayesClassify()); learner=train(learner,Xtr,Ytr,equalCov); class2DPlot(learner,Xtr,Ytr); end; Hong Chang (ICT, CAS) Solution to Assignment]1 Exercise 3.11 VC Dimension Bayes Classifier Gaussian Bayes Classifier (4) Polynomial features, arbitrary covariances Hong Chang (ICT, CAS) Solution to Assignment]1 Gaussian Bayes Classifier Exercise 3.11 VC Dimension Bayes Classifier Gaussian Bayes Classifier (5) Polynomial features, equal covariances Hong Chang (ICT, CAS) Solution to Assignment]1 Gaussian Bayes Classifier
© Copyright 2024 ExpyDoc