Applied Econometrics Martin Huber Chair of Applied Econometrics - Evaluation of Public Policies University of Fribourg 1 / 26 Contents of this lecture 1 Digression: consistency (Wooldridge 5.1, Appendix C.3) Definition 2 Heteroskedasticity (Wooldridge 8.1-8.4) Definition Heteroskedasticity robust inference Testing heteroskedasticity Weighted least squares regression 3 Non-random samples (Wooldridge 9.4) Sample selection based on the independent variables Sample selection based on the dependent variable Missing values Outliers Conclusion 2 / 26 Consistency (Wooldridge 5.1, Appendix C.3) βˆn ... estimator of parameter β based on sample size n βˆn is a consistent estimator, if it holds for every ε > 0 that Pr (|βˆn − β | > ε) → 0 if n → ∞ ⇔ plim(βˆn ) = β (1) With increasing sample size, n → ∞, the distribution of βˆn is more and more concentrated around the true value β : large deviations from the true value become less and less likely. Note: The property ‘unbiasedness’ (E (βˆn ) = β ) refers to a given sample size. Consistency is a property that refers to the distribution of the estimator when the sample size becomes very (infinitely) large. Property: g (plim(βˆn )) = plim(g (βˆn )) 3 / 26 Consistency (Wooldridge 5.1, Appendix C.3) Example: Estimator of the variance of the error term: yi = β0 + β1 xi + ui σˆ 2 = 1 ui ∼ N (0, σ 2 ) n ∑ uˆi2 n−2 i =1 ˆ 2) = σ 2 E (σ ˆ 2) = σ 2 plim(σ σ˜ 2 = 1 (2) n ∑ uˆi2 (3) n i =1 ˜ 2) = E (σ n−2 n σ 2 6= σ 2 n−2 ˜ 2 ) = plim plim(σ | n {z 1 (4) σ2 = σ2 (5) } σˆ 2 is unbiased and consistent σ˜ 2 is biased (as the degrees of freedom adjustment is omitted), but still consistent ˆi2 = (n − 2)σ 2 E ∑ni=1 u (Wooldridge 2.5, equation 2.61) 4 / 26 Consistency (Wooldridge 5.1, Appendix C.3) 5 / 26 Heteroskedasticity (Wooldridge 8.1) Var (ui |x1 , . . . , xk ) = σi2 6= σ 2 (6) Error variance not the same for all values of the regressors: MLR.5 is violated Assumptions MLR.1-MLR.4 are maintained: OLS is unbiased and consistent Due to a violation of MLR.5, standard errors have a different form than under homoskedasticity, therefore the ‘standard’ variance estimator is (generally) biased and inconsistent Without correcting for heteroskedasticity, test statistics (t-tests, F-tests etc.) have a different distribution so that inference (p-values, confidence intervals) is incorrect OLS without correction for heteroskedasticity is no longer efficient 6 / 26 Heteroskedasticity robust inference (Wooldridge 8.2) Simple regression model: n Σ (xi − x¯)ui βˆ1 = β1 + i =1 Var (βˆ1 ) = c (βˆ1 ) Var = SSTx n Σi =1 (xi − x¯)2 σi2 SSTx2 Σni=1 (xi − x¯)2 uˆi 2 SSTx2 (7) (8) (9) Multivariate regression modell: c (βˆj ) Var = Σni=1 rˆij 2 uˆi 2 SSRj2 (10) rˆij is the ith residual in a regression of xj on all other independent variables. SSRj is the sum of squared residuals in this regression. 7 / 26 Heteroskedasticity robust inference (Wooldridge 8.2) c (βˆj ) is not unbiased, but consistent Var Alternative variance estimators include a degrees of freedom correction n/(n − k − 1) (but asymptotically all variance estimators are equivalent) q c (βˆj ) is the heteroskedasticity robust estimator of the Var standard error Tests have the same distribution as before, after replacing the homoskedastic standard error by the heteroskedasticity robust (estimator of the) standard error See Eicker (1967), Huber (1967), White (1980) 8 / 26 9 / 26 Testing heteroskedasticity (Wooldridge 8.3) Model and null hypothesis: y H0 : = β0 + β1 x1 + . . . + βk xk + u 2 E (u |x1 , . . . , xk ) = σ 2 (11) (12) Breusch-Pagan-test for heteroskedasticity: 1 Estimate (11) and compute residuals uˆi 2 Estimate uˆi 2 = δ0 + δ1 x1 + . . . + δk xk + v 3 Test H0 : δ1 = . . . = δk = 0 by means of F-test for joint significance for all coefficients The Breusch-Pagan-test assumes a linear association of the regressors and the variance of the error term. 10 / 26 Testing heteroskedasticity (Wooldridge 8.3) White-test for heteroskedasticity: Model for uˆi additionally includes squares and cross products of all regressors The method tests for those forms of heteroskedasticity which violate the validity of the conventional OLS standard errors Problem: Tests may reject the null hypothesis if one of the assumptions MLR.1-MLR.4 are violated, even if MLR.5 is satisfied Therefore, the tests may also be regarded as general specification tests Recommendation: When in doubt, use heteroskedasticity robust standard errors! 11 / 26 Weighted least squares regression (Wooldridge 8.4) In contrast to OLS, weighted least squares (WLS) minimizes the weighted sum of squared residuals Less weight is given to observations with a higher error variance to correct for heteroskedasticity In contrast, OLS gives each observation the same weight (which is best when the error variance is always the same) → WLS is more precise (smaller standard errors) than OLS in the case of heteroskedasticity (but both estimators are consistent) 12 / 26 Weighted least squares regression (Wooldridge 8.4) More concisely, WLS proceeds as follows: 1 Divide yi and 1, xi1 , . . . , xik (for any observation i in the sample) by p 2 the heteroskedastic standard error, σi , to obtain normalized (and thus, homoskedastic) values of the initial observations: yi∗ ∗ xi0 2 q = yi / σi2 , q q q = 1/ σi2 , xi1∗ = xi1 / σi2 , . . . , xik∗ = xik / σi2 ∗ , x ∗ , . . . , x ∗ (using all Run an OLS regression of yi∗ on xi0 i1 ik observations i in the sample) In errors need to be estimated, so that ppractice, the heteroscedastic p ˆi2 rather than the true σi2 are used in the weighted regression u (=feasible GLS). 13 / 26 Non-random samples (Wooldridge 9.4) Violation of MLR.2: Sample is only drawn from a subgroup of the population of interest Parts of the random sample are not/cannot be used Is the sample still random and representative? Can the coefficients still be estimated consistently? 14 / 26 Non-random samples (Wooldridge 9.4) Sample selection based on the independent variables: Subsample is representative for a part of the population The true model also applies to this subsample If MLR.1-MLR.4 hold, then the coefficients are estimated consistently in the subsample Problem: The true model is unknown We cannot check whether the coefficients differ between the observed subsample and the unobserved part of the population (e.g. through dummies for various parts of population and interaction terms) Due to such potential effect heterogeneities (different coefficients for different parts of the population), the estimates generally apply only to the observed subgroup (internal validity), but not necessarily to the entire population (external validity) 15 / 26 Non-random samples (Wooldridge 9.4) True model (in which MLR.1-MLR.4 holds): y = β0 + δ0 1(x1 > x¯1 ) + β1 x1 + δ1 1(x1 > x¯1 )x1 + u (13) in unobserved group with x1 ≤ x¯1 : y = β0 + β1 x1 + u (14) in observed group with x1 > x¯1 : y = (β0 + δ0 ) + (β1 + δ1 ) x1 + u (15) | {z } | {z } α0 α1 it is neither known nor estimable whether δ0 6= 0 or δ1 6= 0, therefore (αˆ0 , αˆ1 ) mit only be correct for the observed group with x1 > x¯1 . 16 / 26 Non-random samples (Wooldridge 9.4) Sample selection based on the dependent variable: Systematic censoring of the values of the dependent variable generally restricts the location of the regression line and introduces bias The reason is that censoring systematically excludes residuals with particular(ly high or low) values If the regressors affect the dependent variable, this entails endogeneity: if xj and y are positively correlated and the sample is restricted to small y , then large xj must go together with small residuals Therefore, Corr (xj , u |y < ymax ) 6= 0 even if Corr (xj , u ) = 0 Never select your sample based on y or some function of y ! 17 / 26 Non-random samples (Wooldridge 9.4) 18 / 26 Non-random samples (Wooldridge 9.4) Sample selection based on the dependent variable: True model: hwage = 10 + 10 · education + u Person 1: hourly wage 200, years of education 16: u1 = 30 Person 2: hourly wage 170, years of education 16: u2 = 0 Person 3: hourly wage 140, years of education 16: u3 = −30 If the sample is restricted to observations with hourlywage < 170, then among those with education = 16 only individuals with u < 0 are selected: person 3. 19 / 26 Non-random samples (Wooldridge 9.4) hwage = 10 + 10 · education + u , education 9 10 11 12 13 14 15 16 17 18 umin -70 -70 -70 -70 -70 -70 -70 -70 -70 -70 umax 70 60 50 40 30 20 10 0 -10 -20 u ∼ U (−70, 70) E (u |education, hwage ≤ 170) 0 -5 -10 -15 -20 -25 -30 -35 -40 -45 20 / 26 Non-random samples (Wooldridge 9.4) Missing values in variables: Observations are automatically excluded by many statistical software packages Why do missing values occur? Are missing values likely correlated with the dependent variable? Missing values in one of the regressors: Are values of dependent variable and other regressors systematically different across observations with and without missing values? Instead of dropping observations with missing values, it seems preferable to set the missing variable value to zero and include a dummy for missingness as additional regressor Missing values in the dependent variable: Are values of the regressors systematically different across observations with and without missing values in the dependent variable? Observations cannot be used 21 / 26 Non-random samples (Wooldridge 9.4) Missing values in a regressor: Set the missing value to zero and estimate y = β0 + δ0 1(x1 missing ) + β1 x1 + u If missing dummy δ0 6= 0, then it cannot be excluded that the sample is selective w.r.t. the dependent variable y if observations with missing values in the regressor are dropped Missing dummy in the regression is a test whether missing values are problematic If clear that missing values are exclusively related to independent variables, then estimate the model also with observations without missing values for comparison Results for this group are in any case correct (internal validity), but for the total sample the interaction term 1(x1 missing )x1 could be missing 22 / 26 Non-random samples (Wooldridge 9.4) Outliers: ...have a strong impact on the location of the regression line Coding errors vs. ‘real’ outliers (why do they occur) If dropped, is the sample still random and representative? Are outliers correlated with the dependent variable? Present results with an without outliers It might be worth considering median regression rather than mean regression (both estimate the same parameters under a symmetric distribution of the error term): E (y ) = β0 + β1 x Median(y ) = β0 + β1 x 23 / 26 Non-random samples (Wooldridge 9.4) Example: 24 / 26 Non-random samples (Wooldridge 9.4) Scatter plot of the data: 25 / 26 Non-random samples (Wooldridge 9.4) Conclusion: Sample selection based on the dependent variable entails the inconsistency of all regression coefficients Under sample selection based on the regressor(s), the results might only be (internally) valid for this part of the population (due to effect heterogeneity) Missing values and outliers are not problematic if they are random and can then be discarded (corresponds to randomly picking a somewhat smaller sample), but the are very often not random! → Detailed information about sample selection and data generation required! 26 / 26
© Copyright 2024 ExpyDoc