EC352 Econometric Methods: Week 07 Gordon Kemp Outline Contents 1 Heteroskedasticity in Time Series Regressions 1.1 Failure of Standard Conditional Homoskedasticity . . . . . . . . . . . . . 1.2 ARCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 2 2 Advanced Time Series Topics 2.1 Highly Persistent Time Series . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Cointegration and Related Issues . . . . . . . . . . . . . . . . . . . . . . 3 3 6 3 Serial Correlation and OLS under Weak Dependence 3.1 Testing for Serial Correlation . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Remedies for Serial Correlation . . . . . . . . . . . . . . . . . . . . . . 10 11 11 1 Heteroskedasticity in Time Series Regressions 1.1 Failure of Standard Conditional Homoskedasticity Effects On OLS • Suppose Var (εt |xt ) varies with xt (so standard conditional homoskedasticity assumption fails). • Similar consequences to those of serial correlation and to those of conditional heteroskedasticity in cross-section regressions: – OLS estimates of coefficients remain consistent. 1 – Usual OLS standard errors are invalid – Usual t-test and F-tests are invalid Dealing with Conditional Homoskedasticity • As in cross-section regressions we can go one of two routes: 1. Compute corrected standard errors and implement corrected t-test and F-tests. Note we can also do this for serial correlation provided that exogeneity still holds. 2. Computed weighted least squares estimates. • The usual tests of conditional heteroskedasticity can be applied in time-series regressions but are typically only valid in the absence of serial correlation. 1.2 ARCH Autoregressive Conditional Heteroskedasticity • There is a second type of heteroskedasticity can arise specifically in time-series regression models. • This type of heteroskedasticity is conditional on the past error terms in the regression. • The first model of this type was the Autoregressive Conditional Heteroskedasticity (ARCH) model. • The first order ARCH model has: 2 Var (ut |ut−1 , ut−2 , . . . ) = α0 + α1 ut−1 • Extensions of this model are widely used in finance for data that exhibits volatility, e.g., stock returns data. 2 Figure 1: NYSE Returns (Weekly, Jan 1976 to Mar 1989) Example 12.9 (following Example 11.4) • Here we regress current NYSE composite index returns on their lagged values and then examine for ARCH effects in the residuals (the data is from NTSE.dta). • First regression gives: \ t= RETURN 0.180 + 0.059 RETURNt−1 (0.081) (0.038) 2 n = 689, R2 = 0.0035, R = 0.0020 • Regression of squared residuals: ubt2 = 2 2.95 + 0.337 ubt−1 + error (0.44) (0.036) n = 688, R2 = 0.114 2 2.1 Advanced Time Series Topics Highly Persistent Time Series A Note on Deterministic Trends 3 • If a variable has a linear trend, then it is necessarily non-stationary. • However, it can be weakly independent. • Example: yt = β0 + β1t + et where et is iid • Clearly, E(yt ) depends on time and so is non-stationary. • However, {yt } is weakly dependent since Cov(yt , yt+h ) = 0 for all h = 1, 2, . . . . Stochastic Trends • Not all trends are deterministic: some are stochastic. • The simplest example of stochastic trend is a random walk: – The process {yt ;t = 1, 2, . . . } follows a pure random walk if: yt = yt−1 + et where et is iid with mean 0 and variance σ 2 . • Random walks are highly persistent: yt = yt−1 + et = yt−2 + et−1 + et = y0 + (e1 + · · · + et−1 + et ) 4 Moments • If y0 is fixed then: E(yt ) = y0 , Var(yt ) = t · σ 2 so the variance depends on time. Importance of unit roots • A change can have very long lasting effects if there is a high persistence. • Note: We should not confuse trending behavior with highly persistent behavior. • However, some highly persistent processes can also have a trend: – For example, random walk with drift: yt = α0 + yt−1 + et which implies: yt = y0 + tα0 + (e1 + · · · + et−1 + et ) so: E(yt ) = y0 + tα0 , Var(yt ) = t · σ 2 5 Transformations on Highly Persistent Time Series • If there is high persistence, then we have no weak dependence and OLS is not consistent. • What can we then do? • If a process has a unit root, that is, is integrated of order 1 or I(1) for short, then the first difference of the process is weakly dependent: 4yt = yt − yt−1 = ut , t = 1, 2, . . . , where ut is weakly dependent. • However, deciding whether a time series has a unit root requires that we use techniques such as the Dickey-Fuller test. Unit Root Tests • Simplest test of whether yt has a unit root is the Dickey-Fuller (DF) test: – Regress ∆yt on an intercept term and yt−1 and reject if the t-statistic is below a critical value from the Dickey-Fuller tables; use Table 18.2 from Wooldridge. • We can generalize by putting in a time trend, i.e., put in t as an additional regressor; use Table 18.3 from Wooldridge. • We can also generalize to allow for short-run dynamics by putting in ∆yt−1 , . . . , ∆yt−p as additional regressors for suitably chosen p (use Table 18.2 if no time-trend and Table 18.3 if time trend included): leads to the Augmented Dickey-fuller (ADF) test. 2.2 Cointegration and Related Issues Problems for OLS • If the process is not weakly dependent, that is, if there is a strong persistence in the process, then our sample violates the standard randomness assumption. • In the case of a random walk, they clearly depend on the initial value. • This causes problems for OLS. 6 Problems for OLS (continued) • Suppose that both xt and yt are non-stationary and are not weakly dependent. • For example, suppose that: xt = xt−1 + ut yt = yt−1 + et where ut and et are iid and jointly independent of one another: thus yt and xt follow independent random walks. • If we estimate a linear regression model as follows: yt = β0 + β1 xt + ηt then we run into spurious regression problems. Spurious Regression with I(1) Processes • Risk of spurious regressions: – We will tend get high R-squared and significant t-statistics which lead us to conclude that there is a strong relationship between the two variables when in fact they are completely unrelated. • Including a deterministic trend in the regression model will not fix this spurious regression problem. Spurious Regression with I(1) Processes (continued) • What can we do? – Tests (unit root tests, cointegration tests). – First-differencing all the variables in the model: indeed, if they are I(1), then the first-difference is a stationary process (and weakly dependent). – Cointegrating regressions; error correction models. 7 Cointegration Tests • Suppose we have identified yt and xt as both being I (1): here xt can be a vector. • We now wish to test if they are cointegrated (CI) or not: is there a linear combination of them which is I (0)? • For this we can use the Engle-Granger test: 1. Regress yt on an intercept term and xt by OLS and let the residuals be uˆt . 2. Regress ∆uˆt on an intercept term and uˆt−1 (together with lags of ∆uˆt if required) and reject if the t-statistic is below a critical value from the Engle-Granger tables; use Table 18.4 from Wooldridge. Cointegration Tests (continued) • This two-step procedure is similar to the Breusch-Godfrey test for serial correlation: but it uses quite different critical values. • As with the Dickey-Fuller test we can also modify by putting in a time trend; use Table 18.5 from Wooldridge. – We then need to include a time trend in the initial regression as well as in the second regression. Cointegration Tests (continued) • A different method of testing is the Johansen procedure. • This treats y and x symmetrically. • Whereas the Engle-Granger procedure can only test if y is cointegrated with x or not, the Johansen procedure can be used to assess how many distinct cointegrating relationships (if any) there are among the variables. 8 Estimation when Cointegration is Absent • If yt and xt are I (1) but are not CI then we can run a regression in first differences: ∆yt = α0 + γ0 ∆xt + ut (1) where γ0 is the partial effect of of a change in the growth rate of xt on the growth rate of yt . • Here ∆yt and ∆xt will be I (0). • Provided that ∆yt and ∆xt satisfy the usual conditions, including the condition that ∆xt and ut are uncorrelated, then we will get consistent asymptotically normal estimates of γ0 . Estimation when Cointegration is Present • If yt and xt are I (1) and are CI then we could run the regression in levels: yt = α + β xt + ut (2) (we could include a time-trend as well). • This would give a consistent estimate of β (the effect of a change in the level of xt on the level of yt ) even when endogeneity is present. • The residuals are: uˆt = yt − αˆ − βˆ xt = ut − (αˆ − α) − (βˆ − β )xt , so the residual sum of squares will tend to be minimized when βˆ is close to β since otherwise (βˆ − β )xt would make a contribution that would tend to explode. Leads and Lags Estimator • However, the usual standard errors and asymptotic distribution are not in general valid because typically there is correlation between ut and ∆xs for some (or all) s and t. • One way to handle the possible correlation between correlation between ∆xs and ut is to include leads and lags of ∆xt into Equation (2). 9 • So, for example, we could include second leads and lags: yt = α0 + β xt + φ0 ∆xt + φ1 ∆xt−1 + φ2 ∆xt−2 + γ1 ∆xt+1 + γ2 ∆xt+2 + et , and estimate this by OLS. • Doing this will mop up the correlation between ut and (∆xt−2 , . . . , ∆xt+2 ). Error Correction Model • Take the first difference regression of Equation (1) and include the lagged value of (y − β x) as an additional regressor: ∆yt = α0 + γ0 ∆xt + δ (yt−1 − β xt−1 ) + ut (we would expect δ < 0). • This enables us to study the short-run dynamics. • Engle-Granger Two-Step Procedure: In practice we do not know β so we include ˆ yt−1 − β xt−1 as a regressor where βˆ is a consistent estimator of β . 3 Serial Correlation and OLS under Weak Dependence Serial Correlation and OLS • OLS will still be consistent provided that Time-Series Assumptions 1-3 are satisfied. • However, OLS will be inefficient because it does not use the information of serial correlation even when the other Gauss Markov assumptions are satisfied (just like OLS is inefficient under heteroskedasticity). • OLS fails to take account of how the value of the disturbance for observation (t − 1) provides information on the likely value of the disturbance for observation t and so on: – If the disturbances and the regressors both show positive serial correlation (or both show negative serial correlation) then the usual OLS formula for the variance of the OLS estimator under-states the true variance. – Hence, standard errors, t-tests and F-tests are invalid. 10 3.1 Testing for Serial Correlation Breusch-Godfrey Test for Serial Correlation • Suppose: yt = β0 + β1 xt + ut ut = ρ1 ut−1 + et where {et } is iid (and independent of {xt }) with mean zero and variance σ 2 so that the disturbances ut follow an AR(1) process. • Null hypothesis is H 0 : ρ1 = 0 (i.e. disturbances are iid so no serial correlation). • Procedure: 1. Run an OLS regression of yt on xt to generate OLS residuals ubt = yt − βb0 − βb1 xt . 2. Run an OLS regression of ubt on xt , ubt−1 . 3. Test significance of ubt−1 in 2nd OLS regression. 3.2 Remedies for Serial Correlation Remedies 1. Use Feasible Generalized Least Squares • For example, Cochrane-Orcutt Procedure when errors are AR(1): – Run an OLS regression of yt on xt to generate OLS residuals ubt = yt − βb0 − βb1 xt . – Run an OLS regression of ubt on xt , ubt−1 to generate ρb1 . – Run an OLS regression on transformed model: (yt − ρb1 yt−1 ) = (β0 − ρb1 β1 ) + β1 (xt − ρb1 xt−1 ) to get an efficient estimate of β1 . 2. Another approach is to run OLS and then compute autocorrelation robust standard errors. • The newey command in Stata will do this (but requries selecting a “lag length” parameter). 11
© Copyright 2024 ExpyDoc