Panel data Advantages of panel data Handling panel data

Advantages of panel data
• More observations
• Time dimension allows taking into account
dynamics
• More variation in data (over time, over
cross sections)
• Unobserved variables that may be
correlated with the variables in the model
can be eliminated
Panel data
Applied Time Series Econometrics
Spring 2014
1
Handling panel data
2
• Unbalanced panel caused by:
– Some survey respondents stop participation
– New individuals taken to sample (to replace
exited ones)
– Rotation of sample
– Demographic events
• Some concepts:
• Panel data / longitudinal data / pooled
cross section – time series data:
– Same observation units followed over time
• Balanced panel:
• firm data: entry of new firms, exit of firms, mergers
• in individual data: deaths, births
– Information available on all observation units
for all time periods
– Errors in data
• Unbalanced panel:
• sample attrition
– For some units, not all time periods available
3
• Pseudo panel: separate cross sections are
aggregated to cells (e.g. based on age
cohort, sex etc.) that are followed over
time
– Sometimes the term pseudo panel is also
used for aggregate panel data, e.g. industry
panels or country panels (in contrast to ”true”
panels that cover micro units)
4
• The data set can be ordered in alternative ways:
– Stacked by cross section: list all years for the 1st
cross section unit, then for the 2nd, etc.
– Stacked by year: list all observations for the 1st year,
then for the 2nd year, etc.
– In either case, variables as columns
– Both are ok, but you have time identifier and crosssection identifier as variables (columns)
– The data can be unbalanced (no need to have
”empty” rows)
• The data can also be in columns by crosssection (“wide form”) and converted to stacked
data (“long form”) in Stata
5
6
1
• Data set stacked by cross-section unit or by date
– Columns: date, unit, variable yit, variables xit’ (1xK
vector)
 1 1 y11 x11 ' 
... ... ... ... 


T 1 y1T x1T ' 


y
'
x
1
2
21 
21

... ... ... ... 


T 2 y2T x2T ' 
... ... ... ... 


 1 N y N 1 xN 1 '
... ... ... ... 


T N y NT x NT '
1
1

...

1
...

...
T

T
...

T
• A panel need not have time and cross
section as the dimensions
• You could have for example region and
industry as the dimensions
1 y11 x11 ' 
2 y21 x21 ' 
... ... ... 

N yN1 xN1 '
... ... ... 

... ... ... 
1 y1T x1T ' 

2 y 2 T x2 T ' 
... ... ... 

N y NT x NT '
– In this case, the regions have to have code
numbers; treat them as if they were “time
periods”
7
8
• Stata data set wagepanel.dta in the course homepage
• Number of cross-section units: 545
• Number of time periods: 8
• Number of obs.: 8x545 = 4360
•
•
•
•
•
•
•
•
•
•
•
•
•
1. nr
2. year
3. black
4. exper
5. hisp
6. hours
7. married
8.-16. occ1 to occ9
17. educ
18. union
19. lwage
20.-26. d81 to d87
27. expersq
• When you have imported a panel data to Stata,
you have to tell the program that it is a panel
• Statistics  Time series  Setup and
utilities  Declare dataset to be time series
person nuber (cross section identifier)
1980 to 1987 (time identifier)
=1 if black
labor market experience in years
=1 if Hispanic
annual hours worked
=1 if married
occ1=1 if occupation = 1 etc.
years of schooling
=1 if in union member
log(wage)
d81=1 if year = 1981 etc.
exper^2
– Then give both time variables and panel ID variable
• Command xtset nr year (also tsset nr year
works)
– Where year and nr are the names of the variables in
the wagepanel.dta data set that indicate time and
cross-section
– Commands that start with ”xt” are panel commands
9
• Time series operators L., F., D. work also with
panel data
10
Pooled estimation
Pooled estimation
• If we use OLS directly for the panel, this is
usually called pooled estimation
yit    xit   it
• i=1,...,N (cross sections); t=1,...,T (time)
• Use normal regression (regress)
11
12
2
• Estimation methods that take into account
the panel nature of the data are fixed
effects (FE) and random effects (RE)
estimation
yit   i  xit   it
• i = fixed effect or random effect
– depending on the application, also called
individual effect, firm effect etc.; other terms:
unobserved effect, idiosyncratic effect
– The terms fixed effects and random effects
are misleading, since both allow i to be
random!
13
Fixed effects (FE) estimation
• A) least squares dummy variables
estimation (LSDV)
– Include a dummy variable for all cross section
units (except leave out one) and estimate
model with OLS
– The coefficients of the dummies estimate the
unobserved effects
– Not useful, if there are many cross section
units; in the wage data N=545
– In principle possible to include the dummies in
the form i.nr, e.g. reg lwage educ i.nr, but
14
there may be problems with matrix size
• The idea in the other fixed effects methods is to
get rid of the unobserved effect i
• B) take differences over time
• Incidental parameters problem
– When number of observations increases, we
can use asymptotic properties of the
estimators, e.g. consistency
– In time series data T; in cross-section
data N; in panel data T and/or N
– If in panel data T is fixed and the model has
cross section dummies, when N, also the
number of parameters to be estimated
increases and there is no gain from more
observations parameters of ”fixed effects”
not consistent
yit  xit    it
• i = 0, so the ”fixed” effect disappears
• Fist observation for each cross section is lost
• OLS used for the differenced model; use
regress and the D. operator for the variables
• This is called first difference transformation
• Sometimes long differences (e.g. 3-year
differences) used, but this requires more data
• Variables that are constant over time drop out
15
First-difference estimation (note: constant dropped)
16
• C) take differences from cross-section
means (demeaning)
• Estimate the model
_
_
_
yit  y i  ( xit  x i )'   ( it   i )
_
y i  t yit / T , etc.
17
• Again, the unobserved effect disappears,
because the mean of i is i (so i-i=0)
• OLS used for the demeaned model
• This is called within transformation
• If T=2, first-difference and within
approaches give the same results
18
3
Regression line estimated with pooled OLS
_
y y
y
True regression line, cross-section
unit 1
1
True regression line, cross-section
unit 2
Regression line estimated with FE
2
_
xx
x
19
• Examples:
– yit = log(wage); i = unobserved ability; xit
includes education, which is correlated with
ability (high ability  more education)
– yit = log(output) ; i = unobserved managerial
ability; xit includes logs of inputs (labor and
capital), which are correlated with managerial
ability (good management  firm grows 
uses more labor and capital)
• In both cases OLS estimates inconsistent
• Unobservable can be eliminated in FE
estimation
20
• Estimation of FE (within) model in Stata
• Statistics  Longitudinal/Panel data 
Linear models  Linear regression
• In the estimation window, specify the model
in the normal way and choose model type
• Command for example
xtreg lwage exper hours, fe
– Options fe=fixed effects, re=random effects,
be=between, pa=population averaged (rarely
used in econometrics)
21
FE (within) estimation
22
• Some notes on the output
• Stata output shows a constant even for fixed
effects model
– This is an ”average” constant
• In demeaning overall averages added to the variables
• Output gives 3 different R2 values
– Within: for demeaned model
• R2 in dummy variable OLS would be much higher
– Between: for time averaged model
– Overall: for pooled data
• Between and overall R2 are not quite equal to ”traditional” R2
23
24
4
Dynamic panel data with FE
• What has to be assumed in FE
estimation?
• Errors uncorrelated with the explanatory
variables:
_
• Strong exogeneity fails (at least) when there are
lagged values of the dependent variable:
yit   i  xit    yi ,t 1   it
_
E[( xit  x i )' ( it   i )]  0
_
• This implies that xit are uncorrelated
with all (past and future) errors, since
they are part of the average error
– This is called strong exogeneity
assumption
_
_
_
yit  y i  ( xit  x i )'    ( yit 1  y i )  ( it   i )
25
• Statistics  Longitudinal/panel data 
Dynamic panel data (DPD)
• xtabond (Arellano-Bond)
• xtdpdsys (Blundell-Bond)
• xtdpd (both of above)
• Fairly complicated, require many choices
(lag legths, instruments, endogeneity of
variables)
• The transformed lagged dependent variable
correlated with the transformed error
• Estimation by GMM (generalized method of
moments): a combination of GLS and
instrumental variables, with lagged values of
variables as instruments
Clustering
• Transformed errors for cross section unit i
(individual, firm etc.) tend to be correlated
with each other (they all include the same
average error)
– Typical approach: correct standard errors for
”clustering” (cluster = cross section unit)
– Command for example xtreg lwage exper, fe
vce(cluster nr) or specify the standard error
option from the panel data menu (SE/Robust)
– nr is the name of the cross-section identifier
27
FE with standard error corrected for
clustering
26
28
Several fixed effects
• It is also possible to specify fixed time
effects (effects that vary over time, but not
across cross sections)
– “two-way” model
– Dummy variables for time periods (in the
example data, dummies d81 to d87; leave out
one of them!)
• Sometimes three-way models
29
– for example if employer of individuals is
known, there could be individual effects, firm
effects, and time effects
– Complicated, if large data sets (e.g. 1 million
individuals, 10000 firms)
30
5
RE estimation
Random effects (RE) estimation
• Include the unobserved effect in the error term:
yit  xit   uit ,
uit   i   it , E ( i )  0 , Var ( i )   i2
– Note: Now x includes a constant
• Use generalized least squares (GLS) or
maximum likelihood (ML) to estimate the model,
taking into account the error structure (all errors
for i are correlated with each other, since they
include the same random i)
31
Issues in choosing between FE and
RE
• Traditional view : In FE the ”effects” are time invariant parameters to be estimated, in RE they are random terms
• When the data set is a random sample of a large population, RE may be appropriate
• If data cover certain individuals / firms / etc., FE is appropriate
– For example, data on all the biggest firms in Finland, data on OECD countries, etc.
32
• Contemporary view :
• The ”effects” are random in both approaches, the main issue is whether they are correlated with the x ’s (allowed in FE) or not (assumed in RE)
• In FE, inferences conditionally on the effects, in RE unconditional inferences
• Out ‐ of ‐ sample projections of yit possible with RE,
but not with FE, since  i not known for out ‐ of ‐ sample observations
33
• In FE, variables that do not change over
time (for a cross section unit), cannot be
used
– Their mean is constant, so difference from
mean is zero (the variable is “wiped out” in the
within transformation)
– E.g. education cannot be included, if nobody’s
educational level changes over time!!
– Other examples:
• Female dummy in wage equation
• Many country characteristics in country panels
34
• FE is based on variation within cross section
units
• Average the data for each cross section and use
them in estimation of model
_
_
_
yi  xi '    i
–  we get between estimator (BE), which is based
on variation between cross section units
• It can be shown that RE estimator can be written
as a combination of FE and BE estimators
– RE is a “compromise” of FE and BW
35
36
6
Between estimation
• Breusch-Pagan LR (Lagrange multiplier) test;
Tests the hypothesis that Var(i) = 0
– If hypothesis accepted, pooled OLS can be used
(instead of RE)
– After xtreg –estimation with option re, use command
xttest0
– Or Statistics  Longitudinal/Panel data  Linear
models  Lagrange multiplier test for random
effects
– The test output gives a chi-square test statistic and pvalue
– High value for the test statistic (and small p-value)
would indicate rejection of hypothesis Var(i) = 0 , i.e.
rejection of hypothesis of no random effects
37
38
• Testing RE vs. FE
• Hausman test
Breusch-Pagan test for random effects
– Tests whether the coefficients in FE and RE
estimations are equal
– If xi correlated with i, FE is consistent but RE not, so
the estimates should be different
– If equality of FE and RE estimates is accepted, both
are consistent and we conclude that xi not correlated
with i  RE can be used
– If equality of estimates is rejected, we conclude that xi
is correlated with i  FE should be used
39
• Hausman test in Stata
– Statistics  Postestimation  Tests  Hausman
specification test
– With commands: After xtreg –estimation with option fe,
store the estimates with estimates store fe_e
40
Hausman test
• Here fe_e is an arbitrary name given for the stored estimates
– Then estimate the same model (i.e. same variables)
with xtreg and option re, and store the estimates:
estimates store re_e
– Finally, give command hausman fe_e re_e
– A high value for the test statistic (and small p-value)
would indicate rejection of hypothesis that FE and RE
estimates are equal; if rejected, FE should be used
– The test sometimes does not work (involves inverting a
matrix that may not be invertible)
41
42
7