Estimation and Statistical Tests for Difference-in

Estimation and Statistical Tests for Difference-in-Differences Models with Binary Response Data
Taeyong Park
Department of Political Science
E[Y0|T = 1, G = 1, X] = α + β + Xθ
E[Y0|T = 1, G = 1, X] = α + β + γ + Xθ
τDID = γ.
I
The significance test of τ
DID
ˆ b comes from the variance-covariance matrix.
Ω
DID
DID
∂
τ
ˆ
∂
τ
ˆ
DID
I For
,
take
the
partial
derivative
of
τ
ˆ
with respect to each element of the vector of
0 and
∂b
∂b
the coefficients. For example, the derivative of τˆ DID with respect to a coefficient βˆ is displayed as
∂τˆ DID
∂ =
exp(•)[1 + exp(•)]−1 − exp(⊕)[1 + exp(⊕]−1
∂βˆ 0
∂βˆ 0
= exp(•) [1 + exp(•))]−1 − exp(•)[1 + exp(•)]−2
− exp(⊕) [1 + exp(⊕))]−1 − exp(⊕)[1 + exp(⊕)]−2 ,
ˆ and exp(⊕) = exp(αˆ + βˆ + Xθ).
ˆ
where exp(•) = exp(αˆ + βˆ + γˆ + Xθ)
Problem: The treatment effect cannot be constant across
the treated population in nonliear DID models because
the expectation of the outcome variable is bounded.
I A solution: Apply the DID assumption of a constant
difference between groups across time to the
unobserved latent linear index such that
1) Treat the regressors X1, ..., Xk as random variables and resample one set of
response variable value and regressors.
2) Repeat R times of sampling Z with replacement.
The resulting bootstrap samples Z1 = {Z11, Z12, ...Z1n}, Z2 = {Z21, Z22, ...Z2n}, ..., ZR = {ZR1 , ZR2 , ...ZRn}
produce R sets of coefficients br = [αr, βr, γr, θr]0, r = 1, ..., R.
3) Plug these bootstrapped coefficients into
ˆ − logit−1(αˆ + βˆ + Xθ)
ˆ to obtain R sets of τˆ DID.
τˆ DID = logit−1(αˆ + βˆ + γˆ + Xθ)
I
The mean of R sets of τˆ DID can be an estimate of τDID, and the standard deviation around the R
sets of τˆ DID is the standard error of τˆ DID.
5. Bayesian Approach
I
1) Generate a sample of the posterior distribution of each of the parameters of
interest.
I
I
I
2) Generate a sample of the posterior distribution of τ
parameter outputs from step 1) into equation
ˆ − logit−1(αˆ + βˆ + Xθ).
ˆ
τˆ DID = logit−1(αˆ + βˆ + γˆ + Xθ)
by plugging the
3) The posterior mean can be an estimate of τDID, and the standard deviation of
the posterior distribution of τDID is the standard error of τˆ DID.
500
40
400
250
200
300
100
10
0.02
0.03
0.04
200
Density
150
0.01
Mean
SE
0
0.00
0.02
0.04
0.06
0.08
0.10
0.000
0.005
Estimated standard errors
0.010
0.015
0.020
0.025
Estimated standard errors
Figure : Distribution of Estimated Standard Errors
8. Empirical Application
The effect of personal experience of employment on egotropic/sociotropic evaluations – To examine the
microfoundation of decision making in the economic voting process
I The American Panel Study (TAPS) Nov 2011 and Nov 2012 data
I
Dependent V: “getting better” = 1; “getting worse” = 0
I Treatments: Unemployed → Employed / Employed → Unemployed
I Control: No change
I
Three
Getting a job
Losing a job
Estimators Mean S.D. Lower Upper Mean S.D. Lower Upper
τˆ Delta 0.319 0.126 0.072 0.567 -0.160 0.121 -0.397 0.078
Lib. Democrat
τˆ Boot 0.258 0.140 0.035 0.560 -0.144 0.137 -0.469 0.061
τˆ Bayesian 0.304 0.113 0.106 0.511 -0.153 0.118 -0.403 0.085
τˆ Delta 0.241 0.104 0.037 0.446 -0.108 0.086 -0.276 0.061
Mod. Independent
τˆ Boot 0.204 0.133 0.018 0.510 -0.103 0.110 -0.393 0.042
τˆ Bayesian 0.227 0.095 0.073 0.433 -0.101 0.082 -0.303 0.065
τˆ Delta 0.222 0.099 0.027 0.416 -0.097 0.079 -0.251 0.057
Con. Republican
τˆ Boot 0.190 0.129 0.018 0.496 -0.092 0.099 -0.347 0.036
τˆ Bayesian 0.214 0.092 0.060 0.418 -0.095 0.078 -0.272 0.048
Table : Estimated Treatment Effects on Egotropic Evaluations
Lib. Democrat
Mod. Independent
Con. Republican
Table :
PolMeth 2014 Poster Session
Mean
SE
100
Figure : Different Ranges of the Latent
Values
We may employ the MCMCpack function MCMClogit() to generate a sample of the posterior
distribution, if we want to deal with a logit DID model.
DID
Large latent values
50
Simulations study 3:
Large latent values
0
Simulations study 2:
Medium latent values
Mean
SE
Estimated standard errors
As a result, an observation that can be resampled is displayed as Zi = [Yi, Xi1, Xi2, ...Xik].
0
I That is, Z = [Z1 , Z2 , ...Zn ] can be resampled, where n is the number of observations of the
original data set, and Z1 = [Y1, X11, X12, ...X1k], ..., Zn = [Yn, Xn1, Xn2, ...Xnk].
I
Approximation
Bootstrap
Bayesian
0.00
More complicated models with, for instance, high-dimensional data or
complicated combinations of variables? – Simulation-based approaches
(bootstrapping and Bayesian) should be considered.
I
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Simulations study 1:
Small latent values
I
I
Density
1.0
0.8
0.6
0.4
0.2
I
Moderate latent values
A logistic distribution and the latent values
0.0
I
E[Y0|T = 1, G = 1, X] = Φ(α + β + Xθ)
E[Y0|T = 1, G = 1, X] = Φ(α + β + γ + Xθ),
τDID = E[Y1|T = 1, G = 1, X] − E[Y0|T = 1, G = 1, X]
= Φ(α + β + γ + Xθ) − Φ(α + β + Xθ).
Small latent values
6. The Bootstrapping Approach
I
where Φ(·) denotes the conditional distribution function
of the standard normal distribution for the probit case.
Then, the treatment effect in this DID probit model is
estimated by
Table : Estimation Coverage
I
: t-test
3. Review: Puhani’s Nonlinear DID
2) Var(τˆ DID) can be computed analytically:
30
I
τDID = E[Y1|T = 1, G = 1, X] − E[Y0|T = 1, G = 1, X].
Participation in treatment is only occurred where T = 1
and G = 1. Therefore, E[Y0|T, G, X] = αT + βG + Xθ, and
E[Y1|T, G, X] = αT + βG + γTG + Xθ.
As a result,
1) Apply the Delta method to calculate the asymptotic variance of τDID.
The Delta method results in τˆ DID that asymptotically follow a normal distribution:
DID DID
∂τ
∂τ
, where b denotes a vector of all coefficients.
τˆ DID ∼ N τDID,
0 Ωb
∂b
∂b
DID
I The asymptotic variance of τ
is estimated consistently by
DID
DID
∂
τ
ˆ
∂
τ
ˆ
ˆ b is a consistent covariance estimator of b.
ˆ
Var(τˆ DID) =
, where Ω
0 Ωb
∂b
∂b
I
I
Simulation study 1
Small latent values (quantiles)
Estimation coverage at the 95% level
25% 50% 75%
Approximation Bootstrapping Bayesian
-5.464 -3.650 -1.950
325/1000
1000/1000 1000/1000
Simulation study 2
Moderate latent values (quantiles)
Estimation coverage at the 95% level
25% 50% 75%
Approximation Bootstrapping Bayesian
-1.476 -0.052 1.365
997/1000
1000/1000 1000/1000
Simulation study 3
Large latent values (quantiles)
Estimation coverage at the 95% level
25% 50% 75%
Approximation Bootstrapping Bayesian
1.685 3.532 5.361
197/1000
1000/1000 1000/1000
I
2. Review: Puhani’s Linear DID
T: time; G: group; X: covariates; Y1 and Y0: potential
outcomes with and without treatment; Y: outcome
I The DID treatment effect = The difference between the
expected potential outcome under treatment and the
expected counterfactual outcome under treatment:
Undertake simulation studies to examine the relative performance of the three approaches with regard to the
coverage property of the true treatment effect.
20
I
I
0
Consider a logit DID model.
ˆ − logit−1(αˆ + βˆ + Xθ),
ˆ based on Puhani’s
Iτ
ˆ DID = logit−1(αˆ + βˆ + γˆ + Xθ)
nonlinear DID estimator.
I
Prob(Y=1)
For the standard DID model with a continuous
dependent variable, it is straightforward to estimate
DID treatment effect and conduct statistical tests. (But
see Athey and Imbens (2006))
I Then, what if we want to deal with a limited
dependent variable case such as binary or ordered
categorical?
I Puhani’s 2012 Economics Letters paper derives a
nonlinear DID estimator.
I However, one limitation is that Puhani (2012) does not
discuss how to estimate the DID treatment effect in
practice and how to perform statistical tests.
I The present paper focuses on exploring three
approaches to deriving the variance of Puhani’s
nonlinear DID estimator and applying these approaches
to the study of economic voting.
I
7. Simulation Studies
Density
5. Approximation-based Delta Method Approach
1. Motivation
Three
Getting a job
Losing a job
Estimators Mean S.D. Lower Upper Mean S.D. Lower Upper
τˆ Delta 0.199 0.154 -0.104 0.501 0.006 0.111 -0.211 0.222
τˆ Boot 0.302 0.253 -0.070 0.930 0.135 0.341 -0.461 0.830
τˆ Bayesian 0.190 0.155 -0.137 0.500 -0.006 0.152 -0.346 0.248
τˆ Delta 0.136 0.110 -0.079 0.351 0.003 0.068 -0.130 0.137
τˆ Boot 0.306 0.236 -0.076 0.884 0.105 0.300 -0.437 0.756
τˆ Bayesian 0.136 0.118 -0.089 0.347 -0.009 0.112 -0.260 0.153
τˆ Delta 0.066 0.059 -0.050 0.182 0.001 0.030 -0.057 0.060
τˆ Boot 0.279 0.207 -0.078 0.736 0.062 0.223 -0.378 0.571
τˆ Bayesian 0.068 0.062 -0.033 0.211 -0.005 0.057 -0.160 0.083
Estimated Treatment Effects on Sociotropic Evaluations
Taeyong Park ([email protected])