Smooth mixed models for balanced longitudinal data

Smooth mixed models for balanced
longitudinal data
Iain D. Currie1
1
Department of Actuarial Mathematics and Statistics, and the Maxwell Institute
for Mathematical Sciences, Heriot-Watt University, Edinburgh, UK
E-mail for correspondence: [email protected]
Abstract: Centring is a familiar device much used in the analysis of fixed effects
models, but rarely seen in the context of mixed models, since the distribution of
the random effects usually brings about the required centring. However there are
mixed models, notably in the analysis of longitudinal data, where the distribution
of the random effects does not bring about the necessary centring. In such models
we use a conditional argument to centre the distribution of the random effects
directly: the resulting estimates are of necessity unbiased. In general, we define
a new class of mixed models, centred mixed models. We give some examples of
models in this class and discuss efficient estimation of the fixed and random
effects, and the variance parameters. We illustrate our method with the analysis
of some Canadian weather data.
Keywords: Bias; Centring; Longitudinal data; Mixed models; Smoothing.
1
Introduction
We consider balanced longitudinal data Y = [y 1 : . . . : y n ] = (yi,j ), m × n,
with m observations on each of n subjects at times t = (t1 , . . . , tm )T . One
such data set is illustrated in the left panel of Figure 1, which shows the
daily average temperature (taken over the period 1960-1994) in 35 Canadian cities; the mean temperature over the 35 cities is also shown. These
data are available in CanadianWeather in the library fda in R (R Core
Team, 2013). These data are balanced, and simple data analysis shows
that both population mean and subject (city) effects are curved. Ruppert
et al. (2003, ch 9) described these curves with truncated lines as follows:
yi,j = β0 +β1 ti +
K
X
k=1
uk (ti −τk )+ +b0,j +b1,j ti +
˘
K
X
vk,j (ti − τ˘k )+ +i,j . (1)
k=1
Here x+ = max {0, x}, and τ = {τ1 , . . . , τK } and τ˘ = {˘
τ1 , . . . , τ˘K˘ } are
˘ equally spaced internal knots in t; usually, we take K
˘ <
sets of K and K
K. Ruppert et al. (2003, p192) expressed the model as a mixed model,
2
Smooth mixed models
but did not “explore the use of standard software for fitting”. Durban
et al. (2005) showed that the model could be fitted in a straightforward
fashion with the lme function in R, thus making the model widely available.
However, Djeundje and Currie (2010) and Heckman et al. (2013) showed, by
considering balanced data, that (1) could lead to seriously biased estimates
of both the population and subject curves. In this paper, we propose a
modification to the distribution of the random effects of the original mixed
model of Ruppert et al. (2003) which corrects for the bias.
Temperature in 35 Canadian cities
Observed and fitted population mean
Observed mean
Biased estimate
Centred estimate
20
10
Temperature (celcius)
Temperature (celcius)
10
0
−10
0
−20
−10
−30
0
100
200
Time (days)
300
0
100
200
300
Time (days)
FIGURE 1. Left: temperature in 35 Canadian cities (grey), mean (black); right:
observed, biased and centred estimates (forward bases) of population mean with
˜ = 19.
K = 39 and K
The plan of the paper is as follows: in section 2 we describe our main idea,
centred random effects, with reference to three examples, and present the
results of the analysis of the Canadian weather data; in section 3 we define
a new class of mixed models, centred mixed models, and describe efficient
estimation in this class; in section 4 we draw some conclusions and indicate
the direction of future work.
2
2.1
Examples
One-way layout
The usual model for the balanced one-way layout with m observations on
each of n subjects is
yi,j = µ + uj + i,j , i = 1, . . . , m, j = 1, . . . , n, i,j ∼ N (0, σe2 ).
(2)
The parameters in (2) are not identifiable so some condition is required to
ensure that µ estimates the population mean. In the fixed effects model the
Currie
3
P
constraint
uj = 0 ensures that µ
ˆ does indeed estimate the population
mean. In the mixed model we suppose the uj ∼ N (0, P
σu2 ) and, as is well
known, this implies that the BLUP of the uj satisfies
u
˜j = 0; in other
words, correct identification of population mean and subject effects is implicit in the distribution of the random effects, and no explicit constraints
are required.
We can write model (2) in matrix/vector form as
y | u = µ1N + (I n ⊗ 1m )u + , u ∼ N (0, σu2 I n ), ∼ N (0, σe2 I N ) (3)
where I s is an identity matrix of size s, 1s is a vector of 1’s of length
s, N = mn and ⊗ denotes the Kronecker product. We ask the following
question: what is the appropriatePway to place explicit constraints on the
random effects u to ensure that
u
˜j = 0? We observe that
P
2
(4)
u | ( uj = 0) ∼ N 0, σu (I n − n1 1n 1Tn ) = N 0, σu2 H n ,
say, and propose that the distribution of u in (3) be replaced by the conditional distribution in (4). The matrix H n is a familiar device and is known
as the centring matrix, since H n w returns (w1 − w,
¯ . . . , wn − w)
¯ T for any
T
vector w = (w1 , . . . , wn ) .
The two models, (3) and its centred version, give exactly the same estimates of the fixed effect µ and the random effects u, and the same residual
likelihood and hence the same estimates of the variance components σe2 and
σu2 ; ie, in this example where the constraints on u are implicit in the original mixed model (3), explicit centring of the random effects is redundant
as far as parameter estimates are concerned.
2.2
Linear population and subject effects
We consider the mixed model where population mean and subject effects
are linear:
yi,j = β0 + β1 ti + u0,j + u1,j ti + i,j , i = 1, . . . , m, j = 1, . . . , n,
(5)
where uj = (u0,j , u1,j )T ∼ N (0, Σ0 ), and Σ0 , 2 × 2, is a positive definite symmetric matrix. Let β = (β0 , β1 )T be the vector of fixed effects,
u = (uT1 , . . . , uTn )T be the vector of random effects, and T 0 = [1m : t].
Corresponding to (3) we have
y | u = [1n ⊗T 0 ]β +[I n ⊗T 0 ]u+, u ∼ N (0, I n ⊗Σ0 ), ∼ N (0, σe2 I N ).
(6)
The model
is
readily
fitted
with
lme
and
the
BLUP
of
the
random
effects
P
P
satisfies
u
˜0,j =
u
˜1,j = 0. As in the one-way layout, the distribution
of u ensures that the estimates are correctly centred. The centred mixed
model corresponding to (6) replaces u ∼ N (0, I n ⊗ Σ0 ) with
P
P
u | ( u0,j = u1,j = 0) ∼ N (0, H n ⊗ Σ0 ) .
(7)
4
Smooth mixed models
Again, we find that model (6) and its corresponding centred version give
the same estimates of β, u, σe2 and Σ0 .
2.3
Curved population and subject effects
We can write model (1) compactly as
y | (u, b, v) = [1n ⊗ T 0 ]β + [1n ⊗ T ]u + [I n ⊗ T 0 ]b + [I n ⊗ T˘ ]v + , (8)
˘ are regression matrices of truncated lines.
where T , m × K, and T˘ , m × K,
The distribution of the random effects is defined as follows:
u = (u1 , . . . , uK )T ∼ N (0, σu2 I K ),
T
T T
(9)
T
b = (b1 , . . . , bn ) , bj = (b0,j , b1,j ) , b ∼ N (0, I n ⊗ Σ0 ),
T
T T
T
v = (v 1 , . . . , v n ) , v j = (v1,j , . . . , vK,j
˘ ) , v ∼N
0, σv2 I nK˘
(10)
. (11)
Finally, ∼ N (0, σe2 I N ). This is the model proposed by Ruppert et al. (2003)
and fitted with the lme function in R by Durban et al. (2005). We fit the
ˆ + Tu
˜ = 19. If we want T 0 β
˜ to estimate the
model with K = 39 and K
population mean then we must check P
that the subject
random
effects are
˜b0,j = P ˜b1,j = 0, so the linear
appropriately centred. We find that
P
˜
component is correctly centred but
v˜k,j 6= 0 for each k = 1, . . . , K.
The right panel of Figure 1 shows that the estimate of the population
mean is indeed biased.
We centre the model by replacing the distribution
v ∼ N 0, σv2 I nK˘ with its centred distribution
P
˜ ∼ N 0, σv2 H n ⊗ I ˘ .
v | ( j vk,j = 0, k = 1, . . . , K)
(12)
K
The right panel of Figure 1 shows that the bias of the estimate of the
population mean has been removed: the centred estimate now tracks the
observed mean very well.
The problem with the original formulation is that the subject effects are
not centred. Let Sj (t) denote the j th subject effect and S˜j (t) be the BLUP
of Sj (t) in the centred model. Then
!
X
X
X
˜b0,j + ˜b1,j t +
S˜j (t) =
v˜k,j (t − τ˘k )+ = 0, ∀ t.
(13)
j
j
k
In other words, centring the random effects centres the entire curve; we
call
P ˜this perfect centring. Table 1 shows values of the mean bias per city,
j Sj (t)/n, for selected t. Truncated lines come in two versions: the forward
basis with truncated lines of slope +1 and the backward basis with truncated
lines of slope −1. Fixed effects models with forward and backward bases
are parameterizations of each other, but in a mixed model the two models
are different. The forward basis is more usual but results for both bases are
shown in Table 1. The bias with the forward basis is substantial.
Currie
TABLE 1. Mean bias
Day
Centred
Forward
Backward
3
5
P˜
Sj (t)/n, t = 1, 100, 200, 300, 365
1
100
200
300
365
0
0
0.1
0
−0.3
0.3
0
−0.7
0.2
0
−1.3
0.1
0
−1.6
0
Centred mixed models
Equations (4), (7) and (12) show a common pattern and suggest a general
class of mixed models. A convenient definition is:
Definition A centred mixed model is a mixed model y = Xβ + Zu + for
data Y , m × n, y = vec(Y ), such that X and V = var(y) have the form
X = 1n ⊗ X 0 ,
V = I n ⊗ V 0 − n1 1n 1Tn ⊗ ψ 0 .
(14)
All our earlier examples, including, perhaps surprisingly, the mixed model (8),
are examples of centred mixed models. For model (8) we have
T
V 0 = σe2 I m + T 0 Σ0 T T0 + σv2 T˘ T˘ ,
ψ 0 = −nσu2 T T T .
(15)
Thus, even although model (8) is not centred in the usual sense of the
word, the model structure is that of a centred mixed model. For the centred
T
version of model (8) V 0 remains the same and ψ 0 = σv2 T˘ T˘ − nσu2 T T T .
The original mixed models for the one-way layout and the linear by linear
model are trivial examples of centred mixed models with ψ 0 = 0.
Computation is an issue with centred mixed models. As far as we are aware,
centred mixed models cannot be fitted with lme since the required variance
structure is not available. However, the structure (14) does admit an efficient estimation scheme. The key result is that a variance function V of
the form (14) admits a closed form inverse
T
−1
1
V −1 = I n ⊗ V −1
ψ 0 V −1
0 + n 1n 1n ⊗ (V 0 − ψ 0 )
0 .
(16)
This result enables us to obtain all the necessary quantities in the estimating equations with computations of size m instead of size N = mn. For
example, the variance components are chosen by maximising the residual
likelihood
− 21 log |V |− 12 log |X T V −1 X|− 12 y T (V −1 −V −1 X(X T V −1 X)−1 X T V −1 )y
(17)
and we can show that
|V | = |V 0 |n−1 |V 0 − ψ 0 |
T
X V
−1
X
X T V −1 y
T
−1
= nX 0 (V 0 − ψ 0 )
X0
= X T0 (V 0 − ψ 0 )−1 Y 1n .
(18)
(19)
(20)
6
Smooth mixed models
The estimate of the fixed effect β follows from (19) and (20):
−1
ˆ = 1 X T (Vˆ 0 − ψ
ˆ )−1 X 0
ˆ )−1 Y 1n .
β
X T0 (Vˆ 0 − ψ
0 0
0
0
n
(21)
Similar formulae exist for the remaining term in the residual likelihood
and the estimates of the random effects, but are omitted in this paper. The
computational formulae (18) through (21) simplify in the trivial case when
ψ 0 = 0 and are of interest in their own right.
4
Conclusions
We have (a) defined a new class of mixed models, centred mixed models,
(b) given a number of examples, (c) indicated efficient computation and
(d) used such a model to remove bias in an important example in the
analysis of longitudinal data. Future work includes the calculation of confidence intervals and the application of the ideas to B-spline bases and to
unbalanced data.
Acknowledgments: I am grateful to Maria Durban and the Spanish Ministry of Science and Innovation (project MTM2011-28285-C02-02) for financial support, and to Janet Heffernan, Viani Djeundje, Paul Eilers and
Maria Durban for useful comments on early drafts of this paper.
References
Durban, M., Harezlak, J., Wand, M.P. and Carroll, R.J. (2005). Simple fitting of subject-specific curves for longitudinal data. Statistics in Medicine,
24, 1153 – 1167.
Djeundje, V.A.B. and Currie, I.D. (2010). Appropriate covariance-specification
via penalties for penalized splines in mixed models for longitudinal
data. Electronic Journal of Statistics, 4, 1202 – 1224.
Heckman, N., Lockhart, R. and Nielsen, J.D (2013). Penalized regression,
mixed effects models and appropriate modelling. Electronic Journal
of Statistics, 7, 1517 – 1552.
R Core Team. (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
URL http://www.R-project.org/.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003). Semiparametric Regression. Cambridge University Press.