Mixed-frequency large

Mixed-frequency large-scale factor models
E. Andreou∗, P. Gagliardini†, E. Ghysels‡, M. Rubin §
First version(1) : June 3, 2014
Preliminary draft.
Please do not circulate without authors’ permission.
∗
University of Cyprus ([email protected]).
Universit`a della Svizzera Italiana and Swiss Finance Institute ([email protected]).
‡
University of North Carolina - Chapel Hill ([email protected]).
§
Universit`a della Svizzera Italiana and Swiss Finance Institute ([email protected]).
1
We thank M. Deistler, D. Giannone and participants at the 2013 (EC)2 Conference on “The Econometric Analysis of
Mixed Frequency Data” in Nicosia for useful comments.
†
1
1
Introduction
Empirical research generally avoids the direct use of mixed frequency data by either first aggregating
higher frequency series and then performing estimation and testing at the low frequency common
across the series, or neglecting the low frequency data and working only on the high frequency series.
The literature on large scale factor models is no exception to this practice, see e.g. Forni and Reichlin
(1998), Stock and Watson (2002) and Stock and Watson (2010).
A number of mixed frequency factor models have been proposed in the literature, although they
exclusively rely on small cross-sections. See for example, Mariano and Murasawa (2003), Nunes
(2005), Aruoba, Diebold and Scotti (2009), Frale and Monteforte (2010), Marcellino and Schumacher
(2010) and Banbura and R¨unstler (2011), among others.
The purpose of this paper is to propose large scale mixed frequency factor models in the spirit of
Bai and Ng (2002), Bai (2003), Bai and Ng (2006). We rely on the recent work on mixed frequency
VAR models, in particular Ghysels (2012) to formulate such a model and its associated estimators. To
study the large sample properties of a principal component estimation procedure, we first discuss the
conditions which allow us to identify low and high frequency factors separately. The identification
conditions complement those of Anderson et al. (2012) who study the identifiability of an underlying
high frequency multivariate AR system from mixed frequency observations. Identifiability guarantees that the model parameters can be estimated consistently from mixed frequency data. We extend
this analysis to mixed frequency factor models. Under suitable regularity conditions, the factors and
loadings can be estimated via an iterative procedure which consists of estimating respectively principal components from the cross-section of high frequency data and the principal components obtained
from a panel of low frequency series projected onto the high frequency factors.
An empirical application revisits the analysis of Foerster, Sarte, and Watson (2011) who use factor analytic methods to decompose industrial production (IP) into components arising from aggregate
shocks and idiosyncratic sector-specific shocks. Foerster, Sarte, and Watson (2011) focus exclusively
on the industrial production sectors of the US economy. Yet, IP has featured steady decline as a share
of US output over the past 30 years. The US economy has become more of a service sector economy.
Contrary to IP, we do not have monthly or quarterly data about the cross-section of US output across
non-IP sectors, but we do on an annual basis. The US Bureau of Economic Analysis provides GDP
2
by industry - not only IP sectors - annually. We identify two factors in a mixed frequency approximate factor model, with one being a low frequency factor pertaining to non-IP sectors. We re-examine
whether the common factors reflect sectoral shocks that have propagated by way of input-output linkages between service sectors and manufacturing. Hence, our analysis completes an important part
missing in the original study as it omitted a major ingredient of US economic activity. A structural
factor analysis indicates that both low and high frequency aggregate shocks continue to be the dominant source of variation in the US economy. The propagation mechanism are very different, however,
from those identified by Foerster, Sarte, and Watson (2011).
2
The model
2.1
Mixed frequency factor structure
Let t = 1, 2, ..., T be the low frequency (LF) time units. Each period (t − 1, t] is divided into m
subperiods with high frequency (HF) dates t − 1 + j/m, with j = 1, ..., m. For expository purpose,
we present the model and the estimators in a simplified framework in which the low frequency periods
are divided into two high frequency subperiods, i.e. we set m = 2.
2
Let x1,i,t and x2,i,t , for i =
1, ..., NH , be the consecutive high-frequency observations at t − 1/2 and t, respectively, and yi,t ,
with i = 1, ..., NL , the low-frequency observations at t. These observations are gathered into the
NH -dimensional vectors x1,t , x2,t , and the NL -dimensional vector yt , respectively. We assume the
following linear factor structure for the stacked vector of observations:

x
 1,t

 x2,t

yt



Λ 0 ∆1
f
 
  1,t

 
 =  0 Λ ∆2   f2,t
 

Ω1 Ω2 B
gt


ε
  1,t
 
 +  ε2,t
 
ut



.

(1)
The factor structure involves two types of unobservable factors with different speeds. The first factor
evolves at high frequency, and the values for subperiods 1 and 2 are denoted by f1,t and f2,t , respectively. The slow factor gt evolves at low frequency. Both types of factors can be multidimensional: the
unobservable factor vectors f1,t , f2,t have dimension KH , and the unobservable factor vector gt has di2
The model with m = 4 high-frequency subperiods, used in the empirical application, is detailed in Appendix D.
3
mension KL . In equation (1), the high frequency observations load on the high frequency factor of the
same half-period via loading matrix Λ, and on the low frequency factor via loading matrices ∆1 and
∆2 . The low frequency observations load on the high and low frequency factors via loading matrices
Ω1 , Ω2 and B, respectively. The loadings matrix Λ can feature a block structure to accommodate high
frequency factors that are specific to subsets of the high frequency series. A schematic representation
of the factor model is provided in Figure 1.
We assume that the loadings matrices are such that Λ0 Λ/NH → ΣΛ , as NH → ∞, B 0 B/NL →
ΣB , as NL → ∞, where ΣΛ and ΣB are positive definite matrices. Moreover, the idiosyncratic
shocks vectors ε1,t , ε2,t and ut satisfy weak cross-sectional and serial dependence assumptions, and
are assumed to be weakly correlated with the latent factors.
When KH = 0, i.e. there is no high frequency factor, the specification in equation (1) reduces to a
low frequency factor model with vector of observables (x01,t , x02,t , yt0 )0 and factor gt , for t = 1, 2..., T .
When KL = 0, i.e. there is no low frequency factor, and Ω1 = Ω2 = 0, the specification in equation
(1) reduces to a pure HF factor model, with observations xτ and factor fτ for τ = 1/2, 1, ..., T , where
xτ = x1,t and fτ = f1,t for τ = t − 1/2, and xτ = x2,t and fτ = f2,t for τ = t and t = 1, 2, ..., T .
Such factor specifications are considered in e.g. Stock and Watson (2002), Bai and Ng (2002) and
Bai (2003) without explicit modeling of the factor dynamics, or in Forni, Hallin, Lippi, and Reichlin
(2000) with explicit modeling of the factor dynamics.
As usual in latent factor models, the distribution of the factors can be normalized. First, we can
assume orthogonality between (f1,t , f2,t ) and gt . Indeed, if orthogonality does not apply in a given representation of the model, factor f1,t can be written as the orthogonal projection on gt plus a projection
residual, i.e. f1,t = C1 gt + f˜1,t , where C1 = Cov(f1,t , gt )V (gt )−1 , and similarly f2,t = C2 gt + f˜2,t .
Then, by plugging these equations into the model, the structure is maintained if we use f˜1,t , f˜2,t and gt
as the new factors. Second, factors f1,t , f2,t and gt can be assumed to be zero-mean and standardized.
Thus:

f
 1,t

V  f2,t

gt


I
Φ
0
  KH
  0
 =  Φ IKH 0
 
0
0 IKL
where Φ is the covariance between f1,t and f2,t .
4



,

(2)
2.2
Factor dynamics
We complete the model specification by assuming a mixed frequency stationary Vector Autoregressive
(VAR) model for the stacked vector of factors (see Ghysels (2012)). The factor dynamics is given by
the following stationary structural VAR(1) model:

IKH
0
0


 −RH IKH 0

0
0 IKL

f1,t


  f2,t

gt


0
RH A1

f1,t−1
 

 

= 0
0 A2   f2,t−1
 

M1 M2 RL
gt−1


v1,t
 
 
 +  v2,t
 
wt



,

(3)
0
0
, wt0 )0 is a multivariate white noise process with mean 0 and variance-covariance matrix:
, v2,t
where (v1,t



Σ=

ΣH
0
ΣHL,1
ΣH ΣHL,2
ΣL



.

(4)
The model accommodates coupled autoregressive dynamics for the factors at different frequencies.
This coupling is induced by the sub-blocks of coefficients A1 , A2 , M1 , M2 in the structural autoregressive matrix, and the contemporaneous correlation of factor innovations at different frequencies
ΣHL,1 and ΣHL,2 . When either KH = 0 or KL = 0, equation (3) implies that the latent factor follows a VAR(1) model in low or high frequency, respectively. On the other hand, if A1 , A2 , M1 , M2
and ΣHL,1 , ΣHL,2 are zero matrices, the high frequency and low frequency factors follow uncorrelated
VAR(1) processes.
The parameters in the factor dynamics are constrained such that the sub-blocks restrictions on the
unconditional variance-covariance matrix in equation (2) hold. These restrictions, derived in Appendix
A, imply that each of the non-zero elements of the variance-covariance matrix Σ of the innovations,
and the autocovariance matrix Φ of the high frequency factor, can be expressed in terms of parameter
matrices RH , RL , A1 , A2 , M1 and M2 in the structural VAR(1) model (see Equations (A.9)-(A.14) in
Appendix A). These restrictions also imply that parameters RH , A1 and A2 must satisfy the following
matrix equation:
0
A1 A01 − RH A1 A02 − A1 A02 RH
− A2 A02 = 0.
5
(5)
In Appendix A we also derive the stationarity conditions for the factor process.
3
Identification
In standard linear latent factor models, the normalization induced by an identity factor variancecovariance matrix identifies the factor process up to a rotation (and change of signs). Let us now
show that, under suitable identification conditions, the rotation invariance of model (1) - (2) allows
only for separate rotations among the components of f1,t , among those of f2,t , and among those of
gt . Moreover, the rotations of f1,t and f2,t are the same. Thus, the rotation invariance of model (1) (2) maintains the interpretation of high frequency and low frequency factors, and the fact that f1,t and
f2,t are consecutive observations of the same process. More formally, let us consider the following
transformation of the stacked factor process:

f1,t


 f2,t

gt


A11 A12 A13




 =  A21 A22 A23


A31 A32 A33
f˜1,t


 ˜
  f2,t

g˜t





(6)
0
0
where (f˜1,t
, f˜2,t
, g˜t0 )0 is the transformed stacked factor vector, and the block matrix A = (Aij ) is non-
singular.
Definition 1. The model is identifiable if:
0
0
the data x1,t , x2,t and yt satisfy a factor model of the same type as (1) and (2) with (f1,t
, f2,t
, gt0 )0
0
0
replaced by (f˜1,t
, f˜2,t
, g˜t0 )0 if, and only if, matrix A is a block-diagonal orthogonal matrix, with
A11 = A22 .
For the proof of identification, we distinguish two situations regarding the full-rank nature of the
loading matrices.
6
3.1
Identification under full-rank conditions
Proposition 1. Assume that matrix Λ is full column rank and that
.
.
either matrix [Λ .. ∆1 ], or matrix [Λ .. ∆2 ], is full column rank (for NH large enough).
(7)
Then, the model is identifiable.
The proof of Proposition 1 is given in Appendix B. The full-rank condition for the loadings matrix
is a standard assumption in linear factor models (see e.g. Assumption B in Bai and Ng (2002) and Bai
(2003)). In Proposition 1, it is enough that the full-rank condition applies to at least one of the high
frequency panels.
3.2
Identification with reduced-rank loading matrices
.
.
When the loading matrices [Λ .. ∆1 ] and [Λ .. ∆2 ] in the DGP are both reduced-rank, we cannot apply
Proposition 1 to show identification. This situation applies for instance when the high frequency data
do not load on the low frequency factors. We maintain the hypothesis that matrix Λ is full-rank (for
NH large enough), and focus on the case of a single low frequency factor, i.e. KL = 1. Then, a
reduced-rank problem occurs if both vectors ∆1 and ∆2 are spanned by the columns of matrix Λ, that
is
∆1 = Λd1 , and ∆2 = Λd2 ,
(8)
for some vectors d1 and d2 . Then, using the tranformation in Equation (6) the model can be written as:

x1,t


 x2,t

yt


˜
Λ
0
0
 
 
˜ 0
= 0 Λ
 
˜1 Ω
˜2 B
˜
Ω

f˜1,t

 ˜
  f2,t

g˜t


ε1,t
 
 
 +  ε2,t
 
ut



,

(9)
where the transformed factors



f˜ = (IKH + d1 d01 )−1/2 (f1,t + d1 gt )

 1,t
,
f˜2,t = (IKH + d2 d02 )−1/2 (f2,t + d2 gt )



 g˜
= (1 + d01 d1 + d02 d2 + 2d01 Φd2 )−1/2 (gt − d01 f1,t − d02 f2,t )
7
(10)
˜ and Λ,
˜ B,
˜ Ω
˜ 1 and
satisfy the normalization restriction (2) with a transformed autocovariance matrix Φ,
˜ 2 are transformed matrices of loadings. Thus, the model can be rewritten as a model without the
Ω
˜1 = ∆
˜ 2 = 0, by suitably
effect of the low frequency factor on the high frequency observations, i.e. ∆
redefining the high and low frequency factors. To eliminate this multiplicity of representations, we
introduce the following restriction:
Assumption 1. Let KL = 1. If vectors ∆1 and ∆2 are spanned by Λ, then ∆1 = ∆2 = 0.
The next Proposition shows that this identification condition is sufficient to identify the model.
Proposition 2. Let KL = 1 and ∆1 = ∆2 = 0 in the DGP. Then, the model is identifiable.
3.3
Normalization of factor loadings
When the model is identifiable in the sense of Definition 1, we can eliminate the rotation invariance
of high frequency and low frequency factors as in standard latent factor models (see, e.g., Bai and Ng
(2013) for a thorough discussion of identification in latent factor models). In this paper we impose the
diagonality of the variance-covariance matrices of the loadings:
2
ΣΛ = diag(σλ,k
),
2
ΣB = diag(σb,k
).
Then, the high frequency and low frequency factor processes are identifiable up to a change of signs.
4
Estimation
4.1
The estimators of the factor values
The estimates of the factor values are obtained by an iterative estimation procedure. At each iteration
the HF and LF factors are estimated in two separate steps by Principal Component Analysis (PCA)
applied to suitable matrices of HF and LF residuals. The main idea is that from the model in equation
(1) residuals xj,t − ∆j gt satisfy a factor model with factor fj,t in high frequency, and residuals yt −
Ω1 f1,t − Ω2 f2,t satisfy a factor model with factor gt in low frequency.
The iteration p consists in the following two steps:
8
ˆ (p−1) = [˜
1. Define G
g1 , ..., g˜T ]0 as the (T × KL ) matrix of estimated LF factors obtained in
ˆ (p−1) to obtain the
the previous iteration. Regress each sub-panel of the HF observations on G
ˆ 1 and ∆
ˆ 2 , and the residuals:
estimated loadings matrices ∆
ˆ j g˜t ,
ξˆj,t = xj,t − ∆
j = 1, 2 .
Collecting the residuals in the (2T × NH ) matrix:
ˆ = [ξˆ1,1 , ξˆ2,1 , ..., ξˆ1,T , ξˆ2,T ]0 ,
Ξ
the (2T × KH ) matrix Fˆ (p) = [fˆ1,1 , fˆ2,1 , ..., fˆ1,T , fˆ2,T ]0 of estimated HF factor values is obtained
by PCA:
1 ˆ ˆ 0 ˆ (p)
ΞΞ F = Fˆ (p) VˆF ,
2NH T
(11)
ˆ is
where VˆF is the diagonal matrix of the eigenvalues. The estimated HF loadings matrix Λ
obtained from the high frequency least squares regression of xτ on factor fˆτ for τ = 1/2, 1, ..., T ,
where xτ = x1,t and fˆτ = fˆ1,t for τ = t − 1/2, and xτ = x2,t and fτ = fˆ2,t for τ = t and
t = 1, 2, ..., T .
2. Define:

ˆ ∗(p)
F
=
(p)
Fˆ1
ˆ0
ˆ0
 f1,1 f2,1
.. ˆ (p)
..
 .
. F2
=  ..
.

0
0
fˆ2,T
fˆ1,T



,

as the (T × 2KH ) matrix of estimated HF factors obtained in the previous step, where the factor
values of the two subperiods are stacked horizontally. Regress the LF observations y on Fˆ ∗(p) to
obtain the (T × NL ) matrix of residuals:
ˆ = [ψˆ1 , ..., ψˆT ]0
Ψ
9
where:
ˆ 1 fˆ1,t − Ω
ˆ 2 fˆ2,t ,
ψˆt = yt − Ω
t = 1, ..., T
ˆ 1 and Ω
ˆ 2 being the matrices of estimated loadings. The estimated LF factors G
ˆ (p) =
with Ω
[ˆ
g1 , ..., gˆT ]0 are obtained performing PCA:
1 ˆ ˆ 0 ˆ (p)
ˆ (p) VˆG ,
ΨΨ G = G
NL T
(12)
ˆ is
where VˆG is the diagonal matrix of the eigenvalues. The estimated LF loadings matrix B
obtained from the low frequency least squares regression of yt on gˆt . By construction, the
0 0
0
).
, fˆ2,t
estimated factors gˆt are orthogonal to (fˆ1,t
ˆ (p−1) with G
ˆ (p) in step 1 and can be initialized performing the
The procedure is iterated replacing G
ˆ (0) = 0.
PCA in step 1 with ξˆj,t = xj,t , i.e. with G
4.2
Estimation of the factor dynamics
The free parameters of the factor dynamics can be estimated by using the reduced form of the VAR(1)
model in equation (3) and replacing the unobservable factor values with their estimates obtained in
Section 4.1. The reduced form of the VAR(1) model in equation (3) is given by (see Ghysels (2012)):

f1,t


 f2,t

gt


0
RH A1
 
 
2
= 0
RH
RH A1 + A2
 
M1 M2 RL

f1,t−1


  f2,t−1

gt−1



 + ζt ,

(13)
where ζt is a zero-mean white noise process with variance-covariance matrix Σζ given in Equation
(A.17) in Appendix A.2. Let us denote by θ ∈ Rp , say, the parameter vector collecting the elements
in the matrices A1 , A2 , RH , RL , M1 and M2 . By using the normalization restrictions on the factor
process given in Equations (A.9)-(A.14), matrix Σζ in Equation (A.17) can be written in terms of
vector θ, i.e. Σζ = Σζ (θ). Then, the reduced-form factor dynamics in Equation (13) becomes:
zt = C(θ)zt−1 + ζt ,
10
(14)
0
0
where zt = [f1,t
, f2,t
, gt0 ]0 is the vector of stacked factors, matrix C(θ) is the autoregressive matrix
in Equation (13) written as a function of θ, and V (ζt ) = Σζ (θ). The parameter θ is subject to the
constraint θ ∈ Θ, where Θ ⊂ Rp is the set of parameters values that satisfy matrix equation (5). We
estimate parameter θ by constrained Gaussian Pseudo Maximum Likelihood (PML) by replacing the
unobserved factor values f1,t , f2,t and gt with their estimates fˆ1,t , fˆ2,t and gˆt for all t = 1, ..., T . The
estimator of parameter θ is:
ˆ T (θ),
θˆ = arg max Q
(15)
θ∈Θ
ˆ T (θ) is the Gaussian log-likelihood function:
where the criterion Q
T
X
ˆ T (θ) = − 1 log |Σζ (θ)| − 1
Q
[ˆ
zt − C(θ)ˆ
zt−1 ]0 Σζ (θ)−1 [ˆ
zt − C(θ)ˆ
zt−1 ] ,
2
2T t=2
(16)
and involves the factor estimates.
5
Large sample properties of the estimators
Let us assume that the HF and LF factors are one-dimensional, i.e. KH = KL = 1. The next Proposition provides the linearization of the iterative estimators defined by equations (11) and (12) around the
true factor values.
ˆ (p) satisfy the linearized iteration
Proposition 3. For large NH , NL and T , the estimators Fˆ (p) and G
step:
ˆ −1 − F = ηF + LF (G
ˆ −1 − G),
ˆ (p−1) h
Fˆ (p) h
F
G
ˆ −1 − G = ηG + LG (G
ˆ −1 − G),
ˆ (p) h
ˆ (p−1) h
G
G
G
ˆ F and h
ˆ G , where the random vectors ηF and ηG are such that
for some random positive scalars h
√
√
kηF k/ T = Op (T −1/2 ) and kηG k/ T = Op (T −1/2 ), and F = (F10 , F20 )0 and G are the (2T × 1) and
(T ×1) vectors of the true values of the HF and LF factors. The (T ×T ) matrix LG has (asymptotically)
the eigenvalues:
• 0, associated with the eigenvector G,
11
• 1, with multiplicity 2, associated with the eigenspace spanned by F1 + 2(w1 + φw2 )G and
F2 + 2(w1 φ + w2 )G,
• w1 d1 + w2 d2 , with multiplicity T − 3, associated with the eigenspace that is the orthogonal
complement of the linear space spanned by F1 , F2 and G.
The constants w1 , w2 , d1 and d2 are defined as:
wj = lim
NL →∞
B0B
NL
−1 B 0 Ωj
,
NL
dj = lim
NH →∞
Λ0 Λ
NH
−1 Λ0 ∆j
,
NH
j = 1, 2,
and φ = Cov(f1,t , f2,t ) is the stationary autocorrelation of the HF factor.
The proof of Proposition 3 is given in Appendix C.
Proposition 4 provides the consistency of the factor values estimates at rate
√
T . We use the root
mean squared error criterion to assess convergence of the factor estimates at different dates.
Proposition 4. Assuming NH , NL , T → ∞, s.t. NH NL ≥ T and other regularity conditions:
T
−1/2
1
−1
−1/2 ˆ ˆ −1
ˆ
ˆ
,
kF hF − F k + T
kGhG − Gk = Op √
T
where F and G are the vectors of the true factor values.
The proof of Proposition 4 is given in Appendix C.
Proposition 5. Assuming NH , NL , T → ∞, s.t. NH NL ≥ T and other regularity conditions:
kθˆ − θk = Op
The proof of Proposition 5 is given in Appendix C.
6
Monte Carlo analysis
[...]
12
1
√
.
T
7
Empirical application
[...]
8
Conclusions
[...]
13
References
A NDERSON , B., M. D EISTLER , E. F ELSENSTEIN , B. F UNOVITS , P. Z ADROZNY, M. E ICHLER ,
W. C HEN ,
AND
M. Z AMANI (2012): “Identifiability of regular and singular multivariate autore-
gressive models from mixed frequency data,” in Decision and Control (CDC), 2012 IEEE 51st
Annual Conference on, pp. 184–189. IEEE.
A RUOBA , S. B., F. X. D IEBOLD ,
AND
C. S COTTI (2009): “Real-time Measurement of Business
Conditions,” Journal of Business and Economic Statistics, 27(4), 417–427.
BAI , J. (2003): “Inferential Theory for Factor Models of Large Dimensions,” Econometrica, 71, 135–
171.
BAI , J.,
AND
S. N G (2002): “Determining the Number of Factors in Approximate Factor Models,”
Econometrica, 70(1), 191–221.
(2006): “Confidence Intervals for Diffusion Index Forecasts and Inference for FactorAugmented Regressions,” Econometrica, 74(4), 1133–1150.
(2013): “Principal Components Estimation and Identification of Static Factors,” Journal of
Econometrics, 176(1), 18–29.
BANBURA , M.,
AND
¨
G. R UNSTLER
(2011): “A Look into the Factor Model Black Box: Publication
Lags and the Role of Hard and Soft Data in Forecasting GDP,” International Journal of Forecasting,
27, 333–346.
F OERSTER , A. T., P.-D. G. S ARTE ,
AND
M. W. WATSON (2011): “Sectoral versus Aggregate
Shocks: A Structural Factor Analysis of Industrial Production,” Journal of Political Economy,
119(1), 1–38.
F ORNI , M., M. H ALLIN , M. L IPPI ,
AND
L. R EICHLIN (2000): “The generalized dynamic-factor
model: Identification and estimation,” Review of Economics and Statistics, 82, 540–554.
F ORNI , M.,
AND
L. R EICHLIN (1998): “Let’s Get Real: A Factor Analytical Approach to Disaggre-
gated Business Cycle Dynamics,” The Review of Economic Studies, 65, 453–473.
14
F RALE , C.,
AND
L. M ONTEFORTE (2010): “FaMIDAS: A Mixed Frequency Factor Model with MI-
DAS Structure,” Government of the Italian Republic (Italy), Ministry of Economy and Finance,
Department of the Treasury Working Paper, 3.
G HYSELS , E. (2012): “Macroeconomics and the Reality of Mixed Frequency Data,” Unpublished
Manuscript.
G OURIEROUX , C.,
AND
A. M ONFORT (1995): Statistics and Econometric Models. Cambridge Uni-
versity Press.
H ORN , R. A., AND C. R. J OHNSON (2013): Matrix Analysis. Cambridge University Press.
M AGNUS , J. R.,
AND
H. N EUDECKER (2007): Matrix Differential Calculus with Applications in
Statistics and Econometrics. John Wiley and Sons: Chichester/New York.
M ARCELLINO , M.,
AND
C. S CHUMACHER (2010): “Factor MIDAS for Nowcasting and Forecasting
with Ragged-Edge Data: A Model Comparison for German GDP,” Oxford Bulletin of Economics
and Statistics, 72(4), 518–550.
M ARIANO , R. S.,
AND
Y. M URASAWA (2003): “A New Coincident Index of Business Cycles Based
on Monthly and Quarterly Series,” Journal of Applied Econometrics, 18(4), 427–443.
N UNES , L. C. (2005): “Nowcasting Quarterly GDP Growth in a Monthly Coincident Indicator
Model,” Journal of Forecasting, 24(8), 575–592.
S TOCK , J. H., AND M. W. WATSON (2002): “Macroeconomic Forecasting Using Diffusion Indexes,”
Journal of Business and Economic Statistics, 20, 147–162.
(2010): “Dynamic Factor Models,” in Oxford Handbook of Economic Forecasting, ed. by
E. Graham, C. Granger, and A. Timmerman, pp. 87–115. Michael P. Clements and David F. Hendry
(eds), Oxford University Press, Amsterdam.
15
TABLES
Table 1: Estimated number of factors (HF data: IP indexes, LF: non-IP real value added GDP)
ICp1 : growth rates of indexes
[Y X1 ]
[Y X2 ]
[Y X3 ]
[Y X4 ]
[Y X1:4 ]
[Y XLF ]
[XLF ]
[XHF ]
[Y ]
1
2
1
2
3
2
2
2
1
ICp2 : growth rates of indexes
[Y X1 ]
[Y X2 ]
[Y X3 ]
[Y X4 ]
[Y X1:4 ]
[Y XLF ]
[XLF ]
[XHF ]
[Y ]
1
2
1
1
3
1
2
1
1
ICp1 : innovations to sectoral productivity (εt in Foerster, Sarte, and Watson (2011))
[εY εX1 ]
[εY εX2 ]
[εY εX3 ]
[εY εX4 ]
[εY εX1:4 ]
[εY εX,LF ]
[εX,LF ]
[εX,HF ]
[εY ]
1
2
1
1
1
2
3
2
1
ICp2 : innovations to sectoral productivity (εt in Foerster, Sarte, and Watson (2011))
[εY εX,1 ]
[εY εX,2 ]
[εY εX,3 ]
[εY εX,4 ]
[εY εX,1:4 ]
[εY εX,LF ]
[εX,LF ]
[εX,HF ]
[εY ]
1
1
1
1
1
1
2
1
1
In the table we display the estimated number of latent factors for different panels of mixed frequency data, using the
information criteria ICp1 and ICp2 proposed by Bai and Ng (2002). In the first 2 lines, the two panels of observable
variables have the following dimensions: NH = 117, NL = 42, T = 35. The notation [Y Xi ] indicates that panel Y
and panel Xi are stacked together in a unique panel, and the number of latent factors is determined in this new panel. Y
denotes the panel of LF (yearly) observations of growth rates of real value added GDP for the sample period 1977-2011,
for the following 42 sectors: 35 services, Construction, Farms, Forestry-Fishing and related activities, General government
(federal), Government enterprises (federal), General government (states and local) and Government enterprises (states and
local). Xi denotes the panel of HF (quarterly) observations of growth rates for the sample period 1977.Q1-2011.Q4, for the
117 industrial production indexes considered by Foerster, Sarte, and Watson (2011), for quarter i, for i = 1, 2, 3, 4. XHF
denotes the 4T × NH panel of HF observations for all quarters in the sample. XLF denotes the panel of HF observations
of growth rates aggregated as a panel of LF observations (the aggregation is performed by taking the mean of the quarterly
observations). X1:4 denotes the T × 4NH panel of HF observations for all quarters in the sample, with observations
of different quarters stacked along the columns. In our model, the number of factors is KL + KH for panels [Y Xi ],
i = 1, 2, 3, 4, KL + 4KH for panel [Y X1:4 ] and KL + KH for panel [Y XLF ].
In the third and fourth line we perform the same type of analysis as in the first two lines, but on the panels of
sectoral productivity shocks (εt in Foerster, Sarte, and Watson (2011)). εX denotes the panel of productivity shocks
for the 117 IP sectors, and εY denotes the panel of productivity shocks for the panel of 38 non-manufacturing sectors
(corresponding to the 42 considered before, excluding the 4 Government related sectors, as capital flows data are
not available for these sectors.).
16
TABLE 1 BIS: Estimated number of factors (HF data: IP indexes, LF: non-IP real GROSS OUTPUT)
ICp1 : growth rates of indexes
[Y X1 ]
[Y X2 ]
[Y X3 ]
[Y X4 ]
[Y X1:4 ]
[Y XLF ]
[XLF ]
[XHF ]
[Y ]
1
1
2
2
2
2
2
1
15
ICp2 : growth rates of indexes
[Y X1 ]
[Y X2 ]
[Y X3 ]
[Y X4 ]
[Y X1:4 ]
[Y XLF ]
[XLF ]
[XHF ]
[Y ]
1
1
2
2
2
2
1
1
2
ICp1 : innovations to sectoral productivity (εt in Foerster, Sarte, and Watson (2011))
[εY εX1 ]
[εY εX2 ]
[εY εX3 ]
[εY εX4 ]
[εY εX1:4 ]
[εY εX,LF ]
[εX,LF ]
[εX,HF ]
[εY ]
1
1
1
1
1
2
3
1
15
ICp2 : innovations to sectoral productivity (εt in Foerster, Sarte, and Watson (2011))
[εY εX,1 ]
[εY εX,2 ]
[εY εX,3 ]
[εY εX,4 ]
[εY εX,1:4 ]
[εY εX,LF ]
[εX,LF ]
[εX,HF ]
[εY ]
1
1
1
1
1
1
1
1
1
In the table we display the estimated number of latent factors for different panels of mixed frequency data, using the
information criteria ICp1 and ICp2 proposed by Bai and Ng (2002). In the first 2 lines, the two panels of observable
variables have the following dimensions: NH = 117, NL = 38, T = 24. The notation [Y Xi ] indicates that panel Y and
panel Xi are stacked together in a unique panel, and the number of latent factors is determined in this new panel. Y denotes
the panel of LF (yearly) observations of growth rates of real GROSS OUTPUT for the sample period 1988-2011, for the
following 38 sectors: 35 services, Construction, Farms, Forestry-Fishing and related activities. Xi denotes the panel of HF
(quarterly) observations of growth rates for the sample period 1988.Q1-2011.Q4, for the 117 industrial production indexes
considered by Foerster, Sarte, and Watson (2011), for quarter i, for i = 1, 2, 3, 4. XHF denotes the 4T × NH panel of
HF observations for all quarters in the sample. XLF denotes the panel of HF observations of growth rates aggregated as a
panel of LF observations (the aggregation is performed by taking the mean of the quarterly observations). X1:4 denotes the
T × 4NH panel of HF observations for all quarters in the sample, with observations of different quarters stacked along the
columns. In our model, the number of factors is KL + KH for panels [Y Xi ], i = 1, 2, 3, 4, KL + 4KH for panel [Y X1:4 ]
and KL + KH for panel [Y XLF ].
In the third and fourth line we perform the same type of analysis as in the first two lines, but on the panels of sectoral
productivity shocks (εt in Foerster, Sarte, and Watson (2011)). εX denotes the panel of productivity shocks for the
117 IP sectors, and εY denotes the panel of productivity shocks for the panel of 38 non-manufacturing sectors. For
productivity innovations, T = 23, as the innovation for the first year in the sample cannot be computed.
17
Table 2: Regressions of HF and LF observables on 1 HF and 1 LF factors: quantiles of adjusted R2
(HF data: IP indexes, LF: non-IP real value added GDP).
¯ 2 Quantile. OBSERVABLES: growth rates of indexes. FACTORS: extracted from original data.
a) R
Obs.
Factors
10%
25%
50%
75%
90%
Y
Y
Y
LF
LF, HF
HF
-3.0
0.5
-4.0
-2.3
8.7
3.9
4.4
24.5
15.8
11.2
46.0
33.2
22.0
56.1
45.2
X
X
X
LF
HF, LF
HF
-2.5
0.7
-0.0
-2.0
5.9
5.3
-1.2
24.8
25.8
0.4
38.2
38.3
2.2
57.1
57.0
¯ 2 Quantile. OBSERVABLES: εX and εY , i.e. sectoral productivity innovations (εt in Foerster,
b) R
Sarte, and Watson (2011)). FACTORS: extracted from sectoral productivity innovations.
Obs.
Factors
10%
25%
50%
75%
90%
εY
εY
εY
LF
LF, HF
HF
-2.8
-0.5
-3.9
-1.6
7.4
1.0
5.9
12.5
4.7
11.9
38.3
18.9
26.7
49.0
35.8
εX
εX
εX
LF
HF, LF
HF
-2.0
-0.9
-0.6
-1.4
2.3
1.0
-0.6
9.9
8.2
1.6
22.5
20.1
3.3
40.8
40.6
¯ 2 Quantile. OBSERVABLES: growth rates of indexes. FACTORS: extracted from sectoral
c) R
productivity innovations.
Obs.
Factors
10%
25%
50%
75%
90%
Y
Y
Y
LF
LF, HF
HF
-2.7
3.4
-2.3
-0.4
6.4
2.7
5.3
16.2
9.5
17.0
48.5
23.6
29.8
60.2
38.2
X
X
X
LF
HF, LF
HF
-2.1
0.9
-0.1
-1.2
4.2
3.0
0.6
21.3
20.2
2.2
35.8
32.2
5.0
52.4
49.2
¯ 2 Quantile. OBSERVABLES: growth rates of indexes. FACTORS: extracted from sectoral
d) R
productivity innovations and their lagged values (only lag 1 is considered).
Obs.
Factors
10%
25%
50%
75%
90%
Y
Y
Y
LF
LF, HF
HF
-2.6
2.7
-4.0
0.7
10.2
0.1
10.0
23.3
12.1
22.4
54.1
28.7
36.3
63.1
49.7
X
X
X
LF
HF, LF
HF
-3.7
-0.3
0.2
-2.7
5.7
4.4
-0.4
23.8
21.7
2.4
40.8
37.7
5.3
60.0
56.7
18
TABLE 2: description of dataset and methodology (HF data: IP indexes, LF: non-IP real value added
GDP)
¯ 2 , for different sets of time
In the table we display the quantiles of the empirical distributions of the adjusted R2 , denoted R
series regressions.
Panel a)
The regressions in the first three lines involve the real GDP growth rates of the 42 sectors (35 services, Construction,
Farms, Forestry-Fishing and related activities + 4 Government sectors) as dependent variables, while the regressions in
the last tree lines involve the growth of the 117 industrial production indexes as dependent variables. The factors used as
explanatory variables are estimated from the panel of 42 GDP sectors and 117 industrial production indexes considered by
Foerster, Sarte, and Watson (2011), using a mixed frequency factor model (MFFM) with KH = KL = 1. The sample
period for the estimation of both the factor model and the regressions is 1977.Q1-2011.Q4. In lines 1 and 4 we report the
¯ 2 of the regressions using as explanatory variable the estimated LF factor only. In lines 2 and 5 we report
quantiles of R
¯ 2 of the regressions using as explanatory variables the estimated LF and HF factors. In lines 3 and 6 we
the quantiles of R
¯ 2 of the regressions using as explanatory variable the estimated HF factor only. The regressions in
report the quantiles of R
lines 2 and 3 are unrestricted MIDAS regressions. The regressions in lines 4 and 5 allow the estimated coefficients of the
LF factor to be different at each quarter.
Panel b)
The regressions in the first three lines involve the productivity innovations of the 38 non-IP sectors (35 services,
Construction, Farms, Forestry-Fishing and related activities) as dependent variables, while the regressions in the last
tree lines involve the productivity innovations of the 117 industrial production indexes as dependent variables. The
factors used as explanatory variables are estimated from the panel of productivity innovations computed as proposed by
Foerster, Sarte, and Watson (2011), using a mixed frequency factor model (MFFM) with KH = KL = 1. The sample
period for the estimation of both the factor model and the regressions is 1978.Q1-2011.Q4, because the productivity
shocks can not be computed for the first year of the sample (see Foerster, Sarte, and Watson (2011), especially their
equation (B38) on page 10 of their Appendix B.)
Panel c)
The regressions in the first three lines involve the real GDP growth of the 38 non-IP sectors (35 services, Construction,
Farms, Forestry-Fishing and related activities) as dependent variables, while the regressions in the last tree lines involve
the growth of the 117 industrial production indexes as dependent variables. The factors used as explanatory variables are
estimated from the panel of productivity innovations computed as proposed by Foerster, Sarte, and Watson (2011), using a
mixed frequency factor model (MFFM) with KH = KL = 1. The sample period for the estimation of both the factor
model and the regressions is 1978.Q1-2011.Q4.
Panel d)
The regressions in the first three lines involve the real GDP growth of 38 non-IP sectors (35 services, Construction,
Farms, Forestry-Fishing and related activities) as dependent variables, while the regressions in the last tree lines involve
the growth of the 117 industrial production indexes as dependent variables. The factors used as explanatory variables are
estimated from the panel of productivity innovations computed as proposed by Foerster, Sarte, and Watson (2011), using
a mixed frequency factor model (MFFM) with KH = KL = 1. Both the contemporaneous and lagged values (only
lag 1 is included) of the factors are used as explanatory variables. The choice of including the lags of the factors as
regressors, is justified by equations (10) and (12) in Foerster, Sarte, and Watson (2011). The sample period for the
estimation of both the factor model and the regressions is 1979.Q1-2011.Q4.
19
TABLE 2 BIS: Regressions of HF and LF observables on 1 HF and 1 LF factors: quantiles of adjusted
R2 . (HF data: IP indexes, LF: non-IP real GROSS OUTPUT).
¯ 2 Quantile. OBSERVABLES: indexes growth rates (Y are Gross Output growth rates). FACa) R
TORS: extracted from original data.
Obs.
Factors
10%
25%
50%
75%
90%
Y
Y
Y
LF
LF, HF
HF
-3.6
-2.4
-3.0
-0.3
28.9
7.0
4.8
45.3
28.0
23.6
66.0
44.0
31.0
80.2
63.0
X
X
X
LF
HF, LF
HF
-3.6
-1.4
0.4
-3.2
5.7
5.1
-2.2
22.6
21.8
-0.6
40.1
41.2
1.9
63.2
63.2
¯ 2 Quantile. OBSERVABLES: εX and εY , i.e. sectoral productivity innovations (εt in Foerster,
b) R
Sarte, and Watson (2011)). FACTORS: extracted from sectoral productivity innovations.
Obs.
Factors
10%
25%
50%
75%
90%
Y
Y
Y
LF
LF, HF
HF
-4.4
-13.1
-9.2
-3.7
14.4
-0.8
-1.3
32.9
20.7
13.0
53.4
38.1
34.3
62.7
52.8
X
X
X
LF
HF, LF
HF
-3.9
-2.9
-1.0
-3.1
0.3
0.6
-1.5
7.8
4.8
0.7
19.3
19.6
4.0
34.6
35.9
¯ 2 Quantile. OBSERVABLES: indexes growth rates (Y are Gross Output growth rates). FACc) R
TORS: extracted from sectoral productivity innovations.
Obs.
Factors
10%
25%
50%
75%
90%
Y
Y
Y
LF
LF, HF
HF
-4.2
2.2
-3.1
-0.2
20.2
5.8
4.7
44.7
20.7
21.0
66.4
42.9
43.8
81.1
65.4
X
X
X
LF
HF, LF
HF
-4.0
-2.1
-0.3
-3.1
2.4
2.8
-1.7
19.7
19.4
-0.2
38.0
35.8
3.5
54.6
53.0
¯ 2 Quantile. OBSERVABLES: indexes growth rates (Y are Gross Output growth rates). FACd) R
TORS: extracted from sectoral productivity innovations and their lagged values (only lag 1 is considered).
Obs.
Factors
10%
25%
50%
75%
90%
Y
Y
Y
LF
LF, HF
HF
-8.1
-8.7
-7.2
1.1
24.4
0.2
7.9
50.2
26.4
25.6
74.3
53.2
52.3
84.4
70.7
X
X
X
LF
HF, LF
HF
-6.9
-2.2
-0.1
-5.4
6.1
4.1
-2.2
21.3
21.5
0.9
40.8
39.3
4.5
56.5
54.6
20
TABLE 2 BIS: description of dataset and methodology. (HF data: IP indexes, LF: non-IP real GROSS
OUTPUT)
¯ 2 , for different sets of time
In the table we display the quantiles of the empirical distributions of the adjusted R2 , denoted R
series regressions.
Panel a)
The regressions in the first three lines involve the GROSS OUTPUT GROWTH RATES growth of the 38 non-IP (35
services, Construction, Farms, Forestry-Fishing and related activities) as dependent variables, while the regressions in
the last tree lines involve the growth of the 117 industrial production indexes as dependent variables. The factors used as
explanatory variables are estimated from the panel of 38 non-IP sectors and 117 industrial production indexes considered
by Foerster, Sarte, and Watson (2011), using a mixed frequency factor model (MFFM) with KH = KL = 1. The sample
period for the estimation of both the factor model and the regressions is 1988.Q1-2011.Q4. In lines 1 and 4 we report the
¯ 2 of the regressions using as explanatory variable the estimated LF factor only. In lines 2 and 5 we report
quantiles of R
¯ 2 of the regressions using as explanatory variables the estimated LF and HF factors. In lines 3 and 6 we
the quantiles of R
¯ 2 of the regressions using as explanatory variable the estimated HF factor only. The regressions in
report the quantiles of R
lines 2 and 3 are unrestricted MIDAS regressions. The regressions in lines 4 and 5 allow the estimated coefficients of the
LF factor to be different at each quarter.
Panel b)
The regressions in the first three lines involve the productivity innovations of the 38 non-IP sectors (35 services,
Construction, Farms, Forestry-Fishing and related activities) as dependent variables, while the regressions in the last
tree lines involve the productivity innovations of the 117 industrial production indexes as dependent variables. Note
that productivity innovations are computed using the panel of GROSS OUTPUT GROWTH RATES for the LF
observables. The factors used as explanatory variables are estimated from the panel of productivity innovations computed
as proposed by Foerster, Sarte, and Watson (2011), using a mixed frequency factor model (MFFM) with KH = KL = 1.
The sample period for the estimation of both the factor model and the regressions is 1988.Q1-2011.Q4, because the
productivity shocks can not be computed for the first year of the sample (see Foerster, Sarte, and Watson (2011),
especially their equation (B38) on page 10 of their Appendix B.)
Panel c)
The regressions in the first three lines involve the GROSS OUTPUT growth of the 38 non-IP sectors (35 services,
Construction, Farms, Forestry-Fishing and related activities) as dependent variables, while the regressions in the last tree
lines involve the growth of the 117 industrial production indexes as dependent variables. The factors used as explanatory
variables are estimated from the panel of productivity innovations computed as proposed by Foerster, Sarte, and Watson
(2011), using a mixed frequency factor model (MFFM) with KH = KL = 1. The sample period for the estimation of
both the factor model and the regressions is 1989.Q1-2011.Q4. Note that productivity innovations are computed using
the panel of GROSS OUTPUT GROWTH RATES for the LF observables.
Panel d)
The regressions in the first three lines involve the GROSS OUTPUT growth of 38 non-IP sectors (35 services, Construction,
Farms, Forestry-Fishing and related activities) as dependent variables, while the regressions in the last tree lines involve
the growth of the 117 industrial production indexes as dependent variables. The factors used as explanatory variables are
estimated from the panel of productivity innovations computed as proposed by Foerster, Sarte, and Watson (2011), using
a mixed frequency factor model (MFFM) with KH = KL = 1. Both the contemporaneous and lagged values (only lag 1
is included) of the factors are used as explanatory variables. The choice of including the lags of the factors as regressors,
is justified by equations (10) and (12) in Foerster, Sarte, and Watson (2011). The sample period for the estimation of both
the factor model and the regressions is 1990.Q1-2011.Q4. Note that productivity innovations are computed using the
panel of GROSS OUTPUT GROWTH RATES for the LF observables.
21
22
3.06
0.35
-1.12
-1.12
-1.54
-3.99
-4.07
-4.24
-7.15
-9.05
72.06
60.90
56.71
47.82
44.03
42.28
40.99
40.23
38.70
35.15
¯2
R
¯2
Ten sectors with smallest R
Broadcasting and telecommunications
Forestry, fishing, and related activities
Insurance carriers and related activities
Securities, commodity contracts, and investments
Motion picture and sound recording industries
Information and data processing services
Ambulatory health care services
Federal Reserve banks, credit interm., and rel. activities
Water transportation
Hospitals and nursing and residential care facilities
¯2
Ten sectors with largest R
Construction
Accommodation
Administrative and support services
Truck transportation
Misc. professional, scientific, and technical services
Wholesale trade
Retail trade
Other services, except government
Government enterprises (FEDERAL)
Computer systems design and related services
Sector
6.71
6.57
6.15
5.54
1.14
1.01
-0.65
-4.65
-7.51
-10.82
73.85
72.62
70.69
60.01
54.39
52.75
52.46
51.23
50.09
48.84
¯2
R
Table 4: Adjusted R2 of the regression of yearly sectoral GDP
growth on the HF and LF factors.
¯ 2 , for the time series regressions of the growth rates of 42 GDP sectoral indexes on the estimated factors. The
In the table we display the adjusted R2 , denoted R
factors are estimated from the panel of 42 GDP sectors and 117 industrial production indexes considered by Foerster, Sarte, and Watson (2011), using a mixed
frequency factor model with KH = KL = 1. The regressions in Table 3 involve a LF explained variable and the estimated HF factor. The regressions in Table 4
involve a LF explained variable and both the HF and LF estimated factors. The regressions in both tables are unrestricted MIDAS regressions.
¯2
Ten sectors with smallest R
Insurance carriers and related activities
Farms
Forestry, fishing, and related activities
General government (STATES AND LOCAL)
Federal Reserve banks, credit interm., and rel. activities
Water transportation
Ambulatory health care services
Management of companies and enterprises
Hospitals and nursing and residential care facilities
Information and data processing services
¯2
Ten sectors with largest R
Accommodation
Truck transportation
Administrative and support services
Other transportation and support activities
Construction
Other services, except government
Warehousing and storage
Misc. professional, scientific, and technical services
Funds, trusts, and other financial vehicles
Government enterprises (STATES AND LOCAL)
Sector
Table 3: Adjusted R2 of the regression of yearly sectoral GDP
growth on the HF factor.
Table 5: Change in adjusted R2 of the regression of yearly sectoral GDP growth on the HF factor and
the LF factors vs. the regression on the HF factor only.
¯2
change in R
Sector
¯2
Ten sectors with largest change in R
Social assistance
Computer systems design and related services
General government (STATES AND LOCAL)
Construction
Government enterprises (FEDERAL)
Rental and leasing services and lessors of intangible assets
Wholesale trade
Retail trade
Management of companies and enterprises
Real estate
38.89
37.30
30.67
29.82
24.52
23.84
22.71
19.41
17.10
16.34
¯2
Ten sectors with smallest change in R
Securities, commodity contracts, and investments
Pipeline transportation
Air transportation
Publishing industries (includes software)
Broadcasting and telecommunications
Waste management and remediation services
Federal Reserve banks, credit intermediation, and related activities
Motion picture and sound recording industries
Water transportation
Hospitals and nursing and residential care facilities
-2.20
-2.24
-2.31
-2.67
-2.97
-2.97
-3.11
-3.22
-3.52
-3.68
¯ 2 , from the regressions of the growth rates of each
In the table we display the difference in the adjusted R2 , denoted R
sectoral GDP index on the HF and LF estimated factors and on the HF factor only. The factors are estimated from the panel
of 42 GDP sectors and 117 industrial production indexes considered by Foerster, Sarte, and Watson (2011), using a mixed
frequency factor model with KH = KL = 1. The sample period for the estimation of both factor model and regressions is
1977.Q1-2011.Q4. These regressions are unrestricted MIDAS regressions.
23
24
-0.26
-0.27
-0.39
-0.43
-0.60
-0.60
-0.67
-0.69
-0.72
-0.72
73.22
69.69
67.38
65.96
65.87
65.53
63.24
61.33
60.14
58.64
¯2
R
¯2
Ten sectors with smallest R
Wineries and Distilleries
Mining and oil and gas field machinery
Sugar and confectionery product
Coffee and tea
Fruit and vegetable preserving and specialty food
Other Food Except Coffee and Tea
Animal slaughtering and processing
Oil and gas extraction
Nonferrous metal (except aluminum) smelting and refining
Breweries
¯2
Ten sectors with largest R
Plastics product
Household and institutional furniture and kitchen cabinet
Forging and stamping
Coating, engraving, heat treating, and allied activities
Other fabricated metal product
Foundries
Machine shops, turned product, and screw, nut, and bolt
Rubber products ex. Tires
Other Miscellaneous Manufacturing
Other electrical equipment
Sector
-0.23
-0.27
-0.42
-0.68
-0.88
-1.13
-1.61
-1.78
-2.05
-2.24
73.64
69.40
66.72
66.10
65.62
65.06
62.26
62.17
60.94
59.92
¯2
R
Table 7: Adjusted R2 of the regression of quarterly industrial
production growth on the HF and LF factors.
¯ 2 , for the time series regressions of the growth rates of the of 117 industrial production indexes on the estimated
In the table we display the adjusted R2 , denoted R
factors. The factors are estimated from the panel of 42 GDP sectors and 117 industrial production indexes considered by Foerster, Sarte, and Watson (2011), using
a mixed frequency factor model with KH = KL = 1. The regressions in Table 6 involve a HF explained variable and the estimated HF factor. The regressions
in Table 7 involve a HF explained variable and both the HF and LF estimated factors. As the explanatory variables are observable at high frequency, in order to
increase the fit of the model we allow the coefficient of the LF factor to be different in each quarter of the same year.
¯2
Ten sectors with smallest R
Natural gas distribution
Animal slaughtering and processing
Nonferrous metal (except aluminum) smelting and refining
Other Food Except Coffee and Tea
Aerospace product and parts
Grain and oilseed milling
Wineries and Distilleries
Dairy product (except frozen)
Fruit and vegetable preserving and specialty food
Oil and gas extraction
¯2
Ten sectors with largest R
Plastics product
Household and institutional furniture and kitchen cabinet
Forging and stamping
Foundries
Other fabricated metal product
Coating, engraving, heat treating, and allied activities
Rubber products ex. Tires
Machine shops, turned product, and screw, nut, and bolt
Other Miscellaneous Manufacturing
Other electrical equipment
Sector
Table 6: Adjusted R2 of the regression of quarterly industrial
production growth on the HF factor.
Table 8: Change in adjusted R2 of the regression of quarterly industrial production growth on the HF
and LF factors vs. the regression on the HF factor only.
¯2
change in R
Sector
¯2
Ten sectors with largest change in R
Computer and peripheral equipment
Communications equipment
Grain and oilseed milling
Newspaper publishers
Electric power generation, transmission, and distribution
Railroad rolling stock
Coal mining
Periodical, book, and other publishers
Synthetic dye and pigment
Dairy product (except frozen)
11.12
6.57
6.50
4.46
3.95
3.87
3.50
3.38
3.01
2.71
¯2
Ten sectors with smallest change in R
Industrial machinery
Coffee and tea
Agricultural implement
Apparel
Pulp mills
Engine, turbine, and power transmission equipment
Audio and video equipment
Petroleum refineries
Mining and oil and gas field machinery
Breweries
-1.73
-1.81
-1.87
-1.88
-1.88
-1.91
-2.19
-2.42
-2.60
-2.90
¯ 2 , from the regressions of the growth rates of the 117
In the table we display the difference in the adjusted R2 , denoted R
industrial production indexes on the HF and LF estimated factors and on the HF factor only. The factors are estimated
from the panel of 42 GDP sectors and 117 industrial production indexes considered by Foerster, Sarte, and Watson (2011),
using a mixed frequency factor model with KH = KL = 1. The sample period for the estimation of both factor model and
regressions is 1977.Q1-2011.Q4. As the explanatory variables are observable at high frequency, in order to increase the fit
of the model we allow the coefficient of the LF factor to be different in each quarter of the same year.
25
Table 9: Adjusted R2 of selected indexes on the estimated 1 HF and 1 LF factors (HF data: IP
indexes, LF: non-IP real value added GDP).
(1)
¯ 2 (HF )
R
Sector
(2)
¯ 2 (LF )
R
(3)
¯ 2 (HF + LF )
R
(3) - (1)
PANEL a)
REGRESSORS: factors extracted from sectoral output growth (X and Y)
HF observations
Industrial Production
89.46
-0.08
90.03
0.57
LF observations
GDP
GDP - Manifacturing
GDP - Agriculture, forestry, fishing, and hunting
GDP - Construction
GDP - Wholesale trade
GDP - Retail trade
GDP - Transportation and warehousing
GDP - Information
GDP - Finance, insurance, real estate, rental, and leasing
GDP - Professional and business services
GDP - Educational services, health care, and social assistance
GDP - Arts, entert., recreation, accommod., and food services
GDP - Government
60.39
74.20
-0.61
44.03
30.04
33.05
54.55
18.12
6.65
47.29
-10.52
63.10
-2.03
20.22
-0.76
4.85
24.88
19.06
16.06
-1.43
-2.54
21.82
19.17
-2.96
1.14
12.38
85.48
75.89
4.88
73.85
52.75
52.46
54.81
15.85
31.69
70.74
-14.25
66.57
11.98
25.09
1.69
5.49
29.82
22.71
19.41
0.26
-2.26
25.04
23.45
-3.74
3.47
14.00
PANEL b)
REGRESSORS: contemporaneous values of factors extracted from innovations to sectoral productivity
(εt in Foerster, Sarte, and Watson (2011)).
HF observations
Industrial Production
69.30
6.08
75.95
6.65
LF observations
GDP
GDP - Manifacturing
GDP - Agriculture, forestry, fishing, and hunting
GDP - Construction
GDP - Wholesale trade
GDP - Retail trade
GDP - Transportation and warehousing
GDP - Information
GDP - Finance, insurance, real estate, rental, and leasing
GDP - Professional and business services
GDP - Educational services, health care, and social assistance
GDP - Arts, entert., recreation, accommod., and food services
GDP - Government
29.24
53.31
3.99
20.22
15.79
13.55
42.09
13.60
7.83
31.82
-4.95
36.21
4.23
31.63
10.41
-1.48
30.65
36.68
54.86
-0.34
1.05
4.63
24.78
10.15
30.30
1.02
66.45
67.13
2.47
56.00
58.44
76.81
43.18
15.24
13.37
61.16
6.55
72.27
5.56
37.21
13.82
-1.52
35.78
42.65
63.26
1.09
1.64
5.54
29.35
11.50
36.06
1.34
26
TABLE 9: Adjusted R2 of selected indexes on the estimated 1 HF and 1 LF factors, and their lagged
values (HF data: IP indexes, LF: non-IP real value added GDP).
(1)
¯ 2 (HF )
R
Sector
(2)
¯ 2 (LF )
R
(3)
¯ 2 (HF + LF )
R
(3) - (1)
PANEL c)
REGRESSORS: factors extracted from innovations to sectoral productivity
(εt in Foerster, Sarte, and Watson (2011)) both contemporaneous and lagged values (only first lag).
HF observations
Industrial Production
76.77
2.10
82.91
6.15
LF observations
GDP
GDP - Manifacturing
GDP - Agriculture, forestry, fishing, and hunting
GDP - Construction
GDP - Wholesale trade
GDP - Retail trade
GDP - Transportation and warehousing
GDP - Information
GDP - Finance, insurance, real estate, rental, and leasing
GDP - Professional and business services
GDP - Educational services, health care, and social assistance
GDP - Arts, entert., recreation, accommod., and food services
GDP - Government
38.30
62.49
23.81
28.67
16.53
16.14
54.82
34.39
-0.97
33.43
-4.43
35.48
4.33
32.43
6.75
-3.63
38.78
37.27
55.02
-3.94
13.36
9.11
41.52
26.08
25.02
1.25
70.56
69.26
18.68
63.32
55.03
73.00
53.32
35.75
1.43
68.75
10.60
74.47
13.85
32.26
6.77
-5.13
34.64
38.50
56.86
-1.50
1.36
2.40
35.32
15.03
38.99
9.53
TABLE 9: description of dataset and methodology (HF data: IP indexes, LF: non-IP real value added
GDP)
¯ 2 , of the regression of growth rates of selected HF and LF indexes on the
In the table we display the adjusted R2 , denoted R
2
¯
¯ 2 (LF )) and both the HF and LF factors (column R
¯ 2 (LF + HF )).
HF factor (column R (HF )), the LF factor (column R
2
2
¯
¯
The last column displays the difference of the values in column R (LF + HF ) and column R (HF ), i.e. the increment
in the adjusted R2 when the LF factor is added as a regressor to the HF factor.
PANEL a)
The GDP indexes used in this table are aggregates of the indexes used to estimate the factors. The factors are estimated
from the panel of 42 non-IP sectors and 117 industrial production indexes considered by Foerster, Sarte, and Watson
(2011), using a mixed frequency factor model with KH = KL = 1. The sample period for the estimation of both the
factor model and the regressions is 1977.Q1-2011.Q4.
PANEL b)
The GDP indexes are the same as in Panel a). The factors used as explanatory variables are estimated from the panel of
productivity innovations computed as proposed by Foerster, Sarte, and Watson (2011), using a mixed frequency factor
model (MFFM) with KH = KL = 1. The sample period for the estimation of both the factor model and the
regressions is 1978.Q1-2011.Q4, because the productivity shocks can not be computed for the first year of the
sample (see Foerster, Sarte, and Watson (2011), especially their equation (B38) on page 10 of their Appendix B.)
PANEL c)
The GDP indexes are the same as in Panel a). The factors used as explanatory variables are estimated from the panel of
productivity innovations computed as proposed by Foerster, Sarte, and Watson (2011), using a mixed frequency factor
model (MFFM) with KH = KL = 1. Both the contemporaneous and lagged values (only lag 1 is included) of the
factors are used as explanatory variables. The choice of including the lags of the factors as regressors, is justified
by equations (10) and (12) in Foerster, Sarte, and Watson (2011). The sample period for the estimation of both the
factor model and the regressions is 1979.Q1-2011.Q4.
27
TABLE 9 BIS: Adjusted R2 of selected indexes on the estimated 1 HF and 1 LF factors. (HF data: IP
indexes, LF: non-IP real GROSS OUTPUT)
(1)
¯ 2 (HF )
R
Sector
(2)
¯ 2 (LF )
R
(3)
¯ 2 (HF + LF )
R
(3) - (1)
PANEL a)
REGRESSORS: factors extracted from sectoral output growth (X and Y)
HF observations
Industrial Production
89.28
-0.22
90.01
0.73
LF observations
GO (all sectors)
GO - Manifacturing
GO - Agriculture, forestry, fishing, and hunting
GO - Construction
GO - Wholesale trade
GO - Retail trade
GO - Transportation and warehousing
GO - Information
GO - Finance, insurance, real estate, rental, and leasing
GO - Professional and business services
GO - Educational services, health care, and social assistance
GO - Arts, entertainment, recreation, accommodation, and food services
GO - Government
70.75
90.22
-8.85
28.13
85.18
81.68
75.42
25.46
17.72
45.12
3.05
71.14
12.24
16.73
-0.38
0.12
29.35
-3.56
-2.75
3.54
28.87
22.89
33.85
3.28
0.38
-0.52
94.89
94.68
-9.13
65.29
85.52
82.83
83.81
61.91
46.49
88.60
7.15
75.42
12.32
24.14
4.47
-0.28
37.16
0.34
1.15
8.39
36.46
28.78
43.48
4.10
4.28
0.08
PANEL b)
REGRESSORS: contemporaneous values of factors extracted from innovations to sectoral productivity
(εt in Foerster, Sarte, and Watson (2011)).
HF observations
Industrial Production
70.52
9.18
80.60
10.08
LF observations
GO (all sectors)
GO - Manifacturing
GO - Agriculture, forestry, fishing, and hunting
GO - Construction
GO - Wholesale trade
GO - Retail trade
GO - Transportation and warehousing
GO - Information
GO - Finance, insurance, real estate, rental, and leasing
GO - Professional and business services
GO - Educational services, health care, and social assistance
GO - Arts, entert., recreation, accommod., and food services
GO - Government
48.57
70.00
-11.65
19.12
78.27
73.40
68.17
4.57
6.60
37.74
12.33
66.27
13.12
27.61
16.02
-3.17
20.95
6.95
4.97
7.51
59.81
13.04
36.51
2.55
-0.54
-1.39
85.09
93.50
-16.24
45.86
91.45
83.86
81.14
78.09
22.88
84.23
16.10
69.36
11.95
36.51
23.50
-4.58
26.74
13.18
10.46
12.97
73.51
16.28
46.49
3.76
3.09
-1.16
28
TABLE 9 BIS: Adjusted R2 of selected indexes on the estimated 1 HF and 1 LF factors, and their
lagged values. (HF data: IP indexes, LF: non-IP real GROSS OUTPUT)
(1)
¯ 2 (HF )
R
Sector
(2)
¯ 2 (LF )
R
(3)
¯ 2 (HF + LF )
R
(3) - (1)
PANEL c)
REGRESSORS: factors extracted from innovations to sectoral productivity
(εt in Foerster, Sarte, and Watson (2011)) both contemporaneous and lagged values (only first lag).
HF observations
Industrial Production
71.41
5.77
80.70
9.29
LF observations
GO (all sectors)
GO - Manifacturing
GO - Agriculture, forestry, fishing, and hunting
GO - Construction
GO - Wholesale trade
GO - Retail trade
GO - Transportation and warehousing
GO - Information
GO - Finance, insurance, real estate, rental, and leasing
GO - Professional and business services
GO - Educational services, health care, and social assistance
GO - Arts, entert., recreation, accommod., and food services
GO - Government
53.15
74.63
-40.11
6.96
81.76
76.44
84.37
7.93
14.64
41.01
-4.03
74.56
75.69
24.71
13.09
-7.36
17.98
2.47
2.86
15.69
64.94
13.83
41.96
3.42
-3.75
14.80
84.74
92.69
-50.94
26.30
91.74
82.66
88.58
95.04
28.03
82.76
-0.76
71.97
78.36
31.59
18.06
-10.83
19.34
9.97
6.22
4.20
87.11
13.39
41.75
3.27
-2.59
2.66
TABLE 9 BIS: description of dataset and methodology. (HF data: IP indexes, LF: non-IP real GROSS
OUTPUT)
¯ 2 , of the regression of growth rates of selected HF and LF indexes on the
In the table we display the adjusted R2 , denoted R
2
¯
¯ 2 (LF )) and both the HF and LF factors (column R
¯ 2 (LF + HF )).
HF factor (column R (HF )), the LF factor (column R
2
2
¯
¯
The last column displays the difference of the values in column R (LF + HF ) and column R (HF ), i.e. the increment
in the adjusted R2 when the LF factor is added as a regressor to the HF factor.
PANEL a)
The GDP indexes used in this table are aggregates of the indexes used to estimate the factors. The factors are estimated
from the panel of 38 non-IP sectors and 117 industrial production indexes considered by Foerster, Sarte, and Watson
(2011), using a mixed frequency factor model with KH = KL = 1. The sample period for the estimation of both the
factor model and the regressions is 1988.Q1-2011.Q4.
PANEL b)
The GROSS OUTPUT indexes are the same as in Panel a). The factors used as explanatory variables are estimated from the
panel of productivity innovations computed as proposed by Foerster, Sarte, and Watson (2011), using a mixed frequency
factor model (MFFM) with KH = KL = 1. The sample period for the estimation of both the factor model and the
regressions is 1989.Q1-2011.Q4, because the productivity shocks can not be computed for the first year of the sample. Productivity innovations are computed using the panel of GROSS OUTPUT GROWTH RATES for the LF observables.
PANEL c)
Corresponds to PANEL c) in Table 9. The sample period for the estimation of both the factor model and the regressions
is 1990.Q1-2011.Q4. Note that productivity innovations are computed using the panel of GROSS OUTPUT
GROWTH RATES for the LF observables.
29
Table 10: Correlation matrix of the estimated HF and LF factors.
fˆ1,t
fˆ2,t
fˆ3,t
fˆ4,t
gˆt
fˆ1,t
fˆ2,t
fˆ3,t
fˆ4,t
gˆt
1.000
0.663
0.254
0.141
0.000
0.663
1.000
0.668
0.148
0.000
0.254
0.668
1.000
0.639
0.000
0.141
0.148
0.639
1.000
0.000
0.000
0.000
0.000
0.000
1.000
In the table we display the correlation matrix of the stacked vector of estimated factors (fˆ1,t , fˆ2,t , fˆ3,t , fˆ4,t , gˆt ). The factors
are estimated from the panel of 42 GDP sectors and 117 industrial production indexes considered by Foerster, Sarte, and
Watson (2011), using a mixed frequency factor model with KH = KL = 1. The sample period for the estimation of both
the factor model and the regressions is 1977.Q1-2011.Q4.
ˆ i and ˆbi .
Table 11: Regressions of observed HF and LF observables on estimated factors: quantiles of λ
Quantile
Coeff.
10%
25%
50%
75%
90%
ˆi
λ
0.0670
0.2428
0.5116
0.6200
0.7546
ˆbi
-0.2474
0.0176
0.2058
0.3664
0.4856
ˆ i and ˆbi of the HF and LF
In the table we display the quantiles of the empirical distributions of the estimated loadings λ
ˆ
ˆ
factors, i.e. the elements of the estimated vectors Λ and B, respectively. The factors and the loadings are estimated from
the panel of 42 GDP sectors and 117 industrial production indexes considered by Foerster, Sarte, and Watson (2011), using
a mixed frequency factor model with KH = KL = 1. The sample period for the estimation of both the factor model and
the regressions is 1977.Q1-2011.Q4.
30
Table 12: Estimates of the unconstrained reduced-form VAR (1) model for the factor process.
We estimate the following unconstrained reduced-form VAR(1) on the factor process:




f1,t−1
f1,t
 f2,t−1 
 f2,t 




 f3,t  = a + A  f3,t−1  + ζt ,
ζt ∼ N (0, Σζ ).




 f4,t−1 
 f4,t 
gt−1
gt
(T.1)
The estimates are given by:








ˆ
A=















ˆ
Σζ = 







-0.45
(0.16)
−0.36
(0.23)
−0.17
(0.17)
−0.30
(0.28)
0.16
(0.23)
0.3444
(0.0591)
0.2492
(0.0680)
0.0986
(0.0465)
0.0796
(0.0741)
−0.1096
(0.0604)
0.35
(0.23)
−0.03
(0.34)
−0.03
(0.25)
0.33
(0.41)
0.29
(0.33)
0.2492
(0.0000)
0.7319
(0.1255)
0.3615
(0.0788)
0.1043
(0.1078)
0.1545
(0.0880)
−0.06
(0.39)
0.47
(0.57)
0.22
(0.42)
0.14
(0.68)
−0.18
(0.55)
0.0986
(0.0000)
0.3615
(0.0000)
0.3981
(0.0683)
0.4386
(0.0952)
0.1116
(0.0648)
0.82
(0.17)
0.27
(0.25)
−0.06
(0.19)
−0.09
(0.30)
0.22
(0.24)
−0.09
(0.11)
−0.30
(0.16)
−0.10
(0.12)
0.08
(0.20)
0.36
(0.16)
0.0796
(0.0000)
0.1043
(0.0000)
0.4386
(0.0000)
1.0657
(0.1828)
−0.0434
(0.1039)








,







−0.1096
(0.0000)
0.1545
(0.0000)
0.1116
(0.0000)
−0.0434
(0.0000)
0.6865
(0.1177)








.







ˆ ζ is:
The correlation matrix corresponding to the estimated variance-covariance matrix Σ


1.0000 0.4964 0.2664 0.1315 −0.2254
 0.4964 1.0000 0.6697 0.1181
0.2180 


 0.2664 0.6697 1.0000 0.6733
0.2135 
.

 0.1315 0.1181 0.6733 1.0000 −0.0507 
−0.2254 0.2180 0.2135 −0.0507 1.0000
The estimated values of the constant vector a
ˆ are not reported because they are not significantly different from zero at the
5% level. The VAR (1) model is estimated by OLS equation by equation. Significant estimates at 5% level are displayed in
bold and standard errors are reported in parentheses. The factors are estimated from the panel of 42 GDP sectors and 117
industrial production indexes considered by Foerster, Sarte, and Watson (2011), using a mixed frequency factor model with
KH = KL = 1. The sample period for the estimation of both the factor model and the VAR (1) model is 1977.Q1-2011.Q4.
31
Table 13: Estimates of the constrained reduced-form VAR (1) model for the factor process.
We estimate the following constrained reduced-form VAR(1) on the factor process:




f1,t−1
f1,t
 f2,t−1 
 f2,t 




 f3,t  = A  f3,t−1  + ζt ,
ζt ∼ N (0, Σ) ,




 f4,t−1 
 f4,t 
gt−1
gt
where:



A=


0
0
0
0
m1
0
0
0
0
m2
0
0
0
0
m3
rH
2
rH
3
rH
4
rH
m4
a
a(1 + rH )
2
a(1 + rH + rH
)
2
3
a(1 + rH + rH
+ rH
)
rL
(T.2)



,


(T.3)
and



Σζ = 


2
σH
2
σH
rH
2 2
σH
rH
2 3
σH
rH
ρHL σH σL

2
2
σH
(1 + rH
)
2
2
σH rH (1 + rH
)
2 2
2
σH rH (1 + rH
)
(1 + r)ρHL σH σL
The estimates are given by:

0.0000 0.0000 0.0000
 0.0000 0.0000 0.0000

Aˆ = 
 0.0000 0.0000 0.0000
 0.0000 0.0000 0.0000
0.1677 0.2821 −0.1756
0.6542
0.4280
0.2800
0.1832
0.2207
2
2
4
σH
(1 + rH
+ rH
)
2
2
4
σH rH (1 + rH + rH
)
ρHL σH σL (1 + r + r2 )
−0.0268
−0.0443
−0.0557
−0.0632
0.3643
Coefficient
rH
a
m1
m2
m3
m4
φ
σH
σL
ρHL






ˆζ = 
, Σ




2
2
4
6
σH
(1 + rH
+ rH
+ rH
)
2
2
ρHL σH σL (1 + r + r + r3 ) σL
0.5623
0.3678
0.2406
0.1574
0.0033
Estimate
St. Error
0.6542
-0.0268
0.1677
0.2821
-0.1756
0.2207
0.3643
0.7498
0.8163
0.0055
0.0651
0.0665
0.2134
0.3008
0.4968
0.2233
0.1438
0.1283
0.2743
0.0962
0.3678
0.8029
0.5252
0.3436
0.0055
0.2406
0.5252
0.9059
0.5926
0.0070
0.1574
0.3436
0.5926
0.9499
0.0079


.


0.0033
0.0055
0.0070
0.0079
0.6664
(T.4)



 ,


The VAR (1) model is estimated by Maximum Likelihood. The factors are estimated from the panel of 42 GDP sectors
and 117 industrial production indexes considered by Foerster, Sarte, and Watson (2011), using a mixed frequency factor
model with KH = KL = 1. The sample period for the estimation of both the factor model and the VAR (1) model is
1977.Q1-2011.Q4.
32
Table 14: Adjusted R2 of the regression of yearly sectoral GDP growth on the HF factor.
¯2
R
Sector
Accommodation
Truck transportation
Administrative and support services
Other transportation and support activities
Construction
Other services, except government
Warehousing and storage
Miscellaneous professional, scientific, and technical services
Funds, trusts, and other financial vehicles
Government enterprises (STATES AND LOCAL)
Legal services
Retail trade
Wholesale trade
Air transportation
Food services and drinking places
Government enterprises (FEDERAL)
Performing arts, spectator sports, museums, and related activities
Publishing industries (includes software)
Amusements, gambling, and recreation industries
Real estate
Rail transportation
Waste management and remediation services
Pipeline transportation
Computer systems design and related services
Educational services
Broadcasting and telecommunications
Securities, commodity contracts, and investments
Social assistance
Rental and leasing services and lessors of intangible assets
Motion picture and sound recording industries
Transit and ground passenger transportation
General government (FEDERAL)
Insurance carriers and related activities
Farms
Forestry, fishing, and related activities
General government (STATES AND LOCAL)
Federal Reserve banks, credit intermediation, and related activities
Water transportation
Ambulatory health care services
Management of companies and enterprises
Hospitals and nursing and residential care facilities
Information and data processing services
72.06
60.90
56.71
47.82
44.03
42.28
40.99
40.23
38.70
35.15
33.25
33.05
30.04
27.25
27.13
25.57
22.43
21.69
19.53
19.38
18.90
12.73
11.90
11.54
10.49
9.68
7.74
6.33
6.16
4.35
4.02
3.94
3.06
0.35
-1.12
-1.12
-1.54
-3.99
-4.07
-4.24
-7.15
-9.05
¯ 2 , for the time series regressions of each of the of 42 GDP sectors on
In the table we display the adjusted R2 , denoted R
the estimated HF factor. The factors are estimated from the panel of 42 GDP sectors and 117 industrial production indexes
considered by Foerster, Sarte, and Watson (2011), using a mixed frequency factor model with KH = KL = 1. The
sample period for the estimation of both factor model and regressions is 1977.Q1-2011.Q4. The regressions in this table
are unrestricted MIDAS regressions.
33
Table 15: Adjusted R2 of the regression of yearly sectoral GDP growth on the HF and LF factors.
¯2
R
Sector
Construction
Accommodation
Administrative and support services
Truck transportation
Miscellaneous professional, scientific, and technical services
Wholesale trade
Retail trade
Other services, except government
Government enterprises (FEDERAL)
Computer systems design and related services
Other transportation and support activities
Social assistance
Warehousing and storage
Funds, trusts, and other financial vehicles
Legal services
Government enterprises (STATES AND LOCAL)
Real estate
Food services and drinking places
Rental and leasing services and lessors of intangible assets
General government (STATES AND LOCAL)
Air transportation
Performing arts, spectator sports, museums, and related activities
Rail transportation
Publishing industries (includes software)
Amusements, gambling, and recreation industries
Educational services
Transit and ground passenger transportation
Management of companies and enterprises
General government (FEDERAL)
Waste management and remediation services
Pipeline transportation
Farms
Broadcasting and telecommunications
Forestry, fishing, and related activities
Insurance carriers and related activities
Securities, commodity contracts, and investments
Motion picture and sound recording industries
Information and data processing services
Ambulatory health care services
Federal Reserve banks, credit intermediation, and related activities
Water transportation
Hospitals and nursing and residential care facilities
73.85
72.62
70.69
60.01
54.39
52.75
52.46
51.23
50.09
48.84
46.02
45.21
44.90
44.86
44.49
41.52
35.72
35.51
30.00
29.55
24.94
24.11
20.19
19.02
18.23
13.71
13.04
12.87
11.74
9.76
9.66
8.70
6.71
6.57
6.15
5.54
1.14
1.01
-0.65
-4.65
-7.51
-10.82
¯ 2 , for the time series regressions of each of the of 42 GDP sectors on the
In the table we display the adjusted R2 , denoted R
estimated HF and LF factors. The factors are estimated from the panel of 42 GDP sectors and 117 industrial production
indexes considered by Foerster, Sarte, and Watson (2011), using a mixed frequency factor model with KH = KL = 1.
The sample period for the estimation of both factor model and regressions is 1977.Q1-2011.Q4. The regressions in this
table are unrestricted MIDAS regressions.
34
Table 16: Change in adjusted R2 of the regression of yearly sectoral GDP growth on the HF factor and
the LF factors vs. the regression on the HF factor only.
Sector
Social assistance
Computer systems design and related services
General government (STATES AND LOCAL)
Construction
Government enterprises (FEDERAL)
Rental and leasing services and lessors of intangible assets
Wholesale trade
Retail trade
Management of companies and enterprises
Real estate
Miscellaneous professional, scientific, and technical services
Administrative and support services
Legal services
Information and data processing services
Transit and ground passenger transportation
Other services, except government
Food services and drinking places
Farms
General government (FEDERAL)
Forestry, fishing, and related activities
Government enterprises (STATES AND LOCAL)
Funds, trusts, and other financial vehicles
Warehousing and storage
Ambulatory health care services
Educational services
Insurance carriers and related activities
Performing arts, spectator sports, museums, and related activities
Rail transportation
Accommodation
Truck transportation
Amusements, gambling, and recreation industries
Other transportation and support activities
Securities, commodity contracts, and investments
Pipeline transportation
Air transportation
Publishing industries (includes software)
Broadcasting and telecommunications
Waste management and remediation services
Federal Reserve banks, credit intermediation, and related activities
Motion picture and sound recording industries
Water transportation
Hospitals and nursing and residential care facilities
¯2
change in R
ˆ
B
38.89
37.30
30.67
29.82
24.52
23.84
22.71
19.41
17.10
16.34
14.15
13.97
11.25
10.06
9.02
8.95
8.38
8.35
7.80
7.69
6.37
6.16
3.90
3.42
3.21
3.09
1.68
1.29
0.56
-0.89
-1.30
-1.80
-2.20
-2.24
-2.31
-2.67
-2.97
-2.97
-3.11
-3.22
-3.52
-3.68
0.59
0.58
0.53
0.51
0.47
0.47
0.46
0.42
0.41
0.40
0.37
0.36
0.34
-0.34
0.32
0.30
0.30
0.31
-0.30
-0.30
0.27
-0.26
0.22
-0.24
0.23
0.23
0.19
0.18
-0.11
0.06
0.11
-0.00
0.09
-0.08
0.04
0.02
0.03
0.02
0.06
0.03
0.02
-0.01
¯ 2 ) from the regressions of each industrial production index
In the table we display the difference in the adjusted R2 (R
growth on the HF and LF estimated factors and on the HF factor only. The factors are estimated from the panel of 42 GDP
sectors and 117 industrial production indexes considered by Foerster, Sarte, and Watson (2011), using a mixed frequency
factor model with KH = KL = 1. The sample period for the estimation of both factor model and regressions is 1977.Q12011.Q4. The regressions in this table are unrestricted MIDAS regressions.
35
Table 17: Simulation results for DGP with 1 HF and 1 LF factors and different loading ∆j .
R2 quantiles
TH
TL
5%
25%
50%
75%
95%
ˆj
DESIGN 1: ∆j = ∆
HF
117
45 140
HF
498 180 560
LF
117
45 140
LF
498 180 560
35
140
35
140
95
99
34
86
96
99
55
89
97
99
65
92
97
99
74
93
98
99
82
95
ˆj
DESIGN 2: ∆j = 2 · ∆
HF
117
45 140
HF
498 180 560
LF
117
45 140
LF
498 180 560
35
140
35
140
95
99
31
83
96
99
52
87
97
99
63
90
97
99
72
92
98
99
81
94
ˆj
DESIGN 3: ∆j = 5 · ∆
HF
117
45 140
HF
498 180 560
LF
117
45 140
LF
498 180 560
35
140
35
140
59
90
3
41
84
95
16
57
91
96
31
68
95
98
46
77
97
99
65
87
Factor
NH
NL
We consider three simulation designs for the mixed frequency factor model in equation (1), in the case of 4 HF subperiods,
and equations (T.2) - (T.4) in table 13, and we assume that the numbers of factors are KLF = KHF = 1 both for simulation
and in estimation. The number of simulations for each design is 5000. The mixed frequency panels of observations are
simulated using the values of the parameters reported in the following table:
Param.
rH
a
m1
m2
m3
m4
φ
σH
σL
ρHL
value
0.6542
0.0000
0.0000
0.0000
0.0000
0.0000
0.3643
0.7498
0.8163
0.0000
Param.
∆1
∆2
∆3
∆4
B
Ω1
Ω2
Ω3
Ω4
Λ
mean
-0.0021
0.0197
-0.0012
0.0040
-0.1735
0.1986
0.1311
-0.1798
0.1302
0.4378
std.dev.
0.1610
0.1557
0.1450
0.1463
0.2547
0.2506
0.2874
0.4990
0.2258
0.2528
Param.
σε
σu
mean
0.8909
0.7726
std.dev.
0.1389
0.1343
All the simulated loadings, with the exception of ∆j , are drawn from independent normal distributions, with mean and
variance equal to the corresponding sample moments of the estimated loadings from our macro dataset, reported in the
previous table. Design 1 maintains the same distributions as in our macro dataset to simulate the loadings ∆j , while
Design 2 (resp. Design 3) is such that the simulated values of the ∆j loadings are 2 (resp. 5) times bigger than in our
macro-dataset. The variance-covariance matrices of the simulated innovations are diagonal, and their diagonal elements
are bootstrapped from the values in the diagonals of the estimated variance-covariance matrices in our macro dataset. The
averages and standard deviations of the square roots of the diagonal elements of these estimated matrices are reported in
the table, on the lines named σε and σu , respectively. For each simulation design we report one table displaying:
• Line 1: the quantiles of the R2 of the regression of the true HF factor on HF factor estimated from simulated panels
with same TS and CS dimensions as in our macro-dataset;
• Line 2: the quantiles of the R2 of the regression of the true HF factor on HF factor estimated from simulated panels
such that both the CS and TS dimensions are four times larger than in our macro-dataset;
• Line 3: the quantiles of the R2 of the regression of the true LF factor on LF factor estimated from simulated panels
with same TS and CS dimensions as in our macro-dataset;
• Line 4: the quantiles of the R2 of the regression of the true LF factor LF factor estimated from simulated panels
such that both the CS and TS dimensions are four times larger than in our macro-dataset.
36
FIGURES
Figure 1: The model structure in the case of two high frequency subperiods.
HF data
x1,t
x2,t
Λ
HF factors
Λ
f1,t
f2,t
Ω1 Ω2
yt
LF data
B
∆1
∆2
gt
LF factors
time
Subperiod 1
Subperiod 2
Period t
The Figure displays the schematic representation of the mixed-frequency factor model described in Section 2.1.
37
Figure 2: Evolution of sectoral decomposition of US nominal GDP.
100
CONSTRUCTION
90
Share of nominal GDP (%)
80
GOVERNMENT
70
60
SERVICES
50
40
30
20
10
0
77
INDUSTRIAL PRODUCTION
79
81
83
85
87
89
91
93
95
97
99
01
03
05
07
09
11
Year
The Figure displays the evolution from 1977 to 2011 of the sectoral decomposition of US nominal GDP. We aggregate the
shares of different sectors available from the website of the US Bureau of Economic Analysis, according to their NAICS
codes, in 5 different macro sectors: Industrial Production (yellow), Services (red), Government (green), Construction
(white), Others (grey).
38
Figure 3: Adjusted R2 of the regression of yearly sectoral GDP growth on estimated factors.
Regression: yt on fˆ1,t , ..., fˆ4,t
30
25
Percentage
20
15
10
5
0
−20
0
20
40
R
60
80
100
2
(a) Adjusted R2 of the regression of yearly sectoral GDP
growth on the HF factor.
Regression: yt on gˆt , fˆ1,t , ..., fˆ4,t
30
25
Percentage
20
15
10
5
0
−20
0
20
40
R
60
80
100
2
(b) Adjusted R2 of the regression of yearly sectoral
GDP growth on the HF and LF factors.
¯ 2 , of the regressions of the yearly growth rates of sectoral
In Panel (a) we show the histogram of the adjusted R2 , denoted R
GDP indexes on the estimated HF factor. In Panel (b) we show the histogram of the adjusted R2 of the regressions of the
same growth rates on the estimated HF and LF factors. The factors are estimated from the panel of 42 GDP sectors and 117
industrial production indexes considered by Foerster, Sarte, and Watson (2011), using a mixed frequency factor model with
KH = KL = 1. The sample period for the estimation of both the factor model and the regressions is 1977.Q1-2011.Q4.
39
Figure 4: Adjusted R2 of the regression of quarterly industrial production growth on estimated factors.
30
25
Percentage
20
15
10
5
0
−20
0
20
40
R
60
80
100
2
(a) Adjusted R2 of the regression of quarterly industrial
production growth on the HF factor.
30
25
Percentage
20
15
10
5
0
−20
0
20
40
R
60
80
100
2
(b) Adjusted R2 of the regression of quarterly industrial
production growth on the HF and LF factors.
¯ 2 , of the regressions of the quarterly growth rates of
In Panel (a) we show the histogram of the adjusted R2 , denoted R
the industrial production indexes on the estimated HF factor. In Panel (b) we show the histogram of the adjusted R2 of
the regressions of the same growth rates on the estimated HF and LF factors. The factors are estimated from the panel of
42 GDP sectors and 117 industrial production indexes considered by Foerster, Sarte, and Watson (2011), using a mixed
frequency factor model with KH = KL = 1. The sample period for the estimation of both the factor model and the
regressions is 1977.Q1-2011.Q4.
40
Figure 5: Regression of LF and HF indexes on estimated factors.
IP INDEX growth
GDP growth
20
8
15
6
10
5
4
0
2
−5
−10
0
−15
−2
−20
−25
77 79 81 83 85 87 89 91 93 95 97 99 01 03 05 07 09 11
Date
−4
(a) HF Index: Industrial Production Index growth.
(b) LF Index: Aggregate GDP Index growth.
GDP CONSTR. growth
GDP MANIFACTURING growth
15
15
10
10
5
5
0
0
−5
−5
−10
−10
−15
−15
77 79 81 83 85 87 89 91 93 95 97 99 01 03 05 07 09 11
Date
(c) LF Index: GDP-Construction Index growth.
77 79 81 83 85 87 89 91 93 95 97 99 01 03 05 07 09 11
Date
77 79 81 83 85 87 89 91 93 95 97 99 01 03 05 07 09 11
Date
(d) LF Index: GDP-Manifacturing Index growth.
Each panel displays the time series of the growth rate a certain HF or LF index (solid line), its fitted value obtained from a
regression of the index on the HF factor (dotted line), and its fitted value obtained from a regression of the index on both
the HF and LF factors (dashed line). The first three indexes reported in the panels are aggregates of the indexes used to
estimate the factors. The fourth index (GDP-Manufacturing) is constructed from sub-indexes not used for the estimation of
the factors. The factors are estimated from the panel of 42 GDP sectors and 117 industrial production indexes considered
by Foerster, Sarte, and Watson (2011), using a mixed frequency factor model with KH = KL = 1. The sample period for
the estimation of both the factor model and the regressions is 1977.Q1-2011.Q4.
41
Figure 6: Trajectories and autocorrelation functions of HF and LF factors.
3
2
2
1.5
1
1
0
0.5
−1
0
−2
−0.5
−3
−1
−4
−1.5
−5
−2
−6
77 79 81 83 85 87 89 91 93 95 97 99 01 03 05 07 09 11
Date
−2.5
77 79 81 83 85 87 89 91 93 95 97 99 01 03 05 07 09 11
Date
(a) HF factor: estimated values.
(b) LF factor: estimated values.
1
0.8
Sample Autocorrelation
Sample Autocorrelation
0.8
0.6
0.4
0.2
0
−0.2
0.6
0.4
0.2
0
−0.2
0
5
10
Lag
15
−0.4
20
(c) HF factor: autocorrelation function.
0
5
10
Lag
15
20
(d) LF factor: autocorrelation function.
Panel (a) displays the time series of estimated values of the HF factor. Panel (b) displays the time series of estimated values
of the LF factor. Panel (c) displays the empirical autocorrelation function of the estimated values of the HF factor. Panel (d)
displays the empirical autocorrelation function of the estimated values of the LF factor. The horizontal lines are asymptotic
95% confidence bands. The factors are estimated from the panel of 42 GDP sectors and 117 industrial production indexes
considered by Foerster, Sarte, and Watson (2011), using a mixed frequency factor model with KH = KL = 1. The sample
period for the estimation of the factor model is 1977.Q1-2011.Q4.
42
Figure 7: Trajectories of HF and LF factors.
2
1
0
−1
−2
−3
−4
−5
77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 02 03 04 05 06 07 08 09 10 11 12
Date
The Figure displays the time series of estimated values of the HF factor (blue circles) an LF factors (red squares). For
each year we represent the LF factor as 4 squares corresponding to the 4 quarters, assuming the same value. The factors
are estimated from the panel of 42 GDP sectors and 117 industrial production indexes considered by Foerster, Sarte, and
Watson (2011), using a mixed frequency factor model with KH = KL = 1. The sample period for the estimation of the
factor model is 1977.Q1-2011.Q4.
43
APPENDIX A: Restrictions on the factor dynamics
In this Appendix we derive restrictions on the structural VAR parameters of the factor dynamics implied by i) the factor normalization and ii) stationarity.
A.1
Implied restrictions from factor normalization
0
0
gt0 )0 is (see
f2,t
The unconditional variance-covariance matrix of the vector of stacked factors (f1,t
equation (2)):

f
 1,t

V = V  f2,t

gt


I
Φ
0
  KH
  0
 =  Φ IKH 0
 
0
0 IKL



,

where Φ is the covariance between f1,t and f2,t . Moreover, the factor dynamics is given by the structural VAR(1) model (see equation (3)):

f
 1,t

Γ  f2,t

gt


f
 1,t−1



 = R  f2,t−1


gt−1


v
  1,t
 
 +  v2,t
 
wt



,

(A.1)
where:

IKH
0
0


Γ =  −RH IKH 0

0
0 IKL




,

0
RH A1





R= 0
0 A2  ,


M1 M2 RL
0
0
and (v1,t
, v2,t
, wt0 )0 is a multivariate white noise process with mean 0 and variance-covariance matrix
(see equation (4)):

v
 1,t

Σ = V  v2,t

wt


ΣH
 
 
=
 
44
0
ΣHL,1



ΣH ΣHL,2  .

ΣL
By computing the variance on both sides of equation (A.1) we get:
ΓV Γ0 = RV R0 + Σ.
(A.2)
By matrix multiplication:

0
ΓV Γ
0
RH
I
Φ−
0
 KH
 0
0
0
=  Φ − RH IKH − RH Φ − Φ0 RH
+ RH RH
0

0
0
IKL



,

and:

RV R
0
0
RH RH
+ A1 A01
A1 A02

RH (Φ0 M10 + M20 ) + A1 RL0


=  A2 A01
A2 A02 A2 RL0

0
(M1 Φ + M2 ) RH
+ RL A01 RL A02 M1 M10 + M2 M20 + M1 ΦM20 + M2 Φ0 M10 + RL RL0
Hence from equation (A.2) we get the following system of equations:
0
IKH = RH RH
+ A1 A01 + ΣH ,
0
0
IKH − RH Φ − Φ0 RH
+ RH RH
= A2 A02 + ΣH ,
(A.3)
(A.4)
IKL = M1 M10 + M2 M20 + M1 ΦM20 + M2 Φ0 M10 + RL RL0 + ΣL ,
(A.5)
0
Φ − RH
= A1 A02 ,
(A.6)
0 = RH (Φ0 M10 + M20 ) + A1 RL0 + ΣHL,1 ,
(A.7)
0 = A2 RL0 + ΣHL,2 .
(A.8)
45


.

These equations imply:
0
ΣH = IKH − RH RH
− A1 A01 ,
(A.9)
0
0
ΣH = IKH − RH Φ − Φ0 RH
+ RH RH
− A2 A02 ,
(A.10)
ΣL = IKL − M1 M10 − M1 ΦM20 − M2 Φ0 M10 − M2 M20 − RL RL0 ,
(A.11)
0
Φ = RH
+ A1 A02 ,
(A.12)
ΣHL,1 = −RH (Φ0 M10 + M20 ) − A1 RL0 ,
(A.13)
ΣHL,2 = −A2 RL0 .
(A.14)
Let θ be the vector containing the elements of matrices RH , RL , A1 , A2 , M1 and M2 in the structural
VAR(1) model. Equation (A.12) expresses the stationary autocovariance matrix Φ of the HF factor as
function of θ (more precisely, of RH and A1 , A2 ). Equations (A.9), (A.11), (A.13) and (A.14) express
the variance-covariance matrix Σ of the factor innovations as a function of θ. Finally, equation (A.10),
together with equations (A.9) and (A.12), implies a restriction on vector θ:
0
A1 A01 − RH A1 A02 − A1 A02 RH
− A2 A02 = 0.
(A.15)
Thus, the factor dynamics is characterized by parameter matrices RH , RL , A1 , A2 , M1 and M2 , which
are subject to restriction (A.15),
Let us now discuss restriction (A.15) in the case of single HF and LF factors, i.e. KH = KL = 1.
Equation (A.15) becomes:
A21 − 2RH A1 A2 − A22 = 0,
(A.16)
where A1 , A2 and RH are scalars. This equation yields two solutions for A1 as a function of A2 and
RH :
A1
q
2
= A2 RH ± 1 + RH .
46
A.2
Stationarity conditions
The stationarity conditions are deduced from the reduced form of the VAR(1) dynamics in (A.1), that
is:

f1,t


 f2,t

gt


f1,t−1




 = Γ−1 R  f2,t−1


gt−1



 + ζt

where:

0
RH A1


2
Γ−1 R =  0
RH
RH A1 + A2

M1 M2 RL



,

0
0
, wt0 )0 is a zero-mean white noise process with variance-covariance matrix
, v2,t
and ζt = Γ−1 (v1,t

ΣH
0
ΣH RH
ΣHL,1


0
Σζ = Γ−1 Σ(Γ−1 )0 =  RH ΣH RH ΣH RH
+ ΣH
ΣHL,2 + RH ΣHL,1

0
ΣL
Σ0HL,1 Σ0HL,2 + Σ0HL,1 RH



.

(A.17)
The stationarity condition is: the eigenvalues of matrix Γ−1 R are smaller than 1 in modulus. When
either M1 = M2 = 0, or A1 = A2 = 0, the stationarity condition becomes: the eigenvalues of matrices
RH and RL are smaller than 1 in modulus.
47
APPENDIX B: Identification
B.1
Proof of Proposition 1
By replacing equation (6) into model (1), we get

x1,t


 x2,t

yt


ΛA11 + ∆1 A31
 
 
 =  ΛA21 + ∆2 A31
 
Ω1 A11 + Ω2 A21 + BA31
ΛA12 + ∆1 A32
ΛA13 + ∆1 A33
ΛA22 + ∆2 A32
ΛA23 + ∆2 A33
Ω1 A12 + Ω2 A22 + BA32
Ω1 A13 + Ω2 A23 + BA33

f˜1,t

 ˜
  f2,t

g˜t


ε1,t
 
 
 +  ε2,t
 
ut



.

(B.1)
This factor model satisfies the restrictions in the loading matrix displayed in equation (1) if, and only
if,
ΛA12 + ∆1 A32 = 0,
(B.2)
ΛA21 + ∆2 A31 = 0,
(B.3)
ΛA11 + ∆1 A31 = ΛA22 + ∆2 A32 .
(B.4)
..
Let us assume that Λ . ∆1 is full column rank for NH sufficiently large (the argument for the case in
..
which Λ . ∆2 is full column rank is similar). Equation (B.2) can be written as a linear homogeneous
system of equations for the elements of matrices A12 and A32 :

.
Λ .. ∆1 
A12
A32

 = 0.
..
Since Λ . ∆1 is full column rank, it follows that
A12 = 0 and A32 = 0.
48
(B.5)
Then, equation (B.4) becomes Λ(A11 − A22 ) + ∆1 A31 = 0, that is:


A11 − A22
.
 = 0.
Λ .. ∆1 
A31
(B.6)
..
Since Λ . ∆1 is full column rank it follows that:
A11 = A22 ,
(B.7)
A31 = 0.
(B.8)
Replacing the last equation in (B.3), and using that matrix Λ is full rank, we get:
A21 = 0.
(B.9)
Thus, the transformation of the factors that is compatible with the restrictions on the loading matrix in
equation (1) is:

f1,t


 f2,t

gt


A11 0




 =  0


0
A13
A22 A23
0
A33

f˜1,t

 ˜
  f2,t

g˜t



,

We can invert this transformation and write:
−1
−1
f˜1,t = A−1
11 f1,t − A11 A13 A33 gt ,
−1
−1
f˜2,t = A−1
22 f2,t − A22 A23 A33 gt ,
g˜t = A−1
33 gt .
49
A11 = A22 .
The transformed factors satisfy the normalization restrictions in (2) if, and only if,
−1 0
−1
Cov(f˜1,t , g˜t ) = −A−1
11 A13 A33 (A33 ) = 0,
(B.10)
−1
−1 0
Cov(f˜2,t , g˜t ) = −A−1
22 A23 A33 (A33 ) = 0,
(B.11)
−1 0
−1 0 0
−1
−1
−1 0
V (f˜1,t ) = A−1
11 (A11 ) + A11 A13 A33 (A33 ) A13 (A11 ) = IKH ,
(B.12)
−1 0
−1
−1
−1 0 0
−1 0
V (f˜2,t ) = A−1
22 (A22 ) + A22 A23 A33 (A33 ) A23 (A22 ) = IKH ,
(B.13)
−1 0
V (˜
gt ) = A−1
33 (A33 ) = IKL .
(B.14)
Since the matrices A11 = A22 and A33 are nonsingular, equations (B.10) and (B.11) imply
A13 = A23 = 0.
(B.15)
Then from equations (B.12) - (B.15), we get that matrices A11 = A22 and A33 are orthogonal.
Q.E.D.
B.2
Proof of Proposition 2
If ∆1 = ∆2 = 0 in the DGP, from (B.1) we get:

x1,t


 x2,t

yt


ΛA11
 
 
 =  ΛA21
 
Ω1 A11 + Ω2 A21 + BA31
ΛA12
ΛA13
ΛA22
ΛA23
Ω1 A12 + Ω2 A22 + BA32
Ω1 A13 + Ω2 A23 + BA33

f˜1,t

 ˜
  f2,t

g˜t


ε1,t
 
 
 +  ε2,t
 
ut



.

(B.16)
The restrictions on the loading matrices imply:
ΛA12 = 0,
ΛA21 = 0,
ΛA11 = ΛA22 .
Since Λ is full column rank, it follows A12 = 0, A21 = 0 and A11 = A22 . In the transformed model
(B.16), the loadings of the LF factor on the HF data are:
˜ 1 = ΛA13 ,
∆
˜ 2 = ΛA23 ,
∆
50
(B.17)
˜ 1 = 0 and ∆
˜ 2 = 0, and hence A13 = 0 and
that are spanned by Λ. By Assumption 1, it follows ∆
A23 = 0. Then from (6):



f˜ = A−1
11 f1,t

 1,t
f˜2,t = A−1
22 f2,t



−1
−1
 g˜
= A−1
t
33 (gt − A31 A11 f1,t − A32 A22 f2,t )
.
(B.18)
Then:
−1 0
−1 0
−1
0 = Cov(˜
gt , f˜1,t ) = −A−1
33 (A31 A11 + A32 A22 Φ )(A11 ) ,
−1
−1
−1 0
0 = Cov(˜
gt , f˜2,t ) = −A−1
33 (A31 A11 Φ + A32 A22 )(A22 ) .
Thus, we get:
−1 0
A31 A−1
= 0,
11 + A32 A22 Φ
(B.19)
−1
A31 A−1
= 0,
11 Φ + A32 A22
which implies:
−1
A32 A22
[IKH − Φ0 Φ] = 0.
Since the variance-covariance matrix of the factors in (2) is positive definite, the matrix IKH − Φ0 Φ is
invertible. Then, we get A32 = 0. From (B.19) it follows A31 = 0.
Q.E.D.
51
APPENDIX C: Large sample properties
C.1
Proof of Proposition 3
Let us introduce a new notation for the matrices of HF and LF observations, factors, and errors, respectively:

X =
X1
X2


,
F =
F1
F2


Fˆ = 
,
Fˆ1
Fˆ2


,
ε=
ε1
ε2

,
and the residuals matrices:

Ξ =
Ξ1
Ξ2


=
MG X1
MG X2


,
˜=
Ξ

˜1
Ξ

=
˜2
Ξ
MG˜ X1
MG˜ X2

,
˜ 0 . We
˜ −1 G
˜ G
˜ 0 G)
where MG = I − PG , with PG = G(G0 G)−1 G0 , and MG˜ = I − PG˜ , with PG˜ = G(
define:
F ∗ = [F1 F2 ],
Fˆ ∗ = [Fˆ1 Fˆ2 ],
∆ = [∆1 ∆2 ],
Ω = [Ω1 Ω2 ],



G 0
ˆ∗ = 
 = I2 ⊗ G,
G
G∗ = 
0 G



PG 0
MG
,
PG∗ = 
MG∗ = 
0
PG
0
ˆ 0
G
0
ˆ
G
0
MG

ˆ
 = I2 ⊗ G,

.
The hat and tilde refer to the estimates in the current and previous iterations, respectively, in the
iterative estimation procedure.
The model (1) can be written as:
X1 = F1 Λ0 + G∆01 + ε1 ,
X2 = F2 Λ0 + G∆02 + ε2 ,
Y
= F1 Ω01 + F2 Ω02 + GB 0 + u,
52
and, more compactly, as:
X = F Λ0 + G∗ ∆0 + ε,
(C.1)
= F ∗ Ω0 + GB 0 + u.
(C.2)
Y
C.1.1
The exact recursive equation in step 1
The first step of the iterative estimation procedure consists in the estimation by PCA of the HF factor
from the HF data, given the estimated LF factor from the previous iteration. By reordering of the data,
from equation (11) we have:
1
˜Ξ
˜ 0 Fˆ = Fˆ VˆF .
Ξ
2NH T
(C.3)
˜ can be decomposed as:
The matrix Ξ
˜ = M ˜ ∗ X = MG∗ X + (M ˜ ∗ − MG∗ )X
Ξ
G
G
= MG∗ (F Λ0 + ε) − (PG˜ ∗ − PG∗ )X
= F Λ0 + e − (PG˜ ∗ − PG∗ )X,
where:
e = ε − PG∗ (F Λ0 + ε).
(C.4)
˜Ξ
˜ 0 can be expressed as:
Therefore matrix Ξ
0
0
0
˜
˜
ΞΞ = F Λ ΛF + ee0 + (PG˜ ∗ − PG∗ )XX 0 (PG˜ ∗ − PG∗ )
+F Λ0 e0 + eΛF 0
−F Λ0 X 0 (PG˜ ∗ − PG∗ ) − (PG˜ ∗ − PG∗ )XΛF 0
0
0
−eX (PG˜ ∗ − PG∗ ) − (PG˜ ∗ − PG∗ )Xe .
53
(C.5)
The equation (C.3) can be written as:
0 0 ˆ ΛΛ
FF
1
ˆ
ˆ
F VF − F
=
... Fˆ ,
NH
2T
2NH T
where the terms in the curly brackets are the same as in equation (C.5). Since
(C.6)
F 0 Fˆ
2T
is invertible w.p.a. 1
from Lemma S.1, then:
0 ˆ −1 0 −1
0 ˆ −1 0 −1
FF
ΛΛ
ΛΛ
1
FF
ˆ
ˆ
F VF
... Fˆ
−F =
.
2T
NH
2NH T
2T
NH
Since VˆF is invertible w.p.a. 1 from Lemma S.2 we can define:
ˆF =
H
Λ0 Λ
NH
F 0 Fˆ ˆ −1
VF .
2T
ˆ F is invertible w.p.a. 1 and:
Then H
0 ˆ −1 0 −1
ΛΛ
FF
−1
ˆ
ˆ
.
HF = VF
2T
NH
We get:
ˆ −1
Fˆ H
F
1
−F =
2NH T
ee0 Fˆ + F Λ0 e0 Fˆ + eΛF 0 Fˆ
−eX 0 (PG˜ ∗ − PG∗ )Fˆ − (PG˜ ∗ − PG∗ )Xe0 Fˆ
−F Λ0 X 0 (PG˜ ∗ − PG∗ )Fˆ − (PG˜ ∗ − PG∗ )XΛF 0 Fˆ
0 ˆ −1 0 −1
FF
ΛΛ
0
ˆ
+(PG˜ ∗ − PG∗ )XX (PG˜ ∗ − PG∗ )F
.
2T
NH
C.1.2
(C.7)
The exact recursive equation in step 2
The second step of the iterative estimation procedure consists in the estimation by PCA of the LF
factor from the LF data, given the estimated HF factor from the first step (see equation (12)):
1 ˆ ˆ0 ˆ
ˆ VˆG .
ΨΨ G = G
NL T
54
(C.8)
ˆ can be decomposed as:
The matrix Ψ
ˆ = M ˆ ∗ Y = (I − PF ∗ )Y − (P ˆ ∗ − PF ∗ )Y
Ψ
F
F
= GB 0 + v − (PFˆ ∗ − PF ∗ )Y,
where:
v = u − PF ∗ (GB 0 + u).
(C.9)
ˆΨ
ˆ 0 can be expressed as:
Therefore matrix Ψ
(
ˆΨ
ˆ 0 = GB 0 BG0 +
Ψ
vv 0 + (PFˆ ∗ − PF ∗ )Y Y 0 (PFˆ ∗ − PF ∗ )
+GB 0 v 0 + vBG0
−GB 0 Y 0 (PFˆ ∗ − PF ∗ ) − (PFˆ ∗ − PF ∗ )Y BG0
)
−vY 0 (PFˆ ∗ − PF ∗ ) − (PFˆ ∗ − PF ∗ )Y v 0 .
The equation (C.8) can be written as:
0 0 ˆ BB
GG
1
ˆ
ˆ
ˆ
=
... G.
GVG − G
NL
T
NL T
ˆ
G0 G
Since
is invertible from Lemma S.1, then:
T
0 ˆ −1 0 −1
0 ˆ −1 0 −1
1
GG
BB
BB
ˆ
ˆ
ˆ GG
GVG
−G =
... G
.
T
NL
NL T
T
NL
Since VˆG is invertible w.p.a. 1 from Lemma S.2 we can define:
ˆG =
H
B0B
NL
55
ˆ
G0 G
VˆG−1 .
T
(C.10)
ˆ G is invertible w.p.a. 1 and:
Then H
0 ˆ −1 0 −1
GG
BB
−1
ˆ
ˆ
HG = VG
.
T
NL
We get:
ˆH
ˆ −1 − G =
G
G
1
NL T
ˆ + GB 0 v 0 G
ˆ + vBG0 G
ˆ
vv 0 G
ˆ − (P ˆ ∗ − PF ∗ )Y v 0 G
ˆ
−vY 0 (PFˆ ∗ − PF ∗ )G
F
ˆ − (P ˆ ∗ − PF ∗ )Y BG0 G
ˆ
−GB 0 Y 0 (PFˆ ∗ − PF ∗ )G
F
0 ˆ −1 0 −1
GG
BB
ˆ
+(PFˆ ∗ − PF ∗ )Y Y 0 (PFˆ ∗ − PF ∗ )G
.
T
NL
(C.11)
Equations (C.7) and (C.11) are a system of nonlinear implicit equations, which define the new estiˆ in terms of the old estimate G.
˜ In the next two subsections we linearize these equations
mates Fˆ and G
around the true factor values F and G.
C.1.3
The linearized equation in step 1
˜ G∗ as:
Let us define matrix H

˜ G∗ = 
H
˜G =
where H
B0B
NL
˜G 0
H
˜G
H
0

,
˜
G0 G
V˜G−1 and V˜G is the matrix of eigenvalues in the PCA problem defining
T
˜
G.
Lemma C.1. We have:
˜ ∗H
˜ −1∗ − G∗ )0 F
ˆ −1 − F = ηF − MG∗ (G
˜ ∗H
˜ −1∗ − G∗ )D0 − G∗ (G∗0 G∗ )−1 (G
Fˆ H
G
G
F
0 0 −1 0 −1
ΛΛ
1 ˜ ∗ ˜ −1
FF
ΛΛ
−F
D
(G HG∗ − G∗ )0 F
NH
2T
2T
NH
˜
+RF (Fˆ , G),
56
(C.12)
where
D =
lim
NH →∞
Λ0 Λ
NH
−1 Λ0 ∆
NH
= [D1 D2 ],
the term
ηF
0 ˆ −1 0 −1
0 −1
1
FF
ΛΛ
ΛΛ
1
0ˆ
0 0ˆ
=
ee F + F Λ e F
eΛ
+
2NH T
2T
NH
NH
NH
(C.13)
is such that
√
kηF k/ 2T = Op
1
p
min(NH , 2T )
!
,
(C.14)
˜ is such that
and the reminder term RF (Fˆ , G)
√
√
√
1
−1
−1
∗
∗
2
∗
∗
˜
˜ H
˜ ∗ − G k/ 2T + (kG
˜ H
˜ ∗ − G k/ 2T )
kRF (Fˆ , G)k/
2T = Op p
kG
G
G
min(NH , T )
1 ˜ ∗ ˜ −1
−1
∗
ˆ
ˆ
+Op
kG HG∗ − G kkF HF − F k
(C.15)
2T
To prove Lemma C.1, we need the following two lemmas, which are proved in the supplementary
material:
Lemma C.2. We have:
(a)
1
NH T
kεε0 k = Op
1
!
.
p
min(NH , T )
1
1
0 0
(b)
kF Λ ε k = Op √
.
NH T
NH
!
r
T
1
.
(c)
kεΛk = Op
NH
NH
1
1
∗0
(d)
kε∆G k = Op √
.
NH T
NH
Lemma C.3. We have:
(a)
1
NH T
kee0 k = Op
1
!
p
.
min(NH , T )
57
(b)
(c)
(d)
(e)
1
NH T
kF Λ0 e0 k = Op
1
keΛk = Op
NH
1
NH T
1
NH T
1
!
1
!
p
.
min(NH , T )
!
r
T
.
NH
ke∆G∗0 k = Op
keε0 k = Op
p
.
min(NH , T )
!
1
p
.
min(NH , T )
Proof of Lemma C.1: i) Let us first show the decomposition in equation (C.12). By rearranging the
terms in the RHS of equation (C.7) we get:
0
0 ˆ −1 0 −1
XΛ
FF
ΛΛ
ˆ
F
(PG˜ ∗ − PG∗ )F
NH
2T
NH
0 −1
ΛΛ
XΛ
˜
+ R1,F (Fˆ , G),
− PG ∗ )
NH
NH
ˆ −1 − F = ηF − 1
Fˆ H
F
2T
−(PG˜ ∗
(C.16)
where ηF is defined in (C.13), and
1
˜ = −
R1,F (Fˆ , G)
eX 0 (PG˜ ∗ − PG∗ ) − (PG˜ ∗ − PG∗ )Xe0
2NH T
0 ˆ −1 0 −1
FF
ΛΛ
0
+(PG˜ ∗ − PG∗ )XX (PG˜ ∗ − PG∗ ) Fˆ
.
2T
NH
(C.17)
Let us now consider the matrix XΛ/NH in the RHS of equation (C.16). By using the model of X in
equation (C.1):
0 0 ΛΛ
εΛ
XΛ
∗ ∆Λ
+G
+
.
= F
NH
NH
NH
NH
(C.18)
Thus:
XΛ
NH
Λ0 Λ
NH
−1
0 −1 0 −1
∆0 Λ
ΛΛ
εΛ
ΛΛ
= F +G
+
NH
NH
NH
NH
0 0 −1
0 −1
∆Λ
ΛΛ
εΛ
ΛΛ
∗
0
= Z +G
−D +
,
NH
NH
NH
NH
∗
58
(C.19)
where:
Z = F + G∗ D0 .
(C.20)
Then the second and third terms in the RHS of equation (C.16) become:
0
0 ˆ −1 0 −1
0 −1
FF
XΛ
XΛ
ΛΛ
ΛΛ
ˆ
F
(PG˜ ∗ − PG∗ )F
+ (PG˜ ∗ − PG∗ )
NH
2T
NH
NH
NH
0 0 ˆ −1 0 −1
ΛΛ
FF
ΛΛ
1
˜
F
Z 0 (PG˜ ∗ − PG∗ )Fˆ
+ (PG˜ ∗ − PG∗ )Z + R2,F (Fˆ , G),
=
2T
NH
2T
NH
(C.21)
1
2T
where:
0 0 −1 0 0 ˆ −1 0 −1
FF
1
Λ
Λ
Λ
Λ
Λ
∆
ΛΛ
∗0
˜ =
R2,F (Fˆ , G)
F
− D G (PG˜ ∗ − PG∗ )Fˆ
2T
NH
NH
NH
2T
NH
0 0
0 ˆ −1 0 −1
Λε
1
FF
ΛΛ
+ F
(PG˜ ∗ − PG∗ )Fˆ
2T
NH
2T
NH
0 0 −1
0 −1 ∆Λ
ΛΛ
ΛΛ
εΛ
∗
0
+(PG˜ ∗ − PG∗ ) G
−D +
.
(C.22)
NH
NH
NH
NH
Let us now consider the first two terms in the RHS of equation (C.21). In order to linearize the term
PG˜ ∗ −PG∗ , we need the following Lemma C.4. We use the operator norm k·kop , which, for the generic
(m × n) matrix A, is defined as (see, e.g., Horn and Johnson (2013)):
kAkop = sup kAxk .
(C.23)
kxk=1
ˆ Then, if
Lemma C.4. Let Aˆ and A be two m × n matrices, where A is full column rank and A.
ˆ
k(A0 A)−1 k1/2
op kA − Akop <
p
1 + % − 1,
for some % ∈ (0, 1), we have:
ˆ A),
PAˆ − PA = MA (Aˆ − A)(A0 A)−1 A0 + A(A0 A)−1 (Aˆ − A)0 MA + RP (A,
59
ˆ A) is such that:
where PA = A(A0 A)−1 A0 and MA = Im − PA , and the reminder term RP (A,
ˆ A)kop ≤ C k(A0 A)−1 kop + k(A0 A)−1 k2op kAˆ − Ak2op ,
kRP (A,
ˆ but may depend on %.
with constant C < ∞ is independent of A and A,
The proof of Lemma C.4 is given in the supplementary material. By using PG˜ ∗ H˜ −1∗ = PG˜ ∗ and
G
˜ ∗H
˜ −1∗ and A = G∗ , we have:
applying Lemma C.4 with Aˆ = G
G
˜ ∗H
˜ −1∗ − G∗ )(G∗0 G∗ )−1 G∗0
PG˜ ∗ − PG∗ = MG∗ (G
G
˜ ∗H
˜ −1∗ − G∗ )0 MG∗ + RP (G
˜ ∗ , G∗ ),
+G∗ (G∗0 G∗ )−1 (G
G
(C.24)
where
˜ ∗ , G∗ )kop = O(kG
˜ ∗H
˜ −1∗ − G∗ k2 k(G∗0 G∗ )−1 G∗0 k2 ).
kRP (G
op
op
G
(C.25)
Then:
˜ ∗H
˜ −1∗ − G∗ )D0
(PG˜ ∗ − PG∗ )Z = MG∗ (G
G
˜ ∗H
˜ −1∗ − G∗ )0 F + R3,F (G),
˜
+G∗ (G∗0 G∗ )−1 (G
G
(C.26)
where:
˜ = MG∗ (G
˜ ∗H
˜ −1∗ − G∗ )(G∗0 G∗ )−1 G∗0 F
R3,F (G)
G
˜ ∗H
˜ −1∗ − G∗ )0 PG∗ F + RP (G
˜ ∗ , G∗ )Z,
−G∗ (G∗0 G∗ )−1 (G
G
(C.27)
and:
ˆF
˜ ∗H
˜ −1∗ − G∗ )0 F H
Z 0 (PG˜ ∗ − PG∗ )Fˆ = D(G
G
∗
∗0 ∗ −1 ∗0 ˆ
0
0 ˜ ∗ ˜ −1
˜ Fˆ
+ F (G HG∗ − G )(G G ) G F + R3,F (G)
ˆ F − PG∗ Fˆ ).
˜ ∗H
˜ −1∗ − G∗ )0 (Fˆ − F H
+D(G
G
60
(C.28)
ˆ F − PG∗ Fˆ as:
Rewriting Fˆ − F H
ˆ F − PG∗ Fˆ = Fˆ − F H
ˆ F − PG∗ (Fˆ − F H
ˆ F ) − PG ∗ F H
ˆF
Fˆ − F H
ˆ −1 − F ) − PG∗ F ] H
ˆF ,
= [MG∗ (Fˆ H
F
we get:
˜ ∗H
˜ −1∗ − G∗ )0 F H
ˆF
Z 0 (PG˜ ∗ − PG∗ )Fˆ = D(G
G
0 ˜ ∗ ˜ −1
∗
∗0 ∗ −1 ∗0 ˆ
0ˆ
˜
+ F (G HG∗ − G )(G G ) G F + R3,F (G) F
˜ ∗H
˜ −1∗ − G∗ )0 (MG∗ (Fˆ H
ˆ −1 − F ) − PG∗ F ) H
ˆF .
+D(G
F
G
(C.29)
By plugging equations (C.21), (C.26) and (C.29) into the RHS of equation (C.16), we get:
ˆ −1 − F = ηF − MG∗ (G
˜ ∗H
˜ −1∗ − G∗ )D0 − G∗ (G∗0 G∗ )−1 (G
˜ ∗H
˜ −1∗ − G∗ )0 F
Fˆ H
F
G
G
0 0 ˆ −1 0 −1
ΛΛ
1 ˜ ∗ ˜ −1
ΛΛ
ˆF F F
−F
D
(G HG∗ − G∗ )0 F H
NH
2T
2T
NH
˜ − R2,F (G)
˜ − R3,F (G)
˜ + R4,F (Fˆ , G),
˜
+R1,F (Fˆ , G)
(C.30)
where:
0
˜ = −F Λ Λ
R4,F (Fˆ , G)
NH
1
1 0 ˜ ∗ ˜ −1
˜ 0 Fˆ
F (G HG∗ − G∗ )(G∗0 G∗ )−1 G∗0 Fˆ +
R3,F (G)
2T
2T
0 ˆ −1 0 −1
1
ΛΛ
FF
−1
∗ ˜ −1
∗ 0
˜
ˆ
ˆ
ˆ
+
D(G HG∗ − G ) (MG∗ (F HF − F ) − PG∗ F ) HF
.
2T
2T
NH
(C.31)
ˆ −1 − F )]H,
ˆ we
Let us expand the matrix (F 0 Fˆ /2T )−1 in equation (C.30). By using Fˆ = [F + (Fˆ H
F
have:
F 0 Fˆ
T
−1
i−1
ˆF
ˆ −1 − F )/T H
(F 0 F/T ) IKH + (F 0 F/T )−1 F 0 (Fˆ H
F
−1
ˆ −1 IK + A (F, Fˆ )
= H
(F 0 F/T )−1 ,
H
F
=
h
61
(C.32)
where A (F, Fˆ ) =
F 0F
2T
−1
ˆ −1 − F )
F 0 (Fˆ H
F
. Equation (C.32) allows us to rewrite the RHS of
T
equation (C.30) as:
ˆ −1 − F = ηF − MG∗ (G
˜ ∗H
˜ −1∗ − G∗ )D0 − G∗ (G∗0 G∗ )−1 (G
˜ ∗H
˜ −1∗ − G∗ )0 F
Fˆ H
F
G
G
0 0 −1 0 −1
ΛΛ
1 ˜ ∗ ˜ −1
F
F
ΛΛ
˜
−F
D
(G HG∗ − G∗ )0 F
+ RF (Fˆ , G),
NH
2T
2T
NH
(C.33)
where:
˜ = R1,F (Fˆ , G)
˜ − R2,F (G)
˜ − R3,F (G)
˜ + R4,F (Fˆ , G)
˜ − R5,F (Fˆ , G),
˜
RF (Fˆ , G)
(C.34)
with:
0
Λ
Λ
1
−1
∗
∗
0
˜ H
˜ ∗ −G )F
˜ = F
(G
D
R5,F (Fˆ , G)
G
NH
2T
0 −1 0 −1
−1
ΛΛ
FF
ˆ
.
× IKH + A (F, F )
− IKH
2T
NH
(C.35)
(ii) Let us now show the upper bound on kηF k given in equation (C.14). From equation (C.13),
Lemma S.1 and Assumption H.2, we have:
1
1
0ˆ
0 0ˆ
kηF k ≤
kee F k + kF Λ e F k Op (1) + Op
keΛk
2NH T
NH
√
1
1
0
0 0
≤
kee k + kF Λ e k Op ( T ) + Op
keΛk ,
2NH T
NH
(C.36)
where the last inequality follows from Fˆ 0 Fˆ /(2T ) = IKF , as Fˆ is estimated
by PCA. Using inequality
!
1
(C.36) and Lemma C.3 we get T −1/2 kηF k = Op p
.
min(NH , T )
(iii)
˜ given in equation (C.15). We bound
Finally, we prove the upper bound on kRF (Fˆ , G)k
separately the norm of each term in the RHS of equation (C.34). We use the following result linking
the operator norm and the Frobenius norm of a generic (m × n) matrix A (see, e.g. Horn and Johnson
62
(2013)):
kAkop ≤ kAk ≤
p
min(m, n) kAkop .
(C.37)
Using KH ≤ T and inequality (C.37), from equation (C.17) we get:
p
1
1
˜
˜ op
√ kR1,F (Fˆ , G)k
≤
KH √ kR1,F (Fˆ , G)k
T
T
p
1
1
0
0
≤
keX kop + kPG˜ ∗ − PG∗ kop
kXX kop
KH 2
2NH T
2NH T
Fˆ F 0 Fˆ −1 Λ0 Λ −1 .
×kPG˜ ∗ − PG∗ kop √ 2T
N
T
H
op
op
op
Using the result in (C.37) and Lemma S.1 we have:
0 −1 0 −1 ˆ
F Fˆ
≤ F F
= Op (1),
2T
2T
op
0 −1 0 −1 BB
≤ BB
= O(1),
NH
NH
op
ˆ Fˆ √ ≤ √F = Op (1).
T
T
(C.38)
(C.39)
(C.40)
op
This allows us to write:
1
˜
√ kR1,F (Fˆ , G)k
= Op
T
1
keX 0 kop kPG˜ ∗ − PG∗ kop
2NH T
+ Op
1
0
2
kXX kop kPG˜ ∗ − PG∗ kop .
2NH T
(C.41)
Let us bound each term in the RHS of equation (C.41). Using the expression for PG˜ ∗ − PG∗ in equation
(C.24) and the triangular inequality, we have:
˜ −1∗ − G∗ kop k(G∗0 G∗ )−1 G∗0 kop
˜ ∗H
kPG˜ ∗ − PG∗ kop ≤ 2kMG∗ kop kG
G
˜ ∗H
˜ −1∗ − G∗ k2op k(G∗0 G∗ )−1 G∗0 k2op ).
+Op (kG
G
63
Moreover we have:
∗0
∗ −1
∗0
k(G G ) G kop
∗0 ∗ −1 ∗0 G 1 G G
√ ≤ √ T T
T
1
= Op √
.
T
This result and kMG∗ kop = 1 allow us to conclude:
kPG˜ ∗ − PG∗ kop
1 ˜ ∗ ˜ −1
1 ˜ ∗ ˜ −1
= Op √ kG
HG∗ − G∗ k2op
HG∗ − G∗ kop + kG
T
T
1 ˜ ∗ ˜ −1
= Op √ kG
HG∗ − G∗ k .
T
(C.42)
Using the definition of X in equation (C.1) and Lemma C.3, we can bound the term keX 0 kop in the
RHS of equation (C.41) as:
1
1
keX 0 kop ≤
keX 0 k
2NH T
2NH T
1
1
1
≤
keΛF 0 k +
ke∆G∗0 k +
keε0 k
2NH T
2NH T
2NH T
1
= Op p
.
min(NH , T )
(C.43)
From the definition of X in equation (C.1) and Lemma C.2, we can bound the term kXX 0 kop in
equation (C.41) as:
1
1
kXX 0 kop ≤
kXX 0 k
2NH T
2NH T
1
1
1
kF Λ0 ΛF 0 k +
kF Λ0 ∆G∗0 k +
kF Λ0 ε0 k
≤
NH T
NH T
NH T
1
1
1
+
kG∗ ∆0 ΛF 0 k +
kG∗ ∆0 ∆G∗0 k +
kG∗ ∆0 ε0 k
NH T
NH T
NH T
1
1
1
+
kεΛF 0 k +
kε∆G∗0 k +
kεε0 k
NH T
NH T
NH T
1
1
1
=
kF Λ0 ΛF 0 k +
kG∗ ∆0 ∆G∗0 k +
kεε0 k
NH T
NH T
NH T
1
1
0
∗0
+
kF Λ ∆G k + Op p
.
(C.44)
NH T
min(NH , T )
64
The first term in the RHS of the last equation can be bounded as:
F Λ0 Λ F kF Λ ΛF k ≤ √T NH √T NH T
= Op (1).
1
0
0
(C.45)
The second term in the RHS of equation (C.44) can be bounded as:
∗ 0 ∗0 G ∆ ∆ G kG ∆ ∆G k ≤ √T NH √T NH T
= Op (1).
1
∗
0
∗0
(C.46)
Analogous arguments allow us to bound the remaining terms in the RHS of equation (C.44), and to
conclude that:
1
kXX 0 kop = Op (1).
2NH T
(C.47)
Collecting results (C.41), (C.42), (C.43) and (C.47) we get:
1
1
1 ˜ ∗ ˜ −1
∗
ˆ
˜
√ kR1,F (F , G)kop = Op p
√ kG HG∗ − G k
T
T
min(NH , T )
2 1 ˜ ∗ ˜ −1
∗
√ kG HG∗ − G k
+ Op
.
T
(C.48)
˜ in equation (C.22). Using KH < T and inequalities in (C.37)
Let us now bound the term R2,F (Fˆ , G)
65
we get:
p
1
1
˜
˜ op
√ kR2,F (Fˆ , G)k
≤
KH √ kR2,F (Fˆ , G)k
T
T
p
F Λ0 Λ Λ0 Λ −1 Λ0 ∆
≤
KH √ − D
NH
NH
T op NH op
op
∗0 0 −1 0 −1 G Fˆ F Fˆ
ΛΛ
×
√T kPG˜ ∗ − PG∗ kop √T 2T
NH
op
op
op
op
0 0 p
ˆ F Λε kP ˜ ∗ − PG∗ kop √F √
√
+ KH G
T
T N
H T op
op
op
0 −1 0 −1 F Fˆ
ΛΛ
×
2T
NH
op
op
p
+ KH kPG˜ ∗ − PG∗ kop
0 −1 ∗ 0 0 −1
G ∆Λ
ΛΛ
εΛ
Λ
Λ
0
.
√
√
× −
D
+
T NH
N
NH
N
T
H
H
op
op
op
op
!
0 −1 0 ΛΛ
Λ∆
− D
≤ Op kPG˜ ∗ − PG∗ kop
NH
NH
op
!
0 0 Λε +Op N √T kPG˜ ∗ − PG∗ kop .
H
op
Using assumptions H.1 and H.2, and Lemmas S.1, S.2 and C.2, and equation (C.42) we conclude that:
1
1
˜
√ kR2,F (Fˆ , G)k
= √
Op
NH
T
1 ˜ ∗ ˜ −1
∗
√ kG HG∗ − G k .
T
˜ from equation (C.27) as:
Analogous arguments allow to bound the term R3,F (G)
p
1
1
˜
˜ op
√ kR3,F (G)k
≤
KH √ kR3,F (G)k
T
T
∗0 ∗ −1 ∗0 p
G F 1
−1
∗
∗
˜ H
˜ ∗ − G kop G G
≤
KH √ kMG∗ kop kG
G
T T
T
op
op
∗ ∗0 ∗ −1 p
∗F 1 G G G
P
G
˜ ∗H
˜ −1∗ − G∗ kop √ kG
+ KH √ √ G
T T
T
T
op
op
p
1
˜ ∗ , G∗ )Zkop
+ KH √ kRP (G
T
1 1
1 ˜ ∗ ˜ −1
∗
∗
∗
˜
≤ √ Op √ kG HG∗ − G k + Op √ Z kRP (G , G )k ,
T
T
T
66
op
since:
PG ∗ F √ T op
PG∗ F G∗ G∗0 G∗ −1 G∗0 F 1
≤ √ ≤ √ T = Op √T .
T
T
T
From the definition of Z in equation (C.20) and Assumption H.1 it follows that:
1 √ Z = Op (1).
T ˜ ∗ , G∗ ) in equation (C.25), we get:
From the bound of RP (G
˜∗
∗
kRP (G , G )kop
∗0 ∗ −1 2 ∗0 2 G 1 ˜ ∗ ˜ −1
∗ 2 G G
√ kG HG∗ − G k ≤ Op
T
T
T
1 ˜ ∗ ˜ −1
= Op
kG HG∗ − G∗ k2 ,
T
which allows to conclude that:
1
˜
√ kR3,F (G)k
= Op
T
1
√
T
1 ˜ ∗ ˜ −1
1 ˜ ∗ ˜ −1
∗ 2
∗
√ kG HG∗ − G k
kG HG∗ − G k .
+ Op
T
T
˜ can be bounded as:
The term R4,F (Fˆ , G)
p
1
1
˜
˜ op
√ kR4,F (Fˆ , G)k
KH √ kR4,F (Fˆ , G)k
≤
T
T
(
∗0 ∗ −1 ∗0 F G Fˆ F Λ0 Λ 1
˜ ∗H
˜ −1∗ − G∗ kop G G
√ √ kG
√
≤ G
T NH T 2 T
T T
op
op
op
op
ˆ 1
˜ op √F + √ kR3,F (G)k
T
2 T
op
1 ˜ ∗ ˜ −1
ˆ −1 − F kop kH
ˆ F kop
+kDkop kG
HG∗ − G∗ kop kMG∗ kop kFˆ H
F
2T
) F 0 Fˆ −1 Λ0 Λ −1 ∗F P
1
G
∗
∗ ˜ −1
˜
ˆ
.
+ √ kG HG∗ − G kop √ kHF kop × NH
2T
2 T
T op
op
op
1 ˜ ∗ ˜ −1
1
1 ˜ ∗ ˜ −1
ˆ −1 − F k
√ kG
+ Op
= Op √
HG∗ − G∗ k
kG HG∗ − G∗ kkFˆ H
F
T
T
T
1 ˜ ∗ ˜ −1
kG HG∗ − G∗ k2 .
+Op
T
67
˜ in equation (C.35) can be bounded as:
Finally, the term R5,F (Fˆ , G)
F Λ0 Λ F 1
1
−1
∗
∗
˜
˜ ˜
√ kR5,F (Fˆ , G)k
≤ √2T NH kDk √2T (G HG∗ − G ) √2T 2T
0 −1 0 −1 FF
Λ Λ
−1
×k(IKH + A (F, Fˆ )) − IKH k
2T
NH
1
˜ ∗H
˜ −1∗ − G∗ k k(IK + A (F, Fˆ ))−1 − IK k.
≤ O p √ kG
H
H
G
2T
(C.49)
Let us bound the term (I2KH + A (F, Fˆ ))−1 − I2KH . Assuming that
0 −1
FF
1 0 ˆ ˆ −1
ˆ
F (F HF − F )
kA (F, F )k = ≤ ρ,
2T
2T
for any constant ρ < 1, the series representation of the matrix inversion mapping, we have
k(IKH
+ A (F, Fˆ ))−1 − IKH k ≤
∞
X
kA (F, Fˆ )kj ≤
j=1
1
kA (F, Fˆ )k.
1−ρ
Therefore, we have:
k(IKH
+ A (F, Fˆ ))−1 − IKH k = Op (kA (F, Fˆ )k) = Op
1
−1
ˆ
ˆ
√ kF HF − F k ,
2T
which, together with equation (C.49) property (C.37) of the operator norm implies:
1
1
−1
−1
∗
∗
˜
˜ H
˜ ∗ − G kkFˆ H
ˆ − Fk .
√ kR5,F (Fˆ , G)k
= Op
kG
G
F
2T
2T
Q.E.D.
68
C.1.4
The linearized equation in step 2
Lemma C.5. We have:
ˆH
ˆ −1 − G = η ∗ − MF ∗ (Fˆ ∗ H
ˆ −1∗ − F ∗ )W 0 − F ∗ (F ∗0 F ∗ )−1 (Fˆ ∗ H
ˆ −1∗ − F ∗ )0 G
G
G
G
F
F
0 0 −1 0 −1
BB
1 ˆ ∗ ˆ −1
GG
BB
−G
W
(F HF ∗ − F ∗ )0 G
NL
T
T
NL
∗ ˆ ˆ
+RG (F , G),
(C.50)
where
W =
lim
NL →∞
B0B
NL
−1 B0Ω
NL
= [W1 W2 ],
the term
∗
ηG
0 ˆ −1 0 −1
0 −1
1
1
GG
BB
BB
0ˆ
0 0ˆ
=
+
vv G + GB v G
vB
NL T
T
NL
NL
NL
(C.51)
is such that
√
∗
kηG
k/ T = Op
1
p
min(NL , T )
!
,
(C.52)
ˆ is such that
and the reminder term RG∗ (Fˆ , G)
√
ˆ
kRG∗ (Fˆ , G)k/
1
√
√
ˆ −1∗ − F ∗ k/ T + (kFˆ ∗ H
ˆ −1∗ − F ∗ k/ T )2
kFˆ ∗ H
F
F
T = Op p
min(NL , T )
1 ˆ ∗ ˆ −1
−1
∗
ˆH
ˆ − Gk
+Op
kF HF ∗ − F kkG
G
T
(C.53)
Proof: The proof is analogous to the proof of Lemma C.1, and is detailed in the supplementary material.
C.1.5
Writing the linearized equations by components
˜ ∗ of the estimated LF factor in the
The recursive equation (C.12) involves the “compound” form G
RHS. Similarly, the recursive equation (C.50) involves the form Fˆ ∗ of the estimated HF factor in the
69
˜ and Fˆ , respectively.
RHS. Let us now rewrite those equations such that their RHS involve estimates G
This simplifies the combined use of the two equations later on.
By using the definition of F , G∗ , MG∗ and their estimates, equation (C.12) can be written in
components as:


ˆ −1 − F1
Fˆ1 H
F
ˆ −1 − F2
Fˆ2 H
F


 = 
ηF,1
ηF,2

−


MG 0
−
0

0
−1
MG

G(G G)
0
0
G(G0 G)−1


˜H
˜ −1 − G 0
G
G
˜H
˜ −1 − G
G
G
0

˜H
˜ −1 − G)0 F1
(G
G


˜H
˜ −1 − G)0 F2
(G
G


D10


D20



0 −1 0 −1
−1
0
˜
˜
F1
(GHG − G) F1
Λ
Λ
1
ΛΛ

 FF

−
[D1 D2 ]
NH
2T (G
2T
NH
˜H
˜ −1 − G)0 F2
F2
G


ˆ
˜
RF,1 (F , G)

+
ˆ
˜
RF,2 (F , G)

= 
ηF,1
ηF,2



−
0
˜H
˜ −1 − G)D0
MG (G
1
G
˜H
˜ −1 − G)D0
MG (G
2
G


−
F1 (Λ0 Λ/NH )D1 F1 (Λ0 Λ/NH )D2
˜H
˜ −1 − G)0 F1
G(G0 G)−1 (G
G

˜H
˜ −1 − G)0 F2
G(G G) (G
G

0
−1

1 

2T F2 (Λ0 Λ/NH )D1 F2 (Λ0 Λ/NH )D2




0 −1 0 −1
−1
0
˜
˜
ˆ
˜
(GHG − G) F1
RF,1 (F , G)
ΛΛ
 FF
.
×
+
2T
NH
˜H
˜ −1 − G)0 F2
ˆ
˜
(G
R
(
F
,
G)
F,2
G
−
70
Therefore we have:
˜H
˜ −1 − G)0 F1
ˆ −1 − F1 = ηF,1 − MG (G
˜H
˜ −1 − G)D10 − G(G0 G)−1 (G
Fˆ1 H
G
F
G
0 0 −1 0 −1
1
ΛΛ
ΛΛ
−1
−1
0
0
˜H
˜ − G) F1 + D2 (G
˜H
˜ − G) F2 F F
D1 (G
− F1
G
G
2T
NH
2T
NH
˜
+RF,1 (Fˆ , G),
(C.54)
and:
ˆ −1 − F2 = ηF,2 − MG (G
˜H
˜ −1 − G)0 F2
˜H
˜ −1 − G)D20 − G(G0 G)−1 (G
Fˆ2 H
G
F
G
0 0 −1 0 −1
ΛΛ
ΛΛ
1
−1
−1
0
0
˜H
˜ − G) F1 + D2 (G
˜H
˜ − G) F2 F F
D1 (G
− F2
G
G
2T
NH
2T
NH
ˆ
˜
+RF,2 (F , G).
(C.55)
Similarly, by using the definition of F ∗ , MF ∗ and their estimates, equation (C.50) can be written as:
∗
ˆH
ˆ −1 − G = ηG
ˆ −1 − F1 )W10 + (Fˆ2 H
ˆ −1 − F2 )W20 ]
G
− MF ∗ [(Fˆ1 H
G
F
F
ˆ −1 − F1 )0 G
(Fˆ1 H
F

−F ∗ (F ∗0 F ∗ )−1 
ˆ
ˆ
(F2 HF−1 − F2 )0 G
0 0 −1 0 −1
BB
BB
1
−1
−1
0
0
ˆ − F1 ) G + W2 (Fˆ2 H
ˆ − F2 ) G] G G
[W1 (Fˆ1 H
− G
F
F
T
NL
T
NL
∗ ˆ ˜
+RG (F , G).
(C.56)
C.1.6
The system of linearized equations when KH = KL = 1
Let us now focus on the case with one-dimensional HF and LF factors, i.e., KH = KL = 1. Then,
Λ0 Λ
B0B
F 0 F G0 G ˆ
ˆ
,
, HF , HG ,
and
are scalars. Moreover, matrices D and W become (1 × 2)
2T
T
NH
NL
71
matrices:
−1 0 Λ0 Λ
Λ ∆j
dj = lim
,
NH →∞
NH
NH
0 −1 0 BB
B Ωj
wj = lim
,
NL →∞
NL
NL
D = [d1 d2 ],
(1×2)
W = [w1 w2 ],
(1×2)
j = 1, 2,
j = 1, 2.
ˆ −1 − F1
We rename hH and hG the scalars HF and HG . This allows to re-write the equation for Fˆ1 H
F
in (C.54) as:
ˆ −1 − F1 = ηF,1 − d1 MG (G
˜ −1 − G) − G(G0 G)−1 (G
˜ −1 − G)0 F1
˜h
˜h
Fˆ1 h
F
G
G
0 −1
0 −1
1
1
F
F
−1
−1
0
0
˜ − G) F1
˜ − G) F2 F F
˜h
˜h
−
d2 F 1 ( G
− d1 F 1 ( G
G
G
2T
2T
2T
2T
˜
+RF,1 (Fˆ , G),
0
−1
= ηF,1 − d1 MG + G(G G)
F10
+
F 0F
2T
˜ −1 − G) + RF,1 (Fˆ , G).
˜h
˜
×(G
G
−1
d1
F1 F10 +
2T
F 0F
2T
−1
d2
F1 F20
2T
(C.57)
ˆ −1 − F2 in (C.55) becomes:
Similarly, the equation for Fˆ2 H
F
0 −1
0 −1
FF
d1
d2
FF
−1
0
0
0
−1 0
ˆ
ˆ
F2 F1 +
F2 F2
F2 hF − F2 = ηF,2 − d2 MG + G(G G) F2 +
2T
2T
2T
2T
˜ −1 − G) + RF,2 (Fˆ , G).
˜h
˜
×(G
(C.58)
G
72
Let us now consider the equation for the LF factor. From equation (C.56) we have:
ˆ −1 − F1 ) + w2 (Fˆ2 h
ˆ −1 − F2 )]
ˆ −1 − G = η ∗ − MF ∗ [w1 (Fˆ1 h
ˆh
G
G
F
G
F

ˆ −1 − F1 )0 G
(Fˆ1 h
F
∗
∗0 ∗ −1 

−F (F F )
−1
0
ˆ
ˆ
(F2 hF − F2 ) G
0 −1
1
GG
−1
−1
0
0
ˆ
ˆ
ˆ
ˆ
ˆ
− G [w1 (F1 hF − F1 ) G + w2 (F2 hF − F2 ) G]
+ RG∗ (Fˆ , G)
T
T
=
∗
ηG
0 −1
GG
1
∗
∗0 ∗ −1
0
0
ˆ −1 − F1 )
− w1 MF ∗ +
GG + F (F F ) e1 G (Fˆ1 h
F
T
T
0 −1
GG
1
0
∗
∗0 ∗ −1
0
ˆ −1 − F2 ) + R ∗ (Fˆ , G),
ˆ
− w2 MF ∗ +
GG + F (F F ) e2 G (Fˆ2 h
G
F
T
T
(C.59)
where e1 = (1, 0)0 and e2 = (0, 1)0 . The last equation can be written as:
ˆ −1 − F1 ) − LG,F (Fˆ2 h
ˆ −1 − F2 ) + R ∗ (G,
ˆ −1 − G = η ∗ − LG,F (Fˆ1 h
ˆh
ˆ Fˆ ),
G
2
1
G
G
G
F
F
(C.60)
where:
LG,F1
LG,F2
0 −1
GG
1
0
= w1 MF ∗ +
GG + F ∗ (F ∗0 F ∗ )−1 e1 G0 ,
T
T
0 −1
GG
1
0
GG + F ∗ (F ∗0 F ∗ )−1 e2 G0 ,
= w2 MF ∗ +
T
T
ˆ Fˆ ) is as in equation (C.53). On the other hand, equations (C.57) and (C.58)
and the reminder RG∗ (G,
can be expressed as:
ˆ −1 − F1 = ηF − LF ,G (G
˜ −1 − G) + RF,1 (Fˆ , G),
˜
˜h
Fˆ1 h
1
1
G
F
(C.61)
˜ −1 − G) + RF,2 (Fˆ , G),
ˆ −1 − F2 = ηF − LF ,G (G
˜h
˜
Fˆ2 h
2
2
F
G
(C.62)
73
where:
LF1 ,G = d1 MG + G(G G)
−1
F10
LF2 ,G = d2 MG + G(G G)
−1
F20
0
0
F 0F
2T
−1
d1
F1 F10 +
2T
F 0F
2T
−1
d2
F1 F20 ,
2T
F 0F
2T
−1
d1
F2 F10 +
2T
F 0F
2T
−1
d2
F2 F20 .
2T
+
+
Substituting equations (C.61) and (C.62) in equation (C.60), we get:
˜ −1 − G) + RG (G,
ˆ −1 − G = ηG + LG (G
˜h
ˆ G,
˜ Fˆ ),
Gh
G
G
(C.63)
∗
− LG,F1 ηF1 − LG,F2 ηF2 ,
ηG = ηG
(C.64)
LG = LG,F1 LF1 ,G + LG,F2 LF2 ,G ,
(C.65)
ˆ G,
˜ Fˆ ) = RG∗ (Fˆ , G)
ˆ − LG,F1 RF,1 (Fˆ , G)
˜ − LG,F2 RF,2 (Fˆ , G).
˜
RG (G,
(C.66)
where:
and the reminder term is:
We can bound matrix LG,F1 as:
kLG,F1 kop
0 −1 1
GG
0
GG ≤ |w1 | kMF ∗ kop + T
op T
op
∗ ∗0 ∗ −1 F F F
ke1 kop √G +
√T T
T op
op
op
= Op (1).
(C.67)
Analogous arguments allow to prove that kLG,F2 kop = Op (1), kLF1 ,G kop = Op (1) and kLF2 ,G kop =
Op (1). These results, together with Lemmas C.1 and C.5, allow to bound the term ηG as:
kηG kop
1
1
1
∗
√
≤ √ kηG
kop − kLG,F1 kop √ kηF1 kop − kLG,F2 kop √ kηF2 kop
T
T
T
T
1
= Op p
,
min(NL , NH , T )
74
and hence:
kηG k
1
√
= Op p
.
T
min(NL , NH , T )
(C.68)
ˆ G,
˜ Fˆ ) in equation (C.66)
Using the results in equations (C.15) and (C.53), the reminder term RG (G,
can be bounded as:
ˆ G,
˜ Fˆ )kop
ˆ Fˆ )kop
˜ op
˜ op
kRG (G,
kRG∗ (G,
kRF,1 (Fˆ , G)k
kRF,2 (Fˆ , G)k
√
√
√
√
≤
+ kLG,F1 kop
+ kLG,F2 kop
T
T
T
T
√
√
1
ˆ −1∗ − F ∗ k/ T + (kFˆ ∗ H
ˆ −1∗ − F ∗ k/ T )2
= Op p
kFˆ ∗ H
F
F
min(NL , T )
1 ˆ ∗ ˆ −1
ˆH
ˆ −1 − Gk
+Op
kF HF ∗ − F ∗ kkG
G
T
√
√
1
∗ ˜ −1
∗
∗ ˜ −1
∗
2
˜
˜
+Op p
kG HG∗ − G k/ 2T + (kG HG∗ − G k/ 2T )
min(NH , T )
1 ˜ ∗ ˜ −1
−1
∗
ˆ − Fk .
+Op
kG HG∗ − G kkFˆ H
(C.69)
F
2T
Equations (C.60), (C.61) and (C.62) can be stacked together in the following way:



I
0
0
0 0 −LF1 ,G
 T





 0
IT
0  υˆ = ηυ +  0 0 −LF2 ,G



LG,F1 LG,F2 IT
0 0 0



υ , υ˜)
 υ˜ + Rυ (ˆ

(C.70)
where:

ˆ −1 − F1
Fˆ1 h
F

 ˆ −1
υˆ =  Fˆ2 h
F − F2

ˆ −1 − G
ˆh
G
G


η
 F1 


ηυ =  ηF2  ,


∗
ηG


˜ −1 − F1
F˜1 h
F



 ˜ ˜ −1

υ˜ =  F2 hF − F2  ,


˜ −1 − G
˜h
G
G


ˆ
˜
R (F , G)
 F,1



ˆ
˜
Rυ (ˆ
υ , υ˜) =  RF,2 (F , G)  .


∗ ˆ ˆ
RG (G, F )


,

75
(C.71)
From equations (C.14), (C.15), (C.52), (C.53) we get:
kηυ k
1
√
= Op p
,
T
min(NL , NH , T )
2 2 kRυ (ˆ
υ , υ˜)k
k˜
υk
1
kˆ
υk
√
+ √
= Op p
.
+ √
T
T
T
min(NL , NH , T )
(C.72)
(C.73)
Moreover, as

IT
0
0
−1




 0
IT
0 


LG,F1 LG,F2 IT

IT
0
0





=  0
IT
0 ,


−LG,F1 −LG,F2 IT
√
√
˜ −1∗ − F ∗ k/ T ≤ C w.p.a. 1, for
˜ ∗H
˜ −1∗ − G∗ k/ T ≤ C and kFˆ ∗ H
the system (C.70) and using kG
G
F
some C, can be rewritten as:
υˆ = ηυ? + Lυ υ˜ + Rυ? (ˆ
υ , υ˜)
(C.74)
where:


I
0
0
 T



ηυ? =  0
IT
0  ηυ ,


−LG,F1 −LG,F2 IT


I
0
0
 T



Rυ? =  0
IT
0  Rυ ,


−LG,F1 −LG,F2 IT
(C.75)
and

Lυ

I
0
0
0 0 −LF1 ,G
 T



=  0
IT
0   0 0 −LF2 ,G


−LG,F1 −LG,F2 IT
0 0 0


0 0 −LF1 ,G




=  0 0 −LF2 ,G  ,


0 0 LG
76





(C.76)
with LG defined in equation (C.65). Using result (C.72), we can bound η˜υ as:
kηυ? kop
√
T
IT
0
0 ≤ 0
IT
0 −LG,F1 −LG,F2 IT kηυ kop
1
√
,
= Op p
T
min(NL , NH , T )
(C.77)
op
which implies:
kηυ? k
1
√
= Op p
.
T
min(NL , NH , T )
(C.78)
Using analogous arguments, we can bound R˜υ (ˆ
υ , υ˜) as:
2 2 k˜
υk
kRυ? (ˆ
υ , υ˜)k
1
kˆ
υk
√
+ √
= Op p
+ √
.
T
T
T
min(NL , NH , T )
[...]
Let us now compute matrix LG . We have:
LG
G0 G
T
−1
1
0
∗
∗0 ∗ −1
0
GG + F (F F ) e1 G
= w1 MF ∗ +
T
0 −1
0 −1
GG
FF
d1
d2
0
0
0
−1 0
F1 F1 +
F1 F2
× d1 MG + G(G G) F1 +
2T
2T
T
2T
0 −1
GG
1
0
∗
∗0 ∗ −1
0
+ w2 MF ∗ +
GG + F (F F ) e2 G
T
T
0 −1
0 −1
FF
d1
FF
d2
0
−1 0
0
0
× d2 MG + G(G G) F2 +
F2 F1 +
F2 F2
2T
2T
2T
2T
= w1 d1 MF ∗ MG + 2w1 G(G0 G)−1 F10 + F ∗ (F ∗0 F ∗ )−1 e1 F10
+ w2 d2 MF ∗ MG + 2w2 G(G0 G)−1 F20 + F ∗ (F ∗0 F ∗ )−1 e2 F20
+ RL ,
77
(C.79)
where the reminder term:
RL
= −w1 PF ∗ G(G0 G)−1 F10
0 −1 0 −1
GG
1
FF
1
1
0
0
0
+w1
GG
F 1 F 1 d1 +
F 1 F 2 d2
T
2T
T
2T
2T
∗0 ∗ −1 0 −1 0 FF
G F1
e1 F10
e1 F20
∗ F F
+F
d1
+ d2
T
2T
2T
2T
2T
0
−1 0
−w2 PF ∗ G(G G) F2
0 −1 0 −1
1
1
FF
1
GG
0
0
0
GG
F 2 F 1 d1 +
F 2 F 2 d2
+w2
T
2T
T
2T
2T
∗0 ∗ −1 0 −1 0 FF
G F2
e2 F10
e2 F20
∗ F F
+F
d1
+ d2
,
T
2T
2T
2T
2T
where we use MF ∗ Fj = 0 for j = 1, 2 and MG G = 0. The term RL (F, G) can be bounded as:
kRL k = Op (T −1/2 ),
since kFj0 G/T k = Op (T −1/2 ) for j = 1, 2. This allows to write:
LG = (w1 d1 + w2 d2 )MF ∗ MG
+2w1 G(G0 G)−1 F10 + 2w2 G(G0 G)−1 F20 + PF ∗ + Op (T −1/2 ),
(C.80)
where Op (T −1/2 ) denotes a (T × T ) matrix whose norm is Op (T −1/2 ).
[...]
√
Since k˜
υ k/ T ≤ c, w.p.a. 1, for some constant c > 0, the Op (T −1/2 ) term in the RHS of equation
(C.80) can be absorbed into the residual term of equation (C.63). Moreover, by replacing F1 and F2
with their residuals in the projection onto G, we modify matrix LG by a term of order Op (T −1/2 ).
Hence, we can analyze matrix LG as if (F1 , F2 ) and G were orthogonal.
[...]
78
C.1.7
Eigenvalues, eigenvectors and Jordan decomposition of matrix LG
i) Spectral decomposition of matrix LG
Let us now compute the eigenvalues and the associated eigenvectors of matrix LG defined by:
LG = PF ∗ + (w1 d1 + w2 d2 )(MF ∗ MG ) + 2w1 G(G0 G)−1 F10 + 2w2 G(G0 G)−1 F20 .
Since the vectors F1 and F2 are orthogonal (asymptotically) to vector G, the matrix MF ∗ MG is the
orthogonal projection onto the orthogonal complement of the linear subspace generated by vectors F1 ,
F2 and G. Moreover the matrix
A
= PF ∗ + 2w1 G(G0 G)−1 F10 + 2w2 G(G0 G)−1 F20
is (asymptotically) idempotent, with (T − 2)-dimensional null space equal to the orthogonal complement of the span of vectors F1 and F2 . Hence, matrix A admits the eigenvalue 1 with multiplicity 2, and the eigenvalue 0 with multiplicity T − 2. Moreover, matrix A maps the subspace
E1 = span{F1 , F2 , G} spanned by vectors F1 , F2 and G into itself. Matrix A is an oblique projection
onto a bi-dimensional subspace of E1 .
We deduce that matrix LG admits two invariant subspaces, namely E1 and its orthogonal complement E2 , of dimensions 3 and T −3, respectively. On subspace E2 , the linear operator corresponding to
matrix LG is diagonal and equal to w1 d1 + w2 d2 . On subspace E1 , the linear operators corresponding
to matrices LG and A are equal. We conclude that matrix LG admits the eigenvalue 0, associated to
the eigenvector G, the eigenvalue w1 d1 + w2 d2 , with multiplicity T − 3, associated to the eigenspace
E2 , and the eigenvalue 1 with multiplicity 2.
To conclude, let us derive the bi-dimensional eigenspace of matrix LG associated to eigenvalue 1.
Since this eigenspace is also the eigenspace of matrix A associated to eigenvalue 1, and matrix A is
idempotent, it is enough to find two linearly independent vectors in the image space of A . Two such
vectors are:
A F1 = F1 + 2(w1 + w2 φ)G,
A F2 = F2 + 2(w1 φ + w2 )G.
79
ii) Jordan decomposition of matrix LG
The Jordan decomposition theorem 3 ensures the existence of a non-singular T × T matrix Q and a
upper-triangular matrix L¯G whose diagonal elements are the eigenvalues of LG , such that:
Q LG Q−1 = L¯G ,
(C.81)
where


L¯G = 
LG,I
0
0
LG,II








 = 








∗
λ

1
... ...
λ∗ 1
λ∗
0








,


0 0 0 


1 1 

1
0
(C.82)
where λ∗ = w1 d1 + w2 d2 . The norms of the two matrices LG,I and LG,II are kLG,I kop = w1 d1 +
w2 d2 < 1 and kLG,II kop = 1.
iii) Another decomposition of matrix LG
We showed that LG admits two invariant subspaces, namely E1 and its orthogonal complement E2 , of
dimensions 3 and T − 3, respectively. Let [v1 , v2 , v3 ] orthonormal basis for E1 , and [w1 , ..., wT −3 ] be
an orthonormal basis for E2 . Therefore the matrix defined as:
Q = [w1 , ..., wT −3 , v1 , v2 , v3 ]
(C.83)
Q0 Q = QQ0 = IT → Q = Q−1 .
(C.84)
is orthogonal, and unitary as:
3
See theorem 14 in Magnus and Neudecker (2007), p. 18.
80
C.2
Proof of Proposition 4
[...]
C.3
Proof of Proposition 5
0
0
Let zt = [f1,t
, f2,t
, gt0 ]0 be the vector of stacked factors at time t, as defined in Section 4.2, and let
0
0
, gˆt0 ]0 . From Proposition 4 we have:
, fˆ2,t
zˆt = [fˆ1,t
T
1X
0
2
ˆ zt k = Op 1 ,
kˆ
zt − H
T t=1
T
ˆ − Hk = Op √1 ,
kH
H = I2KH +KL .
T
C.3.1
(C.85)
(C.86)
Consistency
We recall that the reduced-form factor dynamics is:
zt = C(θ)zt−1 + ζt ,
where matrix C(θ) is the autoregressive matrix in Equation (13) written as a function of θ, and V (ζt ) =
Σζ (θ). The parameter θ is subject to the constraint θ ∈ Θ, where Θ ⊂ Rp , is the compact set of
parameters values that satisfy matrix equation (5). Parameter θ is estimated by constrained Gaussian
Pseudo Maximum Likelihood (PML), and is the solution of the following minimization problem:
ˆ T (θ),
θˆ = arg max Q
(C.87)
θ∈Θ
ˆ T (θ) is defined as:
w.r.t. θ ∈ Θ, where the criterion Q
T
X
ˆ T (θ) = − 1 log |Σζ (θ)| − 1
Q
[ˆ
zt − C(θ)ˆ
zt−1 ]0 Σζ (θ)−1 [ˆ
zt − C(θ)ˆ
zt−1 ] .
2
2T t=2
81
(C.88)
Note that, if the factor values were observable, parameter θ would be estimated by constrained Gaussian PML by minimizing the following criterion:
T
1
1 X
QT (θ) = − log |Σζ (θ)| −
[zt − C(θ)zt−1 ]0 Σζ (θ)−1 [zt − C(θ)zt−1 ] .
2
2T t=2
(C.89)
w.r.t. θ ∈ Θ. Let us rewrite the stacked factor estimate zˆt as:
ˆ 0 zt ) + (H
ˆ − H)0 zt .
zˆt = zt + (ˆ
zt − H
(C.90)
Substituting equation (C.90) in the criterion (C.88), using the bounds (C.85) and (C.86) and the uniform boundedness of matrices C(θ) and Σζ (θ)−1 , we get the next Lemma, which is proved in the
supplementary material.
Lemma C.6.
ˆ T (θ) = QT (θ) + op (1),
Q
(C.91)
uniformly w.r.t. θ ∈ Θ.
From standard PML theory (see, for instance, Gourieroux and Monfort (1995)) we have:
sup |QT (θ) − Q∞ (θ)| = op (1),
(C.92)
θ∈Θ
where the limit criterion
1
1
0
−1
Q∞ (θ) = − log |Σζ (θ)| − E0 [z − C(θ)zt−1 ] Σζ (θ) [zt − C(θ)zt−1 ] ,
2
2
(C.93)
is minimized uniquely at the true value of parameter θ. Finally, equation (C.92) and Lemma C.6 allo
us to conclude that:
ˆ T (θ) − Q∞ (θ)| = op (1).
sup |Q
(C.94)
θ∈Θ
Then, by standard results on extremum estimators, we conclude that θˆ = θ +op (1), i.e. θˆ is a consistent
estimator.
82
C.3.2
Rate of convergence
The first order conditions (F.O.C.) of the maximization problem (C.87) are:
∂ ˆ ˆ
QT (θT ) = 0.
∂θ
(C.95)
Applying the mean-value theorem to the F.O.C. in the last equation, we have:
√
T
∂ ˆ
∂2 ˆ ¯ √ ˆ
QT (θ0 ) +
QT (θ) T (θT − θ0 ) = 0,
∂θ
∂θ∂θ0
(C.96)
where θ¯ is between θ0 and θˆT componentwise, and θ0 denotes the true parameter value. By similar
arguments as in Lemma C.6 and equation (C.92) we have the following Lemma, which is proved in
the supplementary material:
Lemma C.7.
2
2
∂
∂
ˆ T (θ) −
= op (1),
Q
Q
(θ)
sup T
0
∂θ∂θ0
θ∈Θ ∂θ∂θ
2
2
∂
∂
= op (1).
sup Q
(θ)
−
Q
(θ)
T
∞
0
∂θ∂θ0
θ∈Θ ∂θ∂θ
(C.97)
(C.98)
Moreover since θˆT is consistent, Lemma C.7 implies:
∂2 ˆ ¯
∂2
Q
(
θ)
=
Q∞ (θ0 ) + op (1),
T
∂θ∂θ0
∂θ∂θ0
where
(C.99)
∂2
Q∞ (θ0 ) is nonsingular. Rearranging equation (C.96) we have:
∂θ∂θ0
√
T (θˆT − θ0 ) =
√
The term
T
−1
√ ∂
∂2
ˆ T (θ0 ).
−
Q
(θ
)
+
o
(1)
T Q
∞
0
p
∂θ∂θ0
∂θ
(C.100)
∂ ˆ
QT (θ0 ) in the RHS of equation (C.100) can be rewritten as:
∂θ
√
√ ∂
√
∂ ˆ
∂ ˆ
∂
T QT (θ0 ) = T QT (θ0 ) + T
QT (θ0 ) − QT (θ0 ) .
∂θ
∂θ
∂θ
∂θ
83
(C.101)
The first term in the RHS of equation (C.101) can be bounded as:
√ ∂
T QT (θ0 ) = Op (1),
∂θ
(C.102)
applying a CLT for serial dependent data. Results (C.85) and (C.86) allow to bound the second term
in the RHS of equation (C.101) as in the next Lemma, which is proved in the supplementary material:
Lemma C.8.
√ ∂
∂
ˆ T (θ0 ) − QT (θ0 ) = Op (1).
T
Q
∂θ
∂θ
(C.103)
The bounds in equations (C.103) and (C.102) allow to conclude that:
√
T kθˆT − θ0 k = Op (1).
Q.E.D.
84
APPENDIX D: Factor dynamics with yearly-quarterly
mixed frequencies
In this Appendix we consider the setting with yearly (LF) - quarterly (HF) data, one HF factor and one
LF factor (i.e., KH = KL = 1) as in the empirical section. The model of Section 2 is extended to
accommodate m = 4 HF subperiods. With scalar factors, the model parameters in the factor dynamics
are scalar, and denoted by lower-case letters.
D.1
Structural VAR representation
The dynamics of the stacked factor vector zt = [f1,t , f2,t , f3,t , f4,t , gt ]0 is given by the structural VAR(1)
model (Ghysels (2012)):

1
0
0


 −rH
1
0


 0
−rH
1


 0
0
−rH

0
0
0

f
  1,t

0 0   f2,t


0 0   f3,t


1 0   f4,t

gt
0 1
0 0


0
0
0 rH
 
 
  0
0
0
0
 
 
= 0
0
0
0
 
 
  0
0
0
0
 
m1 m2 m3 m4

f
  1,t−1

a2   f2,t−1


a3   f3,t−1


a4   f4,t−1

gt−1
rL
a1


v
  1,t
 
  v2,t
 
 
 +  v3,t
 
 
  v4,t
 
wt






,




(D.1)
that is
Γzt = Rzt−1 + ηt ,
(D.2)
where ηt = (v1,t , v2,t , v3,t , v4,t , wt )0 is a multivariate white noise process with mean 0 and variancecovariance matrix:






Σ=




2
σH
0
0
2
σH
0
0
σHL,2
0
0
2
σH
0
σHL,3
0
0
0
2
σH
σHL,4
0
0
σHL,1 σHL,2 σHL,3 σHL,4
85
σHL,1
σL2






.




(D.3)
D.2
Restrictions implied by the factor normalization
The factor normalization is:

f
 1,t

 f2,t


V (zt ) = V  f3,t


 f4,t

gt


1
 
 
  φ1
 
 
 =  φ2
 
 
  φ3
 
0
φ1 φ2 φ3 0
1
φ1 φ2
φ1 1
φ1
φ2 φ1 1
0
0
0



0 


0 .


0 

1
In particular, under stationarity we have:
φ1 = Cov(f1,t , f2,t ) = Cov(f2,t , f3,t ) = Cov(f3,t , f4,t ),
since f1,t , f2,t , f3,t and f4,t are consecutive realizations of the HF factor process. Similarly:
φ2 = Cov(f1,t , f3,t ) = Cov(f2,t , f4,t ).
By computing the variance on both sides of equation (D.2) we get:
ΓV Γ0 = RV R0 + Σ.
(D.4)
By matrix multiplication:






0
ΓV Γ = 




1 φ1 − rH
2
rH
− 2rH φ1 + 1
φ2 − rH φ1
φ3 − rH φ2
2
rH
φ1
2
rH
φ2
− rH (1 + φ2 ) + φ1
2
rH
− 2rH φ1 + 1
− rH (φ1 + φ3 ) + φ2
2
rH
φ1 − rH (1 + φ2 ) + φ1
2
rH
− 2rH φ1 + 1
86
0



0 


0 ,


0 

1
and:

RV R0





= 




2
rH
+ a21 a1 a2 a1 a3 a1 a4 A∗15
a22
a2 a3 a2 a4
a23
a3 a4
a24



a2 rL 


a3 rL  ,


a4 rL 

∗
A55
where:
A∗15 = rH (φ3 m1 + φ2 m2 + φ1 m3 + m4 ) + a1 rL ,
A∗55 = m21 + m22 + m23 + m24 + 2φ1 (m1 m2 + m2 m3 + m3 m4 )
+2φ2 (m1 m3 + m2 m4 ) + 2φ3 m1 m4 + rL2 .
Hence from (D.4) we get the following equations:
n.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Position
(1,1)
(2,2)
(3,3)
(4,4)
(5,5)
(1,2)
(1,3)
(1,4)
(1,5)
(2,3)
(2,4)
(2,5)
(3,4)
(3,5)
(4,5)
Equation
2
2
+ a21 + σH
1 = rH
2
2
rH − 2rH φ1 + 1 = a22 + σH
2
2
rH
− 2rH φ1 + 1 = a23 + σH
2
2
rH
− 2rH φ1 + 1 = a24 + σH
∗
2
1 = A55 + σL
−rH + φ1 = a1 a2
−rH φ1 + φ2 = a1 a3
−rH φ2 + φ3 = a1 a4
0 = A∗15 + σHL,1
2
φ1 (rH
+ 1) − φ2 rH − rH = a2 a3
2
φ2 (rH
+ 1) − φ3 rH − φ1 rH = a2 a4
0 = a2 rL + σHL,2
2
φ1 (rH
+ 1) − φ2 rH − rH = a3 a4
0 = a3 rL + σHL,3
0 = a4 rL + σHL,4
87
These equations imply:
(1)
2
2
σH
= 1 − rH
− a21 ,
(2)
2
2
σH
= rH
− 2rH φ1 + 1 − a22 ,
(3)
2
2
σH
= rH
− 2rH φ1 + 1 − a23 ,
(4)
2
2
σH
= rH
− 2rH φ1 + 1 − a24 ,
(5)
2
σL
= 1 − A∗55 ,
(6)
φ1 = rH + a1 a2 ,
(7)
2
φ2 = rH
+ rH a1 a2 + a1 a3 ,
(8)
3
2
φ3 = rH
+ rH
a1 a2 + rH a1 a3 + a1 a4 ,
(9)
σHL,1 = −A∗15 ,
2
(10) φ1 (rH
+ 1) − φ2 rH − rH = a2 a3 ,
2
(11) φ2 (rH
+ 1) − φ3 rH − φ1 rH = a2 a4 ,
(12) σHL,2 = −a2 rL ,
2
(13) φ1 (rH
+ 1) − φ2 rH − rH = a3 a4 ,
(14) σHL,3 = −a3 rL ,
(15) σHL,4 = −a4 rL .
Let θ denote the vector containing rH , rL , ai and mi for all i = 1, 2, 3, 4. Equations (6), (7), (8)
express φ1 , φ2 , φ3 in terms of θ:
φ1 = rH + a1 a2 ,
(D.5)
2
φ2 = rH
+ rH a1 a2 + a1 a3 ,
(D.6)
3
2
φ3 = rH
+ rH
a1 a2 + rH a1 a3 + a1 a4 .
(D.7)
Equations (1), (5), (9), (12), (14) and (15) express the elements of the variance-covariance matrix Σ in
88
terms of θ:
2
2
σH
= 1 − rH
− a21 ,
(D.8)
σL2 = 1 − A∗55 ,
(D.9)
σHL,1 = −A∗15 ,
(D.10)
σHL,2 = −a2 rL ,
(D.11)
σHL,3 = −a3 rL ,
(D.12)
σHL,4 = −a4 rL .
(D.13)
Finally, the remaining equations (2), (3), (4), (10), (11) and (13) provide restrictions on the elements
of θ:
2
2
rH
− 2rH φ1 + 1 − a22 = 1 − rH
− a21 ,
(D.14)
a22 = a23 = a24 ,
(D.15)
2
φ1 (rH
+ 1) − φ2 rH − rH = a2 a3 ,
(D.16)
2
φ2 (rH
+ 1) − φ3 rH − φ1 rH = a2 a4 ,
(D.17)
a2 a3 = a3 a4 .
(D.18)
By using (D.5), (D.6) and (D.7), the equations (D.14) - (D.18) can be written as:
a22 = a23 = a24 ,
(D.19)
a2 a3 = a3 a4 ,
(D.20)
a21 − 2rH a2 a1 − a22 = 0,
(D.21)
a1 a2 − rH a1 a3 − a2 a3 = 0,
(D.22)
a1 a3 − rH a1 a4 − a2 a4 = 0.
(D.23)
89
The system of equations admits three sets of alternative solutions:
• a1 = a2 = a3 = a4 = 0, rH ∈ R,
• rH = 0, a1 = a2 = a3 = a4 ∈ R,
• rH = 0, a1 = −a2 = a3 = −a4 ∈ R.
We focus on the first set of solutions, and impose a1 = a2 = a3 = a4 = 0.
D.3
Reduced form representation
By inverting the matrix on the LHS of equation (D.1):

Γ−1
1
0
0


 rH 1 0

 2
=  rH
rH 1

 3
2
 rH rH
rH

0 0 0
0 0



0 0 


0 0 ,


1 0 

0 1
the reduced form of the structural VAR(1) model in equation (D.1) is given by (see Ghysels (2012)):

f
 1,t

 f2,t


 f3,t


 f4,t

gt


0
 
 
  0
 
 
= 0
 
 
  0
 
m1
0
0
rH
0
0
2
rH
0
0
3
rH
0
0
4
rH
m2 m3 m4

f
  1,t−1

  f2,t−1
rH a1 + a2


2
  f3,t−1
rH a1 + rH a2 + a3


3
2
rH a1 + rH a2 + rH a3 + a4   f4,t−1

gt−1
rL
a1






 + ζt ,




where the zero-mean innovation vector ζt = Γ−1 ηt has the variance-covariance matrix






V (ζt ) = 




2
σH
2
rH σH
2 2
rH
σH
3 2
rH
σH
σHL,1
2
2
(1 + rH
)σH
2
2
rH (1 + rH
)σH
2
2
2
rH
(1 + rH
)σH
rH σHL,1 + σHL,2
2
4
2
(1 + rH
+ rH
)σH
2
4
2
rH (1 + rH
+ rH
)σH
2
rH
σHL,1 + rH σHL,2 + σHL,3
2
4
6
2
(1 + rH
+ rH
+ rH
)σH
3
2
rH
σHL,1 + rH
σHL,2 + rH σHL,3 + σHL,4
2
σL
90






.




Let us now impose the restrictions from factor normalization derived in Section D.2. Using a1 = a2 =
a3 = a4 = 0, from equations (D.8)-(D.13), the parameters of the variance-covariance matrix of the
innovations are:
2
2
σH
= 1 − rH
,
(D.24)
σL2 = 1 − [m21 + m22 + m23 + m24 + 2rH (m1 m2 + m2 m3 + m3 m4 )
2
3
+ 2rH
(m1 m3 + m2 m4 ) + 2rH
m1 m4 + rL2 ],
D.4
(D.25)
2
3
m2 + rH m3 + m4 ),
m1 + rH
σHL,1 = −rH (rH
(D.26)
σHL,2 = σHL,3 = σHL,4 = 0.
(D.27)
Stationarity conditions
The stationarity condition for the VAR(1) model in equation (D.2) is: the eigenvalues of matrix

0


 0


Γ−1 R =  0


 0

m1
0
0
rH
0
0
2
rH
0
0
3
rH
0
0
4
rH
m2 m3 m4
a1




r H a1 + a2


2

r H a1 + r H a2 + a3


2
3
r H a1 + r H a2 + r H a3 + a4 

rL
are smaller than one in modulus. If either ai = 0 for all i, or mi = 0 for all i, the stationarity condition
becomes: |rH | < 1 and |rL | < 1.
91