ARCHIVES - Population Studies Center

y
Causal A n a l y s i s P r o j e c t
December, 1968
I n t e r i m Report No. 1
INSTITUTE FOR SOCIAL RESEARCH LIBRARY
V .-*
i
1
CORRELATIONAL PROPERTIES OF SIMULATED PANEL DATA
WITH CAUSAL CONNECTIONS BETWEEN TWO VARIABLES
Donald C. P e l z
a s s i s t e d by Spyros Magliveras,
w i t h t e c h n i c a l appendices by Robert A, Lew
SURVEY RESEARCH CENTER
The U n i v e r s i t y o f Michigan
Ann Arbor, Michigan
ARCHIVES
Causal A n a l y s i s 1
ii
TABLE OF CONTENTS. WITH ABSTRACTS*
Page
A.
Introduction
•
1
Two questions
I n t u i t i v e expectations ( F i g u r e s 1.1
B.
5
to 1.3)
6
M u l t i p l e models f o r the same e m p i r i c a l data (Figure 1.4) . . . .
A b s t r a c t : I f one has c r o s s - s e c t i o n a l survey data i n which a
c o r r e l a t i o n appears between two v a r i a b l e s x and y, i t I s
seldom p o s s i b l e to I n f e r the d i r e c t i o n of c a u s a l Influence
( I f any) between them. However w i t h panel data containing
measurements on the same i n d i v i d u a l s a t two or more times,
a d i f f e r e n c e i n the "cross-lagged c o r r e l a t i o n s " has been
suggested as a means of I n f e r r i n g c a u s a l p r i o r i t y . This
r e p o r t w i l l explore the question: given known c a u s a l conn e c t i o n s between two simulated v a r i a b l e s , w i l l t h i s c r o s s lagged d i f f e r e n t i a l appear?
The r e v e r s e question I s more
difficult:
i f a cross-lagged d i f f e r e n t i a l i s observed, can
one I n f e r c a u s a l connections? I n t u i t i v e conjectures are presented on how the presence of c a u s a l i n f l u e n c e i n simulated
data w i l l a f f e c t the c o r r e l a t i o n s between two v a r i a b l e s as
the measurement l a g I n c r e a s e s .
10
General C h a r a c t e r i s t i c s of Two-Variable
12
Two sources of consistency (Figures 1.5a
Mode,!
to d)
. .
Types of v a r i a b l e s ( F i g u r e 1.5e)
A b s t r a c t : To observe what c o r r e l a t i o n a l p r o p e r t i e s w i l l follow
from known connections, simulated v a r i a b l e s x and y have been
generated. At s u c c e s s i v e c y c l e s of the operation, an I n d i v i d u a l ' s score on e i t h e r v a r i a b l e may be influenced by an
e a r l i e r score on the other v a r i a b l e . Each v a r i a b l e may have
d i f f e r i n g degrees of short-term c o n s i s t e n c y (the score I s
dependent on the immediately p r i o r s c o r e ) , or long-term cons i s t e n c y (the score i s a l s o dependent on an enduring i n d i v i d u a l c o n s t a n t ) . The program computes and graphs autocorrel a t i o n s and c r o s s - c o r r e l a t i o n s as a f u n c t i o n of measurement
lag.
*For the meaning of s p e c i a l terms, see G l o s s a r y , p p . v i f f .
14
15'
Causal A n a l y s i s 1
iii
jjage
C.
T e c h n i c a l D e t a i l s of Two-Variable Model. . . .
18
Short-terra consistency
18
Long-term c o n s i s t e n c y . . . . . . . . . .
. . . . . .
21
I n f l u e n c e of one v a r i a b l e on the other
24
Distributed influence
25
A b s t r a c t : Recursion equations a r e given f o r simulating x and y
v a r i a b l e s . Short-term consistency i s c o n t r o l l e d by rho coe f f i c i e n t s ( p and py r e s p e c t i v e l y ) m u l t i p l y i n g the p r i o r
x
value of the same v a r i a b l e . Long-term consistency i s i n t r o duced by i n d i v i d u a l zeta constants Q
and 7^ r e s p e c t i v e l y ) .
Causal i n f l u e n c e of e i t h e r v a r i a b l e on the other may be i n troduced through a c o e f f i c i e n t c m u l t i p l y i n g a p r i o r value
of the other v a r i a b l e . The c a u s a l influence may be exerted
w i t h i n a s i n g l e c a u s a l i n t e r v a l or d i s t r i b u t e d over s e v e r a l
intervals.
K
D.
Autocorrelations
E f f e c t s of short-term
«, . . . . . . .
•
consistency (Figure 1. 6)
E f f e c t s of long-term consistency
26
26
(Figure 1*7)
28
E f f e c t s of c a u s a l i n f l u e n c e by x on y (Figure 1.8)
.
E f f e c t of x i n f l u e n c e on y a u t o c o r r e l a t i o n given long-term cons i s t e n c y (Figure 1..9)
31
33
A b s t r a c t : The simulated a u t o c o r r e l a t i o n s corresponded c l o s e l y
to the t h e o r e t i c a l e x p e c t a t i o n s . I n the absence of longterm consistency, a u t o c o r r e l a t i o n s declined slowly or
r a p i d l y toward the zero asymptote depending on the magnitude of the rho c o e f f i c i e n t . When long-term consistency
was introduced, a u t o c o r r e l a t i o n s d e c l i n e d as expected toward
a non-zero asymptote. When x was allowed to influence y , the
consistency of the l a t t e r was increased (autocorrelograms were
h i g h e r ) . These observed r e s u l t s corresponded to mathematical
d e r i v a t i o n s described i n t e c h n i c a l appendices.
E.
Cross-Correlations
E f f e c t s of v a r i a t i o n i n short-term
1.11).
I m p l i c a t i o n s f o r cross-lagged
Delay i n maximum (Figure 1.12)
* . . .
consistency
35
(Figures 1.10 and
35
differential
. . . .
38
40
Causal A n a l y s i s 1
iv
E f f e c t s of v a r y i n g the amount of c a u s a l i n f l u e n c e by x on y
(Figure 1.13)
E f f e c t s of introducing long-term consistency
and 1.15)
E f f e c t of marked i n e q u a l i t y i n short-term
and y (Figure 1.16)
pa%e
42
(Figures 1.14
consistency
45
of x
48
E f f e c t of marked i n e q u a l i t y i n long-term consistency of x
and y (Figure 1.17)
50
E f f e c t of introducing c o r r e l a t i o n between z e t a ' s for x and y
( F i g u r e s 1.18 and 1.19)
52
E f f e c t s of d i s t r i b u t i n g the, c a u s a l i n f l u e n c e (Figure 1.20) . . . .
56
A b s t r a c t : Data were shown on c r o s s - c o r r e l a t i o n s between simulated
v a r i a b l e s x and y, where x was given a u n i d i r e c t i o n a l i n f l u e n c e
on y . The greater the short-term consistency i n x and y, the
higher the correlogram, and the stronger the cross-lagged d i f f e r e n t i a l . Other e f f e c t s were g r e a t e r span of the correlogram,
and delay i n occurence of i t s maximum v a l u e . As the c a u s a l
i n f l u e n c e of x on y was made stronger, the correlogram became
higher, but n e i t h e r i t s span nor delay i n maximum were a f f e c t e d . As long-term consistency was increased the correlogram
became higher but f l a t t e r , so that the cross-lagged d i f f e r e n t i a l
was obscured. Succeeding s e c t i o n s i n v e s t i g a t e d the e f f e c t s of
v a r i o u s m o d i f i c a t i o n s . Generally i t was observed that i n the
presence of high short-term consistency the cross-lagged d i f f e r e n t i a l appeared under various c o n d i t i o n s , but i t was weakened
by the presence of long-term consistency, e s p e c i a l l y i n the
causal variable.
F.
R e c i p r o c a l Influence
59
E f f e c t of r e c i p r o c a l i n f l u e n c e on a u t o c o r r e l a t i o n s (Figure 1.21) .
59
E f f e c t of r e c i p r o c a l i n f l u e n c e on c r o s s - c o r r e l a t i o n s (Figure 1,22)
61
D i f f e r i n g magnitude of i n f l u e n c e (Figure 1.23)
63
E f f e c t of long-term consistency
64
(Figure 1.24 and 1.25)
A b s t r a c t : I n the preceding s e c t i o n s , x was allowed a u n i d i r e c t i o n a l i n f l u e n c e on y. I n the present s e c t i o n r e c i p r o c a l or b i d i r e c t i o n a l i n f l u e n c e s were introduced;
caution was needed
s i n c e high short-term consistency r e s u l t e d i n unstable v a r i a n c e s .
Given r e c i p r o c a l congruent i n f l u e n c e s or p o s i t i v e feedback
Causal A n a l y s i s 1
.
+
PS
I n Conclusion
69
( x — > y — * x ) , the cross-correlogram showed two p o s i t i v e peaks
as expected, one on e i t h e r s i d e of zero l a g . Given a p o s i t i v e and a negative i n f l u e n c e or a negative feedback loop
( x - ^ y - - ^ x ) , the correlogram showed one p o s i t i v e and one
negative peak as expected. When long-term consistency was
introduced, these shapes were preserved but became much
flatter.
G.
v
a
A b s t r a c t : I n r e p l y to the f i r s t l a r g e q u e s t i o n — w h e t h e r known
c a u s a l connections w i l l give r i s e to a cross-lagged d i f f e r e n t i a l — the answer thus f a r w i t h simulated data has been a f f i r m a t i v e , provided that each v a r i a b l e has a t l e a s t moderate
short-term c o n s i s t e n c y , and t h a t long-term consistency i s low
i n a t l e a s t the c a u s a l v a r i a b l e . S t i l l u n s e t t l e d , and the
subject f o r future i n v e s t i g a t i o n , i s the r e v e r s e question of
whether the observation of d i f f e r e n c e i n e m p i r i c a l c r o s s lagged c o r r e l a t i o n s w i l l permit inferences about underlying
c a u s a l connections.
APPENDICES
73
e
Causal A n a l y s i s 1
vl
GLOSSARY OF TERMS AND SYMBOLS
Symbol
Term
Score
it*
y
it
Meaning
Values on simulated v a r i a b l e s x and y r e s p e c t i v e l y ,
f o r i n d i v i d u a l i a t time t .
Cycle
S i n g l e operation of computer program c r e a t i n g a s e t
of scores f o r N i n d i v i d u a l s a t a given time. Succ e s s i v e c y c l e s a r e equivalent to s u c c e s s i v e time u n i t s
Lag
I n t e r v a l of time between s u c c e s s i v e measurements of
any v a r i a b l e , expressed l n c y c l e s ; a l s o c a l l e d measure
ment interval»
Autocorrelation
r ( x t» t+4c
x
r(y
t
C o r r e l a t i o n between a s e t of scores f o r N i n d i v i d u a l s
on one v a r i a b l e a t time t and scores f o r the same i n d i v i d u a l s on t h e same v a r i a b l e a t a l a t e r time t+k,
where k » 1, 2, 3, ....
Autocorrelogram
Graph of the a u t o c o r r e l a t i o n
Cross-correlation
C o r r e l a t i o n between a s e t of scores f o r N i n d i v i d u a l s
on one v a r i a b l e a t time t and those on another v a r i a b l e a t time t4k, where k » . . . r 3 , -2, - I , 0, 1, 2, 3..
A negative k i n d i c a t e s that y i s measured before x;
p o s i t i v e k t h a t x i s measured before y ,
Cross-corre1ogram
Graph of the c r o s s - c o r r e l a t i o n , plotted as a function
of k.
Cross-lagged
correlation:
Any c r o s s - c o r r e l a t i o n between two v a r i a b l e s measured
a t d i f f e r e n t t i m e s — i . e . , k 4 0.
Cross-lagged d i f f e r e n t i a l
Difference
p l o t t e d as function of k.
=
i n cross-lagged c o r r e l a t i o n s
r
< f
x
\ <y » t-*>
r
x
t
D i f f e r e n c e between two c r o s s - c o r r e l a t i o n s , i n one of
which x precedes y by k time u n i t s , and i n the other
y precedes x by the same l a g . I n a cross-correlogram,
the d i f f e r e n t i a l appears as a d i f f e r e n c e i n height a t
equal d i s t a n c e s on e i t h e r s i d e of t e r o l a g , k » 0.
Causal A n a l y s i s 1
Term
vii
Symbol
Meaning
Short-term consistency:
C h a r a c t e r i s t i c of a v a r i a b l e such that scores at
time t+1 are dependent on immediately p r i o r scores,
the degree of dependence being governed by a rho
c o e f f i c i e n t yo or yoy.
x
Rho:
c o e f f i c i e n t of short-term
Px* Py
S
Long-term consistency:
e
e
consistency
above.
C h a r a c t e r i s t i c of a v a r i a b l e such that scores for
each i n d i v i d u a l f l u c t u a t e through time around an I n d i v i d u a l z e t a constant, ^
or 2iy»
x
Zeta:
constant
of long-term
3ix» 3iy
Causal
influence
x—>y 1
[
*
>
>
Causal i n t e r v a l
g
S e e
consistency
a b o v e
«
One v a r i a b l e i s s a i d
another i f scores on
pendent on scores on
where the i n t e r v a l g
to have a c a u s a l influence on
the l a t t e r a t any time t are dethe former a t a p r i o r time t-g,
i s c a l l e d the c a u s a l i n t e r v a l .
See above. Causal i n f l u e n c e can be d i s t r i b u t e d over
s e v e r a l i n t e r v a l s - - i . e . , y can depend not only on
fc
x
C o e f f i c i e n t of c a u s a l
°xy* °yx
Congruent i n f l u e n c e :
t-g
b
u
t
a
l
s
o
0 1 1
x
t-g+l>
x
t-g+2'-" t-r
x
influence
C o e f f i c i e n t governing the extent to which one v a r i a b l e i s influenced by the p r i o r scores on the other
variable.
As one v a r i a b l e i n c r e a s e s , the other i n c r e a s e s .
+
+
x—>y, y — > x
Incongruent i n f l u e n c e :
As one v a r i a b l e i n c r e a s e s , the other
decreases.
x-^»y, y - ^ x
Causal connection
Any c a u s a l i n f l u e n c e between x and y i n e i t h e r d i r e c t i o n , whether congruent or incongruent.
Span
I n a cross-correlogram, the distance between that
point on the k a x i s a t which the correlogram begins
to r i s e from an e s s e n t i a l l y h o r i z o n t a l slope, to that
point a t which the correlogram again becomes e s s e n t i a l l y
h o r i z o n t a l . Where p e r i o d i c f l u c t u a t i o n s occur, the
span may be indeterminate.
vlii
Causal A n a l y s i s 1
Term
Symbol
Meaning
Delay i n maximum
The d i s t a n c e , i n k u n i t s , between t h a t point where
k » g and t h a t point where the cross-correlogram
reaches i t s maximum h e i g h t .
Asymptote
Height of autocorrelogram or a cross-correlogram
a f t e r i t s slope has become e s s e n t i a l l y h o r i z o n t a l .
Causal p r i o r i t y
An i n f e r e n c e t h a t the hypothesis x — * y i s more
tenable than the hypothesis y — » x . The question of
whether such i n f e r e n c e s can be drawn l i e s beyond the
scope of the present r e p o r t .
SURVEY RESEARCH CENTER
The U n i v e r s i t y of Michigan
Ann Arbor, Michigan
Causal A n a l y s i s P r o j e c t
Interim Report No. 1
December,
1968
CORRELATIONAL PROPERTIES OF SIMULATED PANEL DATA
WITH CAUSAL CONNECTIONS BETWEEN TWO
Donald C.
VARIABLES*
Pelz
a s s i s t e d by Spyros Magliveras,
with t e c h n i c a l appendices by Robert A.
A.
Lew
Introduction
I f one has c r o s s - B e c t i o n survey data (a v a r i e t y of measures
tained on a population of i n d i v i d u a l s a t one point i n time) and
ob-
observes
a c o r r e l a t i o n between two v a r i a b l e s x and y, i t i s seldom possible'to
d i s t i n g u i s h by s t a t i s t i c a l a n a l y s i s between two p l a u s i b l e hypotheses:
"x i s c a u s a l l y p r i o r to y " ( x « ^ y ) , and
**y i s c a u s a l l y p r i o r to x"
Conducted under a grant from the National Science Foundation, GS-1873,
with supplementary aid from the National Broadcasting Company.
sions began e a r l y i n 1967
Discus-
involving the author, Spyros Magliveras as pro-
grammer, and Graham Kalton, v i s i t i n g l e c t u r e r l n sociology and sampling
s t a t i s t i c s from the London School of Economics, who
gave f r u i t f u l guidance
in s t r u c t u r i n g the simulated model and expressing some of i t s properties
from time s e r i e s theory.
mathematical properties
Robert Lew
has pushed forward the derivation
of the model.
and expanding the computer program*
George G l u s k i a s s i s t e d Lew
of
i n revising
Cyrus Ulberg developed a program for
examining lagged c o r r e l a t i o n s for a c t u a l panel data.
Causal a n a l y s i s 1
page 2
*
However, i f one has panel data (measurements on the same individuals
a t two or more times), a s t a t i s t i c a l procedure c a l l e d " d i f f e r e n t i a l i n
cross-lagged c o r r e l a t i o n s " has been proposed as a means of deciding which
of the two hypotheses i s more tenable (Pelz and Andrews, 1964;
1963;
Campbell and Stanley,
Campbell,
1964).
Imagine a population of persons characterized by v a r i a b l e s x and
y,
where the x and y scores for each i n d i v i d u a l change somewhat from one time
to the next, but do not change r a d i c a l l y or e r r a t i c a l l y .
such v a r i a b l e s "moderately consistent through time.")
the autocorrelation
(We s h a l l c a l l
With such variables
between x a t one time and x at a l a t e r time w i l l
be
high for adjacent measurements, and w i l l decline as the time between measurements i n c r e a s e s .
Rozelle and Campbell (1969) have pointed out that one must consider at
l e a s t four r i v a l hypotheses, two a s s e r t i n g that one v a r i a b l e
the other in a congruent fashion
and
(as one
two based on incongruent influences
other decreases).
increases, the other increases),
(as one v a r i a b l e increases,
may
the
Representing the two types of influence by "+" and "-"
r e s p e c t i v e l y we must consider the hypotheses;
One
influences
x
y, x-^-y, y ^ x ,
a l s o hypothesize r e c i p r o c a l connections, such a s :
common influence by a t h i r d v a r i a b l e :
w^ , etc
y-^x.
x—^y-^x
I n the in
troductory discussion above, for the sake of s i m p l i c i t y l e t us assume only
u n i d i r e c t i o n a l congruent influences:
x - ^ y and
y-^x.
Causal a n a l y s i s 1
page 3
Imagine a l s o that the value of x a t time t tends to produce, i n a
l i n e a r fashion, a corresponding (congruent) value In v a r i a b l e y a f t e r
a c e r t a i n number of time u n i t s which we s h a l l c a l l the "causal i n t e r v a l , "
designated by the symbol g.
I n such a system, x and y are measured f o r
a l l i n d i v i d u a l s a t time t and again a t time t+k (k « time lag between
measurements).
From the four s e t s of s c o r e s , s i x c o r r e l a t i o n c o e f f i c -
ients (or other measures of a s s o c i a t i o n ) can be obtained as indicated by
the following
lines:
t-ttt
I t i s u s u a l l y assumed that causation i s Instantaneous—the causal f a c tor must be present
l n time and space when the e f f e c t occurs.
data, of course, the value of x
ponding change I s noted i n y.
weeks before action occurs.
I n actual
may change some time before a corres-
For example, motivation to a c t may a r i s e
I n t h i s i n v e s t i g a t i o n we s h a l l not review
philosophical debates on the meaning of "causation."
I n t h i s paper,
causal influence w i l l be used l n the sense of functional dependence, as
i l l u s t r a t e d i n the recursion equations f o r y as a function of p r i o r
values of x or v i c e versa;
see below pages 24-25.
Causal a n a l y s i s 1
page 4
Suppose that we have chosen a measurement lag k which i s close to
the causal i n t e r v a l g needed for x to influence y.
vations
seem i n t u i t i v e l y p l a u s i b l e ,
should be observed between x
strongest
t
The following
obser-
(a) A r e l a t i v e l y strong c o r r e l a t i o n
and y^^» since the causal influence i s
p r e c i s e l y over t h i s time i n t e r v a l ,
(b) We may
s t i l l observe
p o s i t i v e c o r r e l a t i o n s between simultaneous scores;
r ( x , y ) and r ( x ^ ,
t
t
t
y
t 4 4 c
).*
Since x and y are both reasonably consistent through time (that i s x
w i l l be p o s i t i v e l y correlated with 3t ^> *d the same for y
at
t+
fc
and
y ^)»
t
the strong diagonal c o r r e l a t i o n w i l l be r e f l e c t e d moderately i n the
r e l a t i o n s when x and y are simultaneous,
fc
cor-
(c) The weakest c o r r e l a t i o n
should appear along the opposite diagonal, y
and x
t
, since the c o r r e s t+K
ponding x and y values are. each remote i n time from those x and y values
which are causally
linked.
Campbell has independently suggested the same expectation:
if
v a r i a b l e X could be said to cause v a r i a b l e 0, then "the 'effect* should
c o r r e l a t e higher with a p r i o r 'cause than with a subsequent 'cause,'
i . e . , r q > rv o ." (Subscripts 1 and 2 here r e f e r to f i r s t and second
12
^1
1
x
measurements r e s p e c t i v e l y of X and 0.
See Campbell and Stanley,
1963,
p.
238.)
The method has generated p l a u s i b l e outcomes with r e a l data.
Pelz
This notation for c o r r e l a t i o n c o e f f i c i e n t s w i l l be used to avoid double
subscripts.
pendices ,
I t i s consistent with the notation adapted by Lew
i n the
ap-
Causal a n a l y s i s 1
page
and Andrews (1964) applied
i t to data on height and weight of growing
boys and obtained r e s u l t s generally
pe.cted.
consistent with what had been ex-
The method a l s o produced a consistent u n i d i r e c t i o n a l ordering
among 12 measures of consumer behavior and expectations from a survey
panel.
I t proved possible to arrange these i n a directed network of
causal p r i o r i t i e s i n which only one inconsistency appeared out of 35 comparisons •
Two questions
However, the v a l i d i t y of the procedure as a b a s i s f o r choosing
among causal hypotheses i s f a r from e s t a b l i s h e d .
Two e n t i r e l y separate
questions must be considered,
I.
Suppose we know In f a c t that v a r i a b l e x influences v a r i a b l e y
in a congruent d i r e c t i o n ( x - ^ y ) ;
i . e . , a change In the s t a t e of x f o r
each i n d i v i d u a l i s followed (with some margin of indeterminacy) by a
corresponding change in the state of y .
-
true that a s i g n i f i c a n t difference
appear?
Will r ( x , y ^ )
t
> r(y ,
t
I f I n fact x - ^ y ,
w i l l I t hold
i n the cross-lagged c o r r e l a t i o n s w i l l
x ^ ) ?
Evidence to date suggests that such a r e s u l t w i l l often appear,
provided c e r t a i n conditions hold--such as that x and y are reasonably
consistent
through time (neither markedly stable nor unstable), that the
^Because of the extreme consistency
of the v a r i a b l e s , these e f f e c t s ap-
peared only when p a r t i a l correlations were used.
H
2* 2
W
r e
P
r e s e n t
That I s , where H^, W^,
height and weight on f i r s t and second measurements r e -
s p e c t i v e l y , the p a r t i a l c o r r e l a t i o n HjW
r
three of s i x t r i a l s .
*W^ exceeded WjH2 ' ^
r
H
*
n
Causal a n a l y s i s 1
page 6
causal influence i s not instantaneous, that the i n t e r v a l of measurement
i s reasonably close to the i n t e r v a l of causation, that the Influence of
x on y i s l i n e a r , e t c .
II.
Even i f the answer to the above I s a f f i r m a t i v e , we must ask
second question:
can we reason i n the reverse d i r e c t i o n ?
the
That i s , i f we
observe e m p i r i c a l l y a s u b s t a n t i a l difference ln the cross-lagged correlations—i.e., i f r(x , y
) >
t
t+K
r(y , x
) — c a n we
t
t+k
Answer:
be s e v e r a l other models (perhaps an
by no means.
There may
I n f e r that x - ^ y ?
infinite
number) of causal connection between these v a r i a b l e s which might equally
w e l l generate the observed d i f f e r e n t i a l i n c o r r e l a t i o n s .
i l l u s t r a t i o n i s given below, pp. 10-12,
How
One
hypothetical
we are to d i s t i n g u i s h among
these a l t e r n a t i v e s remains a major problem.
Answering these two questions w i l l not be easy.
confined to question I .
This report i s
Given known causal connections among c e r t a i n
v a r i a b l e s , what differences i n cross-lagged c o r r e l a t i o n s (and what other
c o r r e l a t i o n a l properties) can be expected?
I n t u i t i v e expectations
Suppose we have generated simulated v a r i a b l e s x
fc
and y
fc
(the sub-
s c r i p t t standing for successive time u n i t s ) , where each v a r i a b l e i s
moderately consistent through time, and we have l e t y
gruently by an e a r l i e r value of x, say x^y
tained between x
t
and y ^ »
t+K
Now,
for each k = 0, 1, 2,
What should we observe?
be Influenced con-
c o r r e l a t i o n s are
ob-
....
I t seemed reasonable to expect a cross-
correlogram somewhat l i k e that pictured
i n the upper h a l f of Figure 1,1.
page 7
Causal analysis 1
(The i l l u s t r a t i o n s given below were I n i t i a l i n t u i t i v e
conjectures;
a c t u a l r e s u l t s w i l l be given l a t e r . )
.op.
+1.0
Cross
cor re
lation
• 00
r(x , y _ , )
t-rtc
f c
\
l.ofih i
15
*
*
»
*
10
•
o
'
' » ' *
* t
i
i
i
ro
i
i
\
V ( i n t e r v a l of measurement or l a g )
Figure 1.1.
Expected shape m
cross-correlograms between x and
y where I n t e r v a l of measurement v a r i e s and c a u s a l
» 5.
interval
The upper and lower curves should be produced by con-
gruent and incongruent influences
respectively
of x on y.
Thus, i f we measure y long before x (k » -25), or we measure x
long before y (k « +25), we should observe approximately
correlation.
a zero c r o s s -
The measures are too remote i n time f o r them to r e f l e c t the
I n f l u e n c e of x on y w i t h i n f i v e time u n i t s .
However, i f we measure x Just
f i v e u n i t s before y (k = +5 i n F i g u r e 1.1), a c l e a r l y p o s i t i v e xy c o r r e l a t i o n
page 8
Causal A n a l y s i s 1
should appear.
As the measurement i n t e r v a l k departs from t h i s causal
i n t e r v a l , the c o r r e l a t i o n between x and y should become smaller and
smaller.
I t w i l l not drop immediately to zero, however, because x and
y are each moderately
and y
autocorrelated. Hence a c o r r e l a t i o n between x
(say) w i l l a l s o generate a smaller one between x
9
2
and x , be-
tween x^ and x^, e t c
What i f x influences y i n a negative or incongruent fashion: the
higher the x, the lower the y ?
The cross-correlogram should be mirrored
upside down, as i n the lower curve of Figure 1.1. As the measurement
i n t e r v a l between x and y gets c l o s e r t o the causal i n t e r v a l of 5, an i n creasingly negative c o r r e l a t i o n between x and y should appear, reaching
a maximum a t k = +5,
I n the same way, i f y i s allowed to influence x with a causal i n t e r v a l of (say) 5 time u n i t s , then we should observe a maximum xy correl a t i o n when the measurement of y precedes that of x by 5 u n i t s , I.e.,
when k =» -5.
What about a b i d i r e c t i o n a l or r e c i p r o c a l influence of x and y on
+
each other?
y-£>x
Suppose x — ^ y with a c a u s a l i n t e r v a l of 5 time periods, and
with a c a u s a l i n t e r v a l of 8 u n i t s .
In the s o l i d curve I n Figure 1.2
S
We should then observe, as shown
a double-humped curve with one peak a t
k = -8 ( r e f l e c t i n g the y - ^ x i n f l u e n c e ) , and another peak a t k = +5 ( r e f l e c t i n g the x-^>y I n f l u e n c e ) .
Causal a n a l y s i s 1
page 9
-1.00
J'
/
Crosscorrelation
r
< v
Sffec+cf
'
i
V
p
0 0
\
w
-l.od ^
\ . i
-15
i
i
/
\
\
> I
-10
/effect
N
•
i
1
i
i
—
4
i i i
-5
i
i
i
1
i
0
5
i
i
. . i
p
t
.I
i
10
15
k(lag)
Figure 1.2.
Expected shape of cross-correlograms given r e c i p r o c a l
influence of x on y (with causal i n t e r v a l of 5) and y on x
(with causal i n t e r v a l of 8 ) .
I f one of the influences i s congruent and the other incongruent
(say, x-^>y but y - ^ x ) ,
a cross-correlogram such as the dashed curve in
Figure 1.2 i s to be expected.
I n both of these figures we expected the c o r r e l a t i o n s to approach
zero at e i t h e r extreme, i . e . , when x and y are measured f a r apart.
seems no reason to expect any remote influence of x on y.
There
This value at
e i t h e r extreme may be c a l l e d the "no-cause asymptote."
Might the no-cause asymptote be other than zero?
Of course.
If
x and y were both influenced by a t h i r d v a r i a b l e which generated a prev a i l i n g p o s i t i v e or negative c o r r e l a t i o n between them, we would expect
a d e f i n i t e xy c o r r e l a t i o n to appear even between remote measurements.
Causal a n a l y s i s 1
page 10
Two p o s s i b l e correlograms with non-zero asymptotes are i l l u s t r a t e d i n
Figure 1.3. Note i n the lower cur\e that a negative Influence of y on
x ( y — > x ) should produce a minimum c o r r e l a t i o n between the two measurements when y precedes x by the causal i n t e r v a l , although a l l c o r r e l a t i o n s
could remain p o s i t i v e .
Cross
corre
l a t ion
*<* » y.
t-fk
)
.00
i»odri
15
I
1—1
L
10
10
15
k(lag)
Figure 1.3. Expected
correlograms when, because of common t h i r d
cause w, a p r e v a i l i n g xy c o r r e l a t i o n i s generated;
no-
cause asymptote departs from zero.
Multiple models for the same e m p i r i c a l data
Suppose, i n some e m p i r i c a l panel data with x and y measured 5 time
u n i t s apart, we observed
that the cross-lagged c o r r e l a t i o n was low when
y preceded x (k « - 5 ) , high when x preceded y (k = +5), and intermediate
when the two were measured simultaneously (k « 0 ) .
These r e s u l t s cor-
respond to the three c i r c l e s i n Figure 1.4. Could we then i n f e r that
x-i-y?
By no means. Three hypothetical models which equally w e l l f i t
these observed
c o r r e l a t i o n s are i l l u s t r a t e d i n Figure 1.4.
Causal a n a l y s i s 1
page 11
B
St-
Cross
corre
lation
*(* , y
. )
t+k
«
A
0 0
I—J
lJ_l
-15
'
'
•
'
'
-10
I
*
1
*
•
-5
0
5
Li
10
k(lag)
Figure 1.4.
E m p i r i c a l c o r r e l a t i o n s represented by the three
c i r c l e s might f i t s e v e r a l d i f f e r e n t models, properties
of which are described i n t e x t .
Correlogram A might a r i s e i f x a f f e c t e d y p o s i t i v e l y (x—»y) with a
causal lag of 4.
Curve B could a r i s e i f y - ^ x
with a causal lag of 4,
and the presence of t h i r d factors introduced a p o s i t i v e no-cause asymptote.
S t i l l another p o s s i b l e pattern i s i l l u s t r a t e d by curve C i n
which x and y both influence the other p o s i t i v e l y , but with a d i f f e r e n t
causal i n t e r v a l .
The e m p i r i c a l data, based on two measurements only,
might equally w e l l f i t three d i f f e r e n t models.
To be sure, a d d i t i o n a l measurements a t three or more points of time
would help to d i s t i n g u i s h among these models.
Thus, i f we a l s o measured
at k « +10, we might be able to prefer one of these three models over the
other two.
(Even so we might not be able to discriminate between minor
v a r i a t i o n s of that model d i f f e r i n g , e.g.,
i n causal i n t e r v a l ) .
The reader i s reminded that the above discussion i s based purely
on conjecture.
I t seemed reasonable to expect such crossr-correlograms,
I
15
I
I
Causal a n a l y s i s 1
page 12
given the causal influences indicated.
As w i l l be seen l a t e r , r e s u l t s
with simulated data have generally corroborated these i n t u i t i o n s , but
some unexpected departures have a l s o appeared.
I t i s hoped that the
l a t t e r can be accounted f o r i n terms of mathematical properties of the
systems generated (see t e c h n i c a l appendices by Robert Lew).
B.
General C h a r a c t e r i s t i c s of Two-Variable Model
I n order to explore the f i r s t of the two large questions—what
c o r r e l a t i o n a l properties w i l l follow from known causal connections—my
colleagues and I have begun generating simulated data by means of a computer (currently the IBM 360/67),
For
a population of N hypothetical i n d i v i d u a l s , normally d i s t r i b u t e d
v a r i a b l e s x. and y. are created as described below, and are allowed to
it
it
'
J
operate through successive periods of "time," each time u n i t represented
by a cycle or s i n g l e operation of the computer program.
Each v a r i a b l e
i s given a s p e c i f i e d consistency over time, by means of introducing a
small to large normally d i s t r i b u t e d random e r r o r term (with mean = 0) at
successive steps.
*More p r e c i s e l y , normal d i s t r i b u t i o n s are created a t time « 0;
subsequent "error'
1
ably remain normal.
sary.
since
terms are a l s o normal, the r e s u l t a n t d i s t r i b u t i o n s probThe property of normality i s convenient but not neces-
I t nowhere appears i n the mathematical derivations i n the appendices.
Causal a n a l y s i s 1
page 13
After the two v a r i a b l e s have gone through s e v e r a l c y c l e s , a u n i d i r e c t i o n a l influence of x on y i s introduced with a s p e c i f i e d causal
i n t e r v a l , such that an individual's y score a t a p a r t i c u l a r time ( y ^ )
t
i s influenced by h i s score at a s p e c i f i e d prior time ( i _ ) *
This
x
t
influence can e i t h e r be p o s i t i v e (congruent) or negative.
or r e c i p r o c a l influences can a l s o be created, but for now
d i r e c t i o n a l s i t u a t i o n w i l l be
g
Bidirectional
only the u n i -
discussed.
After allowing the causal system to become established during an
i n i t i a l period (such as 20 c y c l e s ) , we allow the system to operate through
50 more cycles during which the program computes the c o r r e l a t i o n a l propert i e s of the r e s u l t i n g data.
For each v a r i a b l e the program computes, f i r s t , the
of each v a r i a b l e , e.g.,
r(x , x
t
2, ...25
cycles.
autocorrelation
) , for each t and each lag from k = 1,
t+K
The program p r i n t s a correlogram showing how
the average
*
of these autocorrelations v a r i e s as the l a g i n c r e a s e s .
The program a l s o computes c r o s s - c o r r e l a t i o n s , i . e . , r ( x , y
t
^ ) ,
t"TTt
where the i n t e r v a l between the two measurements can vary from k = -25
measured 25 cycles before x) to k = +25
(x i s measured 25 cycles before y ) .
Again, an average of these lagged c r o s s - c o r r e l a t i o n s
i s computed, and
In the course of 50 cycles there are 49 p o s s i b i l i t i e s for an
with lag of 1 (x^ and
2, e t c .
x
t +
^)» ^
(y i s
a
autocorrelation
instances for autocorrelation with lag of
Appendix D describes how
the program s e l e c t s a subset of 25 examples
of each l a g i n order to obtain an average c o r r e l a t i o n f o r each lag from
k « 1 to k » 25.
Causal a n a l y s i s 1
page 14
*
graph of the r e s u l t i n g cross-lagged
Two
sources of
correlogram i s printed.
consistency
The next sections w i l l describe some properties of the
simulated
v a r i a b l e s which are created by the computer program.
One
property i s that each i n d i v i d u a l have some degree of consistency
in x and y over time.
nized.
Two
d i s t i n c t sources of consistency can be recog-
One w i l l be c a l l e d short-term
or cycle-to-cycle consistency, r e -
f l e c t i n g the f a c t that a person i s not l i k e l y to change sharply from one
time to the next.
be high.
The autocorrelation between adjacent measurements w i l l
I n the recursion equation for generating x, we l e t x a t time
t ( i ) depend l i n e a r l y upon the i n d i v i d u a l ' s immediately p r i o r value
x
t
(x^
t
) and a random e r r o r which can e i t h e r be small (high cycle-to-cycle
consistency) or large (low c y c l e - t o - c y c l e c o n s i s t e n c y ) .
Under such a system, the autocorrelation w i l l drop c l o s e r and closer
to zero as the i n t e r v a l between measurements increases.
who
An i n d i v i d u a l
s t a r t e d high on x could a f t e r 50 time u n i t s end up a t the opposite
extreme, and v i c e v e r s a .
But i n r e a l l i f e t h i s does not often happen;
there i s u s u a l l y some
long-term consistency due to stable p e r s o n a l i t y factors or the s o c i a l environment.
Thus i n panel studies of p o l i t i c a l behavior and a t t i t u d e s , i t
i s not uncommon to observe that the autocorrelation over a long i n t e r v a l
i s almost the same as the autocorrelation over a short i n t e r v a l .
See appendix D for method of s e l e c t i n g 25 examples of each l a g for the
purpose of averaging.
Causal a n a l y s i s 1
page 15
To achieve such long-term consistency we have assigned each i n d i v i d u a l a s t a b l e underlying tendency or constant--we c a l l i t - a zeta value-such that the mean zeta among i n d i v i d u a l s i s zero, and each individual's
scores vary from c y c l e to cycle around h i s zeta value.
Figures 1.5a through d show how x scores of two hypothetical i n d i viduals might vary over time, under d i f f e r i n g combinations of short-term
and long-term consistency.
Figures 1. 5a through d here
The greater the differences among i n d i v i d u a l zeta values ( i . e . , the
larger the zeta variance r e l a t i v e to t o t a l v a r i a n c e ) , the more long-term
consistency Is introduced.
I n our program, the d i s t r i b u t i o n of zetas i s
made normal, and the variance of zetas can be a l t e r e d from large (for
high long-term consistency) to zero ( f o r no long-term consistency)•
Both short-term and long-term components f o r x can be varied independently, of course, from those for y .
Types of v a r i a b l e s
The two sources of consistency described above have been used, in
various combinations, to generate many types of v a r i a b l e s . Some of these
types are i l l u s t r a t e d i n Figure 1. 5e.
Figure 1. 5e here
Causal A n a l y s i s 1
page 16
a. High short-term
No long-term
consistency
"
«
4)
rH
consistency
Time
Time
c. Low short-term
No long-term
b. High short-term
High long-term
consistency
"
d» Low short-term consistency
High long-term
0
.O
U
>
Time
Time
Figure 1.5a
to d.
Schematic representation of x scores for two
v i d u a l s over time.
Dashed l i n e (
Indi-
) corresponds to zeta con-
stant for each, i n d i v i d u a l (both of these = 0 i n Figures a and c ) .
Causal a n a l y s i s 1
page 17
1.00
Auto*
correlation
r(x . x
t
)
HE
i
.00
15
0
5
11
i
i
t
10
.
i
15
k(lag)
Figure l»5e«
Depending on magnitude of short-term and long-term
consistency i n a v a r i a b l e , i t s autocorrelation as measuremeant i n t e r v a l (k) i n c r e a s e s can be made, to vary i n shape.
Curves I - VI are described in t e x t .
Curves I and I I represent d i f f e r i n g degrees of short-term or cycleto-cycle consistency, with no long-term e f f e c t s .
Given low short-term,
consistency as i n curve I , the a u t o c o r r e l a t i o n i s v i s i b l e only over a few
time u n i t s .
Even when the short-term consistency i s high (as i n curve I I ) ,
the a u t o c o r r e l a t i o n eventually decays to zero i f a s u f f i c i e n t l y long Int e r v a l between measurements i s allowed.
The remaining curves I l l u s t r a t e a u t o c o r r e l a t i o n when the long-term
consistency i s moderate ( I I I and I V ) or high (V and V I ) . Even with a
long time between s u c c e s s i v e measurements, the autocorrelation remains
Causal a n a l y s i s 1
positive.
page 18
Broken and s o l i d curves i n each set show what happens when short-
term consistency i s low or high r e s p e c t i v e l y .
How w i l l v a r i a t i o n i n these c h a r a c t e r i s t i c s a f f e c t the c r o s s - c o r r e l ograms between x and y?
I n t u i t i v e l y I t seemed l i k e l y that given low
consistency from e i t h e r source (as i n curve I ) the cross-correlograms suggested i n Figures 1,1 to 4 above would r i s e and f a l l sharply.
consistency, e i t h e r short-term (as i n curve I I I ) or long-term
Given high
(curve V),
the cross-correlograms should r i s e and f a l l much more gradually. Some
e m p i r i c a l r e s u l t s given below (pp. 35-47) generally supported
these ex-
pectations, but a l s o revealed important d i f f e r e n c e s i n the e f f e c t of shortterm and long-term consistency.
C.
Technical D e t a i l s of Two-Variable Model
I n t h i s s e c t i o n only, I t w i l l be d e s i r a b l e to use a notation somewhat more complex than that used previously.
I t w i l l be consistent with
the notation i n the t e c h n i c a l appendices.
Short-term consistency
Consider f i r s t the s i t u a t i o n i n which there i s no long-term c o n s i s tency, and no c a u s a l influence of e i t h e r v a r i a b l e on the other.
cussion w i l l be i n terms of the x v a r i a b l e ;
The d i s -
i t w i l l apply equally to y .
For a population of N i n d i v i d u a l s , a score x ^ *or each
i at time t =• 0 i s assigned by random s e l e c t i o n from a normally
individual
distri-
2
buted population of x values with mean = 0 and variance o*
x
desired.
Each individual's expected value thus i s 0.
s p e c i f i e d as
(The assumption of
Causal a n a l y s i s 1
page 19
normality i s convenient but i s not necessary f o r the mathematical d e r i vations i n the appendices.)
At each successive time t
9
1, 2, 3, ..., a value x ^
t
I s generated
by the following recursion equation;
x
where;
it
^>
x
e
Xt
= />*< -l>
x
lt
+
e
xt
• • • W
i s a number between 0 and 1, constant over t and i ;
i s
a
r a n <
^
o m
normally d i s t r i b u t e d error term with mean a o
and variance (of s p e c i f i e d magnitude) constant over t and i ;
values of e
x t
are independent across individuals and across
time.
The x's generated i n t h i s way represent one of the simplest autoregressive time s e r i e s , the Markov s e r i e s (see Kendall and Stuart, The Advanced Theory of S t a t i s t i c s , 1966, V o l . 3, pp. 405 f f . )
score x ^
t
The i n d i v i d u a l ' s
a t any time i s a l i n e a r combination of a c e r t a i n f r a c t i o n (o^)
of h i s immediately p r i o r x score x^ _^, and a random e r r o r term e
t
can be interpreted as the e f f e c t of "unknown other v a r i a b l e s . "
x t
which
I t i s as-
sumed that yo^ i s the same f o r a l l i n d i v i d u a l s (although one can imagine
a s i t u a t i o n i n which some i n d i v i d u a l s a r e linked more c l o s e l y from one
time to the next than are other i n d i v i d u a l s . )
For our purposes i t i s desirable that the mean and variance of x
be independent of time;
with time,
( I n r e a l data one i s often confronted with means and variances
Technically p
x
trolled;
they should not change r a d i c a l l y or systematically
could exceed 1, but then the variance of x ^
i t increases without bound as t i n c r e a s e s .
t
cannot be con-
Causal a n a l y s i s 1
page 20
that s y s t e m a t i c a l l y r i s e , f a l l , or f l u c t u a t e over time.
For our present
model such e f f e c t s would be inconvenient, although possibly they could be
incorporated i n future v e r s i o n s . )
I f the variance of x
variance of e
i s to remain independent of time, and i f the
fc
i s assumed fixed over time, then from expression (1) i t
must hold true for any s p e c i f i e d time that:
Var(x )
-
t
(Var(x )) + Var(e )
t
Hence the values of p„, Var(x ) , and Var(e .) are
v#
t
'X
. . . (2)
x t
Xt
interdependent:
Var(e )
x t
/>x " 1 /
1
~
: >
a
n
• • • <>
d
3
Var(x )
fc
Var(e
xt
)= (l-o
Once the values of
2
x
) (Var(x ) )
. . . (4)
z
and of V a r ( x ) are s p e c i f i e d , the value of V a r ( e )
£
x t
is fixed.
The reader w i l l note that expression (3) i s one form of the expression f o r a c o r r e l a t i o n c o e f f i c i e n t .
And
in fact,
i s the t h e o r e t i c a l
autocorrelation between adjacent values of the Markov s e r i e s x .
fc
This rho
c o e f f i c i e n t governs the short-term consistency of the x v a r i a b l e .
larger i t i s , the more c l o s e l y succeeding values of x ^
t
The
are governed by
the immediately p r i o r value.
T h e o r e t i c a l expectation for a u t o c o r r e l a t i o n . As the lag k between
successive sets of x^. i n c r e a s e s , i t i s known that the t h e o r e t i c a l autoth
c o r r e l a t i o n between x
and x
t
, w i l l be simply p r a i s e d to the k
t+k
'x
*See Kendall and S t u a r t , op. c i t . , p. 405
i
power:
Causal a n a l y s i s 1
page 21
/°( t» t+k>
x
x
a
Px
• • • (5)
(In the left-hand term p i s used instead of the usual r to represent the
t h e o r e t i c a l r a t h e r than e m p i r i c a l a u t o c o r r e l a t i o n . For derivation, see
Appendix A . l (7) and proof of ( 7 ) .
I n Figure 1. 6 below the reader w i l l be able to compare t h e o r e t i c a l
autocorrelations with those obtained from simulated data.
Long-term consistency
To introduce long-term consistency, each i n d i v i d u a l i s assigned an
expected value not of zero but of an individual constant zeta 0£ix)•
That
i s , h i s scores over time can be conceived as deviating around h i s individual
zeta.
We s h a l l designate the new s e r i e s of x values for each i n d i v i d u a l
t
1
as x^ , where the expected value E ( x ^ ) = Zixt
S i m i l a r statements can be
t
made for the y v a r i a b l e .
I n such a time s e r i e s , each i n d i v i d u a l i a t time t • 0 i s
two randomly selected values:
assigned
an i n d i v i d u a l constant *I from a normal
->ix
d i s t r i b u t i o n of zeta's with EO^)
a
0 and Var(£ ) s p e c i f i e d ;
x
and an
initial
deviation value X^Q from a normal d i s t r i b u t i o n having E(XQ) « 0 and Var(xQ)
specified.
•k
i
His
a l score i s the sum of these:
x.10 i =n i t ix.
"10 +' S i x
rt
rt
Successive values of x ^
x
Also:
it
y
i t
t
are then generated by the recursion equation:
=
M It-l>
+ T
x^Lx
-
/Oy^it-l)
+ T
ySiy
x
(The tau c o e f f i c i e n t T
x
• • • (6)
+
xt '
e
(7)
+
e
yt
or TV w i l l be discussed s h o r t l y . )
y
"it
Again the assumption of normality i s convenient but not necessary f o r
mathematical d e r i v a t i o n .
Causal analysis 1
page 22
The variance of J
sired values.
x
and of Xq can be set independently a t any de-
For convenience we have allowed the sum of these two t o
equal an a r b i t r a r y t o t a l ( i n i t i a l ) variance.
I n the simulated data shown
l a t e r , we have set Var(7 ) + Var(x^) = 20.
This procedure assumes that an individual's zeta i s fixed throughout time.
I n r e a l l i f e , of course, individuals are not so stable.
Their
long-term consistencies (due t o personality, sociological conditions,
etc.)
might show a mild v a r i a t i o n through time, as well as upward or down-
ward trends, A m u l t i v a r i a t e model now being constructed w i l l permit the f i r s t
of these e f f e c t s .
The individual's x score can be influenced by some very
stable (but not completely constant) t h i r d variable, and h i s y score simil a r l y can be influenced by a very stable (but not constant) f o u r t h variable.
When long-term consistency Is introduced through other variables,
the
and £
constants are not needed f o r t h i s purpose and can be set
i y
at zero f o r a l l i n d i v i d u a l s .
Other complexities, however, such as systematically r i s i n g or f a l l i n g
means or variances, must be ignored f o r the present.
Where we generate non-zero variances of ^
and jjy, i t i s necessary
to know whether any c o r r e l a t i o n exists between the set of ^'s f o r x and y
respectively.
I f a substantial c o r r e l a t i o n does e x i s t , one would expect
I n the l a t e r sections, t h i s t o t a l i n i t i a l variance i s represented by a
simple notation:
i
be:
Var(x ).
Q
Var(x). I n the n o t a t i o n of the present section i t would
page 23
Causal analysis 1
that even i n the absence of causal connection between x and y, a p r e v a i l ing c o r r e l a t i o n between them would appear ( i . e . , non-zero asymptote i n
the xy correlogram). I n the computer program, therefore, one may specify what c o r r e l a t i o n between ^ °d jfy * desired.
a
s
x
A note on the tau o o e f f i c i e n t ( t or T ) .
y
Although Var(7 ) can be
^x
x
1
set independently of V a r ( x ) , there i s an interdependency between
Q
moments of
and
I n expression (7) l e t us specify what expected value i s
desired f o r each t e r m — i t s average over many time periods. Since x^ is
t
intended t o deviate around the i n d i v i d u a l constant £ , the expected
l x
i
,
value desired i s E ( x ) = <£i '
i t
identical.
expected value of x ^ ^ i s of course
x
t -
The expected value f o r the e r r o r term i s by d e f i n i t i o n .zero,
and f o r each constant i t i s simply that constant. Substituting these
expected values f o r the corresponding terms i n expression (7) we have:
Six
Solving, we f i n d that
r
x
- Ac ? i x + x ?
T
+
. . . (8)
0
l x
- (i-^)
... (9)
Thus I f the expected value for each i n d i v i d u a l E ( x ^ ) and hence the expecr
ted value f o r a l l individuals E(x ) are t o remain constant through time,
t
t
the tau c o e f f i c i e n t T
i n (7) must be set = l - o *
x
Without t h i s , the mean
i
of x
t
would not remain independent of time.
Theoretical expectation f o r autocorrelation.
Given a non-zero zeta
variance, a t h e o r e t i c a l expression f o r the autocorrelation as lag k increases
page 24
Causal analysis 1
can be shown t o be the following (see Appendix B . l ) :
•
V
/)(x , x ^ )
k
- />
t
x
V'Px*
+
Var
<?x>
. . . (io)
V a r ( x ) + Var(^ )
Q
x
Note that i f V a r ( ^ ) is set at zero t h i s expression reduces to ( 5 ) ,
Also,
as k increases, the t h e o r e t i c a l autocorrelation approaches an asymptote
i n k as follows:
P&t> t 4 k ) ~ *
x
^
Var(x ) + V a r ( J )
V a r
• • • <")
X >
Q
x
Thus w i t h large measurement intervals the autocorrelation approaches an
asymptote not of zero, as i n the case where there Is no zeta e f f e c t , but
rather of the r a t i o between the zeta variance and t o t a l variance.
This
e f f e c t may be seen i n Figure 1*7 below.
Influence of one variable on the other
A f t e r the x* and y
series have been created by expression (7) and
t
allowed t o operate f o r several cycles, we now allow y^ f o r each individual
t
to be influenced by his x
at a specified e a r l i e r time (x!. , where g is
it-g
called the causal i n t e r v a l ) . Essentially we create another time series
1
* *
which may be designated Y .
fc
it
'
I t i s possible, of course t o create another time series X i n which y i n fc
fluences x, and thus t o generate reciprocal Influences.
of this type are shown at the end of the report.
Some simulated data
Mathematical properties of
such systems are formidable, however, and are not covered i n the appendices.
I n subsequent sections, f o r convenience, the two variables however
generated w i l l simply be designated x
t
and y
t
respectively.
page 25
Causal analysis 1
The r e l a t i v e weight exerted by x | .
t
c
xy
(read:
g
is governed by a c o e f f i c i e n t
causal influence of x on y ) , which may be set between 0 and +1,
™
the sign determining whether the influence is to be congruent ( i f p o s i t i v e )
or incongruent ( i f negative).
There Is no
inherent necessity that c
be < | l | , but i n t u i t i v e l y i t does not make sense to say that one variable
can influence another by an amount greater than I t s e l f .
i
The recursion equation f o r Y then becomes:
*Jt
a
fyttit-l)
+T
y?iy
+
V
W
+
V
• • •<>
12
The size of the tau c o e f f i c i e n t must be determined.
s
value of the x
reduces to I
as simple.
term i s zero
p
The expected
so t h a t I n terms of expected values (12)
( 8 ) , although other properties of the Y
series are not
Pending f u r t h e r i n v e s t i g a t i o n we. s h a l l continue to set 7^ « ( l - p ) .
y
Theoretical expectation f o r cross-correlation.
For the condition of
u n i d i r e c t i o n a l Influence, the t h e o r e t i c a l l y expected cross-correlation Is
discussed i n Appendix B.2.
Distributed influence
I n recursion equation (12) i t is assumed that the causal.influence
of x
on Y occurs a f t e r precisely g time u n i t s .
Another pattern is posi
s i b l e , and i s incorporated i n the computer program, namely that Y^ may be
t
9
I
Influenced not only by x^ .g but also by x^ j_» if2
I
x
t
t-
9
I
••• it-g+l*
x
Most of the output shown below assumes a causal influence over a f i x e d
I n t e r v a l g, but toward the end of the paper some examples are shown I n which
the influence of x on y i s d i s t r i b u t e d through several i n t e r v a l s .
page 26
Causal analysis 1
D.
AutocorrelatIons
This section and the one following w i l l present a variety of
correlograms produced by simulated variables possessing d i f f e r e n t propert i e s of short-term and long-term consistency, where a u n i d i r e c t i o n a l i n fluence x — f r y was established.
Section D w i l l describe the autocorrela-
t i o n a l r e s u l t s , and section E the cross-correlational*
Effects of short-term consistency
Let us s t a r t with, looking a t e f f e c t s of varying the short-term consistency i n variable x, w i t h no long-term component.* .
To p l o t these curves, as In Figures 1.6ff., only positive values
of the time lag k need be shown, since the autocorrelograms are by d e f i n i t i o n symmetrical.
For maximum use of space i n the following charts, there-
f o r e , the k scale I n the l e f t h a l f i s the m i r r o r image of that i n the r i g h t
h a l f , permitting two separate sets of data t o be shown on the same chart*
Figure 1.6 here
To generate the curves shown i n Figure 1-6, the rho c o e f f i c i e n t s
governing short-temi consistency f o r variable x were set successively from
a moderate value of jo
° .70 to an extremely high value of o
i X
83
a
99.
'X
I n the l e f t side of the chart are p l o t t e d the theoretical values f o r
the autocorrelation of x as lag k increases, according t o expression ( 5 ) ,
p« 21.
With a moderate rho of ,70, the t h e o r e t i c a l autocorrelation declined
to an asymptote of zero a f t e r 15 time intervals
u
The higher the rho the
page 27
Causal Analysis 1
l.oa
.90
Simulated
Theoretical
,8C
no.
.70!
\
\
.97
• 6«
F-b
\
8
u
40
I
as
0)
30
X
.20
\
.10
t-io
\
\
N
.00
25
20
15
10
5
0
5
10
15
20
k (lag)
Figure 1.6. Theoretical and simulated values of autocorrelation f o r x as
vho (sKort-term consistency) increased from .70 t o .99. Scale
of time l a g (k) f o r t h e o r e t i c a l values a t l e f t i s mirror image
of that f o r simulated values a t r i g h t .
The l a t t e r on the average
corresponded w e l l t o t h e o r e t i c a l expectation except f o r extremely
high rho.
25
r-fi
page 28
Causal analysis 1
more slowly the autocorrelations dropped, but: a l l of them were directed
toward a zero asymptote i f a s u f f i c i e n t l y long i n t e r v a l of remeasurement
were allowed.
The r i g h t side of the chart shows r e s u l t s from 2*3 simulated runs
w i t h each set of parameters.
(For use i n r e f e r r i n g back t o the o r i g i n a l
data, each run has been given an a r b i t r a r y number shown a t the r i g h t of
the
chart.) From run t o run, the simulated curves deviated somewhat fro©
the
theoretical.
The average of two or more curves approximated the
t h e o r e t i c a l curves, although f o r very high values of rho the simulated
curves seemed t o f a l l s l i g h t l y below the t h e o r e t i c a l .
the deviation i s not clear;
The reason f o r
perhaps I t i s due simply to sampling e r r o r .
The reader w i l l note the s i m i l a r i t y between the autocorrelograms i n
Figure 1.6, and curves I and I I sketched i n t u i t i v e l y i n Figure 1.5e I n d i cating low t o high short-term consistencies.
For the simulated curves we have examined, once they begin t o dev i a t e from t h e o r e t i c a l they w i l l continue t h i s way, because of the i n t e r dependence between successive states.
the
For more accurate estimates of
t h e o r e t i c a l expectation, several separate runs could be generated and
averaged. At t h i s stage of rough exploration, however, the need f o r precise estimation was not strong enough t o j u s t i f y the additional step.
Effects of long-term consistency
To generate long-term consistency we assign each i n d i v i d u a l ^ and
V constants around which h i s scores on x and y respectively are allowed
iy
to deviate. By making the variance of £
x
or 7 small or large compared
Causal analysis 1
page 29
to the t o t a l variance of e i t h e r variable (see p. 24), we are able t o
create a small t o large component of long-term consistency i n e i t h e r
variable.
Effects of long-term consistency on autocorrelations of x
are shown i n Figure 1.7.
(Effects f o r y w i l l be s i m i l a r . )
For addi-
t i o n a l discussion see Appendix B.3.
Figure 1.7 here
Four curves are shown. The broken and s o l i d d i f f e r e d i n short-term
consistency (p
« moderate and high r e s p e c t i v e l y ) .
pair d i f f e r e d i n amount of long-term consistency;
The members of each
the r a t i o of zeta
variance t o t o t a l variance was .30 a t the bottom and .70 a t the top.
The t h e o r e t i c a l curves i n the l e f t h a l f of the chart were derived
from expression (10), p. 24; they had corresponding asymptotes of .30
and .70. The four simulated curves shown i n the r i g h t h a l f of the
chart corresponded
rather closely t o the t h e o r e t i c a l expectations;
all
declined t o t h e i r respective asymptotes governed by the proportion of
zeta variance t o t o t a l variance.
To see what the curves would look l i k e given the same rho c o e f f i c ients and no long-term consistency, the ready may look back a t the bottom
two pairs of curves In Figure 1. 6.
The higher the rho c o e f f i c i e n t , the more slowly the autocorrelograms
declined toward t h e i r respective asymptotes.
I n Figure 1.7 may be seen variables of types I I I t o VI sketched in
Figure 1.5e.
Causal Analysis 1
page 30
1.00
10
.00
25
20
15
10
10
15
20
k (lag)
Figure 1.7. Autocorrelations w i t h d i f f e r e n t combinations of long-term
and short-term consistency.
I n the lover curves the long-term
consistency (governed by r a t i o of zeta variance t o t o t a l variance)
was moderate, and i n the upper curves i t was high.
These asymptotes
were reached slowly (a) or r a p i d l y ( b ) , depending on short-term
consistency (governed by r h o ) ,
25
page 31
Causal analysis 1
Effects of causal influence by x on y
Thus f a r we have shown r e s u l t s f o r the x variable only.
I f we
were to look at r e s u l t s f o r the y variable without influence by x, r e sults would be analogous, since (except f o r the influence of x) y i s
generated i n the same way.
What, then, w i l l be the e f f e c t on autocorrelations of y when t h i s
is influenced by the individual's p r i o r x value at g time units e a r l i e r ,
*
where g is the causal Interval?
Figure 1.8 shows r e s u l t s when x and y had short-term
consistency
only (rho's ranging from .40 to .95), and the influence of x on y was
moderate (causal c o e f f i c i e n t c
was set at +.20).
Figure 1.8 here
For comparison, autocorrelations f o r x i n the same runs are shown
at the l e f t and those f o r y at the r i g h t .
I f there were no causal influence
between the two the autocorrelations f o r each pair should be s i m i l a r ,
since both had the same rho's and no zeta's.
Yet the reader w i l l note
that when rho's were r e l a t i v e l y high, the autocorrelation for y dropped
more slowly than did the corresponding
curve f o r x.
actual short-term consistency of y became greater.
I n other words, the
I t appeared that some
part of the consistency i n x was being added to the e x i s t i n g consistency i n y
Unless "otherwise s p e c i f i e d , x and y variables were generated w i t h p
x
»
p
Causal Analysis 1
page 32
i.oq
\
Simulated curves f o r x
.9XL.
.80-
/
\
g
/
.5C
/
.4cr
.30-
/
/
/
\
\
/
7
\
1
\
oo.
\
\ A--90
\2.
\
/
Jo
.20-
.10
\
\
/
jot-
.6Q
Simulated curves f o r y
/
\
\
\
\
\
\
\
\
/
\
7
k (lag)
Figure 1.8. Fairs of x and y variables were created w i t h i d e n t i c a l shortterm c o e f f i c i e n t s ranging from p « .40 t o p * .95, and no long-term
consistency;
x was given a moderate causal influence on y (c^y • +.20).
As rho's increased, the short-term consistency of y appeared t o rise
faster than f o r the corresponding x.
Above p^ » .95, the variance
of y became too unstable to j u s t i f y p l o t t i n g the autocorrelation.
.(For discussion, see Appendix
B.3.)
page 33
Causal analysis 1
The reader may wonder whether the increase i n y's consistency was
due t o giving x an positive Influence;
would g i v i n g x a negative i n -
fluence (making the c o e f f i c i e n t c negative) reduce rather than raise
xy
the consistency of y? Empirical tests gave the same result e i t h e r way-w
consistency of y was increased*
same r e s u l t ;
Mathematical derivation yielded the
2
the increase depended on c
(see Appendix B.3).
Effect of x influence on y autocorrelation
given long-term consistency
We saw previously (Figure 1< 7) that i n the presence of long-term
consistency created by variance among the zeta's, the autocorrelations
declined t o a non-zero asymptote equivalent t o the r a t i o of zeta variance
to t o t a l variance. How w i l l t h i s picture be affected, i f y i s allowed to
be influenced by a p r i o r value of x?
Some results are shown i n Figure 1.9.
Figure 1*9 here
I n a l l cases, the l e v e l of the y curves was raised. From the simulated results one cannot t e l l whether each y curve was heading toward a
higher asymptote than i t s corresponding x curve, or Whether I t was simply
dropping more slowly. The mathematical derivations^ though, indicate that
the asymptotic y autocorrelations were Indeed higher, and that t h i s e f f e c t depended on the magnitude of V a r ( ^ ) but not on Var(^y).
x
See Appendix B.3.
Let us leave these somewhat technical questions and move on t o the
topic of cross-correlations, which i s more central to our concern w i t h
page 34
Causal Analysis
i
i
>
Simulated f o r x
imulaUid
Vor (%)
k (lag)
Figure 1.9.
With short-term consistency fixed at a high value ( p « p
x
y
-
.90) and x having a moderate e f f e c t on y ( c » +.20), long-term
xy
w
consistency was varied by s e t t i n g variance of each zeta - .30, .50,
and .70 of t o t a l variance.
The autocorrelations f o r y were raised
d i s t i n c t l y above those f o r x.
Causal analysis 1
page 35
cross-lagged d i f f e r e n t i a l s .
E. Cross-correlations
Effects of v a r i a t i o n In short-term consistency
As under the discussion of autocorrelations, we s h a l l s t a r t w i t h
simpler examples and proceed t o more complex.
F i r s t l e t us allow the
short-term consistency of both x and y t o r i s e , w i t h no long-term consistency, and study the e f f e c t on the cross-correlations.*
i n a l l the
examples i n Figure 1.10, x was given a small causal Influence on y
^°xy ° **^)»
+
a n d
t l i e
r n o 1 8
w e
t e allowed t o vary from .70 t o .95. I n
Figure 1.11 the same set of rho's was used, but the causal influence
was made negative (c
=» -.10).
Figure 1.10 and 11 here
The i n t u i t i v e expectations sketched i n Figure 1.1 were generally
borne out. When x was given a p o s i t i v e influence on y (the causal i n t e r val i n a l l cases was set a t g ° 4 ) , the correlation between x and y became increasingly positive as the lag k approached the causal Interval g
(Figure 1.10). When x was given a negative influence on y, the correl a t i o n between them became Increasingly negative as lag k approached
causal I n t e r v a l g (Figure 1.11).
See Appendix B.3.
*Unless otherwise specified, we always set p
=p
and Var(^ ) =» Var(? ) .
Causal Analysis 1
page 36
.80
c a u s a l interval
•70
Ru»
no.
P* and Py
.60
-50
0-1
95
.90
70
8
.40
.30
.20
\
.10
1 \LO 7
.00
25
20
15
10
10
15
20
k (lag)
Figure 1.10. Effect on the cross-correlations of v a r i a t i o n i n short-term
consistency, w i t h x exerting a small p o s i t i v e influence on y (c^y »
+.10). As p
a
x
Py Increased from .70 t o .95, the crose-correlograms
(a) became higher, (b) increased i n span, (c) reached a maximum
height a f t e r increasing delay beyond the causal I n t e r v a l .
Causal Analysis 1
page 37
.10
i—i
.10
\
.20
/
i
/
/
D-8
.30
.40
0)
no
P06
.50
Py
.70
o
.60
.90
95
70
causa
infervol
.80
50
40
30
20
10
10
15
20
k (lag)
Figure 1.11. Parameters here were the same as i n Figure 1.10. but t h i s
time the influence of x on y was m i l d l y negative (c
« -.10).
xy
As short-term consistency increased (from p
.70 t o p « -.90),
negative correlograms of increasing height, span, and delay were
generated.
page 38
Causal analysis 1
Other i n t e r e s t i n g features appeared as the rho's increased*
(a) The
correlograms became successively higher ( e i t h e r more positive or negative,
depending on the sign of c___). Note that the magnitude of causal influence
xy
(c xy ) was unchanged--only the short-term consistency, (b) The span of
the
cross-correlogram became w i d e r — t h a t i s , i t started to rise sooner
and declined t o zero l a t e r ,
(c) More s u r p r i s i n g , the point of maximum
height did not occur a t causal I n t e r v a l g but was Increasingly delayed.
The l a t t e r e f f e c t has been derived mathematically;
see pp. 40-42, and
Appendices A.6 and B.3.
Implications f o r cross-lagged d i f f e r e n t i a l .
Let us return f o r a
moment t o what started t h i s i n v e s t i g a t i o n — t h e question of whether causal
connections might be inferred from a " d i f f e r e n t i a l i n cross-lagged correlations."
The l a t t e r quantity, as the reader w i l l r e c a l l from pages 4 and 5
(see also Glossary, p . v i ) , i s the difference between two xy correlations,
in one of which x i s measured k time u n i t s before y, and l n the other y
Is measured k time units before x. I n cross-correlograms such as Figures
1-10 and 11, this d i f f e r e n t i a l appears as a difference i n the height of
the
correlogram at equal distances on e i t h e r side of k = 0.
Now i f x does exert a causal influence on y, under what conditions
w i l l a d i f f e r e n t i a l i n the cross-correlations become v i s i b l e ?
Its visi-
b i l i t y w i l l be affected by two characteristics of the correlogram.
(a) One
is the height. I f the d i f f e r e n t i a l i s computed a t that k where the correlogram i s maximum (height a t t h i s point being compared w i t h height a t
negative k of same s i z e ) , the magnitude of the d i f f e r e n t i a l Increases as
Causal analysis 1
page 39
the rho's increase.
(b) A second c h a r a c t e r i s t i c i s
the span of the correlogram~-roughly
the distance between i t s r i s e from zero at the l e f t extreme and i t s return
to zero a t the r i g h t extreme ( f o r another d e f i n i t i o n see Glossary, p. vn ) .
Given the short span generated by rho's of .70, the cross-lagged d i f f e r e n t i a l was v i s i b l e f o r k's i n a limited range:
+10.
from about k « +3 t o about
(Exact d e f i n i t i o n of the range w i l l depend upon s p e c i f i c a t i o n of
sampling v a r i a b i l i t y , which we have not attempted to pursue.)
Given the
broader span generated by rho's of „9b, the d i f f e r e n t i a l was v i s i b l e over
a much greater range:
from k = about +3 to +25 or more. And f o r the s t i l l
broader span f o r rho's of .95, i t is clear (had the k axis been extended)
that the d i f f e r e n t i a l would remain even over a range up t o k = +50 or
more.
Thus even when the lag In measurement was much longer than the
causal interval,, the d i f f e r e n t i a l persisted when short-term consistency
was high.
Now since both the height and span are affected by rho, i t tentat i v e l y appears that the greater the short-term consistency of x and y
(magnitude of causal Influence remaining constant), the more clearly a
causal influence of x on y w i l l be v i s i b l e i n the d i f f e r e n t i a l I n crosslagged correlations, even when the measurement lag i s much longer than the
causal i n t e r v a l .
One might suppose that span i s merely a function of height, but the data
i n Figure 1» 13 below w i l l show that height can increase without an Increase
i n span.
Causal analysis 1
page 40
(Several questions must be explored before t h i s statement can be
a f f irmed--such as whether the conclusion depends equally on the short-tern
consistencies of x and y, or whether one matters more. Some simulated
data when the two rho's d i f f e r sharply w i l l be shown below i n Figure 1.16,
page 49 ) .
Another important aspect of Figures 1.10 and 11 Is the t h i r d characteristic:
(c) the fact that the maximum height of the correlogram oc-
curred l a t e r than the causal i n t e r v a l , increasingly so w i t h higher rho's.
Given high short-term consistencies, the researcher would be advised t o
select a measurement i n t e r v a l which was probably "too long" ( i . e . , longer
than the causal i n t e r v a l ) rather than "too short."
Delay i n maximum.
An expression has been obtained r e l a t i n g the
amount of delay In the maximum of the correlogram t o the size of p
*
respectively.
X
and p
Y
The relationship i s plotted i n Figure 1.12.
Figure 1.12 here
For example, i f p
and p
both « .50, there w i l l be no delay In the
The expression i s s i m p l i f i e d when the time dimension t i s allowed t o become very large compared t o the measurement lags k — t e c h n i c a l l y , when the
expression becomes "asymptotic i n t . "
Under t h i s condition, the delay in
maximum of the correlogram depends n e g l i g i b l y on g and t , and almost e n t i r e l y
on yo and yoy. Figure 1.12 plots t h i s t-asymptotlc r e l a t i o n s h i p , described
x
i n Appendix A.6 and j u s t i f i e d i n Appendix C.
Causal Analysis 1
page 41
1.00
Delay
.90
Delay « 2
.80
Delay «.1
+
.70
Delay •= 0
„«0
+
.50
.00
. .00
.10
.20
.30
.40
.50
.60
.70
.80
.90
f i g u r e 1.12. Theoretical expectation f o r amount by which maximum of xy
cross-correlogram w i l l be delayed beyond the causal i n t e r v a l , depending on the short-term consistency In the two variables.
where ^
E.g.,
• .90 a delay of about 4 time periods can be expected.
9
3,00
page 42
Causal Analysis 1
maximum;
But I f p and
"x
the l a t t e r w i l l f a l l a t the causal i n t e r v a l .
both « .95, then a delay o f 10 time periods w i l l be observed.
This
t h e o r e t i c a l expectation agrees reasonably w e l l w i t h the empirical curves
i n Figures 1*10 and 1.11.
(Because of the flatness of the higher curves
p
and rounding of the correlations t o two decimal points, the exact maximum
i n some curves i s ambiguous.)
Note from Figure 1.12 that p
maximum than p^
y
i s more c r i t i c a l i n determining t h i s
Even though the l a t t e r i s extremely high, there can
s t i l l be zero delay over p^ values up t o «50. But I f p^
i s very high,
delay w i l l be introduced even f o r small values of p^*
Effect of varying the amount of
causal influence by x on y
The previous chart (Figure 1.12) implies that the amount of delay
i n the maximum point of the cross-correlogram Is affected only by the
rho values of x and y respectively
(short-term consistency), and not
by the presence of long-term consistency (non-zero zeta variance) or
the
amount of causal influence of x on y.
out empirically by curves shown i n
These expectations are borne
the next few charts
Q
Figure 1.13, f o r example, shows what happens when short-term consistencies of x and y were fixed respectively a t a moderate l e v e l (px •» p^
a .70 i n the upper c h a r t ) , and a t a high l e v e l ( p
m
x
Py
m
. 9 0 — i n the
*Lemma 2 i n Appendix C shows that (asymptotically i n t ) the boundary line between the region where the delay i s zero and the delay i s one has the equation:
p ( l + p ) = 1 . I . e . , there i s zero delay i f and only i f /> (l+p ) < 1«
x
y
x
Causal Analysis 1
lower c h a r t ) •
page
43
The three curves l n each chart are generated by a succes-
sively stronger degree of influence by. x on y ( a r b i t r a r y levels of the
causal c o e f f i c i e n t Cyy were selected f o r upper and lower charts respect i v e l y f o r p l o t t i n g convenience).
Figure 1.13 here
So long as rho remained f i x e d , increasing the causal Influence of
i
x on y had a single e f f e c t :
the correlogram became higher, and the cross-
lagged d i f f e r e n t i a l became more marked, at least f o r lags k reasonably
close t o the causal i n t e r v a l g*
But increasing the causal influence did not-increase the span of
the correlogram, nor did i t a f f e c t the amount of delay i n the maximum
point•
I n other words, i f there e x i s t s i n f a c t a causal influence of x on
y, then the stronger t h i s causal influence, the more clearly i t w i l l show
up i n a d i f f e r e n t i a l i n cross-lagged c o r r e l a t i o n s , providing the shortterm consistency of both variables i s a t least moderate, and the measurement i n t e r v a l k i s reasonably close t o the causal i n t e r v a l g.
The lower part of Figure 1.14 shows that given larger rho's, a l l
three curves possessed a wider span and a l a t e r maximum than was true i n
the upper chart. Again, increasing the causal c o e f f i c i e n t merely increased
the height of the correlograms without a f f e c t i n g t h e i r span or delay I n
maximum.
Causal Analysis 1
page 44
.60
50
8
Causal influence of x on y
.40
0
• 30
CO
.20
\
10
Run
r
.00
s
E-6
60
.50
Causal influence of x on y
+. ZO
40
t
,30
KUft
no,
.20
CO
to
.10
causal- mfervol
.00
-25"
-20
-15
-10
-5
0
5
10
15
20
k (lag)
Figure 1.13. E f f e c t of increasing the influence of x on y. Short-term consistencies of x and y were p » .70 (upper chart) and .90 (lower
c h a r t ) . As causal influence increased, the height of cross-correlograms increased, the height of cross-correlograms increased, but
span and l o c a t i o n of maximum d i d n o t .
25
page 45
Causal Analysis 1
Effects of Introducing long-term consistency
We saw i n Figures 1.7 and 9 that introducing long-term consistency
altered the asymptote o f the autocorrelations.
Instead of declining t o
zero f o r very long i n t e r v a l s of measurement, the autocorrelations s t a b i l ized (as expected) at some p o s i t i v e value.
What e f f e c t would t h i s property have on the cross-correlations?
I n the early section where i n t u i t i v e expectations were given (pp. 6-12
and Figures 1.1 t o 4 ) , no conjectures were offered on t h i s aspect.
Over
short i n t e r v a l s (k •> 5 or 10) high autocorrelations can be generated by
e i t h e r short-term or long-term consistency.
I t seemed l i k e l y , perhaps,
that moderate consistency from e i t h e r long-term or short-term sources
would permit the cross-lagged d i f f e r e n t i a l t o appear, b u t that extremely
high consistency from e i t h e r source might cause the cross-correlogram
to r i s e and f a l l very slowly and thus obscure the cross-lagged d i f f e r e n t i a l .
(The actual e f f e c t of short-term consistency was d i f f e r e n t , as reported
I n pp. 35-41).
Figures 1.14 and 15 I l l u s t r a t e the actual e f f e c t s of introducing
long-term consistency.
I n Figure 1.14 the short-term component was
moderate (rho » .70), and i n Figure 1.15 i t was high (rho = .90).
Figures 1.14 and 15 here
The r e s u l t s were d i s t i n c t l y s u r p r i s i n g .
As long-term consistency
increased, the curves d i d become f l a t t e r as expected, but instead of r i s i n g
gradually from an asymptote of zero, a non-zero asymptote was generated 1
Causal Analysis 1
page 46
.60
Vor
.50
Vor(y)
MarM
4J
4J
. Vor(> )
\
.70
/
.40
.So
8
\
\
/
\
.30
F-8
i
0)
r
CO
CO
.20
.30
L
.10
.00
.00
causa
Interval
r - - i —
20
25
15
10
7*
0
10
15
20
k (lag)
Figure 1.14. E f f e c t on the crossi-correlogram of increasing long-term
consistency.
p
Here short-term consistency was set moderate (p^ «
= .70), and causal influence was moderate ( c ^ =* +.20). As
long-term consistency increased (variance of ^
and ^
increasing
from .00 t o .70 of t o t a l variance), the curves became higher but
flatter.
Causal Analysis 1
page 47
• 90
,80
.70
60
.50
,40
Var ix)
Var(y)
30
20
10
causal interval
00
-25
-20
-15
10
-10
15
20
k (lag)
Figure 1.15, Another example of e f f e c t on cross-correlogram of increasing
long-term consistency.
(
t
p
x
a
r
p
y
a
.90);
Here short-term consistency was set high
causal influence was again moderate (c xy
+.20).
As long-term consistency increased, the curves became markedly
flatter.
25
page 4-8
Causal Analysis 1
(This property has been confirmed mathematically;
see Appendix B.3.)
One would hot be surprised to f i n d a non-zero asymptote i n the
presence of some c o r r e l a t i o n between ^
i)X ?y
c o r r e
^
a t
^
o n
w a s
and ^
approximately zero.
but i n these charts the
9
Given a condition of long-term
consistency i n both variables, and given some causal influence of x on y,
a permanent positive c o r r e l a t i o n was generated between them, even f o r
measurement intervals remote from the causal i n t e r v a l .
(Let us reserve f o r l a t e r discussion the question of whether this
e f f e c t was generated by the long-term
consistencies i n both x and y,
or whether one of them was mainly responsible.)
We now could r e f i n e our i n i t i a l conjecture about the effects of
consistency on the cross-lagged d i f f e r e n t i a l .
The higher the short-term
cons is tency ( i n our model, the higher the rho's), the mora strongly the
presence of a causal connection was revealed by a, difference i n crosslagged correlations.
But the higher the long-term consistency ( i n
our model, the larger the zeta variances), the more a causal influence
of x on y was obscured.
To summarize, we may say that as long-term consistency increased;
(a) the span was unaffected (span being defined as the point of r i s e
from a horizontal base whether zero or non-zero, to the point of r e t u r n
to the h o r i z o n t a l ;
(b) amount of delay i n maximum was unaffected;
(c) the
height of the cross-correlogram was raised somewhat, but (d) this advantage was more than o f f s e t by increasing flatness, so that the cross-lagged
d i f f e r e n t i a l was reduced rather than increased.
page 4 9
Causal A n a l y s i s 1
E f f e c t of marked i n e q u a l i t y inshort-term
c o n s i s t e n c y of x and
y
Thus f a r v a r i a b l e s x and y have been g e n e r a t e d
t e r s f o r t h e rho's and
are markedly d i f f e r e n t ?
cies.
zeta variances.
L e t us f i r s t
w i t h i d e n t i c a l parame-
What w i l l happen i f t h e s e parameters
c o n s i d e r the s h o r t - t e r m c o n s i s t e n -
W i l l the c r o s s - l a g g e d d i f f e r e n t i a l be o b s c u r e d — o r c o n c e i v a b l y r e -
v e r s e d — i f x i s v e r y much more c o n s i s t e n t t h a n y, or v i c e v e r s a ?
s i m u l a t e d r e s u l t s a r e shown i n F i g u r e
Some
1.16.
Figure"1.16
here
F o u r c r o s s - c o r r e l o g r a m s a r e p l o t t e d , i n w h i c h d i f f e r e n t d e g r e e s of
short-term
c o n s i s t e n c y f o r x and y were combined.
d e c l i n e d from h i g h
(p
=..95) t o low
(p
x
o f the c u r v e s h r a n k
As
c o n s i s t e n c y of x
=* . 4 0 ) , the l e f t - h a n d p o r t i o n
x
i n s p a n — t h a t i s , i t remained f l a t l o n g e r , and
r o s e more a b r u p t l y as i t approached t h e c a u s a l i n t e r v a l .
t h e s e changes were accompanied by
to
.95,
p
value
x
the span of the r i g h t - h a n d
(.95)
A l s o , when
r i s i n g c o n s i s t e n c y o f y from py
» .JO
p o r t i o n i n c r e a s e d — a l t h o u g h the h i g h e a t
seemed t o o f f s e t the e f f e c t of the s m a l l e s t py*
Hence the c o n s i s t e n c y of x seemed t o a f f e c t m a i n l y
of the c u r v e , and
the c o n s i s t e n c y of y
to a f f e c t mainly
the l e f t
the r i g h t
However, i n a l l c u r v e s the c r o s s - l a g g e d d i f f e r e n t i a l was
I f one
then
g e n e r a l l y h i g h e r than t h a t a t the
left.
side.
maintained.
t a k e s e q u a l d i s t a n c e s on e i t h e r s i d e of the middle (k = 0 ) ,
c u r v e a t the r i g h t was
side
the
page 50
Causal Analysis 1
70
60
95 -W
.50
30
TO
.95
F-4-9
•25
•20
0
15
10
-5
15
20
k (lag)
Figure 1.16.
Effect o f inequality i n short-term consistency of x and y,
( I n a l l curves, x was given a strong influence on y:
Decreasing p
i n span.
x
c
=• +.40.)
caused the l e f t p o r t i o n of the correlogram t o shorten,
Increasing p^ generally caused the r i g h t p o r t i o n t o i n -
crease i n span.
But i n a l l curves the cross-lagged d i f f e r e n t i a l
remained f o r k's up t o +15 or more.
25
page 51
Causal Analysis 1
Thus, marked i n e q u a l i t y i n t h e s h o r t - t e r m
did
c o n s i s t e n c y o f x and y
n o t o b s c u r e t h e emergence o f the c r o s s - l a g g e d
causal
differential
from a
connection.
E f f e c t o f marked i n e q u a l i t y i n
l o n g - t e r m c o n s i s t e n c y o f x and y
Next, how w i l l
t h e c r o s s - c o r r e l o g r a m s be i n f l u e n c e d i f t h e l o n g -
term c o n s i s t e n c y i n x i s much l a r g e r t h a n t h a t f o r y , o r v i c e v e r s a ?
Six
curves
in Figure
showing d i f f e r e n t c o m b i n a t i o n s o f t h e s e q u a n t i t i e s a r e p l o t t e d
1.17.
F i g u r e 1.17 h e r e
A base of comparison i s the dotted
c u r v e a t t h e bottom, where
n e i t h e r x n o r y had any l o n g - t e r m c o n s i s t e n c y
to t o t a l v a r i a n c e
It
.00 f o r b o t h v a r i a b l e s ) .
I s remarkable t o note, f i r s t ,
term c o n s i s t e n c y
( r a t i o of z e t a v a r i a n c e to
t h a t when only y was g i v e n
long-
( r a t i o o f jfy t o t o t a l v a r i a n c e « .70, i n t h e n e x t - t o -
bottom c u r v e ) t h e r e was a l m o s t no change from t h e comparison c u r v e !
cross-correlogram
rose sharply, with cross-lagged
as d i s t i n c t a s f o r t h e comparison
differentials
x
almost
curve.
When moderate t o h i g h l o n g - t e r m c o n s i s t e n c y was i n t r o d u c e d
however ( r a t i o o f ^
The
i n t o x,
t o t o t a l v a r i a n c e = .50 o r . 7 0 ) , t h e c r o s s - c o r r e l o -
grams i m m e d i a t e l y f l a t t e n e d a t a h i g h
" l e v e l , r e g a r d l e s s o f the long-term
Causal Analysis 1
page 52
90
.80
,70
60
F-53
50
F-20
AO
Var( )
%
30
.20
10
00
-.10
25
20
-15
10
10
-5
15
20
25
k (lag)
Figure 1.17. E f f e c t on the cross-correlogram of differences i n the longterm consistency of x and y. I n each curve, short-term consistencies
were made high (p^ » py = .90), and causal influence s l i g h t (c
» +.10)
When y alone had long-term consistency, the correlogram was almost
unchanged;
but when x alone or both x and y had long-term consistency,
the curves became much f l a t t e r .
Caus,il Analysis 1
component i n y.
to be:
page 53
The i m p l i c a t i o n f o r cross-lagged d i f f e r e n t i a l s seems
lons-term consistency i n the "cause w i l l obscure the d i f f e r e n t i a l ,
11
whereas long-term consistency i n the " e f f e c t
i t i s shown that the e f f e c t of ^
v
11
may not.
( I n Appendix
disappears and only that of ^
remains.)
x
When both variables had high long-term components (£ « ^
• .70),
x
the
cross-correlogram was even f l a t t e r .
This seems plausible.
B.3
As each
variable approaches complete long-term consistency, there i s no longer
room f o r causal influence to a f f e c t the cross-correlation, except by
generating a high, f l a t asymptote.
Effect of introducing c o r r e l a t i o n
between zeta s f o r x and y
f
I n a l l of the correlograms thus f a r where x and y possessed some
long-term consistency (variances of ^
x
and £
puter established a c o r r e l a t i o n between ^
x
were non-zero), the com-
and £
y
close to zero.
What w i l l happen, now, i f a p o s i t i v e or negative c o r r e l a t i o n i s Intro-*
duced between ^
x
and £ ?
v
I t seemed l i k e l y , f i r s t , that a p r e v a i l i n g non-
zero cross-correlation between x and y w i l l be generated, even i f x has
no causal Influence on y.
(We saw above that the mere existence of a £
component w i l l also generate a non-zero asymptote i f x does influence y.)
Second, i t seemed l i k e l y that the more p o s i t i v e the £ £
correlation,
the more positive w i l l be the asymptotic xy cross-correlogram.
*For t h i s and other t e n t a t i v e conclusions, the reader i s reminded that they
apply t o the one hypothetical model we have developed.
Causal Analysis 1
page 54
The e f f e c t s of a ^ ^
correlation^ we thought, would resemble the
effects of a t h i r d variable which Influences both x and y (see the conjecture i n Figure 1.3).
Figure 1.4 sketched how such a non-zero asymptote might give.rise
to an ambiguity. Where i t occurs as In curve B, and y exerts a negative
or incongruent influence on x, the cross-correlogram should dip toward
the zero l i n e when y precedes x by a suitable I n t e r v a l .
The cross-lagged
d i f f e r e n t i a l (the difference between the f i r s t and l a s t c i r c l e s i n
Figure 1.4) could be tne same f o r curve B as f o r curve A, where x exerts
a positive e f f e c t on y.
We have t r i e d t o create curve B by introducing c o r r e l a t i o n betveen
and £
v
i n our model, but so f a r have been unsuccessful.
Some results
thus f a r are plotted in'Figures 1.18 and 19.
Figures 1.18 and 19 here
Thus i n Figure 1.18, a l l of the short-term consistencies were
moderate (p
K
= py => .70), and causal influence of x on y was m i l d l y
p o s i t i v e ( c ^ = +.20).
A positive asymptote was created, as noted e a r l i e r .
Now, i n the f i v e curves, the £
x
^
c o r r e l a t i o n was varied from strongly
p o s i t i v e t o strongly negative (+.60 t o -.60). But even a strong negative
c o r r e l a t i o n between ^ jfy f a i l e d t o p u l l the asymptotic cross-correlation
below zero I See Appendix B.3.
Figure 1.19 shows the corresponding picture when x exerted.a moderate
Causal Analysis 1
page 55
.80
.70
Run
.60
F-Z9
50
V
/
.40
.30
\
/
/
.to
\
.20
V
.10
F.-27
CflUSQI
I'nferval
.00
25
20
15
10
10
15
20
k (lag)
Figure 1.18. Effect on cross-correlogram o f c o r r e l a t i o n between long-term
constants.
(p
x
I n a l l curves, short-term consistency was moderate
= py = .70), x had a moderate p o s i t i v e influence on y ( c ^ =
+.20), and variance of zeta's was set a t .50 of t o t a l variance.
When c o r r e l a t i o n between £
and
w a s
varied from +.60 t o -.60,
the asymptote was pulled down, but not past zero.
25
Causal Analysis 1
page 56
+ . 60
t
•23
20
•15
10
causal
in+erval
10
15
20
k (lag)
Figure 1.19. E f f e c t s of c o r r e l a t i o n between long-term constants (cont'd)
Parameters were the same as I n the preceding chart, except t h a t
here the Influence o f x on y was negative (c^y « - 2 0 )
u
a negative asymptote•
When the c o r r e l a t i o n between ^
9
creating
and ^
y
was v a r i e d from -.60 t o +.60, the asymptote was p u l l e d up b u t
8
not past zero.
25
Causal Influence !•
page
negative influence on y (c
=-.20).
Now a prevailing, (asymptotic) nega-
xy
t i v e correlogram was generated.
c o r r e l a t i o n between ^
and £
y
But again, even a powerful positive
f a i l e d to r a i s e t h i s asymptote above the
zero l i n e .
I n short: i n experiments thus f a r we have been unable t o generate
a cross-correlogram which was moved from a p r e v a i l i n g positive asymptote
toward zero by the nagative causal influence of x on y.
That i s , we have
been unable t o mask a congruent, influence x-^*y by creating a negative
^
x
£
y
c o r r e l a t i o n , or vice versa.
Effects of d i s t r i b u t i n g the causal Influence
I n a l l the foregoing examples the causal influence of x on y was
exerted a t one s p e c i f i c i n t e r v a l g (called the causal i n t e r v a l ) .
The
actual recursion equation f o r y, however, may be extended t o include more
than one causal i n t e r v a l .
The y variable a t time t can be simultaneously
influenced not only b j r x g , but also by other p r i o r values of x.
t-
What w i l l happen i f the influence of x on y i s d i s t r i b u t e d over
several time intervals instead of concentrating a t a single one?
We
suspected that the cross-correlogram would increase i n span. Some r e s u i t s of one t r i a l are shown i n Figure 1.20.
Figure 1.20 here
*The mathematical model corresponding t o d i s t r i b u t e d causal Influence
has not y e t been developed.
Causal Analysis 1
page 58
.60
of- e n c k
causol interval q
.50
•u
(«
30
AO
JO
.03 .07
03
CD
.30
i
.20
J6
.06
\0
.10 .07
oif
b)
10
.ox
,03
?7
.oi
Run
no
.10
.00
K-47
.10
25
20
10
10
0
15
20
k (lag)
Figure 1.20. Effects of concentrating vs. d i s t r i b u t i n g the causal i n fluence c^y. Short-term consistency was moderate (p^ » p
y
=* .70).
I n a l l curves the t o t a l influence o f x on y was the same, but i t
was d i s t r i b u t e d d i f f e r e n t l y through time.
concentrated a t one causal I n t e r v a l ;
I n curve (a) c ^ was
i n (b) and (c) i t was spread
symmetrically over, three or f i v e i n t e r v a l s ,
i n curve (d) the
causal influence declined exponentially. See text f o r discussion.
25
Causal Analysis 1
page 59
The four curves a l l used about the same t o t a l influence of x on y.
But i n curve (a) t h i s was concentrated a t i n t e r v a l 3;
i n curve (b)
i t was evenly d i s t r i b u t e d over i n t e r v a l s 2, 3, and 4;
i n curve (c) i t
was d i s t r i b u t e d i n a symmetrical pyramid between i n t e r v a l s 1 and 5.
The cross-correlograms were s u r p r i s i n g l y s i m i l a r .
The t h i r d curve
was a l i t t l e broader i n span than the others, but the random v a r i a t i o n
i n these correlograms i s such that one cannot be assured of.a genuine
difference.
I n the f o u r t h curve (d) the causal c o e f f i c i e n t s decreased exponentially,
being greatest a t i n t e r v a l 1 and successively lower a t Intervals 2 through
5. The r e s u l t i n g span was about the same as before although the maximum
(as one might expect) was closer to k
a
0.
The l a s t d i s t r i b u t i o n of c
i s i n t u i t i v e l y appealing. I t seems
xy
plausible that the "cause" should Influence the " e f f e c t " most strongly
i n the period Immediately f o l l o w i n g , and exert less and less influence
as the " e f f e c t " variable becomes more and more remote.
I n future t r i a l s ,
a much broader d i s t r i b u t i o n of causal influence w i l l be used t o see a t
what point the correlogram i s noticeably f l a t t e n e d .
For the time being, one may t e n t a t i v e l y conclude that whether the
causal influence i s concentrated or d i s t r i b u t e d s l i g h t l y through time has
l i t t l e e f f e c t on the cross-correlogram. I n further tests of correlational
properties of simulated data, we s h a l l continue t o use a single causal i n terval •
Causal Analysis 1
page 60
F.
Reciprocal Influence
Thus f a r simulated data have been shown i n which variable x exerted
a u n i d i r e c t i o n a l influence on variable y.
The basic model, however,
permits b i d i r e c t i o n a l or reciprocal influences of x and y on each other,
and several results w i t h t h i s feature w i l l be presented i n the present
*
section.
The approach here must be cautious.
I n certain preliminary runs,
when x and y had high short-term consistency (rho's - .90, or higher),
strange things happened t o the d i s t r i b u t i o n s of x and y. Means and vart
iances were no longer stable;
I n p a r t i c u l a r , over the customary 50 cycles,
some variances increased explosively (by a factor of 1,000 or more).
Let us s t a r t , then, w i t h variables having low short-term consistency
(p
x
=> py = .50), and no long-term consistency.
E f f e c t of reciprocal influence on autocorrelations
I n Figure 1.21 are shown results w i t h two runs, one containing a
positive feedback loop (x-^W-y-^x) and one a negative feedback loop (x-^Vy—»I n order t o keep the two influences somewhat out of phase, the causal lags
were made d i f f e r e n t :
g =» 3 and 5 respectively.
The c o e f f i c i e n t Cy deX
notes the influence on x of the p r i o r value of y.
Figure 1.21 here
*The mathematical model f o r reciprocal influence has not yet been developed.
Causal Analysis 1
• 7.0 [
page 61
1
1
f
1
T
Variable y
Variable x
.60
1 — — J
1
.50
.40
.30
Kurt
.20
u
no.
/
.10
.00
V
\
\
\
I — \
F~3'f
i
/
\
.20
/
\
/
\
\J
.30
'
1
25
F-33
\
effects
/
.10
•.tu
second
20
15
1
//
V
1
10
/
1
5
1
0
--
•
5
•
10
•
15
1
20
25
k (lag)
Figure 1.21. Effect of reciprocal influence on. autocorrelations.
term consistencies were weak ( p => p
x
y
Short-
= .50). I n the upper curve,
x and y were made t o influence each other p o s i t i v e l y (c™ = c a +.40),
xy
yx
w i t h causal lags of g «» 3 and g « 5 respectively.
I n the lower curve,
the x—->-y influence was p o s i t i v e , while the y*-*-x influence was the
same i n size but negative;
causal lags as before.
fects due t o the feedback loops appeared.
Secondary e f -
Causal Analysis 1
page 62
I n place of the previous autocorrelations which declined steadily
e i t h e r t o a zero or some positive asymptote, certain periodic fluctuations
appeared a t intervals of about 9 t o 19 respectively.
(Note that the sum
of the two causal i n t e r v a l s was 8,) The p o s i t i v e feedback loop appeared
to generate a major and a minor secondary peak;
the negative feedback
loop produced a major v a l l e y followed by a minor peak.
One thus faces the f a c t that a variable can be more strongly correlated w i t h i t s e l f over some moderate i n t e r v a l (such as k =* 9) than over
an intermediate i n t e r v a l (such as, k « 5 ) . Furthermore, even when a
variable i s reasonably self-consistent (rho i s p o s i t i v e ) , the presence
of negative feedback loops can generate negative autocorrelations over
certain intervals.
Effect of reciprocal Influence on cross-correlations
The reader w i l l r e c a l l from Figure 1.2 the i n t u i t i v e expectation
that i f each variable influenced the other, two peaks or valleys should
be observed, one on e i t h e r side of k = 0, The cross-correlograms generated by the variables j u s t discussed are p l o t t e d i n Figure 1.22.
Figure 1.22 here
The expected e f f e c t s did indeed appear.
erated a peak a t k
a
The Influence of x — > y
gen-
+3, i d e n t i c a l w i t h the causal i n t e r v a l of g = 3;
the e f f e c t appeared almost I d e n t i c a l l y i n two d i f f e r e n t runs.
The e f f e c t of y - ^ x likewise generated a peak a t k = -5, corresponding
Causal Analysis 1
page 63
.60
.50
effect
of
effect of
AO
.30
/
.10
no.
\
\
seconuary
J
0>
/
.00
/
10
.10
Run
\
.20
\
efFects
\ l
I
\
/
1
/
\1
\
\
V
effects
\
\
\
.20
/
V
/
.30
v
effect- of
AO
i
.50
.60 I
-25
_ l
-20
I
-15
I
-10
I
-5
!
0
L
5
I
10
I
15
I
20
k (lag)
Figure 1.22. Effect of reciprocal influence on cross-correlations. Same
parameters as i n previous f i g u r e .
The x-^-y
influence produced a
p o s i t i v e peak when x was measured before y (k «• g = +3); the y —
and y
»ac influences produced p o s i t i v e and negative peaks respec-
t i v e l y when y was measured before x (k « g = - 5 ) . Secondary e f fects again appeared.
I
25
Causal Analysis 1
page 64
again to the causal i n t e r v a l g = 5.
Correspondingly, the negative influence
of y-^-x produced a "negative peak" or v a l l e y of equal magnitude i n the
opposite d i r e c t i o n , as predicted.
I n addition to these main effects c e r t a i n secondary peaks and valleys
appeared, a f t e r an i n t e r v a l of about k =• 4*13 and -15.
Although the cor-
relograms did not extend over a long enough time to be sure, we suspect
that these secondary e f f e c t s would be periodic but smaller and smaller,
and the cross-correlograms would eventually reach an asymptote of zero ( i n
the
absence of long-term consistency).
What are the implications f o r the emergence of cross-lagged d i f f e r e n -
t i a l s from causal influences? Clearly, i f both
and y — t x ,
and the
causal intervals are nearly the same, then any measurement at two times
only may completely obscure the pattern.
Only i f one has several measure-
ments, so that cross-correlations over d i f f e r i n g i n t e r v a l s k can be obtained, w i l l the bimodal pattern be clear.
Of course i f a negative feedback loop e x i s t s , a strong cross-lagged
d i f f e r e n t i a l w i l l appear provided one's measurement I n t e r v a l i s reasonably
close to both causal i n t e r v a l s .
However, the d i f f e r e n t i a l i s produced
by two e n t i r e l y d i f f e r e n t influences;
x--ty and y-^-Vx. I f both lagged
correlations are equally strong and i n opposite d i r e c t i o n s , one might
w e l l suspect a negative feedback loop.
D i f f e r i n g magnitude
of influence
I n the two previous runs, the causal influence of each variable on
the
other was made equal (c
= x^ •
c
V
What w i l l happen i f the two influences
Causal Analysis 1
page 65
are made unequal i n magnitude? Some r e s u l t s are given i n Figure 1.23.
Figure 1.23 here
Again low short-term consistency was used. The e f f e c t of x on y was
made more than twice as strong i n both curves (c
= +.50), as the e f -
f e c t of y on x (Cy = +.20 and -.20 r e s p e c t i v e l y ) .
X
One would expect the peak due t o x—*-y t o be higher than the peak
due t o y—*-x.
This difference i n f a c t appeared, although the difference
i n height of the respective peaks was less than the difference i n causal
coefficients.
I f one has cross-correlograms on a set of r e a l data where measurements
are taken a t several time i n t e r v a l s , so that two clear peaks or a peak-andvalley can be discerned, the r e l a t i v e height of the two peaks or valleys
may indicate the r e l a t i v e magnitude of the two causal Influences.
Effect of long-term consistency
What w i l l happen, given reciprocal influence of x and y on each
other, when there i s also long-term consistency i n the two variables? And
what w i l l happen, furthermore, i f the zeta constants used t o create longterm consistency are either uncorrelated between x and y, or correlated?
For c l a r i t y each of these conditions ought to be Introduced separately,
but both were used i n generating the data i n the next two f i g u r e s .
we see autocorrelations, i n Figure 1*24.
First,
Causal Analysis 1
page 66
e f f e c T of
F-3&
k (lag)
Figure 1.23. I n these cross-correlograms, x was followed t o influence y
more strongly (c
a
+.50) than y Influenced x ( c
i n upper and lower curves r e s p e c t i v e l y ) .
y x
= +.20 and -.20
Other parameters remained
the same as i n the previous chart. Peaks appeared w i t h same lags
and d i r e c t i o n as before, but those due to y—>-x were smaller than
those due t o x-
Causal Analysis 1
page 67
Figure l 2 4 here
a
Autocorrelations are shown f o r a pair of x and y variables having
moderate.long-term consistency (zeta variance set a t h a l f of t o t a l variance), w i t h a strong c o r r e l a t i o n introduced between the zeta's ( t h i s
was specified t o be +.60, but because of random factors the actual corr e l a t i o n s between zeta's were .68 and ,62 f o r the two runs).
causal influence of x on y was made stronger (c
Here the
= +.50) than the i n xy
fluence of y on x (Cy = +.20 and -.20 respectively f o r the two curves.
X
Under p o s i t i v e feedback (x
) , the autocorrelations of both
x and y became extremely high and p r a c t i c a l l y f l a t .
The combination of
moderate long-term consistency and c o r r e l a t i o n between the long-term constants, had the same e f f e c t as introducing extremely high long-term consistency.
Under these circumstances, one i s not hopeful of f i n d i n g a
cross-lagged d i f f e r e n t i a l (as the next chart indeed shows).
+
For the other pattern of negative feedback ( x — > y — - > x ) , however, the
e f f e c t was less severe.
Both autocorrelations s h i f t e d above the zero line
(note the contrast w i t h a previous autocorrelation I n Figure 1.21), a l though t h i s s h i f t was more marked
f o r the y variable than f o r the x.
The cross-correlograms generated by the above variables are shown i n
Figure 1.25.
Figure l. 25 here
i
Causal Analysis 1
page 68
l.OQ
Variable y
Variable x
.90
F--H
.80
,7C
\
\\
.60
y
\
.504J
.40
/
.33
/
/
/
/
\
.20-
+
/
y
v
.00
25
20
15
10
10
15
20
k (lag)
Figure 1.24. Autocorrelations when long-term consistency was introduced
i n t o each variable (zeta variance = .50 of t o t a l variance), w i t h
a strong c o r r e l a t i o n between the zeta s (about +.65). Causal i n fluence of x—>-y was made stronger (c » +.50, g - 3) than the
xy
influence of y—->x (Cy =* +.20 and -,20 f o r upper and lower curvet;,
g - 5).
I n contrast w i t h Figure 1.21 the autocorrelations a l l became posit i v e — e s p e c i a l l y f o r variable y. With positive feedback (upper
curve) they became p r a c t i c a l l y f l a t .
!
x
25
Causal Analysis 1
page 69
l.OQ
.9dF--W
,8d
.7d-
6d-
5&_
/
,4d
3d
x—
y —>-
w
/
-p
\
\
/
\
•
\
2d-
\
/
\
\
/
F-*X
\
I
\
ld-
\
I
ocj-
-25
•20
•15
-10
0
5
10
15
20
k (lag)
Figure 1.25. Cross-correlations f o r same data as i n previous chart, given
long-term consistency w i t h a strong correlation between the zeta's.
The same peaks and valleys appeared as i n previous cross-correlograms (Figures 1.22 and 23), but they were almost o b l i t e r a t e d by
p o s i t i v e feedback (upper curve).
25
Causal Analysis 1
page 70
The presence of long-term consistency, coupled w i t h strong correl a t i o n between the long-term constants, raised a l l cross-correlations
c l e a r l y above zero ( I n comparison w i t h the pattern shown i n Figures 1.22 and
23).
However, under p o s i t i v e feedback, each variable had become so stable
that there was almost no p o s s i b i l i t y f o r v a r i a t i o n i n the cross-correlogram.
S l i g h t peaks appeared i n the same places as before, but they were almost
obliterated.
Under negative feedback, some cross-lagged d i f f e r e n t i a l
remained.
But; unless one had measurements a t several I n t e r v a l s , i t would be hard t o
disengangle the e f f e c t of x-^*y from that of y-^-x.
G.
Simulated panel data
Conclusion
containing two variables x and y have been
generated by computer, t o investigate how lagged correlations between the
variables w i l l be affected by the presence of known causal connections.
When x was allowed t o influence y (but not the reverse), and both
variables had a t least moderate short-term consistency ( i . e . , each value
of an individual's x and y score i s a l i n e a r combination of his immediately
More recently there has been created a ten-variable simulation which w i l l
permit more complex causal connections such as m u l t i p l e Influences on the same
v a r i a b l e , causal chains, i n t e r a c t i o n e f f e c t s , e t c . The variables are created
i n the same fashion as i n the two-variable model.
I n addition t o these "true"
values a corresponding set of "measured" values i s also created by the addition
of a specified component of measurement u n r e l i a b i l i t y .
Causal Analysis 1
page 71
p r i o r value on that variable and a random error term), a clear difference i n
the
"cross-lagged c o r r e l a t i o n " appeared. That I s , the c o r r e l a t i o n between
x values a t a given time and subsequent y values was greater than the correlatbn between y values a t a given time and subsequent x values.
The
difference was stronger the greater the short-term consistency, and appeared despite marked inequality i n t h i s c h a r a c t e r i s t i c f o r x and y.
The cross-lagged d i f f e r e n t i a l was obscured, however, by the presence
of long-term consistency as t h i s was created I n the model by introducing
I n d i v i d u a l constants f o r each variable around which the individual's x
and y scores fluctuated^
the
Long-term consistency appeared t o matter mainly i n
causal v a r i a b l e .
Hence i n the presence of certain conditions which are often approx-
imated i n r e a l data—moderate consistency i n each variable over time, l i n ear relationships among variables, e t c . — t h e introduction of known causal
connections i n fact generated clear-cut differences i n the cross-lagged
c o r r e l a t i o n s , even when
the lag (measurement i n t e r v a l ) departed from the
i n t e r v a l of causation.
There s t i l l remains the important question of whether, given observed
cross-lagged correlations i n empirical data, one can reason i n the reverse
direction
and i n f e r causal connections.
sets of actual panel data.
Some next steps w i l l be t o examine
An e f f o r t w i l l be made t o f i t t o these data
d i f f e r e n t simulated models which may vary on such features as l e v e l of
short-term consistency, presence or absence of long-term consistency, u n i d i r e c t i o n a l or reciprocal influence, positive or negative influence, correl a t i o n among the long-term constants, e t c .
I n Figure 1.4, f o r example,
Causal Analysis 1
page 72
were shown three curves d i f f e r i n g sharply i n such properties.
Perhaps I t w i l l prove possible to eliminate some types of models
as e s s e n t i a l l y incompatible w i t h the empirical r e s u l t s , and thus narrow
the range of models among which causal i n t e r p r e t a t i o n s can be sought.
Causal Analysis 1
page 73
APPENDICES: Table of Contents
page
Comments on mathematical model f o r causal analysis
•*•••••
Summary of Notation i n Appendices A and B* • • • • • • « • * * •
75
Introduction t o Appendix A • • • • •
77
Contents of Appendices A, B, and C* • • « • « • • • « • •
t
A.l.
......
and ^ 5 ^ 3 " *
Appendix A. Moments of [ x ?
(
74
The common model f o r j x J
fc
A.2.
The time series
A .3.
Asymptotic formulas . . . .
A.4.
Variance of Y
t
78
^
and £ ^
•
v
t
&®
82
........
84
.......
85
A.5. Covariance, asymptotic covariance, asymptotic c o r r e l a t i o n - • • 86
A. 6. Covariance and c o r r e l a t i o n between x and Y_» • •
88
n
Appendix B. Moments of £
B. l .
x
i t
j
^d
^
Exact results f o r ^ i t j
i
Y
t
^
^
^itj"
^1
B.2. Asymptotic r e s u l t s f o r | Y ^ j
92
B.3.
94
I n t e r p r e t a t i o n of B.2.
Relevance of r e s u l t s
Introduction t o Appendix C . - - • •
«<>
96
Appendix C. J u s t i f i c a t i o n of Figure 1 i n A.6.
Lemma 1: •
97
* * • • . • • . .
•• » « . . . .
Lemma 2;.
98
99
Lemma 3:
101
Lemma 4 ; . * * « * * e « *
Lemma 5:
Lemma 6:
102
103
»
. ..
Lemma 7:. • • - •
Numerical accuracy of Figure 1 i n A.6
104
• • • • • • * » • • • • • • 105
•• » • • • °
Appendix D. Computation of Correlations f o r Simulated Data
• 106
107
Causal Analysis -Appendices
page 74
Comments on Mathematical Model f o r Causal Analysis
Among possible examples of data w i t h c o r r e l a t i o n structures which may
contain Insight i n t o causation patterns i n that data, panel data serves as a
useful r e a l i z a t i o n f o r describing the mathematical model developed here.
For s i m p l i c i t y define a panel t o be a f i x e d group of persons who report per-,
i o d i c a l l y on t h e i r behavior.
Suppose the i ' *
1
person I n a panel of size N
reports at times t » 0, 1, 2.,.two numbers, x£ and £ , where 1 - 1 , 2,...N.
Y
t
The measurements x| and Y j
t
t
t
vary then according to both the i n d i v i d u a l 1
and the time t .
For a given time t and i n d i v i d u a l 1, x^ and Y^ are random variables
t
t
w i t h f i n i t e means (expectations) E ( x [ ) and E(Y^ ) and f i n i t e
fc
variances
t
positive
Var(x' ) and Var(Y| ), To develop formulas f o r means,variances
t
and correlations f o r x
and Y' as functions of 1 and t we f i x on the 1
it
it
1
t h
ind-
l v i d u a l , suppress the subscript i In the n o t a t i o n so that x£. d Yj. replace
an
xj^and ^ > -d study the p a r a l l e l sequences of random variables
l > l .
Y
ai
t
9
x
0 * l * 2» • * • * t •" «*•"
x
;
x
x
YQ, Y^,
'*•* sequences
£
I n s t a t i s t i c a l terminology
the above
are called time seriea
Y
and w r i t t e n [x* J and j V J where t = 0, 1, 2.....
t
F i n a l l y , we extend these r e s u l t s f o r the i n d i v i d u a l t o the group.
To express the r e l a t i o n s h i p between x
t
and Y requires introducing a t h i r d
fc
sequence of random variables y^, t • 0, 1, 2.... Roughly speaking, ^x |and
t
| y ^ J are sequences of dependent variables which develop Independently as
sequences, and combine t o form the sequence^Y^J.
i
previous ( i n time) values of x
y
i
t
<
depends only on y
values.
That i s , x may depend on
fc
i
but not on any y value or values.
Similarly,
Causal Analysis 1 - Appendices
page 75
Summary of Notation i n Appendices A and B
t
s
- Non-negative integer valued subscript representing time.
»jjkj«^ - Integer valued subscripts used t o denote values of t ,
n
N
- Size of population of i n d i v i d u a l s .
1
- Non-negative, valued subscript corresponding to an i n d i vidual i n the population.
X
it*
X t
X
m e a s u r e m e a t
(random v a r i a b l e ) associated with 1
indi-
n
vidual i n population at time t . when discussion focuses on
i n d i v i d u a l the subscript i i s suppressed.
yit*" t
v
" ^ measurement (analogous t o x measurement above) Independent
of x measurement.
, ^.
2
2
<T , Oy
- Means (expectations) on x ^ , y ^ which are constant 0v*r-'-fc.
t
i
t
i
- Variances of x^ , y^
t
t
constant w i t h respect to both time t
2
Thus Var(x) = V a r ( x ) = 0^ =
V a r
0
and Var(y). = V a r ( y ) *=
x
x
i t
= Var(y
Q
yo , y6y
<
- Autocorrelation of x ^ ^ i w i t h x^
i t
I *) given fixed i .
| I ) given fixed 1.
(y]_t+l i
w
t
t
n
v
it)
co
n s t a n t
over t and 1.
Y
it*
Y
t
"
M s a s u r e m e n t
( it|
x
T
a n
formed from l i n e a r combination of terms from the
^ |^*tj
ser
^-
eSa
L a t t e r notation used when I sub-
s c r i p t can be suppressed.
- Positive integer value of t where terms from ^ i j
( 1
x
t
1
f i r s t combine t o give , Y^ J. .
t
g
an£
* ( itl
Y
1
For t<£T Y i
t
i s defined equal
- Causal I n t e r v a l i . e . number of cycles before time t > T at
t
which x,
"influences
l-tr-g.
i
11
i
Y.-^. I n recursive d e f i n i t i o n , Y,
It
I
I
*
l t
is given as depending l i n e a r l y on ^^ ^i»
t
-term.
x
i-t-g>
a n < a
a n
e r r o r
Causal Analysis 1
page
x ,y ,
t* ' t
t
- The variables x
| xtj*
"
e
s e
^
u e n c e s
y t* ' t
t >
c
translated t o have mean zero,
°^ ^dependent random variables i n t o which the
dependent sequence o£ variables ^ t|»
decompose.
x
o^c^
- See A.2 (11.1)
(11.2)
@
- asymptotic See A.3
E
- Expectation
Var
- Variance
Cov(x ,Y ) - Covariance of x„ w i t h Y._
s* t
s
t
%
x>(x ,Y )
/ s t
- Correlation c o e f f i c i e n t between x
c , c^
- Coefficient of x-term
X
" Coefficient of
0
1
- See B.2 (4)
*
and Y
r
l n recursion
i n recursion
for. Y
for
t
Y^. ,
and Y^..
Causal Analysis - Appendicoo
page 77
Introduction t o Appendix A
Formulas are derived f o r the c o r r e l a t i o n c o e f f i c i e n t s between pairs
of random variables drawn from w i t h i n and between throe time series, two
p a r a l l e l time series « ^ t ^
x
( t - 0, 1, 2, . . . ) .
a n d
"( t}
y
{ t}
a n d
Y
f o r m e d
f r o m
t n e m
The scries Y^ i s defined t o equal y^
T-l whore the integer
f o r t « 0,1,...
T > 1 i s called the generating period.
Is set equal t o a l i n e a r combination of Y _^ t-g»
x
t
a n
At time T,
e r r o r
(
term.
The normegfltivo integer g i s called the (size of the) causal i n t e r v a l .
Thereafter, f o r t > T , Y^ i s defined t o be the same linear combination
r
i
f
of ?c - i<' x t-g , and an error term,
Because c o r r e l a t i o n c o e f f i c i e n t s are invariant under t r a n s l a t i o n ,
l
«
i t proves convenient to replace x
t
and y
t
by the random variables
(pro-
vided the expected values E(x^) and E(y£) e x i s t ) .
x - x - E(x£)
\
,
( t - 0, 1, 2, 3,...)
y - y - E(y )
t
t
t
t
t
Hence, the superscript
"
1
" serves only t o indicate random var-
iables which have not baon translated so as t o have zero moan.
We assume that ^ t J
x
aru
*
n a v Q
autdcorrelated (Markov) time series,,
t n G
b *- characteristics of
ae
c
(See Kendell and Stuart, The Ad-
vanced Theory of Statistics., Vol. 3, pp. 405, 418.) The f u l l force of
the s t a t l o n a r i t y assumption Is not needed hence not assumed here. I n stead, WQ require only that x
2
2
cr and or constant over t ,
x
y
fc
and y
t
have f i n i t e positive variances
s
\
%
x
°
, )
l
r <
a
)
Causal Analysis 1 - Appendices
page 78
Contents of Appendices A, B and C
c
In Appendix
A
moments of the three time series
are developed.
f t} »
x
*
By moments we mean the s t a t i s t i c a l parameters ex-
p e c t a t i o n , variance, covariance and c o r r e l a t i o n c o e f f i c i e n t .
Appendix
an<
The formulas in
A apply only t o the I n d i v i d u a l but form the basis from which
the r e s u l t s f o r the t o t a l population or group are derived*
Appendix B contains a summary, discussion, and abridged derivations of
formulas f o r moments of the population. I n p a r t i c u l a r , f o r the model set
f o r t h i n {is,) page 25 o f the main report and B.2 of the appendix, B.2 (10)
it
i
gives the asymptotic ( i n s and t ) autocorrelation between Y^ and Y^ (where
t
e and t are times) and B.2
8
(11) the asymptotic cross-correlation between
and Y^g.. Sections D and E o f the main report focus
on these two correlations,
The i n t e r p r e t a t i o n o f the formulas i n B.2 and t h e i r relevance t o Section D and
E are treated i n B.3.
The v a r i a b l e s [ ^ | d j ^ i t f
x
a n
a r e
t n e
t
i n the main report.
Define
The variables { V ^ t ]
£
^
*
l x
l y
* & u *
P°P l *
u
at
c o m D i n e
on
w i t n
1=1,2,...N
yeriables under study
[ itj
x
t 0
f
o
r
m
{ itJ'
Y
t-1,2,...
(1.3)
-E(y; )
(1.4)
t
0"2 v a r ( x ) » V a r ( x ^ ( i ) , the variance of x ^ given 1
» Var(y) * Var(y | i ) , the variance of y. given 1
y
it
it
c
The means
over time t .
t
t
and ^
depend on the i n d i v i d u a l 1 but are constant
9
The variance o f the* random variables x
or y.. f o r any fixed
it
*-
i n d i v i d u a l 1 does not depend on the i n d i v i d u a l or the time.
by suppressing subscripts 1 and t .
c
We indicate this
(1.5)
(1.6)
Causal Analysis 1 - Appendices
page 79
These properties enable us t o w r i t e
That i s , a l l the variables x[
( y j ) &re i d e n t i c a l as variables
t
t
except f o r t h e i r means J T ^ t ^ y ) * Formulas (1.3) through (1.8) use the
n o t a t i o n o f the main report and l i n k Appendices A and B.
For each i n d i v i d u a l ( i ) the model*set f o r t h represents the measurement Y
made a t time t by a l i n e a r combination of Y. .
i,t-i
it
(the Y< measure'
3-
0
ment one u n i t before) ^ . g
(the x^, measurement g units before) and
x
v
an e r r o r term.
iables ^ i |
x
o n
s
I n t h i s sense £ _g i s the primary influence of the varx
t
Y
s
f
t
it°
^ ^
1
< it-g i t
x
Y
}
s
*
e a d s
n a t u r a l l y t o the conjecture
- P < is' i t >
x
Y
f o r
f i x e d
** g
t e
e r
t w g, g + l
s - 0
8
p
S
g+2, .
O B
1, 2, ...
The seven lemmas comprising :Apj>«*<J»* C provide an answer t o t h i s
question In the case S and t are very l a r g e
i n Figure 1
Ap^ndix-
e
The r e s u l t s are summarized
A.6 and Figure 1-12 page 40.
page 80
Causal Analysis 1 - Appendices
Appendix A. Moments of x
A.l.
The Common Model f o r j
x
t
|
a n d
t
and Y
{ t}'
v
The series f^*^
-[yt} d i f f e r only i n the values of the parameters cr , o* , p and p . Thus, only the model f o r I x \ I s developed,
x
y
y
v t'
a n a
v
We express the dependent random variables x^. i n terms of independent
ones e^ t o obtain simpler d e r i v a t i o n s . Analogously y
t
fc
i s expressed
i n terms of e •
yt
Model. Let ^ x j ( t » 0,1,2,...) be a sequence of random variables
t
2
w i t h common mean zero and common variance o* »
x
Let - ^ t } (
e
"* 0>1»2,...) be a sequence of independent random
c
x
variables.
Let ^ be a r e a l number such that j o ^
Define
x
Q =
x
t
e
«c 1.
(2)
;
x Q
« p x _ + e \
Cx t - 1
xt
(t-1,2,3,...)
E<^r*>~
From
Cv)
.
(3)
'1)
corrG
(cCtT&n <Z<P& €-ft<i "
'<e
faefwe.
e. H
Causal Analysis 1 - Appendices
<*?
fVo^f
Vy^-f
tff
index ptsrt
£zp>
page 81
Of)
<T^) * W 6 6 ?
J-&*M~
vW/ - W o v
<Sxfc
, so
y'*e(<i
X*>-%
ct"d
,
-t- f
/
C^XtE:
2 c o
Causal A n a l y s i s 1 - Appendices
<xwt*J
"the.
GICJS
Yt~- (
•
~fha
-fan***
\zl*c>
cZ> -2
"fe, Appendix
n
<t&.r-i*ex{Hon*'
Ytr.
-fasiT
.cu<A
Xwtro.<A/cY*©
Vi^'is*
page 82
A
«
X*~6-tfr>
"fe
-fa*, J-zr
CVc3^>
~T"
fc>e
kvS
+
£>
~fc> ^ ^ * r o ^
ivt
*t~
v hr'c k
^h
-Vfcs
~ierm£
^ f /»c<e/
'/ytct.lre?
Ji/fw
.
less
Y£
t*Se-fuf
WrY«*
,
cvb&<sc
^5
Abofe»
4i>r~
<ar&
4tH*o«jh
<3Ach
J<s>*t*
inert* * 7?<*ujh ( y
Causal A n a l y s i s 1 - Appendices
pag« 83
v/
•t--r
t o
.
u**»cj
reverse
^r<^>
5^6/
<?rc/er<*-
'shed.
^umrr,
afl&n,
zr^irt"
-f*
Causal A n a l y s i s 1 - Appendices
.3.
page 34
Asymptotic formulas.
From the equation f o r Y
given by (10) we derive expressions f o r
t
the variance of Y , and the covariance and c o r r e l a t i o n between Y
fc
Y
t
and between x
s
and Y .
Also
t
and
8
simpler "asymptotic ' formulas for
1
r
these expressions are given.
Conventions throughout the remainder of Section A,
s
>
T
and
|s - t | » n
t
>
T.
the absolute d i f f e r e n c e
The p r e f i x "@" stands for "asymptotic"
e.g. 'JffiCov" means asymptotic
covariance.
D e f l n i t i on of asymptotic
By asymptotic we mean that the integers g, T, and n » [ s - t j are
a
minute compared to s and t so that expressions of the form p
8
£
8-t
py , and py
(but not of the form p
) may
x
x
„p
x
t
»
be regarded as zero.
I n t u i t i v e l y , taking the example of time, i f t represents the
present, then s i s the near future or past and g and T a r e the remote
past.
I n other words the values of x and y a t times before T have
n e g l i g i b l e influence on the values x
fluence of x
s
on x ( a n d y
c
i f t i s l e s s than s.
3
on
t
and y
t
as compared to the I n -
y t ) i f a i s l e s s than t , or v i c e versa
Causal A n a l y s i s 1 - Appendices
page 85
<o>
<xbo e
i„ f(<*?
V
f ~ = (=y -
We.
< - c s
va^xltee
<?>»'rh aft
•fr*>*±, d>*~ appear
pro^s
-s'ynpfoi— ~it^a^
<-r~ey
f - f
> <^*=
O
£,r~
i< -- fo v -
r
( ( 1 . 3 ?
Case
~fa<z cr^-src
J.
5
^
a
« 4
£>y
^
'
m
^x
My
t**p<=>f~T&*r' Way ,
R
,
*-T+ *
-t.rv *
*- V '7
T
Causal A n a l y s i s 1 - Appendices
-
^'''^'^
(l-y*/C<
o f
-frie
Vat-la
~
r
[
o
^
_
- * tail—
irt eXf&A— -ftf <£
^«"lO
~*/?(f~
<V
'«
bl&s
page 86
rrf"f"
.6/
-
<**>
CL&'JV'G b<?C<J>yn CJ
- XJKX'
?.
CKS>7
'
£ 1 1
+
( K ^ f - f v
<-f*fV
^
f
;
page 87
Causal A n a l y s i s 1 - Appendices
For-
«
3
(
(/8.0
( 7 c ^ - a ^ y ^ i -
f>*?
6*?
f/ - f* 6 - (if/Xf- f v ^ - * eVr«+
Carres
JO*»J\«J
f*'"~JS
<>f
/*
/
C*
Causal A n a l y s i s 1 - Appendices
A.6,
Covariance
page 88
and C o r r e l a t i o n between x
and Y
s
t
Consistent with the notation i n A.5. we define the
c o r r e l a t i o n c o e f f i c i e n t between x and
s
p <v
V
3
C
o
v
<x
s» t > / ° x
Y
/
to be
t
V a r
cf ) •
t
Asymptotically Var (Y^) does not depend on t (see (15)) so that
/o(x
Y ) v a r i e s i n x and. t as Cov (x » Y ) v a r i e s .
fl>
fc
s
I n (9.1),
t
the d e f i n i t i o n of Y , * _ g enters a s "the i n f l u e n c e " of the x
fc
t
s e r i e s on Y .
I n t u i t i v e l y one expects therefore that Cov ( x ,
Y ) and p ( x
Y ) a t t a i n t h e i r maximum values when s « t-g.
fc
t
s
s >
t
Considering only non-negative values- of /o and p^
asymptotic value occurs a t s » t-g i f py
time s before t-g i f /o (1+/> ) > 1.
y
X
the maximum
9
x
(l*/^) S
*
ant
* a t some
I f we think of time s as
the present, then we expect maximum c o r r e l a t i o n g time u n i t s l n
the f u t u r e .
I f the maximum occurs a t time t a f t e r s-fg, then wa
say the maximum i s delayed t-g-s time u n i t s .
Thus, we define
the delay d as follows:
d « t-g-s where
s ^ t - g so that d assumes only non-
negative integer v a l u e s .
Figure 1 on the following page shows the regions of the
u n i t square (with a b s c i s s a p
lay d « 0
P
1^ 2,
x
The l o g i c a l j u s t i f i c a t i o n s of the table,
Figure 1, and the statements
p
Y
above
comprise Section C.
At t h i s
we conclude Section A by obtaining expressions for Cov
( s > t>» @
x
on which the de-
10. The t a b l e below Figure 1 gives the
areas of these regions.
point
and ordinate p^)
C
o
v
< a* t>>
x
Y
a n d
@
P <V
Y
t>'
Causal A n a l y s i s 1 - Appendices
P* rgy u re
page 89
J
I.00
8
Delay
.30
Delay « 2
.80
Delay * 1
+
.70
Delay « 0
.-60
+
.50
.00
.00
.10
.20
.40
.30
1
q
• i
ft*A
a.
.60
.50
2.
3
6
,70
.80
<?
~7
S
+
&t?o3
.90
1.00
to
4. CO?
Causal A n a l y s i s 1 - Appendices
~~Hie
vtf-iAb
(a*"
j
-the
page 90
tea
t*a
i*4
V
2-
o<k
a$f»*f>~fc>i-;sr<3jL,
sense
-fc?<?
&t
ikies
Who* -m = S
W>«»
= tot
f r - e *
A,3
, k/cr.
_&
b
e
c
~ "
e
~
t~&<Ju*n?s
"f&
£^£
•I -
ff££Lll%f-s+0-&-j-*W*]
page 91
Causal A n a l y s i s 1 - Appendices
~7J?<- te^ri*
(~ts
Append
~fUos?<z l,i
<X
A
Appendtx'
erf
&,
SJ
3*
we
-&>rIn
expand
OH
C
'trtci&partde
Appeal
~fe
A.
fx
C&.sT?
v ^ « >
F*o
m
£(**?
=
CI. C? <u*d a:s?
E Cy±?
<^>
< a - f c ? l ( o w ^
C*. f?
ova
-f^^
u
*(1
-fa*
m
><t
y^
a*tj
beecutM
~~faer& Z
e*tt
i )
f(*3?
, s
-fixed
Zrf' =
ovw
JCC^^^OSG
la* c~c *rf ~hi<T
******
Ca<3?
-fo
-fro*
cmd
bsfaw*
+ V*rf?iP
Xsft-ttturtZr.
- t (<*»<J
i?>«fr<?4*zh\»*
JC„ C3.3?
<ab*>vc
-fet-„
-TZtc.«j -fee >-^7o cr* <X.X?
by 0.1? * <tOg)
V*
C**(y&,yD=$~'%?
-+hd[t~
XU
t/
5«<:e
amji
and
j \<£
W ^ A f c / s j , "TUw* -fate
,"TJr,^ a^iz-bli&Lie-r
ba^i^ -for*
c / c / v f e p er<a_c^~
t,3j'--
-tt,iS ZCrfr* Ju£-hh„
Tfec
t-m
Appear
*?
ft
Cx.O
y+rhUb
( 7 . < £ < f f o ~ r
& bo se
Varte*.
I
CXr, y^? by 4. t 0??
y<*tJ* <2. <f?
t
w
page 92
Causal A n a l y s i s l - Appendices
•Ho
„ ft* ?
Insults
-fite*-
C . ^ Y ^ l ,
ely-Str*-/awec9
+
for
Vi*
lY-j]
-far-
*
~t ? T . F^r
^*i*-<j
aucj -fox? -6-
/rtrla^/e
*W
u*£V7*i
•£ < 1~ w*
-e^-r
y>e{<is
paga 93
Causal A n a l y s i s 1 - Appendices
—
~TKe
-font-
-^K^-T
^
—
;
< ^
Cay
C*
9
)
am
r
ky>
O f ?
Causal A n a l y s i s 1 * Appendices
B.3.
I n t e r p r e t a t i o n of B.2.
page 94
Relevance of r e s u l t s .
The a u t o c o r r e l a t i o n formula ( i<y ) and c r o s s - c o r r e l a t i o n formula
( It
) Illuminate the simulated r e s u l t s i n Sections D and E of the main
report.
By v i r t u e of the d e f i n i t i o n o f asymptotic i n s and t I n A.3,,
only C o v ( Y
Y ) and Cov(x^, Y ) e x p l i c i t l y depend on s and t . Thus @
st
fc
t
V a r ( Y ) i s constant over t and so a r e the denominators i n ( \& ) and
i t
(// >.
As the l a g l s - t | increases ( t h a t i s as k or -k Increase i n magnitude where k » J s - t | i n main r e p o r t ) Cov(Y , Y ) and Cov(x , Y ) approach
fl
zero by (ftf, \ ) , (<£<?) i n A.5
and (fia.l)
t
i n A.6.
fl
t
From ( \& ) as l a g
Increases
Oppr-oashe*,
( Yc«. .Ylt /
a-^J
-CrT***
„
.
x
CnJ
These expressions give the asymptotes r e f e r r e d to on page W.
t
X+ -feH+ws -G-°m( *5 ) whan ^
that I s , @ Y
Y
lt
d
o
n
o
t
i t
does not depend on
fay*
i t
depends n e g l i g i b l y on J
i y
;
I t follows that the moments of
depend on fay as observed on pages 33 and 57 of main r e p o r t .
More p r e c i s e l y , when f
y
nate the ^
« 0, Y
i s small compared to Cv-.. the terms with c
dcni*
^
xy
terms.
The l a s t point helps e x p l a i n why on page
the c r o s s - c o r r e l a t i o n
f a i l s to reach zero despite the strong negative c o r r e l a t i o n between
and
Causal A n a l y s i s 1 - Appendices
I n ( // )
V
a
r
^ix^
+
page 95
T
C o v
y
iix^
^iy*
d
e
c
r
e
a
s
e
8
slowly as Cov
0!>ix» J l y J Brows more negative because Ty i s s m a l l .
When £
and
l y
are independent and hence C o v ^ ^ , ^ i y ) • 0 i n ( I d . )
t
The variance V a r ( Y ) then I s
oonotone increasing i n T
i t
2
y
, Var(J^ ),
y
2
c^
, and Var(^£ ).
x
As noted on pag<2 £f the autocorrelation ( (0 ) i s
monotone increasing In V a r ( ^ i x ) and V a r ( ^ ) since^as the denominator i n
y
(
) increases,@ ^>(Y^ , Y j . ) approaches one.
S
t
F i n a l l y with regard to comments on pages 33 and r?£ we d i s c u s s the
e f f e c t of varying C™. i n ( io ) and ( f/ ) . From (53.1) Section A,6
Ay
@ Cov(x , Y ) i s l i n e a r i n c ^ . I f 7^ i s close to zero, then the numerator
i n ( // ) becomes e s s e n t i a l l y l i n e a r i n c^y and the denominator i n ( // )
2
depends e s s e n t i a l l y on c ^ only through
• Thus, I f c^y p o s i t i v e i s
s
t
replaced by c
w
of the same magnitude but negative* the numerator reverses
2
sign while the denominator of ( If ) i s unchanged since ( - c ^ )
t
i
.
This v e r i f i e s that @
2
» c^ •
* l ) merely changes s i g n when c^y changes s i g n
t
as observed on page 3^.
On the other hand when (/A) holds or Ty. I s close to zero, the auto-*
2
c o r r e l a t i o n ( ia ) depends e s s e n t i a l l y on c___ through c
and no l i n e a r
/
xy
i
»
A
terms i n Cxy.
Thus, @ pC*i8»
as observed on page 33 •
Y
it^
r
e
m
a
i
n
s
f i x e d when c ^ changes s i g n
Causal A n a l y s i s 1 - Appendices
page 96
I n t r o d u c t i o n to Section C
When t h e c o r r e l a t i o n s p ^ and p^ are near one the maximum asymptotic
( i n s and t ) c o r r e l a t i o n between x ^
i s delayed.
That i s
p
s
and. Y^ does n o t occur a t s « t - g but
t
Y
has maximum c o r r e l a t i o n w i t h some x. where s I s
it
is
more than g time u n i t s b e f o r e t .
As might be expected
increase together.
and 0 £ p
y
1.
t h e delay increases c o n t i n u o u s l y as
The lemmas i n S e c t i o n C consider o n l y 0 £
They t r e a t t h e delay d as a non-negative
except i n Lemma 7 which r e s t r i c t s d t o non-negative
and
< 1
real variable
i n t e g e r values.
Thus,
i n Figure 1 o f A. 6, t h e i n t e g e r . v a l u e o f delay gives the .maximum c o r r e l a t i o n i n the sense t h a t
(Xj
, Y..) >dx,
V..) f o r a l l non-negative
it-g-d, i t
is, i t
f
However
y
i n t e g e r values of s.
i f n o n - i n t e g e r values o f s a r e a l l o w e d , F i g u r e 1 w i l l look s i m i l a r
b u t w i l l have d i f f e r e n t boundary curves s l a n t i n g d i a g o n a l l y toward the upper l e f t corner o f t h e u n i t square.
The n o t a t i o n i n S e c t i o n C i s o n l y s l i g h t l y r e l a t e d t o previous nota-r
tion.
For s i m p l i c i t y x replaces p'^ and y replaces p^.
t i v e x and y
For s t r i c t l y p o s i -
k i s d e f i n e d equal t o y/x. This k i s u n r e l a t e d t o any k
used p r e v i o u s l y i n t h e r e p o r t .
Causal A n a l y s i s 1 - Appendices
y
page 97
f^f*
*
&
y
TiTe
•fir-ft'
L*>
*3
"tW^
sU^s
-t*<*t
JA
Osr
eJ
-tuc^e
'Wr<?«tJeJ ,
te.O
£t**d
^y
c
- yr
l
<~
«y
dawfojo
"prt?perr7r e-r
"h> each
f>*\r-
&f
<*,d)
~f-C*, / s c f j
"hterc
CV'n-s£w4s
-£CJ, * y} •« -CCJ+ t > x ,
t
appr&CLcJk
( a . s O
-i-fte
<*(s*>
I'nc
- f a r
y -
y
I
^
d
x
.
*f><3*
A
s
& i I cr 14* j
?^
*
3-
/ -——,
j*,
p
_
^ ^ ^ ^
^
s
„
*p
y
*
p
i *—
7
,
^
~
(
t
+ p ~ /
r C * * - 0
' '~?J-_l:»v
A
Causal A n a l y s i s 1 - Appendices
Z
"
w -ft*c
y
-
&b
0v'e
a It
page 99
*
><
i
p[y
-t- A the*}
{»cxr^.
<r
— C?
&
;
.
-fixed
I - Xy. **" d
a
-frcr*o-
(2.2-)
-Hie
asymptote.
- f t
t — rry
/
6 )
Causal A n a l y s i s ! - Appendices
fC^ST^ r/C
ti <rT{' -far
ti'f'rt-
\
dll
page 100
d
P a Cv, fr>
_
Causal A n a l y s i s 1 - Appendices
cztfeac-
r^x>t
x/)
page 101
of-tter-
~~t(<
- /
.-z„c,-ru<z<~- root•'<9*t*ii,
tr.*^^«
i — «y
^ ^7^i
&
CPrr
,
~~
.
^ < * *r < ^
£
< -
Sase
f< — f
o n f y
(r±
frr*"*"
( rf*~y?
c**%&
f< = (
is
t -t * J
&t*d
f?y ~tt9c?
»*r<9
yields
'+ X J
«
v
b&<r-<?r»<?x
-iT^ert&d
*zf-t&-
'
cT/ ~,
<Vi
^ ^MC^ 'G^aZting (6)*Mh
/ VT%t&
wU< cl\
^e^O'
y - *- L
^ ^
/*
^
'*C
-this
/ .
;
us<»^j
Causal A n a l y s i s 1 - Appendices
3*
}s
<3?
y =*r
a»<*fher-
*
f&^f-
a
^ r
page 102
^cCs)
rr» f,'<p,
r
beside.*
y -
<j *
X"
_>
TCu^
^
K' -
if
x^>r
Causal A n a l y s i s 1 - Appendices
Castv
JE*-
,
u
and
ft
eSyC*,?
-KG
art
<*"<^ g
-&+(lc»rtr
C u e
k*t +
=
-pw*" i
g - C x i j . * * ^
<*
O
r^/tfj
cPf
y
CPU--^
which
(x't
-
<
yj
make.*
S~
-teat
-fa*.
.<*pf><*yH&
CM e
"/»
S
t h t = e ?
<• . He*
gC^^T?
is
g f y J ~ * )
f
a.
i^ed.
so**<> fc' e (l-<~ > H - O .
£ \
hero
<r<*j <rJ?
Gav*e
i ^ £ ^ )
>
C&,r?
y-=-c=>
=
^ f t / c y ^
gC*?.,kO
'/*-fc*
XH
<z/+ (
~th&
k*i - <? ~y have
b<*<~t* yfGQcCtTy^
r<2
and
•fm™-
?
almost
r
in^e.
czW
/
-hh£h
k-~ (
-proof-
page 103
~
,
y
is
y, d f O
H^i-
hC^jXry
<SK p r e - sri^-^ ^at-
~tUa
=
UV)t^ue
So(^7o
h
y ^ O - ^ ) ,
=
^ '<
rtaat*
/
s
^
y
~
£
b » <6 )
~<z ^ '
— )
^
c
^
*-p-*
z-*7^
&t+*OS
<p -J<
±^Hp
S-uv&w
m
«-^ll_
J
p«c? -/..*7^' / ^Mrh
f
U G ?
r
p?lp?&wcr?
P
3^
(J**P}W
Pt^P
-JM>Cop< > - l ^ V *
• '
j-.'**
p
J>*y
t ^ <x
#
S-v&^b
-HSMS)
ffyvy.
f>*Z?
de**) &^h~* z>s\&q -&
'
hl^
&
^ pj^
r
y'M-/-
!_
^^y^^srq^p
Causal A n a l y s i s 1 - Appendices
j n c j
V*t«*s
tf?
-£<*o-i~~
<'«
i ™ p
y>
f
<
'Bert
sotulri?*.*
~~&>
Th?
rnarrxzly
»,7tU
-Oc*,
,
-1-b*rf~
-i%ere
y~
x
y
O
<r
yo
^ f
^ e . ^ ' i
^ r - f r c ^ i e ^ - f i y
y * .
T k * x ,
»z'
&xis~~hi
K
sueU
a
-
a»cJ
^
W<3
(
C<rJ > x~)
wk &
^t^cJ
^
^
b^rh,
y
'
d + iy
J
roof-
"/fie
-V^
(V^r
was
b let
^ r o x ^ f e
^>r~
^
&r~',j,„
„ j
.
x
w
&((
^ a a <;
a
J
a
<j,or Hit ^
~f>?c
e o l o r J ^ f e *
^ r ^ ^
^
y
(pafouJ
a^pr-ox
-£t'n<Ji'*j
-
>
a
^ y ^ ?
r~r<?>rt~
"
b r v e f f y cJe s<z r i & &
"f^e .
«
^
Value*
J-o.^^y
&r
f i7r e-*
-that"
7
Seech**
^ur^e
y
invert***-
< r ^ ^ T ^ ^ ' ^ ' " « j / le* "*-
y^~)
u<;',n<j
—The*
/:, ^ _
I
merino
snc^H-h
i
Prjur*
t'u
<-
groad^r
<
J
by
y '
^
J > - -<r>cy y
Gac: ^
-
y
/
Values
<r&Y* p,u't<z?r-
-fe*^
<-
<d
<T^+r>f<<3<-<2
-f-Ujuro^:
<r>r
f~*tr<:
y ,
J e e r * * * -
fce^e
ctro
•
page 106
^..rh
n
)„
,
Us',^j
Hc"/«
YT^
M
proved
-fb&j r
Causal A n a l y s i s 1 - Appendices
Appendix D.
page 107
Computation of C o r r e l a t i o n s f o r Simulated Data
As i d e a l i z e d , the p a r a l l e l time s e r i e s £ tj» l ^ t ^
f i x e d means
x
and v a r i a n c e s over time t , so that c o r r e l a t i o n between say x
pends only on k.
extremely
fc
and *
t +
k
de-
I n s i m u l a t i o n , however, the means and v a r i a n c e s , while
s t a b l e vary due to random s e l e c t i o n of v a l u e s and the f a c t that
i n f i n i t e time Y
fc
cannot reach I t s l i m i t i n g d i s t r i b u t i o n .
A u t o c o r r e l a t i o n s and c r o s s - c o r r e l a t i o n s are computed for lags k »
0#
±2»**«±25«
Two methods are used t o compute the c o r r e l a t i o n c o e f f i c -
H
l e n t for l a g k which provide a quick and easy a l b e i t crude means of judgIng the s t a b i l i t y of the time s e r i e s .
Method 1.
From 50 c y c l e s of the s i m u l a t i o n producing 50 a r r a y s of v a l u e s ,
say x ^ p i 2 » " » i 5 0
x
#e
x
f
o
* *° » » » » «
r
1
2
1 0 0
» 5 " c e n t r a l " values of t are
2
chosen, 25 separate c o r r e l a t i o n s r ( x , x ^ ) computed, and the average
f c
of the 25 v a l u e s r ( x ,
used to estimate the c o r r e l a t i o n f o r l a g k.
t
Method 2.
t
As above, 25 c e n t r a l v a l u e s of t are chosen.
However, instead
of separate c o r r e l a t i o n s the x v a l u e s are pooled i n t o two c l a s s e s , namely
t - v a l u e s and t + k - v a l u e s
6
and one o v e r a l l c o r r e l a t i o n f o r l a g k computed.
C e n t r a l Values of t
For t « 1,2, ..50 to s e l e c t 25 v a l u e s of t f o r l a g k » 25 involves
e
a l l 50 v a l u e s of t i n the computation s i n c e t » 1 goes w i t h t « 26, t =» 2
w i t h t « 27,..., and t « 25 w i t h t « 50«
puting v a l u e s of t are needed.
"central
11
I n general f o r l a g k„ 25 + k com-
We r e q u i r e these computing v a l u e s to be
i n the sense t h a t they be consecutive and the minimum t used l n
computation be as near to zero as the maximum t i s
to
50.
For example f o r
Causal A n a l y s i s 1 - Appendices
page 108
k » 10, 35 computing v a l u e s are needed, namely t » 8 through t a 42.
Note
8 i s e i g h t from aero and 42 e i g h t from 50.
The " c e n t r a l " values of t are
then t « 8 through t - 32.
The diagram below i l l u s t r a t e s which 25 t v a l u e s corresponds to
particular lags.
The axes r e p r e s e n t times t « 1,2,...50.
therefore corresponds to a p a i r of times t and t+k.
the
s e r r a t e d diamond shape
times used f o r l a g k «• 0.
Each square
The squares i n s i d e
l a b e l l e d w i t h 0 are the twenty-five p a i r s of
Those l a b e l l e d 5 correspond to l a g =* 5, those
l a b e l l e d -5 to l a g «• -5, e t c .
diagram of t v a l u e s here
page 109
C a u s a l A n a l y s i s 1 - Appendices
6
10
8
8
8
16
8
Time t
8
e
8
30
35
V5
I
1
20
SO
i
Time t
Diagram of t v a l u e s