A brief introduction to the inverse Ising problem and some

A brief introduction to the
inverse Ising problem and
some algorithms to solve it
Federico Ricci-Tersenghi
Physics Department
Sapienza University, Roma
original results in collaboration with
Jack Raymond and Aurelien Decelle
Simplifications for this talk
I.
er to
•
At most pairwise (two body) interactions.
•
keep the presentation simple, I prefer to deal
THE MODEL AND THE MEAN-FIELD APPROXIMATIONS
Ising variables si = ±1
The most general Hamiltonian is
only with binary va
= ±1 and Hamiltonian containing
Xup to two-body
X interactions, i.e. extern
H(s|J , h) =
Jij si sj +
hi s i
couplings. Thus, the most general model I want to study is defined by the fo
ty distributionwith
over
N Ising variables
corresponding
measure
i
⌥
↵
↵
1
P (s1 , . . . , sN ) =
exp
Jij si sj +
hi s i ⌦ ,
Z(J , h)
i=j
i
e partition function Z(J , h) is a normalizing constant, that depends on all
} and the external fields h = {hi }. Please notice that the temperature p
The direct problem
(main pb. in stat. mech.)
•
Given the Hamiltonian, compute the free energy
F (J , h) = log Z(J , h) = log
{si }
and average values
hOi =
X
X
s
exp
X
Jij si sj +
O(s)P (s)
The sum is over exponentially many terms
X
i
hi s i
!
I.
THE MODEL AND THE MEAN-FIELD APPROXIMATIONS
) si = ±1 and Hamiltonian containing up to two-body interactions, i.e.
In order to keep the presentation simple, I prefer to deal only with binary variabl
The inverse Ising problem
wise couplings. Thus, the most general model I want to study is defined by
spins) si = ±1 and Hamiltonian containing up to two-body interactions, i.e. external fi
ability distribution over N Ising variables
pairwise couplings. Thus, the most general model I want to study is defined by the follow
•
probability distribution over N Ising variables
1
Given data generated according to
⌥
P (s1 , . . . , sN ) =
P (s1 , . . . , sN ) =
Z(J
1 , h)
Z(J , h)
exp
exp
⌥
↵
i=j
↵
Jij si sj +
Ji=j
ij si sj +
↵
i
↵
hi si ⌦i ,
hi s i ⌦ ,
e the partition
function
which
may beZ(J , h) is a normalizing constant, that depends o
where the partition function Z(J , h) is a normalizing constant, that depends on all the c
{Ji,j
} and the either
external
fields h = {h
Please notice that the temperat
M configurations
ofi }.
spins
J = {Ji,j } and the external fields h = {hi }.
Please notice that the temperature param
•
absorbed
in •theineither
definition
ofofexternal
fields
and
couplings.
the required
been absorbed
the definition
external
fields
and
couplings.
All theAll
required
informati
magnetizations
and
correlations
the is
model
is encoded
in hs
the
free-energy
model
encoded
the
free-energy
Cij = hsi sj i
mini =
i
i
•
mi mj
F (J , h) = ln Z(J , h) .
GOAL: estimate couplings
F (J ,and
h) fields
= ln Z(J , h) .
In the rest of this Section I summarize the most common MFA to the free-energy: I am par
e rest
of thisinSection
I summarize
theequations
most common
MFA to thethat
free-energy
interested
deriving the
self-consistency
for the magnetizations
are used in
II for obtaining 2-point correlations.
The “exact” solution
Maximize the
log-likelihood
=
M
Y
1
(k)
L(J , h|s) =
log
P (s |J , h) =
M
k=1
X
X
Jij hsi sj i log Z(J , h)
=
hi hsi i +
X
i
hi mi +
i
X
ij
Jij (Cij + mi mj ) + F (J , h)
ij
Taking the derivatives
mi + @hi F (J , h) = 0 =) mi (DATA) = mi (J , h)
Cij + mi mj + @Jij F (J , h) = 0 =) Cij (DATA) = Cij (J , h)
Input data:
Magnetizations and Correlations
•
•
Less information than having configurations
If only
mi = hsi i
Cij = hsi sj i
mi mj
are given then maximum entropy principle implies
Hamiltonian contains only single-body and two-body
interactions
H(s|J , h) =
X
Jij si sj +
X
i
hi s i
Brute force solution by Monte Carlo
Monte Carlo -> unbiased solution ...but it is slow!
Initial guess
for J(0),h(0)
Given J(t),h(t)
MC compute m(t),C(t)
No
Return
J(t),h(t)
dm<eps?
Yes
dC<eps?
Update J(t+1),h(t+1)
according to
dm(t)=m(t)-m
dC(t)=C(t)-C
Mean field approximations (MFA)
M
Y
1
(k)
Log-likelihood L(J , h|s) =
log
P (s |J , h) =
M
k=1
X
X
Jij (Cij + mi mj ) log Z(J , h)
=
hi mi +
i
ij
FMFA (J , h)
mMFA
=
i
@hi FMFA (J , h)
MFA
mi
(J , h) = mi (DATA)
MFA
Cij (J , h) = Cij (DATA)
e rest of this Section I summarize the most common MFA to the free-energy: I am part
in (2)
deriving the
selfF (Jin, h)
= ln Z(J
h) .
nterested
deriving
the, self-consistency
equationsinterested
for the magnetizations
that
are
ested in deriving the self-consistency equations for the magnetizations that are used in
II for obtaining 2-point correl
I
for
obtaining
2-point
correlations.
the most
common
MFA to the free-energy: I am particularly
rmarize
obtaining
2-point
correlations.
Theapproximates
simplest MFA,
also kn
The
simplest
MFA,
also
known
as
naive
MF
(nMF),
the
model
onsistency equations for the magnetizations that are used in Section
MFA to the free-energy
he simplest MFA, also known as naive MF (nMF), approximates the model in terms
magnetizations
m
=
⌥s
,
wh
magnetizations
m
=
⌥s
,
where
the
angular
brackets
represent
the
average
w.r.t
i
i
i
i
tions.
netizations mi = ⌥si , where the angular brackets represent the average w.r.t. the me
Q
naive
mean-field
(nMF)
P
(s) free-energy
=
(sislocal
Eq.(1).
The
corresponding
approximation
the
Eq.(1).
iof
i )corresponding ap
own
as naive
MF
(nMF),
approximates
theto
model
in terms
1). The
corresponding
approximation
to the
free-energy
isi PThe
⇧ ⇤ ⌅
⌅⇤
⇤ ⌅⌃
⌅⌃ ↵
⇧ ⇤
⇧
⇤
↵
↵
↵
↵
↵
1i +the
miaverage
1 m
i measure↵
re the angular brackets
represent
w.r.t.
the
in
1
+
m
1
m
i
F
=
H
+
H
+
h
im
im
nMF
FnMF =
H
+H
+
h i mi + F
Ji ij+m=
i mjJ,ij mH
nMF
2
2
2
2
i
i i=j
i=j i
i
roximation to the ifree-energy
is
⌅
⇤
⌅⌃ ↵
↵
where
andmthe
mi be
must
beaccording
fixed where
according
to the
self-consistency
H(x)
and
fixed
to the
self-consistency
equatio
i must
1e +
mH(x)
1x ln(x)
mi the
i ⇤ x⇤ln(x)
H(x)
⇤
x
ln(x)
and th
+H
+
h i mi +
Jij mi mj ,
(3)
⌥
⌥
2
2
i
i=j
↵ ↵
↵
↵
⇤FnMF
⇤FnMF
↵
⌦ .J
= =Jij mjJ+
h
atanh(m
)
=
0
⌅
m
=
tanh
h
+
J
m
⇤F
i
i
i
i
ij
j
m
+
h
atanh(m
)
=
0
⌅
m
=
tanh
h
+
nMF
ij j
i
i
i
i
i
⇤m
=
J
m
i
e mi must ⇤m
be fixed
according
to
the
self-consistency
equations
ij j +
i j
j
j
j
⇤mi
⌥
j
better
MFAMFA
can be
by considering
also
Onsager
reactionreaction
term [26],
lea
↵ thealso
A better
canobtained
be obtained
by considering
the
Onsager
term
hi atanh(mi ) = 0 ⌅ mi = tanh hi +
Jij mjA⌦ better
.
(4) can be obta
MFA
ollowing
TAP
approximated
free-energy
and
self-consistency
equations
j
he following TAP approximated free-energy and
self-consistency equations
⌅
⇤
⌅⌃ following TAP approxima
the
↵ ⇧ ⇤⇧
⌅1 m
⇤i
⌅⌃
1+⇤
mi
↵
ned by considering
reaction
term 1[26],
to
1+
mleading
FTAP = also the
H Onsager
+mHi
+
i
FTAP =
H2
+H
+
2
↵
i
2
⇤2
F⌅ =
•
FnMF =
i
H(x) ⇤
H
2
+2H
2
i
2
i
+
h i mi +
i
i=j
Jij mi mj ,
i=j
x ln(x) and the mi must be fixed according to the self-consistency equations
here H(x) ⇤ x ln(x) and the mi must be fixed according to
⌥ the self-consistency e
⌥
↵
↵
⇤FnMF
=nMF Jij↵
mj + hi atanh(mi ) = 0 ⌅ mi = tanh hi +
Jij mj ⌦↵
.
⇤F
⇤mi
=
Jij mj + hi atanh(mi ) = 0 ⌅ mi = tanh hi +
Jij m
MFA to the free-energy
⇤mij
•
j
j
j
etter MFA can nMF
be obtained
by considering
also the
Onsager reaction term [26], leading
+ Onsager
reaction term
(TAP)
A better MFA can be obtained by considering also the Onsager reaction term [2
owing TAP approximated free-energy and self-consistency equations
⌅
⇤ and ⌅⌃
he following TAP approximated
self-consistency equations
↵ ⇧ ⇤ 1 + mfree-energy
1 mi
i
⇤ +
⌅⌃
FTAP =
H↵ ⇧ ⇤ + H ⌅
1 + mi
1 mi
2
2
i
FTAP =
H ⇤
+H
+
⌅
2
↵
↵ 2
1 2
i
2
2
+
hi mi +
Jij mi m
+
J
(1
m
)(1
m
) ,
⇤
⌅
j
ij
i
j
↵
↵
2
1 2
2
2
i
i=j
+
h
m
+
J
m
m
+
J
(1
m
)(1
m
,
i i
ij i j
⌥
ij
i
j)
2⇥
i↵
i=j
mi = tanh hi + ⌥ Jij mj Jij (1 m2j )mi ⌦ .
j
mi = tanh hi +
↵
j
Jij mj
Jij (1
⇥
m2j )mi ⌦ .
reaction term
izations
⌥si , where
the angular
represent
the
w.r.t. theI measure
e rest of m
this
I summarize
the brackets
most common
MFA
toaverage
the free-energy:
am part
i =Section
The corresponding
to equations
the free-energy
is magnetizations that are used in
ested
in deriving theapproximation
self-consistency
for the
⇧ ⇤
⌅
⇤
⌅⌃ ↵
↵
↵
1 + mi
1 mi
r obtaining 2-point correlations.
FnMF =
H
+H
+
h i mi +
Jij mi mj ,
(
2
2
i
i
i=j the model in terms
he simplest MFA,
also known as naive MF (nMF), approximates
MFA to the free-energy
H(x)
⇤ x ln(x)
mi must
be fixedbrackets
accordingrepresent
to the self-consistency
equations
netizations
mi =and
⌥si ,the
where
the angular
the average w.r.t.
the me
Plefka expansion in small J
⌥
1).⇤F
The
corresponding
approximation to the free-energy is
↵
↵
nMF
=
Jij mj ⇧+ h⇤
atanh(m
0 ⌅ ⌅⌃
mi = tanh hi +
Jij mj ⌦ .
(
i
i ) =⇤
⌅
⇤mi
↵
↵
↵
1 + mi
1 mi
j
j
FnMF =
H
+H
+
h i mi +
Jij mi mj ,
2
2
i
i reactioni=j
etter MFA can be obtained
by considering also the Onsager
term [26], leading
•
e H(x)TAP
⇤ approximated
x ln(x) and the
mi must be
according toequations
the self-consistency equatio
owing
free-energy
andfixed
self-consistency
⌥
⇧
⇤
⌅
⇤
⌅⌃
↵
1 + mi
1 mi
↵
↵
⇤FnMF
FTAP
+ H ) = 0 ⌅ + m = tanh h +
= = Jij mH
+
h
atanh(m
Jij mj ⌦ .
j
i 2
i
i
i
2
⇤mi
i
j
⌅ j
↵
↵⇤
1 2
+
hi mi +
Jij mi mj + Jij (1 m2i )(1 m2j ) ,
(
better MFA can be obtained by considering also
2 the Onsager reaction term [26], lea
i
i=j
⌥
ollowing TAP approximated free-energy
and self-consistency
equations
⇥
↵
2 ⌅⌃ ⌦
⇧
⇤
⌅
⇤
mi = tanh
h
+
J
m
J
(1
m
.
(
i
ij
j
ij
j )mi
↵
1 + mi
1 mi
j
FTAP =
H
+H
+
2
2
MFA to the free-energy
•
Bethe approximation (BA)
•
•
•
tries to include any correlations between n.n. spins
in principle no small J required
(but beware to phase transitions)
factorization over links
P (s) =
•
exact on trees
Q
i
Pi (si )
Q
Pij (si ,sj )
ij Pi (si )Pj (sj )
ij
⇤ (1 m⌅i )(1 +⇤mj ) cij
⌅⌃ ↵
↵
1 mi
+ H 1 + mi
+
joint
th(
+ that
(1 correspond
di ) H
+H
hiprobability
mi +to theJijreaction
(cdistribution
latter 4
terms,
to considering
recursively +
the
reaction
between
spins
ij + m
i mj ) , of
4
2
⌅
⇤ i
⌅⌃ ↵ 2
i
i=j
msi i and sj , can
1 be
miresummed and lead↵
BA.
+H
+
hi mi + to the
Jij (c
(7)
ij + mi mj ) ,
P (s
2degree of spin si , i.e. the number of its neighboring spins. In Eq.(7) the
1 , .t
where
d
is
the
last
i
i
i=j
The BA gives a description of the model in terms of magnetizations m and connected correla-
+ mi )(1
⇧
m↵
cij
j)
MFA to the free-energy
i
terms
correspond
tomthe
average
value
of the spins
energy
at spins
givenconnected
magnetizations
and neighbouri
tions
c
=
⌃s
s
⌥
m
between
neighboring
(i.e.
by
a
non-zero
coupling
ij
i
j
i
j
i.e. the number of its neighboring spins. In Eq.(7) the last two
where
the
product
runs over
correlations,
while
first two
correspond
to the
ofconsists
thefirst
Bethe
approximation
Jij ). The BA
can the
be derived
in terms
two equivalent
ways.
Theentropy
first way
in
finding
values
ofto
mt
•
value of the energy at given magnetizations and neighbouring
Bethe
approximation
joint
distribution
of free-energy
the N spin(BA)
variables, marginal probabilities are given re
and probability
c minimizing
the following
rms correspond to the entropy of the Bethe approximation to the
⌅
⇤
⌅
↵ ⇧ ⇤ (1 + mi )(1 +BA
(1i )(1
+ mm
mj ) + cpijij (si , spji)(s(1
+ cij The conditi
i ) =m
i sji))/2.
pi (si ) ,
(
FBA =
HP (s1 , . . . , sN ) ⇤
+H
+
he N spin variables,
pi (si )pj (sj )
4
4
⌥
i
i=j
(ij)
⇤
⌅⌃
(1
pij (si , sj ) ⇤ (1 + mi )(1 mj ) cij ⌅
BA
(1 mi )(1 + mj ) cij 1
. . . , sNthe
) ⇤first product+runs
pi (s
,
(8) the two-spins
H over
+ H spins and
+lnsingle-sp
i ) pair
where
all
of
neighboring
and
J
=
ij
pi (si )pj (sj )
4
4
4
i
(ij)
⇧
⇤
⌅
⇤
⌅⌃
(1
↵
↵
↵
marginal probabilities are given
respectively
piji (si , sj ) = [(1 + mi si )(1 + mj sj ) + cij si sj ]/4 a
1+m
1by m
i
+
(1 di ) H
+H
+
hi mi +
Jij (cij + mi mj ) ,
(7)
1
2 and the two-spins
2
rp all
pair
of
neighboring
spins
and
single-spin
i
i
i=j
analytically
and lead to
i (si ) = (1 + mi si )/2. The conditions ⌅FBA /⌅cij = 0 can be solved
cij (m
1+t
i , mj , tij ) =
2tij
⇥
⇥
espectively
by
p
(s
,
s
)
=
[(1
+
m
s
)(1
+
m
s
)
+
c
s
s
]/4
and
⌥
ij i jof spin s , i.e.i the
i
j jof its ijneighboring
i j
where di is the degree
number
spins. In Eq.(7) the last two
i
(1
+
m
)(1
+
m
)
+
c
(1
m
)(1
mj ) + cij
i
j
ij
i
1
⌦neighbouring
tij = tanh(Jand
⇥ where
⇥).
tions
⌅Fcorrespond
0=can
be solved
analytically
and at
lead
to magnetizations
,Please no(
ij
BA /⌅cij J=
ijto
terms
the ln
average
value
of the energy
given
4
(1 + mi )(1 mj )⇥ cij (1 mi )(1 + mj ) cij
⇥
further
confirmation
that
resumm
correlations, while the first two terms correspond to the entropy
of the
Bethe approximation
to the
⇥
1 + mi )(1 + mj ) + cij (1
mi )(1
mj ) + c2ij 2
1
2
(mi , mj , tijdistribution
) =⇥
1of+the
tij N spin
(1 variables,
tij )⇥ ⌦ 4t
tij(9)
mj )(mj tij mi )
mi mj . (1
ij (mi
,
jointcijprobability
2tij
1 + mi )(1
+
mj )
cij
(1
mi )(1 + mj )
cij
pij
(sidentical
BA
i , sj )
⇥
where tij = tanh(Jij ). Please
note
that
Eq.(9)
is
to
Eq.(26)
in Ref. 16 and this(8)is
P
(s
,
.
.
.
,
s
)
⇤
p
(s
)
,
1
i
i
N
2
2 2
p (s )p (s )
magnetizations, from which the
equations
for
the magnetizations
cijself-consistency
= (j) (i)
m
m
.
i
j
(j)
(i)
tij + m1i +m
ji tij mj
m
cij requires
=
mi mj . algebra and I prefer to (13)
However this derivation
a(j)rather(i)complicated
1 +the
mi self-consistency
tij mj
avity magnetizations must satisfy
equations
quations in a much simpler alternative way.
⇥
gnetizations must satisfy the self-consistency equations
⇧
(j)
⇥m(i) )⌅ . correlations cij
d Cavity Method [2] localm magnetizations
neighbouring
⇤hi + mi and
=
tanh
atanh(t
ik
i
k
⇧
(j)
(j)
(i) ⌅
k(=j)
⇤
erms of some auxiliary
magnetizations
mi =variables,
tanh hi +the cavity
atanh(t
. mi (i.e. the mean (14)
ik mk )
MFA to the free-energy
•
k(=j)
hese
equations
are
often
solved
by
an
iterative
algorithm
as Belief Propagation (BP
Bethe
approximation
(BA) and
cavityknown
method
absence of a neighboring spin s ):
j
ations
areconvergence,
often solvedthe
byfixed
an iterative
known as
Belief
Propagation
(BP)
[28]
case of
point (i)
ofalgorithm
BP gives
directly
the
Bethe
free-energy
that
adm
(j)
(j)
mi + tij mj
mi : magnetization of i
convergence,
the
point
of
BP
gives
directly
the
Bethe
free-energy
that
admits
an
mifixed
,
(11)
xpression
in terms
of=cavity
magnetizations
only
[2].
(j)
(i)
in absence of j
1 + mi tij mj
in
[2].
Interms
order of
to cavity
obtainmagnetizations
a closed set of only
self-consistency
equations in the magnetizations m, l
(j)
(i)
tij mi + mj
r toeqs.(11-12)
obtain a closed
of self-consistency
equations
let me
olve
for
cavity
magnetizations
and
i find j in the magnetizations m,
mj the
=set
,
(12)
(j)
(i)
1 + mi tij mj
ij
11-12) for the cavity magnetizations
and find t(i)
(j)
mi = f(j)
mj , tij )
mj = f (mj , mi , tij ) ,
i , (i)
tij + mi(mm
j
cij(j)=
m
(13)
(i)i mj .
(j)
(i)
mi = f1(m
)
mj = f (mj , mi , tij ) ,
(15)
i , mj ,ttijm
+
m
ij
here
i
j
⌃
2
1 t
(1 t2 )2 4t(m1 m2 t)(m2 m1 t)
tions must satisfy the self-consistency
equations
f (m1 , m2 , t) =
.
⇥
⌃
2t(m2 m1 t)
2
2
2
1 t
(1 t )
4t(m1 m2 t)(m2 m1 t)
⇧
(j)
(i) ⌅
f
(m
,
m
,
t)
=
. = 0 as(14)
(16)
1
2
⇤
=
tanh
h
+
atanh(t
m
)
.
he sign in m
front
of
the
square
root
has
been
chosen
such
that
f
(0,
0,
t)
it
shoul
i
2t(mik2 km1 t)
i
k(=j)
onsistency check can be made by substituting expressions (15) in Eq.(13) to obtain again the
xpression in terms of cavity magnetizations only [2].
In order to obtain a closed set of self-consistency equations in the magnetizations m, l
MFA to the free-energy
q.(10).
Finally,for
combining
and Eq.(14),
it is possible to obtain the self consist
olve
eqs.(11-12)
the cavityEq.(11)
magnetizations
and find
ation for the magnetizations
(j) under the BA:
(i)
mi = f (mi , mj , tij )
mj = f (mj , mi , tij ) ,
⇤
⌅
⇥
Bethe approximation⌥
(BA) and cavity method
mi = tanh ⇧hi +
atanh tij f (mj , mi , tij ) ⌃ .
here
⌃
j
10). Finally, combining Eq.(11)
and
Eq.(14),
it is possible to obtain the self
2
2
2
1 t
(1 t )
4t(m1 m2 t)(m2 m1 t)
f (m1 , m2 , t) =
.
comment
that the use
of this
formula
finding
magnetizations is not a g
2t(m
m1Bethe
t)
nfair
fortothe
magnetizations
under
the
BA: for
2
•
⇤Eq.(17)
⌅=
:heindeed
iterative
solution
is typically
BP0 solving
Eq.
sign inanfront
of the
squareofroot
has been
chosen more
such unstable
that f (0,than
0, t)
as it shoul
⇥
⌥
⇧involves
⌃(not
interest incheck
this formula
is that
physical
magnetizations
onsistency
can be
by it
substituting
expressions
Eq.(13)
obtain
againones)
the
mimade
=
tanh
hi + only
atanh
tij f(15)
(m
,m
. cavity
jin
i , tij ) to
j
be used to obtain correlations (see Section
II) and to solve in a fast way the inverse I
blem
Sectionthat
V). the use of this formula for finding Bethe magnetizations is
r to (see
comment
A series expansion of the exponent in Eq.(17) for small couplings gives
ndeed an
iterative
solution gives
of Eq.(17)
typically
Small
J expansion
nMF,isTAP,
... more unstable than BP solv
⌥
⇥
⌥
⇥
2
2
erest in
that
it
involves
only
physical
magnetizations
hi this
+ formula
atanh tijisf (m
,
m
,
t
)
⇤
h
+
J
m
J
(1
m
. , cavi
j
i ij
i
ij j
ij
j )mi + . .(not
j
j
used to obtain correlations (see Section II) and to solve in a fast way the
Computing correlations
by linear response
•
Correlations are trivial in MFA
Cij = 0 in nMF, TAP and BA (between distant spins)
•
Non trivial correlations can be obtained by using the
linear response (Kappen Rodriguez, 1998)
he inverse correlation matrices C
1
for the three
discussed
above
given by the
@mMFA
@hare
i
i
1
ij
xpressions:
11
naive MF ((CnMF
nMF)ij =
TAP
Bethe
11
(C
( TAP
ij
TAP))ij
(CBA1 )ij
=
=
ij
1
⇤
⇤
m2i
=
@hj
)ij =
@mj
Jij ,
⇧
1
2
+
J
ik (1
2
1 mi
k
1
1 m2
(
⇧
m2k )
⌅
ij
Jij +
tik f2 (mk , mi , tik )
1 t2 f (mk , mi , tik )2
⌅
2Jij2 mi mj
ij
⇥
,
tij f1 (mj , mi , tij )
1 t2 f (mj , mi , tij
Computing correlations
by
linear
response
in
BA
ij
erse correlation matrices C
ons:
1
MF (CnMF
)ij =
1
⇤
m2i
1
for the three MFA discussed above are given by the followi
Jij ,
(2
⌅
⇧
⇥
1
2
2
2
AP
=
+
Jik (1for m
Jij + 2Jij min
(2
ij
i mBA
j ,
k ) linear
2
Analytic
expression
the
responses
1 mi
k
⇤
⌅
⇧ tik f2 (mk , mi , tik )
tij f1 (mj , mi , tij )
1
11
ethe ((CBA ))ijij =
, (2
ij
2
2
2
2
2
BA
1 mi
1 tik f (mk , mi , tik )
1 tij f (mj , mi , tij )
•
1
(CTAP
)ij
k
•
⇥ ⌅f (m1 , m2 , t)/⌅m1 and f2 (m1 , m2 , t) ⇥ ⌅f (m1 , m2 , t)/⌅m2 . From the
Coincide with the fixed point of
ons one can obtain
directly anyPropagation
correlation by simply computing the inverse of a matrix
Susceptibility
se note that Eq.(22) gives exactly the same solution found by the SuscProp iterative
No need to run any algorithm!
[9], which is presently considered one among the best inference algorithms. The ma
1 (m1 , m2 , t)
•
ge of Eq.(22) is that it always provides the correlation matrix, even in those cases whe
op does not converge to the fixed point. Moreover inverting a matrix takes roughly the sam
Solving the inverse problem by MFA
•
Match measured magnetizations and correlations with
MF approximated magnetizations and linear responses
mMFA
(J , h) = mi (DATA)
i
MFA
(J , h)
ij
•
= Cij (DATA)
Under the Bethe approx. one could use either
cBA
or
ij
BA
ij
for n.n. correlations. Which one is better?
2.1. Estimating correlations in the case of null magnetizations
A preliminary ranking of MFA can be done on the basis of how good their estim
correlations are, given the couplings. Indeed I expect that the better this estimat
better the solution to the inverse problem will be.
For simplicity I concentrate on models with no external fields and the coupl
multiplied by a parameter
(the inverse temperature) such that the di⇤culty
inference problem
If allincreases
field arewith
zero,. then magnetizations are null by
In the case of null magnetizations, the expressions for the inverse correlation m
symmetry, and expressions simplify to
simplify a lot:
Zero field case is simpler
•
naive MF
TAP
Bethe
1
(CnMF
)ij = ⇥ij Jij ,
"
#
X
1
2
(CTAP
)ij = 1 +
Jik
⇥ij
(CBA1 )ij
"
= 1+
k
X
k
Jij ,
#
t2ik
⇥ij
2
1 tik
tij
,
2
1 tij
since f1 (0, 0, t) = 1/(1 t2 ) and f2 (0, 0, t) = t/(1 t2 ).
Given that for mi = 0, the expressions for the correlation matrices are much
I report also those that can be obtained from the Plefka expansion at the third and
order:
2
3
Exactly solvable case for the
inverse Ising problem ?
•
0.30
0.25
ij
Curie-Weiss model, fully connected
nMF approximation
Jij = /(N
1)
N=10,20,40
exact correlations
with i 6= j
0.20
1.5
0.15
1.4
0.10
ii
1.3
0.05
0.2
0.4
0.6
0.8
1.0
1.2
1.1
0.2
0.4
0.6
0.8
1.0
Exactly solvable case for the
inverse Ising problem ?
•
Curie-Weiss model, fully connected
more MFA (N=20)
Jij = /(N
TAP ~ BA
0.5
0.4
1)
ij
with i 6= j
1.10
0.3
nMF
ii
3rd
1.05
0.2
0.1
0.0
1.00
0.2
0.4
0.6
0.8
4th
1.0
0.95
0.2
0.4
0.6
0.8
1.0
Exactly solvable case for the
inverse Ising problem ?
•
•
Bethe approximation on trees is ok
•
How much the paramagnetic properties of a model on a
finite size random graph are different from those of
the same model defined on a tree?
Bethe approximation on random graph is ok only far from
the critical point (as nMF for the Curie-Weiss model)
•
see recent works on finite size corrections for models
defined on random graphs (Lucibello and Morone)
Shown are typical samples of size N = 5 (the qualitative behavior does not chan
The
discrepancy between
true correlations
C and those inferred C is d
Numerical
results
on estimating
⇥
1e+02
1
correlations
2 .
(C
C
)
ij
C ⇤
ij
N2
i,j
1e+00
1e+02
1e+02
In Figure 1 and 2 I report the typical behavior of the error
C
betwee
correlation
for 5 di⇥erent MFA. Figure 1 shows
results for models
1e-02
1e+00 matrices
1e+00
2D spin glass
)
∆C
lattice, while Figure 2 refers to FC and 3D topologies. In order to comp
1e-02
1e-02
1e-04
nMF
with
the
exact
correlation
matrices
I
am
studying
here
small systems, but
et
does 1e-04
not change for larger
sizes.
2D
1e-06 ferromagnet
Although
1e-06 the quantitative
dilutedbehavior
(p=0.7) of
C
TAP
1e-04
rd
3 th
4on the specific sam
depends1e-06
BA
ments can be made:
1e-08
1e-08 1
1e-08 0.8
0.8
0
0.2
0.4
0.6
1
• naive 0MF is 0.2
typically
many
0.4the worst
0.6 MFA
0.8and shows
1
0 spurious
0.2 sin
for each peak
in
1e+02
1e+02
C );
1e+02
Matching data and MF predictions
0
B
@
•
1
Cij
..
Cij
.
1
1
0
11
C?B
A=@
ij
..
.
ij
Usually only off-diagonal elements are used
Jij =
(C
1
)ij
and diagonal elements are ignored...
NN
1
C
A
ng the MFA which can be derived from the Plefka expansion, I consider only TAP and
The corresponding expressions for the inferred couplings can be obtain
use are those performing better in the direct problem of estimating correlations (see Section
corresponding expressions for the inferred couplings2can be obtained
1 by solving the equa
2mthree
(C above
)ij = 0are ⌅(i
⇤= by
j) t
i mj JMFA
ij + problem
ij + Jdiscussed
for the
given
MFA for the 1inverse
Ising
he inverse correlation matrices C
2mi mj Jij2 + Jij + (C
pressions:
for TAP and the equation
TAP
equation
1
naiveand
MFthe(C
nMF )ij =
C
ij
)ij =
)ij = 0 ⌅(i ⇤= j)
nMF
Jij
= (C
Jjij
, i =)
t
f
(m
,
m
,
t
)
2
ij
1
ij
↵
(C )ij = ⇤1 m
=
i
2
tij f1 (mj , mi , t1ij ) tij f (mj , mi , tij )2
t⌅(1
2 )2
ij
t
⇧
ij
1= ↵
2
1
1
1
1(m , m , t )2
1
t
f
TAP (CijTAP )ijj =i ij
1
m2i
+
(1
22
2
2
J
(1
m
tij )ik 4tij (m
k )i
1
)ij
tij
4tij (mi ⌅(i
tij m
)(m
jj)
=
⇤
⇥
2
Jij j+ 2J
mi )i mj
tijij mj )(m
tijijm
,
k
for the BA, thus
leading
to
⇤
⌅
+
he BA, thus leading to
⇧
tij m=0
f1 (mj , mi , t
1
tik f2 (mk , mi , tik ) equal for
⌦
1
Bethe⌦ (CBA )ij =
1 )2
ij
2
2 f (m , m ,
1
8m
m
(C
1
2
i
j
ij
1
m
1
t
f
(m
,
m
,
t
)
1
t
TAP
1
i
j
i
k
ik
i
ij
ik
1 J8m
m
(C
)
1
=
,
i
j
ij
k
TAP
ij
Jij
=
, 4mi mj
4mi mj
⇤ ⇥ ⌅f (m1 , m2 ,⇤t)/⌅m1 and
here f1 (m1 , m2 , t)
f2 (m1 , m2 , t) ⇥ ⌅f (m1 , m2 , t)/⌅m2 .
↵
↵
1
BA
1
2 )(1
2 )(C 1 )2
BA
2
2
2
1
J
=
atanh
1
+
4(1
m
m
m
i
J
=
atanh
1
+
4(1
m
)(1
m
)(C
)
m
m
ij
i
j
ij
i
j
1
ij
i
j
ij
1
pressions one can2(C
obtain
by simply computing the inverse o
2(C correlation
)ij
)ijdirectly any
⌅
↵
↵ the same solution found by⇥2the SuscProp
Please note 1that Eq.(22)
gives
1 exactly
2
2
1 2 2
1 2
2 1
2
1 + 4(1
mi )(1
2mi mjm
(C )(C
)ij 1 ) 4(C2m)i ij
.
1m+
4(1 )ijmi )(1
m
(C
j )(C
j
j
ij
1)
2(C
2(C 1considered
)ij
rithm [9], whichij is presently
one among the best inference algorithms
dvantage
of Eq.(22) isI that
it always provides
correlation
even in those
fourth approximation
am considering
has been the
obtained
from a matrix,
small correlation
expan
The fourth approximation I am considering has been obtained from a
1
t2ij f (mj , mi , tij )2
(1
t2ij )2
4tij (mi
tij mj )(mj
tij mi )
More MFA for the
inverse Ising problem
, thus leading to
IV. METHODS FOR THE INVERSE ISING PROBLEM
⌦
1 8mi mj (C 1 )ij 1
=
,
I consider 4 4m
di⇥erent
i mj approximations for solving the inverse Ising problem. The simples
⇤
↵ approximation, already discussed in Section I and recalled he
the independent-pair
1 (IP)
2 )(1
2 )(C 1 )2
= atanh
1
+
4(1
m
m
mi mj
i
j
ij
1
2(C )ij
Independent
pair (IP) approximation
onvenience
⌅
⇥
⇥
⇧
⌃
↵
⇥2
1
(1 + m2i )(1 + mj )2 + Cij 1 2(1 mi )(1 mj )1 + Cij
1 )2
1
1
+
4(1
m
)(1
m
)(C
)
2m
m
(C
)
4(C
IP
i j
ij
⌥
i
j
ij
ij
⇥
⇥ .
J
=
ln
1
ij
2(C )ij
4
•
(1 + mi )(1
mj )
Cij
(1
mi )(1 + mj )
Cij
hmong
approximation
I am considering has been obtained from a small correlation expa
the MFA which can be derived from the Plefka expansion, I consider only TAP and
Sessak-Monasson (SM) small correlation expansion
and Monasson
[16] and has been further simplified in Ref. 27 to the following expr
•
ecause are those performing better in the direct problem of estimating correlations (see Sectio
he corresponding expressions for the inferred couplingsCcan
ij be obtained by solving the equ
SM
1
IP
Jij =
(C
)ij + Jij
(1
m2i )(1
1
2mi mj Jij2 + Jij + (C
m2j )
(Cij
)ij = 0 ⌅(i ⇤= j)
)2
.
h approximation, I measure the error in inferred couplings Jij⇥ with respect to th
r TAP and the equation
y the following expression
Numerical results for the inverse
Ising problem
10
2D ferromagnet N=72
diluted (p=0.7)
1
0.1
J
0.01
IP
TAP
SM
BA
BA norm
0.001
0.0001
0
0.2
0.4
0.6
0.8
1
1.2
Improving correlations by a
normalization trick
Mean
Field and
BP
for
SGBethe
on a 2D lattice
Characterizing and improving generalized belief propagation algorithms on the 2D Edwards–Anderson
•
In ferromagnetic models with loops,
linear response correlations in BA are
too strong because of loops, which are
“unexpected” in SuscProp
•
•
Leading to
ii
Trick: enforce3
4
>1
⎧
which is unphysical
⎪
PBethe ({si }) =
bi,j (si , sj )
⎪
⎪
⎪
⎨
i,j
1 by a normalization
ii =
Bethe
Approximation:
- Exact on Trees
⎪
⎪
⎪
- Good on sparse systems
⎪
⎩
- Message Passing (BP)
bij = p
bi (si )−c
i
ij
ii jj
Constrained minimization: b(si ) =
sj
b(si , sj )
unive
pure ferro
Alejandro Lage-Castellanos,
et.al. (1)of convergence
CVM on 2D
model
2012
Figure
4. Probability
ofEABP
and GBP La
onSapienza,
a 2DFebruary,
EA model,
random bimodal interactions, as a function of the inverse temperature β =
Make MFA & LR consistent
“Consistency is more important than truth” (S. Ting)
Add Lagrange multipliers to your referred MF free-energy
FMFA ({mi }, {Cij }, . . .)
to enforce consistency with linear response estimates
ii
=1
m2i
free energy
minimum
curvature
ij
= Cij
free energy
minimum
location
General framework (MFA + LR)
F = FMFA ({mi }, {Cij }, . . .) +
Your preferred MFA
X
i
2
i mi
+
X
ij Cij
i<j
can be set to zero to
recover known approx.
or used to satisfy
2
=
1
m
ii
ij = Cij
i
Nearest-neighbor correlation
(2D square lattice)
NN correlations
0.8
0.6
Bethe
approx.
LR
exact
ME+LR
0.4
ME
0.2
0.0
0.1
0.2
0.3
0.4
β(E[Jij(CMC)-Jij
) ])
exact 2 1/2
Off-diagonal
constraint
only
PHYSICAL REVIEW E 87, 052111 (2013)
0
exact
0.01
NMF/TAP (λ=0)
Bethe (λ=0)
Bethe (λ=0) [KR]
Bethe
Plaquette
J
0.001
0.15
0.2
0.25
0.3
0.35
β
0.4
0.45
0.5
0.55
FIG.Bethe
7. (Color+
online)
Error in inferring
couplings J for=a diluted
off-diagonal
constraints
SM
2D square ferromagnet, from statistics of 106 independent samples.
[KR] employs the Kappen-Rodriguez normalization [10].
Random field Ising model
2D square lattice
2D RFIM <h> = 0.0 σh = 0.2 β = 0.25
0.005
0.004
minfer - mexact
0.003
Bethe, no constraint
Bethe, diag. constraint
Bethe, both constraints
0.002
0.001
0
-0.001
-0.002
-0.1
-0.05
0
mexact
0.05
0.1
Random field Ising model
2D square lattice
2D RFIM <h> = 0.0 σh = 0.2 β = 0.25
1e-02
|minfer - mexact|
1e-03
1e-04
1e-05
1e-06
-0.1
Bethe, no constraint
Bethe, diag. constraint
Bethe, both constraints
-0.05
0
mexact
0.05
0.1
Random field Ising model
2D square lattice
2D RFIM <h> = 0.0 σh = 0.2 β = 0.25
1e-01
1e-02
self correlations
|Cinfer - Cexact|
1e-03
1e-04
1e-05
1e-06
1e-07
0.99
Bethe, no constraint (LR)
Bethe, no constraint (ME)
Bethe, diag. constraint (LR)
Bethe, both constraints (ME=LR)
0.992
0.994
0.996
Cexact
0.998
1
Random field Ising model
2D square lattice
2D RFIM <h> = 0.0 σh = 0.2 β = 0.25
1e-01
NNN correlations
NN correlations
|Cinfer - Cexact|
1e-02
1e-03
1e-04
Bethe, no constraint (LR)
Bethe, diag. constraint (LR)
Bethe, both constraints (ME=LR)
1e-05
0
0.05
0.1
0.15
Cexact
0.2
0.25
0.3
Input data: Configurations
•
More information than knowing only
mi = hsi i
Cij = hsi sj i
mi mj
•
In principle one can access to all higher order
correlations (but these are much more noisy)
•
Many different inference algorithms. Among these:
•
•
Adaptive cluster expansion
cluster configurations & apply MFA within any state
Pseudo-likelihood method (PLM)
•
•
For each variable define a conditional probability
P
exp[si (hi + j Jij sj )]
P
Pi (si |s\i ) =
2 cosh(hi + j Jij sj )
Maximize the local log-likelihood Li = hlog Pi (si |s\i )i =
hi mi +
X
j
Jij (Cij + mi mj )
hlog 2 cosh(hi +
to estimate hi and Jij
•
•
X
Jij sj )i
j
Note that for each coupling Jij PLM returns 2 estimates
Better maximizing P L(h, J ) =
X
i
Li
PLM vs. MFA
10
2D ferromagnet N=72=49
diluted (p=0.7)
M=5000 samples
1
J
0.1
0.01
IP
TAP
SM
BA
PLM
BA norm
0
0.2
0.4
0.6
0.8
1
Inferring topology in sparse models
•
•
For simplicity let’s assume
•
•
|Jij | 2 {0, }
non-zero couplings are sparse
Maximize L1-regularized pseudo-likelihood
P L (h, J ) =
X
i
•
Li
X
ij
|Jij |
L1-regularization gives a bias to the estimates!
field
s the
arse
work
d by
the
ably
oved [2]
ow a
s we
icult
perated,
not
es is
n the
and J 1 is impossible (central panel of Fig. 1). For sparse
models, the use of the l1 regularization may induce a new
gap (see right panel of Fig. 1), but this l1 -induced gap does
Couplings
inferred
PLM and the
not always
separates correctly
J1 from Jby
0 couplings
β=0.5
1e-4
β=0.9
1e-2
1
1e-4
1e-2
β=0.9
L1 regularized
1
1e-12 1e-8 1e-4
1
log(J inferred)
Isingonline).
modelHistograms
(30% dilution)
M=4500
FIG. 2D
1 (color
of couplings
inferred by
PLM in a typical sample of 2D ferromagnetic Ising model with
30% dilution (M ¼ 4500). Left: at β ¼ 0.5 a clear gap identifies
Decimation procedure
•
•
•
No L1-regularizer -> no bias
•
Maximize again P L(h, J ) only on couplings still not set
to zero (this is impossible within a MFA)
•
Iterate until maximum of P L(h, J ) starts decreasing
“sensibly”
Maximize P L(h, J )
Set to zero a constant fraction of couplings
(those inferred to be the smallest)
o
y
d
n
f
s
0.6
tPLF
Error
1
0.8
0.4
0.6
0.58
0.56
0.54
0.52
0.5
0.6
0.4
0.2
40
60
Error
e
e
e
1.2
tPLF
EVIEW
week ending
PLM
+
decimation
LETTERS
21 FEBRUARY 2014
80 100 120
0.2
0
0
200
400
600
800
Number of non-decimated couplings
1000
be very well compared to the standard PLM with l1
regularization and δ thresholding by plotting the corresponding ROC PLM
curves +
(see
Fig. 3). Each ROC curve is
decimation
1
0.95
True Positive Rate
ihe
n
st
se
ll
he
re
le
of
on
is
we
on
m
to
F
re
0.9
0.85
0.8
0.75
0.7
PLM+L1 δ=1e-8
PLM+L1 δ=1e-6
PLM+L1 δ=1e-4
PLM+L1 δ=1e-2
PLM+L1 δ=1e-1
PLM+Decimation
STOP point
0.65
0.6
0.95
0.96
0.97
0.98
0.99
1
True Negative Rate
FIG. 3 (color online).
ROC curves for a typical sample of the
Some conclusions
•
•
Mean field approximations
•
•
•
Inverse problem harder than direct problem
Requires (at least) improvement in the direct problem
Fundamental problem of going beyond Bethe and trees...
Pseudo-likelihood method
•
•
Better performances in general
Specially well suited for inferring topology in sparse
models via L1-regularization, thresholding or decimation

Download Report