スライド 1 - Watanabe Lab.

菊池自由エネルギー
に対するCCCPアルゴリズ
ムの拡張
東京工業大学総合理工学研究科
知能システム科学専攻 渡辺研究室
西山 悠, 渡辺澄夫
Outline
High-dimensional distribution
marginals
O( m d )
Functional
Bethe free energy

Kikuchi free energy
Optimization techniques
BO(M. Welling, et. al., 2001)
CCCP(A. L. Yuille, 2002)
SEQ(Tonosaki, et. al., 2007)
(Approximate) marginals

New CCCP (NCCCP)
Performance of CCCP
CCCP

guarantees to monotonically decrease Kikuchi
free energy unlike belief propagation.

requires huge computational cost compared to
belief propagation.

does not always converge for synchronous
update of inner loop.
Contribution
Purpose
We extend CCCP algorithm for Kikuchi free energy
and present a new CCCP (NCCCP) algorithm.
NCCCP algorithm

includes conventional CCCP

guarantees to monotonically decrease Kikuchi free
energy.

is more stable even for synchronous inner loop.

can reduce huge computational cost underlying
CCCP.
Key Idea

NCCCP is based on the generalization of concave
and convex decomposition in Kikuchi free energy.
convex
concave
Kikuchi free energy =
+
=
+
=
+
Parameters
change the
decomposition.
Background
G  (V , E )
High-dimensional distribution
1
p( x1 ,, xd )  Wij ( xi , x j )
Z {ij}B
Marginals
x2
x1
x6
W23
W34
x4
1
p (x )   Wij ( xi , x j )
x \ x Z {ij}B
W12
x3
W35
W56
x5
(  V )
Direct calculation is intractable.
e.g.
Maximization of the Posterior Marginals (MPM) estimation
Free Energy
Minimization of Kullback-Leibler distance from a trial
distribution r (x) to the target distribution p (x) is equivalent to
that of the following functional with respect to r (x) :
F (r )   S (r ) 

{ij}B
Eij
rij
  log Z .
(1)
Helmholtz Free Energy
Entropy S (r ) is
S (r )   r (x) log r (x)
x
Energy Eij is
Eij   log rij ( xi , x j )
Kikuchi Free Energy
Kikuchi free energy, which is defined on set of regions R , is
given by as follows:
FKikuchi ({r }R )    c S (r )   c E
 R
Entropy term consists
of weighted sum of
entropies S in R
Overcounting number
c  1 
c
c ,



sup( )
 R
r
Energy term in (1) is exactly
rewritten as this form.
is given by
where
c  1 for basic clusters  .
Bethe Free Energy
When the set of regions R is given by R  {V , E}
FBethe ({ri }iV ,{rij }{ij}E ) 

S
{ij}B
ij
  (Ni  1) Si   c E
iV
Bethe free energy
R
r
Example (1)
1
2
3
Basic clusters
C  {1245, 2356, 4578, 5689}
4
5
6
Set of regions R
7
8
1245 2356 4578
25
R  {C, 25, 45, 56, 58, 5 }
9
45
56
5
Region Graph
5689
58
Sub (5689)  {56, 58}
Sup (56)  {2356, 5689} Sub (56)  {5}
Sup (5)  {25, 45, 56, 58, C}
Overcounting number
1245 2356 4578
25
45
56
5689
58
c1245  c2356  c4578  c5689  1
c25  c45  c56  c58  1
( c  1 
c5  1
( c5  1 
5
c  1  2)



sup( )


c  1  4  4)
sup( 5)
Kikuchi free energy
FKikuchi ({r }R )  S1245  S2356  S4578  S5689  S25  S45  S56  S58  S5
  c E
R
r
Example (2)
Basic clusters
1
2
4
C  {12, 23, 14
, 25, 36, 45, 56}
3
5
Set of regions R
6
R  {C, 1
, 2, 3, 4, 5, 6 }
12 23 14 25 36 45 56
1
2
3
4
Region Graph
5
6
Overcounting number
c  1 (  C )
c1  c3  c4  c6  1
c2  c5  2
Bethe free energy
FBethe ({r }R )  S12  S23  S14  S25  S36  S45  S56
 S1  2S2  S3  S 4  2S5  S6   c E
 R
r
Concave Convex Procedure (CCCP)
CCCP algorithm guarantees to monotonically decrease the
function(al) F (r)  Fvex (r)  Fcave (r) , where Fvex (r) is convex
and Fcave (r) is concave, by the update Fvex (r (t 1) )  Fcave (r (t ) ).
Kikuchi free energy consists of such functions.
Conventional CCCP
FK ,vex ({r }R )    cmaxS (r )   cmax E
R
R
r
FK ,cave ({r }R )    (c  cmax )S (r )   (c  cmax ) E
R
R
r
(cmax  max c )
 R
Double loop algorithm = Outer loop + Inner loop
Main Results
Trivial pair creation
0

f vex (r)
 f vex (r)
New CCCP (NCCCP)
FK ,vex (r)  FKikuchi (r)  fvex (r)
FK ,cave (r)   f vex (r)
Particularly, let convex funtional f vex (r) be of the form
f vex (r)    uS , S (r )   uE , E
R
R
r
.
Free parameters
Parameters {uS , }, {uE , } satisfy
uS ,  max{0,c },

uE,  R
(  R).
FKikuchi (r)  FK ,vex (r, u)  FK ,cave (r, u)



A high-dimensional vector
u  {uS , }
changes the decomposition.


CCCP
U

0
U  {u  R|R| | u  max{0,c },  R}
NCCCP Free Energy
The update rule Fvex (r (t 1) )  Fcave (r (t ) ) is obtained by
differentiating NCCCP free energy subject to normalization
and consistency properties of marginals.
NCCCP free energy
FK ,CCCP (r(t 1) | r(t ) , u)  Fvex (r(t 1) , u)  r(t 1)  Fcave (r(t ) , u)
Lagrange functional
LK ,CCCP (r
(t 1)
| r , u)  FK ,CCCP
(t )

  v ( r(t 1)  1)

R
x
( t 1)
( t 1)

(
x
)(
r

r
)
  ,   

 R  sub d ( ) x 
x \ 
NCCCP Algorithm for Kikuchi (1/2)
Theorem (Outer Loop of NCCCP)
Outer loop of NCCCP algorithm for minimizing Kikuchi free
energy is given by as follows:
( t 1)
k
r
uk
( t ) ck  u k
k
 {r }
exp[ v~k 

 sub d ( k )
 k ,  (x  )
ck  u k


 sup d ( k )
 , k ( x k )
ck E k

],
ck  u k ck  u k
k  R.
Here
{v~k }, { , (x  )} :Lagrange multipliers
{ck }
{uk }
R
:overcounting numbers
:free parameters
:set of regions
NCCCP Algorithm for Kikuchi (2/2)
Theorem (Inner Loop of NCCCP)
Inner loop of NCCCP is given by as follows:
uk
( t ) ck  u k
k
exp( v~k( 1) )   {r }
exp[ 

 k(, ) (x  )
 sub d ( k )
xk
ck  u k


(,k) (x k )
ck E k

],
ck  u k ck  u k
 sup d ( k )
k  R,
1
1
exp{(

)k(,l1) (xl )} 
ck  uk cl  ul

x k \l
uk
( t ) ck  u k
k
{r }
ul
( t ) cl  ul
{rl }
exp[v~
( )
k


 k(,) (x  )
 sub d ( k ) \ l
exp[v~l( ) 


sub d ( l )
ck  u k
( )
l ,
 (x  )
cl  ul


 sup d ( k )


 sup d ( l ) \ k
(,k) (x k )
ck  u k
( )
 ,l
 (xl )
cl  ul


ck E k
]
ck  u k
,
cl El
]
cl  ul
k  R, l  subd (k ).
Remark
NCCCP algorithm guarantees to monotonically decrease
Kikuchi free energy even if free vector u is changed within U
(t )
u
at every outer loop (i.e,
).
NCCCP algorithm (u U )
U
(t )
Outer loop u
u
u  cmax  c
u ( 2)
(1)
u (t )
(1)
u (1) Inner loop
Outer loop u ( 2 )
u ( 2)
Inner loop
( 2)
u
Outer loop
Approximate marginals
u (1)
CCCP
u (1)
u ( 2)
u (t )
Update Manner of Inner Loop
Asynchronous update
time 
time   1
time   2
CCCP guarantees to converge.
Synchronous update
time 
time   1
time   2
CCCP does not always converge.
NCCCP converges.
Role of the free vector
When | u | is small
When | u | is large
Outer loop
Outer loop
Outer loop
u
slow
or
convergence
fast
Outer loop
fast
Outer loop
slow
Outer loop
or
convergence
fast
Outer loop
fast
Outer loop
Approximate marginals
Approximate marginals
There exists an optimal series of vector u

.
Example: Gaussian Distributions
We consider the special case where
Kikuchi free energy is Bethe free energy (i.e., R  {V , E}).
the target distribution is Gaussian distributions.
| S |1 2
1 T
exp{ x Sx}
Target distribution: N (0, S ) 
d
2
( 2 )
Marginals and Lagrange multipliers:
ri ( xi )  N (0, i )
~
rij ( xi , x j )  N (0, Sij )
exp{ ( x )}  N (0,~ )
ij ,i
i
i {1,, d}
ij ,i
exp{ij,i ( xi )}  N (0,~ij,i )
{ij} B
NCCCP for Bethe free energy
Theorem (Outer loop in Gaussian distributions)
ui
1
(it ) 
i {1,, d },
ij,i ,
ui  | Ni | 1
ui  | Ni | 1 jNi
si ,i  ~ij,i
si , j 
~
{ij}  B.
 
~ ,
~
s
s


j ,i
j, j
ij , j 

(it 1) 
~
Sij(t 1)
Theorem (Inner loop in Gaussian distributions)


( 1)
ij ,i
( 1)
ij , j
Here
si2, j
ui  | Ni | 1
ui
1
1

{


(it ) 
ik(,i) },

1
ui  | Ni | 2 | Ni |
ui  | Ni | 1 kNi \{ j}
 ij(,i ) ui  | Ni | 1
| Nj |
u j  | N j | 1
si2, j
uj
1
1

{


(tj ) 
 jk(,)j },

1
u j  | N j | 2 | N j |
u j  | N j | 1 kN j \{i}
 ij(,j) u j  | N j | 1
| Ni |
ui
satisfies ui | Ni | 1.
{ij}  B.
{ij}  B.
Abbreviated Case
We consider the special case where inverse covariance
matrix S satisfies
Ni  N ,
i {1,, d },
si, j  s ,
{ij}  B.
Corollary (Outer and Inner loops which satisfy (2))
Outer loop
(t 1) 
u
|N|
(t ) 

u  | N | 1
u  | N | 1
Inner loop

( 1)
s2
u  | N | 1
1
u
| N | 1 ( )

{


(t ) 
 }
1
u  | N | 2 | N |
u

|
N
|

1
u

|
N
|

1
( )

|N|
Here u | N | 1.
( 2)
Stability
Expansion of inner loop around the fixed-point 

( 1)
*
s2
| N | 1
u  | N | 1
 {

} ( )  O(( ( ) ) 2 )
u  | N | 2
u  | N | { 1   *}2
|N|
Necessary condition for the convergence of inner loop: u  2 | N | 3
Conventional CCCP (u | N |) : | N | 3
The condition is very tight.
CCCP does not converge for the dense graph.
NCCCP u  2 | N | 3
The condition can always be satisfied for arbitrary
dense graphs by making parameter u large.
Numerical Results (Synchronous)
CCCP
Convergence
S (10, 0.1, 4)
S (10, 0.1, 5)
S (10, 2, 5)
S (10, 0.1, 10
)
Numerical Results (Asynchronous)
CCCP
S (10, 0.1, 5)
S (10, 2, 5)
S (10, 0.1, 10
)
S (10, 0.1, 20)
Conclusion
We presented a new CCCP (NCCCP) algorithm for
Kikuchi free energy.
NCCCP algorithm

includes conventional CCCP

guarantees to monotonically decrease Kikuchi free
energy.

is more stable even for synchronous inner loop.

can reduce huge computational cost underlying
CCCP.
Future works

To design efficient NCCCP algorithm based on
the optimality of free parameters.

To apply NCCCP algorithm to practical problems
such as CDMA multi-user detection problems or
decoding algorithm for LDPC codes.
Reference
M. Welling and Y. W. Teh proceeding of UAI (2001)
A. L. Yuille Neural Computation 14 1691 (2002)
Y. Tonosaki and Y. Kabashima Interdisciplinary Information Sciences 13 57 (2007)
T. Shibuya et. al. IEICE Trans. On Fundamentals E88-A5 1346 (2005)