Sampling Techniques for Boolean Satisfiability

Sampling Techniques for Boolean Satisfiability
Supratik Chakraborty
1
2
1
2
Kuldeep S. Meel
Moshe Y. Vardi
Indian Institute of Technology Bombay, India
Department of Computer Science, Rice University
Two Classical Problems
Experimental Results
Uniformity comparison for case110
ApproxMC performance comparison ( = 0.75, δ = 0.9)
500
US
UniGen
400
50000
Time
350
300
40000
30000
250
20000
200
0
0
100
0
160
180
guarantees but poor performance
• Heuristic approaches (e.g., XORSample0[Gomes 2006], MiniCount[Kroc 2008]):
strong performance but weak guarantees
200
220
240
260
280
300
320
{0, 1}m.
• For each α1, . . . αr ∈ {0, 1}m and for all distinct y1, . . . yr ∈ {0, 1}n, we have
R
Pr [∧ri=1h(yi) = αi : h ←
−
H(n, m, r) = 2−mr .

distinguishable in practice
• Indicates the conservative nature of theoretical guarantees
Work supported in part by NSF grants CNS 1049862 and CCF-1139011, by NSF Expeditions in Computing project
“ExCAPE: Expeditions in Computer Augmented Program Engineering,” by BSF grant 9800096, by a gift from Intel, by
a grant from Board of Research in Nuclear Sciences, India, and by the Shared University Grid at Rice funded by NSF
under Grant EIA-0216467, and a partnership between Rice University, Sun Microsystems, and Sigma Solutions, Inc
ApproxMC
Cachet/1.75
Cachet*1.75
1.0e+03
1.0e+02
0
5
10 15 20 25
Benchmarks
30
35
Quality of counts computed by
ApproxMC
Theoretical Results
Use of r-wise independent hash functions to randomly partition the space
of all solutions into “small” cells
RF : Solution space
Uniformity: If ε > 6.84, then for every y ∈ RF , we have
1
1
≤ Pr [UniGen(F, ε) = y] ≤ (1 + ε)
.
(1 + ε)(|RF | − 1)
|RF | − 1
Small :
ApproxMC(CNF:F, tolerance:ε, confidence:δ)
n-independent
Hashing
3-independent
Hashing
Random
Suppose ApproxMC(F, ε, δ) returns c Then
Pr (1 + ε)

Publications and Acknowledgements
[3] S. Chakraborty, K.S. Meel, and M.Y. Vardi. A Scalable Approximate Model
Counter. In Proceedings of CP’13.
1.0e+04
theoretical guarantee of 75%
• Large class of problems that lie beyond the exact counters but can be
computed by ApproxMC

[2] S. Chakraborty, K.S. Meel, and M.Y. Vardi. Balancing Scalability and
Uniformity in SAT-Witness generator. In Proceedings of DAC’14.
1.0e+05
10 15 20 25 30 35 40
Benchmarks
Performance comparison between
ApproxMC and Cachet
Central Idea


[1] S. Chakraborty, D.J. Fremont, K. S. Meel, S.A. Seshia, and M. Y. Vardi.
Distribution-Aware Sampling and Weighted Model Counting for SAT. 2014.
1.0e+06
• Approximately 4% error in practice – much smaller than the
• F : CNF formula with n variables, RF : Set of all solutions of F
• H(n, m, r): Family of r-wise independent hash functions mapping {0, 1}n to
1.0e+07
Sampling by UniGen
Scalable Nearly Uniform Generator (UniGen) and Approximate Model Counter
(ApproxMC) with provable guarantees.
Mathematical Formulation
5
1.0e+08
Count
• Distributions from UniGen and perfect sampler () are hardly
Our Contributions
Cachet
ApproxMC
10000
50
• Theoretical approaches (e.g., BGP [Bellare 1998], Cachet [Sang 2004]): strong
1.0e+09
60000
150
Prior Work
1.0e+10
70000
# of Solutions
450
# of Solutions
Given: A CNF-SAT formula F
• Uniform Generation: Sample solutions uniformly without enumerating all
the solutions
I Applications in directed random testing
• Approximate Counting: Approximate the number of solutions without
enumerating all the solutions
I Applications in Probabilistic Inference, Markov logic networks etc.
2
−1
· |RF | ≤ c ≤ (1 + ε) · |RF | ≥ δ


Our
Approach
BGP
Approach
Partitioned space
All cells are “small”
Partitioned space
A random cell is “small”
• Employs XOR-based hash functions instead of computationally
infeasible algebraic hash functions.
• Uses off-the-shelf SAT solver CryptoMiniSAT
• Requires only a randomly chosen cell to be “small” instead of all
cells to be “small”
• Reduces “small” cell size from 2n2 to 2n1/k , k ≥ 1 - scales to large
instances
Conclusion and Future Work
• Nearly-Uniform generator UniGen scales to SAT problems with tens of
thousands of variables
• ApproxMC is the first scalable (scales to tens of thousands of variables)
approximate counter for CNF formulae
• Possibly stronger guarantees for uniform generation by combining
counting and hashing
• Extension of hashing technique to weighted model counting and uniform
generation in SMT domain