Yakir-Vizel-Lecture1-Intro_to_SMT
Download
Report
Transcript Yakir-Vizel-Lecture1-Intro_to_SMT
Introduction to
Satisfiability Modulo Theories
Yakir Vizel
SMT Seminar
Spring 2014
Outline
• Background
– Propositional logic
– DPLL-style SAT solvers
• The SMT World
– First order logic
– Theories
• SMT Solving
– Eager approach
– Lazy approach
– DPLL(T)
Motivation
• Are these two programs equal?
int power3(int in) {
int i, out_a;
out_a = in;
for (i=0; i<2; i++)
out_a = out_a * in;
return out_a; }
int power3_new(int in) {
int out_b;
out_b = (in * in) * in;
return out_b; }
• In general, this is an undecidable
problem
– No sound and complete method
Motivation (2)
• Only bounded loops in our example
out0_a = in ⋀
out1_a = out0_a * in ⋀
out2_a = out1_a * in
out0_b = (in * in) * in
Need to prove that ja ⋀ jb out2_a = out0_b is
a valid formula
Basic Definitions
• Assignment to a formula j is a mapping from
the variables of j to a domain D
– (a ⋁ b) c: a -> 1, b -> 0, c -> 1
• Given a formula j
– it is satisfiable if there exists an assignments s.t. j
is evaluated to TRUE
– it is contradiction if such an assignment does not
exist
– it is a tautology (or valid) if under every
assignment it evaluates to TRUE
Decision Procedure
• The Decision Problem for a given formula j is
to determine whether j is valid
• So clearly, a Decision Procedure…
–We want it to be Sound and Complete
• Sound = returns “Valid” when j is valid
• Complete = terminates, and when j is valid, it returns
“Valid”
• A Theory is decidable if there is a decision
procedure for it
Formal Reasoning
• We want to reason about the validity (or
satisfiability) of a formula
• Model Theoretic approach
– Enumerate possible solutions
– Need a finite number of candidates
• Proof Theoretic approach
– Deductive mechanism
– Based on axioms and inference rules
• Forms an inference system
Formal Reasoning
• Three statements
– If x is a prime number greater than 2, then x is odd
– It it not the case that x is not a prime number greater
than 2
– x is not odd
• Prime number greater than 2 – A
• Odd number - B
AB
A
B
Inference Rules
j1 j2
j1
j2
(M.P)
¬¬j j
j
¬j
FALSE
(CONTR)
(DOUBLE NEG)
Proof Theoretic
AB
A
B
1. A B (premise)
2. A (premise)
3. A (2, DOUBLE NEG)
4. B (premise)
5. B (1+3, M.P)
6. FALSE (4+5, CONTR)
¬¬j j
j1 j2
(DOUBLE NEG)
j1
j2
(M.P)
Model Theoretic
AB
A
B
A
B
AB
(AB) ⋀ A
(AB) ⋀ A ⋀ ¬B
0
0
1
0
0
0
1
1
0
0
1
0
0
0
0
1
1
1
1
0
Propositional Logic
• Logic:
– A (formal) language for making statements about
objects and reasoning about properties of these
objects
• The alphabet:
– Countable set of proposition symbols: a, b, c, …
– Logical connectives: (or), ⋀ (and), ¬
(not/negation), ≣ (equivalence), (implication)
• Connectives can be expressed using other connectives
• (a b) is the same as (¬a ⋁ b)
Propositional Logic (2)
• Formula is defined using the following
grammar:
– formula: formula ⋀ formula | ¬formula | (formula) |
atom
– atom: Boolean variable | TRUE | FALSE
• An example to a data structure that uses this
grammar: And-Inverter Graph (AIG)
⋀
⋀
a
b
c
Normal Forms
• Negation Normal Form
– Use only or, and and not
– ¬ appears only on proposition symbols (variables)
• Conjunctive Normal Form
– literal: a or a
– clause: disjunction of literals
– CNF: conjunction of clauses
DPLL-style SAT solvers
GRASP, CHAFF, MiniSAT, Glucose
• Objective:
– Check satisfiability of a CNF formula
• Given a CNF formula, is there a satisfying assignment?
• Approach:
– Branch: make arbitrary decisions
– Propagate implication graph
– Use conflicts to guide inference steps
SAT solvers can also generate refutation proofs!
Pseudo Code
DPLL(j)
AddClauses(cnf(j));
if (BCP() == “confl”) return UnSAT;
while (true) do
if (!Decide()) return SAT;
while(BCP() == “confl”) do
bl = Analyze();
if (bl < 0) return UnSAT;
else BackTrack(bl);
The Implication Graph (BCP)
(a b) (b c d)
a
c
b
d
Decisions
Assignment: a b c d
Propositional Resolution
a b c
a c d
b c d
When a conflict occurs, the implication graph is
used to guide the resolution of clauses, so that the
same conflict will not occur again.
Conflict Clauses
(a c) (a b) (b c d) (b d)
Decision
a
c
resolve
(b c )
b
Conflict!
Conflict!
(a c)
d
resolve
Conflict!
Assignment: a b c d
Conflict Clauses
c (a c) (a b) (b c d) (b d)
a
c
(b c )
b
(a c)
d
(c)
()
Generating refutations
• Refutation = a proof of the null clause
– Record a DAG containing all resolution steps
performed during conflict clause generation.
– When null clause is generated, we can extract a
proof of the null clause as a resolution DAG.
Original clauses
Derived clauses
Null clause
Formal Reasoning – Which?
• Model Theoretic
– The SAT solver tries to find an assignment
• In a way enumerating assignments
– Makes arbitrary decisions for variables
• Proof Theoretic
– The learning scheme when hitting a conflict
– Uses Resolution as an inference rule
• The only inference rule
v⋁A
• SO… Combination of the two…
¬v ⋁ B
A⋁B
(RES)
Circuit to SAT
Can the circuit output be 1?
(a g) (b g)
(a b g)
a
b
c
input
variables
g
p
output
variable
CNF(p)
(g p) (c p)
(g c p)
p is satisfiable when the
formula CNF(p) p
is satisfiable
Questions?
• Is a SAT Solver a Decision Procedure?
• Yes!
– But it looks for an assignment…?
– Yes, but can prove the lack of one
• So, in order to prove the validity of j
– Prove ¬j is a contradiction
First Order Logic
• The alphabet:
– Countable set of symbols: a, b, c, …
– Logical connectives: (or), ⋀ (and), ¬
(not/negation), ≣ (equivalence), (implication),
quantifiers (∀,∃),
– Non-logical symbols
• functions (F(x1,…,xn))
• Constants (1,2,3,…)
• Predicate symbols (x > 0)
First Order Logic
• The Decision Problem for FOL is undecidable
– There is no sound and complete decision procedure
for it
• Several approaches
– Limit the syntax
• Only FOL formulas without quantifiers or function symbols
– Limit the possible models (assignments)
• Not always easier
– Show validity for a given Theory
• Propositional logic is a “subset” – or formally – a Theory
Equality and Uninterpreted Functions
• Or in short EUF
• Based on
– (x1 = y1 … xn = yn) f(x1,…,xn) = f(y1,….,yn)
• NP-Complete problem
– Meaning Decidable
– Polynomial reduction to propositional problem
Equality and Uninterpreted Functions
• Recall the motivating example
out0_a = in ⋀
out1_a = out0_a * in ⋀
out2_a = out1_a * in
out0_b = (in * in) * in
out0_a = in ⋀
out1_a = F(out0_a, in) ⋀
out2_a = F(out1_a, in)
out0_b = F(F(in, in), in)
• Suitable for any function F
Linear Arithemetic
• Or LIA
• Formulas of the form
– 3x + 2y ≤ 5z ⋀ 2x -2y = 0
– Equality is a fragment of LIA
• Usage
– Code optimizations performed by compilers
Arrays
• Operations on array data-structure
– A map from an index to an element
• Basic operations
– Read an element from an index i
– Writing an element to index i
Satisfiability Modulo Theories
• Objective:
– Given a theory T check satisfiability of a T-formula
• Approach:
– Eager encoding
• Reduce to a SAT problem
– Lazy encoding
• Intro to DPLL(T)
– DPLL(T)
Reduction to SAT
• SMT Solving is based on the observation that
a T-formula can be reduced to SAT
• (x-y≤0) ⋀ (y-z≤0) ⋀ ((z-x≤-1) ⋁ (z-x≤-2))
– Use Boolean variables for each conjunct
•
•
•
•
a – (x-y≤0)
b – (y-z≤0)
c – (z-x≤-1)
d – (z-x≤-2)
– a ⋀ b ⋀ (c ⋁ d)
Eager Encoding
• Reduction to SAT clearly does not capture the
semantics
• Eagerly encode the semantics into the formula
• In our example, we have four T-atoms: a,b,c,d
– Need to capture the relation between the atoms:
•
•
•
•
•
Are a and b consistent?
Are a and ¬b consistent?
…
Are a and b and c consistent?
…
– Can lead to 2n relations – Exponential!
Eager Encoding - Example
• Back to our examples
– (x-y≤0) ⋀ (y-z≤0) ⋀ ((z-x≤-1) ⋁ (z-x≤-2))
– a ⋀ b ⋀ (c ⋁ d)
• Clearly (x-y≤0), (y-z≤0) and (z-x≤1-) cannot be
all TRUE
– Thus we can add ¬(a ⋀ b ⋀ c)
• In a similar manner we add
– ¬(¬a ⋀ ¬b ⋀ ¬c)
– ¬(a ⋀ b ⋀ d)
– ¬(¬a ⋀ ¬b ⋀ ¬d)
Eager Encoding – Example (2)
• The resulting formula
– (a ⋀ b ⋀ (c ⋁ d)) ⋀ ¬(a ⋀ b ⋀ c) ⋀ ¬(¬a ⋀ ¬b ⋀ ¬c) ⋀
¬(a ⋀ b ⋀ d) ⋀ ¬(¬a ⋀ ¬b ⋀ ¬d)
• The formula is then passed to a SAT solver
– If it is satisfiable, then so is the original formula
– Same goes for unsatisfiability
• Redundancy?
– ¬(¬a ⋀ ¬b ⋀ ¬d)
– ¬(¬a ⋀ ¬b ⋀ ¬c)
Schematics
SMT formula
CNF formula
T-Solver
CNF + Relations
Encoder
SAT-Solver
SAT/UnSAT
Pseudo Code
Eager(j)
R = T-Solver(lit(j));
B = e(j) ⋀ e(R);
while (true) do
<a, res> = SAT(B);
if (res == “UnSAT”) return UnSAT
else return SAT;
Eager Encoding
• Pros
– Easy to implement
• SAT solver is used as a black-box
– Works well for bit-vectors theory
• Cons
– Encoding may get too big to handle
– Search starts only after entire problem is
translated
Lazy Encoding
• Recall: reduction to SAT does not capture the
semantics
• Lazily encode the semantics into the formula
– How?
• Less redundancies
Lazy Encoding – How it works
• Our example
– (x-y≤0) ⋀ (y-z≤0) ⋀ ((z-x≤-1) ⋁ (z-x≤-2))
– a ⋀ b ⋀ (c ⋁ d)
• Start by trying to solve the propositional
formula
– If UnSAT – done
– Otherwise, get the assignment
• a -> TRUE, b -> TRUE, c -> TRUE, d -> FALSE
– Check if the assignment hold in the theory
• If yes, a true assignment
• Otherwise, block the assignment: ¬(a ⋀ b ⋀ c ⋀ ¬d)
– Solve again
Schematics
SMT formula
Encoder
CNF formula
blocking clause
T-Solver
SAT
assignment
SAT-Solver
UnSAT
Pseudo Code
Lazy(j)
B = e(j);
while (true) do
<a, res> = SAT(B);
if (res == “UnSAT”) return UnSAT
else
<t,res> = T-Solver(T(a))
if (res == “SAT”) return SAT;
B = B ⋀ e(t)
Lazy Encoding
• Pros
– Easy to implement
• SAT solver is used as a black-box
• Need to implement T-Solver
• Cons
– SAT Solver must finish entirely before Tinconsistency is checked
– SAT Solver restarts (at the beginning) if last
assignment failed
How Can We Improve?
• The SAT Solver and T-Solver are separate
– Each works as a black-box
• Try to bring them closer together
– One should help the other
Incrementality
• Problem
– SAT Solver restarts (at the beginning) if last
assignment failed
• Keep the state of the SAT solver, and simply
resume
– After a full assignment is found, check with TSolver
– If not a real assignment, add a clause and BCP
Pseudo Code
Lazy-DPLL(j)
AddClauses(cnf(e(j)));
if (BCP() == “confl”) return UnSAT;
while (true) do
if (!Decide())
<t,res> = T-Solver(T(a))
if (res == “SAT”) return SAT;
AddClause(e(t));
while(BCP() == “confl”) do
bl = Analyze();
if (bl < 0) return UnSAT;
else BackTrack(bl);
else
while(BCP() == “confl”) do
bl = Analyze();
if (bl < 0) return UnSAT;
else BackTrack(bl);
Use Partial Assignment
• Problem
– SAT Solver must finish entirely before Tinconsistency is checked
• Try to find a T-inconsistency using only a
partial assignment
– Send partial assignment to T-Solver after k
decisions
• May be hard to find the optimal k
Use Partial Assignment
• Consider a formula that has (10 ≤ x) and (x < 0)
(represented by v and u respectively)
– v and u are both assigned to TRUE
– Every call to the T-Solver results in UnSAT
• May also be due to other reasons
• Thus, when these two are assigned TRUE we
want to block all possible assignments
– By adding ¬(v ⋀ u)
Theory Propagation
• How do we improve more?
• Deeper integration
• T-Solver uses current partial assignment to
deduce values for other literals
– Similar to BCP, but using the Theory
• Improves performance dramatically
• But makes conflict analysis hard
Theory Propagation
• Consider a formula that has (10 ≤ x), (x < 0),
(x ≤ y) and (5 ≤ y)
• By knowing that (10 ≤ x) and (x ≤ y) are
assigned to TRUE, the T-Solver can deduce
facts
(10 ≤ x)
¬(x < 0)
(x ≤ y)
(5 ≤ y)
Decisions
Pseudo Code
DPLL_T(j)
AddClauses(cnf(e(j)));
if (BCP() == “confl”) return UnSAT;
while (true) do
if (!Decide()) return SAT;
repeat
while(BCP() == “confl”) do
bl = Analyze();
if (bl < 0) return UnSAT;
else BackTrack(bl);
<t,res> = T-Solver(T(a))
AddClause(e(t));
until (t == true)
Schematics
SMT formula
Encoder
CNF formula
T-Solver
SAT-Solver
UnSAT/SAT
Combining Theories
• We have seen how to solve a specific theory
• But what if we have
– 3x + 2y ≤ f(z) ⋀ f(x) -2y = 0
– LIA_UF
• Assuming we have a decision procedure for
each theory
– There is a procedure, called Nelson-Oppen
– Other conditions as well
Advanced Topics
• Proof generation
– We want something similar to a Resolution Proof in
the propositional case
• Interpolants generation
– Specific methods for different theories
– Based on deduction rules
• Specific usage in model checking
Questions?