forward chaining

Download Report

Transcript forward chaining

Artificial Intelligence
Inference in first-order logic
Fall 2008
professor: Luigi Ceccaroni
Role of first-order logic (FOL)
• Knowledge representation (KR) is an
interdisciplinary subject that applies
theories and techniques from three
fields:
– Logic provides the formal structure and
rules of inference.
– Ontology defines the kinds of things that
exist in the application domain.
– Computation supports the applications that
distinguish KR from pure philosophy. 2
Syntax of FOL
3
Effective procedures for answering
questions posed in FOL
• More or less anything can be stated in FOL.
• It is important to have algorithms that can
answer any answerable question stated in
FOL.
• Three major families of first-order inference
algorithms:
– forward chaining and its applications to
deductive databases and production
systems
– backward chaining and logic programming
systems
– resolution-based theorem-proving systems
Forward chaining
• Forward chaining can be applied to firstorder definite clauses.
• First-order definite clauses are
disjunctions of literals of which exactly one
is positive.
• Definite clauses such as Situation 1 ∧
Situation 2 ⇒ Response are especially
useful for systems that make inference in
response to newly arrived information.
Forward chaining
• The idea:
– start with the atomic sentences in the knowledge
base
– apply Modus Ponens (A, A⇒B |- B) in the
forward direction, adding new atomic sentences,
until no further inferences can be made
• Reasoning with forward chaining can be
much more efficient than resolution theorem
proving.
• This idea needs to be applied efficiently to
6
first-order definite clauses.
First-order definite clauses
• Consider the following problem:
– The law says that it is a crime for a Spanish to
sell weapons to other nations. The country
Israel has some missiles and all of its missiles
were sold to it by Rodríguez Zapatero, who is
Spanish.
• To prove that Rodríguez Zapatero is a
criminal, first, we have to represent the
facts as first-order definite clauses.
• Then we can solve the problem with
forward-chaining.
7
First-order definite clauses
• “... it is a crime for a Spanish to sell weapons
to other nations”:
Spanish(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧
NotInSpain(z) ⇒ Criminal(x).
• “...Israel has some missiles” is transformed
into two definite clauses by Existential
Elimination, introducing a new constant
AGM12:
Owns(Israel, AGM12)
Missile(AGM12)
8
First-order definite clauses
• “All of its missiles were sold to it by Rodríguez
Zapatero”:
Missile(x) ∧ Owns(Israel, x) ⇒ Sells(ZP, x, Israel).
• We need to know that missiles are weapons:
Missile(x) ⇒ Weapon(x)
• We also need to know that Israel is not in
Spain:
NotInSpain(Israel)
• “Rodríguez Zapatero, who is Spanish...”:
Spanish(ZP)
9
A simple forward-chaining
algorithm
function FOL-FC-ASK (KB,a) returns a substitution or false
local new, the new sentences inferred on each iteration
repeat until new is empty
new <- {}
for each sentence r in KB do
(p1 … pn q ) <- STANDARDIZE-APART(r)
for each t such that SUBST(t , p1 … pn)=SUBST(t , p1’
… pn’) for some p1’ ,…, pn’ in KB
q’ <- SUBST(t ,q)
if q’ <- is not a renaming of some sentence already in
KB or new then do
add q’ to new
f <- UNIFY(q’, a)
if f is not fail then return f
add new to KB
return answers
10
A simple forward-chaining
algorithm
• Starting from the known facts,
– it triggers all the rules whose premises are
satisfied,
– adding their conclusions to the known facts.
• The process repeats until
– the query is answered (assuming that just
one answer is required)
– or no new facts are added.
11
A simple forward-chaining
algorithm
• Notice that a fact is not “new” if it is just
renaming a known fact.
• One sentence is a renaming of another if
they are identical except for the names of
the variables.
• Likes(x, IceCream) and Likes(y,
IceCream) are renamings of each other
– their meanings are identical:
• everyone likes ice cream
12
A simple forward-chaining
algorithm
• In the application of forward chaining to the
crime problem (with three rules), two iteration
are required:
– Iteration 1
“Spanish(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧
NotInSpain(z) ⇒ Criminal(x).” has unsatisfied
premises.
“Missile(x) ∧ Owns(Israel, x) ⇒ Sells(ZP, x, Israel).” is
satisfied with {x/AGM12} and “Sells(ZP, AGM12,
Israel).” is added.
“Missile(x) ⇒ Weapon(x).” is satisfied with {x/AGM12}
and “Weapon(AGM12)” is added.
13
A simple forward-chaining
algorithm
– Iteration 2
“Spanish(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧
NotInSpain(z) ⇒ Criminal(x).” is satisfied with {x/ZP,
y/AGM12, z/Israel} and Criminal(ZP) is added.
•
•
Fixed point: no new inferences are
possible at this point because every
sentence that could be concluded by
forward chaining is already contained
explicitly in the KB.
Algorithm not designed for efficiency of
14
operation.
Efficient forward chaining
• There are three possible sources of
complexity:
– The “inner loop” involves finding all possible
unifiers such that the premise of a rule unifies
with a suitable set of facts in the KB.
• This pattern matching can be very expensive.
– The algorithm rechecks every rule on every
iteration, even if very few additions are made
to the KB on each iteration.
– The algorithm might generate many facts that
15
are irrelevant to the goal.
Matching rules against known
facts
• Consider the two rules:
Missile(x) ∧ Owns(Israel, x) ⇒ Sells(ZP, x, Israel).
Missile(x) ⇒ Weapon(x).
• If the KB contains many objects owned by
Israel and very few missiles,
– it would be better to find all the missiles first
– then check whether they are owned by Israel
• This is the conjunct ordering problem:
– find an ordering to solve the conjuncts of the rule
premise so that the total cost is minimized.
16
Matching rules against known
facts
• Finding the optimal ordering is NP-hard,
but good heuristics are available:
– for example, the most constrained variable
heuristic
• The connection between pattern matching
and constraint satisfaction is very close.
• Most rules in real-world KBs are small and
simple rather than large and complex.
17
Incremental forward chaining
• To avoid redundant rule matching:
– Every new fact inferred on iteration t must be
derived from at least one new fact inferred on
iteration t-1.
• This leads to an incremental forward
chaining algorithm where, at iteration t, a
rule is checked only if its premise includes
a conjunct that unifies with a fact newly
inferred at iteration t-1.
18
Rete algorithm
• Only a small fraction of the rules in a knowledge base
are triggered by the addition of new facts.
• A great deal of redundant work is done in constructing
partial matches repeatedly that have some unsatisfied
premises.
• It would be better to retain and gradually complete the
partial matches as new facts arrive, rather than
discarding them.
• The rete algorithm was the first to address this problem
seriously.
• The algorithm preprocesses the set of rules in the
knowledge base to construct a sort of dataflow network
in which each node is a literal from a rule premise.
• Variable bindings flow through the network and are
filtered out when they fail to match a literal.
Production systems
• Rete networks have been a key
component of so-called production
systems, which were among the earliest
forward chaining systems in widespread
use.
• The word “production” denotes a
condition-action rule.
• The steps to solve a problem are
described as a chain of deductions.
Production systems
• The representation is based on two elements:
– facts: definite clauses with no negative literals, which
simply assert a given proposition; they can include
variables
– rules: conditional formulas where the consequent is
exactly one literal
• A problem is defined by:
– fact base: atomic sentences describing the specific
problem
– rule base: rules describing the reasoning
mechanisms
– inference engine: algorithm executing the rules in a
reasoning chain
Irrelevant facts
• Forward chaining makes all allowable
inferences, even if they are irrelevant to
the goal at hand.
• One solution is to restrict forward chaining
to a selected subset of rules.
• Another solution is to use backward
chaining.
22
Backward chaining
• This algorithm works backward from the
goal, chaining through rules to find known
facts that support the proof.
• It is called with a list of goals containing a
single element, the original query.
• It returns the set of all substitutions
satisfying the query.
• If all of them can be satisfied, then the
current branch of the proof succeeds.
23
Backward chaining
• Set of sentences:
•
•
•
•
•
S1: for each x1,y1 child(x1,y1) ⇒ parent(y1,x1)
S2: for each x2,y2 parent(x2,y2) ∧ female(x2) ⇒ mother(x2,y2)
S3: child(Lisa,Homer)
S4: child(Lisa,Marge)
S5: female(Marge)
• Goal: mother(x0,Lisa)
• Note: variables have already been standardized apart
using subscripts
24
Backward chaining
25
Backward chaining
• Based on the inductive method:
– The objective/hypothesis is validated
reconstructing the reasoning chain backwards
• Each step implies new sub-objectives:
– New hypotheses to be validated
26
Backward chaining
• Algorithm
– The FB is initialized with a set of initial facts.
– The hypothesis set (HS) is initialized with the
objectives.
– While there are hypotheses in HS, one is
chosen and validated:
• Facts and right-end-side of rules are compared to
the hypotheses
• If a hypothesis is in the FB, then eliminate it from
HS.
• Else, look for rules with the hypothesis as
conclusion. Select one and add its non-satisfied
27
premises to HS.
Backward chaining
• Advantages:
– Only what is necessary for the resolution of
the problem is considered.
• The resolution process consist of a tree
exploration
• Most representative language: Prolog
– W. F. Clocksin y C. S. Mellish . Programming
in Prolog: Using the ISO Standard. Springer,
2003 (first edition: 1981).
28
Hybrid chaining
• Parts of the reasoning chain from facts to
objectives are deductively constructed.
• Other parts are inductively constructed.
• Bi-directional exploration
• The combinatorial explosion of
deductive reasoning is avoided.
29
Hybrid chaining
• Meta-rules decide about strategy change
as a function of:
– Number of initial and final states
• Better from smaller to larger
– Branching factor
• Better in the direction of lower factor
– Necessity of justifying the reasoning process
• Better what reflects typical user reasoning
30
Inference engine
• The inference engine or control
mechanism is made of two elements:
– Rule interpreter or inference mechanism
• It determines which rules can be applied
– Control strategy of conflict-resolution
strategy
• The function of the inference engine is to
execute actions to solve the problem
(objective) from an initial set of facts,
possibly via some interaction with the
user.
31
Inference engine
• Phases of a basic engine cycle:
1. Detection (filter): relevant rules
• Obtaining from the KB the set of executable rules for a
given situation (state) of the FB (rule interpreter)
• Creation of the conflict set
2. Selection: which rule?
•
Conflict resolution: selection of a rule to be executed
(control strategy)
3. Execution
•
Execution of the rule with an instantiation of the
variables: modification of the working memory
4. Return to 1, or stop if the problem is solved
•
If a solution had not been found and no rule is
executable: failure
32
Detection
• Creation of the set of executable rules
• The rule interpreter obtains possible
instantiations of the LHS of the rules via
matching.
• A rule can be instantiated more than
once with different values matching the
variables.
33
Selection
• Rules are selected according to the control
strategy, which can be:
– Fixed
– Dynamic with criteria decided in advance
– Guided by dynamic meta-rules
34
Selection
• Strategies for the selection of the “best”
rule/instantiation:
–
–
–
–
–
First rule according to KB ordering
Least/most used rule
Most specific (with more literals) / most general rule
Rule with the highest certainty value
The instantiation satisfying
• Facts with highest priority
• Oldest facts
• Most recent facts
– Most recently used rule
– Dynamic meta-rules
– Combination of different criteria
35
Execution
• The execution of a rule can cause:
– A modification of the FB (in forward chaining
only)
– New calculations, new actions, questions to
the user
– New sub-objectives (in backward chaining)
• Instantiations are propagated
• Certainty degrees are propagated
36
End of deduction/induction
process
• The conclusion (objective) is
found/demonstrated: success
• No rule is executable: success? / failure?
37
38
Prolog: lenguaje de
programación lógica
• Programa Prolog:
–
–
–
–
conjunto de aserciones lógicas
cada aserción es una cláusula de Horn
proceso de comparación: unificación
el orden de las aserciones (reglas) es significativo
• Ejemplo
– gato (bufa).
– animaldomestico (X) :- gato (X).
– animaldomestico (X) :- pequeño (X), conplumas (X).
• p → q ⇔ q :- p
• Consultas: ?- predicado.
• La negación se representa con la falta de la aserción
correspondiente: asunción de mundo cerrado.
Cláusula de Horn
•De Wikipedia:
Las cláusulas de Horn (instrucciones ejecutables de PROLOG) tienen el siguiente
aspecto:
hija (*A, *B) :- mujer (*A), padre (*B, *A). que podría leerse así: "A es hija de B si A es
mujer y B es padre de A".
Obsérvese que, en PROLOG, el símbolo :- separa la conclusión de las condiciones. En
PROLOG, las variables se escriben comenzando con un asterisco. Todas las
condiciones deben cumplirse simultáneamente para que la conclusión sea válida. Por
tanto, la coma que separa las distintas condiciones es equivalente a la conjunción
copulativa (en algunas versiones de PROLOG se sustituye la coma por el símbolo &). La
disyunción, en cambio, no se representa mediante símbolos especiales, sino definiendo
reglas nuevas, como la siguiente:
hija (*A, *B) :- mujer (*A), madre (*B, *A). que podría leerse así: "A es hija de B si A es
mujer y B es madre de A".
Cláusulas con como mucho un literal positivo.