Logic Programming and Prolog

Download Report

Transcript Logic Programming and Prolog

DCP 1172
Introduction to Artificial Intelligence
Lecture notes for representing Knowledge
using Rules [AIMA]
Chang-Sheng Chen
1
Logical reasoning systems
• Theorem provers and logic programming languages
• Provers: use resolution to prove sentences in full FOL.
• Languages: use backward chaining on restricted set of FOL
constructs.
• Production systems – based on implications, with
consequents interpreted as action (e.g., insertion & deletion in
KB).
• Based on forward chaining + conflict resolution if several possible
actions.
• Frame systems and semantic networks – objects as nodes
in a graph, nodes organized as taxonomy, links represent
binary relations.
• Description logic systems – evolved from semantic nets.
Reason with object classes & relations among them.
DCP1172, Ch.8,9
2
Theorem provers
• Theorem proving refers to being able to enter (usually
simple) queries against a database of facts and axioms
(i.e., if A is true, then B is true). and being given an
answer, derivable from the facts and axioms (as well as
the basic laws of logic).
• It should be noted that "theorem proving" does not (in
general) refer to anything as grandiose as proving
Fermat’s Last Theorem.
• In a type inference system (or any typechecker,
really), a typical query might be "is A a subtype of B"?
DCP1172, Ch.8,9
3
Basic tasks
• Add a new fact to KB – TELL
• Given KB and new fact, derive facts implied by conjunction
of KB and new fact.
 In forward chaining: part of TELL
• Decide if query entailed by KB – ASK
• Decide if query explicitly stored in KB – restricted ASK
• Remove sentence from KB: distinguish between correcting
false sentence, forgetting useless sentence, or updating KB
re. change in the world.
DCP1172, Ch.8,9
4
Logic programming
- Computation as inference on logical KBs
• Logic programming is a programming language
paradigm based on logic (more accurately, the Predicate
Calculus), in which logic assertions are viewed as
programs.
• A program is represented by a set of facts
(statements/relationships which are held to be true),
and a set of axioms (i.e. if A is true, then B is true).
The axioms and clauses may have arguments.
DCP1172, Ch.8,9
5
Logic Programming
• Declarative style
• State what to do, system takes care of rest
• State what is known and what question is to be
solved
• “something for nothing”
• If you are prepared to wait for an infinite amount
of time
• Augmented by control
• Algorithm = Logic + control
DCP1172, Ch.8,9
6
Logic Programming (cont.)
- “What-to-do” Paradigm vs. “How-to-do-it” Paradigm
Logic Programming
1. Identify problem
2. Assemble information
3. Tea Break
4. Encode information in KB
5. Encode problem instance
as facts
6. Ask queries
7. Find false facts
Ordinary programming
• Identify problem
• Assemble information
• Figure out solution
• Program solution
• Encode problem instance
as data
• Apply program to data
• Debug procedural errors
Should be easier to debug Capital(Tokyo, Japan) than X:=X+2
DCP1172, Ch.8,9
7
Logic Programming (cont)
• For example, one could define the relation child(A,B),
meaning that A is a child of B.
• One then could establish a set of facts (stored in a
database--see below):
child(Pebbles,Fred)
child(Pebbles,Wilma)
child(Wilma,Freds-mother-in-law) (what's her name?)
child(Bam-bam,Barney)
child(Bam-bam,Betty)
• One can query the database:
child? (Pebbles,Fred) -> True
child? (Pebbles,Barney) -> False (at least Fred hopes not!)
DCP1172, Ch.8,9
8
Logic Programming (cont)
• In a Logic Programming system (or a Relational
Database, or a Constraint Programming system, etc);
one builds a collection of facts (things held to be true)
and axioms (rules for deriving additional facts from the
set of base facts).
• If a theorem can be shown to be true based on the facts
and axioms; then it is true.
• How do you handle a query that cannot be shown from
the above (in other words, isn't derivable?)
• Two possibilities, the Closed World Assumption and the Open
World Assumption.
DCP1172, Ch.8,9
9
Open World Assumption
• In the Open World Assumption ; any proposition or
theorem which cannot be derived from the facts and
axioms present in the system is held to be unknown.
• Things which are known to be true or false must be
stated explicitly, or else inferrable from facts and
axioms.
• The Open World Assumption more clearly models
reality...however it severely complicates things.
• The two boolean values (true and false) are
inadequate, and we have to use Three Valued Logic
(i.e, 'True', 'False' and 'Unknown' . ). Universal
quantifiers are much harder to prove.
DCP1172, Ch.8,9
10
Closed World Assumption
• The Closed World Assumption holds that anything
that cannot be shown to be true is false; no explicit
declaration of falsehood is needed.
• Consequently, any query (which terminates) will either return
true or false; there is no possibility of "unknown".
• Most programming systems which use logic or predicate
calculus use the Closed World Assumption ,
including:
• Prolog Language
• Relational Databases (these can be viewed as predicate
systems)
DCP1172, Ch.8,9
11
Production systems
• Rule-based systems or production systems are
computer systems that use rules to provide
recommendations or diagnoses, or to determine a
course of action in a particular situation or to solve a
particular problem.
• A rule-based system consists of a number of
components:
• A database of rules ( also called a knowledge base)
• A database of facts
• An interpreter, or inference engine
DCP1172, Ch.8,9
12
Production System
User
Knowledge
Base
Fact
Working
Memory
User
Expertise Interface
Inference
Engine
•Knowledge-Base System
DCP1172, Ch.8,9
13
Forward-chaining process
- data-driven process
• When applying forward-chaining, the 1st step is to take
the facts in the facts database and see if any
combination of these matches all the antecedents of one
of the rules in the rule databases.
• If it is the case, the rule is triggered.
• Usually, when a rule is triggered, it is then fired,
which means that the conclusion is added to the facts
database.
DCP1172, Ch.8,9
14
Forward Chaining vs. Backward Chaining
• Rules:
R1: A ^B  C
R2: A  D
R3: C ^ D  E
R4: B ^ E ^ F  G
R5: A ^ E  H
R6: D ^ E ^ H  I
• Facts:
Fact 1 A
Fact 2 B
Fact 3 F
• Our Goal is to proveDCP1172,
H
Ch.8,9
15
Forward-chaining process
- data-driven process
Facts
Rules triggered
Rules fired
A,B,F
1,2
1
A,B,C,F
2
2
A,B,C,D,F
3
3
4,5
4
A,B,C,D,E,F,G
5
5
A,B,C,D,E,F,G,H
6
STOP
A,B,C,D,E,F
DCP1172, Ch.8,9
16
Conflict resolution phase
• In a situation where more than one conclusion can be
deduced from a set of facts, there are a number of
possible ways to decide which rule to fire.
If it is cold
THEN wear a coat.
If it is cold
THEN stay at home.
If it is cold
THEN turn on the heat.
• In some cases, it might be fine to follow all the
conclusions, but in many cases, the conclusions are
incompatible (for example, when prescribing medicines
to patients.)
DCP1172, Ch.8,9
17
Conflict resolution phase (cont)
• one strategy: execute all actions for all satisfied rules
• or, treat them as suggestions and use conflict
resolution to pick one action.
• Strategies:
- no duplication (do not execute twice same rule on
same arguments)
- regency (prefer rules involving recently created WM
elements)
- specificity (prefer more specific rules, e.g., longestmatching strategy)
- operation priority (rank actions by priority and pick
highest)
DCP1172, Ch.8,9
18
Backward chaining process
- Goal-driven reasoning
Facts
Goals
Matching Rules
A,B,F
H
5
A,B,F
E
3
A,B,F
C, D
1
D
2
A,B,C,F
A,B,C,D,F
STOP
DCP1172, Ch.8,9
19
Forward-chaining production systems
• Prolog & other programming languages: rely on
backward-chaining
(I.e., given a query, find substitutions that satisfy it)
• Forward-chaining systems: infer everything that can be
inferred from KB each time new sentence is TELL’ed
• Appropriate for agent design: as new percepts come in,
forward-chaining returns best action
DCP1172, Ch.8,9
20
Implementation
• One possible approach: use a theorem prover, using resolution
to forward-chain over KB
• More restricted systems can be more efficient.
• Typical components:
- KB called “working memory” (positive literals, no variables)
- rule memory (set of inference rules in form
p1  p2  …  act1  act2  …
- at each cycle: find rules whose premises satisfied
by working memory (match phase)
- decide which should be executed (conflict resolution phase)
- execute actions of chosen rule (act phase)
DCP1172, Ch.8,9
21
Match phase
• Unification can do it, but inefficient
• Rete algorithm (used in OPS-5 system): example
rule memory:
A(x)  B(x)  C(y)  add D(x)
A(x)  B(y)  D(x)  add E(x)
A(x)  B(x)  E(x)  delete A(x)
working memory:
{A(1), A(2), B(2), B(3), B(4), C(5)}
• Build Rete network from rule memory, then pass working
memory through it
DCP1172, Ch.8,9
22
Inside the Rule Engine
The Rete Network
 A Rete network is a graph through which data flows.
 Rule conditions are compiled into a network
Objects
Ruleset
Compiler
Rete
Network
Rule instances
 Minimizes the number of evaluations
 Tests are shared between rules
 Changes are propagated incrementally
DCP1172, Ch.8,9
23
Rete algoritm
• The Rete algorithm (from the Latin 'rete' for net, or
network) provides the basis for a more efficient
implementation of an expert system.
• A Rete-based expert system builds a network of nodes,
where each node (except the root) corresponds to a
pattern occurring in the left-hand-side of a rule.
• The path from the root node to a leaf node defines a
complete rule left-hand-side.
• Each node has a memory of facts which satisfy that
pattern.
DCP1172, Ch.8,9
24
Rete network
A
B
A(1),
A(2)
B(2),
B(3),
B(4)
D
A=D
add E
A=B
C
add D
A(2),
B(2)
C(5)
E
D(2)
delete A
Circular nodes: fetches to WM; rectangular nodes: unifications
A(x)  B(x)  C(y)  add D(x)
A(x)  B(y)  D(x)  add E(x)
A(x)  B(x)  E(x)  delete A(x)
{A(1), A(2), B(2), B(3), B(4), C(5)}
DCP1172, Ch.8,9
25
A(x)  B(x)  C(y)  add D(x)
A(x)  B(y)  D(x)  add E(x)
A(x)  B(x)  E(x)  delete A(x)
Rete match
D
A=D
A(2)
D(2)
x/2
D(2)
A
A(1), A(2)
B
B(2),B(3),B(4)
Add E
A=B
E(2)
C
Add D
D(2)
C(5)
y/5
A(2)
B(2)
X/2
E
A=E
E(2)
{ A(1), A(2), B(2), B(3), B(4), C(5), D(2), E(2) }
DCP1172, Ch.8,9
Delete A
A(2)
E(2)
x/2
Delete A(2)
26
Advantages of Rete networks
• Share common parts of rules
• Eliminate duplication over time (since for most
production systems only a few rules change at each
time step)
DCP1172, Ch.8,9
27
Introduction to Prolog
• Prolog is a programming language based on logic.
• Prolog => Programming in logic
• A programming language for symbolic, non-numerical
computation
• Solving problems that involve objects and relations
between objects
• Used in
• Problem solving and heuristic search
• Expert systems
• Game playing
DCP1172, Ch.8,9
28
Prolog -programming in logic
•
•
•
•
•
•
1970s
Theory - Robert Kowalski (Edinburgh)
Implementation - Alain Colmerauer (Marseilles)
Efficient implementation - David Warren (Edinburgh)
Widely used for AI in Europe
Japan’s Fifth Generation Plan
DCP1172, Ch.8,9
29
The Prolog System
• Prolog consists of a uniform database ( as opposed to
a set of instructions to work through ) and a search
mechanism.
• Remember this.
• Understanding how the Prolog search
mechanism works is vital in understanding Prolog.
DCP1172, Ch.8,9
30
Uniform Database
• Manipulate database of assertions
• Assertion
• Fact
• State relation between objects
• (likes Kim Robin)
• Rule
• State contingent(附隨的, 或許會或許不會) facts
DCP1172, Ch.8,9
31
A Brief Introduction
• Prolog is declarative.
• This means that programmer attempts to describe
what the problem is.
• This is different from procedural programming
languages where the programmer describes how
the problem is solved.
• Rather then running a program to obtain a solution, the
user asks a question.
• When asked a question, the run time system
searches through the data base of facts and rules to
determine (by logical deduction) the answer.
DCP1172, Ch.8,9
32
Rule interpretation
• Rule: (<- (likes sandy ?x) (likes ?x cats)
• Declarative
• “for any x, sandy likes x if x likes cats
• Procedural
• “to show sandy likes some x, show x likes cats”
• Backward-chaining
• Reasons backward from goal to the premise
• Forward-chaining interpretation
DCP1172, Ch.8,9
33
The Structure of Prolog Programs
• Program = sequence of sentences (implicitly
conjoined)
• A Prolog program consists of a database of facts
and rules, and queries (questions).
• Fact: ... .
• Rule: ... :- ... .
• Query: ?- ... .
• Variables: must begin with an upper case letter.
• Constants: numbers, begin with lowercase letter,
or enclosed in single quotes.
DCP1172, Ch.8,9
34
Basic syntax of facts, rules and queries
<fact>
<rule>
<query>
<term>
::=
::=
::=
::=
|
<terms> ::=
<term> .
<term> :- <term> .
<term> .
<number> | <atom> | <variable>
<atom> (<terms>)
<term> | <term>, <terms>
DCP1172, Ch.8,9
35
Introduction to Prolog
• All variables implicitly universally quantified
• Variables in different sentences considered distinct
• Horn clause sentences only (= atomic sentences or sentences
with no negated antecedent and atomic consequent)
• Terms = constant symbols, variables or functional terms
• Queries = conjunctions, disjunctions, variables, functional terms
• Instead of negated antecedents, use negation as failure operator:
goal NOT P considered proved if system fails to prove P (Closed
world assumption)
• Syntactically distinct objects refer to distinct objects
• Many built-in predicates (arithmetic, I/O, etc)
DCP1172, Ch.8,9
36
Prolog systems
DCP1172, Ch.8,9
37
Clauses
• Assertions
• Fact - single clause
• Rule - multiple clauses (contingent facts)
• (<- (likes sandy ?x) (likes ?x cats)
• Clause has 2 parts
 Head: leftmost expression
 Body: remaining expressions
• Head is true only if all the goals in the body are true
DCP1172, Ch.8,9
38
Horn Clause and Prolog
• A Horn clause. Is a disjunction of literals of which at most
one is positive.
• For example, A v ⌐B v ⌐ C v ⌐ D v ⌐ E…
Where A, B, C and so on are positive literals
• This Horn clause can also be written as an implication:
B^C^D^EA
• In PROLOG, this would be written in reverse, with the
conclusion on the left.
A :- B, C, D, E
DCP1172, Ch.8,9
39
HORN clauses in Prolog
• A program in PROLOG consists of a set of HORN
clauses. PROLOG applies unification and resolution to
attempt to derive conclusions from a set of rules that
are defined in this language.
• Horn clause can take three forms:
• Form 1 - Rule relation:
A :- B, C, D, E
• Form 2 – fact:
A:• Form 3 – goal or headless clause.
:- B, C, D, E
DCP1172, Ch.8,9
40
Introduction to Prolog
• Among the features of Prolog are :
• `logical variables' meaning that they behave like
mathematical variables,
• a powerful pattern-matching facility (unification),
• a backtracking strategy to search for proofs,
• uniform data structures,
• input and output are interchangeable.
DCP1172, Ch.8,9
41
Main Design Features of Prolog
• Automatic backtracking
• Search relations in the database that satisfy a query
 City with pop > 500000 & capital
 Find all pop > 500000, check if capital
 If so print
 Else backtrack in pop relation
• Free programmer from details of database
organization
• How stored
• How searched
• Sometimes too inefficient
• Programmer must restate problem
DCP1172, Ch.8,9
42
Main Design Features of Prolog
• Use of a uniform database
• Relational rather than functional
• Relations are more flexible
• Logic variables
• Bound by unification
• Never change
• Constrain the problem (no order of evaluation)
DCP1172, Ch.8,9
43
Relations vs. Functions
• Relations are two way functions
(<- (likes Kim Robin)
(<- (likes Sandy Lee)
(<- (likes Sandy Kim)
(<- (likes Robin cats)
• Multiple functions by posing different queries (same
relations)
• (likes sandy ?who)
• (likes ?who Lee)
DCP1172, Ch.8,9
44
Closed world assumption
• Prolog has what is known as a "closed world
assumption".
• If something is not in its database, it is false. Even
though in reality it may be true.
• But you must remember we are not querying the system
about reality we merely querying it about the contents of
its database.
DCP1172, Ch.8,9
45
Backtracking in Prolog
• Often there will be more than one way to deduce the
answer or there will be more than one solution,
• in such cases the run time system may be asked to
find other solutions, backtracking to generate
alternative solutions.
• Prolog is a weakly typed language with dynamic type
checking and static scope rules.
DCP1172, Ch.8,9
46
Startup Screen of GNU Prolog in interactive mode
DCP1172, Ch.8,9
47
A Simple usage of Prolog command
DCP1172, Ch.8,9
48
How to leave GNU prolog.
DCP1172, Ch.8,9
49
A PROLOG Program
• A PROLOG program is a set of facts and rules.
• A simple program with just facts :
parent(alice,
parent(jim,
parent(jim,
parent(jim,
parent(tim,
parent(tim,

jim).
tim).
dave).
sharon).
james).
thomas).
the relation parent (A, B) means that A is a parent of B.
DCP1172, Ch.8,9
50
A PROLOG Program
• c.f. a table in a relational database.
• Each line is a fact (a.k.a. a tuple or a row).
• Each line states that some person X is a parent of some
(other) person Y.
• In GNU PROLOG the program is kept in an ASCII file.
DCP1172, Ch.8,9
51
Relation: Parent(PersonX, PersonY)
•Parent(PersonX, PersonY)  PersonX is a parent of PersonY.
Item
1.
PersonX
Alice
2.
Jim
Dave
3.
Jim
Sharon
4.
Jim
Tim
5.
Tim
James
6.
Tim
Thomas
DCP1172, Ch.8,9
PersonY
Jim
52
Semantic Net
parent
Alice
parent
Sharon
Jim
Dave
parent
parent
parent
Tim
James
parent
Thomas
DCP1172, Ch.8,9
53
A PROLOG Query
• Now we can ask PROLOG questions :
| ?- parent(alice, jim).
yes
| ?- parent(jim, herbert).
no
| ?-
DCP1172, Ch.8,9
54
A PROLOG Query
• Not very exciting. But what about this :
| ?- parent(alice, Who).
Who = jim
yes
| ?• Who is called a logical variable.
• PROLOG will set a logical variable to any value
which makes the query succeed.
DCP1172, Ch.8,9
55
A PROLOG Query II
• Sometimes there is more than one correct answer to a
query.
• PROLOG gives the answers one at a time. To get the
next answer type ;.
| ?- parent(jim, Who).
Who = tim ? ;
Who = dave ? ;
Who = sharon ? ;
yes
| ?-
NB : The ;
do not
actually
appear on
the screen.
• After finding that jim was a parent of sharon GNU
PROLOG detects that there are no more alternatives for
parent and ends the search.
DCP1172, Ch.8,9
56
Append
• append([], L, L)
• append([H| L1], L2, [H| L3]) :- append(L1,
L2, L3)
• Example join [a, b, c] with [d, e].
• [a, b, c] has the recursive structure [a| [b, c] ].
• Then the rule says:
• IF [b,c] appends with [d, e] to form [b, c, d, e] THEN
[a|[b, c]] appends with [d,e] to form [a|[b, c, d, e]]
• i.e. [a, b, c]
[a, b, c, d, e]
DCP1172, Ch.8,9
57
Expanding Prolog
• Parallelization:
OR-parallelism: goal may unify with many different literals and
implications in KB
AND-parallelism: solve each conjunct in body of an implication
in parallel
• Compilation: generate built-in theorem prover for different
predicates in KB
• Optimization: for example through re-ordering
e.g., “what is the income of the spouse of the president?”
Income(s, i)  Married(s, p)  Occupation(p, President)
faster if re-ordered as:
Occupation(p, President)  Married(s, p)  Income(s, i)
DCP1172, Ch.8,9
58
KBS & Ontologies
• Domain Knowledge = Domain Ontology +
Domain Models
• In essence, each knowledge base is an extension of
some application domain ontology
• The ontology provides a roadmap for the class of the
concepts that will comprise the knowledge base
DCP1172, Ch.8,9
59
Rule-based Expert System
• An expert system is one designed to model the
behavior of an expert in some field, such as medicine
or geology.
• Rule-based expert systems are designed to be able to
use the same rules that the expert would use to draw
conclusions from a set of facts that are presented to
the system.
DCP1172, Ch.8,9
60
The people involved in an Expert System
• The design, development and use of expert system involves a
number of people.
• End-user: the person who has the needs for the system.
• Doctors that use medical diagnosis system
• Technicians that use car diagnosis system
• Knowledge Engineer (KE): the person who design the
rules for the system, based on either observing the expert at
work or by asking the expert questions about how he/she
works.
• Domain Expert (DE)
• Very important to the design of an expert system
• The expert needs to be able to explain to the KE how
he/she does the job
DCP1172, Ch.8,9
61