The Logic Theory Machine
Download
Report
Transcript The Logic Theory Machine
Empirical Explorations with
The Logical Theory Machine:
A Case Study in Heuristics
by Allen Newell, J. C. Shaw, & H. A. Simon
The Logic Theory Machine (LT)
: a system for finding proofs of theorems in
elementary symbolic logic
: devised to understand the complex problemsolving process
: the first heuristic program (1956)
The Logic Theory Machine in Operation
- LT’s Task is to prove theorems in the
sentential calculus ( in Principia Mathematica
)
- The proof is the sequence of expressions that leads
from the axioms and known theorems to the desired
expression e.g.)
- Why are such problems difficult ?
- What features of LT account for its successes
and failures ?
Problems, Algorithms, and Heuristics
problem
: if he is given some process for generating the
elements of a set of possible solutions in some
order, and a test for verifying whether a given
element of this set is in fact a solution to his
problem.
problem solving
: if the costs are not too large in relation to the time
and computing power available for solution
Problems, Algorithms, and Heuristics
algorithm
: the process that guarantee the solution of a given
problem. ex) opening a combination safe
heuristic
: the process that may solve a given problem but
offers no guarantees of doing so.
ex) playing chess
The Problem of Proving Theorems
Logic
- the set of all possible sequences of logic expression
: E ( given implicitly by rules for generating)
- the set of sequences that are proofs : P
- the expression to be proved : X
- the set of sequences that have X as their final expression :
Tx
E : all sequences of
logic expressions
P :proof
sequence
Tx : sequences
ending in X
Proofs of X
in
The difficulty of proving theorems depends on
1. The scarcity of elements in the intersection of P and
Tx , relative to the number of elements in E
the cost and speed of the available generators,
the cost and speed of making tests (whether P or Tx)
2. Whether generators can be found that guarantee that
any element they produce automatically satisfies
some of the conditions.
3. What heuristics can be found to guide the selection.
The
British Museum Algorithm
: reveal the basic nature of theorem proving
the algorithm constructs all possible proofs in
systematic manner, checking each time (1) to
eliminate duplicates, and (2) to see if the final
theorem in the proof coincides with the expression
to be proved
the set of n-step proofs : obtained from the set of
(n-1)-step proofs by making all the permissible
substitutions and replacements, detachments.
Figure 2 . Number of proofs generated by first few
steps of British Museum algorithm
Number of proofs
: only use the simple substitution
and the replacements
200
: If specializations are included
11-steps, 1000 theorems
100
: among the first 246 theorems,
only 5 theorem ( in chap. 2 of
0
principia), 1000 theorems, one
0
1
2
3
4
5
6
7
8
Proof steps
more
how many theorem to prove all theorems (60-odd)
: a hundred million
how many times : hundreds of thousands of years
The Logic Theory Machine
Methods
: the major type of heuristic that LT uses
The Substitution Method
: by finding an axiom or proved theorem that can be
transformed, by a series of substitutions for variables and
replacements of connectives
The Detachment Method
: if problem B , search for an axiom or theorem “ A implies
B ” , A is set up as a new subproblem
Methods (continued)
The Chaining Method
: use the tansitivity of the relation of implication to create a
new subproblem
- forward chaining method
: “ a implies c ”, search an axiom or theorem “ a implies b ”,
set up “ b implies c ” as new subproblem
- backward chaining method
: search “ b implies c ”, set up “ a implies b ” as subproblem
Methods (continued)
no guarantee
: a target axiom or a theorem can be found
: the generated subproblem is part of the desired
proof sequence, or part of any proof sequence
: the beginning of the sequence can be completed
with axioms or proved theorems
: the combination of four methods comprise a
sufficient set of methods to prove all theorems
ex) “ p or not-not-not-p ” (2.13)
Methods (continued)
Why do the methods transform LT into an
effective problem-solver ?
1. unitizing effect
: the methods organize the sequences of individual
processing steps into larger units
2. backward approach
: only one theorem to be proved, a number of known true
theorems
The
Executive Routine
1. For a theorem list, the substitution
method is tried
2. The detachment method is tried,
following the substitution, if fail,
the subproblem is added to the
subproblem list
3. Forward chaining, backward
chaining.
4. For the subproblem list, repeat.
The Executive Routine (continued)
This process continues until (1) a proof is found,
(2) the time allotted for finding a proof is used up,
(3) there is no more available memory space,
(4) no untried problems remain on the subproblem
list
Examples
The Executive Routine (continued)
not (p or q) implies not-p
(A implies B) implies (not-B
implies not-A)
subprblems
(theorem 2.16)
[ p implies (p or q)]
by substitution method
A implies (A or B)
The Matching Process
What allows the efficient search ?
the required times : BM algorithm vs LT ( 5~100:1)
ex) “ p implies (q implies p) ” (158 sec : 10 sec)
p implies (q or p) (axiom 1.3)
1
3
2
7
6
4
5
11
10
8
9
12
p implies (not-q or p)
p imlies (q implies p)
The
Matching Process (continued)
Matching in the Substitution Methods
: componentwise matching - a feedback of the
results of a substitution or replacement
: Of the 52 theorems of Principia, 38 proofs
substitution alone - 17 proof
Matching in Detachment and Chaining
: matching the subexpressions of the problem or
theorems.
Similarity Tests and Descriptions
Screening Process : reject any theorem for
matching that has low likelihood of success
The Similarity Test : 1) the maximum numbers of
levels , 2) the number of distinct variables, 3) the
number of variable places
Percent
Percent
table 1.
similar of
matched
Method
Theorems
considered
Substitution
11,298
Detachment
1,591
Chain. forward
869
Chain. backward
673
Theorems
similar
Theorems
matched
993
406
200
146
37
210
120
63
theorems
considered
8.8
25.5
23.0
21.7
of theorems
similar
3.7
51.7
60.0
43.2
Similarity Tests and Descriptions (cont.)
similarity test의 문제점
: type II error ex) p implies (p or p) p implies (q or p)
: type I error
: the cost of similarity test
Effort in LT
measuring efforts
: the total number of primitives executed
computing effort and performance
: substitution vs detach. vs chain. ( 1: 2 : 3)
Similarity Tests and Descriptions (cont.)
- precomputed, recomputed description
- the effort required is proportional to the number of
theorems considered.
- the number of theorems considered is an effort measure for
evaluating a heuristic
Evaluation of the Similarity Test
: the full similarity vs the modified similarity vs no similarity
: when precomputed, full : no (10 to 60 perc. 3534/5206)
when recomputed, (26,739/22,914)
: modified similarity test (18,035/22,914)
Subproblems
figure 6. Distribution of LT’s proofs by effort
the limitation of LT
: a “plateau” require quite different heuristics
The Subproblem Tree
The Subproblem Tree
theorem 2.17 : the proof that cost LT 89,000 primitives.
(not-q implies not-p) implies (p implies q)
(theorem 2.17)
1. A implies not-not-A
2. p implies not-not-p
3. (A implies B) implies [(B implies C) implies (A implies C)] (theorem
2.06)
4. (p implies not-not-p) implies [(not-not-p implies q) implies (p implies q)
5. (not-not-p implies q) implies (p implies q)
(det. 4 from 3)
6. (not-A implies B) implies (not-B implies A)
7. (not-q implies not-p) implies (not-not-p implies q)
8. (not-q implies not-p) implies (p implies q)
(chain 7 and 5)
The Subproblem Tree
step 3
step 4
: the difference in total effort can be attributed largely to the
difference in number of subproblems generated
: an algorithmic procedure to govern its generation of
subproblem, an algorithm to determine the order to try
Modification of the Logic Theory Machine
1. The unit cost of processing subproblems can be reduced.
2. LT can be modified to screen subproblems before they are
put on the subproblem list.
3. Another way is to reduce selectively the number of
subproblems generated, by limiting the lists for theorems
available for generating subproblems.
Cf) Fig. 7. A list fo 20 theorems, fig. 8. 5 theorems
ex) theorem (2.48)
two condition: all theorems vs axiom + one theorem
the longer list : in two steps, 51,450 primitives of effort
the shorter list : in tree step, 18,558 primitives
Conclusion
- Heuristics give the program power to solve
problems in reasonable computing time
- The limitations of the present program of LT
- Some of directions that improvement would have
to take to extend it powers to problems at new
levels of difficulty
The Definitions of the Sentential Calculus
variables p, q, r, … A, B, C …
connectives not, or, implies
expression “not-p”, “p or q”, “p implies q”
axiom the universally true expressions
theorem
the true expressions that are derived from axioms
by means of various rules of inference
- rules of inference formalization of logical
operation ( e.g. deduction)
-
the system of axioms
( p or p ) implies p
(1.2)
p implies ( q or p )
( p or q ) implies ( q or p )
[ p or ( q or r ) ] implies [ q or ( p or r ) ]
( p implies q ) implies [ ( r or p ) implies ( r or q ) ]
(1.3)
(1.4)
(1.5)
(1.6)
three rules of inference
1. Substitution : “ A” for “ p ” , “ p or q ” for “ p ”
2. Replacement : “ p implies q ” with “ not-p or q ”
3. Detachment : if “ A ” and “ A implies B ”, then “ B”
(
p implies not-p) implies not-p
1. ( A or A ) implies A
2. ( not-A or not-A ) implies not-A
3. ( A implies not-A ) implies not-A
4. ( p implies not-p ) implies not-p
: LT works for about 10 seconds
(theorem 2.01)
(axiom 1.2)
not ( p or q ) implies not-p
(theorem 2.45)
1. A implies (A or B )
(theorem 2.2)
2. p implies ( p or q )
3. (A implies B ) implies (not-B implies not-A) (theorem 2.16)
4. [ p implies ( p or q )] implies [ not (p or q) implies not-p)
5. not ( p or q ) implies not-p
works for about 12 minutes and success
[
p or ( q or r ) ] implies [ ( p or q ) or r ] (theorem 2.31)
works for about 23 minutes and fails to prove