Empirical Explorations with The Logical Theory Machine: A Case Study in Heuristics by Allen Newell, J. C. Shaw, & H. A. Simon The Logic Theory Machine (LT) : a system for finding proofs of theorems in elementary symbolic logic : devised to understand the complex problemsolving process : the first heuristic program (1956) The Logic Theory Machine in Operation - LT’s Task is to prove theorems in the sentential calculus ( in Principia Mathematica ) - The proof is the sequence of expressions that leads from the axioms and known theorems to the desired expression e.g.) - Why are such problems difficult ? - What features of LT account for its successes and failures ? Problems, Algorithms, and Heuristics problem : if he is given some process for generating the elements of a set of possible solutions in some order, and a test for verifying whether a given element of this set is in fact a solution to his problem. problem solving : if the costs are not too large in relation to the time and computing power available for solution Problems, Algorithms, and Heuristics algorithm : the process that guarantee the solution of a given problem. ex) opening a combination safe heuristic : the process that may solve a given problem but offers no guarantees of doing so. ex) playing chess The Problem of Proving Theorems Logic - the set of all possible sequences of logic expression : E ( given implicitly by rules for generating) - the set of sequences that are proofs : P - the expression to be proved : X - the set of sequences that have X as their final expression : Tx E : all sequences of logic expressions P :proof sequence Tx : sequences ending in X Proofs of X in The difficulty of proving theorems depends on 1. The scarcity of elements in the intersection of P and Tx , relative to the number of elements in E the cost and speed of the available generators, the cost and speed of making tests (whether P or Tx) 2. Whether generators can be found that guarantee that any element they produce automatically satisfies some of the conditions. 3. What heuristics can be found to guide the selection. The British Museum Algorithm : reveal the basic nature of theorem proving the algorithm constructs all possible proofs in systematic manner, checking each time (1) to eliminate duplicates, and (2) to see if the final theorem in the proof coincides with the expression to be proved the set of n-step proofs : obtained from the set of (n-1)-step proofs by making all the permissible substitutions and replacements, detachments. Figure 2 . Number of proofs generated by first few steps of British Museum algorithm Number of proofs : only use the simple substitution and the replacements 200 : If specializations are included 11-steps, 1000 theorems 100 : among the first 246 theorems, only 5 theorem ( in chap. 2 of 0 principia), 1000 theorems, one 0 1 2 3 4 5 6 7 8 Proof steps more how many theorem to prove all theorems (60-odd) : a hundred million how many times : hundreds of thousands of years The Logic Theory Machine Methods : the major type of heuristic that LT uses The Substitution Method : by finding an axiom or proved theorem that can be transformed, by a series of substitutions for variables and replacements of connectives The Detachment Method : if problem B , search for an axiom or theorem “ A implies B ” , A is set up as a new subproblem Methods (continued) The Chaining Method : use the tansitivity of the relation of implication to create a new subproblem - forward chaining method : “ a implies c ”, search an axiom or theorem “ a implies b ”, set up “ b implies c ” as new subproblem - backward chaining method : search “ b implies c ”, set up “ a implies b ” as subproblem Methods (continued) no guarantee : a target axiom or a theorem can be found : the generated subproblem is part of the desired proof sequence, or part of any proof sequence : the beginning of the sequence can be completed with axioms or proved theorems : the combination of four methods comprise a sufficient set of methods to prove all theorems ex) “ p or not-not-not-p ” (2.13) Methods (continued) Why do the methods transform LT into an effective problem-solver ? 1. unitizing effect : the methods organize the sequences of individual processing steps into larger units 2. backward approach : only one theorem to be proved, a number of known true theorems The Executive Routine 1. For a theorem list, the substitution method is tried 2. The detachment method is tried, following the substitution, if fail, the subproblem is added to the subproblem list 3. Forward chaining, backward chaining. 4. For the subproblem list, repeat. The Executive Routine (continued) This process continues until (1) a proof is found, (2) the time allotted for finding a proof is used up, (3) there is no more available memory space, (4) no untried problems remain on the subproblem list Examples The Executive Routine (continued) not (p or q) implies not-p (A implies B) implies (not-B implies not-A) subprblems (theorem 2.16) [ p implies (p or q)] by substitution method A implies (A or B) The Matching Process What allows the efficient search ? the required times : BM algorithm vs LT ( 5~100:1) ex) “ p implies (q implies p) ” (158 sec : 10 sec) p implies (q or p) (axiom 1.3) 1 3 2 7 6 4 5 11 10 8 9 12 p implies (not-q or p) p imlies (q implies p) The Matching Process (continued) Matching in the Substitution Methods : componentwise matching - a feedback of the results of a substitution or replacement : Of the 52 theorems of Principia, 38 proofs substitution alone - 17 proof Matching in Detachment and Chaining : matching the subexpressions of the problem or theorems. Similarity Tests and Descriptions Screening Process : reject any theorem for matching that has low likelihood of success The Similarity Test : 1) the maximum numbers of levels , 2) the number of distinct variables, 3) the number of variable places Percent Percent table 1. similar of matched Method Theorems considered Substitution 11,298 Detachment 1,591 Chain. forward 869 Chain. backward 673 Theorems similar Theorems matched 993 406 200 146 37 210 120 63 theorems considered 8.8 25.5 23.0 21.7 of theorems similar 3.7 51.7 60.0 43.2 Similarity Tests and Descriptions (cont.) similarity test의 문제점 : type II error ex) p implies (p or p) p implies (q or p) : type I error : the cost of similarity test Effort in LT measuring efforts : the total number of primitives executed computing effort and performance : substitution vs detach. vs chain. ( 1: 2 : 3) Similarity Tests and Descriptions (cont.) - precomputed, recomputed description - the effort required is proportional to the number of theorems considered. - the number of theorems considered is an effort measure for evaluating a heuristic Evaluation of the Similarity Test : the full similarity vs the modified similarity vs no similarity : when precomputed, full : no (10 to 60 perc. 3534/5206) when recomputed, (26,739/22,914) : modified similarity test (18,035/22,914) Subproblems figure 6. Distribution of LT’s proofs by effort the limitation of LT : a “plateau” require quite different heuristics The Subproblem Tree The Subproblem Tree theorem 2.17 : the proof that cost LT 89,000 primitives. (not-q implies not-p) implies (p implies q) (theorem 2.17) 1. A implies not-not-A 2. p implies not-not-p 3. (A implies B) implies [(B implies C) implies (A implies C)] (theorem 2.06) 4. (p implies not-not-p) implies [(not-not-p implies q) implies (p implies q) 5. (not-not-p implies q) implies (p implies q) (det. 4 from 3) 6. (not-A implies B) implies (not-B implies A) 7. (not-q implies not-p) implies (not-not-p implies q) 8. (not-q implies not-p) implies (p implies q) (chain 7 and 5) The Subproblem Tree step 3 step 4 : the difference in total effort can be attributed largely to the difference in number of subproblems generated : an algorithmic procedure to govern its generation of subproblem, an algorithm to determine the order to try Modification of the Logic Theory Machine 1. The unit cost of processing subproblems can be reduced. 2. LT can be modified to screen subproblems before they are put on the subproblem list. 3. Another way is to reduce selectively the number of subproblems generated, by limiting the lists for theorems available for generating subproblems. Cf) Fig. 7. A list fo 20 theorems, fig. 8. 5 theorems ex) theorem (2.48) two condition: all theorems vs axiom + one theorem the longer list : in two steps, 51,450 primitives of effort the shorter list : in tree step, 18,558 primitives Conclusion - Heuristics give the program power to solve problems in reasonable computing time - The limitations of the present program of LT - Some of directions that improvement would have to take to extend it powers to problems at new levels of difficulty The Definitions of the Sentential Calculus variables p, q, r, … A, B, C … connectives not, or, implies expression “not-p”, “p or q”, “p implies q” axiom the universally true expressions theorem the true expressions that are derived from axioms by means of various rules of inference - rules of inference formalization of logical operation ( e.g. deduction) - the system of axioms ( p or p ) implies p (1.2) p implies ( q or p ) ( p or q ) implies ( q or p ) [ p or ( q or r ) ] implies [ q or ( p or r ) ] ( p implies q ) implies [ ( r or p ) implies ( r or q ) ] (1.3) (1.4) (1.5) (1.6) three rules of inference 1. Substitution : “ A” for “ p ” , “ p or q ” for “ p ” 2. Replacement : “ p implies q ” with “ not-p or q ” 3. Detachment : if “ A ” and “ A implies B ”, then “ B” ( p implies not-p) implies not-p 1. ( A or A ) implies A 2. ( not-A or not-A ) implies not-A 3. ( A implies not-A ) implies not-A 4. ( p implies not-p ) implies not-p : LT works for about 10 seconds (theorem 2.01) (axiom 1.2) not ( p or q ) implies not-p (theorem 2.45) 1. A implies (A or B ) (theorem 2.2) 2. p implies ( p or q ) 3. (A implies B ) implies (not-B implies not-A) (theorem 2.16) 4. [ p implies ( p or q )] implies [ not (p or q) implies not-p) 5. not ( p or q ) implies not-p works for about 12 minutes and success [ p or ( q or r ) ] implies [ ( p or q ) or r ] (theorem 2.31) works for about 23 minutes and fails to prove