Chapter 5 Information Processing and Utilization Section 3 Theorem Proving.

Chapter 5 Information Processing and Utilization

Section 3 Theorem Proving

1. Terminology

1) Atom A proposition/predicate that can not be decomposed into other proposition/predicate is an atom. 2) Literal Atom and the negated atom are called literals. 3) Clause A number of literals connected only by disjunctivesymbols are called clauses. 4) Term Constant, variable, function are called terms.5) Well formed formula (wff) Any legal expressions/formulas are called wffs.

6) An interpretation of a formula is an assignment of a truth value to every atom of the formula. A formula containing n distinct atoms has 2 distinct interpretations. Under each interpretation, a formula can be evaluated to be true or false. 7) An interpretation is said to satisfy a formula iff it can make the formula true .8) A formula is valid iff true under all its interpretations9) A formula is inconsistency iff it is false under all its interpretations.10) A formula is consistent iff it is not inconsistent. A consistent formula is true under at least one interpretation.

n

11) Formula G is said to be a logical consequence of formulas F1, …, Fn iff every interpretation that satisfies (F1 F2 ... Fn) also satisfies G.∧ ∧ ∧12) Rules of inference are operations, in the logic, which can be applied to certain Wffs and sets of Wffs to produce new Wffs. Modus Ponens: P(x), P(x) →Q(x) Q(x) Universal Specialization: (x)P(x), A P(A)13) Theorems: Wffs as a logic consequence derived from ones by inference rules applications.14) Proof of a theorem: The sequence of inference rules applications used in new Wffs derivation.

2. Preliminary Knowledge

(a) Unification: process of finding substitutions, {s}, of terms for variables, {t/v}, to make expressions identical.

A substitution instance of an expression E is obtained by substituting terms for variables in that expressionand denoted by Es.

A set of expressions {Ei} is said unifiable if there exists a substitution s such that E1s = E2s = … = Ens, and s is said to be a unifier of {Ei}.

The most general (simplest) unifier of {Ei} is denoted by mgu.

(b) Process for Conversion of Wff to Clause Form

(1) Eliminate implication symbols(2) Reduce scopes of negation symbols(3) Standardize variables(4) Eliminate existential quantifiers(5) Convert to prenex form(6) Put matrix in conjunctive normal form(7) Eliminate universal quantifiers(8) Eliminate conjunction symbols(9) Rename variables

3. Resolution Principle (RP)

(a) Concept: RP is a procedure that produces proofs by refutation. To prove a statement, RP attempts to show that thenegation of the statement produces a contradiction withthe known statement.

(b) RP in Propositional Logic Given premises S and a conclusion G to be proved. (1) Convert all the propositions of S to clause form. (2) Negate G and then convert into clause form, add it to the set of clauses obtained in (1).

(3) Repeat until either a contradiction is found or no progress can be made: (a) Select two clauses c and c , the parent clauses. (b) Resolve c and c , the resulting clause, called the resolvent, r , will be the disjunction of all the literals of both the parent clauses with the following exception: If there are any pair of literal L and L, such that one of the parent clauses contains L and the other contain L, then delete L and L from the resolvent. (c) If the resolvent is the empty clause, NIL, then a contradiction has been found. If it is not, then add it to the set of clauses available to the procedure.

i j

ij

i

i

j

j

Examples 1:

Parent Clauses Resolventplm …, pno … lmno… p, pq q (MP) pq, pq qq = q (Merge) pq, p q qq, p p (Taotology) p, p NIL (Empty) pq, qr pr (chaining)

Example 2:

Given Premises Convert to Clause Form p p (1)(pq)r p q r (2)(st) q s q (3) t q (4) Conclusion: t t (5)Negated Goal r (6)RP: (2,6) p q (7) (1,7) q (8) (4,8) t (9) (5,9) NIL (10)

This is a contradiction among the premises and thenegated conclusion.

The premises are known valid. Therefore the invalid component must be the negated conclusion. In other words, the conclusion should be the logic consequence of the premises.

4. RP in Predicate LogicGiven: a set of premises S and a conclusion G to be proved.(1) Convert S to clause form.(2) Negate G and then convert to clause form. Add it to the set of clauses obtained in (1).(3) Repeat until either a contradiction is found, or no progress can be made, or a predetermined amount of effort has been expanded: (a) Select two clauses c (x) and c (x), the parent ones. (b) Resolve c (x) and c (x): (i) If there is P(x) in c (x) and P(x) in c (x), the resolvent will be the disjunction of c (x) and c (x) with P(x) and P(x) disappeared.

i j

ji

ji

i

j

(ii) If there is a pair of literals L (x) and L (x) such that one of the parent clause contains L (x) and the other contains L (x), and if L (x) and L (x) are unifiable, then by using unification make them identical, and the resolvent will be the disjunction of c (x) and c (x) with appropriate substitution performed and with L (x) and L (x) disappeared. (c) If the resolvent is NIL, a contradiction is found; If not, add it to the set of clauses available to the procedure.

i

j

i

i

i

i

j

j

j

j

Example 1 S: Whoever cab read is literate. xR(x) L(x)) Dolphins are not literate. (x)(D(x) L(x)) Some Dolphins are intelligent. xDx I(x)) G: Some who are intelligent cannot read. (x) (I(x) R(x))Proof: S: 1. R(x) L(x) (Premise) 2. D(y) L(y) (Premise) 3a. D(A) (Premise) 3b. I(A) (Premise) G: 4. I(z) R(z) (Negated Conclusion) 5. R(A) {A/z} (3b, 4; R) 6. L(A) {A/z} (5, 1; R) 7. D(A) {A/z} (6, 2; R) 8. NIL (7, 3a; R)

Example 2 S: Man are mortal. xAN(x) MORTAL(x)Socrate is a man. MAN(SOCRATE) G: Socrate is mortal. MORTAL(SOCRATE)

Proof: S: 1. MAN(x)MORTAL(x) (Premise) 2. MAN(S) (Premise) G: 3. MORTAL(S) (Negated Conclusion) 4. MAN(S), {S/x} (1, 3; R) 5. NIL (2, 4; R)

MORTAL(S)

MAN(S) MAN(S)

NIL

Refutation Tree

MAN(x) MORTAL (x)

{S/x}

Example 3Theorem: The inner alternate angles of a trapezoid are equal.

Symbols: 1. T(x,y,u,v) denotes a trapezoid: xyuv 2. P(x,y,u,v): xy//uv 3. E(x,y,v,u,v,y): xyv = uvy∠ ∠

a (x) b (y)

c (u)d (v)

S: Premises 1. xyuv(T(x,y,u,v) P(x,y,u,v)) 2. xyuvP(x,y,u,v) E(x,y,v,u,v,y)3.a,b,c,dG: Theorem 4. E(a,b,d,c,d,b)Proof: 1. T(x,y,u,v) P(x,y,u,v) (Premise) 2. P(x,y,u,v) E(x,y,v,u,v,y) (Premise) 3. T(a,b,c,d) (Premise) 4. E(a,b,d,c,d,b) (Negated Conclusion) 5. P(a,b,c,d) {a/x, b/y, c/u, d/v} (2,4) 6. T(a,b,c,d) {a/x, b/y, c/u, d/v} (1,5) 7. NIL (3,6)

a b

cd

6. Answer Extraction System -- A Modified VersionExample:S: If Fido goes wherever John goes xAT(J, x) AT(F, x)) and if John is at School. AT(J, S)G: Where is Fido? x (AT(F, x))

AT(F, x) AT(J, y)AT(F, y)

AT(J, x) AT(J, S)

NIL

{x/y}

{S/y}

A Refutation Tree Approach:

G

A Proof Tree Approach

AT(F, x) AT(F, x) AT(J, y) AT(F, y)

AT(J, S)

AT(F, S)

{x/y}

AT(J, x) AT(F, y)

{S/x}

G G

The Answer

Chapter 5Information Processing & utilization

Section 4 Rule-Based Deduction Systems

1. Introduction

Rule-based Deduction Systems do not convert wffs toclause forms as the latter forms would lose information:

C = A B) C = (A C) B = …

Wffs representing assertion knowledge are separatedinto two categories: (1) The rules expressed in implication form; (2) The facts expressed in AND/OR form.

The task of the production system here is to prove a goal from these facts and rules.

2. A Forward Deduction System(1) Obtaining AND/OR form from Arbitrary Forms -- Eliminate Implication symbols; -- Minimize the scope of negation symbols; -- Skolemize; * Variables within the scopes of universal quantifiers are standardized by renaming: variables in different conjunctions have different names; * Existentially quantified variables are replaced by Skolem functions; * The universal quantifiers are dropped; * Any remaining variables are assumed to have universal quantification.

Example: Given a wff below: (u) (v) {Q(v,u) [[R(v)P(v)] S(u,v)]} (u) (v) {Q(v,u) [[R(v) P(v)] S(u,v)]} (v) {Q(v,A) [[R(v) P(v)] S(A,v)]} Q(v,A) [[R(v) P(v)] S(A,v)]fact form: Q(w,A) [[R(v) P(v)] S(A,v)]

Q(w,A) {[R(v) P(v)] S(A,v)}

[R(v) P(v)] S(A,v)

R(v) P(v)

R(v)

Q(w,A)

S(A,v)

P(v)

Root

Leaf

Leaf Leaf

Leaf

An interesting property of the AND/OR graphrepresentation of a wff is that the set of clauses (into which that wff could have been converted) can be read out as the set of solution graphs (terminating in leaf nodes) of the AND/OR graph.

Thus the clauses that result from the fact wff above are:

Q(w,A), S(A,v) R(v), S(A,v) P(v)

Each clause is obtained as the disjunction of the literals at the leaf nodes of one of the solution graphs.

(2) The Rule Expressions and Rule Application

(a) General Remarks on Rule Expressions-- Rules are based on the implication wffs that represent general assertion knowledge about a problem domain and can then be applied to global database (AND/OR graph structure) to produce new database.-- The simplest form of rule is L W, where L, a single literal and W, an arbitrary wff in AND/OR form.

(LL) W can be expressed as L1W and L2 Wsince (LL) W = (L1 L2) W = ( L1 L2) W = ( L1 W ) L2 W LW L2W

Variables in the implication can assume to have Universal quantification over the entire implication;variables existentially quantified have been skolemized.

Variables in facts and rules are standardized apart sothat no variable occurs in more than one rule and so thatthe rule variables are different to the fact variables.

Any implication with a single-literal antecedent, regardless of its quantification, can be put in a form in which the scope of quantification is the entire implication by a process as follows: -- Replace L W by L W -- Skolemize all existential variables.

E.g: (x) {[(y)(z) P(x,y,zu)Q(x,u)} can be transformed through the following steps:

(i) Eliminate (temporarily) implication symbol (x) y)(z) P(x,y,zu)Q(x,u)}(ii) Reducing the scope of negation symbol (x) y)( z) [ P(x,y,zu)Q(x,u)}(iii) Skolemize (x) y) [ P(x,y, f(x,y)u)Q(x,u)}(iv) Move all s to the front and drop P(x,y, f(x,y)Q(x,u)(v) Restore implication P(x,y, f(x,y)Q(x,u)

(b) Rule Application in Proposition LogicA rule of the form L W can be applied to an AND/OR graph having a leaf node, n, labeled by literal L. Theresult is a new AND/OR graph in which node n now hasan outgoing 1-connector to a descendant node (also labelby L) which is the root node of that AND/OR graphstructure representing W. E.g., S (x y) z is appliedto

T U

T U S

S (T U)

L

S (T U)

T U

T U

S S

Z

YX

X YL

L

Match Arc

W

From the rule L W and the fact expression F(L), the expression F(W) can be derived from F(L) by replacing all the occurrences of L in F by W, and thus a new graph containing a representation of F(W) is produced.

After a rule is applied at leaf node, this node is no longer a leaf node of the graph, but it is still labeled by a single literal and may continue to have rules appliedto it. Any node labeled by a single literal is called a literal node.

The set of clauses represented by an AND/OR graph is the set that corresponds to the set of solution graph terminating in literal nodes of the graph.

Termination of Rule Applications

-- The object of the forward production system is to prove some goal wff from fact wffs and a set of rules. Hence, whenever the goal wff is reached, the system can be terminated.-- The forward system is limited in proving those goal wffs whose form is disjunction of literals.-- When one of the goal literals matched a literal node, n, of the AND/OR graph, we add a new descendant of node n, labeled by the matching goal literal, to the graph. This descendant is called a goal node.

E.g.,

C D

G

G

C

Fact:

Rules:A C DB E G

Goal: C G

Rule Matching

Goal Matching

(C) Rule Application in Predicate Logic

Fact and rule expressions are the same as above:-- Variables are universally quantified;-- Any existentially variables are already Skolemized.

Goal wffs, however, is dual to those above:-- Universal variables are replaced by Skolem functions;-- Existential quantifiers in the Skolemized goal wff can then be dropped;-- Variables remaining in goal wffs have been renamed so that the same variable does not occur in more than one disjunct of the goal wff.

Rule Application-- A rule is applicable if the AND/OR graph contains a literal node L’ that unifies with L by mgu, u;-- Application of this rule, then, extends this graph by creating a match arc directed from the node L’ to a new descendant node L;-- This descendant node L is the root node of the graph representation of Wu;-- Label the match arc by the mgu, u.

L’ L Wuu

RuleFact GoalMatch

E.g.,

P(x, y) [Q(x,A) R(B,y)]

P(x, y)

X(B)S(A)

P(A,B) R(B,y)Q(x,A)

Q(x,A) R(B,y)

Fact wff

Rule wff:P(A,B) S(A) X(B)

{A/x, B/y}Rule Match

The clauses corresponding to these solution graphs are: S(A) X(B) Q(A,A) A/x S(A) X(B) Q(B,B) B/y

Rule Application

Important Remarks:-- Any solution graph (terminating in literal nodes) can have more than one match arc.

-- In computing the set of clauses represented by AND/ OR graph containing several match arcs, only those solution graphs terminating in literal nodes having consistent match arc substitutions are counted.

-- The clause represented by a consistent solution graph is obtained by applying a special solution, called the unifying composition, to the disjunction of the literals labeling its terminal nodes.

-- The definition of consistent substitutions: Given a set of substitutions, {u , …, u , …, u }, where u = {t /v , …, t /v }, i = 1, …, n. Let again

U = (v , …, v , …, v , …, v ) U = (t , …, t , …, t , …, t )

The substitutions (u , …, u ) are called consistent iff U and U are unfifiable.

The unifying composition, u, of (u , …, u ) is the mguof U and U .

1 i n

i i1 i1 im(i) im(i)

11

11

1m(1)

1m(1)n1

n1

nm(n)

nm(n)

1 n 1

1

2

2

n1

1 2

Examples of unifying compositions of substitutions:

u u u {A/x} {B/x} inconsistent {x/y} {y/z} {x/y, x/z}{f(z)/x} {f(A)/x} {f(A)/x, A/z}{x/y, x/z} A/z {A/x, A/y, A/z} {s} { } {s}{g(y)/x} {f(x)/y} inconsistent{f(g(x))/z, f(y)/w} {w/z, g(x)/y} {f(g(x))/z, f(g(x))/w, g(x)/y}

21

A solution graph must have a set of consistent match arc substitutions in order for its corresponding clauses to be ones that can be inferred from the original fact and rules.

R(A)

P(x) Q(x)

Q(x)

Q(B)P(A)

P(x)

R(B)

R1: P(A) R(A)

R2;Q(B) R(B)

Fact

{A/x} {B/x}

R1 R2

As is seen above that the substitutions employed in the graph are inconsistent, the clause R(A) R(B) is not the one of these, inferred by the graph.

If the same rule is applied more than once, then each application must use renamed variables.

When a goal literal, L, unifies with a literal L’ labelinga literal node, n, of a graph, we can add a match arc (labeled by the mgu) directed from node n to a newdescendant goal node labeled by L.

The same goal literal can also be used a number of times, but each use must employ renamed variables.

Termination

The process of extending the AND/OR graph by applying rules or by using goal literals can successfully terminate when a consistent solution graph is produced having goal nodes for all of its terminal nodes.

The process has then proved the goal (sub)disjunction obtained by applying the unifying composition of the final solution graph to the disjunction of the literals labeling the goal nodes in the solution graph.

ExampleFacts: Fido barks and bites, or Fido is not a dog.Rules: All terriers are dogs. Anyone who barks is noisy.Conclusion: There exists someone who is not a terrier or who is noisy.

Facts: [BARKS(F) BITES(F)] DOG(F)Rule 1: (x) (TERRIER(x) DOG(x)) (x) (DOG(x) TERRIER(x)) DOG(x) TERRIER(x)Rule 2: (y) (BARKS(y) NOISY(y)) BARKS(y) NOISY(y)G: (z) (TERRIER(z) NOISY(z)) Dual Relations TERRIER(z) NOISY(z)

TERRIER(z)

TERRIER(F)

DOG(x)

DOG(F)

BITES(F)

[BARKS(F) BITES(F)] DOG(F)

BARKS(F) BITES(F)

BARKS(F)

BARKS(y)

NOISY(F)

NOISY(z)

{F/x}

R1

{F/z}

{F/y}

R2

{F/z}

Note that {F/x, F/y, F/z} employed in the graph is theunifying composition of these substitutions.

Applying this composition to the goal literals used in the solution graph yields

TERRIER(F) NOISY(F)

which is the instance of goal wff that our system has proved.

3. Backward Systems (skipped over)4. A Combination: F-B Systems (skipped over)

Chapter 6

Expert Systems

1. Introduction: What Is An Expert System?

special category of computer systems able to perform sophisticated functions that only human experts can perform.

Expert systems differ substantially from conventionalcomputers in several important respects:-- Their tasks have no algorithmic solutions.-- They often must make conclusions based on uncertain or incomplete information.

-- In a conventional computer program, knowledge pertained to the problem and methods for using this knowledge are inter-wined, so, it is very difficult to change the program. In an expert system, there is a clear separation of general knowledge about the problem (the Knowledge base) from information about the current problem (the input data) and methods (the inference engine) for applying the general knowledge to the problem. So, program in expert systems can be changed by only modification of the knowledge base.-- Symbolic processing is emphasized in expert systems, instead of data processing as in computers.-- Highly interactive processing in expert system.-- Mid-run explanation easily in expert systems

General configuration of expert systems

Knowledge Acquisition

Knowledge Processing (Inference Engine)

Knowledge Representation (Bases)

Interpretation & Inquiry

Expert User

2. Representative Example of Expert Systems

DENDRAL, E. A. Feigenbaum et al, Stanford U, 1968 Spectroscopic Analysis -- Molecular StructureMACSMA, C. Engleman et al, MIT, 1974 Mathematical CalculusMYCIN, E. H. Shortliffe et al, Stanford U, 1974 Bacterial Infection DiagnosingAM, D. B. Lenat, et al, Stanford, Inductive InferencePROSPECTOR, R. O. Duda et al, Stanford, GeologistHEARSAY, Carnegie-Mellon U, 1960s, NLPXCON, DEC & Carnegie-Mellon, 1980, Computer Configuration…

3. Example: MYCINFirst large expert system to perform at human expertlevel, first expert system to solve real-world problemsinstead of “toy problems”, and passed Turing Test.

(1) General Description-- Functions: (1) Report the user is a bacterial infection patient; (2) Give the infection hypothesis; (3) Recommend antibiotic therapy; (4) Give the proper prescription.-- Knowledge Acquisition: (1) Built in by knowledge engineers; (2) Learnt during man-machine interaction

-- Knowledge Representation: Production System

(2) Knowledge Bases: (1) Static knowledge base which includes (a) dictionary base has 800 English words, used for dialog between MYCIN and Doctor/Patient (b) index base simplifies the knowledge stored (c) knowledge categories (2) Dynamic Knowledge Base which includes (a) new data given by the patient (b) intermediate conclusions (c) some records during operation

(3) Rule Base Rule base includes 200 production rules, e.g.,

If (i) the stain of the organism is gram-positive, and (ii) the morphology of the organism is coccus, and (iii) the growth conformation of the organism is clumps Then there is suggestive evidence (.7) that the identity of the organism is staphylococcus.

(4) Knowledge Processing MYCIN reasons backward from its top level goal (of determining that there are disease-causing organism that must be treated) to clinical observations.

To solve the top level diagnostic goal, it looks for ruleswhose right sides suggest disease. It then uses the left side of those rules to set up subgoals whose success would enable the rules to be invoked. These subgoals are again matched against rules, and their left sidesare used to set up additional subgoals. Whenever a leftside describe a specific ptece of clinical evidence, MYCIN uses that evidence if it already has access to it. Otherwise, it asks the user to provide the information.

(5) Uncertain ReasoningHow does MYCIN combines the estimates of certainty in each of the rules to produce a final estimate of the certainty of its conclusion ?

All assertions being considered have associated withthem two numbers: MB (measure of belief) and MD (measure of disbelief) of a hypothesis h given evidence e

max(p(h|e), p(h)) - p(h)

Max(1, 0) - p(h)

MB (h,e) =

1 if p(h) = 1

otherwise

MD (h, e) = 1 if p(h) = 0

Min (p(h|e), p(h)) - p(h)Min(1, 0) - p(h)

otherwise

The overall estimate of the confidence of the system inits belief about the hypothesis, certainty factor (CF), isgiven by CF(h,e) = MB(h,e) - MD(h.e)

The measures of belief and disbelief of a hypothesis given two observations s1 and s2 are computed by

0 if MD(h, s1&s2) =1MB(h, s1&s2) = MB(h,s1) + MB(h,s2)[1-MB(h,s1)], otherwise

0 if MB(h, s1&s2) = 1MD(h, s1&s2) = MD(h,s1) + MD(h,s2) [1-MD(h,s1)] otherwise

Certainty factor of a combination of hypothesis:

MB(h1&h2, e) = min (MB(h1,e), MB(h2,e)) MB(h1orh2, e) = max (MB(h1,e), MB(h2,e))

(6) InteractionIs the patient’s illness with ORGANISM-1 a hospitalacquired infection ?Why ? (Doctor asked)[2.0] This will aid in determining the category ofORGANISM-1. It has already been established that [2.1] the gram stain of ORGANISM-1 is gramneg, [2.2] the morphology of ORGANISM-1 is rod, and [2.3] the aerobicity of ORGANISM-1 is facultative therefore, if [2.4] the infection with ORGANISM-1 was not acquired while the patent was hospitalized, thenthere is strongly suggestive evidence (.8) that caregoryof ORGANISM-1 is enterobacteriacese.

[also: there is weakly suggestive evidence (.1) that theidentity of ORGNISM-1 is pseudomonus-aeruginosa][Rule 037]How [2.3]The following were used: [3.1] Rule 027 indicated there is strongly suggestive evidence (.8) that the aerobicity of ORGANISM-1 is facultative. [3.2] Rule 033 indicated there is weakly suggestive evidence (.2) that the aerobicity of ORGANISM-1 is not facultative. Since this give a cumulative CF of (.6) for facultative, the evidence indicates that the aerubicity of ORGANISM-1 is facultative.

Chapter 5 Information Processing and Utilization Section 3 Theorem Proving.

Documents

p q q q

p q q mp p q

q merge p q

p q r p q r

clauses c x

p p taotology p

clause form p p

s t q s q