Resolution and First-Order Logic Inference
📂 General
# Resolution and First-Order Logic Inference
**Video Category:** Computer Science / Artificial Intelligence
## ð 0. Video Metadata
**Video Title:** CS221: Artificial Intelligence: Principles and Techniques | Autumn 2019 | Lecture 17 - Logic II
**YouTube Channel:** Stanford
**Publication Date:** Autumn 2019 (Specific date not shown in video)
**Video Duration:** ~1 hour 35 minutes
## ð 1. Core Summary (TL;DR)
This lecture explores the transition from Propositional Logic to First-Order Logic to enable more expressive and compact knowledge representation. It establishes the Resolution rule as a complete inference algorithm for propositional logic, overcoming the limitations of Modus Ponens which only works on Horn clauses. Finally, it details the mechanical pipeline required to perform inference in First-Order Logic, specifically the conversion to Conjunctive Normal Form (CNF) using Skolemization and the application of Unification to match abstract variables with concrete objects.
## 2. Core Concepts & Frameworks
* **Propositional Logic vs. First-Order Logic (FOL):** -> **Meaning:** Propositional logic uses opaque symbols (e.g., $P, Q$) to represent entire facts, making it clunky for generalizing rules. FOL introduces internal structure using Objects (constants like `alice`), Variables (`x`), and Predicates (`Knows(x, arithmetic)`) combined with Quantifiers ($\forall, \exists$) to express relationships compactly. -> **Application:** Translating "All students know arithmetic" from natural language into a single programmatic rule rather than enumerating every student.
* **Entailment vs. Derivation:** -> **Meaning:** Entailment ($KB \models f$) is a semantic property meaning that in every possible world (model) where the Knowledge Base (KB) is true, the formula $f$ is also true. Derivation ($KB \vdash f$) is a syntactic property meaning $f$ can be generated from the KB through pure symbol manipulation using inference rules. -> **Application:** Designing inference algorithms; an algorithm is *sound* if derivation guarantees entailment, and *complete* if entailment guarantees derivability.
* **Resolution Rule (Robinson, 1965):** -> **Meaning:** A single, complete inference rule for propositional logic that operates on clauses (disjunctions of literals). If you have two clauses $f_1 \lor \dots \lor f_n \lor p$ and $\neg p \lor g_1 \dots \lor g_m$, you can cancel out the contradictory literals ($p$ and $\neg p$) to derive a new valid clause $f_1 \lor \dots \lor f_n \lor g_1 \dots \lor g_m$. -> **Application:** Used as the engine for automated theorem provers and SAT solvers by repeatedly applying the rule to find contradictions.
* **Conjunctive Normal Form (CNF):** -> **Meaning:** A standardized logical format where a formula is expressed strictly as an "AND of ORs" (a conjunction of disjunctive clauses). Every logical formula can be mechanically converted into an equivalent CNF formula. -> **Application:** Required as a preprocessing step because the Resolution algorithm can only operate on data structured as isolated clauses.
* **Unification and Substitution:** -> **Meaning:** Substitution ($\theta$) is a mapping that replaces variables with concrete terms. Unification is the algorithmic process of finding the most general substitution that makes two different logical expressions syntactically identical. -> **Application:** Used in First-Order Logic inference to apply abstract rules to specific facts (e.g., unifying the rule `Loves(x, y)` with the fact `Loves(alice, bob)` by generating the substitution $\{x/\text{alice}, y/\text{bob}\}$).
## 3. Evidence & Examples (Hyper-Specific Details)
* **CodaLab Worksheets Demonstration:** The speaker demonstrated `worksheets.codalab.org`, a platform for reproducible cloud research.
* Created a worksheet named `cs221-final-project`.
* Uploaded a bundle including `polarity.train` (408k), `polarity.test` (407k), and `textclass.py` (4.5k).
* Executed a job inside a Docker container: `python textclass.py --train polarity.train --test polarity.test`.
* Created a custom schema to parse the output logs, using `add testError /stats.json:testError` to extract the metric and display it formatted to 3 decimal places (e.g., `0.277`).
* Launched parallel experiments modifying the step size parameter (`--eta 0.1` vs `--eta 0.2`) and tracked the resulting `testError` differences side-by-side.
* Demonstrated environmental debugging by executing `free -df` to confirm the container had ~16GB of available memory.
* **Soundness of Resolution (Visual Proof):** Demonstrated using Venn diagrams representing the space of all possible models. The premise `Rain \lor Snow` covers all areas except the "neither" region. The premise `\neg Snow \lor Traffic` covers all areas outside the specific "Snow but no Traffic" region. Intersecting these two model spaces leaves a region that is strictly a subset of the conclusion `Rain \lor Traffic`, visually proving that the conclusion holds in all worlds where the premises hold.
* **CNF Conversion Example (Propositional):** The formula `(Summer \rightarrow Snow) \rightarrow Bizzare` was manually converted:
1. Eliminate implication: `\neg(\neg Summer \lor Snow) \lor Bizzare`
2. Push negation inwards (De Morgan's): `(Summer \land \neg Snow) \lor Bizzare`
3. Distribute OR over AND: `(Summer \lor Bizzare) \land (\neg Snow \lor Bizzare)`
* **First-Order Logic Representation Examples:**
* Natural Language: "Every even integer greater than 2 is the sum of two primes."
* FOL Translation (Goldbach's Conjecture): `\forall x EvenInt(x) \land Greater(x, 2) \rightarrow \exists y \exists z Equals(x, Sum(y, z)) \land Prime(y) \land Prime(z)`
* **First-Order Logic CNF Conversion Example:** "Anyone who likes all animals is liked by someone." -> `\forall x (\forall y Animal(y) \rightarrow Loves(x,y)) \rightarrow \exists y Loves(y,x)`
1. Eliminate implication: `\forall x \neg (\forall y \neg Animal(y) \lor Loves(x,y)) \lor \exists y Loves(y,x)`
2. Push negation inwards: `\forall x (\exists y Animal(y) \land \neg Loves(x,y)) \lor \exists y Loves(y,x)`
3. Standardize variables (rename to avoid collisions): `\forall x (\exists y Animal(y) \land \neg Loves(x,y)) \lor \exists z Loves(z,x)`
4. Skolemize (replace existential variables with functions of universal variables): Replace $y$ with $Y(x)$, and $z$ with $Z(x)$. Result: `\forall x (Animal(Y(x)) \land \neg Loves(x,Y(x))) \lor Loves(Z(x),x)`
5. Distribute OR over AND: `\forall x (Animal(Y(x)) \lor Loves(Z(x),x)) \land (\neg Loves(x,Y(x)) \lor Loves(Z(x),x))`
6. Drop universal quantifiers to output the final clauses.
* **First-Order Resolution Application:** Resolving the derived clauses `Animal(Y(x)) \lor Loves(Z(x), x)` against a new premise `\neg Loves(u, v) \lor Feeds(u, v)`. Unifying `Loves(Z(x), x)` with `Loves(u, v)` yields the substitution $\theta = \{u/Z(x), v/x\}$. Applying $\theta$ generates the resolvent: `Animal(Y(x)) \lor Feeds(Z(x), x)`.
## 4. Actionable Takeaways (Implementation Rules)
* **Rule 1: Pre-process Knowledge Bases into Conjunctive Normal Form (CNF)** -> **Convert all statements into independent disjunctive clauses** -> **Enables the application of the Resolution rule.** Inference engines cannot operate on arbitrary nested logic; breaking data down into standardized `A OR B OR C` chunks is mandatory for automated reasoning.
* **Rule 2: Execute Proof by Contradiction for Inference** -> **To prove $KB \models f$, add $\neg f$ to the knowledge base and run resolution until an empty clause (`false`) is derived** -> **Guarantees completeness.** Because resolution is refutation-complete, finding a contradiction definitively proves that the original formula is entailed by the knowledge base.
* **Rule 3: Match Natural Language Quantifiers to Specific Connectives** -> **Pair "For All" ($\forall$) with Implications ($\rightarrow$), and pair "Exists" ($\exists$) with Conjunctions ($\land$)** -> **Prevents semantic mapping errors.** Using $\land$ with $\forall$ makes an overly strong claim about the whole universe, while using $\rightarrow$ with $\exists$ makes statements vacuously true.
* **Rule 4: Apply Skolemization to Remove Existential Variables** -> **Replace $\exists y$ with a generated function $Y(x)$ mapped to any universally quantified variables in scope** -> **Removes existential ambiguity.** This creates concrete, callable "names" for unknown objects based on their dependencies, allowing algorithmic resolution to proceed without abstract existentials.
## 5. Pitfalls & Limitations (Anti-Patterns)
* **Pitfall:** Relying solely on Modus Ponens for general propositional logic inference. -> **Why it fails:** Modus Ponens is incomplete unless the knowledge base is strictly restricted to Horn Clauses (clauses with at most one positive literal). It cannot natively handle uncertainty expressed as pure disjunctions (e.g., $A \lor B$). -> **Warning sign:** The algorithm terminates without deriving a conclusion that is clearly entailed by the premises.
* **Pitfall:** Confusing local variable scope during First-Order logic conversions. -> **Why it fails:** If a formula uses the same variable name (e.g., $y$) across different quantifiers (e.g., $\forall y$ and $\exists y$), pushing negations inward or applying Skolemization will cause variable collisions and incorrect dependencies. -> **Warning sign:** Skolem functions map to the wrong dependencies, or resolution unifies variables that represent distinctly different entities. (Fix this by applying "Standardize variables" step to rename them uniquely).
* **Pitfall:** Assuming First-Order Logic inference will eventually return a definitive "False" if a statement is not entailed. -> **Why it fails:** First-Order Logic is only *semi-decidable*. Because function symbols can generate an infinite number of terms (e.g., $F(a), F(F(a)), F(F(F(a)))$), the resolution search space can be infinite. -> **Warning sign:** The inference algorithm runs indefinitely (infinite loop) without deriving the target formula or returning a failure state.
## 6. Key Quote / Core Insight
"You should think about knowledge as a restriction. The fewer models (possible situations in the world) that remain valid, the more actual knowledge you possess about the state of the world."
## 7. Additional Resources & References
* **Resource:** CodaLab - **Type:** Web Platform (`worksheets.codalab.org`) - **Relevance:** Shown as a tool for executing reproducible machine learning jobs, tracking metrics via schemas, and managing Docker environments for cloud compute.
* **Resource:** "A Machine-Oriented Logic Based on the Resolution Principle" by J.A. Robinson (1965) - **Type:** Historical Paper - **Relevance:** Cited as the foundational text that introduced the Resolution rule, unifying logical inference into a single, mechanizable algorithm.