Artificial Intelligence

1. Agent Concept
- 1.1. Agent function v.s. Agent program
2. State changing and implementation
- 2.1. hanoon jugs
- 2.2. 野人与传教士
3. Search Algorithm
- 3.1. branching factor, BFS, DFS
4. Formulation
5. Propositional Logic
6. Adversarial Search
- 6.1. Minimax Algorithm
  - 6.1.1. The optimal strategies
- 6.2. Alpha-Beta pruning
7. Constraint Satisfaction Problems
- 7.1. Advantage
- 7.2. Variations
8. Logic
9. First order logic
10. Bayesian

1 Agent Concept

An agent that senses only partial information about the state cannot be perfectly rational.
- False. The vacuum-cleaning agent is perfectly rational, but it senses only partial information, i.e. it doesn't observe the state of the square that is adjacent to it.
There exist task environment in which no pure reflex agent can behave rationally.
- True. Purely reflective behavior does not take the percept history into account, only the most recent percepts.
There exists a task environment in which every agent is rational.
- True. A task environment in which there are no decisions to be made: all actions will receive the same output.
The input to an agent program is the same as the input to the agent function
- False.
  - The input to agent program: current percept
  - the input to agent function: entire precept history
Every agent function is implementable by some program/machine combination
- False. As the textbook said: "No machine can tell in general whether a given program will return an answer on a given input or run forever."
suppose an agent selects its action uniformly at random from the set of possible actions. There exists a deterministic task environment in which this agent is rational.
- True. In the environment that all actions will produce same output, it is rational. Actually all agents are rational in such environment.
It is possible for a given agent to be perfectly rational in two distinct task environments.
- True. There's recently a kickstarter project that produces dice with more than 6 faces. If an agent is rational in a N face dice bet game, it will perform equally well in a 6-face dice or a 8-face dice.
Every agent is rational in an un-observable environment.
- False. A vacuum agent that can move will be rational, but the one that does not move is not.
A perfectly rational poker-playing agent never loses.
- False. Two such perfectly rational agent play against each other will give one lose.

1.1 Agent function v.s. Agent program

Can there be more than one agent program that implements a given agent function? Give an example, or show why one is not possible.
- There are infinite agent programs that implement a given agent function. If an agent function acts only depend on previous \(n\) percepts. Than, the agent implementations that have n or more memory will always produce the same action.
Are there agent functions that cannot be implemented by any agent program?
- Yes. As the textbook said: "No machine can tell in general whether a given program will return an answer on a given input or run forever."
Given a fixed machine architecture, does each agent program implement exactly one agent function?
- Yes. A program implements a mapping from percepts to actions. The same percept will only result in one action.
Given an architecture with n bits of storage, how many different possible agent programs are there?
- There would be \(a^{2^n}\) possible programs; \(2^n\) possible states and \(a\) choices for each state.
Suppose we keep the agent program fixed but speed up the machine by a factor of two. Does that change the agent function?
- No. The speed does not have influence on the produced action.

2 State changing and implementation

2.1 hanoon jugs

Give a complete problem formulation for each of the following. Choose a formulation that is precise enough to be implemented. d. You have three jugs, measuring 12 gallons, 8 gallons, and 3 gallons, and a water faucet. You can fill the jugs up or empty them out from one to another or onto the ground. You need to measure out exactly one gallon.

Define a 3-tuple (x,y,z) where x,y,z are the amount of water in the three jugs.

Initial state: (0,0,0).
Action:
- FILL: given values (x,y,z) , generate
  - (12,y,z), (x,8,z), (x,y,4)
- EMPTY: given values (x,y,z) , generate
  - (0,y,z), (x,0,z), (x,y,0)
- POUR: Given value (x,y), let t = min(x+y, cap(y)), pour x into y, generate:
  - (x-(t-y), t)
Cost function: Number of actions.

2.2 野人与传教士

三个野人，三个传教士，一艘船。如何过河？

2.2.1 a. Formulate

a. Formulate the problem precisely, making only those distinctions necessary to ensure a valid solution. Draw a diagram of the complete state space.

2.2.1.1 state

(M1,C1,B1,M2,C2,B2) where M1,C1,B1 is the number of missionaries, cannibals, boats on the left side, (M2,C2,B2) is the corresponding number on the right side.

The start state is (3,3,1,0,0,0). The goal: (0,0,0,3,3,1)

2.2.1.2 action

Action: (m,c,b) on left side: where b means the change of boat, m and c means the change of missionaries and cannibals. The action allows the boat number B1 or B2 to change from 1 to 0, along with M and C on the side move to the other side by one or two.

(-1 0 -1)
(0 -1 -1)
(-2 0 -1)
(0 -2 -1)
(-1 -1 -1)

(1 0 1)
(0 1 1)
(2 0 1)
(0 2 1)
(1 1 1)

2.2.1.3 The complete state space

In the table below, the striped items are those that cannibals will eat missionaries. The state that is not reachable (e.g. (3,3,0,0,0,1)) is not striped out.

`(3 3 1 0 0 0)`	~~`(3 2 1 0 1 0)`~~	~~`(3 1 1 0 2 0)`~~	~~`(3 0 1 0 3 0)`~~
~~`(2 3 1 1 0 0)`~~	`(2 2 1 1 1 0)`	~~`(2 1 1 1 2 0)`~~	~~`(2 0 1 1 3 0)`~~
~~`(1 3 1 2 0 0)`~~	~~`(1 2 1 2 1 0)`~~	`(1 1 1 2 2 0)`	~~`(1 0 1 2 3 0)`~~
~~`(0 3 1 3 0 0)`~~	~~`(0 2 1 3 1 0)`~~	~~`(0 1 1 3 2 0)`~~	`(0 0 1 3 3 0)`
`(3 3 0 0 0 1)`	~~`(3 2 0 0 1 1)`~~	~~`(3 1 0 0 2 1)`~~	~~`(3 0 0 0 3 1)`~~
~~`(2 3 0 1 0 1)`~~	`(2 2 0 1 1 1)`	~~`(2 1 0 1 2 1)`~~	~~`(2 0 0 1 3 1)`~~
~~`(1 3 0 2 0 1)`~~	~~`(1 2 0 2 1 1)`~~	`(1 1 0 2 2 1)`	~~`(1 0 0 2 3 1)`~~
~~`(0 3 0 3 0 1)`~~	~~`(0 2 0 3 1 1)`~~	~~`(0 1 0 3 2 1)`~~	`(0 0 0 3 3 1)`

2.2.2 b. Solve

Implement and solve the problem optimally using an appropriate search algorithm. Is it a good idea to check for repeated states?

The solution:

(3,3,1,0,0,0)
-> (2,2,0,1,1,1)
-> (3,2,1,0,1,0)
-> (3,0,0,0,3,1)
-> (3,1,1,0,2,0)
-> (1,1,0,2,2,1)
-> (2,2,1,1,1,0)
-> (0,2,0,3,1,1)
-> (0,3,1,3,0,0)
-> (0,1,0,3,2,1)
-> (0,2,1,3,1,0)
-> (0,0,0,3,1,0)

Yes, we should check repeated states to avoid infinite recursion.

2.2.3 c. discussion

c. Why do you think people have a hard time solving this puzzle, given that the state space is so simple?

It is hard to manually work it out.
the repeat states need to be removed, which increase difficulty for manual solving.

3 Search Algorithm

3.1 branching factor, BFS, DFS

3.26 Consider the unbounded version of regular 2D grid shown in Figure 39. The start state is at the origin, (0,0), and the goal state is at (x,y).

What is the branching factor b in the state space?

How many distict states are there at depth k (for k > 0)?

What is the maximum number of nodes expanded by breadth-first tree search?

What is the maximum number of nodes expanded by breadth-first graph search?

Is h= |u-x| + |v-y| an admissible heuristic for a state at (u,v)? explain.

How many nodes are expanded by A* graph search using h?

Does h remain admissible if some links are removed?

Does h remain admissible if some links are added between nonadjacent states?

What is the branching factor b in the state space?
- Since it is a 2D grid, there're 4 directions for each node. The branching factor is 4.
How many distict states are there at depth k (for k > 0)?
- For depth 1, there're 1+4 states;
- For depth 2, there're 1+4+8 states;
- For depth 3, there're 1+4+8+12 states;
- For depth k, there're 1 + 4 + 8 + .. + 4k = 2k² + 2k + 1
What is the maximum number of nodes expanded by breadth-first tree search?
- The depth of the goal is |x|+|y|, and if we allow the loopy states on the search tree, we will have 4 branches for each node. Thus the maximum total nodes to be expanded: \(1 + 4^1 + 4^2 + .. + 4^(|x|+|y|)\)
What is the maximum number of nodes expanded by breadth-first graph search?
- For graph search, we only expand nodes that are not in exploded set. The expanded nodes will be the distinct state of depth |x|+|y|: 1 + 4 + 8 + .. + 4k
Is h= |u-x| + |v-y| an admissible heuristic for a state at (u,v)? explain.
- Yes. Because it never overestimates the cost: it is the optimal path from (u,v) to (x,y) in a 2D grid given that all links cost 1.
f. How many nodes are expanded by A* graph search using h?
- It is |x|*|y|. Because all the paths in the rectangle are optimal paths.
g. Does h remain admissible if some links are removed?
- Yes. Removing links can only make the best path longer if possible, so h remains an underestimate.
h. Does h remain admissible if some links are added between nonadjacent states?
- No. We could add some links that makes the optimal solution shorter. Thus h would overestimate the cost.

4 Formulation

4.1 Floor planning

6.4 Given the precise formulations for each of the following as a Constraint Satisfaction Problems:

Rectilinear floor-planning: find non-overlapping places in a large rectangle for a number of smaller rectangles.

Variables:
- WIDTH and HEIGHT for the large rectangle.
- R_i for each rectangles, R_i.w and R_i.h for the width and height of R_i respectively.
- the position P_i for top-left corner of the rectangle R_i (P_i.x and P_i.y for the co-ordinates)
Domains: {P_i.x ∈ [0, WIDTH]}
Constraints: the four corners of R_i should not be inside the area of R_j, for each \(i \neq j\)
- for each \(i \neq j\)
- for each corner \((x,y)\) in \(\{(P_{i}.x, P_{i}.y)\), \((P_i.x + R_i.w, P_i.y)\), \((P_i.x, P_i.y + R_i.h)\), \((P_i.x+R_i.w, P_i.y + R_i.h)\}\)
- \(\neg (P_{j}.x < x < P_{j}.x + R_{j}.w \cap P_{j}.y < y < P_{j}.y + R_{j}.h)\)

4.2 class scheduling

Class scheduling: There is a fixed number of professors and classrooms, a list of classes to be offered, and a list of possible time slots for classes. Each professor has a set of classes that he or she can teach.

Variables:
- P_i for each professor, with \(P_i.classes\) be the set of classes he or she can teach
- R_i for each room
- C_i for each class
- T_i for each time slot
- Assignment A_i, the i-th assignment, is a 4-tuple \((A_i.prof, A_i.room, A_i.class, A_i.time)\).
Domain
- \(A_i.prof\) ∈ {P_j}
- \(A_i.room\) ∈ {R_j}
- \(A_i.class\) ∈ {C_j}
- \(A_i.time\) ∈ {T_j}
Constraint
- \(A_i.class \in A_i.prof.classes\) for all i
- \(\neg (A_i.time = A_j.time \cap A_i.prof = A_j.prof)\) for all \(i \neq j\)
- \(\neg (A_i.time = A_j.time \cap A_i.room = A_j.room)\) for all \(i \neq j\)

4.3 linving in 5 houses

Consider the following logic puzzle: In five houses, each with a different color, live five persons of different nationalities, each of whom prefers a different brand of candy, a different drink, and a different pet. Given the following facts, the questions to answer are "where does the zebra live, and in which house do they drink water?" Discuss different representations of this problem as a CSP. Why would one prefer one representation over another?

The Englishman lives in the red house.
The Spaniard owns the dog.
The Norwegian lives in the first house on the left.
The green house is immediately to the right of the ivory house.
The man who eats Hershey bars lives in the house next to the man with the fox.
Kit Kats are eaten in the yellow house.
The Norwegian lives next to the blue house.
The Smarties eater owns snails.
The Snickers eater drinks orange juice.
The Ukrainian drinks tea.
The Japanese eats Milky Ways.
Kit Kats are eaten in a house next to the house where the horse is kept.
Coffee is drunk in the green house.
Milk is drunk in the middle house.

4.3.1 First representation

This representation is based on the house.

Variables and domains:
- H for the houses
  - .n: nationality
  - .b: hold brand of candy
  - .lb: the one lived in this house like the brand of candy
  - .c: color
  - .d: hold drink
  - .ld: the man lived in this house like the drink
  - .pet: hold pet
  - .index: index of the house from left to right
Domains
- h.n ∈ Englishman, Spaniard, etc.
- h.c ∈ red, green, etc.
- h.b ∈ ivory, smarties, etc.
- h.d ∈ water, tea, etc.
- h.p ∈ dog, fox, snail, etc.
- h.index ∈ [1,5]
Constraints
- The Englishman lives in the red house.
  ¬ h.c=red ∪ h.n=Englishman
- The Spaniard owns the dog.
  ¬ h.n=Spanisard ∪ h.pet=dog
- The Norwegian lives in the first house on the left.
  ¬ h.n = Norwegian ∪ h.index = 1
- The green house is immediately to the right of the ivory house.
  ¬ (h1.c = green ∩ h2.b = ivory) ∪ h1.index = h2.index + 1
- The man who eats Hershey bars lives in the house next to the man with the fox.
  ¬ (h1.lb=Hersheybar ∩ ∩ h2.pet = fox) ∪ | h1.index-h2.index | = 1
- Kit Kats are eaten in the yellow house.
  ¬ h.c=yellow ∪ h.lb = KitKats
- The Norwegian lives next to the blue house.
  ¬ (h1.n=Norwegian ∩ h2.c = blue) ∪ | h1.index - h2.index |=1
- The Smarties eater owns snails.
  ¬ h.lb=smarties ∪ h.p = snails
- The Snickers eater drinks orange juice.
  ¬ h.lb=snickers ∪ h.ld = OrangeJuice
- The Ukrainian drinks tea. ¬ h.n=Ukrainian ∪ h.ld=tea
- The Japanese eats Milky Ways. ¬ h.n=Japanese ∪ h.lb=MilkyWays
- Kit Kats are eaten in a house next to the house where the horse is kept. ¬ (h1.b=Kats ∩ h2.pet=horse) ∪ |h1.index-h2.index|=1
- Coffee is drunk in the green house. ¬ h.d=coffee ∪ h.c=green
- Milk is drunk in the middle house. ¬ h.d=milk ∪ h.index=3

4.3.2 Second Representation

This representation is based on the Person.

Variables:
- P for person
  - .n : nationality
  - .b : in his house, the candy that holds
  - .lb : the brand of candy he likes
  - .c : the color of his house
  - .d: the drink held in his house
  - .ld: the drink he likes
  - .pet: his pet
  - .index: the index of his house

The domains and constraints are similar.

4.3.3 Comparison

Similarly, we can also derive the representation based on other variables, e.g. drink, candy, etc. One would prefer one to the other based on the query type he want. For example, if the query is based on the house, e.g. which house hold some drink, he would prefer the house-based representation. Similarly if the query is based on person, e.g. what pet does the Englishman keeps, he would prefer the person-based representation. But essentially they are the same.

5 Propositional Logic

5.1 Simple

Given the following, can you prove that the unicorn is mythical? How about magical? Horned?

If the unicorn is mythical, then it is immortal, but if it is not mythical, then it is a mortal mammal. If the unicorn is either immortal or a mammal, then it is horned. The unicorn is magical if it is horned.

The formula writes:

\(\neg mythical \vee immortal\)
\(mythical \vee mammal\)
\(\neg (immortal \vee mammal) \vee horned\)
\(\neg horned \vee magical\)

We have the enumerated truth table:

mythical	immortal	mammal	horned	magical
T	T	T	T	T
T	T	F	T	T
F	T	T	T	T
F	F	T	T	T

Based on these, we can derive:

We cannot prove unicorn is mythical.
We can prove it is magical.
We can prove it is horned.

5.2 Party resolution

7.18 Consider the following sentence: \(((Food \Rightarrow Party) \vee (Drinks \Rightarrow Party)) \Rightarrow ((food \wedge drinks) \Rightarrow Party)\)

Determine, use enumeration, whether the sentence is valid, satisfiable (but neg valid), or unsatisfiable.

Convert the left-hand and right-hand sides of the main implication into CNF, Showing each step, and explain how the results confirm your answer to (a)

Prove your answer to (a) using resolution.

5.3 1

It is Valid. The enumeration:

F	D	P	left hand	right hand	left ⇒ right
T	T	T	T	T	T
T	F	T	T	T	T
F	T	T	T	T	T
F	F	T	T	T	T
T	T	F	F	F	T
T	F	F	T	T	T
F	T	F	T	T	T
F	F	F	T	T	T

5.4 2

left-hand CNF:

((F ⇒ P) ∨ (D ⇒ P)

♠ (¬ F ∨ P) ∨ (¬ D ∨ P)

♠ (¬ F ∨ P ∨ ¬ D ∨ P)

♠ (¬ F ∨ ¬ D ∨ P)

right-hand CNF:

(F ∧ D) ⇒ P

♠ ¬ (F ∧ D) ∨ P

♠ (¬ F ∨ ¬ D) ∨ P

♠ ¬ F ∨ ¬ D ∨ P

As we can see, they are exactly the same. Thus the production is valid.

5.5 3

We can prove it by proving the negation is unsatisfiable.

\(\neg ((F \Rightarrow P) \vee (D \Rightarrow P) \Rightarrow ((F \vee D) \Rightarrow P))\)

♠ ¬ (((F ⇒ P) ∨ (D ⇒ P)) ⇒ ((F ∧ D) ⇒ P))

♠ ¬ ( ¬ ((F ⇒ P) ∨ (D ⇒ P)) ∨ ((F ∧ D) ⇒ P))

♠ ((F ⇒ P) ∨ (D ⇒ P)) ∧ (¬ ((F ∧ D) ⇒ P))

♠ ((¬ F ∨ P) ∨ (¬ D ∨ P)) ∧ (¬ (¬ (F ∧ D) ∨ P))

♠ (¬ F ∨ ¬ D ∨ P) ∧ (F ∧ D) ∧ ¬ P

♠ (¬ F ∨ ¬ D ∨ P) ∧ F ∧ D ∧ ¬ P

This resolves to empty clause, thus the original sentence is valid.

6 Adversarial Search

This is multiple agents, also known as game.

6.1 Minimax Algorithm

There're two players, Min and Max, each takes turn to execute. Max moves first.

6.1.1 The optimal strategies

\begin{equation*} MINIMAX-VALUE(n) = \left\{ \begin{array}{r1} Utility(n) & \text {if n is terminal},\\ max_{s \in succ(n)} MINIMAX-VALUE(s) & \text{if n is a max node},\\ min_{s \in succ(n)} MINIMAX-VALUE(s) & \text{if n is a min node}. \end{array} \right . \end{equation*}

Basically it recursively solve the problem. The Utility function is the payoff. It actually list the tree of state space, and it is optimal.

6.2 Alpha-Beta pruning

The problem of minimax algorithm is its node grow exponential. This algorithm is used to prune the subtree that does not affect the result.

This is similar for MiniMax algorithm

α is the value of the best choise so far, for max, init from -∞
β is the best value for min, init from +∞

There're two procedures:

Alpha-Beta-Search(state): returns an action. state is the current state.
Max-Value(state, \alpha, \beta): returns a utility value
Min-Value(state, \alpha, \beta): returns a utility value

Alpha-Beta-Search(state) {
  v = Max-Value(state, -999, +999);
  return action in ACTIONS with value v;
}

Max-Value(state, alpha, beta) {
  v = INT_MIN;
  for each a in ACTIONS(state) do
    v = Max(v, Min-Value(result(s,a), alpha, beta));
    if v >= beta then return v;
    alpha = MAX(alpha,v);
  return v;
}

7 Constraint Satisfaction Problems

It seems to formulate the search problems in a uniformed representation:

X: a set of variables
D: each has a domain of values
C: a set of constraints for each of the variable

The goal is to find the assignment of values to the variables, that satisfies the constraints.

7.1 Advantage

it uses general purpose heuristic rather than problem-specific ones.

7.2 Variations

continuous or discrete domain
finite or infinite domain
linear or non-linear constraint
unary or binary or high order constraint

8 Logic

8.1 entailment

Entailment: β \models α, reads: the sentence β entails the sentence α if and only if α is true in all worlds where β is true.

8.2 Propositional logic

a.k.a. boolean logic.

logical equivalence;

a	b
α ∧ β	β ∧ α
(α ∧ β) ∧ γ	α ∧ (β ∧ γ)
α ⇒ β	¬ β ⇒ ¬ α
α ⇒ β	¬ α ∨ β
α ⇔ β	(α ⇒ β) ∧ (β ⇒ α)

A sentence is valid if it is true in all models. Deduction theorem: KB \models α iff KB ⇒ α is valid
A sentence is satisfiable if it is true in some models. KB \models α iff KB ∧ ¬ α is unsatisfiable.

Proof method:

inference rules: transform the sentences to a normal form
model checking: truth table

A clause is a disjunction of literals. Factoring: the result clause keeps only one copy of each literal.

Conjunction: ∧ Disjunction: ∨ CNF: conjunctive normal form. Conjunction of (disjunctions of literals).

Resolution algorithm: proof by contradiction. I.e. to show KB \models α, we show KB ∧ ¬ α is unsatisfiable. The naming resolution is because, the pair of complementary literals is resolved.

Definite clause: disjunction of literals with exactly one positive literal.

8.3 First Order Logic

∧ is the natural connective with ∃. Using ⇒ as the main connective with ∃ often causes errors:

∃ x At(x,ISU) ⇒ Smart(x)

is true if there's anyone who is not at ISU, which may not be what you want.

Properties:

a	b	result
∀ x ∀ y	∀ y ∀ x
∃ x ∃ y	∃ y ∃ x
∃ x ∀ y	∀ y ∃ x	not equal
∀ x	¬ ∃ x ¬
∃ x	¬ ∀ x ¬

9 First order logic

9.1 Some problems

9.1.1 Student problems

Some students took French in spring 2001.

Every student who takes French passes it.

Only one student took Greek in spring 2001.

The best score in Greek is always higher than the best score in French.

student(x): x is student
f,g: French and German courses
take(x,c,s): student x takes course c in semester s
pass(x,c): student x passes course c
score(x,c,s): the score of student x in course c in semester s.
x>y: x is greater than y

\(\exists x student(x) \wedge take(x,f,spring2001)\)

\(\forall x,s student(s) \wedge take(x,f,s) \Rightarrow pass(x,f,s)\)

\(\exists x student(x) \wedge take(x,g,sprint2001) \wedge \forall \: y y \ne x \Rightarrow \neg take(y,g,sprint2001)\)

\(\forall s \exists x \forall y score(x,g,s) > score(y,f,s)\)

9.1.2 pollicy problems

Every person who buys a policy is smart.

No person buys an expensive policy.

There is an agent who sells policies only to people who are not insured.

person(x): x is person
expensive(x): x is expensive
agent(x): x is agent
insured(x): x is insured
smart(x): x is smart
buy(x,y,z): x buys y from z
sell(x,y,z): x sells y to z

∀ person(x) ∧ (∃ y,z policy(y) ∧ buy(x,y,z)) ⇒ smart(x)

∀ x,y,z person(x) ∧ policy(y) ∧ expensive(y) ⇒ ¬ buy(x,y,z)

∃ x agent(x) ∧ ∀ y,z policy(y) ∧ sell(x,y,z) ⇒ (person(z) ∧ ¬ insured(z))

9.1.3 barber

There is a barber who shaves all men in town who do not shave themselves.

man(x): x is man
barber(x): x is a barber
shaves(x,y): x shaves y

∃ x ∀ y barber(x) ∧ man(y) ∧ ¬ shaves(y,y) ⇒ shaves(x,y)

9.1.4 citizen

A person born in the UK, each of whose parents is a UK citizen or a UK resident, is a UK citizen by birth.

A person born outside the UK, one of whose parents is a UK citizen by birth, is a UK citizen by descent.

person(x): x is person
parent(x,y): x is parent of y
citizen(x,c): x is a citizen of country c
citizen(x,c,r): x is a citizen of country c, for reason r
resident(x,c): x is a resident of country c
born(x,c): x was born in country c

∀ x person(x) ∧ born(x,UK) ∧ (∀ y parent(y,x) ⇒ ((∃ r citizen(y,UK,r)) ∨ resident(y,UK))) ⇒ citizen(x,UK,BIRTH)

∀ x person(x) ∧ ¬ born(x,UK) ∧ (∃ y parent(y,x) ∧ citizen(y,UK,BIRTH)) ⇒ citizen(x,UK,DESCENT)

9.1.5 other

Politicians can fool some of the people all of the time, and they can fool all of the people some of the time, but they can’t fool all of the people all of the time.

All Greeks speak the same language. (Use Speaks(x,l) to mean that person x speaks language l.)

person(x): x is person
politician(x): x is politician
fool(x,y,t): x fools y at time t
german(x): x is German.

∀ x politician(x) ⇒ (∃ y ∀ t person(y) ∧ fool(x,y,t)) ∧ (∃ t ∀ y person(y) ⇒ fool(x,y,t)) ∧ ¬ (∀ t ∀ y person(y) ⇒ fool(x,y,t))

∀ x,y,l german(x) ∧ german(y) ∧ Speaks(x,l) ⇒ Speaks(y,l)

9.2 Unify

For each pair of atomic sentences, give the most general unifier if it exists:

P(A,B,B), P(x,y,z)

Q(y,G(A,B)),Q(G(x,x),y)

Older(Father(y),y),Older(Father(x),John)

Knows(Father(y),y), Knows(x,x)

9.2.1 {x/A, y/B, z/B}

P(A,B,B), P(x,y,z)

⇒ P(A,B,B), P(A,y,z) : {x/A}

⇒ P(A,B,B), P(A,B,z) : {x/A, y/B}

⇒ P(A,B,B), P(A,B,B) : {x/A, y/B, z/B}

9.2.2 Cannot unify.

To see why:

Q(y,G(A,B)),Q(G(x,x),y) : {y/G(x,x)}

⇒ Q(G(x,x), G(A,B)), Q(G(x,x), G(x,x)) : {y/G(x,x), x/A}

⇒ Q(G(A,A), G(A,B)), Q(G(A,A), G(A,A))

In the last formula, A cannot be unified to B.

9.2.3 {x/John, y/John}

Older(Father(y),y),Older(Father(x),John)

⇒ Older(Father(John),John),Older(Father(x),John) : {y/John}

⇒ Older(Father(John),John),Older(Father(John),John) : {y/John, x/John}

9.2.4 cannot unify.

Knows(Father(y),y), Knows(x,x) : {x/Father(y)}

⇒ Knows(Father(y),y), Knows(Father(y),Father(y))

In the last formula, y cannot be unified to Father(y).

9.3 Barber

9.20 Let L be the first-order language with a single predicate S(p, q), meaning “p shaves q.” Assume a domain of people.

Consider the sentence “There exists a person P who shaves every one who does not shave themselves, and only people that do not shave themselves.” Express this in L.

Convert the sentence in (a) to clausal form.

Construct a resolution proof to show that the clauses in (b) are inherently inconsistent. (Note: you do not need any additional axioms.)

9.3.1 1

∃ p ∀ q person(p) ∧ person(q) ∧ (¬ S(q,q) ⇔ S(p,q))

9.3.2 2

1st order formula:

\begin{equation} \exists p \: \forall q \: person(p) \wedge person(q) \wedge (\neg S(q,q) \Leftrightarrow S(p,q)) \end{equation} \begin{equation} \exists p \: \forall q \: person(p) \wedge person(q) \wedge (\neg S(q,q) \Rightarrow S(p,q)) \wedge (S(p,q) \Rightarrow \neg S(q,q)) \end{equation}

remove implication:

\begin{equation} \exists p \: \forall q \: person(p) \wedge person(q) \wedge (S(q,q) \vee S(p,q)) \wedge (\neg S(p,q) \vee \neg S(q,q)) \end{equation}

skolemize off the existence:

\begin{equation} \forall q \: person(P) \wedge person(q) \wedge (S(q,q) \vee S(P,q)) \wedge (\neg S(P,q) \vee \neg S(q,q)) : \{p={P}\} \end{equation}

drop universal qualifier:

\begin{equation} person(P) \wedge person(q) \wedge (S(q,q) \vee S(P,q)) \wedge (\neg S(P,q) \vee \neg S(q,q)) \end{equation}

9.3.3 3

The CNF above resolves to empty clause, which is false, meaning the logic is not satisfiable.

10 Bayesian

Node X is conditionally independent of all other nodes in the network, given its markov blanket. (parents, children, and children's parents).
Node X is conditionally independent of its non-descendants given its parent.

sample space: The set of all possible worlds
Ω: the sample space
ω: an element of the space. Each element has a probability P(ω), and sum up to one.
prior: also called unconditional probabilities or prior probabilities.
posterior: also called conditional probability or posterior probability.
evidence: the results observed
product rule: A different form of the definition of conditional probability \(P(a\vedge b) = P(a|b) P(b)\)
random variables: begin with uppercase letter.
domain: the set of possible values the random variable can take
probability distribution: for discrete random variables
probability density function: for continuous random variables, because the vector is infinite.
joint probability distribution: the P for two variables with some interaction
full joint probability distribution: all variables
inclusion-exclusion principle: P(a∨ b) = P(a) + P(b) - P(a\vedge b).
probabilistic inference: computation of posterior probabilities given evidence
marginalization: also called summing out, because it sums the conditional probabilities. The process of computing the unconditional probability, aka marginal probability
normalization: when calculating a conditional probability, there's a constant. It is typically not important, so replace it with α. The alpha is to set the probabilities to sum to 1. P(X|e) = α P(X,e)
conditioning rule: P(Y) = ∑_z P(Y|z)P(z)
(absolute) independence: P(X|Y)=P(X)
Bayes rule: P(b|a) = \frac{P(a|b)P(b)}{P(a)}
conditional independence: P(X,Y|Z) = P(X|Z) P(Y|Z)
naive Bayes: a single cause directly influences a number of effects, all of which conditionally independent, given the cause. It is also called Bayesian classifier, idiot Bayes. P(Cause,E1,E2,…,En) = P(Cause) ∏_i P(Ei|Cause)
conditional probability table (CPT): each row shows a conditional probability
(no term): each variable is conditionally independent of its non-descendants, given its parents.
Markov blanket: parents, children, and children's parents

For continuous variables, the Bayes needs to do something. Of course we can do discretization, but the precision is lost. One common solution is to define standard families of probability density functions, with a finite number of parameters. Another solution is non-parameter one. Now I list some parameter models.

Gausion (normal) distribution: N(μ, σ²)(x) has the mean μ and the variance σ².

A network with both discrete and continuous variables is called hybrid Bayesian network.

linear Gaussian
conditional Gaussian
probit distribution
logit distribution
logistic function

Exact Inference

query variables
event: some assignments to a set of evidence variables
evidence variables
hidden variables: non-query non-evidence variables