AI 第一部分“搜索”复习笔记

2018 年 09 月 23 日

这门课程所使用的教材是 Artificial Intelligence: A Modern Approach，本份笔记直接沿用书本中对变量及名词的定义，在此没有对它们进行特殊解释。

Lecture 1

Agents are systems that perceive and act in some environment, including humans, robots, softbots, thermostats, etc.
Environment is world in which agent operates.
Cognitive Cycle: Perception, Memory Access, Decision, Learning, Action.
The agent function is a mathematical relationship that maps percept sequences to actions in the environment.
The agent function is computed by an agent program; the agent program runs on the physical architecture to implement the function.
A rational agent chooses whichever action maximizes the expected value of the performance measure given the percept sequence to date and the prior environment knowledge.
PEAS description of the task environment:
- Performance: Measure for success/progress/quality;
- Environment: The world in which the agent operates;
- Actuators: How the agent affects the environment;
- Sensors: How the agent perceives the environment.
Environment Types: fully vs. partially observable; deterministic vs. stochastic; episodic vs. sequential; static vs. dynamic; discrete vs. continuous; single vs. multi-agent.
Four basic kinds of agent programs:
- Simple reflex agents: Select action on the basis of only the current percept;
- Model-based reflex agents: To tackle partially observable environments; maintain internal state representing best estimate of current world situation; over time update state using world knowledge;
- Goal-based agents: Goals describe what agent wants; by changing goals, can change what agent does in same situation;
- Utility-based agents: Some solutions may be “better” – have higher utility; utility function maps a (sequence of) state(s) onto a real number which can help in optimization or in arbitration among goals or solutions;
- All can be turned into learning agents; And a more complex variation is Hybrid agents.

Lecture 2-4

Blind/uninformed search; heuristic/informed search
Uninformed search stratigies:

	BFS	Uniform Cost	DFS	Depth limited	Iterative deepening	Bidirectional
Complete?	Y	Y	N	N	Y	Y
Optimal?	Y	Y	N	N	Y	Y
Time	b^d	b^(1 + C* / ε)	b^m	b^l	b^d	b^(d / 2)
Space	b^d	b^(1 + C* / ε)	b * m	b * l	b * d	b^(d / 2)

Why do people have a hard time solving missionaries and cannibals? There are a number of constraints on solutions to this problem, for example, the boat only holds one or two people, you can never have more cannibals than missionaries on either side of the river, etc. Constraints are generally thought of as making it harder to solve a problem.

Lecture 5

Informed search stratigies:

	BestFS	A*	RBFS	((S)MA*)
Complete?	N	Y	Y (same as A*)	Y (if solution is reachable)
Optimal?	N	Y (if admissible (Tree Search) or consistent (Graph Search))	Y (same as A*)	Y (if solution is reachable)
Time	b^m	A* expands all nodes with f(n) < C* (C* is cost of optimal path); some nodes with f(n) = C; no nodes with f(n) > C	hard
Space	b^m	Exponential	b * d	constant

A heuristic is admissible if it never overestimates the cost to reach the goal.
A heuristic is consistent if for every node n and every successor n’ of n generated by any action: h(n) ≤ c(n, a, n’) + h(n’). A consistent heuristic is also admissible. It is rare for an admissible heuristic to be inconsistent.
Memory-Bounded Heuristic Search:
- Iterative-deepening A* (IDA*): Cutoff information is the f-cost (g + h) instead of depth;
- Recursive best-first search (RBFS): Search depth-first, only keep current path and branches from it in memory (saves memory over keeping entire level); Keep track of f value (f-limit) of best sibling of path currently exploring; a bit more efficient than IDA*;
- (Simple) Memory-bounded A* ((S)MA*): Drop the worst-leaf node when memory is full.
Branching factor → 1 ⇒ heuristic quality ↑.

Lecture 6

The success of hill climbing depends very much on the shape of the state-space land-scape: if there are few local maxima and plateaux, random-restart hill climbing will find a good solution very quickly. NP-hard problems typically have an exponential number of local maxima to get stuck on. Despite this, a reasonably good local maximum can often be found after a small number of restarts.
Local beam search: Start with k randomly generated nodes; generate all successors of each of the k; keep the best k out of the them; repeat till goal or stop condition reached.
Simulated annealing: T ↓ ⇒ e^(△E / T) → -∞ ⇒ Probability of bad moves ↓

Lecture 7

checkers/draughts: like chess; scrabble: like poker.
Minimax: Complete? Y; Optimal? Y; Time? O(b^m); Space? O(b * m) or O(m).
With “perfect ordering”, the alpha-beta pruning can reduce time complexity to O(b^(m/2)).

Lecture 8

Methods for imporving backtracking efficiency in CSP:
- Most constrained variable (minimum remaining values (MRV) heuristic): Choose the variable with the fewest legal values;
- Most constraining variable: Choose the variable with the most constraints on remaining variables;
- Least constraining value: Given a variable, choose the one that rules out the fewest values in the remaining variables.
Forward checking: Keep track of remaining legal values for unassigned variables; terminate search when any variable has no legal values.
Arc consistency: Simplest form of propagation makes each arc consistent: X → Y is consistent iff for every value x of X there is some allowed y. Time complexity: O(n^2 * d^3) (n variables, d values), each arc can be queued only d times, n^2 arcs (at most), checking one arc is O(d^2).
Iterative min-conflicts is often surprisingly effective in solving CSP.