Intelligent Systems

Table of Contents

State space search

How do we find the solutions of previous problems?

state: representation of a physical configuration

node: data structure in the search tree. it contains state, parent node, actions, etc.

Uninformed search strategies

strategy defines picking order of node expansion

uninformed methods: only use problem definition

informed methods: use heuristic function to estimate cost of solution

evaluating performance:

Summary of uninformed search algorithms

Informed search (heuristics)

Heuristic function: “educated guess that reduces/limits search for solutions”

informedness property of heuristics:

Best-first search

best-known form of best-first search

avoid expanding paths that are already expensive

evaluation function: $f(n) = g(n) + h(n)$

A* Search

A search, but with an admissible heuristic

evaluation:

search has no adversary, solution is a (heuristic) method for finding a goal

games have an adversary, solution is a strategy. time limits force an approximate solution.

you need a function to evaluate the “goodness” of a game position

types of games:

deterministicchance
perfect informationchess, checkers, go, othellobackgammon, monopoly
imperfect informationbridge poker, scrabble, nuclear war

Minimax

Setup

two players: MAX, MIN

MAX moves first, take turns until game is over. winner gets award, loser gets penalty.

how does this relate to search?

Optimal strategies

find contingent strategy for MAX assuming infallible MIN.

assume that both players play optimally.

given game tree, optimal strategy can be found with minimax value of each node:

    minimax(n) = utility(n)                      if n is a terminal
                 minimax(max(successors of n))   if n is a max node
                 minimax(min(successors of n))   if n is a min node

Evaluation

Reducing problems of complexity with Minimax

instead of if TERMINAL(state) then return UTILITY(state) do if CUTOFF-TEST(state, depth) then return EVAL(state)

this introduces fixed-limit depth. also loses completeness!

utility: value based on quality of state

heuristics: value based on estimation of quality of state

heuristic EVAL:

Alpha-Beta pruning (efficient Minimax)

with minimax, the number of states is exponential to number of moves

so, don’t examine every node and prune the subtrees you don’t have to examine

Alpha: value of best MAX choice so far

Beta: value of best MIN choice so far

you prune the rest of the level if, at any point, beta <= alpha.

pruning doesn’t affect final results, entire subtrees can be pruned

good move ordering improves effectiveness of pruning

with ‘perfect ordering’, time complexity is: $O(b^{m/2})$

Search with no or partial information

problems:

Perfect information Monte Carlo sampling (rdeep)

  1. rollout:
    • assume a belief state (with perfect info)
    • play random game in that state.
  2. average the rollouts
  3. choose one with max average

Games with chance

Games with chance

Summary (Schnapsen)

Phase 2: minimax & alpha-beta pruning

Phase 1: PIMC sampling

what next? give the agent information about the game

Search direction

Data-driven: start with initial state (e.g. a maze)

Goal-driven: start with goal state, but has bigger branching factor (TODO confirm this)