📄 ics 180, april 17, 1997.htm
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0049)http://www.ics.uci.edu/~eppstein/180a/970417.html -->
<HTML><HEAD><TITLE>ICS 180, April 17, 1997</TITLE>
<META content="text/html; charset=gb2312" http-equiv=Content-Type>
<META name=Owner value="eppstein">
<META name=Reply-To value="eppstein@ics.uci.edu">
<META content="MSHTML 5.00.2614.3500" name=GENERATOR></HEAD>
<BODY><IMG alt="" height=72 src="ICS 180, April 17, 1997.files/icslogo2.gif"
width=472>
<P><A href="http://www.ics.uci.edu/~eppstein/180a/index.html">
<H1>ICS 180A, Spring 1997:<BR>Strategy and board game programming</H1></A>
<H2>Lecture notes for April 17, 1997<BR>Minimax and negamax search</H2>
<H3>Game Trees</H3>
<P>For any game, we can define a rooted tree (the "game tree") in which the
nodes correspond to game positions, and the children of a node are the positions
that can be reached from it in one move. For instance tic-tac-toe:
<P>
<CENTER><IMG src="ICS 180, April 17, 1997.files/tictactoetree.gif"></CENTER>
<P>(Actually, the root of this tree should have nine children, but I've left out
some symmetric cases. If the same board configuration arises from two different
sequences of moves, we create two separate nodes, so this really is a tree.
We'll see later when we talk about hashing how to take advantage of duplicated
nodes to speed up the search -- we only need to search one copy of a position,
and use the same search results everywhere else that position appears in the
tree. We also assume that the players take turns moving, with no multiple moves
or skipped turns; these complications can be dealt with by treating an entire
sequence of actions by a single player as forming a single move. Finally, we'll
assume this tree is finite, so that we're not dealing with a game that can go on
forever or has infinitely many choices for a single move.)
<P>There are three kinds of nodes in the tree:
<OL>
<LI><B>Internal nodes at even levels</B> correspond to positions in which the
first player is to move.
<LI><B>Internal nodes at odd levels</B> correspond to positions in which the
second player is to move.
<LI><B>Leaves</B> correspond to positions at which the game has ended. One
player or the other has won, or perhaps the game is drawn. </LI></OL>
<H3>Game Tree Evaluation</H3>
<P>Suppose we have an internal node for which all children are leaves, so that
the game is certain to end in one more move. Then we can assume that the player
to move is going to pick the best move. If there is a leaf giving a won position
for him, he will move to it and win. If not, but some leaf gives a drawn
position for him, he will move to it and draw. But, if all leaves give won
positions for his opponent, he will lose no matter what happens.
<P>So, we know what the game outcome should be from nodes one level above the
leaves. Once we know that, we can apply the same analysis bottom up, to
determine the correct outcome from nodes two levels above the leaves, three
levels up, and so on, until we reach the root of the tree. At each node, the
position is won for the player on move if he can find a child of that node
giving him a won position; it is drawn if he can find a child giving him a draw;
and if neither of these holds then it is lost. This gives us an algorithm for
playing games perfectly if only we had enough computation time, but for any
reasonable game we won't have enough computation time. The trees are too big.
<P>This also tells thus that a "correct" evaluation function needs to only have
three values, win lose and draw. The evaluations we use in computer game
programs have a wider range of real-number values only because they're
inaccurate. If we represent a first-player win as the value +1, a draw as the
value 0, and a second-player win as the value -1, then the value of each
internal node in the game tree is the maximum or minimum of its children's
values, depending on whether the first or second player is to move respectively.
<H3>Partial Game Trees</H3>In practice, our search algorithms will work by only
expanding part of the game tree. We use some kind of <I>stopping rule</I> to
decide to stop expanding the tree at certain internal nodes, making them leaves;
for instance, we might stop after sequences of eight moves. Since the game won't
have ended at those leaves, we guess at how likely it is for one or the other
player to win, using the evaluation functions. Then, we make the assumption that
within the nodes we've expanded, one player will be trying to reach positions
with large values of the evaluation function, while the other player will be
trying to reach positions with small values.
<P>If both players really play this way, then we can determine the value of the
leaf they will reach by the same min-max procedure outlined above: compute the
value of each internal node as either the maximum or minimum of its children's
values, depending on whether the first or second player is to move respectively.
The path to this leaf is known as the <I>principal variation</I>. The basic
principle of minimax game search is to expand a partial game tree, find the
principal variation, and make the move forming the first step in this variation.
<H3>Breadth First and Depth First Search; Negamax Code</H3>As described above,
the computation of game tree values is breadth first (we compute the values in
the tree bottom-up, a single level in the tree at a time). Instead, we can
perform a depth-first (post-order) recursive traversal of the tree that
evaluates a node by recursively evaluating its children and keeping track of the
values seen so far. This is much more space-efficient because it doesn't need to
store the whole game tree, only a single path (which would generally be quite
short, e.g. eight moves with the example stopping rule above). As we'll see next
time when I discuss alpha-beta search, depth-first traversal also has the
advantage that you can use the information you've found so far to help decide
not to visit certain irrelevant parts of the tree, saving a lot of time.
<P>It's convenient to modify the game tree values slightly, so that we only need
maximization operations rather than having to alternate between minima and
maxima. At odd levels of the tree (nodes in which the second player is to move),
negate the values defined above. Then at each node, these modified values can be
found by computing the maximum of the negations of the node's children's values.
Maybe this will make more sense if I write down some source code for game tree
search: <PRE>// search game tree to given depth, return evaluation of root node
double negamax(int depth)
{
if (depth <= 0 || game is over) return eval(pos);
else {
double e = -infty;
for (each move m available in pos) {
make move m;
e = max(e, -negamax(depth - 1));
unmake move m;
}
return e;
}
}
</PRE>Note that this only finds the evaluation, but doesn't determine which move
to actually make. We only need to find an actual move at the root of the tree
(although many programs return an entire principal variation). To do this, we
slightly modify the search performed at the root: <PRE>// search game tree to given depth, return evaluation of root node
move rootsearch(int depth)
{
double e = -infty;
move mm;
for (each move m available in pos) {
make move m;
double em = -negamax(depth - 1);
if (e < em) {
e = em;
mm = m;
}
unmake move m;
}
return mm;
}
</PRE>
<H3>Analysis of negamax: branching factor, depth</H3>Traditionally one analyzes
game tree algorithms by making some simplifying assumptions about what the game
tree looks like. We assume that each internal node has the same number of
children; this number is known as the <I>branching factor</I>. We also assume
that we search the tree to some fixed <I>depth</I> (as does the algorithm above)
and that the game doesn't end early (before this depth is reached).
<P>With these assumptions, it's easy to write down a formula for the amount of
time the negamax program uses: it's just proportional to the number of tree
nodes expanded. (It may look like we should multiply by something since there is
a loop nested within each call to negamax, but the time spent in this loop can
be charged to the recursive calls made in it.) If the branching factor is b and
the depth is d, this number is
<P>
<CENTER>1 + b + b^2 + b^3 + ... + b^d = b^d (1 - 1/b^d)/(1 - 1/b).</CENTER>
<P>The stuff in parentheses at the end of the formula is very close to one, so
the overall time is very close to b^d.
<P>If our game doesn't meet the strict assumptions above, we can work backwards
and define the <I>effective branching factor</I> to be whatever value of b works
to make the formula above describe our program's running time. Even less
formally, we'll use "branching factor" to describe the average number of moves
available from a "typical" position in a game.
<P>What can we say about this formula? First, it's exponential. This means that
we won't be able to search too many nodes; if we get a computer twice as fast as
the old one, we will only be able to increase d by some small number of levels.
Second, it depends very strongly on the branching factor b. In a game with a
small branching factor (like checkers, in which there may often be as few as
three moves to search) we can search much deeper than chess (which may have 30
or so moves in a position) or go (hundreds of moves in a position). So we'd like
b to be as small as possible, but unfortunately it's more a function of what
game we're working on and less a function of how well we can program. However,
the technique I'll talk about next time, alpha-beta pruning, acts to reduce the
effective branching factor considerably: if we're lucky, to the square root of
its value in unpruned game trees, which lets us search to twice the depth we
might without using alpha-beta.
<H3>Iterated Deepening</H3>One question remains with the negamax code above:
what do we give it for its depth argument? Primitive game programs just set it
to some fixed number, but this will result in a lot of variation in the amount
of time the program takes per move. Instead you'd probably like something that
chooses the search depth based on the amount of time the search will take.
Fortunately, the exponential nature of game search has one advantage: it makes
this sort of control easy through a technique known as "iterated deepening":
start searching very shallowly, then repeatedly increase the depth until you run
out of time: <PRE>depth = 0
while (enough time left for another level of search) {
depth++;
m = rootsearch(depth);
}
make move m
</PRE>It seems like we're wasting time, since all but the last search is thrown
away. But the same analysis as before shows that the amount of time wasted is
very small: the times for the different levels add together like 1+b+b^2+..., a
formula we've already seen to come out to very close to the last term b^d. So,
iterated deepening is cheap and provides good time-based control. It's also
helpful in other ways: we can use the results of shallower searches to help
choose what order to search the moves in deeper searches, and as we'll see in
alpha-beta searching this ordering is critical for fast searching.
<HR>
<A href="http://www.ics.uci.edu/~eppstein/">David Eppstein, <A
href="http://www.ics.uci.edu/">Dept. Information & Computer Science</A>, <A
href="http://www.uci.edu/">UC Irvine</A>, Wednesday, 16-Apr-1997 11:44:31 PDT.
</BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -