📄 g13gam -- game theory -- alpha-beta pruning.htm
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0061)http://www.maths.nott.ac.uk/personal/anw/G13GT1/alphabet.html -->
<HTML><HEAD><TITLE>G13GAM -- Game Theory -- alpha-beta pruning</TITLE>
<META content="text/html; charset=gb2312" http-equiv=Content-Type>
<META content="MSHTML 5.00.2614.3500" name=GENERATOR></HEAD>
<BODY bgColor=#d8a070 text=black>
<H1>G13GAM -- Game Theory </H1><A name=top></A>
<H2>Alpha-beta pruning </H2>Alpha-beta pruning is a potentially hard concept
because, at one level, it's counter-intuitive. We can understand that we don't
want to look at <I>bad</I> moves, why is it that we don't want to look at
<I>good</I> moves? The first partial answer is that this is the other side of
the same coin -- a move that is good for me is bad for you. The second partial
answer is that it isn't that we don't want to know about good moves, rather that
it's their existence not their details that matter. The third partial answer is
that we don't need so much <I>good</I> moves as <I>good enough</I> moves; it is
because `good enough' moves are much easier to find than `good' moves that this
technique makes a huge improvement to any game-playing program.
<H3>What is alpha-beta pruning? </H3>[The name is essentially completely
arbitrary -- the two variables concerned in a computer program happened to be
called alpha and beta, and we might just as well have had fred-jim pruning, p-q
pruning or foo-bar pruning.]
<P><IMG align=right border=2
src="G13GAM -- Game Theory -- alpha-beta pruning.files/fig1.gif"> When we are
called upon to analyse a position, we will be given two `levels', alpha and
beta. These are the `floor' and `ceiling', respectively, of the range of values
our caller is interested in. Think of the beta level as representing a value
which is `good enough'; we'll worry later about what `good enough' really means,
and also about what the alpha level means. Let us suppose that a full analysis
would give us the picture shown to the right: we see that of the nine possible
moves, moves 1, 4, 5 and 9 are reasonably good, moves 2, 6 and 7 are so-so, move
3 is a disaster, and move 8 is the star move we would like to find. These values
could be found by static evaluation, or by dynamic [recursive] evaluations
looking deeper in the game tree. <IMG align=left border=2
src="G13GAM -- Game Theory -- alpha-beta pruning.files/fig2.gif">
<P>The second picture shows what could happen for various possible beta levels.
The top [dotted] line shows a high level. In this case, almost no gain is made;
none of the first seven moves is `good enough', but move 8 at last goes over the
line, and we can return its value. If the line had been a little higher, then
<I>no</I> move would have been over the line, nothing would have been gained,
and we would have had to do a full evaluation of all nine moves. The middle
[dashed] line shows a lower beta level. In this case, move 4 is already `good
enough'. <I>We do not need even to glance at moves 5 to 9. We can stop the
analysis immediately, and return the value of move 4.</I> The bottom [mixed]
line shows an even more favourable situation; here the very first move is `good
enough', and we can return its value without glancing at moves 2 to 9. This can
be a <I>huge</I> saving of time. We've saved 8/9 of the analysis already; but it
can be better than this in the context of dynamic analysis, for we can make
similar savings at every level of the tree, equivalent [in this example] to
having a computer which is potentially 9×9×9×... times faster, and the deeper we
look, the bigger the advantage. Note that the advantage is maximised by having
the beta level as low as possible, and by having `good enough' moves as early in
the list of moves as possible. <IMG align=right border=2
src="G13GAM -- Game Theory -- alpha-beta pruning.files/fig3.gif">
<P>Note however that once pruning has taken place, we are no longer returning
the true value of a position. The third picture shows this in the context of the
middle [dashed] beta level. The true value of this position is the value of the
best move, move 8. But we are returning the value of move 4, the first `good
enough' move; we don't know whether or not later moves will be even better, so
all we can say is that the true value of this position is <I>at least</I> the
value of move 4. This is shown by the arrow. The [potentially huge] savings made
by pruning the moves in the blue box are [slightly] offset by the loss of
information and the return of a lower bound rather than a true value. Very
often, the lower bound is all we need, and the pruning is pure gain. <IMG
align=left border=2
src="G13GAM -- Game Theory -- alpha-beta pruning.files/fig4.gif">
<P>This leaves two questions. Where do the alpha and beta levels come from? And
what is the alpha level for? They come from whoever asked us to analyse this
position; we in turn will pass them on to any subsidiary [`dynamic'] analysis.
Because positions are evaluated from the point of view of the player to move,
they will be `inverted' when they are passed on -- the caller's `ceiling' is the
callee's `floor', and vice versa. So the alpha level, the `floor', will become
the beta level, the ceiling, for the subsidiary analysis; the higher we can make
the floor, the lower will be the resulting ceiling, and so the more efficient
will be that analysis. The next picture, left, shows this in action. We have
inherited alpha and beta levels from our `parent'. After evaluating move number
1, we are no longer interested in moves that are worse than this, so we can
raise the floor to the corresponding value. Moves 2 and 3 now fall below this
floor; these values come from the `children' discovering moves above their beta
level, so all we can say is that these values are upper bounds on the true
values -- but even the upper bounds are below our floor, so we don't need to
know the true values, just that moves 2 and 3 are worse than move 1. This is
indicated by the arrows again; note that the tops of the arrows are higher than
the corresponding dots were in the previous pictures.
<P>In summary, when we evaluate a move, one of three things will happen:
<I>(a)</I> The value lies between alpha and beta. In this case, this move is
potentially interesting -- it may be the best move in the position, and its
value may be the value of the position as a whole. We can raise the value of
alpha to match, for we are no longer interested in exact values of worse moves.
<I>(b)</I> The value is at least beta. This move is `good enough', and causes a
`beta cutoff'. We return its value as a lower bound for the true value without
bothering to evaluate the remaining moves. This value will, after inversion, be
below the floor for the `parent'. The move causing the cutoff is sufficient to
`refute' the move leading to this position. <I>(c)</I> The value is at most
alpha. This move has been refuted, possibly by a beta cutoff in the `child'.
This value is an upper bound, not the true value, which may be much worse, but
this will not affect the value of the current position.
<H3>Value of alpha-beta </H3>Some [rather ignorant!] programmers treat
alpha-beta pruning as an `optional extra', a modest bonus that makes searching
slightly more efficient at the expense of sophisticated programming. Not so! Its
use typically <I>doubles</I> the depth to which the program can search. Since
the tree is growing exponentially with increasing depth, this is equivalent to
an exponential growth in the speed with which the program can search.
<P>However, the pruning is seen to full advantage only if <I>(a)</I> the beta
levels are low enough to be interesting, and <I>(b)</I> the first move tried is
often `good enough'. So much effort has gone in to ensuring that these two
conditions hold.
<P>An insight into the help that alpha-beta pruning gives can be had by
imagining that Kasparov is sitting next to you suggesting good moves to try.
When you are doing well, and when his suggestions are good, you do not need to
consider anything beyond his ideas, and your analysis can be conducted much more
efficiently -- a beta cutoff! If his suggestions are bad, then you will have to
consider other ideas as well [this is point <I>(b)</I> above!], but even rotten
suggestions do not cost you anything compared with no suggestions at all If your
<I>position</I> is bad, then his suggestions do not [indeed, cannot] help; all
the evaluations will be too low to cause a cutoff; but even in this case, his
suggestions will raise the floor as far as it can be raised [if he gives you the
best ideas first], so that deeper analysis will see a lower ceiling and easier
cutoffs [this is point <I>(a)</I> above]. In most lines, with best play [as
suggested by Kasparov!], good positions and bad positions will alternate [my
piece up is your piece down!]; alpha-beta pruning causes very quick cutoffs in
the good positions, which is why you can search twice as deeply. Of course, in
real life your computer does not have access to Kasparov; it has to do the best
it can with its own analysis.
<H3>Refinements </H3>The `correct' way to use alpha-beta pruning is to start
with alpha set to minus infinity, beta to plus infinity, and to rely on the
natural `raising of the floor', seen above, to make the bounds more useful.
However, this creates a tension with the observation above that the pruning
works best if the bounds are as tight as possible. So other strategies are
possible.
<P>Essentially, in all of these, we lie to our children about our proposed
alpha-beta bounds. Our children must, however, tell us the truth, as outlined
above: if the value is between alpha and beta, they must return the correct
value, otherwise they return an upper or lower bound. Our lies mean that we
place the bounds closer than they should have been, and therefore that the
search is more efficient. The penalty is that sometimes we are caught out. Even
so, we will quite often get away with it; if the returned value is outside the
true bounds, then we only need the upper or lower bound anyway. If however it is
inside the true bounds but outside the fraudulent bounds, then we have to ask
our children to look again, and to refine their evaluations. So, the search is
made sometimes more efficient and sometimes less. Where the balance lies is a
matter of pragmatism and experiment.
<P>Once this viewpoint is accepted, then describing this as `lying' carries
unwanted moral overtones. We should instead re-define the purpose of the
analysis as being to provide a value which is either `guaranteed' or else merely
a bound, as described above. Then we can ask our children questions with a clear
conscience!
<P>Effectively, we are asking of the analysis questions like `In this position,
is White between 1 and 2 pawns up?', and getting answers like `Yes, White is
1.376 pawns up' or `No, White is at least 2.3 pawns up', or `No, White is no
more than 0.52 pawns up'. [All of these questions and answers relative to the
requested depth of analysis, of course.] These questions can be answered much
more efficiently than the blunt `How many pawns up is White?', and the answers
are usually, but not always, sufficient. When they are not sufficient, we have
to ask again, `OK, is White between 2.3 and 3 pawns up', and balance these extra
questions against the extra efficiency.
<P><A name=tail></A><A
href="http://www.maths.nott.ac.uk/personal/anw/G13GT1/alphabet.html#top">top of
page </A>
<HR>
<A href="mailto:anw@maths.nottingham.ac.uk"><IMG alt=border=0
src="G13GAM -- Game Theory -- alpha-beta pruning.files/mailme.gif"> E-mail
me</A>. <A href="http://www.maths.nottingham.ac.uk/personal/anw"><IMG
alt=border=0
src="G13GAM -- Game Theory -- alpha-beta pruning.files/home-btn-s.gif"> My home
page</A>. <A href="http://www.nottingham.ac.uk/"><IMG alt=border=0
src="G13GAM -- Game Theory -- alpha-beta pruning.files/logo.gif"> The University
of Nottingham</A>.
<HR>
Copyright © Dr A. N. Walker, 1997. <BR>This Web page may be freely used for
purposes related to the above module or for private study; requests for
permission to make other uses should be referred to the author. </BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -