📄 g13gam -- game theory -- alpha-beta pruning.htm

📁 这是博弈论算法全集第四部分:剪枝算法,其它算法将陆续推出.以便与大家共享
💻 HTM
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0061)http://www.maths.nott.ac.uk/personal/anw/G13GT1/alphabet.html -->
<HTML><HEAD><TITLE>G13GAM -- Game Theory -- alpha-beta pruning</TITLE>
<META content="text/html; charset=gb2312" http-equiv=Content-Type>
<META content="MSHTML 5.00.2614.3500" name=GENERATOR></HEAD>
<BODY bgColor=#d8a070 text=black>
<H1>G13GAM -- Game Theory </H1><A name=top></A>
<H2>Alpha-beta pruning </H2>Alpha-beta pruning is a potentially hard concept 
because, at one level, it's counter-intuitive. We can understand that we don't 
want to look at <I>bad</I> moves, why is it that we don't want to look at 
<I>good</I> moves? The first partial answer is that this is the other side of 
the same coin -- a move that is good for me is bad for you. The second partial 
answer is that it isn't that we don't want to know about good moves, rather that 
it's their existence not their details that matter. The third partial answer is 
that we don't need so much <I>good</I> moves as <I>good enough</I> moves; it is 
because `good enough' moves are much easier to find than `good' moves that this 
technique makes a huge improvement to any game-playing program. 
<H3>What is alpha-beta pruning? </H3>[The name is essentially completely 
arbitrary -- the two variables concerned in a computer program happened to be 
called alpha and beta, and we might just as well have had fred-jim pruning, p-q 
pruning or foo-bar pruning.] 
<P><IMG align=right border=2 
src="G13GAM -- Game Theory -- alpha-beta pruning.files/fig1.gif"> When we are 
called upon to analyse a position, we will be given two `levels', alpha and 
beta. These are the `floor' and `ceiling', respectively, of the range of values 
our caller is interested in. Think of the beta level as representing a value 
which is `good enough'; we'll worry later about what `good enough' really means, 
and also about what the alpha level means. Let us suppose that a full analysis 
would give us the picture shown to the right: we see that of the nine possible 
moves, moves 1, 4, 5 and 9 are reasonably good, moves 2, 6 and 7 are so-so, move 
3 is a disaster, and move 8 is the star move we would like to find. These values 
could be found by static evaluation, or by dynamic [recursive] evaluations 
looking deeper in the game tree. <IMG align=left border=2 
src="G13GAM -- Game Theory -- alpha-beta pruning.files/fig2.gif"> 
<P>The second picture shows what could happen for various possible beta levels. 
The top [dotted] line shows a high level. In this case, almost no gain is made; 
none of the first seven moves is `good enough', but move 8 at last goes over the 
line, and we can return its value. If the line had been a little higher, then 
<I>no</I> move would have been over the line, nothing would have been gained, 
and we would have had to do a full evaluation of all nine moves. The middle 
[dashed] line shows a lower beta level. In this case, move 4 is already `good 
enough'. <I>We do not need even to glance at moves 5 to 9. We can stop the 
analysis immediately, and return the value of move 4.</I> The bottom [mixed] 
line shows an even more favourable situation; here the very first move is `good 
enough', and we can return its value without glancing at moves 2 to 9. This can 
be a <I>huge</I> saving of time. We've saved 8/9 of the analysis already; but it 
can be better than this in the context of dynamic analysis, for we can make 
similar savings at every level of the tree, equivalent [in this example] to 
having a computer which is potentially 9×9×9×... times faster, and the deeper we 
look, the bigger the advantage. Note that the advantage is maximised by having 
the beta level as low as possible, and by having `good enough' moves as early in 
the list of moves as possible. <IMG align=right border=2 
src="G13GAM -- Game Theory -- alpha-beta pruning.files/fig3.gif"> 
<P>Note however that once pruning has taken place, we are no longer returning 
the true value of a position. The third picture shows this in the context of the 
middle [dashed] beta level. The true value of this position is the value of the 
best move, move 8. But we are returning the value of move 4, the first `good 
enough' move; we don't know whether or not later moves will be even better, so 
all we can say is that the true value of this position is <I>at least</I> the 
value of move 4. This is shown by the arrow. The [potentially huge] savings made 
by pruning the moves in the blue box are [slightly] offset by the loss of 
information and the return of a lower bound rather than a true value. Very 
often, the lower bound is all we need, and the pruning is pure gain. <IMG 
align=left border=2 
src="G13GAM -- Game Theory -- alpha-beta pruning.files/fig4.gif"> 
<P>This leaves two questions. Where do the alpha and beta levels come from? And 
what is the alpha level for? They come from whoever asked us to analyse this 
position; we in turn will pass them on to any subsidiary [`dynamic'] analysis. 
Because positions are evaluated from the point of view of the player to move, 
they will be `inverted' when they are passed on -- the caller's `ceiling' is the 
callee's `floor', and vice versa. So the alpha level, the `floor', will become 
the beta level, the ceiling, for the subsidiary analysis; the higher we can make 
the floor, the lower will be the resulting ceiling, and so the more efficient 
will be that analysis. The next picture, left, shows this in action. We have 
inherited alpha and beta levels from our `parent'. After evaluating move number 
1, we are no longer interested in moves that are worse than this, so we can 
raise the floor to the corresponding value. Moves 2 and 3 now fall below this 
floor; these values come from the `children' discovering moves above their beta 
level, so all we can say is that these values are upper bounds on the true 
values -- but even the upper bounds are below our floor, so we don't need to 
know the true values, just that moves 2 and 3 are worse than move 1. This is 
indicated by the arrows again; note that the tops of the arrows are higher than 
the corresponding dots were in the previous pictures. 
<P>In summary, when we evaluate a move, one of three things will happen: 
<I>(a)</I> The value lies between alpha and beta. In this case, this move is 
potentially interesting -- it may be the best move in the position, and its 
value may be the value of the position as a whole. We can raise the value of 
alpha to match, for we are no longer interested in exact values of worse moves. 
<I>(b)</I> The value is at least beta. This move is `good enough', and causes a 
`beta cutoff'. We return its value as a lower bound for the true value without 
bothering to evaluate the remaining moves. This value will, after inversion, be 
below the floor for the `parent'. The move causing the cutoff is sufficient to 
`refute' the move leading to this position. <I>(c)</I> The value is at most 
alpha. This move has been refuted, possibly by a beta cutoff in the `child'. 
This value is an upper bound, not the true value, which may be much worse, but 
this will not affect the value of the current position. 
<H3>Value of alpha-beta </H3>Some [rather ignorant!] programmers treat 
alpha-beta pruning as an `optional extra', a modest bonus that makes searching 
slightly more efficient at the expense of sophisticated programming. Not so! Its 
use typically <I>doubles</I> the depth to which the program can search. Since 
the tree is growing exponentially with increasing depth, this is equivalent to 
an exponential growth in the speed with which the program can search. 
<P>However, the pruning is seen to full advantage only if <I>(a)</I> the beta 
levels are low enough to be interesting, and <I>(b)</I> the first move tried is 
often `good enough'. So much effort has gone in to ensuring that these two 
conditions hold. 
<P>An insight into the help that alpha-beta pruning gives can be had by 
imagining that Kasparov is sitting next to you suggesting good moves to try. 
When you are doing well, and when his suggestions are good, you do not need to 
consider anything beyond his ideas, and your analysis can be conducted much more 
efficiently -- a beta cutoff! If his suggestions are bad, then you will have to 
consider other ideas as well [this is point <I>(b)</I> above!], but even rotten 
suggestions do not cost you anything compared with no suggestions at all If your 
<I>position</I> is bad, then his suggestions do not [indeed, cannot] help; all 
the evaluations will be too low to cause a cutoff; but even in this case, his 
suggestions will raise the floor as far as it can be raised [if he gives you the 
best ideas first], so that deeper analysis will see a lower ceiling and easier 
cutoffs [this is point <I>(a)</I> above]. In most lines, with best play [as 
suggested by Kasparov!], good positions and bad positions will alternate [my 
piece up is your piece down!]; alpha-beta pruning causes very quick cutoffs in 
the good positions, which is why you can search twice as deeply. Of course, in 
real life your computer does not have access to Kasparov; it has to do the best 
it can with its own analysis. 
<H3>Refinements </H3>The `correct' way to use alpha-beta pruning is to start 
with alpha set to minus infinity, beta to plus infinity, and to rely on the 
natural `raising of the floor', seen above, to make the bounds more useful. 
However, this creates a tension with the observation above that the pruning 
works best if the bounds are as tight as possible. So other strategies are 
possible. 
<P>Essentially, in all of these, we lie to our children about our proposed 
alpha-beta bounds. Our children must, however, tell us the truth, as outlined 
above: if the value is between alpha and beta, they must return the correct 
value, otherwise they return an upper or lower bound. Our lies mean that we 
place the bounds closer than they should have been, and therefore that the 
search is more efficient. The penalty is that sometimes we are caught out. Even 
so, we will quite often get away with it; if the returned value is outside the 
true bounds, then we only need the upper or lower bound anyway. If however it is 
inside the true bounds but outside the fraudulent bounds, then we have to ask 
our children to look again, and to refine their evaluations. So, the search is 
made sometimes more efficient and sometimes less. Where the balance lies is a 
matter of pragmatism and experiment. 
<P>Once this viewpoint is accepted, then describing this as `lying' carries 
unwanted moral overtones. We should instead re-define the purpose of the 
analysis as being to provide a value which is either `guaranteed' or else merely 
a bound, as described above. Then we can ask our children questions with a clear 
conscience! 
<P>Effectively, we are asking of the analysis questions like `In this position, 
is White between 1 and 2 pawns up?', and getting answers like `Yes, White is 
1.376 pawns up' or `No, White is at least 2.3 pawns up', or `No, White is no 
more than 0.52 pawns up'. [All of these questions and answers relative to the 
requested depth of analysis, of course.] These questions can be answered much 
more efficiently than the blunt `How many pawns up is White?', and the answers 
are usually, but not always, sufficient. When they are not sufficient, we have 
to ask again, `OK, is White between 2.3 and 3 pawns up', and balance these extra 
questions against the extra efficiency. 
<P><A name=tail></A><A 
href="http://www.maths.nott.ac.uk/personal/anw/G13GT1/alphabet.html#top">top of 
page </A>
<HR>
<A href="mailto:anw@maths.nottingham.ac.uk"><IMG alt=border=0 
src="G13GAM -- Game Theory -- alpha-beta pruning.files/mailme.gif"> E-mail 
me</A>. <A href="http://www.maths.nottingham.ac.uk/personal/anw"><IMG 
alt=border=0 
src="G13GAM -- Game Theory -- alpha-beta pruning.files/home-btn-s.gif"> My home 
page</A>. <A href="http://www.nottingham.ac.uk/"><IMG alt=border=0 
src="G13GAM -- Game Theory -- alpha-beta pruning.files/logo.gif"> The University 
of Nottingham</A>. 
<HR>
Copyright &copy; Dr A. N. Walker, 1997. <BR>This Web page may be freely used for 
purposes related to the above module or for private study; requests for 
permission to make other uses should be referred to the author. </BODY></HTML>
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -