📄 aske plaat mtd(f), a new chess algorithm.htm

📁 这是博弈论算法全集第四部分:剪枝算法,其它算法将陆续推出.以便与大家共享
💻 HTM
📖 第 1 页 / 共 2 页
字号:
12 下一页
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0035)http://www.cs.vu.nl/~aske/mtdf.html -->
<HTML><HEAD><TITLE>Aske Plaat: MTD(f), a new chess algorithm</TITLE>
<META content="text/html; charset=gb2312" http-equiv=Content-Type>
<META 
content="MTD(f) a new minimax algorithm for programs that play chess, checkers, and Othello" 
name=DESCRIPTION>
<META content="MSHTML 5.00.2614.3500" name=GENERATOR></HEAD>
<BODY bgColor=#fafabc text=#000000 vLink=#ca2222>
<CENTER>
<H1>MTD(<I>f</I>)</H1>
<P><!img  align=bottom src="alladdin.gif">
<H2>A Minimax Algorithm faster than NegaScout</H2></CENTER>
<P>MTD(<I>f</I>) is a new minimax search algorithm, simpler and more efficient 
than previous algorithms. In tests with a number of tournament game playing 
programs for chess, checkers and Othello it performed better, on average, than 
<A href="http://www.cs.vu.nl/~aske/mtdf.html#negascout">NegaScout</A>/PVS (the 
AlphaBeta variant used in practically all good chess, checkers, and Othello 
programs). One of the strongest chess programs of the moment, MIT's parallel 
chess program <A href="http://theory.lcs.mit.edu/~cilk/chess.html">Cilkchess</A> 
uses MTD(<I>f</I>) as its search algorithm, replacing NegaScout, which was used 
in StarSocrates, the previous version of the program. </P>
<P>MTD(<I>f</I>) is only ten lines of code? Here it is (my apologies for mixing 
C and Pascal; block structure is indicated by indentation only): </P>
<P><B>function</B> MTDF(<I>root</I> : node_type; <I>f</I> : integer; <I>d</I> : 
integer) : integer;<I>
<UL>g</I> := <I>f</I>;<BR><I>upperbound</I> := +INFINITY; <BR><I>lowerbound 
  </I>:= -INFINITY;<BR><B>repeat</B><BR>
  <UL><B>if</B> <I>g</I> == <I>lowerbound</I> <B>then</B> <I>beta</I> := 
    <I>g</I> + 1 <B>else</B><I> beta</I> := <I>g</I>;<BR><I>g</I> := <A 
    href="http://www.cs.vu.nl/~aske/mtdf.html#abmem">AlphaBetaWithMemory</A>(<I>root</I>, 
    <I>beta</I> - 1, <I>beta, d</I>);<BR><B>if</B> <I>g</I> &lt; <I>beta</I> 
    <B>then</B> <I>upperbound</I> := <I>g</I> <B>else</B> <I>lowerbound</I> := 
    <I>g</I>;<BR></UL><B>until</B> <I>lowerbound</I> &gt;= 
  <I>upperbound</I>;<BR><B>return</B> <I>g</I>;
  <P></P></UL>
<P></P>
<P>The algorithm works by calling <A 
href="http://www.cs.vu.nl/~aske/mtdf.html#abmem">AlphaBetaWithMemory</A> a 
number of times with a search window of zero size. The search works by zooming 
in on the minimax value. Each AlphaBeta call returns a bound on the minimax 
value. The bounds are stored in <I>upperbound</I> and <I>lowerbound</I>, forming 
an interval around the true minimax value for that search depth. Plus and minus 
INFINITY is shorthand for values outside the range of leaf values. When both the 
upper and the lower bound collide, the minimax value is found. </P>
<P>MTD(<I>f</I>) gets its efficiency from doing only zero-window alpha-beta 
searches, and using a "good" bound (variable <I>beta</I>)&nbsp;to do those 
zero-window searches. Conventionally AlphaBeta is called with a wide search 
window, as in AlphaBeta(<I>root</I>, -INFINITY, +INFINITY, <I>depth</I>), making 
sure that the return value lies between the value of alpha and beta. In 
MTD(<I>f</I>) a window of zero size is used, so that on each call AlphaBeta will 
either fail high or fail low, returning a lower bound or an upper bound on the 
minimax value, respectively. Zero window calls cause more cutoffs, but return 
less information - only a bound on the minimax value. To nevertheless find it, 
MTD(<I>f</I>) has to call AlphaBeta a number of times, converging towards it. 
The overhead of re-exploring parts of the search tree in repeated calls to 
AlphaBeta disappears when using a version of AlphaBeta that stores and retrieves 
the nodes it sees in memory.</P>
<P>In order to work, MTD(<I>f</I>) needs a "first guess" as to where the minimax 
value will turn out to be. The better than first guess is, the more efficient 
the algorithm will be, on average, since the better it is, the less passes the 
repeat-until loop will have to do to converge on the minimax value. If you feed 
MTD(<I>f</I>) the minimax value to start with, it will only do two passes, the 
bare minimum: one to find an upper bound of value <I>x</I>, and one to find a 
lower bound of the same value.</P>
<P>Typically, one would call MTD(<I>f</I>) in an iterative deepening framework. 
A natural choice for a first guess is to use the value of the previous 
iteration, like this:</P>
<P><B>function</B> iterative_deepening(<I>root</I>&nbsp;:&nbsp;node_type) : 
integer;<BR><I>
<UL>firstguess</I> := 0;<BR><B>for </B><I>d</I> = 1 <B>to</B> MAX_SEARCH_DEPTH 
  <B>do</B><I> 
  <UL>firstguess</I> := MTDF(<I>root</I>, <I>firstguess</I>, 
    <I>d</I>);<BR><B>if</B> times_up() <B>then </B>break;<BR></UL><B>return</B> 
  <I>firstguess</I>;
  <P></P></UL>
<P></P>
<P>In a real program you're not only interested in the value of the minimax 
tree, but also in the best move that goes with it. In the interest of brevity 
that is not shown in the above pseudo code.</P>
<P>In case your program has a strong oscillation in the values it finds for odd 
and even search depths, you might be better off by feeding MTD(<I>f</I>) its 
return value of two plies ago, not one, as the above code does. MTD(<I>f</I>) 
works best with a stable Principal Variation (doesn't every program?) Although 
the transposition table greatly reduces the cost of doing a re-search, it is 
still a good idea to not re-search excessively. As a rule, in the deeper 
iterations of quiet positions in good programs MTD(<I>f</I>) typically performs 
between 5 and 15 passes before it finds the minimax value.</P>
<HR>

<H2>Some Background</H2>
<P>The name of the algorithm is short for MTD(<I>n, f</I>), which stands for 
something like Memory-enhanced Test Driver with node <I>n</I> and value 
<I>f</I>. MTD is the name of a group of driver-algorithms that search minimax 
trees using zero window AlphaBetaWithMemory calls. <A 
href="http://singapore.cs.ucla.edu/jp_home.html">Judea Pearl</A> has named zero 
window AlphaBeta calls "Test", in his seminal papers on the Scout algorithm (the 
basis for Reinefeld's <A 
href="http://www.cs.vu.nl/~aske/mtdf.html#negascout">NegaScout</A>). Adding 
memory to Test makes it possible to use it in re-searches, creating a group of 
simple yet efficient algorithms.</P>
<P>MTD(<I>f</I>) is simple in that it only does zero window AlphaBeta calls, 
making reasoning about the parts of the tree that get traversed easier than with 
algorithms that use wide window calls, such as NegaScout and the standard 
AlphaBeta. Actually, the difficulty in analyzing ordinary AlphaBeta was 
precisely the reason why Pearl introduced his Test in the first place. The 
AlphaBeta versions shown on this page can be simplified to use a single input 
bound, instead of both alpha and beta, since alpha is always one less than beta. 
</P>
<P>Especially in a parallel setting the simplicity of MTD(<I>f</I>) compared to 
NegaScout is valuable. Designing and debugging a parallel search routine is a 
complex affair. MTD(<I>f</I>) only needs a zero window search, a Test. Instead 
of two bounds, MTD(<I>f</I>) needs one. In NegaScout, when new values for the 
search window become available they have to be communicated asynchronously to 
the child processes; in MTD(<I>f</I>) you simply abort an entire subtree when a 
cutoff happens. Furthermore, the recursive search code does not spawn 
re-searches anymore. All re-searching is done at the root, where things are 
simpler than down in the parallel tree. The large body of research on 
parallelizing AlphaBeta and NegaScout is directly applicable to MTD instances, 
since they use zero-window AlphaBeta calls to do the tree searching. See for 
example <A href="http://arch.cs.yale.edu/~bradley/">Bradley Kuszmaul</A>'s <A 
href="ftp://theory.lcs.mit.edu/pub/people/bradley/startech.ps">Jamboree 
search</A> or <A 
href="http://www.uni-paderborn.de/fachbereich/AG/monien/PERSONAL/obelix.html">Rainer 
Feldmann</A>'s <A 
href="http://www.uni-paderborn.de/fachbereich/AG/monien/PUBLICATIONS/ABSTRACTS/FMM_Studying_Overheads_1994.html">Young 
Brothers Wait Concept</A>. If your AlphaBeta is parallel, then your 
MTD(<I>f</I>) is parallel. </P>
<P>Incidentally, one of the MTD instances is equivalent to SSS*, <A 
href="http://web.cps.msu.edu/~stockman/">George Stockman</A>'s best-first 
minimax algorithm that promised to be more efficient than AlphaBeta. (By 
equivalent I mean that the two algorithms look different, but search the same 
nodes.) This SSS*-MTD made the first practical tests of SSS* in full-fledged 
game playing programs feasible, shedding new and unexpected light, after more 
than 15 years, on the questions posed in Stockman's 1979 article. See our <A 
href="http://www.cs.vu.nl/~aske/Papers/aij-final.ps.gz">1996 article in 
Artificial Intelligence</A> for more on this subject. (And yes, MTD(<I>f</I>) is 
also better than SSS*, in case you wondered.)</P>
<P>Another instance of the MTD framework is equivalent to the K. Coplan's C* 
algorithm. <A href="http://www.neci.nj.nec.com/homepages/jcw/">Jean-Christophe 
Weill</A> has published a number of <A 
href="http://www.neci.nj.nec.com/homepages/jcw/publications.html">papers</A> on 
experiments with a negamax version of C*. In MTD terms the idea of C* is to 
bisect the interval formed by the upper and lower bounds, reducing the number of 
AlphaBetaWithMemory calls. On the down side, bisection yields a value for the 
search window, <I>beta</I>, that turns out to be not as efficient as 
MTD(<I>f</I>)'s choice. But still, Weill's work indicates that it is worthwhile 
to experiment with variants on MTD(<I>f</I>)'s choice of pivot value; leaving 
ample room for more research. </P>
<HR>

<H2><A name=abmem>AlphaBetaWithMemory</A></H2>Note that the MTD(<I>f</I>) code 
calls an AlphaBeta version that stores its nodes in memory as it has determined 
their value, and retrieving their values in a re-search. If AlphaBeta wouldn't 
do that, then each pass of MTD(<I>f</I>) would re-explore most of those nodes. 
In order for MTD(<I>f</I>) to be efficient your AlphaBeta has to store the nodes 
it has searched. An ordinary tranposition table of reasonable size suffices, as 
our experiments showed (see <A 
href="http://www.cs.vu.nl/~aske/mtdf.html#further">further reading</A>). 
<P>To be sure, here's a minimax version of the pseudo code of 
AlphaBetaWithMemory. The transposition table access code is the same as what is 
used in most tournament chess, checkers, and Othello programs. 
</P><B>function</B> AlphaBetaWithMemory(<I>n </I>: node_type; <I>alpha</I> , 
<I>beta</I> , <I>d </I>: integer) : integer;<B>
<UL>if</B> retrieve(<I>n</I>) == OK <B>then</B> /* Transposition table lookup 
  */<BR>
  <UL><B>if</B> <I>n.lowerbound</I> &gt;= <I>beta</I> <B>then return 
    </B><I>n.lowerbound</I>;<BR><B>if</B> <I>n.upperbound</I> &lt;= <I>alpha</I> 
    <B>then return </B><I>n.upperbound</I>;<BR><I>alpha</I> := max(<I>alpha, 
    n.lowerbound</I>);<BR><I>beta</I> := min(<I>beta, 
  n.upperbound</I>);<BR></UL><B>if</B> <I>d</I> == 0 <B>then </B><I>g</I> := 
  evaluate(<I>n</I>); /* leaf node */<BR><B>else if </B><I>n</I> == MAXNODE 
  <B>then</B><BR>
  <UL><I>g</I> := -INFINITY; <I>a</I> := <I>alpha</I>; /* save original alpha 
    value */<BR><I>c </I>:= firstchild(<I>n</I>);<BR><B>while</B> (<I>g</I> &lt; 
    <I>beta</I>) <B>and </B>(<I>c</I> != NOCHILD) <B>do</B><BR>
    <UL><I>g </I>:= max(<I>g, </I>AlphaBetaWithMemory(<I>c, a, beta, d</I> - 
      1));<BR><I>a</I> := max(<I>a, g</I>);<BR><I>c </I>:= 
      nextbrother(<I>c</I>);<BR></UL></UL><B>else </B>/*&nbsp;n is a 
  MINNODE&nbsp;*/<BR>
  <UL><I>g</I> := +INFINITY; <I>b</I> := <I>beta</I>; /* save original beta 
    value */<BR><I>c </I>:= firstchild(<I>n</I>);<BR><B>while</B> (<I>g</I> &gt; 
    <I>alpha</I>) <B>and </B>(<I>c</I> != NOCHILD) <B>do</B><BR>
    <UL><I>g </I>:= min(<I>g, </I>AlphaBetaWithMemory(<I>c, alpha, b, d</I> - 
      1));<BR><I>b</I> := min(<I>b, g</I>);<BR><I>c </I>:= 
      nextbrother(<I>c</I>);<BR></UL></UL>/* Traditional transposition table storing 
  of bounds */ <BR>/* Fail low result implies an upper bound */ <BR><B>if 
  </B><I>g</I> &lt;= <I>alpha</I> <B>then</B><I> n.upperbound</I> := <I>g</I>; 
  store <I>n.upperbound</I>;<BR>/* Found an accurate minimax value - will not 
  occur if called with zero window */ <BR><B>if </B><I>g</I> &gt;&nbsp; 
  <I>alpha</I> <B>and</B> <I>g</I> &lt;&nbsp;<I>beta</I> <B>then</B><BR>
  <UL><I>n.lowerbound</I> := <I>g</I>; <I>n.upperbound</I> := <I>g</I>; store 
    <I>n.lowerbound</I>, <I>n.upperbound</I>;<BR></UL>/* Fail high result implies 
  a lower bound */ <BR><B>if </B><I>g</I> &gt;= <I>beta</I> <B>then</B><I>
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -