⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ics 180, april 29, 1997.htm

📁 这是博弈论算法全集第二部分:辅助搜索,其它算法将陆续推出.以便与大家共享
💻 HTM
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0049)http://www.ics.uci.edu/~eppstein/180a/970429.html -->
<HTML><HEAD><TITLE>ICS 180, April 29, 1997</TITLE>
<META content="text/html; charset=gb2312" http-equiv=Content-Type>
<META name=Owner value="eppstein">
<META name=Reply-To value="eppstein@ics.uci.edu">
<META content="MSHTML 5.00.2614.3500" name=GENERATOR></HEAD>
<BODY><IMG alt="" height=72 src="ICS 180, April 29, 1997.files/icslogo2.gif" 
width=472>
<P><A href="http://www.ics.uci.edu/~eppstein/180a/index.html">
<H1>ICS 180A, Spring 1997:<BR>Strategy and board game programming</H1></A>
<H2>Lecture notes for April 29, 1997<BR>Which nodes to search? Full-width vs. 
selective search</H2>Alpha-beta tells us how to search, but we still need to 
know when to expand a node (search its children) and when to just stop and call 
the evaluation function. 
<H3>Brute Force and Selectivity</H3>
<P>Shannon's original paper on computer chess listed two possible strategies. 
<OL>
  <LI>The most obvious is what the pseudo-code I've shown you so far does: a 
  full-width, brute force search to a fixed depth. Just pass in a "depth" 
  parameter to your program, decrement it by one for each level of search, and 
  stop when it hits zero. This has the advantage of seeing even wierd-looking 
  lines of play, as long as they remain within the search horizon. But the high 
  branching factor means that it doesn't search any line very deeply (bachelor's 
  degree: knows nothing about everything). 
  <P>Even worse, it falls prey to what's known as the <I>horizon effect</I>. 
  Suppose, in chess, we have a program searching seven levels deep, and that it 
  has trapped its knight in the corner of the board (say in exchange for a rook) 
  in a way that, if it played well, would end up with the knight captured within 
  the next seven moves. But, by sacrificing a pawn elsewhere on the board, maybe 
  the program could delay that capture by another move. That delay would then 
  push the capture past the program's search horizon, so what it sees along that 
  line of play would just be the loss of a pawn instead of the loss of a knight. 
  The knight would still end up being captured, but far enough from the current 
  position that the program doesn't see it. So, it sacrifices its pawn, thinking 
  that will save the knight. The very next move, the same situation occurs, so 
  it sacrifices another pawn, and another, until very soon it's lost a lot more 
  material than the knight was ever worth, and will still end up losing the 
  knight anyway. This sort of tailspin, in which a sequence of moderately bad 
  moves is used to delay some worse consequence, is known as the "horizon 
  effect" and it used to be an easy way to win games against computers (until 
  better algorithms let them avoid this problem). 
  <P></P>
  <LI>The other method suggested by Shannon was selective pruning: again search 
  to some fixed depth, but to keep the branching factor down only search some of 
  the children of each node (avoiding the "obviously bad" moves). So, it can 
  search much more deeply, but there are lines it completely doesn't see (ph.d.: 
  knows everything about nothing). Shannon thought this was a good idea because 
  it's closer to how humans think. Turing used a variant of this idea, only 
  searching capturing moves. More typically one might evaluate the children and 
  only expand the <I>k</I> best of them where <I>k</I> is some parameter less 
  than the true branching factor. 
  <P>Unfortunately, "obviously bad" moves are often not bad at all, but are 
  brilliant sacrifices that win the game. If you don't find one you should have 
  made, you'll have to work harder and find some other way to win. Worse, if you 
  don't see that your opponent is about to spring some such move sequence on 
  you, you'll fall into the trap and lose. </P></LI></OL>Nowadays, neither of 
these ideas is used in its pure form. Instead, we use a synthesis of both: 
selective extension. We search all lines to some fixed depth, but then extend 
extend some lines deeper than that horizon. Sometimes we'll also do some pruning 
(beyond the safe pruning done by alpha-beta), but this is usually extremely 
conservative because it's too hard to pick out only the good moves; but we can 
sometimes pick out and ignore really bad moves. For games other than chess, with 
higher branching factors, it may be necessary to use more aggressive pruning 
techniques. 
<H3>When to extend?</H3>What is the point of extending? To get better (more 
accurate) evaluations. So, should extend 
<OL>
  <LI>when the current evaluation is likely to be inaccurate, or 
  <LI>when the current line of play is a particularly important part of the 
  overall game tree search </LI></OL>(or some combination of both). 
<P>How do we know when the eval is likely inaccurate? 
<OL>
  <LI>In chess or other games in which there are both capturing and 
  non-capturing moves (checkers, go, fanorona), if there are captures to be 
  made, the evaluation will change greatly with each capture. 
  <P><I>Quiescence search</I> is the idea of, after reaching the main search 
  horizon, running a Turing-like search in which we only expand capturing moves 
  (or sometimes, capturing and checking) moves. For games other than chess, the 
  main idea would be to only include moves which make large changes to the 
  evaluation. Such a search must also include "pass" moves in which we decide to 
  stop capturing. Once both players pass, we stop expanding. That way, the 
  evaluation function is only called on "quiescent" nodes at which it isn't 
  about to change by making a capture. 
  <P></P>
  <LI>If the position has been active in the recent past, maybe we guess that it 
  should still be active. So we extend the search depth if the search passes 
  thru an "interesting" move e.g. a capture. In the alpha-beta pseudocode, this 
  would be accomplished by replacing the depth-1 parameter to the recursive call 
  to the search routine by the value depth-1+extension. You have to be careful 
  not to do this too often, though, or you could end up with a hugely expanded 
  (even possibly infinite!) search tree. 
  <P>One trick helps make sure this extension idea terminate: only extend by a 
  fraction of a level. Specifically, make the "depth" counter record some 
  multiple of the number of levels you really want to search, say 
  depth=levels*24. Then, in recursive calls to alpha-beta search, pass a value 
  of depth-24+extension. If the extension is always strictly less than 24, the 
  method is guaranteed to terminate, and you can choose which situations result 
  in larger or smaller extensions. 
  <P></P>
  <LI>I haven't seen this third technique done but maybe it should be. Include a 
  separate evaluation function to determine how complicated a position is. For 
  instance, a position is probably complicated if it has lots of contradictory 
  evaluation terms, so compute the "complication evalutation" by taking the 
  normal evaluation function and replacing each term in it by the absolute value 
  of that term. </LI></OL>
<H3>How to combine accuracy with importance?</H3>So far, we've just looked at 
trying to find the points at which the evaluation may be inaccurate. But maybe 
we don't care if it's inaccurate for unimportant parts of the tree, but we 
really do care for nodes on the principal variation. How do we take importance 
into account when performing selective extensions? 
<OL>
  <LI>Don't, let alpha-beta sort out importance and just extend based on 
  accuracy. 
  <P></P>
  <LI>Extend lines that are part of (or near) the principal variation (e.g. 
  singular extensions -- used in Deep Blue and/or its predecessors -- if there 
  is one move much better than others in a position, extend the search on that 
  move). 
  <P></P>
  <LI>Moving away from alpha-beta... conspiracy number search -- what is the 
  minimum number of positions the value of which would have to change to force 
  program to make a different move? Search those positions deeper. </LI></OL>
<P>
<HR>
<A href="http://www.ics.uci.edu/~eppstein/">David Eppstein, <A 
href="http://www.ics.uci.edu/">Dept. Information &amp; Computer Science</A>, <A 
href="http://www.uci.edu/">UC Irvine</A>, Monday, 01-Feb-1999 16:58:05 PST. 
</BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -