ics 180, january 14, 1999.htm

来自「这是博弈论算法全集第六部分:局面描述,其它算法将陆续推出.以便与大家共享」· HTM 代码 · 共 229 行
HTM
229 行
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0049)http://www.ics.uci.edu/~eppstein/180a/990114.html -->
<HTML><HEAD><TITLE>ICS 180, January 14, 1999</TITLE>
<META content="text/html; charset=gb2312" http-equiv=Content-Type>
<META name=Owner value="eppstein">
<META name=Reply-To value="eppstein@ics.uci.edu">
<META content="MSHTML 5.00.2614.3500" name=GENERATOR></HEAD>
<BODY><IMG alt="" height=72 src="ICS 180, January 14, 1999.files/icslogo2.gif" 
width=472>
<P><A href="http://www.ics.uci.edu/~eppstein/180a/index.html">
<H1>ICS 180A, Spring 1997:<BR>Strategy and board game programming</H1></A>
<H2>Lecture notes for April 10, 1997<BR>Evaluation Functions</H2>
<H3>General considerations</H3>
<P>The evaluation function is where most of the game-specific knowledge goes 
into your program. We start off with two basic assumptions: 
<P>
<OL>
  <LI>We can represent the quality of a position as a number. For instance, this 
  number might be our estimate of the probability that we can win the game; but 
  most programs don't try to make the number mean anything so specific, it's 
  just a number. 
  <P></P>
  <LI>The quality we measure is or should be the same as the quality our 
  opponent measures (so if we think we're in a good position, our opponent 
  thinks he's in a bad position and vice versa). This is unlikely to be really 
  true, but it's needed to make our search algorithms work well, and in practice 
  it comes pretty close to the truth. </LI></OL>
<P>The evaluation can be more or less complicated, depending on how much 
knowledge you build in to it. The more complicated it is, and the more knowledge 
it encodes, the slower it is likely to be. Typically, the performance of a 
program (how well it plays) has been estimated as behaving like the product of 
the knowledge and speed: 
<P>
<CENTER><IMG src="ICS 180, January 14, 1999.files/perfcurves.gif"></CENTER>
<P>So, if you have a fast dumb program you can often make it better by adding 
more knowledge and slowing it down a little. But that same additional knowledge 
and slowdown might actually make a smart slow program worse; there is a 
diminishing rate of return of performance to knowledge. Similarly once you speed 
your program up past a certain point, there is a diminishing improvement for 
adding more speed, and you would be better off balancing speed and knowledge 
somewhere closer to the middle of the chart. This balance point varies somewhat 
depending on what kind of opponent you expect to face; speed works better for 
defeating other computers, while human opponents are very good at exploiting 
holes in your knowledge and are more easily defeated by knowledge-based 
programs. 
<H3>Implementation methods</H3>
<P>There are two major types of evaluation function method. The first, 
"end-point evaluation", is simply to evaluate each position independently of 
each other position, using your favorite evaluation algorithm. This can give 
good results, but is slow, so some programmers have resorted to the following 
trick, known as pre-computation, first order evaluation, or piece-square tables. 

<P>Before we begin a search for the best move from a position, we examine 
carefully the position itself, and compute values to store in an array 
T[square,piece type]. The evaluation of any position found in the search will 
then be simply the sum of the array values for the pieces in the position. We 
don't have to compute the sum from scratch at each step; instead when moving a 
piece from one square to another update the score using the formula 
<CENTER><PRE>score += T[new square,piece] - T[old square,piece]
</PRE></CENTER>
<P>Examples of piece-square table values in chess: when a king is castled into 
the corner of the board, the pawns in front of it are very useful in defending 
against attacks. Their ability to defend becomes less as they move forward. So, 
if the king is in the corner in the starting position of the search, we might 
build piece-square tables for the pawns having the values <PRE>    ... 1   1   1   1
    ... 1   1.1 1.1 1.1
    ... 1   1.2 1.2 1.2
</PRE>on the three rows in front of the king, to encourage the pawns to stay 
close to the king by giving them a greater value than their usual one point when 
they are nearby. 
<P>Unfortunately while piece-square tables are blindingly fast, and you can 
incorporate some interesting kinds of knowledge this way, piece-square tables 
are pretty stupid in general. They can't take into account interactions between 
several moving pieces; those interactions have to be approximated by looking at 
where the pieces were when the piece-square table was computed. So, for 
instance, if we search through a long sequence of moves, in which the king goes 
to a different part of the board, the piece-square table values above would be 
inaccurate because they would be making the pawns defend the place the king used 
to be, rather than defending the king itself. 
<P>Programs that use piece-square tables often combine them with some amount of 
end-point evaluation. Another strategy for making piece-square table methods 
more accurate is to delay building the tables until later in the search; e.g. if 
you are searching 9-move sequences, build the tables after sequences of 5 moves 
and use them for the remaining 4-move search. If you do that, though, you should 
be careful to make the tables resulting from one 5-move sequence be consistent 
with those from other sequences, so that the overall evaluation scores can be 
compared meaningfully. In class Dave O. suggested another possible improvement: 
make incremental modifications to the piece-square tables, e.g. move the bonuses 
for pawns in front of kings when the kings move; this seems like a good idea but 
I don't know whether it's been implemented or if so how well it worked. 
<H3>How to combine evaluation terms</H3>Typically, like the first-order 
evaluations above, an evaluation function is a sum of several terms, where each 
term is the result of a function that looks for certain specific information in 
a position. Why sums? It's a relatively simple way of combining information that 
works ok in practice. 
<P>My own feeling is that game programmers really should try more carefully to 
model their evaluation functions on probabilities: combine terms to determine 
probabilities of winning soon (by carrying out some kind of attack), in a 
moderate number of moves, or in an endgame (say by taking advantage of a passed 
pawn in chess), and combine the probabilities appropriately. If the probability 
of winning soon for black is bs and for white is ws, if the probability of 
winning in a moderate number of moves (assuming no sooner win) is bm or wm, and 
if the probability of winning in an endgame is be or we, then the overall 
probability of winning is 
<CENTER><PRE>bs + (1 - bs - ws) bm + (1 - bs - ws - bm - wm) be
</PRE></CENTER>or 
<CENTER><PRE>ws + (1 - bs - ws) wm + (1 - bs - ws - bm - wm) we.
</PRE></CENTER>I think it might be a useful idea for an evaluation function to 
compute terms estimating these individual probabilities, and combine them with 
formulas like the ones above. How well each probability is estimated could be 
tested by comparing the program's estimates against the actual results in 
databases of games, and this would give a program the ability to do some 
rudimentary planning (judging whether to go for a certain attack based on how 
likely it is to work). But this is purely speculation, it hasn't been tested in 
a real program, and you won't go far wrong just using sums. 
<H3>What kinds of information go into evaluation functions?</H3>Evaluation 
functions typically combine terms encoding knowledge of different types: 
<UL>
  <P>
  <LI><B>Material</B>. The sum of point values in chess, the number of pieces of 
  each player on the board in e.g. go or othello. This is often useful, but 
  othello provides an interesting counterexample: the game is based at the end 
  on the material count, but for middle-game positions it is a pretty bad idea 
  to base the evaluation on material, since often the player with the better 
  position will actually have fewer pieces. For some other games such as go-moku 
  material is irrelevant since it a function only of what move it is and not of 
  how good the board position is. 
  <P></P>
  <LI><B>Space</B>. For some games, one can partition the board into regions 
  controlled by one player, regions controlled by the other player, and regions 
  still in dispute. For instance, this is the main idea of go. But it also comes 
  up in games including chess, in which one player's region consists of the 
  squares attacked or protected by his own pieces and not attacked or protected 
  by the opponent's. In Othello, if one player has a connected group of pieces 
  surrounding a corner, these pieces can never be taken and form part of that 
  player's territory. The space evaluation is then simply the sizes of these 
  regions, or less simply the total importance of these regions if there's some 
  way of saying that one square is more important than another. 
  <P></P>
  <LI><B>Mobility</B>. How many different moves does each player have available? 
  The idea is that if you have more choices of move, it's that much more likely 
  that at least one of them will lead to a good position. This works very well 
  in othello. It's not so useful in chess (it's been used, but some chess 
  programmers have taken it out of their programs because it doesn't seem to 
  help the quality of the overall evaluation). 
  <P></P>
  <LI><B>Tempo</B>. This is closely related to mobility, and comes up in games 
  like Othello and Connect-4 (and in certain chess endgames) where it can often 
  be a disadvantage to be forced to move. But unlike mobility terms, often the 
  parity of the number of available moves (whether it is odd or even) matters 
  more than the total number. 
  <P>For instance, consider the connect-4 position below: 
  <P>
  <CENTER><IMG height=171 src="ICS 180, January 14, 1999.files/990114.gif" 
  width=200></CENTER>Columns 1, 3, 4, and 7 are filled up, so any move must be 
  to columns 2, 5, or 6. Moves to column 6 are neutral -- either player can make 
  them without winning or losing. Black controls column 2 -- red can not play 
  safely there, because that would let black win by getting four in a row. 
  Neither player can move safely in column 4 because the other player would then 
  immediately win. If red plays next, then after three moves to column 6, black 
  will be forced move in column 2, giving up his control of that column, and 
  three moves later black will have to move again in column 5 and red will win. 
  But if black is the one to play next, then three moves later red will be 
  forced to make a losing move. 
  <P>In connect-4 endgames such as this, the columns with an even number of 
  spaces left are very unimportant. The important quantity to measure is the 
  number of odd columns in which only one player can move. If one player 
  controls more odd columns, he or she is likely to win. If the number of odd 
  columns controlled by each player is equal, as in the board shown (red 
  controls none of the columns, and black only controls an even column) then the 
  important quantity is the number of odd neutral columns -- if this is odd then 
  the next player to play will win, while if it's even the next player will 
  lose. Of course, this simple analysis needs to be made more sophisticated to 
  handle positions in which one player controls positions higher up in the 
  columns. 
  <P>In some other games such as Go, such strict parity rules are not important, 
  but it still matters who has the <I>initiative</I> -- the player who can 
  choose where to play, while the other player is forced to respond in the same 
  area of the board. Often it is a good idea to make a sequence of moves that 
  each take a small amount of space but force your opponent to respond, before 
  making a larger move that takes more space but allows your opponent to take 
  the initiative. 
  <P></P>
  <LI><B>Threats</B>. Is the opponent about to do something bad? Are you about 
  to do something good? E.g. in chess or go, are some pieces likely to be 
  captured? In go-moku or connect-4, does either player have some number of 
  pieces lined up? In chess or checkers is some pawn about to be queened or 
  kinged? In othello, is one player about to take a corner? This sort of term 
  should vary according to the immediacy and strength of the threat. 
  <P></P>
  <LI><B>Shape</B>. In go, connected groups of pieces are safe from capture if 
  they surround two separate regions of territory (called "eyes"). In chess, 
  side-by-side pawns are generally much stronger than pawns stacked on the same 
  column. Shape-based terms are especially important because they measure 
  long-term qualities of the position that do not change much in a few moves and 
  that won't be covered by the search. (Searching finds short-range tactics in 
  an attempt to improve the overall evaluation, so the evaluation itself needs 
  to include any more long-term features that the search is trying to make 
  happen.) 
  <P></P>
  <LI><B>Motifs</B>. Some particular patterns of pieces are common enough that 
  it's worth including special case terms to cover them. In chess, for example, 
  a bishop can often capture a pawn on the outside column, only to be trapped by 
  moving another pawn forward. Once the bishop is trapped, it may still take 
  many moves for the opponent to maneuver a piece into position to capture it, 
  so the fact that it's trapped may not be obvious in the computer's search 
  routine. Some programs have included special evaluation terms to warn the 
  computer that taking that pawn might be a mistake. In othello, it's sometimes 
  useful to "sacrifice" one corner by placing a stone next to the corner, so 
  that when the opponent takes the corner itself one can put another stone next 
  to the corner in a way that won't be taken, and that leads to the win of a 
  different corner: 
  <P>
  <CENTER><PRE>    . . O @ @ @ @ .
    . . @ @ @ @ O .
    . @ @ @ @ @ . .
</PRE>White has sacrificed the bottom-left corner.<BR>Once black plays 
  bottom-left, white will play next to it and win the bottom-right corner. 
  </CENTER>
  <P>It might be worthwhile to have special evaluation code to examine these 
  sacrifices and determine to what extent they're worthwhile to make, or to 
  include in the measure of quality of a position the vulnerability of 
  edge-pieces to such a sacrifice. </P></LI></UL>
<HR>
<A href="http://www.ics.uci.edu/~eppstein/">David Eppstein, <A 
href="http://www.ics.uci.edu/">Dept. Information &amp; Computer Science</A>, <A 
href="http://www.uci.edu/">UC Irvine</A>, Monday, 22-Feb-1999 18:11:25 PST. 
</BODY></HTML>
ics 180, january 14, 1999.htm - 源码说明

本页面展示了「这是博弈论算法全集第六部分:局面描述,其它算法将陆续推出.以便与大家共享」中的 ics 180, january 14, 1999.htm 源码文件，采用 HTM 编程语言编写，共 229 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与算法相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?