⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 328-330.html

📁 遗传算法经典书籍-英文原版 是研究遗传算法的很好的资料
💻 HTML
字号:
<HTML>
<HEAD>
<META name=vsisbn content="0849398010">
<META name=vstitle content="Industrial Applications of Genetic Algorithms">
<META name=vsauthor content="Charles Karr; L. Michael Freeman">
<META name=vsimprint content="CRC Press">
<META name=vspublisher content="CRC Press LLC">
<META name=vspubdate content="12/01/98">
<META name=vscategory content="Web and Software Development: Artificial Intelligence: Other">




<TITLE>Industrial Applications of Genetic Algorithms:Genetic Algorithms for Game Playing</TITLE>

<!-- HEADER -->

<STYLE type="text/css"> 
 <!--
 A:hover  {
 	color : Red;
 }
 -->
</STYLE>

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

<!--ISBN=0849398010//-->
<!--TITLE=Industrial Applications of Genetic Algorithms//-->
<!--AUTHOR=Charles Karr//-->
<!--AUTHOR=L. Michael Freeman//-->
<!--PUBLISHER=CRC Press LLC//-->
<!--IMPRINT=CRC Press//-->
<!--CHAPTER=17//-->
<!--PAGES=328-330//-->
<!--UNASSIGNED1//-->
<!--UNASSIGNED2//-->

<CENTER>
<TABLE BORDER>
<TR>
<TD><A HREF="325-328.html">Previous</A></TD>
<TD><A HREF="../ewtoc.html">Table of Contents</A></TD>
<TD><A HREF="331-337.html">Next</A></TD>
</TR>
</TABLE>
</CENTER>
<P><BR></P>
<P><FONT SIZE="+1"><B><I>Fitness Function</I></B></FONT></P>
<P>Two ways of coding the fitness function are considered here. First, the &#147;X&#148; and the &#147;O&#148; players could be pitted against one another in a series of games. Or, alternatively, the game-playing program could be matched against a fixed algorithm. If the two players were forced to play one another, then two population pools would be generated initially. One pool would contain the &#147;X&#148; player strategy, the other pool would represent the &#147;O&#148; player strategy. In this approach, if 30 strings were generated in each pool, and each pool played five games, there would be 30*30*5 = 4500 games played total in each generation. The fitness function value of each string is derived from the win-loss-tie record of the strings.
</P>
<P>On the other hand, if the strings are pitted against fixed algorithms, each strategy would play against its opponent a fixed number of games. Then, the fitness value would be determined in the same way as above; according to the win-loss-tie record. In this approach, if there are 100 strings in the pool, and each string is forced to play 50 games, there would be total of 5,000 games played during each generation.</P>
<P>The following scoring methods are used:</P>
<TABLE WIDTH="100%" BORDER><TR>
<TD WIDTH="25%">Player
<TD WIDTH="25%">Win
<TD WIDTH="25%">Tie
<TD WIDTH="25%">Loss
<TR>
<TD>Player X
<TD>&#43;4
<TD>&#43;1
<TD>0
<TR>
<TD>Player O
<TD>&#43;5
<TD>&#43;2
<TD>0
</TABLE>
<P>The difference between the &#147;X&#148; player and the &#147;O&#148; player is that for the &#147;X&#148; player, its objective is to win, because it goes first. It receives a smaller score for a tie because it has the advantage of moving first. But, as for the &#147;O&#148; player, because it goes second, its objective is to prevent the &#147;X&#148; player from winning. Thus, the &#147;O&#148; player is considered successful (to some extent) if it ties the game, and an even bigger accomplishment if it wins the game.
</P>
<P>Using the above table, for example, if the &#147;X&#148; player has a record of 11-5-4 after 30 games, then it will have a fitness value of 49. If the &#147;O&#148; player has the same record, then it will have a fitness value of 65.</P>
<P><FONT SIZE="+1"><B><I>Crossover &#38; Mutation</I></B></FONT></P>
<P>After the random generation and evaluation of the first generation, there exist 30 strings with associated score or fitness values. Next, roulette wheel selection is implemented to generate a mating pool of 30 strings. For example, if the initial generation was represented by the strings in the table below, then string number one would have the probability of 34/430 or 8% chance of making it to the next generation. Over the 30 selections, it would have a probability of 91% to have one or more copies of itself in the next generation. Likewise, string two would have a probability of 0.9% over one selection and a probability of 24% over all 30 selection.
</P>
<TABLE WIDTH="100%" BORDER><TR>
<TD WIDTH="30%">String
<TD WIDTH="10%">1
<TD WIDTH="10%">2
<TD WIDTH="10%">3
<TD WIDTH="10%">4
<TD WIDTH="10%">5
<TD WIDTH="10%">&#133;
<TD WIDTH="10%">Total
<TR>
<TD>Score
<TD>34
<TD>4
<TD>23
<TD>12
<TD>2
<TD>&#133;
<TD>430
</TABLE>
<P>The idea is, of course, to give the strings with a high fitness values a greater chance of making it to the next generation. And by using roulette wheel selection, it will add some variations into the next generation.
</P>
<P>For this application, a crossover probability of 1.0 was used. This means that crossover will happen to all strings in the mating pool. A mutation probability of 0.001 was used.</P>
<P><FONT SIZE="+1"><B><I>Selecting GA Parameters</I></B></FONT></P>
<P>Initially, the GA was used to train the &#147;X&#148; player. A ternary alphabet was used consisting of 0, 1, and 2, so each move had two bits which combine to produce a number between 0 and 8. The GA strategy was developed by pitting it against a computer opponent that played with a very simple, perhaps &#147;minimal&#148; strategy (to be described below).
</P>
<P>First, the board is initialized and place in a file so that the GA did not have to compute the boards every time it evaluated the fitness of a string. The board initialization program goes through the entire board configuration and saves the valid configurations while discarding the invalid ones. It also discards the boards which the &#147;X&#148; player can win during the next move and the ones &#147;X&#148; player must block in order to stay in the game. This is done because this domain knowledge was pre-programmed. For each valid configuration, the code applied each of the six variations and compared these variations of the current decision to the stored configuration. If the configuration is a variant of a previous one, it stores the current configuration, the original configuration, and the variation number. All the information is then saved to a data file.</P>
<P>After these preliminaries are completed, the GA program can now be executed. Each individual in the 100 string population is forced to play a computer opponent. This opponent uses following strategy:</P>
<DL>
<DD><B>1.</B>&nbsp;&nbsp;Win the game: go through all the blank squares and, if it can win by placing its piece in one of them, go ahead and make that move.
<DD><B>2.</B>&nbsp;&nbsp;Block opponent: go through all the blanks again and, if the opponent can win by placing its piece in one of them, then return that move.
<DD><B>3.</B>&nbsp;&nbsp;Else, make a random move.
</DL>
<P>This, obviously, is not a very sophisticated strategy. However, this player does know when to block and when to win the game. This is enough to train the GA player with. Here, the goal is to train the GA player so that it can beat this opponent virtually every time. If this is achieved, then the strategy employed by the GA is deemed to be effective for playing the game of &#147;Tic-Tac-Toe&#148;.
</P>
<P>Each string or individual plays 50 games against its simplistic computer opponent. Using the scoring method described above, a maximum of 200 points can be scored.</P><P><BR></P>
<CENTER>
<TABLE BORDER>
<TR>
<TD><A HREF="325-328.html">Previous</A></TD>
<TD><A HREF="../ewtoc.html">Table of Contents</A></TD>
<TD><A HREF="331-337.html">Next</A></TD>
</TR>
</TABLE>
</CENTER>

<hr width="90%" size="1" noshade>
<div align="center">
<font face="Verdana,sans-serif" size="1">Copyright &copy; <a href="/reference/crc00001.html">CRC Press LLC</a></font>
</div>
<!-- all of the reference materials (books) have the footer and subfoot reveresed -->
<!-- reference_subfoot = footer -->
<!-- reference_footer = subfoot -->

</BODY>
</HTML>

<!-- END FOOTER -->

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -