📄 161-163.html
字号:
<HTML>
<HEAD>
<META name=vsisbn content="0849398010">
<META name=vstitle content="Industrial Applications of Genetic Algorithms">
<META name=vsauthor content="Charles Karr; L. Michael Freeman">
<META name=vsimprint content="CRC Press">
<META name=vspublisher content="CRC Press LLC">
<META name=vspubdate content="12/01/98">
<META name=vscategory content="Web and Software Development: Artificial Intelligence: Other">
<TITLE>Industrial Applications of Genetic Algorithms:Data Mining Using Genetic Algorithms</TITLE>
<!-- HEADER -->
<STYLE type="text/css">
<!--
A:hover {
color : Red;
}
-->
</STYLE>
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<!--ISBN=0849398010//-->
<!--TITLE=Industrial Applications of Genetic Algorithms//-->
<!--AUTHOR=Charles Karr//-->
<!--AUTHOR=L. Michael Freeman//-->
<!--PUBLISHER=CRC Press LLC//-->
<!--IMPRINT=CRC Press//-->
<!--CHAPTER=9//-->
<!--PAGES=161-163//-->
<!--UNASSIGNED1//-->
<!--UNASSIGNED2//-->
<CENTER>
<TABLE BORDER>
<TR>
<TD><A HREF="159-161.html">Previous</A></TD>
<TD><A HREF="../ewtoc.html">Table of Contents</A></TD>
<TD><A HREF="163-165.html">Next</A></TD>
</TR>
</TABLE>
</CENTER>
<P><BR></P>
<P><FONT SIZE="+1"><B>REVIEW OF CURRENT DATA MINING SEARCH TECHNIQUES</B></FONT></P>
<P>In order to better appreciate the importance and relevance of the genetic algorithm search approach proposed in this chapter, and to better understand the aspect of search techniques and how they relate to data mining, several issues related to the searching of data will be reviewed. Specifically, the importance of search in data mining and current approaches to search in data mining will be discussed.
</P>
<P><FONT SIZE="+1"><B><I>Importance of Search in Data Mining</I></B></FONT></P>
<P>Data mining is a field still in its formative stages. Because of this, the actual meaning of the term “data mining” is open to interpretation. Some consider data mining to be a component within a bigger “knowledge discovery” process, while others consider data mining and knowledge discovery to be one in the same. In addition, because data mining is in its infancy, data mining techniques are still being developed and investigated. Nonetheless, however one chooses to define data mining, it is a fact that efficient and effective search mechanisms are an important and essential component in the process of discovering potential relationships that exist within large data sets.
</P>
<P>As previously mentioned, the explosive growth in our ability to collect electronic information has far outpaced our ability to interpret and make use of that information. This, in turn, has created a need for a new generation of tools and techniques for database analysis. One such technique, data mining, employs methodologies to sift through huge databases in search of frequently occurring patterns to detect trends and produce generalizations about the data content.</P>
<P>As one might suspect, in order to find patterns, detect trends, or produce generalizations about data content, extensive searching, not only of the actual data, but of various aspects of the data, must take place. In the case of data mining, search is the process of seeking a solution by examining alternatives. In the case of the problem investigated in this chapter, many item combination alternatives must be examined in order to find a combination that meets the given constraint. In the case of our example, the constraint is to determine the four most frequently occurring items that appear within the transactions of a database. Because of the huge number of combinations and alternatives that typically need to be investigated in data mining problem domains, the need for effective and efficient search mechanisms is of extreme importance. Therefore, since genetic algorithms are excellent search mechanisms, their application in data mining seems “natural.”</P>
<P><FONT SIZE="+1"><B><I>Current Approaches to Search in Data Mining</I></B></FONT></P>
<P>When considering search mechanisms in a data mining environment, a distinction must be made between different types of search. For the purposes of this discussion, a distinction will be made between a “low level” search of the database (which is not the focus of this chapter) and a “high level” search of alternatives.
</P>
<P>In a data mining environment, there are different levels of search involved in the extraction of information from a database. At the lowest level, when specific files or records need to be accessed, access structures called indexes, are used to speed up the retrieval of records in response to certain search conditions. The idea behind an index access structure is similar to that behind the indexes used in textbooks. The index contains a key term along with a page number, or list of page numbers where the key term can be found. We can search the index to find an address (page number in a textbook) or list of addresses and then use the address(es) to locate the term in the database (the textbook). The (unacceptable) alternative is to search the whole database (the textbook) to find the term we are interested in [4].</P>
<P>Since (as discussed later in the section, “Problem Statement”) an actual database is not being used for purposes of this project, the lowest level of database search becomes trivial and the indexing concept described above is not necessary. In a “real” database environment, however, indexing mechanisms would be present.</P>
<P>A “high level” search, on the other hand, is a search mechanism that operates above the lower level searching mechanism just described. A high level search, rather than performing the actual search through a database, determines <I>what</I> the lower level search should look for. The focus of this discussion is on the “high level” search aspect of a data mining environment, and it is the genetic algorithm that can be used for such a search. Figure 9.1 shows how these search levels fit into the overall data mining environment.</P>
<P><A NAME="Fig1"></A><A HREF="javascript:displayWindow('images/09-01.jpg',496,321)"><IMG SRC="images/09-01t.jpg"></A>
<BR><A HREF="javascript:displayWindow('images/09-01.jpg',496,321)"><FONT COLOR="#000077"><B>Figure 9.1</B></FONT></A> Search levels in the data mining environment.<P><BR></P>
<CENTER>
<TABLE BORDER>
<TR>
<TD><A HREF="159-161.html">Previous</A></TD>
<TD><A HREF="../ewtoc.html">Table of Contents</A></TD>
<TD><A HREF="163-165.html">Next</A></TD>
</TR>
</TABLE>
</CENTER>
<hr width="90%" size="1" noshade>
<div align="center">
<font face="Verdana,sans-serif" size="1">Copyright © <a href="/reference/crc00001.html">CRC Press LLC</a></font>
</div>
<!-- all of the reference materials (books) have the footer and subfoot reveresed -->
<!-- reference_subfoot = footer -->
<!-- reference_footer = subfoot -->
</BODY>
</HTML>
<!-- END FOOTER -->
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -