⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 page_100.html

📁 怎样挖掘你的网站的内容。本领域内唯一的书
💻 HTML
字号:
<HTML>  <HEAD>    <!--SCRIPT LANGUAGE="JavaScript" SRC="http://a1835.g.akamai.net/f/1835/276/3h/www.netlibrary.com/include/js/dictionary_library.js"></SCRIPT>    <SCRIPT LANGUAGE="JavaScript">      if (!opener){document.onkeyup=parent.turnBookPage;}    </SCRIPT!-->    <META HTTP-EQUIV="Cache-Control" CONTENT="no-cache">    <META HTTP-EQUIV="Pragma" CONTENT="no-cache">    <META HTTP-EQUIV="Expires" CONTENT="-1"><META http-equiv="Content-Type" content="text/html; charset=windows-1252"><SCRIPT>var PrevPage="Page_99";var NextPage="Page_101";var CurPage="Page_100";var PageOrder="111";</SCRIPT>  <TITLE>Document</TITLE>  </HEAD>  <BODY BGCOLOR="#FFFFFF"><CENTER><TABLE BORDER=0 WIDTH=100% CELLPADDING=0><TR><TD ALIGN=CENTER>  <TABLE BORDER=0 CELLPADDING=2 CELLSPACING=0 WIDTH=100%>  <TR>  <TD ALIGN=LEFT><A HREF='Page_99.html'>Previous</A></TD>  <TD ALIGN=RIGHT><A HREF='Page_101.html'>Next</A></TD>  </TR>  </TABLE></TD></TR><TR><TD ALIGN=LEFT><P><A NAME='JUMPDEST_Page_100'/><A NAME='{36F}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0 WIDTH='100%'><TR><TD ALIGN=RIGHT><FONT FACE='Times New Roman, Times, Serif' SIZE=2 COLOR=#FF0000>Page 100</FONT></TD></TR></TABLE><A NAME='{370}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR>  <TD ROWSPAN=5></TD>  <TD COLSPAN=3 HEIGHT=12></TD>  <TD ROWSPAN=5></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR><TD></TD>  <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3>ID3 algorithm works by computing the entropy associated with each attribute. The calculation is made as follows:</FONT></TD><TD></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR>  <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{371}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR>  <TD ROWSPAN=5></TD>  <TD COLSPAN=3 HEIGHT=12></TD>  <TD ROWSPAN=5></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR><TD></TD>  <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3><IMG SRC='C0112-01.GIF' BORDER=0 ALT='C0112-01.gif' WIDTH=162 HEIGHT=41></FONT></TD><TD></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR>  <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{372}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR>  <TD ROWSPAN=5></TD>  <TD COLSPAN=3 HEIGHT=12></TD>  <TD ROWSPAN=5></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR><TD></TD>  <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3>Rather than work you through the math, suffice to say that ID3 performs a measurement of entropy or noise in order to measure the amount of data an attribute contains. The overall entropy of the classification is the expected amount of information that will be gained when the class is specified. For example, let's say you are trying to determine which attribute contains the most information in the classification of fruit:</FONT></TD><TD></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR>  <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{373}'/><TABLE CELLPADDING=0 CELLSPACING=0 BORDER=0 WIDTH='100%'><TR><TD HEIGHT=12></TD></TR><TR><TD><TABLE CELLSPACING=0 WIDTH=583 CELLPADDING=5><TR><TD COLSPAN=3 WIDTH=583 VALIGN=TOP><FONT FACE='Times New Roman, Times, Serif' SIZE=2><B>&quot;What is it, a banana, an apple or an orange?</B>&quot;</FONT></TD></TR><TR><TD WIDTH=301 VALIGN=TOP><FONT FACE='Times New Roman, Times, Serif' SIZE=2><B><I>Data</I></B></FONT></TD><TD WIDTH=132 VALIGN=TOP><FONT FACE='Times New Roman, Times, Serif' SIZE=2><B><I>Attribute</I></B></FONT></TD><TD WIDTH=150 VALIGN=TOP><FONT FACE='Times New Roman, Times, Serif' SIZE=2><B><I>Information Gain</I></B></FONT></TD></TR><TR><TD WIDTH=301 VALIGN=TOP><FONT FACE='Times New Roman, Times, Serif' SIZE=2>It's 7.8 oz.</FONT></TD><TD WIDTH=132 VALIGN=TOP><FONT FACE='Times New Roman, Times, Serif' SIZE=2>Weight</FONT></TD><TD WIDTH=150 VALIGN=TOP><FONT FACE='Times New Roman, Times, Serif' SIZE=2>Little</FONT></TD></TR><TR><TD WIDTH=301 VALIGN=TOP><FONT FACE='Times New Roman, Times, Serif' SIZE=2>It's round.</FONT></TD><TD WIDTH=132 VALIGN=TOP><FONT FACE='Times New Roman, Times, Serif' SIZE=2>Shape</FONT></TD><TD WIDTH=150 VALIGN=TOP><FONT FACE='Times New Roman, Times, Serif' SIZE=2>Some</FONT></TD></TR><TR><TD WIDTH=301 VALIGN=TOP><FONT FACE='Times New Roman, Times, Serif' SIZE=2>It's red.</FONT></TD><TD WIDTH=132 VALIGN=TOP><FONT FACE='Times New Roman, Times, Serif' SIZE=2>Color</FONT></TD><TD WIDTH=150 VALIGN=TOP><FONT FACE='Times New Roman, Times, Serif' SIZE=2>Most</FONT></TD></TR></TABLE></TD></TR></TABLE><BR><A NAME='{374}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR>  <TD ROWSPAN=5></TD>  <TD COLSPAN=3 HEIGHT=12></TD>  <TD ROWSPAN=5></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR><TD></TD>  <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3>Figure 3-9 is an example of an ID3 decision tree. Notice that ID3 forms a branch for unique clusters with similar income categories. ID3 is the precursor to C4.5 and C5.0, which were developed also by Quinlan and use a criterion known as <I>information gain</I> to compare and generate potential splits within a data set. C5.0, the newest and latest algorithm, uses the ratio of the total information gain due to a proposed split to the information gain attributable solely to the number of subsets created, as the criterion for evaluating proposed splits. The ID3 and C4.5 algorithms are based on concept learning, where the number of branches equals number of categories for predictor outputs.</FONT></TD><TD></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR>  <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{375}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR>  <TD ROWSPAN=5></TD>  <TD COLSPAN=3 HEIGHT=17></TD>  <TD ROWSPAN=5></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR><TD></TD>  <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3><B>The Rules</B></FONT></TD><TD></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR>  <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{376}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR>  <TD ROWSPAN=5></TD>  <TD COLSPAN=3 HEIGHT=12></TD>  <TD ROWSPAN=5></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR><TD></TD>  <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3>Machine algorithms, which are the core technology of most decision tree and rule-generating data mining tools, take a &quot;divide and conquer&quot; approach. In any given data set, such as your log files or forms database, the algorithms look at the attributes (domain,</FONT><FONT FACE='Times New Roman, Times, Serif' SIZE=3 COLOR=#FFFF00><!-- continue --></FONT></TD><TD></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR>  <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{377}'/></FORM></P></TD></TR></TABLE><P><FONT SIZE=0 COLOR=WHITE></CENTER><A NAME="bottom">&nbsp;</A><!-- netLibrary.com Copyright Notice -->  </BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -