📄 page_103.html
字号:
<HTML> <HEAD> <!--SCRIPT LANGUAGE="JavaScript" SRC="http://a1835.g.akamai.net/f/1835/276/3h/www.netlibrary.com/include/js/dictionary_library.js"></SCRIPT> <SCRIPT LANGUAGE="JavaScript"> if (!opener){document.onkeyup=parent.turnBookPage;} </SCRIPT!--> <META HTTP-EQUIV="Cache-Control" CONTENT="no-cache"> <META HTTP-EQUIV="Pragma" CONTENT="no-cache"> <META HTTP-EQUIV="Expires" CONTENT="-1"><META http-equiv="Content-Type" content="text/html; charset=windows-1252"><SCRIPT>var PrevPage="Page_102";var NextPage="Page_104";var CurPage="Page_103";var PageOrder="114";</SCRIPT> <TITLE>Document</TITLE> </HEAD> <BODY BGCOLOR="#FFFFFF"><CENTER><TABLE BORDER=0 WIDTH=100% CELLPADDING=0><TR><TD ALIGN=CENTER> <TABLE BORDER=0 CELLPADDING=2 CELLSPACING=0 WIDTH=100%> <TR> <TD ALIGN=LEFT><A HREF='Page_102.html'>Previous</A></TD> <TD ALIGN=RIGHT><A HREF='Page_104.html'>Next</A></TD> </TR> </TABLE></TD></TR><TR><TD ALIGN=LEFT><P><A NAME='JUMPDEST_Page_103'/><A NAME='{386}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0 WIDTH='100%'><TR><TD ALIGN=RIGHT><FONT FACE='Times New Roman, Times, Serif' SIZE=2 COLOR=#FF0000>Page 103</FONT></TD></TR></TABLE><A NAME='{387}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR> <TD ROWSPAN=5></TD> <TD COLSPAN=3 HEIGHT=12></TD> <TD ROWSPAN=5></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR><TD></TD> <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3>CART splits a data set on the basis of diversity, and it looks at all the variables in a database to determine which ones make the best separators or splinters. The best separators are those data fields that do the best job splitting a database into groups where a single class is dominant. Its inputs or independent predictor variables can be nominal or ordinal, with continuous predictors also being supported.</FONT></TD><TD></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR> <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{388}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR> <TD ROWSPAN=5></TD> <TD COLSPAN=3 HEIGHT=12></TD> <TD ROWSPAN=5></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR><TD></TD> <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3>Then there is the AID family of algorithms: AID, THAID, CHAID, MAID, XAID, FIRM, TREEDISC, etc. These algorithms, the most popular being CHAID, are based on the concept of detecting complex statistical relationships. They generate decision trees where the number of branches varies from two to the number of categories of predictor. A key difference between CHAID and CART or C4.5 is that CHAID is highly conservative; it stops growing trees in an effort to avoid overfitting. CHAID also only works on categorical variables. All of these algorithms segment data sets on the basis of statistical significance tests, which they also use to determine the size of their tree. AID, MAID, and XAID were designed for quantitative responses, while THAID, CHAID, and TREEDISC are for nominal responses. Some data mining tools combine two or more algorithms, such as KnowledgSEEKER, which uses CART, CHAID, and ID3, or Clementine, which uses ID3 and C5.0. The algorithm C5.0 is available from Quinland's own company, Rulequest.</FONT></TD><TD></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR> <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{389}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR> <TD ROWSPAN=5></TD> <TD COLSPAN=3 HEIGHT=17></TD> <TD ROWSPAN=5></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR><TD></TD> <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3><B>The Trees vs. The Rules</B></FONT></TD><TD></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR> <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{38A}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR> <TD ROWSPAN=5></TD> <TD COLSPAN=3 HEIGHT=12></TD> <TD ROWSPAN=5></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR><TD></TD> <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3>Decision trees are popular in many areas of marketing and can also be used to analyze web-based data. The advantages of decision trees are found in their ability to generate understandable business rules in a decision-support environment and the ability to model nonlinear relationships with logical rules. ID3, and its successors C4.5, C5.0, CART, CHAID, and other variations of machine-learning algorithms, perform somewhat the same process on a database: <I>they split it into classes that differ as much as possible in their relation to a selected output.</I> In other words, a database is split into subsets according to the results of statistical tests conducted on an output by the <I>algorithm—not</I> the user. CART, developed by L. Briemen in 1984, builds binary trees that allow the classification on the basis of the attributes, which are the best splitters and separators. CHAID, which was developed nearly a decade before by J. S. Hartigan in 1975, was designed for purposes of detecting statistical relationships in data and is restricted to categorical variables.</FONT></TD><TD></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR> <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{38B}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR> <TD ROWSPAN=5></TD> <TD COLSPAN=3 HEIGHT=12></TD> <TD ROWSPAN=5></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR><TD></TD> <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3>In general, rules are more complex than necessary, and rules derived from trees are usually pruned to remove redundant tests.</FONT><FONT FACE='Times New Roman, Times, Serif' SIZE=3 COLOR=#FFFF00><!-- continue --></FONT></TD><TD></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR> <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{38C}'/></FORM></P></TD></TR></TABLE><P><FONT SIZE=0 COLOR=WHITE></CENTER><A NAME="bottom"> </A><!-- netLibrary.com Copyright Notice --> </BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -