⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 page_50.html

📁 怎样挖掘你的网站的内容。本领域内唯一的书
💻 HTML
字号:
<HTML>  <HEAD>    <!--SCRIPT LANGUAGE="JavaScript" SRC="http://a1835.g.akamai.net/f/1835/276/3h/www.netlibrary.com/include/js/dictionary_library.js"></SCRIPT>    <SCRIPT LANGUAGE="JavaScript">      if (!opener){document.onkeyup=parent.turnBookPage;}    </SCRIPT!-->    <META HTTP-EQUIV="Cache-Control" CONTENT="no-cache">    <META HTTP-EQUIV="Pragma" CONTENT="no-cache">    <META HTTP-EQUIV="Expires" CONTENT="-1"><META http-equiv="Content-Type" content="text/html; charset=windows-1252"><SCRIPT>var PrevPage="Page_49";var NextPage="Page_51";var CurPage="Page_50";var PageOrder="62";</SCRIPT>  <TITLE>Document</TITLE>  </HEAD>  <BODY BGCOLOR="#FFFFFF"><CENTER><TABLE BORDER=0 WIDTH=100% CELLPADDING=0><TR><TD ALIGN=CENTER>  <TABLE BORDER=0 CELLPADDING=2 CELLSPACING=0 WIDTH=100%>  <TR>  <TD ALIGN=LEFT><A HREF='Page_49.html'>Previous</A></TD>  <TD ALIGN=RIGHT><A HREF='Page_51.html'>Next</A></TD>  </TR>  </TABLE></TD></TR><TR><TD ALIGN=LEFT><P><A NAME='JUMPDEST_Page_50'/><A NAME='{1D7}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0 WIDTH='100%'><TR><TD ALIGN=RIGHT><FONT FACE='Times New Roman, Times, Serif' SIZE=2 COLOR=#FF0000>Page 50</FONT></TD></TR></TABLE><A NAME='{1D8}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR>  <TD ROWSPAN=5></TD>  <TD COLSPAN=3 HEIGHT=17></TD>  <TD ROWSPAN=5></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR><TD></TD>  <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3><B>Data Mining vs. Statistics</B></FONT></TD><TD></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR>  <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{1D9}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR>  <TD ROWSPAN=5></TD>  <TD COLSPAN=3 HEIGHT=12></TD>  <TD ROWSPAN=5></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR><TD></TD>  <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3>The deciding distinction between statistics and data mining is the direction of the query: In data mining, the interrogation of the data is done by the machine-learning algorithm or neural network, rather than by the statistician or business analyst. In other words, data mining is data-driven, rather than user-driven or verification-driven, as it is with most statistical analyses. Statistical manual factorial and multivariate analyses of variance may be performed in order to identify the relationships of factors influencing the outcome of product sales using such tools as SPSS or SAS. Pearson's correlation may be generated for every field in a database to measure the strength and direction of their relationship to some dependent variable, like total sales.</FONT></TD><TD></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR>  <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{1DA}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR>  <TD ROWSPAN=5></TD>  <TD COLSPAN=3 HEIGHT=12></TD>  <TD ROWSPAN=5></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR><TD></TD>  <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3>A skilled SAS statistician conversant with that system's PROC syntax can perform that type of analysis rather quickly. However, one of the problems with this approach, aside from the fact that it is very resource-intensive, is that the techniques tend to focus on tasks in which all the attributes have continuous or ordinal values. Many of them are also parametric; for instance, a linear classifier assumes that class can be expressed as a linear combination of the attribute values. Statistical methodology assumes a bell-shaped normal distribution of data&#151;which in the real world of business and Internet databases simply is nonexistent and too costly to accommodate. However, these statistical tool vendors are well aware of these shortcomings; as both SPSS and SAS are now making available new data mining modules and add-ons to their main products.</FONT></TD><TD></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR>  <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{1DB}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR>  <TD ROWSPAN=5></TD>  <TD COLSPAN=3 HEIGHT=12></TD>  <TD ROWSPAN=5></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR><TD></TD>  <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3>Data mining also has major advantages over statistics when the scale of databases increase in size, simply because manual approaches to data analysis are rendered impractical. For example, suppose there are 100 attributes in a database to choose from, of which you don't know which are significant. With even this small problem there are 100 &times; 99 = 9,900 combinations of attributes to consider. If there are three classes, such as high, medium, and low, there are now 100 &times; 99 &times; 98 = 970,200 possible combinations. If there are 800 attributes, such as in our large website bookseller customer database&nbsp;.&nbsp;.&nbsp;.&nbsp;well, you get the picture. Consider analyzing millions of transactions on a daily basis, as is the case with a large electronic retailing site, and it quickly becomes apparent that the manual approach to pattern-recognition simply does not scale to the task. Data mining, rather than hindering the traditional statistical approach to data analysis and knowledge discovery, extends it by allowing the automated examination of large numbers of hypotheses and the segmentation of very large databases.</FONT><FONT FACE='Times New Roman, Times, Serif' SIZE=3 COLOR=#FFFF00><!-- break --></FONT></TD><TD></TD></TR><TR>  <TD COLSPAN=3></TD></TR><TR>  <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{1DC}'/></FORM></P></TD></TR></TABLE><P><FONT SIZE=0 COLOR=WHITE></CENTER><A NAME="bottom">&nbsp;</A><!-- netLibrary.com Copyright Notice -->  </BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -