📄 page_141.html
字号:
<HTML> <HEAD> <!--SCRIPT LANGUAGE="JavaScript" SRC="http://a1835.g.akamai.net/f/1835/276/3h/www.netlibrary.com/include/js/dictionary_library.js"></SCRIPT> <SCRIPT LANGUAGE="JavaScript"> if (!opener){document.onkeyup=parent.turnBookPage;} </SCRIPT!--> <META HTTP-EQUIV="Cache-Control" CONTENT="no-cache"> <META HTTP-EQUIV="Pragma" CONTENT="no-cache"> <META HTTP-EQUIV="Expires" CONTENT="-1"><META http-equiv="Content-Type" content="text/html; charset=windows-1252"><SCRIPT>var PrevPage="Page_140";var NextPage="Page_142";var CurPage="Page_141";var PageOrder="151";</SCRIPT> <TITLE>Document</TITLE> </HEAD> <BODY BGCOLOR="#FFFFFF"><CENTER><TABLE BORDER=0 WIDTH=100% CELLPADDING=0><TR><TD ALIGN=CENTER> <TABLE BORDER=0 CELLPADDING=2 CELLSPACING=0 WIDTH=100%> <TR> <TD ALIGN=LEFT><A HREF='Page_140.html'>Previous</A></TD> <TD ALIGN=RIGHT><A HREF='Page_142.html'>Next</A></TD> </TR> </TABLE></TD></TR><TR><TD ALIGN=LEFT><P><A NAME='JUMPDEST_Page_141'/><A NAME='{4D7}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0 WIDTH='100%'><TR><TD ALIGN=RIGHT><FONT FACE='Times New Roman, Times, Serif' SIZE=2 COLOR=#FF0000>Page 141</FONT></TD></TR></TABLE><A NAME='{4D8}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR> <TD ROWSPAN=5></TD> <TD COLSPAN=3 HEIGHT=12></TD> <TD ROWSPAN=5></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR><TD></TD> <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3>mining and knowledge discovery process. Related to connectivity is the feature of importing data: does the tool support multiple formats such as ASCII, Access, Excel, comma or tab delimited, SAS, SPSS, and other specific DBMS, etc.? What conversion does the tool make with the original data it imports and at what ratio? Does the tool allow for the exporting of code, syntax, rules, etc.? Many database products (including traditional query, reporting, graphics, and visualization tools) can assist in the understanding of the data before and after the data mining process. The tool should provide the capability to easily link its results to an exportable format that can be visually enhanced for a presentation to management or others.</FONT></TD><TD></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR> <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{4D9}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR> <TD ROWSPAN=5></TD> <TD COLSPAN=3 HEIGHT=12></TD> <TD ROWSPAN=5></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR><TD></TD> <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3><B><I>Memory Management</I></B></FONT></TD><TD></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR> <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{4DA}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR> <TD ROWSPAN=5></TD> <TD COLSPAN=3 HEIGHT=12></TD> <TD ROWSPAN=5></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR><TD></TD> <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3>Normally a data mining tool's memory requirement will only be linear, depending on the size of the data set. The time complexity of the learning phase is a very limiting factor for many of today's data mining tools. If the algorithm used, for instance, uses exponential time growth, the maximum size of the training set will be quite limited. When considering the memory usage of a data mining tool, only the complexity matters, as for time usage. Still, the memory usage could give an indication as to what kind of system is necessary in order to handle "normal" amounts of data, in terms of number of records and rows.</FONT></TD><TD></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR> <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{4DB}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR> <TD ROWSPAN=5></TD> <TD COLSPAN=3 HEIGHT=12></TD> <TD ROWSPAN=5></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR><TD></TD> <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3>On the other hand, whether a complex system uses one or two hours is usually of less importance. For a tool using iteration to achieve better and better accuracy, the time taken into evaluation must be the time to reach a certain level of accuracy. In the evaluation of a tool, consideration should be given to this time/complexity factor. Know in advance that certain tools, such as those based on an SOM network or a genetic algorithm, operate on a data set in such a way that they are very computationally intensive—meaning results may not be available for several hours.</FONT></TD><TD></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR> <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{4DC}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR> <TD ROWSPAN=5></TD> <TD COLSPAN=3 HEIGHT=12></TD> <TD ROWSPAN=5></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR><TD></TD> <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3><B><I>Performance</I></B></FONT></TD><TD></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR> <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{4DD}'/><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0><TR> <TD ROWSPAN=5></TD> <TD COLSPAN=3 HEIGHT=12></TD> <TD ROWSPAN=5></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR><TD></TD> <TD><FONT FACE='Times New Roman, Times, Serif' SIZE=3>Speed and accuracy both contribute to the evaluation of a data mining tool's overall performance. Speed is measured by how fast a model is built, as well as how fast a deployed predictive model can evaluate new data. Given the tool's design network or algorithm, how does it process the data, via single or multiple passes? Another factor impacting performance is cost—that is, the cost of providing a learning data set in the development of a model. This cost includes the number of examples necessary and the cost of assuring a needed accuracy in the learning set of a model. In most</FONT><FONT FACE='Times New Roman, Times, Serif' SIZE=3 COLOR=#FFFF00><!-- continue --></FONT></TD><TD></TD></TR><TR> <TD COLSPAN=3></TD></TR><TR> <TD COLSPAN=3 HEIGHT=1></TD></TR></TABLE><A NAME='{4DE}'/></FORM></P></TD></TR></TABLE><P><FONT SIZE=0 COLOR=WHITE></CENTER><A NAME="bottom"> </A><!-- netLibrary.com Copyright Notice --> </BODY></HTML>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -