⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 数据仓库与数据挖掘--数据挖掘部分算法的matlab实现 c4_5.htm

📁 [数据挖掘]数据挖掘部分算法的matlab实现 C4_5 比较经典的代码
💻 HTM
📖 第 1 页 / 共 5 页
字号:
                  discrete_dim(dims), Uc);<BR>&nbsp;&nbsp; in = 
                  indices(find(features(dim, indices) 
                  &gt;&nbsp;&nbsp;tree.split_loc));<BR>&nbsp;&nbsp; targets = 
                  targets + use_tree(features(dims, :), in, tree.child(2), 
                  discrete_dim(dims), Uc);<BR>else<BR>&nbsp;&nbsp; %Discrete 
                  feature<BR>&nbsp;&nbsp; Uf = unique(features(dim,:));<BR>for i 
                  = 1:length(Uf),<BR>&nbsp;&nbsp; in&nbsp;&nbsp; &nbsp;&nbsp; = 
                  indices(find(features(dim, indices) == 
                  Uf(i)));<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;targets = 
                  targets + use_tree(features(dims, :), in, tree.child(i), 
                  discrete_dim(dims), Uc);<BR>&nbsp;&nbsp; 
                  end<BR>end<BR>&nbsp;&nbsp;&nbsp;&nbsp;<BR>%END use_tree 
                  <BR><BR>function tree = make_tree(features, targets, inc_node, 
                  discrete_dim, maxNbin, base)<BR>%Build a tree 
                  recursively<BR><BR>[Ni, L]&nbsp;&nbsp;&nbsp;&nbsp; = 
                  size(features);<BR>Uc&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
                  = unique(targets);<BR>tree.dim = 0;<BR>%tree.child(1:maxNbin) 
                  = zeros(1,maxNbin);<BR>tree.split_loc = inf;<BR><BR>if 
                  isempty(features),<BR>&nbsp;&nbsp; break<BR>end<BR><BR>%When 
                  to stop: If the dimension is one or the number of examples is 
                  small<BR>if ((inc_node &gt; L) | (L == 1) | (length(Uc) == 
                  1)),<BR>&nbsp;&nbsp; H = hist(targets, 
                  length(Uc));<BR>&nbsp;&nbsp; [m, largest] = 
                  max(H);<BR>&nbsp;&nbsp; tree.child = 
                  Uc(largest);<BR>&nbsp;&nbsp; break<BR>end<BR><BR>%Compute the 
                  node&acute;s I<BR>for i = 
                  1:length(Uc),<BR>&nbsp;&nbsp;&nbsp;&nbsp;Pnode(i) = 
                  length(find(targets == Uc(i))) / L;<BR>end<BR>Inode = 
                  -sum(Pnode.*log(Pnode)/log(2));<BR><BR>%For each dimension, 
                  compute the gain ratio impurity<BR>%This is done separately 
                  for discrete and continuous 
                  features<BR>delta_Ib&nbsp;&nbsp;&nbsp;&nbsp;= zeros(1, 
                  Ni);<BR>split_loc = ones(1, Ni)*inf;<BR><BR>for i = 
                  1:Ni,<BR>&nbsp;&nbsp; data = features(i,:);<BR>&nbsp;&nbsp; 
                  Nbins = length(unique(data));<BR>&nbsp;&nbsp; if 
                  (discrete_dim(i)),<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;%This 
                  is a discrete feature<BR>P = zeros(length(Uc), 
                  Nbins);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for j = 
                  1:length(Uc),<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
                  for k = 
                  1:Nbins,<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;indices 
                  = find((targets == Uc(j)) &amp; (features(i,:) == 
                  k));<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;P(j,k) 
                  = 
                  length(indices);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
                  end<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;end<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Pk&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 
                  sum(P);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;P&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
                  = 
                  P/L;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Pk&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 
                  Pk/sum(Pk);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;info&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 
                  sum(-P.*log(eps+P)/log(2));<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;delta_Ib(i) 
                  = 
                  (Inode-sum(Pk.*info))/-sum(Pk.*log(eps+Pk)/log(2));<BR>&nbsp;&nbsp; 
                  else<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;%This is a 
                  continuous feature<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;P = 
                  zeros(length(Uc), 
                  2);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;%Sort 
                  the 
                  features<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[sorted_data, 
                  indices] = 
                  sort(data);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sorted_targets 
                  = 
                  targets(indices);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;%Calculate 
                  the information for each possible 
                  split<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;I = zeros(1, 
                  L-1);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for j = 
                  1:L-1,<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for 
                  k 
                  =1:length(Uc),<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;P(k,1) 
                  = length(find(sorted_targets(1:j) == 
                  Uc(k)));<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;P(k,2) 
                  = length(find(sorted_targets(j+1:end) == 
                  Uc(k)));<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
                  end<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Ps = 
                  sum(P)/L;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
                  P = P/L;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
                  info = 
                  sum(-P.*log(eps+P)/log(2));<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
                  I(j) = Inode - sum(info.*Ps);&nbsp;&nbsp; 
                  <BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;end<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[delta_Ib(i), 
                  s] = max(I);<BR>split_loc(i) = 
                  sorted_data(s);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<BR>&nbsp;&nbsp; 
                  end<BR>end<BR><BR>%Find the dimension minimizing delta_Ib 
                  <BR>[m, dim] = max(delta_Ib);<BR>dims = 1:Ni;<BR>tree.dim = 
                  dim;<BR><BR>%Split along the &acute;dim&acute; dimension<BR>Nf = 
                  unique(features(dim,:));<BR>Nbins = length(Nf);<BR>if 
                  (discrete_dim(dim)),<BR>&nbsp;&nbsp; %Discrete 
                  feature<BR>&nbsp;&nbsp; for i = 
                  1:Nbins,<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;indices&nbsp;&nbsp;&nbsp;&nbsp; 
                  = find(features(dim, :) == 
                  Nf(i));<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tree.child(i) = 
                  make_tree(features(dims, indices), targets(indices), inc_node, 
                  discrete_dim(dims), maxNbin, base);<BR>&nbsp;&nbsp; 
                  end<BR>else<BR>&nbsp;&nbsp; %Continuous 
                  feature<BR>&nbsp;&nbsp; tree.split_loc = 
                  split_loc(dim);<BR>&nbsp;&nbsp; indices1 &nbsp;&nbsp; = 
                  find(features(dim,:) &lt;= split_loc(dim));<BR>&nbsp;&nbsp; 
                  indices2 &nbsp;&nbsp; = find(features(dim,:) &gt; 
                  split_loc(dim));<BR>&nbsp;&nbsp; tree.child(1) = 
                  make_tree(features(dims, indices1), targets(indices1), 
                  inc_node, discrete_dim(dims), maxNbin);<BR>&nbsp;&nbsp; 
                  tree.child(2) = make_tree(features(dims, indices2), 
                  targets(indices2), inc_node, discrete_dim(dims), 
                  maxNbin);<BR>end</TD></TR></TBODY></TABLE><BR>
            <TABLE cellSpacing=0 cellPadding=0 width="100%" align=center 
            border=0>
              <TBODY>
              <TR>
                <TD width="74%"><A 
                  href="http://blogger.org.cn/blog/more.asp?name=xueflhg&amp;id=6839">阅读全文(2863)</A> 
                  | <A 
                  href="http://blogger.org.cn/blog/more.asp?name=xueflhg&amp;id=6839#comment">回复(6)</A> 
                  | <A href="http://blogger.org.cn/blog/showtb.asp?id=6839" 
                  target=_blank>TrackBack(2)</A> | <A 
                  href="http://blogger.org.cn/blog/User_blog.asp?Action=Modify&amp;ID=6839">编辑</A> 
                  | <A 
                  href="http://blogger.org.cn/blog/User_blog.asp?Action=isbest&amp;ID=6839" 
                  target=_blank>精华</A></TD>
                <TD width="26%">
                  <DIV 
      align=right>&nbsp;</DIV></TD></TR></TBODY></TABLE></TD></TR></TBODY></TABLE><BR><BR>
      <STYLE type=text/css>A.categorylink:link {
	COLOR: #999999
}
A.categorylink:visited {
	COLOR: #999999
}
A.categorylink:active {
	COLOR: #999999
}
A.categorylink:hover {
	COLOR: #ff9900
}
</STYLE>

      <TABLE style="TABLE-LAYOUT: fixed; WORD-BREAK: break-all" cellSpacing=1 
      cellPadding=3 width="98%" bgColor=#cccccc border=0>
        <TBODY>
        <TR bgColor=#f8f8f8>
          <TD>
            <P><FONT size=4><STRONG>回复:数据挖掘部分算法的matlab实现&nbsp;C4_5<A 
            name=47724></A></STRONG></FONT><BR><A class=categorylink 
            href="http://blogger.org.cn/blog/list.asp?classid=46" 
            target=_blank>网上资源</A>,&nbsp;&nbsp;<A class=categorylink 
            href="http://blogger.org.cn/blog/list.asp?classid=4" 
            target=_blank>随笔</A></P>
            <P>111(游客)发表评论于2007-3-25 14:55:53 </P></TD></TR>
        <TR bgColor=#ffffff>
          <TD height=0>
            <TABLE cellSpacing=0 cellPadding=0 width="100%" border=0>
              <TBODY>
              <TR>
                <TD>是啊,high_histogram是什么功能?能不能给出来啊,急需,谢谢</TD></TR></TBODY></TABLE><BR>
            <TABLE cellSpacing=0 cellPadding=0 width="100%" align=center 
            border=0>
              <TBODY>
              <TR>
                <TD width="74%">个人主页 | <A 
                  href="http://blogger.org.cn/blog/more.asp?name=xueflhg&amp;id=6839&amp;commentid=47724#comment">引用回复</A> 
                  | <A 
                  href="http://blogger.org.cn/blog/user_comment.asp?Action=Modify&amp;ID=47724&amp;re=true">主人回复</A> 
                  | <A 
                  href="http://blogger.org.cn/blog/more.asp?name=xueflhg&amp;id=6839#top">返回</A> 
                  | <A 
                  href="http://blogger.org.cn/blog/User_comment.asp?Action=Modify&amp;ID=47724">编辑</A> 
                  | <A onclick="return confirm('确定要删除吗?');" 
                  href="http://blogger.org.cn/blog/User_comment.asp?Action=Del&amp;ID=47724&amp;mainid=6839">删除</A></TD>
                <TD width="26%">
                  <DIV 
      align=right>&nbsp;</DIV></TD></TR></TBODY></TABLE></TD></TR></TBODY></TABLE><BR><BR>
      <STYLE type=text/css>A.categorylink:link {
	COLOR: #999999
}
A.categorylink:visited {
	COLOR: #999999
}
A.categorylink:active {
	COLOR: #999999
}
A.categorylink:hover {
	COLOR: #ff9900
}
</STYLE>

      <TABLE style="TABLE-LAYOUT: fixed; WORD-BREAK: break-all" cellSpacing=1 
      cellPadding=3 width="98%" bgColor=#cccccc border=0>
        <TBODY>
        <TR bgColor=#f8f8f8>
          <TD>
            <P><FONT size=4><STRONG>回复:数据挖掘部分算法的matlab实现&nbsp;C4_5<A 
            name=23715></A></STRONG></FONT><BR><A class=categorylink 
            href="http://blogger.org.cn/blog/list.asp?classid=46" 
            target=_blank>网上资源</A>,&nbsp;&nbsp;<A class=categorylink 
            href="http://blogger.org.cn/blog/list.asp?classid=4" 
            target=_blank>随笔</A></P>
            <P>tt(游客)发表评论于2006-5-12 11:50:37 </P></TD></TR>
        <TR bgColor=#ffffff>
          <TD height=0>
            <TABLE cellSpacing=0 cellPadding=0 width="100%" border=0>
              <TBODY>
              <TR>
                <TD>
                  <P>我有PCA,但不知道在具体执行时,该怎么用,比如C4_5中inc_node, region分别该怎么给呢?</P>
                  <P><A 
                href="mailto:dyj_115@sina.com">dyj_115@sina.com</A></P></TD></TR></TBODY></TABLE><BR>
            <TABLE cellSpacing=0 cellPadding=0 width="100%" align=center 
            border=0>
              <TBODY>
              <TR>
                <TD width="74%">个人主页 | <A 
                  href="http://blogger.org.cn/blog/more.asp?name=xueflhg&amp;id=6839&amp;commentid=23715#comment">引用回复</A> 
                  | <A 
                  href="http://blogger.org.cn/blog/user_comment.asp?Action=Modify&amp;ID=23715&amp;re=true">主人回复</A> 
                  | <A 
                  href="http://blogger.org.cn/blog/more.asp?name=xueflhg&amp;id=6839#top">返回</A> 
                  | <A 
                  href="http://blogger.org.cn/blog/User_comment.asp?Action=Modify&amp;ID=23715">编辑</A> 
                  | <A onclick="return confirm('确定要删除吗?');" 
                  href="http://blogger.org.cn/blog/User_comment.asp?Action=Del&amp;ID=23715&amp;mainid=6839">删除</A></TD>
                <TD width="26%">
                  <DIV 
      align=right>&nbsp;</DIV></TD></TR></TBODY></TABLE></TD></TR></TBODY></TABLE><BR><BR>
      <STYLE type=text/css>A.categorylink:link {
	COLOR: #999999
}
A.categorylink:visited {
	COLOR: #999999
}
A.categorylink:active {
	COLOR: #999999
}
A.categorylink:hover {

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -