⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 euclideanclustermodel.java

📁 dragontoolkit用于机器学习
💻 JAVA
字号:
package dragon.ir.clustering.clustermodel;
import dragon.matrix.*;
import dragon.ir.clustering.*;
import dragon.ir.index.*;

/**
 * <p>Cluster model which computes Euclidean distance between a document and a cluster</p>
 * <p></p>
 * <p>Copyright: Copyright (c) 2005</p>
 * <p>Company: IST, Drexel University</p>
 * @author Davis Zhou
 * @version 1.0
 */

public class EuclideanClusterModel extends AbstractClusterModel{
    private double[][] arrClusterVector;
    private double [] arrClusterVectorLen;
    private SparseMatrix matrix;

    public EuclideanClusterModel(int clusterNum, SparseMatrix docTerm) {
        super(clusterNum);
        arrClusterVectorLen=new double[clusterNum];
        this.matrix=docTerm;
    }

    public void setClusterNum(int clusterNum) {
        this.clusterNum = clusterNum;
        arrClusterVectorLen=new double[clusterNum];
    }

    public void setDocCluster(DocCluster cluster) {
        int i, j, featureNum, newIndex, clusterID, indexList[];
        double freqList[];

        if(featureFilter==null)
            featureNum=matrix.columns();
        else
            featureNum=featureFilter.getSelectedFeatureNum();
        if(arrClusterVector==null || arrClusterVector.length!=clusterNum || arrClusterVector[0].length!=featureNum)
            arrClusterVector = new double[clusterNum][featureNum];
        clusterID = cluster.getClusterID();

        for (i = 0; i < cluster.getDocNum(); i++) {
            indexList = matrix.getNonZeroColumnsInRow(cluster.getDoc(i).getIndex());
            freqList = matrix.getNonZeroDoubleScoresInRow(cluster.getDoc(i).getIndex());
            for (j = 0; j < indexList.length; j++){
                newIndex = indexList[j];
                if (featureFilter != null)
                    newIndex = featureFilter.map(newIndex);
                if(newIndex>=0)
                    arrClusterVector[clusterID][newIndex] += freqList[j];
            }
        }

        arrClusterVectorLen[clusterID]=0;
        for (i = 0; i < arrClusterVector[clusterID].length; i++) {
            if (arrClusterVector[clusterID][i] == 0)
                continue;
            arrClusterVector[clusterID][i] = arrClusterVector[clusterID][i] / cluster.getDocNum();
            arrClusterVectorLen[clusterID] += arrClusterVector[clusterID][i] * arrClusterVector[clusterID][i];
        }
    }

    public double getDistance(IRDoc doc, int clusterID) {
      int i,newIndex, indexList[];
      double product, freqList[];

      product=arrClusterVectorLen[clusterID];
      indexList = matrix.getNonZeroColumnsInRow(doc.getIndex());
      freqList = matrix.getNonZeroDoubleScoresInRow(doc.getIndex());

      for (i = 0; i < indexList.length; i++) {
          newIndex=indexList[i];
          if(featureFilter!=null)
              newIndex=featureFilter.map(newIndex);
          if(newIndex>=0){
              product += freqList[i]*(freqList[i]-2*arrClusterVector[clusterID][newIndex]);
          }
      }
      return Math.sqrt(product);
  }
}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -