⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 sequentialalgorithm.java

📁 一个数据挖掘软件ALPHAMINERR的整个过程的JAVA版源代码
💻 JAVA
字号:
/*
 *    This program is free software; you can redistribute it and/or modify
 *    it under the terms of the GNU General Public License as published by
 *    the Free Software Foundation; either version 2 of the License, or
 *    (at your option) any later version.
 *
 *    This program is distributed in the hope that it will be useful,
 *    but WITHOUT ANY WARRANTY; without even the implied warranty of
 *    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 *    GNU General Public License for more details.
 *
 *    You should have received a copy of the GNU General Public License
 *    along with this program; if not, write to the Free Software
 *    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
 */

/**
 * Title: XELOPES Data Mining Library
 * Description: The XELOPES library is an open platform-independent and data-source-independent library for Embedded Data Mining.
 * Copyright: Copyright (c) 2002 Prudential Systems Software GmbH
 * Company: ZSoft (www.zsoft.ru), Prudsys (www.prudsys.com)
 * @author Victor Borichev
 * @author Valentine Stepanenko (valentine.stepanenko@zsoft.ru)
 * @version 1.0
 */

package com.prudsys.pdm.Models.Sequential;

import java.util.Date;
import java.util.Hashtable;
import java.util.Vector;

import com.prudsys.pdm.Core.CategoricalAttribute;
import com.prudsys.pdm.Core.MiningAlgorithm;
import com.prudsys.pdm.Core.MiningException;
import com.prudsys.pdm.Core.MiningModel;
import com.prudsys.pdm.Core.MiningSettings;
import com.prudsys.pdm.Core.NumericAttribute;
import com.prudsys.pdm.Models.Sequential.Event.CreationModelEndMessageSequential;

/**
 * Base class for sequential algorithms.
 */
public abstract class SequentialAlgorithm extends MiningAlgorithm
{
    // -----------------------------------------------------------------------
    //  Variables declarations
    // -----------------------------------------------------------------------
    /** Item ID attribute. */
    protected CategoricalAttribute itemId;

    /** Transaction ID attribute. */
    protected CategoricalAttribute transactionId;

    /** Item index attribute. */
    protected NumericAttribute itemIndex;

    /** Minimum support. */
    protected double minimumSupport;

    /** Minimum confidence. */
    protected double minimumConfidence;

    /** Generate rules from sequences. */
    protected boolean generateRules = false;

    /** Export all transaction IDs into PMML. */
    protected boolean exportTransactIds = true;

    /** Export names of item ID, transaction ID, and item index into PMML. */
    protected int exportTransactItemNames = SequentialMiningModel.EXPORT_PMML_NAME_TYPE_XELOPES;

    // -----------------------------------------------------------------------
    //  Constructor
    // -----------------------------------------------------------------------
    /**
     * Empty constructor.
     */
    public SequentialAlgorithm()
    {
    }

    // -----------------------------------------------------------------------
    //  Getter and setter methods
    // -----------------------------------------------------------------------
    /**
     * Write all transaction IDs into PMML (default: true)?
     *
     * @return true if write all transaction IDs into PMML, otherwise not
     */
    public boolean isExportTransactIds()
    {
      return exportTransactIds;
    }

    /**
     * Set export all transaction IDs into PMML (default: true).
     *
     * @param exportTransactIds true if export, otherwise false
     */
    public void setExportTransactIds(boolean exportTransactIds)
    {
      this.exportTransactIds = exportTransactIds;
  }

  /**
   * Returns type how item, transaction, and position IDs are handled in PMML.
   *
   * @return PMML export type of transaction, item, position IDs
   */
  public int getExportTransactItemNames()
  {
    return exportTransactItemNames;
  }

  /**
   * Sets type how item, transaction, and position IDs are handled in PMML.
   * This is because of an incompleteness in PMML 20: transaction, item, and
   * position ID are not specially denoted in the mining schema.
   * This makes PMML20 sequence models not really applicable
   * to new data (except you use agreed names for the IDs). <p>
   *
   * There are two ways to handle this problem:
   * 1. Do nothing: conform with PMML 2.0 but lose of functionality,
   * 2. Use XELOPES PMML Extension: to SequenceModel three new
   * attributes 'itemIdName' (itemId), 'transactIdName' (transactionId),
   * and 'positionIdName' (itemIndex) are added.
   *
   * @param exportTransactItemNames PMML export type of item, transaction,
   * and position IDs
   */
  public void setExportTransactItemNames(int exportTransactItemNames)
  {
    this.exportTransactItemNames = exportTransactItemNames;
  }

  /**
   * Creates an instance of the sequential settings class that is required
   * to run the algorithm. The mining settings are assigned through the
   * setMiningSettings method.
   *
   * @return new instance of the sequential settings class of the algorithm
   */
  public MiningSettings createMiningSettings() {

    return new SequentialSettings();
  }

    /**
     * Sets sequential settings.
     *
     * @param miningSettings new sequential settings
     * @exception IllegalArgumentException mining settings not sequential settings
     */
    public void setMiningSettings( MiningSettings miningSettings ) throws IllegalArgumentException
    {
        if( miningSettings instanceof SequentialSettings )
        {
            super.setMiningSettings( miningSettings );
            SequentialSettings sequentialSettings = (SequentialSettings)miningSettings;
            this.itemId = (CategoricalAttribute)sequentialSettings.getItemId();
            this.transactionId = (CategoricalAttribute)sequentialSettings.getTransactionId();
            this.itemIndex = (NumericAttribute)sequentialSettings.getItemIndex();
            this.minimumSupport = sequentialSettings.getMinimumSupport();
            this.minimumConfidence = sequentialSettings.getMinimumConfidence();
            this.generateRules = sequentialSettings.isGenerateRules();
        }
        else
        {
            throw new IllegalArgumentException( "MiningSettings have to be instance of SequentialSettings." );
        }
    }

    /**
     * Returns sequences.
     *
     * @return sequences
     */
    protected abstract Vector getSequentialRules();

    /**
     * Returns sequence rules. This is done via calculating
     * the rules from the large sequences.
     *
     * @return sequence rules
     * @exception MiningException cannot generate rules
     */
    protected Vector getSequenceRules() throws MiningException {

      if (!generateRules)
        throw new MiningException("there should be no rules generated");

      // Construct hashtable of all large sequences:
      Hashtable seqs = new Hashtable();
      int num        = getSequentialRules().size();
      int nTransact  = getNumberOfTransactions();

      for (int i = 0 ; i < num; i++)
      {
        ItemSetSeq iss = (ItemSetSeq) getSequentialRules().elementAt(i);
        Double Supp = (Double) seqs.get(iss);
        if (Supp == null)
        {
          double supp = (double) iss.getSupportCount() / (double) nTransact;
          seqs.put(iss, new Double(supp) );
        };
      };

      // Find all rules satisfying minimum confidence condition:
      Vector sequenceRules = new Vector();
      for (int i = 0; i < num; i++) {
        ItemSetSeq iss = (ItemSetSeq) getSequentialRules().elementAt(i);
        if (iss.getSize() == 1) continue;

        // Get all rules for itemset:
        for (int j = 1; j < iss.getSize(); j++) {
          // New rule:
          ItemSetSeq prem = new ItemSetSeq();
          ItemSetSeq conc = new ItemSetSeq();
          for (int k = 0; k < j; k++) prem.addItem( iss.getItemAt(k) );
          for (int k = j; k < iss.getSize(); k++) conc.addItem( iss.getItemAt(k) );

          // Check confidence condition of new rule:
          Double SuppAUB = (Double) seqs.get(iss);
          Double SuppA   = (Double) seqs.get(prem);
          if (SuppAUB == null || SuppA == null || SuppA.doubleValue() == 0) continue;

          double conf = SuppAUB.doubleValue() / SuppA.doubleValue();
          if (conf < minimumConfidence)
            continue;
          else {
            // Add new rule to list:
            RuleSetSeq rss = new RuleSetSeq(prem, conc, SuppAUB.doubleValue(), conf);
            sequenceRules.addElement(rss);
          };
        };
      };

      return sequenceRules;
    }

    /**
     * Returns number of transactions. Standard method uses number of
     * categories of transaction ID attribute. However, for algorithms
     * that can also handle transaction ID attributes which do not store
     * all categories (e.g. AssocialtionRulesDecompAlgorithm), this method
     * should be overwritten.
     *
     * @return number of transactions, -1 if unknown
     */
    public int getNumberOfTransactions()
    {
      int nTransact = -1;
      if ( transactionId != null && !transactionId.isUnstoredCategories() )
         nTransact = transactionId.getCategoriesNumber();

      return nTransact;
    }

    // -----------------------------------------------------------------------
    //  Run sequential algorithm and build mining model
    // -----------------------------------------------------------------------
    /**
     * Runs sequential algorithm.
     *
     * @exception MiningException cannot run algorithm
     */
    protected abstract void runAlgorithm() throws MiningException;

    /**
     * Builds mining model by running the sequential algorithm internally.
     *
     * @return sequential mining model generated by the algorithm
     * @exception MiningException cannot build model
     */
    public MiningModel buildModel() throws MiningException
    {
        long start = ( new Date() ).getTime();

        runAlgorithm();

        SequentialMiningModel model = new SequentialMiningModel();
        model.setMiningSettings( miningSettings );
        model.setInputSpec( applicationInputSpecification );
        model.setSequentialRules( getSequentialRules() );
        if (generateRules) model.setSequenceRules( getSequenceRules() );
        model.setItemIdName( itemId.getName() );
        model.setTransactIdName( transactionId.getName() );
        model.setItemIdName( itemId.getName() );
        model.setExportTransactIds(exportTransactIds);
        model.setExportTransactItemNames(exportTransactItemNames);
        if (getNumberOfTransactions() >= 0) model.setNumberOfTransactions( getNumberOfTransactions() );
        this.miningModel = model;

        long end = ( new Date() ).getTime();
        timeSpentToBuildModel = ( end - start ) / 1000.0;
        
        int nRules = model.getSequenceRules()!=null ? model.getSequenceRules().size() : 0;
        fireMiningEvent(new CreationModelEndMessageSequential(nRules, model.getSequentialRules().size(), getAlgorithmLevel()));
        
        return model;
    }
}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -