readme

来自「序列模式挖掘prefixspan算法源代码。License: GPL2 (Gnu」· 代码 · 共 121 行

TXT

121 行

 prefixspan --- An Implementation of PrefixSpan  Author: Taku Kudo <taku-ku@is.aist-nara.ac.jp>         Nara Institute of Science and Technology,          Graduate School of Information Science,          Computational Linguistics Laboratory  License: GPL2 (Gnu General Public License Version 2) Reference:  J. Pei, J. Han, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu,   PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth  Proc. 2001 Int. Conf. on Data Engineering (ICDE'01), Heidelberg, Germany, April 2001.  http://www.cs.sfu.ca/~peijian/personal/publications/span.pdf Requirements:   C++ compiler with STL (Standard Template Library). Install:  % make  Usage:     ./prefixspan [options] < data     option:          -m NUM:   set minimum support        (default: 1)         -M NUM:   set minimum pattern length (default: 1)         -L NUM:   set maximum pattern length (default: 0xffffffff)         -t TYPE:  set item type, choose from [string|int|short|char]                                              (default: string)         -a:       print ALL patterns (default: no, print longest pattern only)         -w:       print the list of transaction IDs                   where the pattern occurs (default: no)	 -d STR    use STR as delimiter between item and freq. (default: "/")         -v:       set verbose mode (print the size of transactions first) Format of input data:   foo bar do  foo foo bar  i you he  she me    Each line corresponds to the each transaction which has a set of items separated by single space.  For example, first transaction has 3 items (foo, bar, do).  If you don't need to care the sequential order of items,   just sort items by dictionary order like:    bar do foo  bar foo foo  he i you  me she Format of results:   item/freq. item/freq. ...  item/freq. item/freq. ...  ..    Here is an example:    bar/187 foo/113  do/170 bar/134  she/100  i/501 by/232 the/108  This result means:  SEQUENTIAL PATTERN    : FREQUENCY  bar                   : 187  bar -> foo            : 113  do                    : 170  do -> bar             : 113  she                   : 100  i -> by -> the        : 108  i -> by               : 232  i                     : 501  Each line represents the longest sequential pattern   whose frequency is larger than minsup (-m option).  -M NUM1 and -L NUM2  options restrict the size of patterns extracted.  By using -d option, the delimiter between item and freq can be changed. (default is "/")   Note that any prefix of the longest pattern are also sequential pattern.  However, by using -a option, you can obtain ALL patterns, all prefix of  the longest pattern. Here is an example:  187 bar  113 bar foo  170 do  134 do bar  100 she  501 i  232 i by  108 i by the    By using -w option, the list of transaction IDs where each pattern   occurs can be obtained. Here is an example: * without -a option <pattern> <what>bar/187 foo/113</what> <where>54 141 218 264 295 472 768 839 900 931</where> </pattern>  * with -a option <pattern> <freq>187</freq> <what>bar</what> <where>54 141 218 264 295 472 768 839 900 931 .... </where> </pattern> <pattern> <freq>113</freq> <what>bar foo</what> <where>54 141 218 264 295 472 768 839 900 931 .... </where> </pattern>   Each result is surrounded by "<pattern>" tag.  The pattern is in "<what>" tag, and transaction IDs are listed in "<where>" tag.

readme - 源码说明

本页面展示了「序列模式挖掘prefixspan算法源代码。License: GPL2 (Gnu General Public License Version 2) Requirements: C++ comp」中的 readme 源码文件，采用编程语言编写，共 121 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。

虫虫下载站收录了大量与License相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。

⌨️ 快捷键说明

复制代码Ctrl + C

搜索代码Ctrl + F

全屏模式F11

增大字号Ctrl + =

减小字号Ctrl + -

显示快捷键?