ibm数据生成器所生成文件说明.txt
来自「IBM(原)数据生成器和源代码」· 文本 代码 · 共 31 行
TXT
31 行
(1) Associations and Sequential Patterns:
Code:
assoc.gen.tar.Z (26,286 bytes)
Downloading and Compiling Tips
Usage:
gen lit|tax|seq [options]
gen lit|tax|seq -help For more detailed list of options
lit: large (frequent) itemsets without taxonomies
tax: large (frequent) itemsets with taxonomies
seq: sequential patterns
Output Format:
There are two posssible output formats for the data file, based on whether or not the "-ascii" option is specified.
Binary
Consists of <CustID, TransID, NumItems, List-Of-Items.> Each of these is a 4-byte integer.
Ascii
Each line contains a CustID, TransID, and Item. Each of these take up 10 bytes, for a total of 33 bytes per line.
Apart from the data file, this program also generates a pattern file. The pattern file has three parts:
A description of the data.
A list of items with high weights. (Recall that the weight corresponds to the probability that item will appear in an itemset.) Each line has the item number, followed by the weight.
A list of the itemsets/sequential patterns with high weight. (Recall that the weight corresponds to the probability that the itemset will appear in a transaction.) Each line has the weight, the expected confidence for rules generated from this itemset, and the itemset.
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?