disc.py

来自「orange源码 数据挖掘技术」· Python 代码 · 共 23 行

PY
23
字号
# Description: Entropy based discretization compared to discretization with equal-frequency
#              of instances in intervals
# Category:    preprocessing
# Uses:        iris.tab
# Classes:     Preprocessor_discretize, EntropyDiscretization
# Referenced:  o_categorization.htm

import orange

def show_values(data, heading):
    print heading
    for a in data.domain.attributes:
        print "%s: %s" % (a.name, reduce(lambda x,y: x+', '+y, [i for i in a.values]))
        
data = orange.ExampleTable("iris")

data_ent = orange.Preprocessor_discretize(data, method=orange.EntropyDiscretization())
show_values(data_ent, "Entropy based discretization")
print

data_n = orange.Preprocessor_discretize(data, method=orange.EquiNDiscretization(numberOfIntervals=3))
show_values(data_n, "Equal-frequency intervals")

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?