📄 4.txt
字号:
发信人: ashun (阿顺), 信区: DataMining
标 题: 数据挖掘术语简介(四)
发信站: 南京大学小百合站 (Thu Aug 30 12:10:14 2001)
data
Values collected through record keeping or by polling, observing, or measuring
, typically organized for analysis or decision making. More simply, data is fa
cts, transactions and figures.
data format
Data items can exist in many formats such as text, integer and floating-point
decimal. Data format refers to the form of the data in the database.
data mining
An information extraction activity whose goal is to discover hidden facts cont
ained in databases. Using a combination of machine learning, statistical analy
sis, modeling techniques and database technology, data mining finds patterns a
nd subtle relationships in data and infers rules that allow the prediction of
future results. Typical applications include market segmentation, customer pro
filing, fraud detection, evaluation of retail promotions, and credit risk anal
ysis.
data mining method
Procedures and algorithms designed to analyze the data in databases.
DBMS
Database management systems.
decision tree
A tree-like way of representing a collection of hierarchical rules that lead t
o a class or value.
deduction
Deduction infers information that is a logical consequence of the data.
degree of fit
A measure of how closely the model fits the training data. A common measure is
r-square.
dependent variable
The dependent variables (outputs or responses) of a model are the variables pr
edicted by the equation or rules of the model using the independent variables
(inputs or predictors).
deployment
After the model is trained and validated, it is used to analyze new data and m
ake predictions. This use of the model is called deployment.
dimension
Each attribute of a case or occurrence in the data being mined. Stored as a fi
eld in a flat file record or a column of relational database table.
discrete
A data item that has a finite set of values. Discrete is the opposite of conti
nuous.
discriminant analysis
A statistical method based on maximum likelihood for determining boundaries th
at separate the data into categories.
entropy
A way to measure variability other than the variance statistic. Some decision
trees split the data into groups based on minimum entropy.
exploratory analysis
Looking at data to discover relationships not previously detected. Exploratory
analysis tools typically assist the user in creating tables and graphical dis
plays.
external data
Data not collected by the organization, such as data available from a referenc
e book, a government source or a proprietary database.
Top of page
feed-forward
A neural net in which the signals only flow in one direction, from the inputs
to the outputs.
fuzzy logic
Fuzzy logic is applied to fuzzy sets where membership in a fuzzy set is a prob
ability, not necessarily 0 or 1. Non-fuzzy logic manipulates outcomes that are
either true or false. Fuzzy logic needs to be able to manipulate degrees of "
maybe" in addition to true and false.
--
业精于勤荒于嬉,行成于思毁于随。 —— 韩愈
临渊羡鱼不如退而结网。 —— 班固
勿以恶小而为之,勿以善小而不为。 —— 刘备
※ 来源:.南京大学小百合站 http://bbs.nju.edu.cn [FROM: 202.119.80.20]
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -