📄 9.txt
字号:
发信人: ashun (阿顺), 信区: DataMining
标 题: 数据挖掘术语简介(完)
发信站: 南京大学小百合站 (Thu Aug 30 12:16:02 2001)
test data
A data set independent of the training data set, used to fine-tune the estimat
es of the model parameters (i.e., weights).
test error
The estimate of error based on the difference between the predictions of a mod
el on a test data set and the observed values in the test data set when the te
st data set was not used to train the model.
time series
A series of measurements taken at consecutive points in time. Data mining prod
ucts which handle time series incorporate time-related operators such as movin
g average. (Also see windowing.)
time series model
A model that forecasts future values of a time series based on past values. Th
e model form and training of the model usually take into consideration the cor
relation between values as a function of their separation in time.
topology
For a neural net, topology refers to the number of layers and the number of no
des in each layer.
training
Another term for estimating a model's parameters based on the data set at hand
.
training data
A data set used to estimate or train a model.
transformation
A re-expression of the data such as aggregating it, normalizing it, changing i
ts unit of measure, or taking the logarithm of each data item.
unsupervised learning
This term refers to the collection of techniques where groupings of the data a
re defined without the use of a dependent variable. Cluster analysis is an exa
mple.
validation
The process of testing the models with a data set different from the training
data set.
variance
The most commonly used statistical measure of dispersion. The first step is to
square the deviations of a data item from its average value. Then the average
of the squared deviations is calculated to obtain an overall measure of varia
bility.
visualization
Visualization tools graphically display data to facilitate better understandin
g of its meaning. Graphical capabilities range from simple scatter plots to co
mplex multi-dimensional representations.
windowing
Used when training a model with time series data. A window is the period of ti
me used for each training case. For example, if we have weekly stock price dat
a that covers fifty weeks, and we set the window to five weeks, then the first
training case uses weeks one through five and compares its prediction to week
six. The second case uses weeks two through six to predict week seven, and so
on.
--
业精于勤荒于嬉,行成于思毁于随。 —— 韩愈
临渊羡鱼不如退而结网。 —— 班固
勿以恶小而为之,勿以善小而不为。 —— 刘备
※ 来源:.南京大学小百合站 http://bbs.nju.edu.cn [FROM: 202.119.80.20]
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -