📄 225.txt
字号:
发信人: yaomc (白头翁&山东大汉), 信区: DataMining
标 题: We can't wait---my opinion about data--jiaqi.
发信站: 南京大学小百合站 (Tue Apr 2 11:00:51 2002), 站内信件
Hi,
I have something to say about data which can be used in research.
Actually there are lots of benchmark datasets online. You can search
through Google; or if you are interested particularly in an algorithm,
you can even ask for dataset from the authors of the relevant paper.
Generally they would like to provide the dataset they used for your
further studying, experiments and research.
After getting the data sets, I asure some people will argue that these
kind of data are not that of large scale or don't have some other
characteristics. While some will argue that they don't know the domain
knowledge about the data, how can they do research with it. Actually
it is very common to think in this way, I myself also experienced this
stage. Sooner or later, you will find that whether the data is perfect
is not important; instead, the key point is to play with the data in a
systematical way. You can play with data at a very abstract level, say,
you know nothing about the background; you can change the data set a
little bit, and see how the performance of the algorith will change; you
can apply the same algorithm, say Aprior, to the market basket data, or
you can apply it to the website log file data... These are just a few
examples.
For real world application, things are different. I don't want to go
further to that world.
My suggestion for those who are doing reseach in this fantastic field is
that don't wait, start out now, search for the data, ask for the data,
read paper, learn how others play with data, and play with it by
yourself.
Jiaqi
--
Welcome to http://datamining.bbs.lilybbs.net.
※ 来源:.南京大学小百合站 bbs.nju.edu.cn.[FROM: 202.204.36.15]
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -