842.txt
来自「This complete matlab for neural network」· 文本 代码 · 共 58 行
TXT
58 行
发信人: zrs (tita), 信区: DataMining
标 题: some of my experience using SAS in DM
发信站: 南京大学小百合站 (Wed Mar 6 15:57:58 2002)
在SASOR,JESICA8写的,觉得不错
Let me share with you some of my experience using SAS in DM. BY jesica8
1. The most exciting about DM is to generate ideas, hypothesis or discoveries.
I have experience in banking industry in USA for many years, espeically in cr
edit card business. The largest DB I have had accessed to was about 220 TB. Th
e smallest has been 500kb. It is true that the size of database is often large
, but does not have to be at TB levels. Also, OLAP may help, but is not requir
ed.
2. You do need to be very good at handling large data sets, using SAS Base and
Macro. This not only means you need to be very good at coding. You need to un
derstand hardware sometimes. For example, if you are joining from different ta
bles at a MS SQL DB, whether the computer has a SCSI HD or EIDE or UMA HD will
largely decide what kind of SAS code you are going to write to pass through t
he SQL.
3. DM is often considered to be interface between computer science and statist
ics. You need to be very good at both to excel at DM. If you are finding somet
hing that others can find, you are "worthless". Yes, it is not very easy to be
best at both CS and statistics, but if it is easy, why are you entering into
DM? Therefore, what I mean by DM is more into knowledge discovery, instead of
visual end of, or spreadsheet-centric result delivery.
4. You need to be very good at project management. Yes, you can try to use SAS
/EM to have a taste how to manage a project, but please don't spend too much t
ime on Enterprise Miner, if you have little experience using SAS. The bad thin
g about EM is it gives you an impression that you can master DM without going
through systematic traning and ardous try and error. It hides behind the earne
st efforts most of so-called data miners try to skip, but are required by beco
ming a true, higher lever miner. A very good miner is one that masters his or
her business domain, has properly training in CS or statistics or both, has ve
ry good detective abilities and, most of all, is among the hardest workers. Th
ere is art involved in DM.
I can talk on and on. Back to SAS, I was "forced" to be a good coder because I
have always wanted to build models (estimation or prediction) the way I wante
d. Don't count on others to prepare data or tools. The worst enemy in becoming
a good DM is teamwork. Be versatile. I just came off vacation and got a new p
roject asking me to build a Baysian averaging scoring system for a segmentatio
n purpose into a refinance market. I am reading customer data by myself and I
will present the final results by myself.
There are many papers or talks that are useless (I have not read any Chinese p
apers on DM yet, so I am talking about Englisher literature). Say, in the fiel
d of cluster analysis, 80% + papers that have been published and classifed und
er that name turn out to be "worthless" after all. How many SAS users know som
ething about proc modulus, fastclus, or acerclus? I failed many times with the
m, so I do.
--
※ 来源:.南京大学小百合站 http://bbs.nju.edu.cn [FROM: 61.174.148.62]
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?