37.txt
来自「This complete matlab for neural network」· 文本 代码 · 共 51 行
TXT
51 行
发信人: iamgufeng (古风), 信区: DataMining
标 题: Re: 下一步该看些什么
发信站: 南京大学小百合站 (Sun Dec 23 17:16:04 2001), 站内信件
Reading some papers regarding to data preparation,frequently traversal path,
etc.(some maybe written by roamingo:)),many ideas about web using mining
become clearer gradually. However,such methods metioned previously are fitter
for static web sites. With relation to my corporation site,based on php+asp/sql,
there are at least following impediments:
1. My registered user is not the one in log. How can i pull these registered
user's session correctly?
2. A dynamic page implys many info category, simple url cannot identify detail
info. Maybe classification by query string parameters after "?" is alternative,
but how about query with "POST" method?
3. Many dynamic operation info of users cannot be logged fully.
4. ...
Perhaps we need integrate mining thinking into site design. we write some codes
generating enough log info what we needed when events occuring, in conjunctio
n with original
server log, valuable data may be extracted really to consecutive mining.
but how these effect on performance?
【 在 roamingo (漫步鸥) 的大作中提到: 】
: I happened to be running Apache under Linux, and it is also very easy to
: customize the log file format under Apache. I have put the cookie
: field in it, and it makes unique user identification very accurate.
: Of course, you still have to use the timestamp field to perform session
: identification. However, the cookie field will always change for those users
: who turn the cookie acceptance option off in their browers. Those
: log entries have to be discarded or treated differently.
: If no cookie information is available, the IP field with optional
: information such as "refer" and "user agent" can be used to do
: session identification, more adaptive but not so accurate. Mobasher
: and Cooley's work around 1998 have detailed discussion on this topic.
: This way is also useful for carrying out experiments on some external
: weblog data, such as the freely available http log from Berkeley CS server.
: http://www.cs.berkeley.edu/logs/
: Ronny Kohavi has proposed some new insights on the level of information
: the web usage mining algorithm should be carried out. He think it is
: better to do it at the E-Commerce application level, not the raw log file
: level.
: // sorry for my absence. I have just finished my thesis draft yesterday.
: 【 在 iamgufeng (古风) 的大作中提到: 】
: (以下引言省略 ... ...)
--
※ 修改:.iamgufeng 於 Dec 23 17:17:20 修改本文.[FROM: 210.83.133.26]
※ 来源:.南京大学小百合站 bbs.nju.edu.cn.[FROM: 210.83.133.26]
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?