60.txt
来自「This complete matlab for neural network」· 文本 代码 · 共 375 行 · 第 1/2 页
TXT
375 行
发信人: ccipt (北方的狼), 信区: DataMining
标 题: Data Mining on the Web
发信站: 南京大学小百合站 (Mon Aug 27 10:55:29 2001)
Data Mining on the Web
There's Gold in that Mountain of Data
By Dan R. Greening
When visitors interact with your site, they provide information about themselv
es and how they respond to your content: which links visitors click, where the
y spend most of their time, which search terms they use, and when they browse.
Some visitors may even fill out a lifestyle survey or provide names and addre
sses. Complex content also contains important information, such as words in ar
ticles, job descriptions and resumes, and features of competitive or complemen
tary products. All this information is often stored in a database.
As a result, you have a lot of information on your Web visitors and content, b
ut you probably aren't making the best use of it. Data warehouse reporting sys
tems, such as those provided by traffic analyzers, aggregate and report facts
over different dimensions. (See my article titled "Tracking Users," Web Techni
ques, July 1999.)
These warehouse reporting systems are commonly called online analytic processi
ng (OLAP) systems. OLAP systems can report only on directly observed and easil
y correlated information. They rely on you to discover patterns and decide wha
t to do with them. OLAP systems won't tell you that people frequently buy pota
to chips, onion soup mix, and sour cream at the same time, and they won't disc
over that some people love any movie that contains an explosion. The informati
on is even too complex for humans to discover these patterns using an OLAP sys
tem.
To solve this problem, marketers and business analysts use data-mining techniq
ues. These are machine learning algorithms that find buried patterns in databa
ses, and report or act on those findings. There are many data-mining technique
s, and it's difficult for one person to understand the entire field. The best
we can do in one article is provide an introduction to the problems that data-
mining techniques can solve, mention the techniques usually applied to those p
roblems, and give some insight into vendors offering solutions.
Know Your Visitor
To use data mining on your Web site, you have to establish and record visitor
and item characteristics, and visitor interactions.
Visitor characteristics include demographics, psychographics, and technographi
cs. Demographics are tangible attributes such as home address, income, purchas
ing responsibility, or recreational equipment ownership. Psychographics are pe
rsonality types that might be revealed in a psychological survey, such as high
ly protective feelings toward children (commonly called "gatekeeper moms"), im
pulse-buying tendencies, early technology interest, and so on. Technographics
are attributes of the visitor's system, such as operating system, browser, dom
ain, and modem speed. If you have a phone number or address, you can sometimes
obtain household demographic or psychographic information through direct mark
eting service providers, such as Webcraft or Acxiom. Business demographics are
available through Dun & Bradstreet.
Item characteristics include Web content information -- media type, content ca
tegory, URL -- as well as product information -- SKU (stock-keeping unit, basi
cally a product number), product category, color, size, price, margin, availab
le quantities, promotion level, and so on.
Visitor statistics accumulate when visitors interact with items, the Web site,
or the company. Visitor-item interactions include purchase history, advertisi
ng history, and preference information. Purchase history is a list of products
and purchase dates. Advertising history indicates which items were shown to a
visitor. Preference information refers to item ratings provided by a visitor.
Click-stream information is a history of hyperlinks that a visitor has clicke
d on. Link opportunities are hyperlinks that have been presented to a visitor.
Visitor-site statistics are typically per-session characteristics, such as tot
al time, pages viewed, revenue, and profit per session with a visitor. Visitor
-company information might include total number of customer referrals from a v
isitor, total profit, total page views, number of visits per month, last visit
, and so on. Visitor-company information can include brand measurements. Brand
associations, for example, are lists of positive or negative concepts a visit
or associates with the brand, which can be measured by surveying visitors peri
odically. Permissions are attributes that a visitor provides indicating how ma
rketing information contributed by the visitor can be used, such as permission
to send email, to share information with marketing partners, and so on.
If you do nothing else in response to this article, I urge you to do two thing
s: First, decide how you might use information recorded about your site's visi
tors, write a privacy statement, and make that statement available on your Web
site. See www.truste.org for assistance. Think about privacy from the visitor
's point of view. Visitors prefer to view products and pages that interest the
m, so they usually share information for that purpose. However, they typically
want you to ask for permission before sending them marketing email, or sharin
g their contact information with partner companies. If you provide a privacy s
tatement documenting your intended uses, and give visitors an email address fo
r comments, your visitors will let you know whether the policy is acceptable.
Second, record the data now, even if you do not have a data-mining process in
place. You will find most data-mining tool vendors allow for an initialization
step in which they incorporate historical data into your data-mining system.
List Your Goals
The great advantage of Web marketing is that you can measure visitor interacti
ons more effectively than in brick-and-mortar stores or direct mail. Data mini
ng works best when you have clear, measurable goals. The following are some go
als you might consider:
Increase average page views per session;
Increase average profit per checkout;
Decrease products returned;
Increase number of referred customers;
Increase brand awareness;
Increase retention rate (such as number of visitors that have returned within
30 days);
Reduce clicks-to-close (average page views to accomplish a purchase or obtain
desired information);
Increase conversion rate (checkouts per visit).
If you've instrumented your site to record the visitor, content, and interacti
on characteristics, and you've determined a set of measurable marketing goals,
congratulations! You are farther along than most marketers. Now you can gain
value from data mining.
Understand Your Problem
The first step to solving a problem is articulating the problem clearly. Commo
n problems Web marketers want to solve are how to target advertisements, perso
nalize Web pages, create Web pages that show products often bought together, c
lassify articles automatically, characterize groups of similar visitors, estim
ate missing data, and predict future behavior. All involve discovering and lev
eraging different kinds of hidden patterns.
Targeting. Marketers use targeting to select the people receiving a fixed adve
rtisement, to increase profit, brand recognition, or other measurable outcome.
Targeting on the Web must account for different advertising ad space costs. W
eb sites with valuable visitors typically charge more for ad space.
On sites where visitors register, advertisers can target on the basis of demog
raphics. For example, people living in different parts of the country or visit
ing different Web sites may have differing propensities to purchase sports-tea
m-branded apparel, gay travel tours, or discount car parts. Therefore, if you
target the people most likely to purchase your product, you can reduce your co
st for an ad campaign and increase the total profit.
Some sites let you target ads on the basis of IP address, under the theory tha
t DNS registration information or surveys provide the physical location of the
IP address. However, because national dial-up ISPs often share a pool of IP a
ddresses, this is not a reliable method. As we say in the business, "Half the
U.S. population lives in Vienna, Virginia" (AOL's corporate address).
Data mining can help you select the targeting criteria for an ad campaign. Web
publications have a set of variables by which they can target advertisements.
By performing a test ad using "run-of-site" (that is, untargeted) ad space yo
u can associate demographic variables with conversion. People "convert" when t
hey accomplish the marketing goal, such as performing a click-through, purchas
e, registration, and so on. Data mining can identify the combination of criter
ia that maximizes the profit. For example, data mining might discover that tar
geting based on the logical expression
(java-consultant) or
(software-engineer and
purchasing-authority < 10,000)
will increase the click-through on a JavaBean banner ad.
There is a huge variety of data-mining tools that support targeting, because t
argeting is extensively used in direct mail marketing.
Personalization. Marketers use personalization to select the advertisements to
send to a person, to maximize some measurable outcome. Here we use "advertise
ment" loosely to refer to any recommendation or item offered by a site. Even a
simple hyperlink in a menu or an article could be considered an advertisement
.
Personalization is the converse of targeting. Targeting optimizes the types of
people that will see an advertisement, reducing cost by showing the advertise
ment to more people in a broader campaign. It is most useful for prospecting -
- finding people who haven't visited your site yet -- because there's a cost t
o advertising on outside Web sites. But targeting is pointless on your own sit
e, where advertisements are free. Why would you not show your products to a pe
rson visiting your own site?
In contrast, personalization optimizes the advertisements that a person sees,
raising revenue because the person sees more interesting stuff. Personalizatio
n can be used for external advertising, but you're more likely to use it on yo
ur own site. External sites don't usually give you enough information about in
dividual visitors to do good personalization.
Some personalization systems, such as Broadvision One-to-One, rely on the mark
eter to write rules for tailoring advertisements to visitors. These are "rules
-based personalization systems." If you have historical information, you can b
uy data-mining tools from a third party to generate the rules. Rules-based per
sonalization systems are usually deployed in situations where there are limite
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?