78.txt

来自「This complete matlab for neural network」· 文本 代码 · 共 388 行 · 第 1/2 页

TXT
388
字号
发信人: ccipt (北方的狼), 信区: DataMining
标  题: Search Engines Comparison 2001
发信站: 南京大学小百合站 (Tue Sep  4 09:59:54 2001)

Search Engines Comparison 2001

By Diana Botluk

Diana Botluk is a lawyering skills instructor at the Catholic University of Am
erica School of Law in Washington, D.C., and is the author of the The Legal Li
st: Research on the Internet.  She teaches legal research at CAPCON, Catholic 
University Law School, and the University of Maryland.  Take a class with Dian
a!  Here's how...

Published August 1, 2001

 

At first glance, using a general search engine to locate information on the we
b seems easy. But getting a search engine to work with precision is another st
ory. General search engines come packed with features that are often underutil
ized, but can be helpful in increasing search precision. The features differ f
rom engine to engine, and skilled researchers will adjust their search strateg
y to take advantage of these differences depending on the type of results soug
ht. This article will explain the differences in some of the available feature
s, then examine a few major search engines in light of these features.

Searching Features

Alternative/Inclusive Default

When you type two words into a search engine box without any connectors, how d
oes the engine put them together? Will it find only those pages where both wor
ds appear, or will it find pages where either word appears? Search engines wit
h an inclusive default treat two separately typed words as if there were an AN
D between the words, while search engines with an alternative default treat th
e same two words as if there were an OR between the words. Thus, the results f
or the same search typed into two different search engines can be enormously d
ifferent because one is inclusive, and the other alternative.

Inclusive Default Search Engines

Google	HotBot	Lycos

Alternative Default Search Engines

AltaVista	Excite

Many search engines allow a researcher to designate alternative or inclusive t
hrough the use of the connectors OR and AND. Inclusion can also be designated 
using a plus sign as a word modifier:

apple OR blueberry

apple AND blueberry

+apple +blueberry

Keyword/Concept Default

Some search engines use automatic concept searching as a default. Many advance
d online researchers are accustomed to keyword searching, where the exact stri
ng of characters typed in is searched. Thus, an advanced researcher who unwitt
ingly uses a search engine with a concept searching default can become frustra
ted. Concept searching occurs when the engine not only searches for the exact 
character string, but also for word forms, and even synonyms and other words t
hat statistically appear with the typed word.

Keyword Search Default Search Engines

AltaVista	Google	HotBot	Lycos

Concept Search Default Search Engines

AltaVista (for some searches)	Excite

Exclusion

Most search engines allow exclusion of search results that contain certain ter
ms. Many engines recognize this feature by placing a minus sign or the word NO
T in front of the term to be excluded. This feature should be used sparingly t
o avoid eliminating relevant results that might have a casual mention of the e
xcluded term. Note that a minus sign modifies a single word, while NOT is a co
nnector between words:

pie -apple

pie NOT apple

Truncation

When using keyword, or exact match, searching, it can be helpful to command th
e search engine to locate pages where there are various forms of the word bein
g sought. Typing the root of a word and adding a truncation symbol on the end 
can accomplish this. Most search engines recognize an asterisk as a truncation
 symbol. For example, if I wanted to find pages with various forms of the word
 independence, I would type independen* and the results would include pages th
at contain independence, independent, and independently.

Search Restrictors

Search restrictors in web search engines are similar to search fields in Westl
aw. They allow a search for terms or values contained only in certain portions
 of a page, rather than anywhere in the entire page. A simple example is a sea
rch restricted to a type of domain, like .com or .edu. If a domain restriction
 is used, the search engine seeks results only where the url matches the desig
nated domain type. Search restrictions are accomplished in different ways on d
ifferent search engines, usually showing up in an engine's advanced searching 
option. Serious researchers have long applauded HotBot's search form, which ma
kes restricted searching easy.

Title restrictions are often available. Use these with caution, perhaps as a f
irst step to see what pops up. A title restriction reflects the title of the w
eb page, designated by the web author. It may not necessarily correspond to th
e title of the document appearing on the page. For example, I might be looking
 for a copy of the Declaration of Independence. That document may appear on a 
web page entitled Historic Documents by the web author. If I restrict my searc
h for "declaration of independence" to the title portion of pages, I will miss
 this page because it is actually called Historic Documents.

Date Searching

Searches can often be restricted by date. Additionally, dates often appear on 
the list of search results. However, like page titles, page dates can be somew
hat misleading. The dates that are searched or reflected in results lists are 
the dates of the web page, and not necessarily the date of the document on the
 page. A search with a date restriction of July 4, 1776, will yield no results
 since no web pages were created or changed on that date. Thus, if I am search
ing for the Declaration of Independence, it won't help me to try and place a d
ate restriction in my search query. However, date restrictions can be useful t
o locate newly created or recently updated web pages, weeding out older result
s.

Phrase Searching

Most search engines recognize quotation marks around two or more terms as the 
designation of a phrase. Additionally, this can sometimes be accomplished by p
lacing the Boolean connector ADJ between the terms. Thus, "apple pie" or apple
 ADJ pie will search for the phrase apple pie, and not search the two terms se
parately.

Nesting

Many search engines support the use of parentheses to nest various parts of a 
search query. For example, a search for apple or blueberry pie can be accompli
shed by nesting:

(apple or blueberry) ADJ pie

It can also be accomplished by searching two alternative phrases:

"apple pie" OR "blueberry pie"

Search Levels

It is often useful to perform a multi-level search, first casting a wide net, 
then narrowing by searching only within that set of results. This feature is o
ffered by AltaVista, Google, HotBot and Lycos.

Results Features

When comparing search engines, search language is only half the story. Search 
results are also important. Search engines use various mathematical formulas t
o match terms from the search query to web pages containing those terms. These
 formulas take various factors into consideration to present lists of results 
often ranked by relevancy, at least, relevancy according to the formulas used.
 Some of the factors that go into the determination of relevancy are how close
ly together the terms appear, how many times they appear on the page, how clos
e to the top of the page they are, and how unique they are.

Beyond pure relevancy rankings, however, many options are available to achieve
 a variety of results. Search engines present results quite differently, often
 without clearly explaining how the results are calculated or displayed. A ser
ious researcher will seek to understand these differences and use them to her 
advantage.

Directory Results

Several years ago, before sophisticated portal sites were developed, there wer
e two major ways to search for information on the web: directories and search 
engines. A directory is a collection of links to web sites which is classified
 into subject categories and subcategories.

As directories and search engines developed into overall portals, directories 
incorporated search engines and search engines incorporated directories. Porta
ls have attempted to make these two entities appear seamless; however, they ar
e two distinct finding tools. Understanding this concept allows the researcher
 to take more control over her searching.

Consider, for example, the classic directory, Yahoo! In a search for the Decla
ration of Independence, I can click through subject categories to locate it, o
r I can type "declaration of independence" in the search box. When searched, Y
ahoo! first searches its classified directory for subcategories entitled Decla
ration of Independence. If none are present, it then searches the directory fo
r listed web sites entitled Declaration of Independence. If there are none, Ya
hoo! then uses search engine Google to search for web sites which contain the 
phrase Declaration of Independence. Yahoo! presents the first set of results i
t can, even if that happens to be the third step, web page results from Google
. I do not have to prompt Yahoo! to move through to the next step if the first
 step found nothing; it happens automatically. This is why different searches 
on Yahoo! may produce results pages that look quite distinct.

Besides Yahoo!, there are two other major subject directories that have linked
 themselves with major search engines. The Open Directory Project provides dir
ectory results to Google, HotBot and Lycos, while LookSmart provides directory
 results to AltaVista and Excite.

Most Popular Results

As researchers began to realize that mathematical relevancy ranking didn't alw
ays equal researchers' intuitive relevancy ranking, tools were developed to pu
t a more human factor back into relevancy determinations. Search engines can n
ow measure what the most popular sites are, given certain search terms, and li
st the popular sites as results options. This is the driving force behind Dire
ct Hit, which is used at HotBot and Lycos. Google and AltaVista include popula
rity as a factor in their formulas to determine relevancy rankings.

Customized Results

Most search engines allow the look of the results page to be changed, especial
ly with regard to the number of hits per page. Additionally, they may offer th
e option of listing only titles or sorting by date or site, rather than releva
ncy.

Clustered/Compressed Results

Some searches produce many individual page hits from the same overall web site
, making it seem like the results all come from the same place. When a search 
engine uses results compression, or clustering, it shows only one page per web
 site, while offering an option to view the other results from that site. This
 feature can be found at AltaVista, Excite, Google and HotBot.

Suggested Searches

Suggestions for further searching based on the initial search are provided by 
many search engines. These suggestions can be simple, such as synonyms or alte
rnative search terms. They can be more sophisticated, such as suggestions for 
searching in different, specialized databases. Ask Jeeves is built entirely ar
ound suggested searches. If I type a question into Ask Jeeves' search box, it 
returns a list of suggested specialized databases that might contain the answe
r to that question.

For example, I asked Jeeves "Where can I find the Declaration of Independence?
" Jeeves returned several suggested sources for the text of the Declaration of
 Independence, as well as historical background on it.

Suggested searches can also be found at AltaVista, Excite, HotBot and Lycos.


Similar Searches

If I locate a web page that is highly relevant to my research issue, I might b
e interested in finding more pages that are very similar. Some search engines 
will perform a search for other similar pages at the click of a button. I simp
ly choose a page from my results list and ask the engine to perform a second s
earch to find similar pages. This feature can be found at Google (Similar Page

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?