Web Search Engines "Made Simple"

Jian Liu jiliu at script.lib.indiana.edu
Thu Nov 6 18:43:55 EST 1997


In other words, there is no way to search for "date rape"?

Jian

> 
> 
> >>This illustrates the basic problem; each engine operates under its own
> >>semi-concealed rules; the rules have to be semi-concealed to prevent
> >>spammers from hijacking the engine.
> >>  I actually *did* get an answer from Hotbot a few months ago, to a very
> >>similar query "Roman sites". The concealed rule is that one of the words is
> >>reserved: in my case "sites", in yours almost certainly "date" (I should
> >>hope!).
> >
> >"date" is indeed a stopword in HotBot.  the way to test this when you get
> >squirrely hits is to type in the suspect term by itself.  if it is a
> >stopword, it will yield no results.  a subsequent search on "date rape" as
> >an exact phrase yielded the same number of hits as "rape" by itself
> >(stopwords are wildcarded in an exact phrase).  if you take a look at the
> >breakdown of individual pagecounts, you'll notice that "date" occurs over
> >11 million times in our database, which definitely makes it a stopword,
> >since searching for it would significantly slow down retrieval time.
> >
> >while we do not have a printed list of stopwords (it is dynamic and changes
> >with each crawl), we do have in our FAQ an explanation of how we index and
> >retrieve pages:
> >
> >http://help.hotbot.com/faq/score.html
> >
> >hope this clears up some of the mystery!
> >
> >- judy
> 
> 
> ___________________________________
> 
> j. y.  chen | hotbot  tutor | WIRED  d i g i t a l
> (v) 415. 276 .8464  | (f) 415. 276. 8499
> 	http://www.hotbot.com
> 
> The beatings will continue until morale improves!
> 
> 
> 
> 



More information about the Web4lib mailing list