[WEB4LIB] FW: Searching for AIDS: Google strikes again [Hardin MD

Richard Wiggins wiggins at mail.com
Wed Jul 26 12:14:41 EDT 2000


Google is a big win over other engines for certain kinds of searches, but
other spiders have their own strengths.  Northern Light's custom search
folders offers a highly sophisticated form of disambiguation.  Try searching
on "genetic" and see how nicely it sorts things into relevant folders. 
(E.g. genetic algorithms versus genetic engineering, etc.)  It does just as
good job on AIDS.  The custom search folders feature is an excellent example
of blending computer science and librarian skills.  The linear hit list in
Northern Light is not the best place to click in most cases; by reporting
that Northern Light has 59 "false hits," Rumsey's article puts it in an
unfair, er, "light."

And this example happens to work well with AltaVista.  If you type AIDS in
all caps, as is (or was) the common usage for this acronym, you find zero
"false hits" in the top ten.

Although the single example of the search for AIDS is interesting, it really
doesn't show that Google has any special advantage in terms of
disambiguation.  It just shows that Google does a good job of ranking sites
by popularity.  The link weighting may also find sites that are more
permanent and therefore in some way more substantive; snapping a hyperlink
to a page is in a sense an endorsement.  But if you search for "genetic" on
Google you won't see any special distinction made between the biological and
the computer science uses of the term.

Finally, Google has some really maddening design features.  They insist on
tossing stop words, even in phrase searches.  The classic "to be or not to
be" search underscores the folly of this approach.  Try it, and observe the
resulting usage notes.

And Google refuses to support any kind of Boolean logic.  If you want to
search for pages with "Richard Cheney" or "Dick Cheney" you are forced to do
two searches.  If you don't know that, and you type both phrases in one
search, you get only pages with BOTH forms -- a fraction of your target
pages.

Google is, like Yahoo, a great tool for the masses.  It is not unambiguously
superior to its competitors in all ways.

/rich



------Original Message------
From: "Gimon, Charles A" <CAGimon at mpls.lib.mn.us>
To: Multiple recipients of list <web4lib at webjunction.org>
Sent: July 26, 2000 2:22:23 PM GMT
Subject: [WEB4LIB] FW: Searching for AIDS: Google strikes again [Hardin MD


Nice article...but on the other hand, I'm not sure I'd always want a search
engine to be making assumptions about what I want. I hope raw text searching

remains an option alongside what you describe here.

(As an aside, in my life away from work, I host a website on Indonesian
history. I would be perturbed if search engines suddenly decided that
searches that mention "Java" were automatically about the programming
language.)

--Charles Gimon
Web Coordinator
Minneapolis Public Library

-----Original Message-----
From: Eric Rumsey [mailto:rumsey at blue.weeg.uiowa.edu]
Sent: Wednesday, July 26, 2000 9:07 AM
To: Multiple recipients of list
Subject: [WEB4LIB] Searching for AIDS: Google strikes again [Hardin MD
Notes]


The "dumb computer" behind most search engines is nowhere more in evidence
than when doing a search for an ambiguous word such as AIDS. Any reasonably
aware human being realizes that a search for "aids" is likely looking for
the disease AIDS. But to a search engine, all occurrences of the word are
given equal weight. So can a search engine be smart enough to "know" that a
search for "aids" is almost certain to be looking for the disease? Until
recently, probably not. But new developments are bringing improvements.

For more see:
Searching for AIDS: Google strikes again
http://www.lib.uiowa.edu/hardin/md/notes6.html


*	*	*	*	*	*	*	*	*
Eric Rumsey, Hardin Library for the Health Sciences
University of Iowa, Iowa City IA 52242
<eric-rumsey at uiowa.edu>
319-335-9875 (voice), 319-335-9897 (fax)
Hardin Meta Directory of Internet Health Sources - Kudos -
http://www.lib.uiowa.edu/hardin/md/news.html

Richard Wiggins
Consulting, Writing & Training on Internet Topics
www.netfact.com/rww         wiggins at mail.com
517-349-6919 (home office)  517-353-4955 (work)  
______________________________________________
FREE Personalized Email at Mail.com
Sign up at http://www.mail.com/?sr=signup



More information about the Web4lib mailing list