[Web4lib] Google Print

Richard Wiggins richard.wiggins at gmail.com
Sat Nov 5 19:06:16 EST 2005


Norma,

I think you've pointed out an important difference here, something
that I alluded to when Roy and I discussed Google Print at our talk at
Internet Librarian.

Google's PageRank is well-suited to ranking Web pages.  Usually, a
single URL that Google indexes is fairly short -- loosely speaking,
several Page Downs in your Web browser -- certainly not the equivalent
of a paper book with hundreds of pages.  A book tends to run to a few
hundred pages, perhaps several hundred.

Therefore i f you're indexing an object as large as a book the noise
is likely to be much greater.  A book that has a few mentions of
"dressage" may, or may not, be a book in which dressage is an
important part of the text.

For searching books -- if someone seeks a book that's relevant on a
particular topic -- a provider that searches really good metadata may
deliver a much more useful service than one that indexes the full text
of the book.

You could imagine Amazon understanding this better than Google. 
Amazon understands books.  Google tends to trust robots to index Web
pages, or news stories, or books. The size of Google's atoms thus
varies by orders of magnitude.   The question is whether Google
BookRank (my TM, not Google's) will work well if it's built on a
strict analogy of PageRank.  I don't think it will.  You could imagine
an Amazon hit list broken into useful segments:

-- We found 123 books about dressage:
...

-- We found 1234 books that mention dressage:
...

-- We found 12345 books that mention the word "dressage":
...

/rich

On 11/5/05, Norma Hewlett <hewlett at usfca.edu> wrote:
> The GooglePrint beta (print.google.com) doesn't seem to be a very
> useful search tool in its current incarnation. In fact, if my quick-and-
> dirty test today is a good sample, it appears to be exactly the kind of
> mish-mash some people have predicted.
>
> Today I ran a search on the GooglePrint beta for "dressage". (This is a
> style of horseback riding that requires intense training, and there are
> many books about the techniques.) From on the results of this search,
> it appears that googleprint is retrieving any book where the search
> word appears anywhere in the text. There doesn't appear to be any
> weighting, not even for words in the title.
>
> In the first 50 books listed, there were no dressage manuals and
> nothing that would be of use to a serious dressage rider. There were
> general books such as The Encycolopedia of the Horse (1 page on
> dressge), a number of books on other styles of riding that briefly
> mention dressage, several young people's novels from the Thoroughbred
> series (think Baby Sitters Club on horseback), a Lonely Planet travel
> guide to Slovenia, a biography of Christopher Reeve, a Breyer model
> horse guide--and on and on. The only book that looked as if it might be
> a serious guide to Dressage training was in German.
>
> In contrast, when I ran a basic book search for "dressage" on Amazon,
> 49 of the first 50 entries were for books about the techniques of
> dressage training. (Number 36 was a detective story about a murdered
> dressage rider titled "Death By Dressage.")
>
> Jean Hewlett
>
> Regional Librarian
> University of San Francisco
> _______________________________________________
> Web4lib mailing list
> Web4lib at webjunction.org
> http://lists.webjunction.org/web4lib/
>


More information about the Web4lib mailing list