[Web4lib] Google Books a tease, not a useful tool,
for serious research
Richard Wiggins
richard.wiggins at gmail.com
Fri Jul 6 08:22:04 EDT 2007
I think we've plumbed these troubled waters before, but my experience over
the last two days has me shaking my head, wondering if Google really
considers Google Book a serious research tool.
To me, to be useful, a research tool needs these features:
-- You must be able to cite what you find. You must be able to provide a
reference that others can follow in order to retrieve exactly what you
retrieved.
-- You must be able to quote it. That is, you must be able to copy text
from it and paste that text into an article, an e-mail, whatever.
-- You must be able to reproduce the search that found the item.
-- You must be able to search within the full text.
-- Others must be able to do all of these things.
As a matter of sport in the last couple days I've been trying to chase down
a matter of historical fact: is the proper name of a thoroughfare in East
Lansing "Harrison Road" or is it "Harrison Avenue." This has been a fun
research project worthy of History Detectives (except the subject matter is
a lot more boring than their tales).
Google Book Search offered some tantalizing evidence from the Michigan
public laws of 1907. What was especially cool was that the book was
digitized by the University of California just this past May.
Here's what's not cool:
-- My first search revealed the tantalizing tidbit re the founding of East
Lansing, when Harrison Avenue was a boundary of the town.
--- For some reason, subsequent searches did not pull up that tidbit, but
rather metadata about the volume.
-- And now, unless I'm losing my mind, repeats of the same searches don't
even find that volume.
-- I was able to find the URL in my browser cache, in this bizarre form (not
even sure it will paste) http://books.google.com/books?id=_VUyAAAAIAAJ
.... (In my browser address bar the upper case AAAs are crossed out.)
-- If you manage to locate the PDF and download it, of course you cannot
search it, because it is a PDF stripped of Acrobat power; the pages are
images, and not searchable. This is a volume with 1200 pages. Eyeball
scanning for the text that Google Book Search once coughed up on screen is a
waste of time and an insult.
Again, I know we've covered some of this turf, but doesn't the combination
of these facts destroy the value of Google Book Search as a serious research
tool?
Google seems to be paranoid about others mining their data. Do they
actually change search behavior to limit the number of searches for a book?
If so it's obviously preventing reproducibility of research and even opening
the door to denial of service.
/rich
More information about the Web4lib
mailing list