[Web4lib] "hacking" Google Book Search to get whole contents

Jeremy Dunck jdunck at gmail.com
Mon Nov 28 16:13:11 EST 2005


This (juvenile) forum post shows a pretty simple way to get the whole
contents of a book by progressively searching for terms on subsequent
pages.

http://www.techenclave.com/forums/read-any-book-google-techenclave-exclusive-6234.html

Summary:
Find a term within a book you'd like to read.
Execute a search on that.  Book Search will let you see the preceeding
and subsequent 3 pages.
>From the first/last page in that context view, take a, uh, statisticly
improbably search phrase, and repeat your search.  This will basically
shift your window into the book to the earlier/later pages.
===

Since the text for a book is shown as an image, it's not entirely
trivial to automate this.  You'd have to OCR the resultant images to
get the phrases to shift the window.

Still... this won't last. Publishers will be screaming for blood, and
Google will have to remove the previous/next window or otherwise
prevent the (pretty obvious) method.


More information about the Web4lib mailing list