[Web4lib] "hacking" Google Book Search to get whole contents

Walt.Crawford at rlg.org Walt.Crawford at rlg.org
Mon Nov 28 16:30:40 EST 2005


It's also worth noing that *this just won't work* for in-copyright books
from the Google Library Project, since you get three snippets, not three
pages. Quite apart from the likelihood that Google will figure out that
people are gaming the system and use cookies to prevent it...

"Book Search will let you see the preceeding and subsequent 3 pages" isn't
even true on a general basis for Google Library Program books; the number
of pages shown is up to the publisher, as I've found in trying a few.
Sometimes you get one page; sometimes no pages at all.


Walt Crawford
wcc at rlg.org, 650-691-2227
-------------------------------------
Typically reachable:
Monday & Wednesday 7 a.m.-3 p.m.
Tuesday & Thursday 7 a.m.-2 p.m.
Friday 7-11 a.m.
--------------------------------------

web4lib-bounces at webjunction.org wrote on 11/28/2005 01:22:18 PM:

> Jeremy Dunck wrote:
> > This (juvenile) forum post shows a pretty simple way to get the whole
> > contents of a book by progressively searching for terms on subsequent
> > pages.
> >
> > http://www.techenclave.com/forums/read-any-book-google-
> techenclave-exclusive-6234.html
> >
> > Summary:
> > Find a term within a book you'd like to read.
> > Execute a search on that.  Book Search will let you see the preceeding
> > and subsequent 3 pages.
> >>From the first/last page in that context view, take a, uh, statisticly
> > improbably search phrase, and repeat your search.  This will basically
> > shift your window into the book to the earlier/later pages.
>
> Schees, it'd be a whole lot easier to borrow the thing from a library
> and type it out.  If I were going to bootleg publications on a massive
> scale, that's how I'd go about it.
>
> LEO
>
> -- -------------
> Leo Robert Klein
> www.leoklein.com
> _______________________________________________
> Web4lib mailing list
> Web4lib at webjunction.org
> http://lists.webjunction.org/web4lib/



More information about the Web4lib mailing list