[Web4lib] Google Allows Downloads of out-of-copyright Books

Karen Coyle kcoyle at kcoyle.net
Sat Sep 2 11:59:47 EDT 2006


Thank you. And I am SO glad the Michigan shows the underlying text 
(which Google doesn't -- at least not currently). Seeing the text, which 
is the input to the index, will help librarians and power users better 
understand search results and to formulate strategies for searching. OCR 
has some quirks, and seeing them can only help.

Another thought: any chance that Michigan (or any other Google 
libraries) will take on the task of correcting the OCR? (Assuming they 
have the right to do so.)

kc

Perry Willett wrote:
> Just to clear this up, we're getting both image and OCR files from 
> Google for each page. You'll see this specified in our agreement with 
> Google on p. 4:
> <http://www.lib.umich.edu/mdp/um-google-cooperative-agreement.pdf>
>
> Perry Willett
> Head, Digital Library Production Service
> 300 Hatcher North
> University of Michigan
> Ann Arbor MI 48109-1205
> Ph: 734-764-8074
> Fax: 734-647-6897
> Email: pwillett at umich.edu
>
>> ------------------------------
>> Date: Thu, 31 Aug 2006 14:07:43 -0700
>> From: Karen Coyle <kcoyle at kcoyle.net>
>> Subject: Re: [Web4lib] Google Allows Downloads of out-of-copyright
>> Books
>>
>> Interesting example. If you go to page 1 you get a message saying "This
>> page does not contain any text recoverable by the OCR engine." Is it
>> possible that Michigan is providing OCR "on the fly?"
> _______________________________________________
> Web4lib mailing list
> Web4lib at webjunction.org
> http://lists.webjunction.org/web4lib/
>
>

-- 
-----------------------------------
Karen Coyle / Digital Library Consultant
kcoyle at kcoyle.net http://www.kcoyle.net
ph.: 510-540-7596
fx.: 510-848-3913
mo.: 510-435-8234
------------------------------------




More information about the Web4lib mailing list