[Web4lib] Google Allows Downloads of out-of-copyright Books

Mon Sep 4 15:09:47 EDT 2006

Perhaps take a folksonomy approach -- have a system by which patrons can 
report or recommend correction of errors they discover. A wikipedia model, 
perhaps. Just brainstorming, but it could take the burden of correction 
off the local coders.

  -- Patricia Anderson, pfa at umich.edu

On Mon, 4 Sep 2006, Perry Willett wrote:

> We've been concentrating on releasing our access system first, so we haven't 
> thought much about it. I don't think there's any issue about whether our 
> agreement with Google will allow us--I think it's something we are allowed to 
> do. The sheer volume of the task is daunting, however.
>
> Perry Willett
> Head, Digital Library Production Service
> 300 Hatcher North
> University of Michigan
> Ann Arbor MI 48109-1205
> Ph: 734-764-8074
> Fax: 734-647-6897
> Email: pwillett at umich.edu
>
>
> On Sat, 2 Sep 2006, Karen Coyle wrote:
>
>> Thank you. And I am SO glad the Michigan shows the underlying text (which 
>> Google doesn't -- at least not currently). Seeing the text, which is the 
>> input to the index, will help librarians and power users better understand 
>> search results and to formulate strategies for searching. OCR has some 
>> quirks, and seeing them can only help.
>> 
>> Another thought: any chance that Michigan (or any other Google libraries) 
>> will take on the task of correcting the OCR? (Assuming they have the right 
>> to do so.)
>> 
>> kc
>> 
>> Perry Willett wrote:
>>> Just to clear this up, we're getting both image and OCR files from Google 
>>> for each page. You'll see this specified in our agreement with Google on 
>>> p. 4:
>>> <http://www.lib.umich.edu/mdp/um-google-cooperative-agreement.pdf>
>>> 
>>> Perry Willett
>>> Head, Digital Library Production Service
>>> 300 Hatcher North
>>> University of Michigan
>>> Ann Arbor MI 48109-1205
>>> Ph: 734-764-8074
>>> Fax: 734-647-6897
>>> Email: pwillett at umich.edu
>>> 
>>>> ------------------------------
>>>> Date: Thu, 31 Aug 2006 14:07:43 -0700
>>>> From: Karen Coyle <kcoyle at kcoyle.net>
>>>> Subject: Re: [Web4lib] Google Allows Downloads of out-of-copyright
>>>> Books
>>>> 
>>>> Interesting example. If you go to page 1 you get a message saying "This
>>>> page does not contain any text recoverable by the OCR engine." Is it
>>>> possible that Michigan is providing OCR "on the fly?"
>>> _______________________________________________
>>> Web4lib mailing list
>>> Web4lib at webjunction.org
>>> http://lists.webjunction.org/web4lib/
>>> 
>>> 
>> 
>> -- 
>> -----------------------------------
>> Karen Coyle / Digital Library Consultant
>> kcoyle at kcoyle.net http://www.kcoyle.net
>> ph.: 510-540-7596
>> fx.: 510-848-3913
>> mo.: 510-435-8234
>> ------------------------------------
>> 
>> 
>> 
>> 
>> 
> _______________________________________________
> Web4lib mailing list
> Web4lib at webjunction.org
> http://lists.webjunction.org/web4lib/
>
>
>