[Web4lib] Google Allows Downloads of out-of-copyright Books
Patricia F Anderson
pfa at umich.edu
Mon Sep 4 15:09:47 EDT 2006
Perhaps take a folksonomy approach -- have a system by which patrons can
report or recommend correction of errors they discover. A wikipedia model,
perhaps. Just brainstorming, but it could take the burden of correction
off the local coders.
-- Patricia Anderson, pfa at umich.edu
On Mon, 4 Sep 2006, Perry Willett wrote:
> We've been concentrating on releasing our access system first, so we haven't
> thought much about it. I don't think there's any issue about whether our
> agreement with Google will allow us--I think it's something we are allowed to
> do. The sheer volume of the task is daunting, however.
>
> Perry Willett
> Head, Digital Library Production Service
> 300 Hatcher North
> University of Michigan
> Ann Arbor MI 48109-1205
> Ph: 734-764-8074
> Fax: 734-647-6897
> Email: pwillett at umich.edu
>
>
> On Sat, 2 Sep 2006, Karen Coyle wrote:
>
>> Thank you. And I am SO glad the Michigan shows the underlying text (which
>> Google doesn't -- at least not currently). Seeing the text, which is the
>> input to the index, will help librarians and power users better understand
>> search results and to formulate strategies for searching. OCR has some
>> quirks, and seeing them can only help.
>>
>> Another thought: any chance that Michigan (or any other Google libraries)
>> will take on the task of correcting the OCR? (Assuming they have the right
>> to do so.)
>>
>> kc
>>
>> Perry Willett wrote:
>>> Just to clear this up, we're getting both image and OCR files from Google
>>> for each page. You'll see this specified in our agreement with Google on
>>> p. 4:
>>> <http://www.lib.umich.edu/mdp/um-google-cooperative-agreement.pdf>
>>>
>>> Perry Willett
>>> Head, Digital Library Production Service
>>> 300 Hatcher North
>>> University of Michigan
>>> Ann Arbor MI 48109-1205
>>> Ph: 734-764-8074
>>> Fax: 734-647-6897
>>> Email: pwillett at umich.edu
>>>
>>>> ------------------------------
>>>> Date: Thu, 31 Aug 2006 14:07:43 -0700
>>>> From: Karen Coyle <kcoyle at kcoyle.net>
>>>> Subject: Re: [Web4lib] Google Allows Downloads of out-of-copyright
>>>> Books
>>>>
>>>> Interesting example. If you go to page 1 you get a message saying "This
>>>> page does not contain any text recoverable by the OCR engine." Is it
>>>> possible that Michigan is providing OCR "on the fly?"
>>> _______________________________________________
>>> Web4lib mailing list
>>> Web4lib at webjunction.org
>>> http://lists.webjunction.org/web4lib/
>>>
>>>
>>
>> --
>> -----------------------------------
>> Karen Coyle / Digital Library Consultant
>> kcoyle at kcoyle.net http://www.kcoyle.net
>> ph.: 510-540-7596
>> fx.: 510-848-3913
>> mo.: 510-435-8234
>> ------------------------------------
>>
>>
>>
>>
>>
> _______________________________________________
> Web4lib mailing list
> Web4lib at webjunction.org
> http://lists.webjunction.org/web4lib/
>
>
>
More information about the Web4lib
mailing list