[Web4lib] One consequence of the digitization
programs
Jonathan Gorman
jtgorman at uiuc.edu
Tue Nov 6 08:09:15 EST 2007
---- Original message ----
>Date: Tue, 6 Nov 2007 13:26:26 +0100
>From: "Anders Ericson" <anders.ericson at norskbibliotekforening.no>
>Subject: [Web4lib] One consequence of the digitization programs
>To: <web4lib at webjunction.org>
>
>Libraries and others do a lot of digitization these days. But one of the
>(unintended) consequences is an increasing amount of very easily available
>texts in Google - however old and often unreliable and false information.
>(Not unlike the new, but you get my point?)
>
>I'm looking for digitization efforts that include some "consumer's
>information" on the quality of digitized documents. Like links to Wikipedia
>articles or librarians' input.
>
I guess I'm a little confused here by what you mean by
quality. It seems that you're asking if anyone has reviewed
or looked over the material. That's an interesting question,
after all, bad books have always been published. Maybe the
author was considered a quack. I'm not sure of any projects
attempts to do anything like that. The interesting part I've
found as I've played with these works though is that I've
found references to the books I'm looking for in other books.
At some point we might be able to datamine those connections,
but it can be a tricky issue.
Of course, I might be misreading you. There are usually
quality metrics associated with scanning books, such as the number of errors per page or the average run of correctly
converted words. If you're asking if anyone has done sampling
or displays estimated error rates, I don't think so.
Sorry, no answers, but good questions ;). Good luck on finding
some information.
Jon Gorman
>
>Anders Ericson,
>Web editor, Norwegian Libr. Assoc.
>
>
>
>
>_______________________________________________
>Web4lib mailing list
>Web4lib at webjunction.org
>http://lists.webjunction.org/web4lib/
More information about the Web4lib
mailing list