[Web4lib] Google Allows Downloads of out-of-copyright Books

Tyson Tate tysontate at gmail.com
Thu Aug 31 15:49:21 EDT 2006


99.9%, minimum. 100% if you can get good scans.

If you have good-quality scans, you can get near-100% accuracy. I used
to use OmniScan back when I published a literary zine to scan in
author's work submitted on paper and I almost never had any OCR errors
in my scans. The only sort of issues I encountered were with odd
formatting and minor confusion between en and em dashes, etc.

However, when the quality of your scan is low, the OCR will struggle.

-Tyson

On 8/31/06, Karen Coyle <kcoyle at kcoyle.net> wrote:

> OCR companies claim that they can get a 98-99.9% accuracy rate directly
> out of their software. (One of the main companies is Abbyy:
> http://www.abbyy.com) They also claim to be able to OCR 177 languages.
> It's pretty impressive, but remember that 99.9% means that there is one
> bad character, average, for every 1000 characters, which means one
> "typo" per page on average. I don't know how this compares the to
> readers used by the blind (is Kurzweil still the main one?)


More information about the Web4lib mailing list