[Web4lib] Re: Web4lib Digest, Vol 44, Issue 23

Stewart Baker stewart.c.baker at gmail.com
Wed Nov 26 13:40:11 EST 2008


Hi John,

Image editing/enhancing:

   - Adobe Photoshop (
   http://tryit.adobe.com/us/cs4/photoshopextended/index.html?sdid=DOPF)
   This is what most people use, and is excellent for image
   editing/enhancing
   - Photoshop Elements(http://www.adobe.com/products/photoshopelwin/)
   A considerably cheaper version of Photoshop, and also a little more
   user-friendly for first-time users.
   - GIMP (http://www.gimp.org/)
   An open-source and completely free photo-editing suite which has much the
   same functionality as Photoshop does.

So far as OCRing old newspapers, there are several programs which should
work fine.

   - ABBYY FineReader - http://finereader.abbyy.com/
   This is the one that my friends who digitise books swear by.
   - OmniPage Pro - http://www.nuance.com/omnipage/professional/
   I personally used this one to turn scanned TIFF files of US civil war-era
   newspapers into full-text-searchable PDFs.  (
   http://www.sc.edu/library/digital/collections/newsouth.html)

If the originals (or microfilms) are in particularly poor condition, though,
you can expect to be doing a whole lot of typing to correct OCR errors
whichever software you use.  We scanned our originals at about 600 DPI, so
it wasn't too awful, but it still took us rather a long time to OCR all 60
of the newspaper's issues.  Depending on the specific page, it took anywhere
between 30 minutes (rare) to 2 or 3 hours to fix all the errors.

Hope this helps, and feel free to shoot me an e-mail if you have any other
questions.  I'd be glad to help out.

-- 
Stewart Baker
California State University Dominguez Hills
Library Webmaster
sbaker at csudh.edu


> ---------- Forwarded message ----------
> From: John Fitzgibbon <jfitzgibbon at Galwaylibrary.ie>
> To: "web4lib at webjunction.org" <web4lib at webjunction.org>
> Date: Wed, 26 Nov 2008 09:59:06 +0000
> Subject: [Web4lib] sharpening images and OCR
> Hi,
>
> We have just captured photographs from old newspapers that are on
> microfilm. We are about to put these images on the Web. What software is
> best for enhancing these images? Sometimes, there are black lines going down
> through the image.
>
> A second related question is this: we are hoping to OCR some articles from
> a TIFF image thus created. What OCR package costing fewer than 2,000 dollars
> might be best for this task. I suspect that none of them will be good enough
> because the original newspaper was in such a poor state when microfilmed,
> but, I thought I would investigate it anyway.
>
> Any advice would be much appreciated.
>
>
> Regards John
>
>
>
> John Fitzgibbon
>
>
>
> w: www.galwaylibrary.ie
>
> e: info at galwaylibrary.ie
>
> p: 00 353 91 562471
>
> f: 00 353 91 565039
>
>


More information about the Web4lib mailing list