[Web4lib] Thumbnails for online books

Binkley, Peter Peter.Binkley at ualberta.ca
Fri Sep 1 14:49:59 EDT 2006


Assuming this pdfs just have images, not text, and assuming you're
willing to do some scripting, the pdfimages component of xpdf
(http://www.foolabs.com/xpdf/) could be used to extract the first image
from each pdf. Further scripting with an imaging processing package
could then reduce that image to a thumbnail. Xpdf was written for *nix
systems but there are Windows versions of the batch processing
components, including pdfimages.

Ghostscript also allows you to write scripts that "print" a pdf to a
graphics file
(http://www.cs.wisc.edu/~ghost/doc/cvs/Use.htm#File_output), and you can
specify the desired page(s). Again, you'd have to write a script to
generate the image of the first page and then process it into a
thumbnail. This would work for text pdfs as well, though I gather
Ghostscript is sometimes finicky about formats.

Hope this helps,

Peter

Peter Binkley
Digital Initiatives Technology Librarian
Information Technology Services
4-30 Cameron Library
University of Alberta Libraries
Edmonton, Alberta
Canada T6G 2J8
Phone: (780) 492-3743
Fax: (780) 492-9243
e-mail: peter.binkley at ualberta.ca


-----Original Message-----
From: web4lib-bounces at webjunction.org
[mailto:web4lib-bounces at webjunction.org] On Behalf Of Danielle Plumer
Sent: Friday, September 01, 2006 10:36 AM
To: web4lib at webjunction.org
Subject: [Web4lib] Thumbnails for online books

Now that the furor over Google Book Search has died down a bit...

One thing I like about their search is the way they're displaying
thumbnails with the cover and a couple of images to make the "books"
seem three-dimensional. Do any other systems have a display like this?

I've been looking for freeware (preferably open-source) to batch-create
thumbnails from the first pages of PDF documents for a couple of large
collections. I don't know if Google is doing that on the fly or if
they've created the thumbnails as part of their process, but the overall
effect is quite nice.

Danielle Cunniff Plumer, Coordinator
Texas Heritage Digitization Initiative
Texas State Library and Archives Commission
512.463.5852 (phone) / 512.936.2306 (fax) dplumer at tsl.state.tx.us
_______________________________________________
Web4lib mailing list
Web4lib at webjunction.org
http://lists.webjunction.org/web4lib/


More information about the Web4lib mailing list