Pdf to html -- jpeg to pdf??

Leo Robert Klein leo_klein at baruch.cuny.edu
Wed Feb 16 14:12:39 EST 2000


On Tue, 15 Feb 2000, "Sidey, Carolyne L", wrote:

>This topic reminded me of something I have been trying to do and usually
>giving up on.  I would like to create a single .pdf file from multiple
>scanned files.  I have a hp scanner with an automatic document handler.
>multi page documents are easy to scan, but each page is put in a separate
>file. I want to make these one pdf file.
>
>help
>
>
>I have adobe acrobat writer.
>

Hi Carolyne:

We've been using Acrobat on a pilot Electronic Reserves project.  The first
thing we had to get used to is that Acrobat comes from the world of print
and most of its presets and how it functions seem a universe away from
standard web applications.

Strange for an Adobe product, documentation leaves a lot to be
desired--particularly for direct scanning and production of pdf files.  You
get the impression that the Adobe people are really thinking of their
product as something you use once you've finished creating a document
either in a word processing app or in a DTP product.  This makes sense
where you're the original author of the piece or have access to the
original but for us who may be getting the article either straight out of a
journal or from a photocopy, this is of no help whatsoever.

In any case, part of the challenge of developing a pilot project was
thinking up a workflow that had the least number of steps in it basically
because the process is to be handled by Access Services staff who have
little or no scanning or HTML experience.

The good news is that Acrobat is twain compliant.  This means you can scan
directly into the program (black and white at 300 dpi seems to work best).
If you do it this way, you can make your document as long as you want.  Try
it--it works.  All you have to do then is save the thing out as a PDF.
Unfortunately, there's yet another step since the file you've just created
is likely to be huge.

In essence, you have to run the thing through Distiller.  Running the thing
through Distiller is perhaps the strangest, non-intuitive process you're
likely to meet.  In other applications, when you want to save out a
compressed version of anything--which is what you want Distiller to do, you
simply save the thing out.  Some applications require you to export the file.

In Acrobat, you have to "print" to Distiller.  Once you do this, a second
slimmer file will be created--provided you've successfully set Distiller up
to do so before hand.  Setting Distiller up to do so before hand is equally
strange.  We've found that setting everything to 150 dpi with the
compression on B/W images set to CCIT 4 produces the most acceptable
results.  Even then, you're producing pdf files that may run to 775k for
ten pages--scary.

We're still playing around with the options and we may get the things
scaled a bit further down but the lack of documentation for this kind of
process--simply taking the scanned image and putting it into a pdf shell
or, in other words, what a library would most typically use the app
for--this lack of documentation means you have to play around with it a lot.

LEO

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Leo Robert Klein                     17 Lexington Ave, Box H0520
Web Coordinator &                    New York, NY. 10010
Digital Resources Developer          tel: (212) 802-2373
Newman Library/Baruch College        fax: (212) 802-2360
http://newman.baruch.cuny.edu        email: Leo_Klein at baruch.cuny.edu
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


More information about the Web4lib mailing list