PDF usability and best practices?

Thomas Edelblute TEdelblute at ANAHEIM.NET
Fri Mar 16 14:48:43 EDT 2012


Just as an experiment, I just scanned a single 8 ½ x 11 page of text at three different dpi (dots per square inch) and came up with the following file sizes.

200 dpi = 94 KB (note, text tends to be broken up at this resolution but is recommended for e-mailing due to small file size).
300 dpi = 178 KB
400 dpi = 394 KB

From: Web technologies in libraries [mailto:WEB4LIB at LISTSERV.ND.EDU] On Behalf Of Wilhelmina Randtke
Sent: Friday, March 16, 2012 11:09 AM
To: WEB4LIB at LISTSERV.ND.EDU
Subject: Re: [WEB4LIB] PDF usability and best practices?

I know that overall file size is key to whether people will be able to easily download and open a pdf...

... but, Are there any guidelines or recommendations about document size for pdfs?


My concerns are:
The access copy has to be the preservation copy, because it's highly unlikely someone will think to preserve it.  So, for scanned images, most of the flexibility in reducing file size will come from reducing resolution and so permanently loosing some of the representation of the original paper object.  Nevertheless, a larger file size will result in access issues - longer downloads, higher chance of an error during the download process, clunkier handling of the pdf using any viewer.
For a representation of a voluminous document, the tradeoff is that having multiple open documents is confusing, but also having a single 5000 page document open is confusing.

I would like to know whether anyone knows of studies on this tradeoff between image quality and file size.  I would also like examples of ways of presenting a very long text by breaking that text into segments - how are the segments arranged to allow meaningful retrieval of the desired part of the text, and how can the pdfs be presented in ways that aid navigating across the entire text?

-Wilhelmina

On Fri, Mar 16, 2012 at 7:58 AM, Bob Rasmussen <ras at anzio.com<mailto:ras at anzio.com>> wrote:
This is a partial answer, based on what I know of PDFs' internal
structures.

A key consideration is whether the PDF is "linearized". If it is, then the
browser does not need to download the entire PDF before some of it (the
first page) can be viewed.

Other factors to consider:

* Overall file size (as mentioned by someone else)
* Number and type of embedded fonts
* Whether it's "searchable". When pages of text are scanned, each is an
image. When it has had OCR done on it, it becomes searchable (and could
also be read aloud by screen reader software).

On Thu, 15 Mar 2012, Wilhelmina Randtke wrote:

> I am looking for recommendations or guidelines on best practices for
> displaying PDFs on the internet.
>
> What I don't want:  I am not looking for ADA compliance.  I am able to find
> that.
>
> What I do want:  I am looking for anything about ability to access the
> document - so:
>   -  How large a file size is acceptable?  (I anticipate U.S. visitors to
> the project this is for.)
>   -  How long in pages a document can be before it becomes overwhelming to
> a reader?
>   -  Any size constraints imposed by different PDF viewing devices and
> connection speeds?
>   -  Ways to represent a really long document - like a novel - and
> represent it in meaningful ways in PDF format, without doing a single giant
> PDF (and without becoming married to technology other than PDF-A and html)?
>
> Any pointers to guidelines or best practices on displaying voluminous PDFs
> would be appreciated.
>
> -Wilhelmina Randtke
>
> ============================
>
> To unsubscribe: http://bit.ly/web4lib
>
> Web4Lib Web Site: http://web4lib.org/
>
> 2012-03-15
>
Regards,
....Bob Rasmussen,   President,   Rasmussen Software, Inc.

personal e-mail: ras at anzio.com<mailto:ras at anzio.com>
 company e-mail: rsi at anzio.com<mailto:rsi at anzio.com>
         voice: (US) 503-624-0360 (9:00-6:00 Pacific Time)
           fax: (US) 503-624-0760<tel:503-624-0760>
           web: http://www.anzio.com
 street address: Rasmussen Software, Inc.
                10240 SW Nimbus, Suite L9
                Portland, OR  97223  USA

============================

To unsubscribe: http://bit.ly/web4lib

Web4Lib Web Site: http://web4lib.org/

2012-03-16

============================

To unsubscribe: http://bit.ly/web4lib

Web4Lib Web Site: http://web4lib.org/

2012-03-16

________________________________

THIS MESSAGE IS INTENDED ONLY FOR THE USE OF THE INDIVIDUAL OR ENTITY TO WHICH IT IS ADDRESSED AND MAY CONTAIN INFORMATION THAT IS PRIVILEGED, CONFIDENTIAL, AND EXEMPT FROM DISCLOSURE UNDER APPLICABLE LAWS. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution, forwarding, or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by e-mail or telephone, and delete the original message immediately. Thank you.

============================

To unsubscribe: http://bit.ly/web4lib

Web4Lib Web Site: http://web4lib.org/

2012-03-16
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.nd.edu/pipermail/web4lib/attachments/20120316/b3ca2666/attachment.htm>


More information about the Web4lib mailing list