Google as Acrobat reader (!)

gary price gprice at gwu.edu
Fri Nov 16 23:27:45 EST 2001


Rich was on target with this observation. I've just confirmed the change
with a contact at the GooglePlex.

Recently, Google changed from an option to view pdf content conveted to text
("view as text")  to a conversion to html ("view as html").

The converted pages look much better in terms of format and display as
compared to the "view as text" option.

Btw, the "view as html" option is also available with 4 of the 5  formats
Google has just started crawling and making available.
.doc
.ppt
.xls
.rtf

Postscript docs are converted to text.

cheers,
gary



-----Original Message-----
From: Richard Wiggins [mailto:rich at richardwiggins.com]
Sent: Thursday, November 15, 2001 1:35 PM
To: Multiple recipients of list
Subject: [WEB4LIB] Google as Acrobat reader (!)



I just discovered something. Google now offers a "View as HTML" link for
hit list entries corresponding to PDF files. To see examples, just go to
Google, search for "PDF", and scroll for a hit list item that includes the
"View as HTML" link.


Does anyone know if this is new behavior? I know they've been indexing PDFs
for some months now, but I don't recall this option.


I suppose this just falls out of the fact that they translate the PDF to
HTML and feed that to their parser and indexer, and that they also cache
those intermediate files, but I think it's really cool.


/rich


(PS -- there must be a library technology angle there somewhere. Suppose
your patrons use Web TV to read PDF files that your library hosts. There.
:-) )


Richard Wiggins
Writing, Speaking, and Consulting on Internet Topics
rich at richardwiggins.com www.richardwiggins.com






More information about the Web4lib mailing list