PDF files, part II: How does IE know?

Christopher Handy charta at inetdirect.net
Sat Jul 4 01:29:32 EDT 1998


On 7/2/98, Rich.Harrington at co.hennepin.mn.us wrote:

|>Hi, more on the wackiness of linking to PDF files.
|>
|>   I told you all before how our machines with IE4 -- but only some of our
|>IE3.x machines -- were able to deal with a link to a PDF file even if the
|>target didn't have a .pdf extension (see <
|>http://supct.law.cornell.edu/supct/index.html >.  None of the PDF links
|>there have a .pdf extension).
|>

<snip>

|>
|>   Perhaps you are thinking that I am fixating a little too much on this
|>little problem, and perhaps you are right.  But I do hope to gain some
|>understanding of how the browser works, so if any of you have some insight,
|>I'd sure be interested to hear it.

The rules of the game are that any time a server sends a file -- of
whatever type -- to a client browser it also sends along various metadata,
including content-type information (e.g. application/pdf in the case of a
PDF file) so that the browser knows what to do with it. Consequently, while
the extensions that one ordinarily sees on files downloaded over the
Internet are very handy, I don't think they're theoretically necessary in
order for files to be correctly interpreted by a standards-compliant
browser. The browser should figure this out from the content-types.

Probably the file extensions are more important on the server side, since
the server is responsible for determining what content-type designation to
assign to each file. But even on a server I don't believe that
content-types need to map rigidly to specific file extensions. I think the
mapping can be customized. In the case you mention you'll notice that the
.html files are all in a directory called "/html/..." and the .pdf files
all in a directory called "/pdf/..." (surprise). A guess might be that this
is how the server determines which is which.

You'll recall that in Explorer's "file helper" settings each file type has
associated with it not only the content-type (MIME), but also a file
extension (suffix). I think the browser uses this latter information to
guess at files in cases where the content-type is ambiguous or unknown. It
may even be the case that Microsoft allows the extensions to override or
mask the content-types. One way or the other, the problems you describe are
probably related somehow to this sort of content-type/extension conflict.
Have you tried simply deleting the extension associated with .pdf files in
the "file helper" settings of the misbehaving IE3? Maybe this would force
the browser to pay attention to the content-type. Might be worth a try.

Chris Handy
charta at inetdirect.net





More information about the Web4lib mailing list