[WEB4LIB] Re: Subject: Re: In re includes

Avi Rappoport avirr at lanminds.com
Mon Nov 8 17:20:48 EST 1999


At 1:45 PM -0800 11/8/99, Chris Zagar wrote:
>There is a subtle difference between files that have undergone SSI
>processing and those that haven't.  When a web server sends a document,
>one of the headers it can include specifies the date when the document was
>last modified.  For a HTML file that has not undergone SSI processing,
>this is the date when the file was actually last modified.  If a document
>undergoes SSI processing, this header is normally omitted, since the
>document has become dynamic.

Good point!

This also may apply to search engine indexing robots -- if the web 
server doesn't send the original modified date (i.e. the file date or 
even better, a DC.date field), most search engines assume that the 
page is new, and store the date of indexing.  This is why it's so 
hard to do proper date-range searching in web search engines.

I think some servers (maybe Apache?) have ways around the date 
problem, but it's a biggie for indexers.

Avi
________________________________________________________________
Avi Rappoport, Search Tools Maven: <mailto:avirr at lanminds.com>
Guide to Site Indexing and Local Search Engines: <http://www.searchtools.com>


More information about the Web4lib mailing list