[WEB4LIB] Re: Subject: Re: In re includes
Avi Rappoport
avirr at lanminds.com
Mon Nov 8 17:20:48 EST 1999
At 1:45 PM -0800 11/8/99, Chris Zagar wrote:
>There is a subtle difference between files that have undergone SSI
>processing and those that haven't. When a web server sends a document,
>one of the headers it can include specifies the date when the document was
>last modified. For a HTML file that has not undergone SSI processing,
>this is the date when the file was actually last modified. If a document
>undergoes SSI processing, this header is normally omitted, since the
>document has become dynamic.
Good point!
This also may apply to search engine indexing robots -- if the web
server doesn't send the original modified date (i.e. the file date or
even better, a DC.date field), most search engines assume that the
page is new, and store the date of indexing. This is why it's so
hard to do proper date-range searching in web search engines.
I think some servers (maybe Apache?) have ways around the date
problem, but it's a biggie for indexers.
Avi
________________________________________________________________
Avi Rappoport, Search Tools Maven: <mailto:avirr at lanminds.com>
Guide to Site Indexing and Local Search Engines: <http://www.searchtools.com>
More information about the Web4lib
mailing list