[Web4lib] Another Google question
Richard Wiggins
richard.wiggins at gmail.com
Wed Jul 6 12:31:27 EDT 2005
Bernie,
A huge problem with that is that most Web servers do not return reliable
date information in the HTTP dialog. Many servers don't return date
information at all; many others return the instant of the HTTP transaction
instead of the date that the page was created or changed. Database-driven
servers are especially culpable.
I regard this as one of the major failures of the Web revolution.
With news media sources, they've got more to work with. I've seen some
evidence that Google is partnering with major sources to provide good
metadata as part of the feed, instead of just screen-scraping the Daily
Planet's Web site.
/rich
On 7/6/05, Sloan, Bernie <bernies at uillinois.edu> wrote:
>
> Re: finding brand new web pages. I wish the Google web search worked
> like the Google news search. With the news search, the results come back
> "sorted by relevance". But you can click on "sort by date" to get the
> results displayed in reverse chronological order, regardless of the
> relevance of the individual results.
>
> The Google advanced search page does allow you to limit by date (Return
> Web pages updated in the: past 3 months, past 6 months, and past year),
> but it's definitely not the same thing as the "sort by date" function in
> Google news.
>
> I'm not sure what the Google limit by date function is doing, exactly. I
> entered a search for "service perspectives for the digital library" and
> limited it to pages updated within the past three months. The first
> result was a web page that I put up in 1997 and have not updated since.
>
More information about the Web4lib
mailing list