[WEB4LIB] Controlling Bloglines crawler?
Thomas Dowling
tdowling at ohiolink.edu
Wed Nov 10 16:05:13 EST 2004
Thomas Dowling wrote:
>Has anyone encountered a way to throttle (in either sense of the word!)
>the Bloglines RSS crawler? We have a server with a large number of
>feeds. While Bloglines doesn't fetch any individual feed more than once
>an hour, they tend to fire off requests for all of our feeds in rapid
>succession, creating bursts of heavy traffic that can do bad things if
>they come in during an already high load on the server. Any chance
>they'd understand a 304 Not Modified HTTP status?
>
>
>
>
Thanks to the people who responded with hints (more polite than "RTF
Standard!") about the <ttl> and <skipHours> elements in RSS. They do
exactly what I want (respectivesly, tell an RSS reader not to pull
updates more often than X minutes and tell a reader not to pull feeds
during certain hours of the day).
Unfortunately, Bloglines disregards both elements; on inspection, I see
my copy of RSSReader does also. I'll experiment with the 304 HTTP status.
Thomas Dowling
tdowling at ohiolink.edu
More information about the Web4lib
mailing list