[Web4lib] Measuring RSS activity in Apache log files

Thomas Dowling tdowling at ohiolink.edu
Wed Nov 1 09:56:38 EST 2006



On 11/1/2006 8:56 AM, Robert Menk wrote:

> 
> Do RSS readers actually fetch the xml file repeatedly and thus show up
> in the log with an "artificially" high hit count?

That's often the case.  Most RSS readers and aggregators re-GET the RSS
URL every few hours.

On the other hand, web feed aggregators are likely to download your feed
just once for all of their users who subscribe to it, which will drive
down the hit numbers.  This is web stats in a microcosm: all you can
really be sure of is climbing or falling hits over time.

> Or do they just
> compare the date & time stamp they currently have and only fetch the
> file when they're out of synch?

I know that Bloglines sends an If-Modified-Since header and they claim
to honor Not-Modified responses.  If your RSS feeds are static files,
your web server may already be handling that for you; some blog software
does it for you also.  Other readers and aggregators may honor elements
in the feed itself to control when or how often they hit your site -
skipHours and skipDays, and ttl (time to live) for the maximum time a
feed can reside in a cache.


-- 
Thomas Dowling
tdowling at ohiolink.edu





More information about the Web4lib mailing list