Counting www server hits

Albert Lunde Albert-Lunde at nwu.edu
Tue Sep 2 23:51:42 EDT 1997


> A couple of quick questions regarding interpretation of Apache logs
> for counting hits on our server.
>
> First, how do you all out there treat successful HEAD requests for
> the purpose of reporting numbers of hits?  From the rfcs it looks to
> me as if they probably aren't reqests for documents, as it seems to
> me that GET's returning a 304 (not modified) are.

A HEAD request returns headers about an object (roughly the same as you'd
receive with a GET, usually including a Last-Modified date), but no body
containing the object. It's intended as a low-bandwidth way to check that
links are are OK or get some basic info about a file.  I assume, for
purposes of _my_ stats, that a HEAD request represents some kind of
link-checking robot or other mechanical query and don't count it as a
normal hit.

I _do_ count GET requests with 304 status as hits.

(These are produced by requests with the conditional header
(If-Modified-Since)that means "send me this file if it's changed" and are
usually the result of checking a cached file (either in a browser or a
proxy server) to see if it is fresh before returning it to the user.)

For what it's worth, in a recent week our server saw 550338 GET requests,
1539 HEAD requests, 2 OPTIONS requests, and 13370 POST requests.

Here's an example of faking a HEAD request with Unix telnet:

>nuinfo % telnet nuinfo 80
>Trying...
>Connected to nuinfo.nwu.edu.
>Escape character is '^]'.
>HEAD / HTTP/1.0
>
>HTTP/1.1 200 OK
>Date: Wed, 03 Sep 1997 03:07:44 GMT
>Server: Apache/1.2.1
>Last-Modified: Thu, 12 Jun 1997 17:24:05 GMT
>ETag: "e04-1211-33a030b5"
>Content-Length: 4625
>Accept-Ranges: bytes
>Connection: close
>Content-Type: text/html
>
>Connection closed by foreign host.
>nuinfo %

I typed "HEAD / HTTP/1.0" then Return and Control-J.

=======================================
> Second, I have a few lines in last month's access_log like this:
>
> 198.65.99.64 - - [25/Aug/1997:20:37:50 -0500] "-" 200 -
>
> where the part that normally specifies the URL and the method is
> just a "-".  I find reference to this notation in the rfc for HTTP
> 1.1, but I don't yet know enough about the protocol to really fit
> this in to understand what was requested (and, apparently,
> successfully provided).

I don't know what a "-" in the space for the method means in the apache logs.

I'm not sure what you are referring to in the RFC 2068.

I'm guessing this is an Apache convention rather than something really sent
across the net. It's used several places in the 1.2.1 Apache
mod_log_config.c source as a placeholder for NULL or missing data, but I
don't find a usage that matches this case you describe.


---
    Albert Lunde                      Albert-Lunde at nwu.edu




More information about the Web4lib mailing list