FW: [WEB4LIB] Web Log Analysis
Gimon, Charles A
CAGimon at mpls.lib.mn.us
Thu Jan 11 17:45:55 EST 2001
1. I produce and distribute a monthly report which includes:
--Unique Daily Visitors for the month
This is the number of unique IP addresses/hostnames logged for each day,
totaled for the month. This corresponds in a rough way with our gate counts.
--Total Page Hits overall
With subtotals for these three categories: staff, internal/public (our
public Internet workstations), and external. Staff is a separate driectory
on the webserver; the other two categories are distinguished by IP address.
--Top Pages overall
The most popular pages in descending order by number of page hits. Also
lists of the most popular pages for staff, internal/public, and external.
--Top Categories in the LIST (our web directory)
Since these appear as the appended GET query in the URL, for example:
http://www.mpls.lib.mn.us/list.asp?subhead=Science+_and_+Technology:Astronom
y
I can extract these and report on them.
--Top Subscription Databases by Clickthrough
--Top Links in the LIST by Clickthrough
We log clickthroughs on these links; this is done in a database, however,
not through web logs.
--Search Queries in the LIST, and for our entire site
Again, these are appended GET queries, extracted and reported on in a
separate document.
I also produce specialized reports on request about usage of specific pages
or areas of the site.
Some things that I've done on other jobs or for myself personally:
--Usage by Domain
Fun to do ("We got three hits from Estonia!"), but can lead to questions
about unresolved IP addresses and misconfigured hosts ("Where is .arpa,
anyway?").
--Usage by Browser/Platform
Not as straightforward to do from scratch (MSIE being Mozilla in logs, etc.)
but can still give interesting info. Related to this is:
--Robot Activity
Who's indexing you? Often you can just pull the last several totals from a
Browser/Platform report to get this.
--Error logs
Can be your best friend in finding troublesome spots in your site that users
still haven't reported. Might be in a separate file from your regular
webserver logs, depending on your server and configuration.
--Referer logs
Can help you see who is linking to you. Also might be in a separate file.
All of the above come with all the usual caveats: that you're counting
machines, not people in many cases, that your pages could be cached
elsewhere, that info can be spoofed, etc. etc.
Also note that exactly which items your server logs is generally a
configurable option; be sure that your server is, in fact, logging the
referer info before promising anyone a report on it.
2. I've always written this stuff from scratch in perl, and customized it
for my needs. (I can't stand overpriced, underfeatured pre-written software
for little tasks like this...) This sort of thing isn't rocket science; the
only thing even possibly off-putting about it is that the files you're
working with can get awfully large.
--Charles Gimon
Web Coordinator
Minneapolis Public Library
> -----Original Message-----
> From: Maribeth Manoff [mailto:manoff at aztec.lib.utk.edu]
> Sent: Thursday, January 11, 2001 1:56 PM
> To: Multiple recipients of list
> Subject: [WEB4LIB] Web Log Analysis
>
>
> Hello All,
>
> I am working on doing something with the logs generated by our Web
> server (something other than deleting them, that is :) I found some
> good information in the list archives on Web log analysis software, as
> well as a good article in Online magazine on this topic. I
> downloaded a
> trial version of WebTrends Log Analysis software, and got it to work
> with our logs. What I don't have a good sense of, though, is what is
> the information that I really want or need? I plan to talk to other
> librarians here to get their input, and I would like to ask for your
> assistance also. If you have the time, could you reply to me (I will
> happily summarize for the list) with answers to the following
> questions:
>
> 1) What types of statistics are you collecting on your
> library Web site
> usage?
>
> 2) What software are you using to collect these statistics?
>
> Thanks very much,
> Maribeth
> --
> ----------------------------------------------------
> Maribeth Manoff
> Coordinator for Networked Service Integration
> 647 Hodges Library mmanoff at utk.edu
> The University of Tennessee voice: 865-974-2876
> Knoxville, TN 37996-1000 fax: 865-974-0626
>
More information about the Web4lib
mailing list