[WEB4LIB] Analyzing OPAC logs

Karen Harker Karen.Harker at UTSouthwestern.edu
Wed May 29 14:52:09 EDT 2002


The only thing we use IP addresses for is to determine if the user was on or off campus.  We do this by maintaining a table of campus IP address ranges, parsing the user's IP address and then comparing it with the ranges in our table.  

I guess you could use the IP addresses to determine the general domain of the users, but it would be very time-consuming and I don't think worth the effort.  I don't think there is any logical structure of the IP addresses that would reveal any other attributes of the user.  It could also violate users' privacy, which is why we no longer store IP addresses in our logs of the "Search the Library" and other Web-based systems.  


Karen R. Harker, MLS
UT Southwestern Medical Library
5323 Harry Hines Blvd.
Dallas, TX  75390-9049
214-648-1698
http://www.swmed.edu/library/

>>> Fernando Gómez <fgomez at criba.edu.ar> 5/29/02 1:40:51 PM >>>
Hello!

I am planning to start a detailed analysis of our logs, which have been
accumulating for more than six months since we launched a new Web PAC. The
purpose of this mail is to ask for some ideas about this task: guidelines,
pointers to interesting or innovative works in this area, or whatever you
would like to share. I thought it could be inspiring to know how others are
processing their own data (and if that processing leads to some kind of
improvement).

Our system has been designed in-house (using Bireme's WWWISIS technology),
so we have great freedom to decide what information to save in the logs. Up
to now, we are logging:

* date & time
* IP number (REMOTE_ADDR)
* query
* database (i.e. books, videos, periodicals, etc.)
* type of search (author, title, etc.)
* no. of hits
* no. of page solicited (we show 10 records per page)
* source of the query (an expression typed in the search form, a link
appearing in a previously displayed result, etc.)

This seems enough to harvest a great deal of useful information regarding
user behavior, failed searches, shortcomings in the system, ...

A concrete point that puzzles me a bit (not exactly library-related): what
concrete information can you extract form the list of IP numbers (the
REMOTE_ADDR variable)? Does the count of different IP numbers mean anything?
Is there any geographical information you can extract from them?

Well, thanks in advance for all your help. :)


Fernando Gómez
Bahía Blanca, Argentina



*********************************************************************
Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
*********************************************************************



More information about the Web4lib mailing list