counting internet usage (disregard earlier)

Wed Sep 10 20:45:22 EDT 1997

[Oops, sorry, forgive mistaken previous send...]

I agree with Dr. David R. Newman of Queens University about the
unreliability and questionable usefulness of counting hits as valid
statistical analysis. These comments were in regard to the original
message from Gerry Rowland of Iowa State Library, in this message
trail....

Newman wrote:
> This only tells you the time pattern of WWW usage in the library, and the 
> popularity of different WWW information sources among your library users. 
> It tells you nothing about how useful, relevant or interesting the 
> information is. Nor does it tell you why the site is popular, and whether 
> it is a result of technical factors (indexing, search results, time to 
> download, time to connect) or factors to do with the material and the 
> reader's interests.
>

About all you can actually expect from analyzing hits on a Web server is
a continuing comparison with itself, assuming that's of any use.  And
"about:global" results from the client/browser really tell you only what
an individual user happened to look at at a particular moment in time. 
      [For more on about:global, see "Hidden features of Netscape 
      Navigator ouuups Mozilla !"  
      at <http://wwwcn.cern.ch/~rigaut/about.html>

 Gerry Rowland, State Library of Iowa originally wrote:
> Internet use statistics are a high priority for libraries at the local,
> state and national levels.
> Counting hits by local users against remote Internet servers has been a goal
> of the FSCS, the national public library statistics project, for several years.

I really don't understand why it's such a high priority. Are we trying
to measure how hard computers and telecommunications networks are
working, for some reason? Admittedly, Rowland's recent _Public
Libraries_ article uses the example of high hits on the Netscape server
contrasted to much lower hits on library servers, and suggests that we
might be "losing the war." 
   Seems to me that this is comparing apples and oranges. Are libraries
going to suffer because the telco's 411 information number gets a lot
more calls than library telephone reference numbers? Or that TV Guide
and Dilbert get looked at a bunch more than the 800's section of our
collections?

Rowland writes:
> At today's meeting of the FSCS group, we learned that the command
> "about:global" in the Location: box of the Netscape browser returns a list
> of files downloaded and a count of total files.  I assume that Internet
> Explorer has a similar feature.

The deceptive ease of collecting log files is perhaps exactly the
problem. It all really depends on if you're really recording anything
worthwhile in the first place.  about:global shows you Netscape's
history.db file, which is simply a list of all the files you accessed in
browsing Web pages. 
   For instance, let's say you looked at a "single" page made up of an
HTML file, a photograph, two pretty line graphics, three "NEW" icons, a
CGI call, and a couple of other icons or graphic devices. Okay, the hit
count is going to show that you accessed that server something like 12
or 15 times.  Is that information supposed to tell us something of ANY
significance? 
  And that webserver's log file would tally up 12 or 15 accesses. Well,
I s'pose you could maybe conclude that the particular site used a lot of
really complicated pages? No, the actual case might be that the
particular webserver uses really blah, plain, ho-hum pages that get used
a lot because of what's on them.

Rowland writes: 
> It would appear that a count of hits could be tallied over a period of days
> or weeks, then multiplied to give an annual figure to provide a count of hits.
> 

Yeah. I drive an average of 14,000 miles each year.  So what? 

Rowland writes:
> Is this the way to generate Internet use statistics?  I think it just may be.

Not!

Newman goes on in his commentary:
> Think a bit about what sort of statistics would help you run a better 
> library. Most of the questions relate to the needs and perceived benefits 
> of library patrons, and how well the WWW service meets those needs. So 
> you actually need to ask them - be it in interviews, face-to-face or 
> on-line questionnaires, or in single questions automatically presented to 
> a small random sample of accesses (the technique Jacob Palme used to 
> research e-mail use and its benefits and costs to staff of the Swedish 
> Defence Organization).

On target. What are we trying to measure here, anyway? If it's patron
use or satisfaction, then, like Oregon State Librarian Jim Scheppke
says, "Why don't we ask them?"  Interviews or questionnaires or focus
groups, and good sampling techniques will give a much more accurate
picture of how well or how poorly a library is performing in regard to
its network information services. 
   Counting network hits is a throwback to 1) the library profession's
mania for counting useless numbers for no particular purpose (because we
always have), and 2) the old days when connect time _maybe_ meant
something. And that's a qualified maybe. Connect time might also reflect
lots of coffeebreaks or interesting/attractive people at the next
workstation.

I agree with the intent...that it's useful to measure value and quality
and success of ourinformation services. But I don't think that measuring
the present form of computer hit odometer is the way to do it. Let's do
think of some valid measures....

Cheers,
-ernest

Ernest Perez, Ph.D.//Oregon State Library//perez at opac.state.or.us
-----------------------------------------------------------------
Library:  Like a software house, except the software's free.
It's not vaporware.  And if it breaks, they help you fix it.
Quickly.  Without a toll call.