[Web4lib] Lies, damn lies, and usage statistics?

cpikas.14607360 at bloglines.com cpikas.14607360 at bloglines.com
Sun Mar 11 12:18:27 EST 2007


I think Stacy raises some really important points:
1) if you see some obviously
bad data blithely included, then you should question the rest (and not trust
it)
2) when you question your vendor, they should not treat you like Stacy
was treated, but explain exactly what is happening so that you can make a
decision whether to trust *any* of the data.
3) Hiding bad data isn't the
same is explaining why it occurred.  It's dishonest (lie by statistics)


It's not for Stacy to find her own way of capturing the data, but it's up
to Ebsco to prove through a COUNTER audit or like audit that their statistics
are meaningful and accurate.  Sure it makes sense to capture your own data
if it will provide unique information, but I don't think we should allow these
vendors to blow smoke!

IMHO.

Christina Pikas

--- Stacy Pober <stacy.pober at manhattan.edu
wrote:
I have been downloading annual database usage statistics for our
> library's electronic databases.
> 
> Looking at the statistics from one
of our vendors (EBSCOhost), I
> noticed  a peculiar thing.  Some of the databases
in the report were ones
> for which we had no subscriptions and no access.
Yet the report showed
> usage for those databases,  and it was for multiple
months in several
> databases,  so  it was not a one-time computer foul up.

> 
> When I contacted their technical support and reported this, they said
that
> this was a "known issue" and explained that we could deselect those

> databases when generating the usage report.  But I didn't want simply
> to make the obviously bad data invisible,  I wanted to know why it was
> there and whether the other figures, those that were not so obviously
>
fictional, were accurate.
> 
> When pressed for more information on the
exact nature of the problem, the
> helpful  support person did not elaborate,
but wrote:
> 
>    " I have filed a Service Issue (think of it as a work
order) to have
>    your  statistics "scrubbed" so that you will only be
left with your
>    actual statistics."
> 
> When asked for the specific
reason that we are seeing fictional
> usage statistics for several databases,
he again assured me it was a
> "known issue"  (I don't know if he thought
that this was a good
> substitute for a detailed explanation.  It is not.)
  He sent  no 
> technical details and wrote:
> 
>    "please rest assured
that this type of problem is rare, and that
>     the statistics gathered
by the system are quite accurate."
> 
> Which seems to miss the point. 
If we don't know what caused the
> problem, why would we assume any of the
usage statistics are
> accurate?  The erasure of  glaringly wrong figures
isn't a a reason
> to believe that the remaining information in a report
is correct.
> 
> This isn't the only vendor that provided inaccurate usage
data this year.
> Another  vendor's statistics showed zero usage after our
subscription
> started. Since I had used it  at the beginning of  the subscription
period,
> it was clear something was wrong.  When this anomaly was reported,
the
> vendor never explained what the problem was, but sent us some
> completely
 different (and - surprise! - much higher) usage figures. 
> 
> In the past,
I never really thought about this issue, and just assumed that
> most  of
the database usage information provided  by our vendors was
> resonably accurate.
 This  was an inappropriately optimistic assumption. 
> As far as I know,
there is no way to validate the most of the statistics
> provided to us by
database vendors.
> 
> Some independent data can be obtained from our EXproxy
logs, as
> they show the number of times users accessed particular databases.

> However, though the EZproxy server has some detailed information
> about
off-campus use, our on-campus users don't interact with it past
> the initial
database link selection.
> 
> Even if all of our usage was routed through
the EZproxy server, those
> logs aren't kept for that purpose, and I don't
think they show some of
> the most useful information, such as the number
of  abstracts and
> full-text documents accessed.  For the databases with
full-text, the
> number of full-text articles or documents used is a significant
figure. 
> The EZproxy logs can be analyzed to show pdf downloads, but
>
many of our databases offer much of the full-text as HTML.
> 
> Our openURL
system offers some statistics on full-text retrievals,
> but that system
only works with full-text access across different
> databases.  The openURL
system  won't come into play for those
> sessions  where the search and the
full-text are in the  same database.
> 
> Aside from the limited nature
of the independent usage statistics
> available,  doing  accuracy checks
on the vendor-supplied statistics
> would be a major pain to do on  a regular
basis.
> 
> I'm just bringing this up as a concern.  I'm sure that I am
not the only
> librarian who assumed in the past that the vendor supplied
usage data
> was correct.  Since we  use that data as an important factor
in our
> database acquisition and renewal decisions, it would be nice to
have
> some independent assurance of the accuracy of the data we're getting

> from vendors.
> 
> I don't really think that our database providers are
using Ouija boards
> to produce our usage reports.  The question is whether
they are routinely
> checking the validity of  the figures they collect and
supply to us. 
> Apparently
> some of them  are not doing logic and accuracy
testing of the software
> they use to produce the usage statistics.
> 
> Has anyone checked the accuracy of vendor-supplied database usage
> data?
 If you  have, how did you do it and what results did you find?
> 
> --

> Stacy Pober
> Information Alchemist
> Manhattan College
> O'Malley Library

> Riverdale, NY 10471
> stacy.pober at manhattan.edu <mailto:stacy.pober at manhattan.edu>

> 
> "If you want to inspire confidence, give plenty of statistics.
> It
does not matter that they should be accurate, or even intelligible,
> as
long as there is enough of them."  - Lewis Carroll
> _______________________________________________

> Web4lib mailing list
> Web4lib at webjunction.org
> http://lists.webjunction.org/web4lib/

> 


More information about the Web4lib mailing list