PC-based web server

Sun Apr 18 10:00:35 EDT 1999

"Harrison, Roger" <rharrison at Exchange.FULLERTON.EDU> writes:
>I'd recommend a "real" server if your web site is mission-critical, as ours

This seems particularly true for NT-based servers, where Microsoft data has
found a 10x difference in downtime between systems, largely depending on
whether the system was an "approved" hardware platform.  Of course, one
might speculate that people running such platforms tend to be more
conservative and competent in other areas of system admin, and that the
hardware platform wasn't really what made a difference...

My own experience with Linux and Unix is that even a lowend desktop in a
closet can achieve impressive uptime (99.5% and better). And most downtime
on such boxes tends to be software- or environment-related.  It may be that
this is "good enough" for you, even though you know you can do better with
a computer room, UPS, RAID, hotswappable hardware, offsite hot spare for
building-wide catastrophes, and a technician standing by on site 24x7.

I'm curious to know whether anyone has systematic uptime data comparing
library configurations, and comparing that uptime with availability of the
stacks.  If you do, would you share it with us?  In computing uptime, it's
particularly important to say how you calculated the denominator.  Was it
168 hrs/week, or was it "potentially available hours", 168 minus any
scheduled uptime?  I remember one network vendor who claimed 99.8% uptime
because they had network diagnosis that detected problems a few minutes
before the problems caused total connectivity loss, and "scheduled"
downtime for a few minutes later to fix the problem; they didn't count such
scheduled downtime in the calculation!

I'm also curious to compare web site uptime data with similar data on
availability of the stacks or the reference desk.  Typical failures causing
stacks unavailability might include a fire alarm or a staff member arriving
5min. late to open the building on Sunday morning.  Anybody have hard
numbers?

An aside on minimizing downtime:  hardware downtime does occur, usually at
the least convenient time.  If you care about uptime percentage, the single
most important thing to do as a manager is to make sure that you don't have
extended downtime after a failure.  If it takes your systems person from
1am until 8 am to notice the failure and reboot to fix the problem
(sometimes that's all it takes) that's 7x worse than a 1 hr response time,
and could make the difference between 99% availability (ok but not great)
and 99.8 % (excellent).  Note, though, that unscheduled system
unavailability at 5am doesn't make too much differerence to average patron
satisfaction.

The second most important thing to do to insure hardware reliability is to
do a substantial burnin to avoid infant mortality.  Maybe the reason Linux
servers do so well is that we often run Linux on old 486s or 1st generation
Pentiums we were about to surplus -- they've had the advantage of proving
themselves by running reliably for years.

JQ Johnson                      office: 115F Knight Library
Academic Education Coordinator  email: jqj at darkwing.uoregon.edu
1299 University of Oregon       phone: 1-541-346-1746  -3485 fax
Eugene, OR  97403-1299          http://darkwing.uoregon.edu/~jqj/