Libweb milestone and Stats o' the Day revisited

Thomas Dowling tdowling at ohiolink.edu
Tue Jan 27 15:56:17 EST 1998


I am pleased to announce a milestone for the Libweb lists of library home
pages <URL:http://sunsite.berkeley.edu/Libweb/>.  The entry for the San
Francisco State University library <URL:http://www.library.sfsu.edu/> marks
the 2000th page listed.


I took the opportunity to run a spider over about 1400 North American
library home pages and follow up on a couple of reports I made to the list
last year regarding the validation of HTML markup found in library home
pages and a few other statistics.
<URL:http://www.lib.berkeley.edu/Web4Lib/archive/9702/0090.html> and
<URL:http://www.lib.berkeley.edu/Web4Lib/archive/9704/0528.html>


VALIDATION: I'll reiterate a couple of caveats from a year ago--these
figures are not meant to criticize or find fault, and validation problems do
not necessarily a bad page make (though they don't help).  They are intended
to illustrate the difficulties of creating good HTML that works on all
current browsers and is reliably "future-proofed" against future browser
versions, and the underscore the fact that as a community of web developers
we have a way to go.

I recently saw a claim that every major release of Netscape has broken
invalid documents that displayed correctly under the previous release.  With
the code for Netscape 5.0 going to developers, the odds are that the browser
environment will get more diverse than ever, leaving valid HTML as the only
guarantee of predictable presentation.

I validated most of these pages against HTML 4.0 Transitional.  I know that
generates a few validation errors in documents that pass in HTML 3.2, but
things more than even out, considering the number of documents that *claim*
to be written in HTML 2.0 or 3.0.  :-)

Last year's survey looked at 624 North American library home pages.  Here's
the rundown from back then:

2/7/97
    Average/Median number
     of validation errors:   20.4/13
    Pages that validate:        24 (3.8%)
    Pages with fewer than
     4 validation errors:    97 (15.5%)
 Pages with 80 or more
     validation errors:      14 (2.2%)


Yesterday's survey looked at 1390 North American library home pages:

1/26/98
    Average/Median number
     of validation errors:   22.5/13
 Pages that validate:  83 (6.0%)
 Pages with fewer than
     4 validation errors:   300 (21.6%)
 Pages with 80 or more
     validation errors:      73 (5.3%)


On a percentage basis, the number of pages with only a few problems has gone
up (though it's still a distinct minority), but so has the number of pages
with lots of problems.

My page of validation tools is still at
http://gold.ohiolink.edu/tdowling/validation.html  The newest addition to it
is the World Wide Web Consortium's own HTML 4.0 validator at
<URL:http://validator.w3.org/>.

Other tidbits:

    Pages with stylesheets:  11
    Pages with scripts:     119
    Pages with framesets:    70



HTML EDITORS: My apologies to those whose favorite editors don't leave a
visible trace of themselves in the pages they create.  From what I can see
in HTML source, these are the editors people are using, with the number of
home pages I found for each.

    Adobe PageMill          48
    AOLPress                 3
    Claris Home Page        15
    HomeSite                 2
    HotDog Pro               1
 Internet Asst. for Word  8
    MS FrontPad              1
    MS FrontPage           144
    MS Publisher             3
    MS Word                 11
    Netscape 2.x             2
    Netscape 3.x            86
    Netscape 4.x            68
    NetObjects Fusion        3



SERVER BREAKDOWN: The library world is a little more conservative than the
Internet in general, running a lot of older servers and older versions (not
that that's a bad thing...).  On the other hand, you have to admire the
steely nerves of the 14 sites using Apache 1.3beta as their production
server.

    SERVER                 SITES      PCT
    Apache.................. 360    26.6%
        Pre 1.0        6
        1.0x          26
        1.1x          71
        1.2x         224
        1.3x          14
        Stronghold    19
    CERN....................  56     4.1%
    Domino/Go...............   2     0.1%
    IBM ICS.................   9     0.7%
    Innovative..............  12     0.9%
    IIS..................... 164    12.1%
        Pre 3.0       29
        3.0          107
        4.0           28
    NCSA.................... 202    14.9%
        pre 1.5       60
        1.5x         142
    Netscape................ 337    24.9%
        Commerce      42
        Commun.       68
        E'prise 2.x  127
        E'prise 3.x   76
        F'Track 2.x   21
        F'Track 3.x    3
    Novell..................  14     1.0%
    OSU.....................  35     2.6%
    WebSTAR.................  58     4.3%
            MacHTTP    2
            1.x        8
            2.x       35
            Unident.  13
    WebSite.................  46     3.4%
            WebSite   17
            Pro 1.x   21
            Pro 2.x    8

    OTHERS.................. 223 16.5%


The 1/98 Netcraft survey at http://www.netcraft.com/survey/ shows these
figures for the net at large:

    Apache         45.12%
    Microsoft      21.51%
    Netscape       10.24%
    NCSA            3.77%
    O'Reilly        2.65%




Thomas Dowling
OhioLINK - Ohio Library and Information Network
tdowling at ohiolink.edu





More information about the Web4lib mailing list