[WEB4LIB] Re:trivial question about size of largest web page

rich at richardwiggins.com rich at richardwiggins.com
Thu Jul 12 10:23:46 EDT 2001


My first answer was the same as Tony's. What do you mean by "page" or "site"? How do you measure a mesh?  How do you weigh a URL? How big is a cloud?

Yet we do every day refer to "sites" as discrete objects.  "I found a cool Web site today."  "Visit nwa.com to buy your tickets online."   We know what "site" means most times we use it -- a collection of related content usually within a single domain organized by a site sponsor.

I think you CAN characterize the size of a site.  If it's primarily static pages, how many pages?  If it's database driven, what is the record count?  If it's a graphical database, how many images at what resolution?  If it's a movie archive, how many films, what average length, at what resolution? Etc.

The state of Michigan launched a new portal with great fanfare this week.  When a reporter asked for an appraisal, I used the Xenu link analyzer (great tool I discovered on Web4lib; this finds dead links but also counts all links and produces a site map) and counted 6000 links, maybe 1/2 internal.  Not a complete measure, since the site delivers services and database-driven content, but one metric.

Of course, the reference librarian gets to parry with Tony's questions during the reference interview, but there ARE ways to get at this.

It's easy to compare the Manhattan phone book with the Manhattan, Kansas phone book.  It's much harder to appraise a Web site announcement du jour.  A few years ago a virtual library of First Ladies was announced in a White House garden ceremony with Hillary Clinton proclaiming that this virtual library would change forever how scholars studied First Ladies.  All the site had was a photo and short bio for each First Lady, and a bibliography of print materials that were not in any way provided by the site.  It wasn't a "virtual library"; it was conceptually thinner than a single children's book.  

In fact, I claim that this sort of analysis is precisely the kind of inquiry librarians ought to be doing.  They've done it for years in choosing which online databases to subscribe to.

By the way, I nominate the terraserver.com satellite imagery site.

/rich




> The question has no sensible answer because the words used have no 
> clear meaning in the context of the internet. It's a hang over from 
> thinking of information as held in an artifact. What is the biggest 
> book is not all that clear either ;-]

_____________________________________________________

Richard Wiggins
Consulting, Writing, and Lecturing on Internet Topics
rich at richardwiggins.com  http://richardwiggins.com 


More information about the Web4lib mailing list