[Web4lib] Another Google question
Lars Aronsson
lars at aronsson.se
Wed Jul 6 14:53:16 EDT 2005
Patricia F Anderson wrote:
> I teach a class on advanced Internet searching, and
> focus on the concept of "match the tool to the task".
Good for you, and your students. The problem is that web
searching isn't very advanced yet, and there is little point in it
even trying to being so, because the web isn't very advanced yet.
Not too long ago I saw the movie "The Aviator" about Howard Hughes
who founded Hughes Aircraft in 1932. I think air flight then is
comparable to where the Internet is today. It has a 30 year
history (ARPAnet 1969; Wright brothers 1903) and a century long
prehistory (library science; balloon flight). It is promising and
has a lot of future in it, but it is just about to leave wood and
cloth behind it for all-metal airplanes. Here is a six page
article from September 1931 on chosing the right kind of wood for
building aircraft, http://runeberg.org/tektid/1931a/0495.html
That could then be called "advanced aircraft material testing",
but a decade later wood was no longer an "aircraft material".
Coming back to the Internet, that article from 1931 is online
because I scanned years 1871, 1872 and 1931-1934 of that magazine.
But the 58 years in between are still missing. That is an "almost
indefensible" gap. Not to mention the still copyrighted years
1935-1994 that are missing, until the magazine itself appeared
online in 1995. So what do you get if you "find all web pages"?
You get my scanned six years plus the magazine's own ten years
online, out of the total 134 years that this magazine has existed.
That is almost 12 percent. Suppose that Google has indexed half
of what's online, then Google will find 6 percent of what's been
published in the magazine. At best Google could achieve 12
percent by tweaking its search engine. By promoting scanning
projects, Google could find all text from the first half of this
magazine's publishing history, but the recent 70 years might have
copyright problems.
Suppose you can write a clever Google query for building aircraft
out of other materials than metal. Why would you go through 900
web hits, when it is so obvious that most of the knowledge is not
going to be available on the web at all? There are so many stages
of omission other than that of Google's hit list. In fact, of all
knowledge, most has never been published in print at all. Which
is why we need people to write blogs and contribute to Wikipedia
and similar projects, to get more knowledge online.
--
Lars Aronsson (lars at aronsson.se)
Aronsson Datateknik - http://aronsson.se
More information about the Web4lib
mailing list