Serendipity, re: organizing schemes
Nick Arnett
listbot at mccmedia.com
Mon May 3 11:25:24 EDT 1999
At 07:25 AM 5/3/99 -0700, Minkel, Walter (Cahners -NYC) wrote:
>I have a devil's advocate question in response to the request below: why is
>it a good thing to sort Web collections by Dewey or LC? Dewey or LC are
>useful for items that cannot be searched by keywords, but why is it
>necessary, other than the fact that the books have call numbers, to sort Web
>sites that way? Yeah, I know that Web sites are being assigned call numbers
>in many library catalogs, but is it necessary if keywording & assigning
>subject headings is done carefully? In a public library situation (where my
>experience lies) call numbers were a way to arrange things together on the
>shelf & that was about it. But Web sites have no "shelf" to sit on.
While call numbers themselves may disappear, the idea of grouping related
materials makes more sense than ever, IMO. This is the only way that
people will find things serendipitously. With so much information on-line,
most searches on content has limited value, because a search on words will
return too many items most of the time. Categories provide context, which
complements content search in at least two ways. By returning categories
as the result of a search, an enormous list of results becomes manageable,
allowing broad searching. By searching within categories, the scope can be
narrowed to the point where the results of word searches return reasonable
lists individual items. Digging through a huge collection of resources
becomes much easier when you can choose at any point to switch between
searching content (text search) and navigating context (categories). This
was the idea behind the Verity Knowledge Organizer (for which I was the
product manager) -- a hybrid of something like Yahoo and something like
AltaVista, but integrated much more than any other product or service that
I know of.
There is a third benefit that is just beginning to emerge, though I suspect
in the long run it's the one that will really change information search and
retrieval. Unlike an organizing system implemented in the physical world,
a digital system can support multiple, overlapping classification
schemes. This allows the researcher to ask a new kind of question -- how
does a given resource or category look from other contexts, other points of
view? For example, one could compare how technologists, historians and
religious scholars categorize documents about printing. Library
classification systems allow some of this kind of exploration, but it is
very constrained by physical catalogs and shelves. I have come to believe
rather strongly that easy access to multiple points of view is a powerful
engine of creativity, helping us to see patterns and analyze at higher
levels. For example, it was only fairly recently that biologists and
physicists realized that they were using substantially similar mathematical
models to describe certain aspects of their fields. When they sat down to
figure out why, our understanding of complex systems took a leap
forward. This kind of discovery should accelerate as we realize that order
can emerge from the seeming chaos of a web of information.
Librarians probably understand better than almost anyone that it will be a
long time, if ever, before computers can really figure out how our minds
categorize things -- that ability is clearly a key to language and
knowledge. In the meantime, by recording the correlations we invent in
categorization, digital information systems can be serendipity engines
without having to analyze causality (with little understanding of the
information they store, that is). Searching content is causal, based on
the logic of a priori knowledge such as "documents about the history of
printing are likely to mention Gutenberg." Navigating content is only
based on correlation: you can discover the fact that such documents mention
Gutenberg, even if you -- and more importantly, the computer -- didn't know it.
Nick
More information about the Web4lib
mailing list