[Web4lib] Re: Future of libraries

Tim Spalding tim at librarything.com
Tue Jul 8 00:26:33 EDT 2008


>Hi Tim, somehow I've missed conversation on library tagging programs failing.

I'm going to respond at some length to this--much length. I apologize
to the 90% of you who are now hitting "next" on your email program. :)

As Karen noted, tagging has recently been "tacked on" to various OPACs
in this mechanical, thoughtless way. I at least have heard over and
over the feature is virtually unused. If anyone can show otherwise,
I'm all ears.

To my knowlede, however, the largest, longest-running catalog tagging
projects remains PennTags, UPenn's tagging project, started at the end
of 2005.

I just grabbed the "all tags" page on PennTags, ran it through Excell
and reached a total of 59,101 tags. Of these, it appears that only a
minority relate to catalog records; PennTags allows users to tag web
pages and other resources, which is a strength, certainly, but does
not speak to whether in-catalog tagging works.

This aside, even if all 59,000 were OPAC tags, these numbers compare
quite unfavorably with LibraryThing's 37.4 million tags. Indeed,
LibraryThing adds more tags every day (about 65,000) than PennTags has
ever added.

I mention this not to bring people down, or to crow about how big
LibraryThing is--in fact, still quite small--but because when it comes
to tags, *numbers matter*: Everything that works about tagging works
better with numbers, and most of the problems are ammeliorated by
numbers too.

When a book has one tag, or even ten, you know little for certain.
Simple idiocy or malicious intent aside, tagging is an uncertain,
personal thing. But when books map to hundreds or thousands of tags,
and tags map to hundreds or thousands of books, patterns
emerge--called a "folksonomy"--that provides some of the power of
traditional classification and can significantly aid in discovery. You
can look at a book and see a wide range of tags, sized by how often
they are applied. You can browser a tag and see a list of books that
is comprehensive and sorted by relevance. The good stuff rises and the
bad stuff sinks. You can even combine tags with each other--eg.,
LibraryThing can combine "wwii," "france" and "nonfiction" for a
decent list of nonfiction books about France during World War II--and
even maps to formal ontologies, as LibraryThing does.

I can hardly find an example that doesn't seem unfair, so I will chose
one that's typical. In the UPenn catalog, the _Great Gatsby_ has been
tagged twice by one user--as "great" and as "gatsby."

On LibraryThing, the same book has been tagged 15,891 times. Among
these, both "great" and "gatsby" appear--5 and 26 times respectively,
but they are dwarfed by others, "jazz age" (152), "lost generation"
(48), "long island" (52), "american literature" (392), "wealth" (53),
etc. The result, together with all the other books these tags connect
you to, is a rich tapestry of meaning. What are the most significant
jazz age books? What about jazz age New York? Etc.

Even when low-frequency tagging is dead-on, it's still leaves you
alone. For example, Kuhn's _Structure of Scientific Revolutions_ has
been tagged twice on PennTags, the very correct "historiography" and
"history_of_science." But both were applied by one user, who has also
applied each and every instance of those tags. The user--a heroic
individual responsible for fully 7% of PennTags--has surely put in a
lot of work, but do you really want one person's slice of these
topics, probably gathered in the course of a single research project?
Isn't formal classification--comprehensive, thorough and intent on
objective analysis--infinitely superior to one anonymous user's
happenstance tags?

On LibraryThing by contrast, you are never alone--historiography has
798 users, and history of science 1,145. Each unwraps a small world of
connections. Kuhn's masterwork alone has drawn "history of science"
fully 130 times--and it can also be found through appropriate tags
like "philosophy of science" (185), "paradigm" (40), "paradigm shift"
(12), Wissenschaftstheorie (4). Even if the tags themselves don't
interest you, LibraryThing has done the math and discovered that
"history of science" is highly correlated with a set of LCSHs--e.g.,

	Science -- History (473)
	Knowledge, Theory of (223)
	Science -- Methodology (150)
	Science -- Social aspects (96)

--all of which have bearing on Kuhn's work, but only one of which is
listed among its LCSHs. Only numbers can give you these sorts of
correlations.

Most importantly, even with 37 million tags, I am fully aware that,
because of the long-tail effect, huge swaths of the bibliographic
universe are still unserved or poorly served by tagging on
LibraryThing. This will still be true at 370 million tags, but less
so.

To Karen's point, that tagging might succeed if put elsewhere in the
"workflow" of the library patron, for example at return not check-in,
I have some sympathy. At the least the "ignorance problem" is
resolved--the patron has probably read the book or seen the movie.

But return is not a natural "computer time," and the explanation still
fronts the idea of tagging as a "feature." Tagging and other social
activities are more than that. They are social phenomena, and like all
social software phenomena, a complex alchemy of personal benefit and
social reinforcement. Just as you can't put a basketball on the ground
and expect a basketball game to spring to life (similarly: a matress,
orgy) folksonomy requires much more than a "tag" field to take off.

Tagging on LibraryThing took off because our members have a strong
desire to organize large collections of their own books, and because
participating in the tagging system, once it gathered steam, became
its own draw. Despite its success, tagging is still done by fewer than
50% of users, it tends to start in a big burst of activity, and it
generally begins when a user has more than 200 books in their personal
library. These observations should dim the light on library tagging
somewhat, since the impulse to organize returned items is less
powerful, and returns happen in dribs and drabs.

Personally, I think tagging in catalogs should be looked at primarily
as a discovery mechanism--and tags should be drawn from elswhere, if
necessary, to achieve critical mass. That is, of course, what
LibraryThing sells, so you can discount this if you like. If libraries
want to bypass outside sources, however, and develop their own
folksonomies, they have to get serious about combining forces to
achieve mass, and about understanding social software as something
more than a set of "features."

Tim




More information about the Web4lib mailing list