Tags, taxonomies, folksonomies

Written by David Tebbutt in May 2005

Clay Shirky, adjunct professor at New York University and internet expert, notes that categorisers have to be both mind readers and fortune tellers. What a great description of the difficulties they face. The pesky world keeps changing and at such a pace it's impossible to keep up. By the time the taxonomies have been carefully re-crafted, the world's already moved on again.

On mind reading, the user has to know, guess at or refer to the controlled vocabulary to find what they want. Or some kind of synonym mechanism needs to be in place. Either way, the categorisers are trying to second-guess the users while not polluting the taxonomy.

Fortune telling, in this context, is about predicting the future. It's impossible to create a taxonomic structure which can cope with all future change. Libraries are a great example. No doubt you've read about the preponderance of Christianity in the Religion section of the Dewey system. Or the prominence of the Balkan Peninsula in the History section of the Library of Congress. Both had lots of books written about them, thus giving them prominence in the classification system. Hardly an objective reflection of the real world, more a reflection of the library's inner world.

Having said this, traditional systems have their place and it would be madness to even consider dismantling them without having something better in their place. But interesting things are happening in the internet with users tagging information with their own terms. No consulting a controlled vocabulary, just an instinctive choice of appropriate words. Thomas Vander Wal coined the term 'folksonomy' to describe the approach. If enough people apply tags to a web-based entity, then it's easy to see which terms are the most popular. And the likelihood of a successful search is greatly increased.

An entity could be a photograph, a web page, a blog posting or any other web-based manifestation. Purists will moan that if you tap in 'canine' you'll miss out on masses of dog-related material. It's more likely that you'll be offered other popular terms and be invited to try again.

This approach, of course, breaks all the rules of building taxonomies. The main one is that the same entity is not restricted to a particular place in the hierarchy. But does this matter? Wasn't that always a somewhat artificial restriction? The digital world has freed us from such concerns. The entity lives in one physical space (a server somewhere) but it can be linked to from just about anywhere.

Quite often, the tags are not stored with the object but with a service doing the pointing. The tags are attached to the links. Perhaps it's time for a few brief references. Flickr stores photos with user tags and descriptions attached. Visitors can add their own tags. Technorati watches weblogs and newsfeeds and delivers anything of interest according to your chosen keywords. Bloggers can embed special Technorati tags in their postings. A good example was the 'lesblogs' tag which was applied when people uploaded photos or made blog postings about the recent 'Les Blogs' event in Paris. The del.icio.us service allows people to store URLs with tags. It was designed as a hosted bookmarking system but, of course, it is actually a social bookmarking system. You can access pages through tags, you can see who else has tagged the page, and you can see what else they found interesting.

It might sound mad if you've not looked. So take a look. Lurkers are allowed. And, who knows, you might be the person who finds innovative ways of combining the new ways with the old.