Miss an article? Archives

Feature Article

Sunday, July 08, 2007

WordMap Makes Taxonomy Creation Simple

One of the challenges of the information age is helping people find things. There are many ways to do this, but they all boil down to improving the two main approaches to locating information:  navigation and search. Taxonomies are essential for improving both of these approaches and contributing to the bottom line goal of making information more findable.

Navigation
A taxonomy is an organizing principle or structure. If we take the example of an e-commerce website, these sites usually start with a small number of top level categories that branch out to reveal sub-categories and terms in varying levels of depth. Products are always organized in hierarchical relationships. This process of browsing and drilling down through information-rich websites is an example of how a taxonomy can be used as an organizing principle for navigation.

Search
A taxonomy also helps to fine tune search tools as it allows efficient access to all content classified under the same term. Rather than simply relying on full-text keyword searches, taxonomies improve search by placing content in it’s organizational context, which helps to increase the relevance of retrieved information.

Whether you are dealing with a customer relationship database or a content management system, all technologies that deal with information require a basis in taxonomy. This is even more important when various systems must interact

I have a taxonomy – now what?

Building and maintaining a taxonomy is not a walk in the park. Taxonomies are often complex structures involving hundreds to thousands of terms, synonyms, multi-language translations, non-hierarchical relationships, and more.  The need for maintenance is ongoing: new terms need to be added, new relationships created, modifications to existing terms made. Without a specialized tool to manage all of these terms and relationships, the task of keeping your taxonomy useful and up-to-date is a challenge.

Yet, many information managers try to make do with a home-grown solution of spreadsheets and other non-specialized software – getting very frustrated along the way. We’ll take a look at some of the key considerations that can help you decide whether a taxonomy management software tool is right for your situation.

Size matters

Taxonomy management tools are geared to dealing with large scale taxonomies. If your site’s navigational taxonomy has a few dozen, or maybe even a few hundred categories, you can probably live without them.

It’s when taxonomy terms scale in the thousands that you need to look at management tools. It soon becomes onerous to manage a global taxonomy of thousands of terms in a spreadsheet. Product catalogs are an obvious example, as well as sites that aggregate information. Complex employee portals can also have large and intricate navigation requirements that need to be managed efficiently and presented with simplicity.

So the first consideration you should take into account when thinking of acquiring and using a taxonomy management tool is;

Do you have a large number of taxonomy terms to manage?


image

Fig 1 – Even a fairly small taxonomy like the North American Industrial Classification Scheme (NAICS – 1,811 categories) is quite broad and deep

A tangled web

It’s not just about size, though. Complexity plays an important role. Many content management systems have taken their metaphors from the file folder hierarchy.

The simplicity of folders is their strength – but for complex use cases, it can be their weakness. Imagine you have a category that appears in more than one place (very common in hierarchies). If you’re using folders, that can’t happen. You have to choose a position.

Now, if you choose one position, and your users choose another, you will get some missed opportunities in navigation. When a user in an e-commerce site is searching for replacement batteries for laptop computers, he would be forgiven for looking for them under laptops, and not under accessories, where you have decided they should live.

If your folders constrain you to something that is a strict tree, you will lose users along the way.

So complexity can take the form of a dataset in which the have different relationships to each other. These might often be non-hierarchical relationships, the kind that can be so useful in pointing people to other items that might interest them.

So the second consideration you should take into account is;

Would you like to be able to associate content/information with more than one folder or category?


image

Figure 2 – This taxonomy makes use of complex relationships and shades closer to ontology

A rose, by any other name

We can take the same example again, and point out that some users will call laptops notebooks, and vice versa. Batteries might be long life or extended life, replacement or upgrade.  Requests for Information are RFIs and vice versa. Wherever search plays a role, synonyms matter.

Disparities between terms are often most striking when the culture and perspective of two groups is different. Many consumers (your customers) don’t know or care what terms the manufacturer uses to describe his product. They will speak in their own language, and if you don’t share it, they will find someone who does.

It’s not very difficult to use synonyms to ensure that there is a meeting of minds, or at least terms. But here again, our folder hierarchies let us down. A typical folder has a name. One name. Period. No synonyms allowed.

So the third consideration would be;

Would you like users to be able to access the same content/information using multiple terms?


image

Figure 3 – Using synonyms ensures that searching for “car” will recover a page labelled “vehicle”

Featuring …

Of course, categories have more than one attribute. Consider a consumer electronics product. Besides its name, it may have many attributes that will be used by information seekers, including its price, geographical distribution, manufacturer name. It may also have attributes that are used by systems, such as identifiers.

Now we are moving a long way from the folder hierarchy. Although you would not generally expect a taxonomy to play the full and complex role of a metadata management system, some metadata storage and publication is essential.

So our fourth consideration would be;

Would you like to be able associate a richer metadata with your organizations’ content/information?


image

Figure 4 – Using metadata to model the attributes used in a taxonomy

The times, they are a …

Each year, categories change faster (in most industries). Imagine you are a cell phone manufacturer, a broadcaster, or a publisher. How much will this year’s catalog resemble the last? How often will it have been updated? The likelihood is that the pace of change is ever faster.

Managing that change brings a number of requirements to the fore. One is for efficiency. Clearly, you want an environment in which changes can be dealt with quickly and easily, and without a lot of disruptions. Another, though, is for governance. Change has to be tracked, audited and recorded. Our final consideration would then be;

Do you need a tool that can keep track of the decisions made about how your organizations information/content is managed?

Home grown

I referred in the opening paragraphs to the typical home grown taxonomy management solution, so let’s describe it in a bit more detail.

At its center is almost always a spreadsheet. Categories occupy one column. Columns may be used to express hierarchy, but this is generally quite awkward. What if one branch is two levels deep, and another branch is six levels deep?

The spreadsheet is without doubt one of the most flexible tools on our desktop, but it does have its limitations in dealing with taxonomies.

Scanning across rows, we might find synonyms and other attributes in multiple columns. Scanning down the columns, we can see the breadth of the terms. When dealing with a faceted taxonomy, you may even have multiple spreadsheets. It is hard to see the taxonomy in its entirety.

Quite often, the spreadsheet is distributed to a small group of stakeholders, who may make modifications, then return it. As anyone who has had a passing relationship with financial applications knows, this is an inherently insecure and error prone process. There is no easy way to track change history, notes, definitions, to ensure that users are ...

Read more

Filed under: Taxonomy

News & Notes
(updated daily. almost.)
News RSS Feed

Understanding XML: Making Models and Watching for Swans

Saturday, April 12, 2008

Kurt Cagle makes some interesting comments in his XML.com article, Understanding XML: Making Models and Watching for Swans. One that jumps out: “As more of the burden of modeling systems falls into the lap of XML specialists (and it definitely is), this is driving those same specialists to become experts not just in the mechanics of XML (such as validating XML against schemas or transforming it with XSLTs) but increasingly the semantics of modeling real-life processes - or of at least serving to train up those people that do.” That’s one reason why many content professionals would benefit from a deeper understanding of ontologies, taxonomies, and the politics of naming content components.

Apple Marketing Genius: iPhone Fastest Rising Search Term On The Net

Tuesday, December 18, 2007

If advertising campaign success can be measured by the popularity of a product name being used as a search term, then the folks doing the marketing for Apple are big winners. According to Google’s 2007 Year-End Zeitgeist, iPhone is the fastest rising search term—not just in the US—but in the world. 

Subscribe: Direct Inbox Delivery

Get The Content Wrangler Newsletter delivered straight to your home or work Inbox. It's full of content goodness.

sponsors Image Image Image Image Image Image image Image Image Image Image Image Image
Internet Blogs - Blog Top Sites