Auto-classification and Taxonomies – Think Before You Jump
The word taxonomies reminds me of childhood and going to the local library to select books, which is so long ago I don’t even know why I remember it, or make the association. Now, taxonomies constitute much of my work life. I look at the legal industry who has decided that artificial intelligence and predictive analytics is the Holy Grail that the industry has been seeking. Hey, taxonomies don’t look so bad after all. Unfortunately, taxonomies have the misnomer of being antiquated technologies, requiring massive human resources, lengthy deployment time, and constant management. Even today, many vendors offer exactly that under the guise of extraordinary software.
In our annual surveys, every year manual tagging is still the primary vehicle for describing content and has only inched down by 1% according to this year’s survey – indicating that 93% of organizations rely on manual tagging. I’m not going to go into my song and dance about the peculiarity of the continuation of this practice, even bad software is better than manual tagging and its associated vagaries.
Why don’t more organizations investigate other options? There are probably hundreds of predictions on the growth of unstructured content. Hey – is anyone awake out there? Sooner or later, organizations will have no choice but to move away from manual tagging. A taxonomy, auto-classification and metadata have a co-dependent relationship, and offer a robust solution. The structure of the taxonomy and the metadata are reciprocal elements that work together to create the information architecture for unstructured and semi-structured content. Taxonomies provide the visual organization and structure for organizing content which metadata does not provide. At the same time, metadata provides more descriptive information about the content to improve access and use of the content. Semantic metadata generation results in improving workflows and business applications that use metadata when content is classified to one or more taxonomies.
The selection of a taxonomy tool requires separating the chaff from the wheat. Although on the surface, tools may appear to provide sufficient, if not equal features. But the decision for a taxonomy tool is a decision that will impact the organization for years. Some taxonomy tools are designed for IT, require learning a separate language, and long-term use of third party consultants. Extensive rule building and testing can slow the organization’s ability to respond rapidly to business and terminology changes. Some are not scalable. Totally automated solutions, can impact the organizations ability to develop taxonomies that are aligned with their specific objectives, using their unique terminology from their own corpus of information. Real-time interactive features are highly recommended and reduce the time spent for improving as well as the on-going management of the taxonomies. Interactive features typically do not require a re-indexing of content enabling tuning to be done in one expedited iteration, instead of hours.
Think before you jump.