Big Data and Text Analytics – Eliminating the Noise
Gaining insight from unstructured content such as in social networks, online media, and surveys is a hot topic but still not mainstream for many organizations. The big guys have been doing it for years and have honed their skills to be magicians in ‘assuming’ they know what you are looking for. In some cases, they are right.
But for the typical organization, analysis of unstructured content can improve applications such as brand-reputation management, market research, competitive intelligence, and customer service and support. For these applications and others, text analytics brings automated, natural-language processing techniques to bear to identify and extract names, facts, relationships, sentiment, and other information in blogs, forums, news, social updates, e-mail, and a range of enterprise sources.
Without any control over unstructured content the result is just noise. Which doesn’t give executives in a company a handle on what exactly is happening in the marketplace, and in solving their specific challenges. The key to unlocking this information is first of all, better enterprise search with the ability to automatically generate semantic metadata consisting of phrases as well as single keywords and acronyms. As they say – garbage in garbage out. We are long past depending on the end user to consistently apply meaningful metadata to everything that is created. It just won’t happen. It never has and it never will. And what about content that is ingested with erroneous or non-existent metadata which results in an even a greater loss of control of content?
The second criteria must be a way to aggregate and manage the unstructured content for business use that enables the analysis of unstructured content. For this, enter a taxonomy or taxonomies. The key here is the ability to rapidly generate taxonomies to capture the insight you are looking for. It must be easy to use, by knowledge professionals, and rapidly implemented and managed. The laborious time-consuming effort of typical taxonomy building doesn’t avail itself as a solution to quickly capture a wide range of content that can be transformed into intellectual capital.
Two approaches, feed the rich metadata to any application, or create a taxonomy that mirrors the information you trying to find. Either way, it’s a win win.