Text Analytics and Unlocking the Value of Unstructured Content
A subset of the popular term ‘Big Data’ is Text Analytics. I guess, because we are in the business, we have seen many vendors jumping on this band wagon, regardless if this is their core competency. (Sort of makes you scratch your head). That is not to say that there aren’t highly competent vendors who do focus quite successfully on text and sentiment analysis. Don’t want to lump everyone into the same basket. But back to the point. When we conducted our 2012 market survey of CIO priorities, text analytics did come up as a 2013 objective, 23% of our clients are using our solutions for text analytics, yet only 16% of the broad market felt that this was a priority.
When it comes to incorporating unstructured and semi-structured content in the context of Big Data it has been typically pigeon-holed into a database approach, which, to me, doesn’t seem like a logical approach. Data is machine driven, whereas unstructured content is driven by people, which makes the nuances, insights, relationships of disparate content, sentiment, and knowledge capital much more difficult to extract. Unstructured content is also continually in a state of flux and changes rapidly.
What we have found is that there are two issues that are consistently stumbling blocks, which organizations typically do not know how to solve. The first is the end user’s inability to correctly tag content for reuse and the organization’s inability to enforce policy that captures metadata from diverse content sources at the time of ingestion. The first is self-explanatory. The second is how do you capture in real-time the essence of content that can provide the organization with insight on a variety of topics. At the fundamental level, the analysis and extraction of highly correlated concepts from very large document collections is required. This enables organizations to attain an ecosystem of semantics that delivers understandable results. The second key component is a way to capture and extract the concepts and sentiments from external sources and diverse applications. The valuable insight gained can be used to identify competitive advantages, customer perception, regional trends, and identify the internal knowledge capital that exists but is rarely used because it cannot be found.
Have you jumped on the bandwagon yet? Do you use text analytics and/or sentiment analysis tools? Any pitfalls or great benefits?