Big Data and the Elusive Butterfly

In our previous blogs we looked at Big Data, some definitions and how they fit into the extraction of value from unstructured content. The adoption rate for ‘Big Data’ will continue to grow because of large and rapidly growing data that is being captured by automated and digitized business processes. It is the same challenge facing unstructured information – how do you turn it into useable information to improve business outcomes? It’s rather confusing with all the hype about Big Data and the expected benefits – revenue, cost savings, competitive advantages yet organizations still fail to realize the treasure trove of unstructured information not being used, or for that matter, can’t even be found. The projected benefits are exactly the same as for Big Data.

In an article in Forbes Rasmus Wegener, a partner in Bain’s IT practice he suggested that Big Data was totally dependent on the organization. The example he used was: “When you walk through the airport and they take pictures of everybody in the security line to match every face through facial recognition, they have to do that almost in real-time. That becomes a big data problem. If I am a bank and looking at a vast number of credit scores and histories, and I don’t need to provide an answer in five seconds but can do it next day, then that is not a big data problem.”

If we turn that around to focus only on unstructured content, it still is an elusive butterfly. There is enormous value in knowledge capital, yet most organization still struggle to even manage it (or ignore the problem). Unstructured content is just there and growing but not being harnessed to deliver business benefits. Those who are capitalizing on the aggregation and extraction of knowledge are being driven because they have realized that managing it, does actually deliver the touted benefits.

We have many smaller clients that need to manage unstructured content primarily for regulatory purposes. We have another client who has four petabytes of unstructured content that needs to be managed for not only the extraction of value, but also for compliance and data privacy. I’m not sure that unstructured content fits neatly into a peg for Big Data as most organizations are just struggling to manage what they do have so that is an organizational gap that needs to be overcome first. But it will be interesting to see how unstructured content unfolds within the framework of Big Data, specifically around sentiment analysis. But more on that subject later…