Dark data, the new darling term of the analysts, could just be called junk, stuff, garbage, but dark data does have a rather romantic ring to the term. Some organizations keep everything, just in case. Others, delete everything they can. Some, have a more rational approach and actually perform lifecycle management. Dark data can be a risk in litigation and with regulatory investigations, or may be the source of personally identifiable information, health information, credit card information, and the list goes on. From a practical point of view, dark data, depending on the amount can be expensive in server space required and response times.
Reading a recent article by Fred A. Pulzello, president of ARMA International on best practices, although I’m not sure there are any yet. The first step was to identify the dark data. But there was no explanation on how this was to be accomplished. Although I am not certain, I would assume most of the dark data is user generated and just forgotten about. This makes it more difficult to find. The second step was to do a cost-benefit analysis. Might be a good idea, but the typical organization may be dealing with thousands of pieces of content. Rather time consuming. Third step, determine what to keep and what to delete. Another human intensive process, and also subjective as you and I might not assign the same value to a document. Mr. Pulzello went on to map out the remaining best practices.
I don’t think the intent was to evaluate software and technology tools, but his approach was from a business perspective on how to start getting control over dark data, which he explained very well. As a software vendor, I would argue the challenge of dark data requires metadata generation and auto-classification to find all the potential areas of risk and just plain garbage. If an organization is in the process of migration, this is also the ideal time to identify, evaluate, and get rid of dark data.
My question to you all, do you address dark data? Does the organization care? Does executive management consider it a risk issue? Just curious.