Migration – You Mean You Are Migrating 69% of Useless Information?
According to a survey by the 2012 Compliance, Governance and Oversight Counsel (CGOC) Summit, 69% of corporate information that is saved can and should be deleted. That’s a pretty hefty percentage. According to IDC, 80% of enterprise information is unstructured. Let’s assume that these figures are correct. For the sake of discussion what do you do when faced with a migration? Move it all? Why? Because it’s easier and you can ‘worry about it later’? Or tell your end users to clean up their messes and trust them to do it (and correctly).
What are the problems? Migration of unstructured content can be a laborious and time consuming project. The challenge is that documents can exist in multiple places at the same time, different revisions of the same document exist, some documents should be deleted, and others should be archived. There may be records that were never declared, as well as confidential or privacy information that will not be identified when migrated. The ability to mass move content is relatively straight forward. However, from an information governance approach, mass moving content results in the same problem of mismanaged content.
To migrate document collections effectively the text content of each document needs to be searched to determine its value. This cannot be done manually, as the volume is too high, and the consistency of human review and decision making is unreliable as well as costly. If manually processed the security rights of the documents as they are moved to their new location must be applied. General migration tools cannot safeguard document confidentiality because they do not make intelligent taxonomy workflow decisions based on the text content of the document, or do it in mass.
We recommend the auto-generation of semantic metadata and a taxonomy before migration to cleanse and identify content to determine if it should be moved, if it should be archived, if it contains sensitive/privacy data, and to identify any records that need to be kept but were never declared records, or just plain delete because of redundancy and versioning.
But what about you? Since at times I have tunnel vision, how do you address these types of migration issues with unstructured content? How do you resolve and what is your approach?