Delete Data? Why, just use search and create data lakes! Dive In!
I just read a very well written article, entitled Information Governance v Search: The Battle Lines Are Redrawn, by Ralph Losey, who is a practicing attorney and shareholder in a national law firm with 50+ offices and over 800 lawyers where he lead’s the firm’s Electronic Discovery practice group.
It is a very interesting viewpoint, and although the article is quite long, I would suggest reading it. Mr. Losey’s premise is that information should never be deleted and should be replaced with Artificial Intelligence search. He does make a several good points, but I guess I am still stuck in the old school on topics such as records management, information governance, and search. One of the points he makes is who is to decide when data has lost its value? This is referred to as an old-school problem, as in the new world all information should be saved and data lakes created, According to Losey, “information can prove what really happened in the past and can help you to make the right decisions. With smart search, there can be great hidden value in too much information. “I do take exception to that. There is quite a bit of information that organizations keep and is actually useless. Business users still spend much of their time searching because they can’t find what they need. Although, according to Losey, search will be so ‘smart’ that, I assume, the problem inherent in search engines will go away.
Losey concludes the article by stating, “that is the new reality of Big Data. It is a hard intellectual paradigm to jump, and seems counter-intuitive. It took me a long time to get it. The new ability to save and search everything cheaply and efficiently is what is driving the explosion of Big Data services and products. As the save everything, find anything way of thinking takes over, the classification and deletion aspects of IG will naturally dissipate. The records life-cycle will transform into virtual immortality. There is no reason to classify and delete, if you can save everything and find anything at low cost. The issues simplify; they change to how to save and search, although new issues of security and privacy grow in importance.” Where I see a problem is that organizations need to plan for the impact of collecting even more information, garbage or not. Not only in terms of hardware but in terms of keeping dark data.
For Information Governance, duplication and multiple sources of truth will be present. How are you certain the information you are basing decisions on is relevant and accurate? Just trust the search engine?
Perhaps from a legal standpoint, the organization does need to be more careful on delete versus keep. But not all data or content retains value forever. I wonder too, by keeping all data, eliminating records management, and depending only on search, does it impact the results of the data mining? Does it make data mining more complex to get to the information you are seeking as you are now dealing with a tremendous data set where you don’t really know which end is up? I would tend to think so.
Anyway, a radically different perspective. He hasn’t convinced me. What about you?
(If you have a few minutes and use SharePoint or Office 365, could you kindly take our metadata survey? You could win a free conference pass to Microsoft Ignite. We would greatly appreciate it)