Let’s Recreate the Wheel, Again and Again
Information creation is an interesting topic. According to IDC, employees spend 2.5 hours per day duplicating or recreating work that’s already been done, and the cost is $5,000 per employee per year. From a different perspective, there are some estimates that suggest 20 to 35 percent of an organization’s operating revenue is wasted recovering from process failure and information rework. What do workers do when they can’t find information? They recreate it, use out-of-date content assets, interrupt co-workers to help find the information, start the task without the information needed, or simply don’t start. Well, that’s a bit scary, especially because these things happen frequently in most organizations.
What is thought provoking is that this subject represents a hidden cost of doing business, meaning most managers and executives are not aware it is happening. Maybe the IT team is, but I doubt even it knows the extent of the problem. We have a client who was timing the retrieval of content during the proof of concept. In this particular test case, the content was a legal document, so it had to be found. It took the client a day and a half to find the document using its normal search procedures. Seriously. Another client figured out that its knowledge workers had previously spent 40 percent of their time trying to find the ‘right’ information. This issue is a reality, not just analyst musings.
The problem, of course, is poor information retrieval. Content can’t be reused or repurposed, simply because it can’t be found. Everything gets blamed on poor search, so let’s dig a little deeper. Now, I am going to ask you a straightforward question. How much garbage do you think you are storing? I’ll help you figure it out. 70 percent of content stored on file shares is redundant, obsolete, or trivial (ROT), while 25 percent of content is duplicated, 10 percent has no business value, and a whopping 90 percent of documents are never accessed after creation. Legal experts estimate that 69 percent of your data can and should be deleted. Do you think this garbage clogs up search results? No wonder no one can find anything.
All our clients perform what we call content optimization before implementing our technologies. It’s done in conjunction with migration. We also recommend that content optimization is done on a quarterly basis. Why? Because it cleans up your mess.
Our solutions generate semantic, multi-term metadata and classify it against one or more taxonomies. If you tried to clean up your corpus of content manually, the contextual meaning of each document would need to be searched to determine its value. This can’t be achieved manually, as the volume of documents is probably too high and the consistency of human review and decision making is unreliable, as well as costly. Most tools return some pretty erroneous answers, based on poor metadata entered by end users, and are unable to identify the context within the content. Content optimization identifies dark data, ROT, data of no value, duplicates, multiple versions, privacy and sensitive information violations, and undeclared records. Basically, the process cleanses your corpus of content and data.
As our technologies automatically categorize content when it is created or ingested, and the contextual meaning is extracted in the form of multi-term metadata then fed to the search engine index, this means that concept-based searching becomes a reality.
Improved decision making, competitive advantage, and speed to market can sound like hype, and things with which you can’t associate tangible ROI. But they are all valid. Do you think you have a problem with an overload of information of no value? Do you think it impacts search?
Our webinars also address the topics explored in our blogs. Access all our webinar recordings and presentation slides at any time, from our website, in the Recorded Webinars area, via the Resources tab.