Enterprise Search – Precision versus Recall – What is old becomes new again
During my research I often find some little snippets of information that make me stop and think about how ideas, theories, processes are repeated, imagining a highway being built that stretches endlessly in the horizon and we return to the starting point. It seems to be happening more often lately.
Even with technology we are still seeing history being repeated. Enterprise search has been around for about 67 years as described by J.E. Holmstrom in 1948. Machine Learning or Artificial Intelligence has been around for 61 years, and is now becoming the newest buzzword and must have technology. Precision and Recall, was introduced in 1955 when a gentleman named Allen Kent joined Case Western Reserve University. That same year, Kent and his colleagues published a paper in American Documentation describing the precision and recall measures as well as detailing a proposed “framework” for evaluating an Information Retrieval system which included statistical sampling methods for determining the number of relevant documents not retrieved.
Over three generations have passed, and what is ‘old’ is now ‘new’. Precision and Recall is now back in the news, at least in the legal industry. What brought this to mind is an article I read in Legaltech News, written by Zach Warren, it’s actually a good read regardless of industry as in almost all points he hits the nail on the head.
Years ago, the accuracy of search was measured by precision versus recall, in fact, we have several clients who use our tools to tweak and manage precision versus recall. Why? One is considered one of the top three global analyst firms, and they need precision and recall on their external client web site – poor search results equal lost revenue. The other client has 170K global users and needs accurate search results. The image from Wikipedia illustrates Precision and Recall in an easy-to-understand graphic.
These days, despite some of our clients, I don’t think it is used much. I also agree with the writer, that most tools don’t let you easily manipulate precision versus recall. It seems to be a forgotten metric in search efficiency. Luckily, our tools are easy-to-use and although precision and recall is a tough nut to crack it’s not like it used to be. Nice to see it back around again, at least in the legal industry.