Natural Language Processing Equals Keywords? Whoa, Wait a Minute
An interesting discussion came up the other day on the use of the term natural language processing (NLP).
According to Wikipedia, “Natural language processing is a field of computer science, artificial intelligence, and computational linguistics concerned with the interactions between computers and human (natural) languages. As such, NLP is related to the area of human–computer interaction. Many challenges in NLP involve: natural language understanding, enabling computers to derive meaning from human or natural language input; and others involve natural language generation.”
For those software scientists it ain’t easy. The gist of the conversation was that the use of keywords for auto-classification was equal to NLP for classification. Keywords are not, again I say not, NLP.
Think of a very rudimentary taxonomy, because that’s all it could be, and keywords get classified to the system. Not even in proximity keywords, just plain ole keywords. For example, if you take the phrase ‘triple heart bypass’, in a keyword system it would return all documents that contain the word ‘triple’ (baseball, three), all the words that contain the word ‘heart’ (core, center, love) and ‘bypass’ (highway, avoid).
Quite a difference from knowing the words represented the concept ‘triple heart bypass’. In addition, the system would suggest other relevant documents even though they didn’t contain that exact search word, such as ‘coronary artery surgery’. Dumb versus Smart.
Why do people consider a keyword-based classification system? Because they just don’t know the difference. But they will when they start using it. Would you consider a keyword-based classification system?
Join us for our ‘The Nuts and Bolts of Metadata Tagging and Taxonomies Made Easy’ webinar on Wednesday, February 8th. It examines how taxonomies, auto-classification, and multi-term metadata generation unburden the IT team, eliminate end user tagging, and empower business users. Gain an understanding of different technologies and best practices, for evaluation and deployment.