Auto-classification – Content-based Classification
I don’t talk much about auto-classification, although it’s a key component in our technology. In content-based classification, which is what we do, weights are given to particular subjects in a document that determine the class to which the document is assigned. This is based on rules that provide the instructions on what verbiage and descriptors are important and how they should be weighted.
Auto-classification, sometimes termed categorization, helps achieve information governance
Request-oriented classification, sometimes just called indexing, is classification where users determine how a document is classified. This can be useful to specific functional groups, but policy must also be applied to ensure the content is consistently and accurately classified.
Categorization, although terms now are often intertwined, groups documents into categories, based on their similar properties. The result is that a document is recognized, differentiated, and understood. Usually, the intent is to group similar content for a specific purpose.
Auto-classification and categorization
Classification and categorization, to different degrees, are particularly useful in records management and data security type applications, where predictable metadata or patterns can be used to aid in the content life cycle for retention, disposition, and security of privacy assets. For example, one SharePoint client has 72,000 site collections, and 5,300 retention codes. Using auto-classification, documents of record, as well as unknown privacy data violations, are automatically identified and routed to the appropriate repositories. All manual tagging of content has been eliminated. Obviously, the client had a very strong business case and has achieved its objectives. In this usage scenario, there were direct and quantifiable business benefits that were achieved quite rapidly.
I am curious if readers use any type of classification technology? I would assume it is typically request-based classification. Although I could be wrong – wouldn’t be the first time. I am interested in any feedback on classification in general and your experiences.