“Information governance cannot be a one-off approach to managing information assets, but has to be driven by policies and procedures, and must require whole hearted adoption from top to bottom.”
Offering over 300 discrete services, and over 200 terabytes of information, the county council needed to improve the utilization of technology to automate many manual processes, and ensure that assets were being well managed, and services delivered successfully to 700,000 citizens.
A variety of Concept Searching products and applications were used to achieve this, enabling the council to develop a reusable enterprise metadata repository, addressing challenges such as search, records management, and eDiscovery. Matching the needs to the file plan in SharePoint, the organization implemented centralized management of taxonomies. Search and eDiscovery, provide through conceptSearch, delivered powerful concept-based searching capabilities, with sophisticated refinement capabilities. While records were automatically identified and processed, eliminating potential areas of noncompliance.
- Implemented a semantic metadata repository
- Automatic generation of semantic, compound term metadata
- Easily used by subject-matter experts
- Able to retrieve and use legacy content
This council is one of the largest local authorities in the UK, employing 18,000 people, including those working in schools. The council provides over 300 cost-effective public services, ranging from social care to highways maintenance, to over 700,000 people.
This organization felt that effective information management policies and processes were in place but that policy adoption by staff was lacking. One of the most critical challenges was the management of unstructured and semi-structured content.
What were the issues faced?
- No tagging was used for unstructured information
- There was no auto-classification capability or file structure to save content for future retrievable
- Reduction in staff numbers translated into a legacy of unmanaged, unsearchable content
- Repositories had grown through a decentralised IT function and a siloed approach to the delivery of services
The most significant outcome the council was able to achieve was in the technology framework, providing a strategic solution as well as an easy-to-use, tactical methodology that was readily deployed. Concept Searching search capabilities are built on a compound term processing engine that statistically generates semantic, multi-term metadata that represents concepts, phrases, topics, and subjects. Since the organization was not using any tagging, it was able to automatically generate metadata to content residing in diverse repositories, both internal and external.
A key driver for the solution was the underlying compound term technology. Compound term processing is a new approach to an old problem. Instead of identifying single keywords, compound term processing identifies multi-word terms that form a complex entity and identifies them as a concept. By forming these compound terms and placing them in the search engine’s index, the search can be performed with a higher degree of accuracy because the ambiguity inherent in single words is no longer a problem. As a result, a search for ‘triple heart bypass’ will locate documents about this topic, even if this precise phrase is not contained in any document. A concept search using compound term processing can extract the key concepts, and use these concepts to select the most relevant documents. Content that shares the same concept will be retrieved, even if the search terms do not match.
Automatic and/or manual classification to one or more taxonomies prevents end users from making potentially erroneous classification decisions. Auto-classification of content against a defined structure gave this organization a hierarchy to facilitate the retrieval of relevant information when searching or during eDiscovery. The council had no filing structure so was dependent on the author of the information to be able to retrieve it. Supporting both automatic and manual classification, subject-matter experts can utilize rich features in the conceptTaxonomyManager component, such as node weighting, the ability to see ‘concepts in context’, auto-clue suggestion for classification, and instant feedback on the impact of changes. The taxonomy provides the structure for the grouping of like documents, and enables a more targeted, accurate, and efficient management and tuning tool, resulting in reduced costs and improved productivity.
With litigation increasingly affecting all industries, the ability to locate content based on the concepts found within similar content, or content that shares the same themes, is extremely valuable. Traditional information retrieval systems use ‘keyword’ searches of text and metadata as a means of identifying and filtering documents during eDiscovery. These keyword searches can include the use of simple words or combinations of words, and often use Boolean operators to further refine information retrieval. Keyword search captures only 33 percent of relevant information, resulting in the retrieval of potentially large numbers of documents that are not weighted or ranked based upon their relevance. Each document is considered to have an equal importance and equal probability of relevance, require each to be reviewed manually. Although Boolean operators are commonly used, these approaches are limited by their dependence on matching specific language entered by knowledge professionals, to retrieve items relating to the desired topic of interest. Boolean operators also require a high skill level and are cumbersome to use.
How to search for and find appropriate, relevant documents is hampered by search specialists’ ability to think of every known term that would be applicable. In eDiscovery, different parties may use different words, depending on their roles. It is estimated that legal professionals’ activities are less than 20 to 25 percent accurate and complete when searching and retrieving information from a heterogeneous set of documents.
Compound term processing and auto-classification overcome the issue of not exhausting all possible search criteria. Designed for end users rather than expert search specialists, its search interface is intuitive, easy-to-use, and requires no training.
For the council, it represented a reduction in risk, ensured compliance, and enabled them to concentrate on effectively operating the council and providing services to the citizens.
- Ability to access and retrieve legacy information that previously remained hidden
- Concept-based information retrieval, increasing the accuracy and relevance of results
- Ability to research and identify noncompliance risk
- Effectively identify content for eDiscovery
- Legal compliance and facility for legal hold
- Accessibility to information, maintaining end user security rights
- Flexibility of the technology as a strategic solution and facilitator of tactical tasks, leveraging SharePoint investment
- No end user training required
- Holistic view of information delivers improved decision making capabilities