The Concept Searching Technology Platform is based on our Smart Content Framework™ for information governance, and incorporates best practices for developing an enterprise framework to mitigate risk, automate processes, manage information, protect privacy, and address compliance issues. Underlying the framework is the technology to:
- Automatically generate semantic metadata
- Auto-classify content from diverse repositories
- Easily develop, deploy, and manage taxonomies
The framework is being used to enable intelligent metadata enabled solutions to improve search, records management, enterprise metadata management, text analytics, migration, enterprise social networking, and data security.
Why Use the Concept Searching Technology Platform?
- To take advantage of the full product and feature set including conceptSearch, conceptSQL, APIs, custom controls, and demonstration source code
- Need a powerful enterprise search solution that delivers highly precise results
- Used frequently in external web sites, particularly those offering paid services to site visitors
- Do not have a SharePoint environment but would like semantic metadata generation, auto-classification, and taxonomy management
- Do have a SharePoint environment but would like to use additional bundled products and features
- Need to implement one or more intelligent metadata enabled solutions that may include:
- Enterprise information governance
- Content management
- Content migration
- Concept based searching
- Sensitive information identification and protection
- Automatic declaration of documents of record
- Text analytics
- Enterprise social networking
Why Use conceptSearch?
||Watch the Webcast!Searching with Metadata and Searching without Metadata
What are the key outcomes?
The combination of the Smart Content Framework™, the Concept Searching Technology Platform, and the deployment of intelligent metadata enabled solutions result in a comprehensive and complete approach to metadata management in an internal or external environment. Our clients are using the technologies to:
This table provides an overview of all Concept Searching Platforms and the components for each platform.
|Core Components||conceptClassifier for SharePoint Platform||conceptClassifier for Office 365 Platform||conceptClassifier Platform||Concept Searching Technology Platform|
|conceptClassifier for SharePoint 2013||conceptClassifier for SharePoint 2010||conceptClassifier for SharePoint 2007|
|Compound Term Processing Engine – licensed for concept extraction only||yes||yes||yes||yes||yes||Full search functionality included|
|SharePoint Feature Set||yes||yes||yes||yes||no||yes|
|APIs, custom controls, demonstration source code||no||no||no||no||yes||yes|
|Proprietary controls for SharePoint 2007||no||no||yes||no||no||yes|
|Optional Components||conceptClassifier for SharePoint 2013||conceptClassifier for SharePoint Platform conceptClassifier for SharePoint 2010||conceptClassifier for SharePoint 2007||conceptClassifier for Office 365 Platform||conceptClassifier Platform||Concept Searching Technology Platform|
|conceptSearch||yes||yes||yes||yes||yes||Included in Base Product|
|conceptSQL||yes||yes||yes||yes||yes||Included in Base Product|
|Content Enrichment Service for SharePoint 2013||yes||no||no||no||no||yes|
|conceptClassifier for OneDrive for Business||yes||no||no||yes||no||no|
|FAST Pipeline Stage for SharePoint 2010||yes||yes||no||no||no||yes|
|Additional Classification Servers||yes||yes||yes||yes||yes||yes|
|Additional Front End Web Servers||yes||yes||yes||N/A||yes||yes|
The following table illustrates the functionality and features available in the Concept Searching Technology platform. All technology platforms share the same core functionality. The most significant difference in the Concept Searching Technology platform is the inclusion of our enterprise search engine, conceptSearch and conceptSQL. Although the SharePoint Feature set is available with the Concept Searching Technology platform, it is frequently used in a non SharePoint environment or for those who prefer our enterprise search engine for scalability, performance, and precision searching.
For more information on a specific feature please use the hover function over the specific Feature/Function.
|Feature/Function||Concept Searching Technology Platform|
|Text mining to identify candidate terms||yes|
|Import, combine, organize and harmonize taxonomy models||yes|
|Distributed taxonomy management||yes|
|Instant feedback on taxonomy changes, dynamic screen updating||yes|
|Ability to automatically suggest classification clues for taxonomies||yes|
|Native integration with the SharePoint Term Store with no need to import/export||yes (index and classify)|
|Managed by Subject Matter Experts||yes|
|Automatic tagging of content from diverse repositories, web sites, content management systems||yes|
|Classification as content is created or ingested||yes|
|Classification to one or more nodes in one or more taxonomies||yes|
|Classifies all unstructured and semi-structured from content, libraries, blogs, wikis, and threads lists||yes|
|Rich Web indexing support||yes|
|Compound Term Indexing Engine embedded within conceptClassifier and conceptTaxonomyManager||yes|
|Automatic generation of compound terms||yes|
|Concept based searching||yes|
|Taxonomy and faceted navigation||yes|
|Text preview capability||yes|
|Related topics by taxonomy node||yes|
|Single integrated view of content||yes|
|Intelligent Metadata Enabled Solutions|
|Data privacy and information security||yes|
The Concept Searching Technology Platform is based on an open architecture with all APIs based on XML and Web Services. Transparent access to system internals including the statistical profile of terms, is standard. The base platform is installed as a feature set and comprises the following components.
|Base Components in the Concept Searching Technology Framework
|Typical Recommended Base Configuration
Additional Front End Web Servers (Optional)
Additional Classification Servers (Optional)
“Due to Concept Searching’s straightforward design, integration into our environment was very easy and we were able to provide much more than a basic search results list; we were actually able to categorize the search results by content type – documents, forums, communities, people, action items and wikis.”
Douglas Book, President and CEO, Triune Group
Knowledge workers need to identify content in the context of what they are seeking. The fundamental problem with most enterprise search solutions, and all statistical search solutions, is that they are based on an index of single words. Yet most queries are expressed in short patterns of words, and not single words in isolation, which are highly ambiguous.
conceptSearch is an enterprise search solution that delivers precision search results. Platform agnostic, it is a highly scalable, high-performance search solution for organizations that require the ability to search on concepts and identify related concepts in query results. It is frequently used in public facing web sites, where high performance, scalability, and precision searching are required.
The primary issue with search engines is the inability to access meaningful metadata. For most organizations their metadata tagging is either system generated or based on the end user – which means it is often ambiguous, subjective, absent, or wrong. A comprehensive approach requires more than syntactic metadata and expecting end users to add rich metadata is haphazard and subjective at best. With our underlying compound term processing technology, conceptSearch is able to automatically identify ‘concepts in context‘ and retrieve results that contains the keywords or phrases that were searched for and, in addition, retrieve information with similar concepts.
How Does It Work?
“With the integration of the Concept Searching intelligent search capability, Triune Group was able to provide us with a robust and scalable collaboration tool that delivers not only powerful advanced searching capabilities, but also a controlled and secure environment.”
Brian Follen, NASA Safety Center (NSC) KnowledgeNow Program Manager
conceptSearch is not restricted to keyword identification, and compound term metadata can be automatically generated either when the content is created or when ingested. The generation of metadata based on concepts extracts compound terms and keywords from a document, or a corpus of documents, that are highly correlated to a particular concept. By identifying the most significant patterns in any text, these compound terms can then be used to generate non-subjective metadata based on an understanding of conceptual meaning.
conceptSearch can isolate the key meaning that is normally expressed in proper nouns, nouns phrases and verb phrases. Linguistic products can do this, but their performance is highly variable depending upon the vocabulary and language in use. A statistical based language independent concept search can accept queries in natural language, with the user typing words, phrases or whole sentences. The system then analyzes the natural language query to extract the keywords and phrases to identify the main concepts and retrieve content that is highly relevant.
The core compound term processing technology can address many challenges facing large enterprises and provide many benefits including:
- Identification of concepts within a large corpus of information
- Removes the ambiguity in search
- Eliminates inconsistent meta-tagging
- Simplifies taxonomy development and on-going maintenance
Precision and Recall – Why It’s Important
“Our previous system restricted our access to the information by a factor of at least 50%. Something that would have taken weeks is now taking just a few days. Furthermore, the intelligence in the search has meant that sometimes the database will link papers that we wouldn’t have linked in a million years. I am confident now that we don’t skip or ignore important information.”
T Longland CVO OBE, Brigadier (Retd) DCDC
Precision and recall are the two key performance measures for information retrieval. Precision is the retrieval of only those items that are relevant to the query. Recall is the retrieval of all items that are relevant to the query. Yet most information retrieval technologies are less than 22% accurate for both precision and recall. The ideal goal is to have them balanced. Compound term processing has the ability to increase precision with no loss of recall.
High recall with low precision is easy to achieve and is the typical approach of Internet and enterprise based search engines. This means that many documents will be retrieved but will not necessarily be relevant. High precision will return only those documents that are relevant. In this case, some documents that are highly relevant will not be included in the results.
The ideal result is to have both these measurement balanced. conceptSearch was designed to achieve these objectives.
All the functions you need to start gaining control over your unstructured content are included in the base Concept Searching Technology Platform. Our clients have discovered the unique and varied uses of the technology to solve a wide variety of content management challenges. Below is a list of the base platform and optional products that are needed to solve your particular business process challenge and leverage your technology investment.
Why wait? Improve your business processes and positively impact your bottom line starting today.
Conceptual Search Platform
conceptSearch, is Concept Searching’s enterprise class search product and a key component in the Concept Searching Technology Platform. It is a unique, language independent technology and is the first content retrieval solution to integrate relevance ranking based on the Bayesian Inference Probabilistic Model and concept identification based on Shannon’s Information Theory. Unlike other enterprise search engines that require significant customization with marginal results, conceptSearch is delivered with an out-of-the-box application that demonstrates a simple search interface and indexing facilities for internal content, web sites, file systems, and XML documents. Application developers experience a minimal learning curve and the organization can look forward to a rapid return on investment.
Because of the innovative technology, conceptSearch delivers both high precision and high recall. Precision and recall are the two key performance measures for information retrieval. Precision is the retrieval of only those items that are relevant to the query. Recall is the retrieval of all items that are relevant to the query. Yet most information retrieval technologies are less than 22% accurate for both precision and recall. The ideal goal is to have these facilities balanced. Compound term processing has the ability to increase precision with no loss of recall.
conceptSearch is particularly important for organizations that need sophisticated search and retrieval solutions. By weighting multi-word phrases, instead of single words, or words in proximity, the retrieval experience is more accurate and relevant. The ability for the search engine to identify concepts enables organizations to improve the search experience for a variety of business requirements.
Required Products: Concept Searching Technology Platform
Search Engine Integration
Functionality is provided via the Concept Searching Technology platform to integrate with any search engine. The Concept Searching Technology platform can perform on the fly classification with search engines calling the classify API. Search engine support includes SharePoint, the former FAST products, Office 365 Search, Solr, Google Search Appliance, Autonomy, and IBM Vivisimo. If the FAST Pipeline Stage is required, this is sold as a separate product.
Required Products: Concept Searching Technology platform, FAST Search for SharePoint 2010 requires the FAST Pipeline Stage, SharePoint Search in SharePoint 2013 requires the Content Enrichment Service
Intelligent Document Classification
Functionality is provided via the Concept Searching Technology platform, to classify documents based upon concepts and multi-word terms that form a concept. Automatic and/or manual classification is included. Knowledge workers with the appropriate security rights can also classify content in real time. Content can be classified from diverse repositories including SharePoint, Office 365, file shares, Exchange public folders, and websites. All content can be classified on the fly and classified to one or more taxonomies.
Required Product: Concept Searching Technology Platform
conceptTaxonomyManager is a simple to use, has an intuitive user interface designed for Subject Matter Experts, and does not require IT or Information Scientist expertise to build, maintain and validate taxonomies for the enterprise. conceptTaxonomyManager has the capability to automatically group unstructured content together based on an understanding of the concepts and ideas that share mutual attributes while separating dissimilar concepts.
This approach is instrumental in delivering relevant information via the taxonomy structure as well as using the semantic metadata in enterprise search to reduce time spent finding information, increase relevancy and accuracy of the search results, and enable the re-use and re-purposing of content. Using one or more taxonomies, unstructured content can be leveraged to improve any application that uses metadata. This flexibility extends to records management, information security, migration, text analytics, and collaboration.
Required Product: Concept Searching Technology Platform
Using the Concept Searching Technology platform an intelligent approach to migration can be achieved. As content is migrated it is analyzed for organizationally defined descriptors and vocabularies, which will automatically classify the content to taxonomies, or in the SharePoint environment, the SharePoint Term Store, and automatically apply organizationally defined workflows to process the content to the appropriate repository for review and disposition.
Intelligent Records Management
The ability to intelligently identify, tag, and route documents of record to either a staging library and/or a records management solution is a key component in driving and managing an effective information governance strategy. Taxonomy management, automatic declaration of documents of record, auto-classification, and semantic metadata generation are provided via the Concept Searching Technology platform and conceptTaxonomyWorkflow.
Required Products: Concept Searching Technology Platform, conceptTaxonomyWorkflow
Fully customizable to identify unique or industry standard descriptors, content is automatically meta-tagged and classified to the appropriate node(s) in the taxonomy based upon the presence of the descriptors, phrases, or keywords from within the content. Once tagged and classified the content can be managed in accordance with regulatory or government guidelines. The identification of potential information security exposures includes the proactive identification and protection of unknown privacy exposures before they occur, as well as real-time monitoring of organizationally defined vocabulary and descriptors in content as it is created or ingested. Taxonomy, classification, and metadata generation are provided via the Concept Searching Technology platform and conceptTaxonomyWorkflow
Required Products: Concept Searching Technology Platform, conceptTaxonomyWorkflow
eDiscovery, Litigation Support, and FOIA Requests
Taxonomy, classification, and metadata generation are provided via the Concept Searching Technology platform. This is highly useful when relevance, identification of related concepts, vocabulary normalization are required to reduce time and improve quality of search results.
Required Products: Concept Searching Technology platform, conceptTaxonomyWorkflow
Taxonomy, classification, and metadata generation are provided via the Concept Searching Technology platform. A third party business intelligence or reporting tool is required to view the data in the desired format. This is useful to cleanse the data sources before using text analytics to remove content noise, irrelevant content, and identify any unknown privacy exposures or records that were never processed.
Required Product: Concept Searching Technology Platform
Taxonomy, classification, and metadata generation are provided via the Concept Searching Technology platform. Integration with social networking tools can be accomplished if the tools are available in .NET or via SharePoint functionality. This is useful to provide structure to social networking applications and provide significantly more granularity in relevant information being retrieved.
Required Product: Concept Searching Technology Platform
Business Process Workflow
conceptTaxonomyWorkflow serves as a strategic tool managing migration activities and content type application across multiple SharePoint and non-SharePoint farms and is platform agnostic. This add-on component delivers value specifically in migration, data privacy, and records management, or in any application or business process that requires workflow capabilities.
conceptTaxonomyWorkflow is required to apply action on a document, optionally automatically apply a content type and route to the appropriate repository for disposition.
Concept Searching Technology Platform Benefits
With the exponential increase in unstructured information, enterprises are seeking new ways to improve not only the search and retrieval process but to identify tools to manage, capitalize on, and leverage their information assets to improve organizational performance. Moving beyond keyword metadata and traditional taxonomy approaches, the use of compound term processing, or identifying ‘concepts in context’ effectively addresses the issue of managing unstructured content and enables organizations to more effectively find, organize, and manage their information capital.
- Based on industry unique compound term processing
- Platform Agnostic
- Ability to auto-classify content from diverse internal and external repositories
- Powerful and still industry unique enterprise search engine, conceptSearch
- Ability to generate semantic metadata and surface it to any search engine to improve search results
- Ability to automatically tag content with vocabulary or retention codes for records management
- Ability to provide intelligent migration capabilities based on the semantic metadata within content, identify previously undeclared documents of record, unidentified privacy exposures, or information that should be archived or deleted
- Ability to cleanse data to be used in text analytics by identifying relevant, accurate information and identifying previously undeclared records or privacy data that should be exempt from the text analytics process
- Ability to provide granular and structured identification of people, content recommendations, and organizational knowledge assets
Leveraging Your Technnology Investment
Concept Searching’s technologies still have not been replicated in the marketplace. The technologies are unique, language independent, and the first content retrieval solution to integrate relevance ranking based on Bayesian Inference Probabilistic Model and concept identification based on Shannon’s Information Theory. The key features include:
Leveraging Your Business Investment
The real value of your investment includes both technology and the demonstrable ROI that can be generated from improving business processes. The Concept Searching Technology platform has been deployed by clients to solve individual or multiple challenges including:
Concept Searching has a current Enterprise Authority to Operate (ATO) US Air Force, a current Enterprise Certificate of Networthiness (CoN) US Army, and has been deployed on the SIPR, NIPR, and DISA networks.
Technology and Business Differentiators
Compound Term Processing
Concept Searching’s industry unique compound term processing technology delivers outcomes that are not achieved by any other classification engine. Compound term processing means that Concept Searching’s statistical engine can understand, out-of-the-box, the incremental value of keywords, multi-word fragments, and compound terms. As a result, it can identify concepts resident within an organization’s own information repositories that are highly correlated to particular topics. With the identification of these highly correlated topics in the form of keywords, multi-word fragments and compound terms the result is automatically generated intelligent metadata that is unique to the organization. By using these compound terms in any application that requires metadata, the outcomes are highly accurate, because the ambiguity inherent in single words is no longer an issue.
A search for ‘triple heart bypass’ will locate documents about this topic even if this precise phrase is not contained in any document. A concept search using compound term processing can extract the key concepts, in this case “triple heart bypass” and use these concepts to retrieve relevant documents containing concepts such as ‘heart surgery’, ‘coronary artery bypass’, or ‘open heart surgery’.
Industry Unique Taxonomy Management Features
conceptTaxonomyManager remains unique in the industry in features that provide the ability to rapidly and easily change the taxonomy as the organizational needs and requirements change. This is important as a taxonomy must remain fluid as opposed to static and must be managed in a way that facilitates change. The easy to use taxonomy and automatic classification tools create the framework to classify content based on concepts to one or more nodes in the taxonomy or multiple taxonomies. The conceptTaxonomyManager component is included in the base product in all platforms.
Ease of Use
Automatic Clue Suggestion
Document Movement Feedback
Distributed Taxonomy Management
Security and Rollback
conceptTaxonomyWorkflow is a powerful add-on product to automate manual business processes. conceptTaxonomyWorkflow serves as a strategic tool managing migration activities and content type application across multiple SharePoint and non-SharePoint farms and is also platform agnostic. This add-on component delivers value specifically in migration, data privacy, records management or any application or business process that requires workflow capabilities. It is required to apply an action on a document and optionally, automatically apply a content type and route to the appropriate repository for disposition.
Intelligent Metadata Enabled Solutions
Concept Searching is the only available solution that addresses the challenges in managing unstructured and semi-structured data in SharePoint and non SharePoint environments.
Our intelligent metadata enabled solutions address the following challenges with one set of technologies, leverages an enterprises investment in SharePoint, and reduces resources to maintain and manage the solution.
Enterprise Search enables concept based searching by providing the search engine index with the compound terms and semantic metadata.
Records Identification is improved through the elimination of end user tagging, automatic declaration of records, and taxonomy workflow capabilities.
Data Privacy and the protection of confidential data as defined by the organization is identified in real-time and routed to a secure repository for disposition.
Intelligent Migration is accomplished through the identification and auto-classification of the unstructured or semi-structured data to one or more taxonomies.
Text Analytics can be performed to extract organizationally defined descriptors and concepts from diverse repositories.
Social Networking and collaboration applications gain structure and the ability to retrieve highly relevant and granular information.
eDiscovery and FOIA time and costs are reduced through identification of highly detailed information and conceptually similar information that typically would not be found.
The following resources will provide you with additional information about the Concept Searching Technology platform. Please visit our Knowledge Centerfor links to all product and industry information.
Collateral and White Papers
Smart Content Framework™
Intelligent Metadata Solutions
Quick Links to Pages