Concept Searching Technology Platform

Overview

*

The Concept Searching Technology Platform is based on our Smart Content Framework™ for information governance, and incorporates best practices for developing an enterprise framework to mitigate risk, automate processes, manage information, protect privacy, and address compliance issues. Underlying the framework is the technology to:

  • Automatically generate semantic metadata
  • Auto-classify content from diverse repositories
  • Easily develop, deploy, and manage taxonomies

The framework is being used to enable intelligent metadata enabled solutions to improve search, records management, enterprise metadata management, text analytics, migration, enterprise social networking, and data security.

Why Use the Concept Searching Technology Platform?

  • To take advantage of the full product and feature set including conceptSearch, conceptSQL, APIs, custom controls, and demonstration source code
  • Need a powerful enterprise search solution that delivers highly precise results
  • Used frequently in external web sites, particularly those offering paid services to site visitors
  • Do not have a SharePoint environment but would like semantic metadata generation, auto-classification, and taxonomy management
  • Do have a SharePoint environment but would like to use additional bundled products and features
  • Need to implement one or more intelligent metadata enabled solutions that may include:
    • Enterprise information governance
    • Content management
    • Content migration
    • Concept based searching
    • Sensitive information identification and protection
    • Automatic declaration of documents of record
    • Text analytics
    • Enterprise social networking

Why Use conceptSearch?

  • Enterprise concept based search engine
  • Compound terms are extracted when content is indexed from internal or external content sources, enabling the delivery of greater precision of relevant content at the top of search results
  • Relevance ranking displays extracts from the documents based on the query
  • Search refinement delivers to the end user highly correlated concepts that may be used to refine the search
  • Taxonomy browse capabilities are standard
  • Documents can be classified into one or more taxonomy nodes, enhancing the precision of documents returned
  • In addition to static summaries, Dynamic Summarization, a modified weighting system, can be applied that will identify in real-time short extracts that are most relevant to the user’s query
  • Related Topics will return results based on the conceptual meaning of the search terms used, using the ability to generate compound terms in a search.  For example, ‘triple’ is a single word term but ‘triple heart bypass’ is a compound term that provides a more granular meaning
  • Based on previous queries, or on extracts retrieved, end users can use the text to perform additional searches to retrieve more granular results
  • The product is based on an open architecture with all API’s based on XML and Web Services. Transparent access to system internals including the statistical profile of terms is standard
  • Highly scalable
  • High performance  specifically with classification occurring in real time
  • Easily customized to achieve your organizations’ objectives
Watch the Webcast!Searching with Metadata and Searching without Metadata

What are the key outcomes?

The combination of the Smart Content Framework™, the Concept Searching Technology Platform, and the deployment of intelligent metadata enabled solutions result in a comprehensive and complete approach to metadata management in an internal or external environment. Our clients are using the technologies to:

  • Eliminate manual tagging
  • Deliver a concept based enterprise search engine
  • Facilitate records management
  • Detect and automatically secure unknown privacy exposures
  • Intelligently migrate content
  • Enhance eDiscovery, litigation support, and FOIA requests
  • Enable text analytics
  • Provide structure to enterprise social networking

Platform Matrix

This table provides an overview of all Concept Searching Platforms and the components for each platform.

Core Components conceptClassifier for SharePoint Platform conceptClassifier for Office 365 Platform conceptClassifier Platform Concept Searching Technology Platform
conceptClassifier for SharePoint 2013 conceptClassifier for SharePoint 2010 conceptClassifier for SharePoint 2007
Compound Term Processing Engine – licensed for concept extraction only yes yes yes yes yes Full search functionality included
conceptClassifier yes yes yes yes yes yes
conceptTaxonomyManager yes yes yes yes yes yes
conceptSearch no no no no no yes
SharePoint Feature Set yes yes yes yes no yes
SharePoint Connector yes yes yes yes no yes
APIs, custom controls, demonstration source code no no no no yes yes
conceptSQL no no no no no yes
Proprietary controls for SharePoint 2007 no no yes no no yes
Optional Components conceptClassifier for SharePoint 2013 conceptClassifier for SharePoint Platform conceptClassifier for SharePoint 2010 conceptClassifier for SharePoint 2007 conceptClassifier for Office 365 Platform conceptClassifier Platform Concept Searching Technology Platform
conceptTaxonomyWorkflow yes yes no yes yes yes
conceptSearch yes yes yes yes yes Included in Base Product
conceptSQL yes yes yes yes yes Included in Base Product
Content Enrichment Service for SharePoint 2013 yes no no no no yes
conceptClassifier for OneDrive for Business yes no no yes no no
FAST Pipeline Stage for SharePoint 2010 yes yes no no no yes
Additional Classification Servers yes yes yes yes yes yes
Additional Front End Web Servers yes yes yes N/A yes yes

Features

The following table illustrates the functionality and features available in the Concept Searching Technology platform. All technology platforms share the same core functionality. The most significant difference in the Concept Searching Technology platform is the inclusion of our enterprise search engine, conceptSearch and conceptSQL. Although the SharePoint Feature set is available with the Concept Searching Technology platform, it is frequently used in a non SharePoint environment or for those who prefer our enterprise search engine for scalability, performance, and precision searching.

For more information on a specific feature please use the hover function over the specific Feature/Function.

Feature/Function Concept Searching Technology Platform
Taxonomy Functionality
SOA Compliant yes
Industry Standards yes
Taxonomy Rollback yes
Multi-language yes
Text mining to identify candidate terms yes
Relationship Definition yes
Concept Mapping yes
Taxonomy Navigation yes-
Import, combine, organize and harmonize taxonomy models yes
Polyhierarchy Support yes
Folksonomy Support yes
Distributed taxonomy management yes
Synonym support yes
Instant feedback on taxonomy changes, dynamic screen updating yes
Controlled Vocabularies yes
Ability to automatically suggest classification clues for taxonomies yes
Security yes
Microsoft Integration yes
Native integration with the SharePoint Term Store with no need to import/export yes (index and classify)
Managed by Subject Matter Experts yes
Highly Scalable yes
Rapidly Deployed yes
Classification
Manual Tagging yes
Automatic tagging of content from diverse repositories, web sites, content management systems yes
Classification as content is created or ingested yes
Classification to one or more nodes in one or more taxonomies yes
Classifies all unstructured and semi-structured from content, libraries, blogs, wikis, and threads lists yes
Rich Web indexing support yes
Compound Term Indexing Engine embedded within conceptClassifier and conceptTaxonomyManager yes
Automatic generation of compound terms yes
Search
Keyword Search yes
Concept based searching yes
Property/Entity Extraction yes
Relevance Ranking yes
Similar Results yes
Dynamic Summarization yes
Dynamic Clustering yes
Taxonomy and faceted navigation yes
Text preview capability yes
Related topics by taxonomy node yes
Single integrated view of content yes
Vocabulary normalization yes
Intelligent Metadata Enabled Solutions
Records Identification yes
Data privacy and information security yes
Taxonomy workflow yes
Intelligent migration yes

Technology Specifications

The Concept Searching Technology Platform is based on an open architecture with all APIs based on XML and Web Services. Transparent access to system internals including the statistical profile of terms, is standard.  The base platform is installed as a feature set and comprises the following components.

Base Components in the Concept Searching Technology Framework

Technology

  • conceptSearch
    conceptSearch is an enterprise search engine based on a unique, language independent technology. Unlike other enterprise search engines, which require significant customization with marginal results, conceptSearch is delivered as an out-of-the-box application that demonstrates a simple search interface and indexing facilities for internal content, web sites, file systems and XML documents. Application developers experience a minimal learning curve and the organization achieves a rapid return on investment.
  • conceptClassifier
    conceptClassifier is a leading-edge rules based categorization module providing our clients complete control of rules-based descriptors unique to their organizations. conceptClassifier delivers a categorization descriptor table, which is easy to implement and maintain, through which all rules and terms can be defined and managed. This approach eliminates the error-prone results of ‘training’ algorithms typically found in other text retrieval solutions and enables human intervention to effectively tune classification results.
  • conceptTaxonomyManager
    This is an advanced enterprise class, easy-to-use taxonomy development and management tool, still unique in the industry. Developed on the premise that a taxonomy solution should be used by business professionals, and not the IT team or librarians, the end result is a highly interactive and powerful tool that has been proven to reduce taxonomy development by up to 80% (client source data).
  • conceptSQL
    This product provides the ability to define a document structure based on information held in a Microsoft SQL Server. A document can include any number of text and metadata fields and can span multiple tables if required. conceptSQL supports SQL 2005, 2008, and 2012. A powerful but easy to use configuration tool is supplied eliminating the need for any programming. Templates are provided for out of the box support for Documentum, Hummingbird, and Worksite/Interwoven DMS.
  • SharePoint Feature Set
    The SharePoint Feature Set includes the following components: farm solution with feature sets, Term Store integration, taxonomy tree control for editing, refinement panel integration, event handlers for notification of changes, management of classification status column, web service advanced functionality (implement system update or preserve GUIDS), automated site column creation.
  • Proprietary Controls for SharePoint 2007
  • API’s, Custom Controls, Demonstration Source Code
Typical Recommended Base Configuration

  • Windows 2008/2012 Server with IIS
  • Modern 64 bit CPUs (ideally at least 8 cores)
  • 8GB RAM (recommended)
  • .NET Framework v4.0 or v4.5
  • Access to SQLServer (2005 or later) or Oracle (10g R2 or later)
  • IIS 6 with MetaBase enabled
  • Microsoft Office 2010 64-bit iFilter pack
  • Adobe or Foxit PDF 64-bit iFilter
  • High speed disk, Raid Array or SAN

One Farm

  • 2 Front End Web Servers per License
  • 1 conceptClassifier Server

Additional Front End Web Servers (Optional)
Provides scalability to accommodate size of end user community.

Additional Classification Servers (Optional)
Provides scalability of classification to increase speed of classification throughput especially when classification on the fly is an important requirement.

Supports

  • Open Source
  • SharePoint 2007, 2010, 2013
  • SQL Server 2005, 2008, 2012
  • Oracle 10g R2 or later
  • Non SharePoint environments
  • On-premise, cloud, or hybrid environments

 

Architecture

Optional Products

conceptTaxonomyWorkflow

conceptTaxonomyWorkflow can perform an action on a document following a classification decision when certain criteria are met. The workflow source type works in SharePoint, as well as all document types, including FILE and HTTP. This product is available in a SharePoint and non-SharePoint environments and has a plugin architecture enabling clients and integration partners to easily build plugins for both content sources and destination sources.

FAST Pipeline Stage for SharePoint 2010

This product can be used with FAST Search to classify any document that is being indexed by this search engine. The product integrates with FAST via the pipeline and is designed to allow custom processing of documents as they are indexed. The resulting classifications are stored directly in the FAST index. This product does not build a conceptSearch index so its disk usage is zero during classification operations.

Content Enrichment Service for SharePoint 2013

This product can be used with Microsoft Search for SharePoint 2013 to classify any document that is being indexed by this search engine. The product integrates with Microsoft Search via the web service callout service which is designed to allow custom processing of documents as they are indexed. The resulting classifications are stored directly in the Microsoft index and will be available in the SharePoint 2013 search refinement panel. This product does not build an conceptSearch index and so its disk usage is zero during classification operations.

conceptClassifier for OneDrive for Business

conceptClassifier for OneDrive for Business is an optional component that enables the full feature set of conceptClassifier for SharePoint and conceptClassifier for Office 365. For systems management, administrators can now apply policy across SharePoint, SharePoint Online/Office 365, and across OneDrive for Business. For the business users, OneDrive for Business provides them the ability to retrieve documents regardless of the device they are using (currently with some exceptions) and from any location.

conceptClassifier for OneDrive for Business provides governance, compliance, records management, and enterprise policy application, as well as collaboration and productivity enhancements. From an administrator perspective, the product provides management of all content regardless of where it resides.

Additional Front End Web Servers

Provides scalability to accommodate size of end user community.

Additional Classification Servers

Provides scalability of classification to increase speed of classification throughput especially when classification on the fly is an important requirement.

conceptSearch

“Due to Concept Searching’s straightforward design, integration into our environment was very easy and we were able to provide much more than a basic search results list; we were actually able to categorize the search results by content type – documents, forums, communities, people, action items and wikis.”

Douglas Book, President and CEO, Triune Group

Knowledge workers need to identify content in the context of what they are seeking. The fundamental problem with most enterprise search solutions, and all statistical search solutions, is that they are based on an index of single words. Yet most queries are expressed in short patterns of words, and not single words in isolation, which are highly ambiguous.

conceptSearch is an enterprise search solution that delivers precision search results. Platform agnostic, it is a highly scalable, high-performance search solution for organizations that require the ability to search on concepts and identify related concepts in query results. It is frequently used in public facing web sites, where high performance, scalability, and precision searching are required.

Metadata Matters

The primary issue with search engines is the inability to access meaningful metadata. For most organizations their metadata tagging is either system generated or based on the end user – which means it is often ambiguous, subjective, absent, or wrong. A comprehensive approach requires more than syntactic metadata and expecting end users to add rich metadata is haphazard and subjective at best. With our underlying compound term processing technology, conceptSearch is able to automatically identify concepts in context and retrieve results that contains the keywords or phrases that were searched for and, in addition, retrieve information with similar concepts.

How Does It Work?

“With the integration of the Concept Searching intelligent search capability, Triune Group was able to provide us with a robust and scalable collaboration tool that delivers not only powerful advanced searching capabilities, but also a controlled and secure environment.”

Brian Follen, NASA Safety Center (NSC) KnowledgeNow Program Manager

 

conceptSearch is not restricted to keyword identification, and compound term metadata can be automatically generated either when the content is created or when ingested. The generation of metadata based on concepts extracts compound terms and keywords from a document, or a corpus of documents, that are highly correlated to a particular concept. By identifying the most significant patterns in any text, these compound terms can then be used to generate non-subjective metadata based on an understanding of conceptual meaning.

conceptSearch can isolate the key meaning that is normally expressed in proper nouns, nouns phrases and verb phrases. Linguistic products can do this, but their performance is highly variable depending upon the vocabulary and language in use. A statistical based language independent concept search can accept queries in natural language, with the user typing words, phrases or whole sentences. The system then analyzes the natural language query to extract the keywords and phrases to identify the main concepts and retrieve content that is highly relevant.

The core compound term processing technology can address many challenges facing large enterprises and provide many benefits including:

  • Identification of concepts within a large corpus of information
  • Removes the ambiguity in search
  • Eliminates inconsistent meta-tagging
  • Simplifies taxonomy development and on-going maintenance

Precision and Recall – Why It’s Important

“Our previous system restricted our access to the information by a factor of at least 50%. Something that would have taken weeks is now taking just a few days. Furthermore, the intelligence in the search has meant that sometimes the database will link papers that we wouldn’t have linked in a million years. I am confident now that we don’t skip or ignore important information.”

T Longland CVO OBE, Brigadier (Retd) DCDC

Precision and recall are the two key performance measures for information retrieval. Precision is the retrieval of only those items that are relevant to the query. Recall is the retrieval of all items that are relevant to the query. Yet most information retrieval technologies are less than 22% accurate for both precision and recall. The ideal goal is to have them balanced. Compound term processing has the ability to increase precision with no loss of recall.

High recall with low precision is easy to achieve and is the typical approach of Internet and enterprise based search engines. This means that many documents will be retrieved but will not necessarily be relevant. High precision will return only those documents that are relevant. In this case, some documents that are highly relevant will not be included in the results.

The ideal result is to have both these measurement balanced. conceptSearch was designed to achieve these objectives.

 

 

 

Application Requirements

All the functions you need to start gaining control over your unstructured content are included in the base Concept Searching Technology Platform. Our clients have discovered the unique and varied uses of the technology to solve a wide variety of content management challenges. Below is a list of the base platform and optional products that are needed to solve your particular business process challenge and leverage your technology investment.

Why wait? Improve your business processes and positively impact your bottom line starting today.

Conceptual Search Platform

conceptSearch, is Concept Searching’s enterprise class search product and a key component in the Concept Searching Technology Platform. It is a unique, language independent technology and is the first content retrieval solution to integrate relevance ranking based on the Bayesian Inference Probabilistic Model and concept identification based on Shannon’s Information Theory. Unlike other enterprise search engines that require significant customization with marginal results, conceptSearch is delivered with an out-of-the-box application that demonstrates a simple search interface and indexing facilities for internal content, web sites, file systems, and XML documents. Application developers experience a minimal learning curve and the organization can look forward to a rapid return on investment.

Because of the innovative technology, conceptSearch delivers both high precision and high recall. Precision and recall are the two key performance measures for information retrieval. Precision is the retrieval of only those items that are relevant to the query. Recall is the retrieval of all items that are relevant to the query. Yet most information retrieval technologies are less than 22% accurate for both precision and recall. The ideal goal is to have these facilities balanced. Compound term processing has the ability to increase precision with no loss of recall.

conceptSearch is particularly important for organizations that need sophisticated search and retrieval solutions. By weighting multi-word phrases, instead of single words, or words in proximity, the retrieval experience is more accurate and relevant. The ability for the search engine to identify concepts enables organizations to improve the search experience for a variety of business requirements.

Required Products: Concept Searching Technology Platform

 Search Engine Integration

Functionality is provided via the Concept Searching Technology platform to integrate with any search engine. The Concept Searching Technology platform can perform on the fly classification with search engines calling the classify API. Search engine support includes SharePoint, the former FAST products, Office 365 Search, Solr, Google Search Appliance, Autonomy, and IBM Vivisimo. If the FAST Pipeline Stage is required, this is sold as a separate product.

Required Products: Concept Searching Technology platform, FAST Search for SharePoint 2010 requires the FAST Pipeline Stage,  SharePoint Search in SharePoint 2013 requires the Content Enrichment Service

Intelligent Document Classification

Functionality is provided via the Concept Searching Technology platform, to classify documents based upon concepts and multi-word terms that form a concept. Automatic and/or manual classification is included. Knowledge workers with the appropriate security rights can also classify content in real time. Content can be classified from diverse repositories including SharePoint, Office 365, file shares, Exchange public folders, and websites. All content can be classified on the fly and classified to one or more taxonomies.

Required Product: Concept Searching Technology Platform

Taxonomy Management

conceptTaxonomyManager is a simple to use, has an intuitive user interface designed for Subject Matter Experts, and does not require IT or Information Scientist expertise to build, maintain and validate taxonomies for the enterprise. conceptTaxonomyManager has the capability to automatically group unstructured content together based on an understanding of the concepts and ideas that share mutual attributes while separating dissimilar concepts.

This approach is instrumental in delivering relevant information via the taxonomy structure as well as using the semantic metadata in enterprise search to reduce time spent finding information, increase relevancy and accuracy of the search results, and enable the re-use and re-purposing of content. Using one or more taxonomies, unstructured content can be leveraged to improve any application that uses metadata. This flexibility extends to records management, information security, migration, text analytics, and collaboration.

Required Product: Concept Searching Technology Platform

Intelligent Migration

Using the Concept Searching Technology platform an intelligent approach to migration can be achieved. As content is migrated it is analyzed for organizationally defined descriptors and vocabularies, which will automatically classify the content to taxonomies, or in the SharePoint environment, the SharePoint Term Store, and automatically apply organizationally defined workflows to process the content to the appropriate repository for review and disposition.

Required Products:  Concept Searching Technology Platform, conceptTaxonomyWorkflow, conceptSQL if migrating from other SQL databases (core component)
Optional Products: SharePoint Feature Set

Intelligent Records Management

The ability to intelligently identify, tag, and route documents of record to either a staging library and/or a records management solution is a key component in driving and managing an effective information governance strategy. Taxonomy management, automatic declaration of documents of record, auto-classification, and semantic metadata generation are provided via the Concept Searching Technology platform and conceptTaxonomyWorkflow.

Required Products: Concept Searching Technology Platform, conceptTaxonomyWorkflow

Data Privacy

Fully customizable to identify unique or industry standard descriptors, content is automatically meta-tagged and classified to the appropriate node(s) in the taxonomy based upon the presence of the descriptors, phrases, or keywords from within the content. Once tagged and classified the content can be managed in accordance with regulatory or government guidelines. The identification of potential information security exposures includes the proactive identification and protection of unknown privacy exposures before they occur, as well as real-time monitoring of organizationally defined vocabulary and descriptors in content as it is created or ingested. Taxonomy, classification, and metadata generation are provided via the Concept Searching Technology platform and conceptTaxonomyWorkflow

Required Products: Concept Searching Technology Platform, conceptTaxonomyWorkflow

eDiscovery, Litigation Support, and FOIA Requests

Taxonomy, classification, and metadata generation are provided via the Concept Searching Technology platform. This is highly useful when relevance, identification of related concepts, vocabulary normalization are required to reduce time and improve quality of search results.

Required Products: Concept Searching Technology platform, conceptTaxonomyWorkflow

Text Analytics

Taxonomy, classification, and metadata generation are provided via the Concept Searching Technology platform. A third party business intelligence or reporting tool is required to view the data in the desired format. This is useful to cleanse the data sources before using text analytics to remove content noise, irrelevant content, and identify any unknown privacy exposures or records that were never processed.

Required Product: Concept Searching Technology Platform

Social Networking

Taxonomy, classification, and metadata generation are provided via the Concept Searching Technology platform. Integration with social networking tools can be accomplished if the tools are available in .NET or via SharePoint functionality. This is useful to provide structure to social networking applications and provide significantly more granularity in relevant information being retrieved.

Required Product: Concept Searching Technology Platform

Business Process Workflow

conceptTaxonomyWorkflow serves as a strategic tool managing migration activities and content type application across multiple SharePoint and non-SharePoint farms and is platform agnostic. This add-on component delivers value specifically in migration, data privacy, and records management, or in any application or business process that requires workflow capabilities.

conceptTaxonomyWorkflow is required to apply action on a document, optionally automatically apply a content type and route to the appropriate repository for disposition.

 

Concept Searching Technology Platform Benefits

With the exponential increase in unstructured information, enterprises are seeking new ways to improve not only the search and retrieval process but to identify tools to manage, capitalize on, and leverage their information assets to improve organizational performance. Moving beyond keyword metadata and traditional taxonomy approaches, the use of compound term processing, or identifying ‘concepts in context’ effectively addresses the issue of managing unstructured content and enables organizations to more effectively find, organize, and manage their information capital.*

  • Based on industry unique compound term processing
  • Platform Agnostic
  • Ability to auto-classify content from diverse internal and external repositories
  • Powerful and still industry unique enterprise search engine, conceptSearch
  • Ability to generate semantic metadata and surface it to any search engine to improve search results
  • Ability to automatically tag content with vocabulary or retention codes for records management
  • Ability to provide intelligent migration capabilities based on the semantic metadata within content, identify previously undeclared documents of record, unidentified privacy exposures, or information that should be archived or deleted
  • Ability to cleanse data to be used in text analytics by identifying relevant, accurate information and identifying previously undeclared records or privacy data that should be exempt from the text analytics process
  • Ability to provide granular and structured identification of people, content recommendations, and organizational knowledge assets

 

Leveraging Your Technnology Investment

Concept Searching’s technologies still have not been replicated in the marketplace. The technologies are unique, language independent, and the first content retrieval solution to integrate relevance ranking based on Bayesian Inference Probabilistic Model and concept identification based on Shannon’s Information Theory. The key features include:

  • SOA compliant and delivered as web parts
  • Support of any platform such as Open Source
  • API based on Web Services and all information is exchanged in XML
  • Taxonomy formats are based on Web Ontology Language (OWL)
  • Since the server is stateless is also works with all failover and load balancing hardware and software
  • Reduce IT Staff requirements to support diverse applications
  • Reduce costs associated with the purchase of multiple, stand-alone applications
  • Deploy once, utilize multiple times
  • Eliminate unproductive and manual end user tagging and the support required by business units and IT
  • Reduce hardware expansion costs due to scalability and performance features
  • Provides database support including SQL or Oracle
  • SharePoint Feature Set
  • Supports Office 365
  • Deployable as an on-premise, cloud, or hybrid solution

Leveraging Your Business Investment

The real value of your investment includes both technology and the demonstrable ROI that can be generated from improving business processes. The Concept Searching Technology platform has been deployed by clients to solve individual or multiple challenges including:

  • Provides a powerful, scalable, concept based search engine – unique in the industry
  • Enables concept based searching regardless of search engine
  • Reduces organizational costs associated with data exposures, remediation, litigation, fines and sanctions
  • Eliminates manual metadata tagging and human inconsistencies that prohibit accurate metadata generation
  • Prevents the portability and electronic transmission of secured assets
  • Assists in the migration of content by identifying records as well as content that should have been archived, contains sensitive information, or should be deleted
  • Protects record integrity throughout the individual document lifecycle
  • Creates virtual centralization through the ability to link disparate on-premise and off-premise content repositories
  • Ensures compliance with industry and government mandates enabling rapid implementation to address regulatory changes

 

Concept Searching has a current Enterprise Authority to Operate (ATO) US Air Force, a current Enterprise Certificate of Networthiness (CoN) US Army, and has been deployed on the SIPR, NIPR, and DISA networks. 

Technology and Business Differentiators

*

Compound Term Processing

Concept Searching’s industry unique compound term processing technology delivers outcomes that are not achieved by any other classification engine. Compound term processing means that Concept Searching’s statistical engine can understand, out-of-the-box, the incremental value of keywords, multi-word fragments, and compound terms. As a result, it can identify concepts resident within an organization’s own information repositories that are highly correlated to particular topics. With the identification of these highly correlated topics in the form of keywords, multi-word fragments and compound terms the result is automatically generated intelligent metadata that is unique to the organization. By using these compound terms in any application that requires metadata, the outcomes are highly accurate, because the ambiguity inherent in single words is no longer an issue.

A search for ‘triple heart bypass’ will locate documents about this topic even if this precise phrase is not contained in any document. A concept search using compound term processing can extract the key concepts, in this case “triple heart bypass” and use these concepts to retrieve relevant documents containing concepts such as ‘heart surgery’, ‘coronary artery bypass’, or ‘open heart surgery’.

Industry Unique Taxonomy Management Features

conceptTaxonomyManager remains unique in the industry in features that provide the ability to rapidly and easily change the taxonomy as the organizational needs and requirements change. This is important as a taxonomy must remain fluid as opposed to static and must be managed in a way that facilitates change. The easy to use taxonomy and automatic classification tools create the framework to classify content based on concepts to one or more nodes in the taxonomy or multiple taxonomies. The conceptTaxonomyManager component is included in the base product in all platforms.

Ease of Use
conceptTaxonomyManager is a simple yet powerful tool with an intuitive user interface designed for Subject Matter Experts (SME) without the need for IT, Information Scientists, or specific application skills, to build, maintain and validate taxonomies for the enterprise. This feature has been shown to reduce taxonomy development by up to 80% (client source data).

Automatic Clue Suggestion
Eliminating complex Boolean rules and the need for training sets the taxonomy nodes can be automatically generated from the compound terms found in the document corpus. The Subject Matter Expert (SME) has full control of the terms to be used as well as the weighting of the term based on its relevancy. This enables a much more robust taxonomy as the terms are suggested based on the organization’s own content and can offer the SME new terms from the relevant documents that may not have been identified. The Clues can also be assigned a score or weight, either positive or negative to improve the classification. Clues can also be assigned a Type. Types include standard, case-sensitive, metadata, phonetic, and RegEx (Regular Expression).

Document Movement Feedback
Automatic document movement feedback enables the SME to see the cause and effect on changing the clue weightings for a node in the taxonomy. The user can also search within the refined node and bring back documents from the whole corpus now classified against the node. The system will indicate if the change has increased the score, reduced the score as well as identify documents that will no longer be classified and the new documents that will be classified.

Distributed Taxonomy Management
This feature is a requirement for organizations that have many taxonomy operators, extremely large collections of documents, and where taxonomy management is a critical business process. This feature can be implemented on any number of servers and several taxonomy managers can be assigned to a server to ensure the level of throughput needed. Real time locking mechanisms are used to make nodes of the taxonomy inaccessible to other taxonomy managers while the node is being edited. The taxonomy managers can visually see when a node is locked and who has locked it as well as when it becomes available. The Distributed Taxonomy Management feature is totally transparent to the end user and all locking and unlocking of the nodes by the taxonomy managers are coordinated by the central server.

Security and Rollback
The product provides a full security model enabling lock down of nodes, branches, and complete taxonomies to particular users and/or groups of users. Also supports rollback to the previous state.

conceptTaxonomyWorkflow

conceptTaxonomyWorkflow is a powerful add-on product to automate manual business processes. conceptTaxonomyWorkflow serves as a strategic tool managing migration activities and content type application across multiple SharePoint and non-SharePoint farms and is also platform agnostic. This add-on component delivers value specifically in migration, data privacy, records management or any application or business process that requires workflow capabilities. It is required to apply an action on a document and optionally, automatically apply a content type and route to the appropriate repository for disposition.

Intelligent Metadata Enabled Solutions

Concept Searching is the only available solution that addresses the challenges in managing unstructured and semi-structured data in SharePoint and non SharePoint environments.

Our intelligent metadata enabled solutions address the following challenges with one set of technologies, leverages an enterprises investment in SharePoint, and reduces resources to maintain and manage the solution.

*

Enterprise Search enables concept based searching by providing the search engine index with the compound terms and semantic metadata.

Records Identification is improved through the elimination of end user tagging, automatic declaration of records, and taxonomy workflow capabilities.

Data Privacy and the protection of confidential data as defined by the organization is identified in real-time and routed to a secure repository for disposition.

Intelligent Migration is accomplished through the identification and auto-classification of the unstructured or semi-structured data to one or more taxonomies.

Text Analytics can be performed to extract organizationally defined descriptors and concepts from diverse repositories.

Social Networking and collaboration applications gain structure and the ability to retrieve highly relevant and granular information.

eDiscovery and FOIA time and costs are reduced through identification of highly detailed information and conceptually similar information that typically would not be found.