The manufacturing industry is not immune to content overload. In fact, maximizing productivity, leveraging existing content assets, and remaining highly competitive are objectives that must be achieved to improve the bottom line. With diverse groups, such as engineering, sales, marketing, and finance, all seeking pertinent information, and often in unique vocabularies, the need for immediate access to relevant and precise information has never been greater.
Information Transparency and Information Retrieval
Enterprise search is a core infrastructure component. The business impact of poor search results reaches far beyond the retrieval of information.
At the most basic level, enterprise search has become inadequate. Bells and whistles abound but the problem still exists. Search cannot find and deliver relevant information in the right context, at the right time. Search is a key component and critical enabler for improving business outcomes.
Concept Searching’s search capability is a unique, language-independent technology. It automatically generates compound term metadata, eliminating human idiosyncrasies. This metadata is generated in the form of concepts, phrases, topics, or subjects, using multi-word terms. The content is auto-classified to taxonomies, where it can be maintained and managed by business professionals.
From a search perspective, the rich multi-term metadata is fed to the search engine index, enabling concept-based searching. The information retrieval process is significantly improved.
Indexing facilities are provided for internal content, websites, file systems and XML documents. APIs and natively integrated connectors are also available.
The ability to populate the SharePoint Term Store is also possible when maintaining the taxonomy. This feature works in real time, and changes made in the Term Store or in taxonomies are bi-synchronous. This enables organizations to facilitate the use of the term store, without investing the time and human resources needed to maintain the term store.
What Do You Do When You Have Poor Metadata or No Metadata, and Want to Improve Information Transparency and Retrieval?
This client is a world-renowned electrical engineering and electronics firm. In support of 8,000 business users in the US, it needed a way to improve search on its SharePoint intranet. After evaluating available solutions, it selected the conceptClassifier for SharePoint platform as the infrastructure to support its existing search solution.
What were the challenges?
- Little to no metadata
- Lack of auto-classification capabilities
- End user tagging, when done, was poor
- Search limited to keywords, and unusable content retrieved
The Concept Searching solution was able to automatically generate metadata representing concepts and phrases, consisting of multi-word terms. This eliminated end user tagging. This also solved the problem of poor metadata and the absence of metadata. Content was auto-classified to the corpus of content. Metadata was consistent and meaningful.
Search results were significantly improved. Users were not required to enter metadata, improving their productivity. Search could also be executed not only on keywords but also on multi-word terms. This retrieved more meaningful content, eliminating many irrelevant results.
What Do You Do with Content That Is No Longer Used? Getting Rid of It Is a Good Idea.
This company designs, manufactures, and distributes engines, filtration, and power generation products. Now using SharePoint and Office 365, its initial challenge was migration to Office 365. It was aware of the pitfalls of migration. Its content store had grown, and managing content was no longer viable.
- Too much content to manage effectively, severely impacting search results
- Successful migration to Office 365
- Documents of record relating to security needed to be identified, tagged, and classified for compliance
The conceptClassifier platform and conceptTaxonomyWorkflow enabled the organization to achieve all its goals, specifically the initial step of content optimization.
Using content optimization, the corpus of content was significantly reduced in size, organized for migration, and the company was able to easily manage the content through taxonomies. After the migration, it was also able to utilize concept-based searching, significantly improving performance.
Some refer to it as deletion. But it’s more than that. What about that one document that contains value? Don’t you want to keep it?
Content optimization is the process of removing information from active systems, through deletion or archiving, and identifying duplicates, near duplicates, undeclared records, and data privacy exposures. It eliminates obsolete and trivial content, which has been saved but is of value.
Concept Searching recommends that content optimization is done on a quarterly basis. It is estimated that 69 percent of an organization’s content can, and should, be deleted.
Did you know?
- 70 percent of content on file shares is redundant, obsolete or trivial (ROT)
- 25 percent of content is duplicate
- 10 percent has no business value
- 90 percent of documents are never accessed after creation
- 65 percent of documents are accessed only once
And the risks are:
- PII, PHI, and PCI data breaches
- Uncontrolled intellectual property
- Unmanaged documents of record
- Unsecured confidential company information
Content optimization not only reduces storage but dramatically improves information retrieval, as it eliminates false positives and irrelevant information from retrieval results.
Migration projects typically have questionable success rates. You are migrating a vital portion of your business. Are you willing to accept failure?
Occurring after content optimization, intelligent migration identifies the content to be moved, and an administrator defines where it will be moved. Concept Searching technology generates concepts before the actual migration. Temporary taxonomies are used to refine the content by concept, or client, or topic, within specific taxonomies. This effectively groups like content, so when migrated it retains the likeness or similarity of related documents.
An intelligent approach to migration can be achieved. As content is migrated, it is analyzed for organizationally-defined descriptors and vocabularies, which will automatically classify the content to taxonomies, or optionally to the SharePoint Term Store, and automatically apply organizationally-defined workflows, to process the content to the appropriate repository for review and disposition.
This has an added benefit after migration, improving information retrieval and providing the taxonomy hierarchy to end users as a navigational aid, offering like topics in the taxonomy for selection and exploration, so further refining their initial query.
Applying Information Governance Practices to Successfully Migrate Content to SharePoint Online.
This client provides services to a large cigarette and tobacco company. The companion company provides the company with services, including compliance, corporate affairs, finance, government affairs, human resources, information technology, regulatory affairs, research, and development.
The client services company wanted to accomplish the following:
- Thorough planning for migration from SharePoint on-premises to SharePoint Online
- Clean up the corpus of content and identify undeclared records, data privacy and confidential information exposures, and noncompliance instances
- Enrich remaining content with multi-term metadata
- Ensure the migrated corpus of content was organized and easily managed
Information governance best practices were applied to the migration of the company’s unstructured content. This approach enabled rapid document migration, as well as the ability to evaluate each document as it was migrated. The end result was a highly effective approach, to cleanse irrelevant or unnecessary documents, and to identify records that may not have been declared, or content that contained potential privacy and confidential information exposures.
With Potentially 161,000 End Users Creating Documents, Auto-classification Was a Necessity, Not an Option.
This leading global supplier of automotive parts needed to provide business tools to improve its enterprise-wide search and collaboration experience, through the tagging, classification, and reduction of discrete pieces of information that business users had to sort through daily.
The client had already performed content optimization before migration, and reduced the number of on-premises servers from 60 to four. The 20 million documents were all retained because of their value.
The goals were:
- Classify 20 million documents, on demand or as a scheduled activity, with no degradation of performance
- Classify content as it was created, ingested, or changed, in real time
- No search engine or application degradation
- Automatic application of multi-term metadata
The company selected Concept Searching’s technology to achieve all objectives. The key component was the conceptClassifier platform, the enterprise classification solution built on industry-standard open APIs. Through Concept Searching’s classification API and content enrichment service, content could be classified on the fly and injected into the SharePoint search index, delivering a quality end user search experience in a secure, collaborative environment.
Auto-classifying a base corpus of content consisting of 20 million documents, with no degradation in performance, and achieving significant improvements in search.
Concept Searching’s core technology underpinning its auto-classification capability is unique in the industry. Auto-classification clues are built by automatically generating multi- word concepts from a client’s own content. The classification engine then classifies the content to one or more nodes in taxonomies, inserting metadata into the managed metadata fields in SharePoint or, in the case of file share content, directly into the document properties.
Elimination of complicated Boolean expressions, proximity, or scripting, ensures the metadata is consistent and reflects an organization’s terminology, without the need to customize the output. Since the metadata is precise, the auto-classification engine identifies intelligent content in context that can be used in records management, identifying confidential data, migration, text analytics, content management, secure collaboration, compliance, and search.
Deployed in any environment, the technology is rapidly installed, with auto-classification available immediately. In addition to the visible impact of search improvement, end user tagging is eliminated, reducing both productivity drain and tagging errors, to safeguard information that should be protected, such as confidential information or records.
Data Privacy and Protection of Confidential Information
Protect confidential information, and reduce the 68 percent of breaches caused by your end users.
Maintaining security and limiting access to internal as well as external documents has been made that much more difficult with the rapid increase of security breaches – from ransomware, malware, and internal negligence. The legal profession is not immune to breaches, and must at least meet the requisite requirements to keep information secure and inaccessible to those without the need to know.
Included in taxonomy management components are standard descriptors, available in most security packages. However, confidential information exists that is unique to an organization.
Workflows that are easily defined by business professionals can be created, to automatically identify organizationally-defined confidential information, when content is ingested or created, and then route it to a secure repository, where download is prohibited. The appropriate administrator of the repository can evaluate the quarantined content, and handle its disposition.
When identified words appear during indexing, those documents are removed from access and await disposition. Workflows also operate in real time, immediately identifying when an unauthorized user is within a document where those terms are present. This ensures a robust redaction process, so content can be isolated immediately. This also takes place when users upload or create documents that have the potential to cause breaches, or that lack appropriate security.
Protecting Intellectual Property. Think It Can’t Get into the Wrong Hands? Think Again.
With 68 percent of all data breaches caused internally, it is highly probably that confidential company information will be exposed through negligence, or deliberately.
With over 8,000 end users, this electrical engineering and electronics firm recognized that although it needed a way to protect data privacy and financials, its corporate intellectual property and confidential information also needed to be protected and accessed only by those with a need to know.
The client was seeking a solution that was:
- Flexible and able to identify any information the company considered confidential
- Performed in real time, not after the fact
- Protected documents containing vulnerabilities, by finding and moving them to a secure repository for disposition
- Prohibited download
- Contained workflow rules that were easy to define and use
Using the conceptClassifier platform and conceptTaxonomyWorkflow, business users and IT teams were able to define rules consisting of descriptors and text to identify company confidential information. This enabled the organization to address all its concerns about the protection of confidential information. It now feels confident that its confidential information is protected, and is inaccessible, except to those few with a need to know.
Without Highly Descriptive Metadata, Untagged and Undeclared Records Prove Costly in Noncompliance, and Result in Poor Information Governance.
The world’s largest manufacturer of gas appliances, and producer of 90 percent of all gas appliances sold in the US under the GE brand, could no longer accept compliance issues with undeclared records.
Relying on end users to accurately assign metadata to documents of record, or using system-assigned metadata, was resulting in too many noncompliance issues.
This client was facing the following noncompliance challenges:
- Documents were mistagged or not tagged at all
- Records managers were unproductive, spending their efforts continually monitoring undeclared records
- Compliance with global and international standards, managing regulatory risk
- Audit trails, due diligence, and evidence of effort of compliance
- Avoidance of fines and lawsuits
The automation of records processing was rapidly accomplished, using conceptTaxonomyWorkflow to automate the process. The solution is easy-to-use and can be deployed in minutes, by records managers.
The company was able to swiftly respond to new compliance mandates, and records managers were more productive, spending time on higher order tasks.
Automatically identify, tag, and classify documents of record, for compliance and information governance.
The biggest stumbling block in records management has been end user tagging. Organizations typically have their end users declare records, and often require reference to the file plan. Unfortunately, this process is subjective. Worse yet, is when records are not tagged at all, resulting in organizational noncompliance. Forcing end users to accurately assign records descriptors is not a sound approach and is bound to fail.
With complex file plans, and the fact that human tagging is haphazard at best, organizations often unwittingly fall into the trap of noncompliance. Accomplishing compliance by developing a taxonomy that mirrors the file plan, the conceptClassifier platform is used to automatically identify documents of record, and to assign descriptors and any associated text. In the SharePoint environment, a content type can be automatically changed to reflect the type of record, and the information within the content that caused it to be auto-classified under a specific content type.
Once declared, the record can be automatically sent to the records management application, or an administrator can be notified, to review.