Global Oil Field Services Organization

Case Study

  Industry Case Studies    All Case Studies

Intelligent Migration as the Key to Content Management

The ability to mass move content is relatively straight forward. However, from an information governance approach, mass moving content results in mismanaged and inaccessible information.”

Concept Searching
Customer Location:
United States
Energy and Utilities

This organization was implementing SharePoint, requiring a migration. Instead of a forklift approach, which simply transfers risk from one system to another, the company’s objective, and challenge, was to expedite the migration, minimize cost overrun, and scope creep, and improve the quality of content on SharePoint after the migration.


“The cleansing of the corpus of content, and easy-to-use migration tools, enabled a significant improvement in search, the identification and protection of privacy and confidential information, and the automation of records processing after the migration.”

This organization was able to achieve its objectives using intelligent migration. The process involved automatic semantic metadata generation and auto-classification to different taxonomies. It was able to quickly eliminate multiple revisions, obsolete content, and content of no value, significantly reducing the number of documents that needed to be migrated. During the process, previously unknown data privacy and confidential data was identified, protected, and placed in a secure repository for disposition. Records that had never been declared were automatically assigned record descriptors and associated language, and routed to the records management application. The remaining documents were migrated, with appropriate metadata, and classified to taxonomies then to SharePoint, where the semantic metadata could be used to resolve diverse application challenges.


  • Able to eliminate multiple revisions, duplicate content, content of no value, and obsolete content before migration
  • Identified records that had never been processed, initiating automatic tagging of descriptors and text
  • Identified and protected privacy and confidential information, preventing access and portability
  • Reduced server footprint
  • Significantly improved search after the migration, by providing semantic, concept-based searching as well as a taxonomy hierarchy to facilitate informational search techniques

Owned by General Electric, this organization is one of the world’s largest oil field services companies.

The company had made the decision to move to SharePoint, but needed to migrate information. As is the case with the majority of companies, staff were not trained in migration. The organization also wanted to avoid a lift and shift migration technique that does not clean up the corpus of content.

  • Migration, although a necessity, was secondary to the implementation of SharePoint, and needed be done as quickly and efficiently as possible
  • Elimination of content of no value
  • Avoid the costs of specialized migration software and training of staff
  • Provide the ability to manage content in the SharePoint environment

Organizations are requiring more sophisticated techniques to ensure compliance objectives are met, and a typical loophole is the migration process. Simply moving documents from one repository to another is not enough, as content that was typically unmanaged will remain unmanaged, continuing to expose organizations to risk. Information cannot be successfully managed from inception to deletion without comprehensive metadata associated with the content, maintaining security and incorporating the multiple channels and origination points from which content was received.

Migration of unstructured content can be a laborious and time-consuming project. The challenge is that documents can exist in multiple places at the same time, different revisions of the same document exist, and some documents should be deleted and others should be archived. There may be records that were never declared, as well as confidential or privacy information, which will not be identified when migrated, exposing organizations to data breaches. The ability to mass move content is relatively straightforward. However, from an information governance approach, mass moving content results in the same problem posed by mismanaged content.

To migrate document collections effectively, the text content of each document needs to be searched to determine its value. This cannot be done manually, as the volume is too high, and the consistency of human review and decision making is unreliable, as well as costly. If manually processed, the security rights of the documents as they are moved to their new locations must be applied. General migration tools cannot safeguard document confidentiality, because they do not make intelligent taxonomy workflow decisions based on the text content of the individual document.

Using the conceptClassifier platform, conceptClassifier for SharePoint, and conceptTaxonomyWorkflow an intelligent approach to migration can be taken. As content is migrated, it is analyzed for organizationally-defined descriptors and vocabularies, which will automatically classify the content to taxonomies, and optionally to the SharePoint Term Store. conceptTaxonomyWorkflow will then process the content to the appropriate repository, for review and disposition.

This approach includes indexing of content, including file shares to file shares, file shares to SharePoint, or any custom action from any other repository such as .NET code and web services. The plug-in architecture provides the ability to custom develop content sources and destination sources, automatically connecting to Concept Searching taxonomies or to the SharePoint Term Store. The system is easily trained to accurately classify content, using multi-word concepts, rules, and metadata clues automatically generated from within the content such as file properties, file path, keywords, and dates. These parameters can be used in the workflow rules, and when content is processed it will automatically generate semantic metadata, auto-classify it, and route it to appropriate SharePoint site, library, or folder.

Migration to SharePoint presents various challenges at various levels, including bulk migration of well managed documents of value, bulk archival or disposition of old or extraneous documents, and finding a way to deal with the poorly classified middle ground. Collections occupying the middle ground contain a variety of documents – some are important, others are ‘nice to keep’ and many are irrelevant. Many large migration vendor solutions are unable to offer an intelligent approach to the middle ground – this is where conceptTaxonomyWorkflow delivers value.

The conceptTaxonomyWorkflow component delivers workflow capabilities that enable intelligent, automatic classification decisions, both during and after migration. These decisions enhance organizational performance and drive down costs but, more importantly, enforce corporate and legal compliance guidelines. For organizations with medium to large free text document collections, migration is no trivial matter and cannot be performed by human effort alone.

Migration must also consider the security of the documents as they are moved to their new locations. There are two imperatives here – to not only respect the existing security status and apply the same security in the new location, but also identify sensitive documents that may not currently be in a secure location. Assessing the security needs of these documents requires intelligent interrogation of their content, and then comparison with a number of relevant official taxonomies, such as PII, PHI, and ITAR. If a document is automatically classified against one or more of these taxonomies, it must be given the appropriate security profile.

General migration tools cannot safeguard document confidentiality, because they do not make intelligent taxonomy workflow decisions based on the text content of the document. If this security profiling is not performed during the migration to SharePoint, then many of these documents may be surfaced using SharePoint search, breaching the relevant document security obligations. Using conceptTaxonomyWorkflow, these documents will be safely routed to the Records Center or any designated secure location, with the correct access rights, protecting and preserving documents during the migration process.

The products were developed for subject-matter experts. They are easy to use and highly interactive, and can be rapidly deployed. Taxonomies can be created quickly, using the technology to provide content clues to an administrator, on recurring phrases, concepts, entities, and keywords. They will also identify related documents that contain similar concepts, which would not typically be found. Optionally, any existing hierarchy or taxonomy can be imported. conceptTaxonomyWorkflow workflows can be created and tested in minutes.

After migration, the taxonomies can be used to address diverse challenges, including those of enterprise search, data privacy, records management, knowledge management, secure collaboration, and text analytics.

Information governance best practices should be applied to the migration of unstructured content. This approach enables rapid document migration, as well as the evaluation of each document as it is migrated. The end result is highly effective, cleansing irrelevant or unnecessary documents, and identifying records that may not have been declared and content that contains the potential for privacy exposures.



The solution enabled the company to achieve its objectives. The Concept Searching tools facilitated the migration, making it easy for both subject-matter experts and the IT team, removing the need for specialized training and its associated costs. The organization was able to clean up its corpus of content, resulting in significantly improved search after the migration, and the opportunity to leverage the metadata to improve a variety of applications.

  • Secure and compliant document migration
  • Identification of duplicate content, revisions, and content that is obsolete or no longer of value, reducing the quantity of content to be migrated
  • Comprehensive integration with the SharePoint enterprise metadata management, writing directly to the Term Store locations and to the conceptTaxonomyManager component simultaneously
  • Automatic locking down or removal of sensitive documents, both during and after the migration process
  • Enforced governance and reduced costs associated with litigation, PII breaches, noncompliance and federal guidelines
  • Automated identification of records that were never declared, solution assigned record descriptors and text, changed content types in SharePoint, and routing to the records management application
  • End-to-end solution to manage the entire lifecycle of digital content, and content from file stores in a single repository spanning on-premise and cloud
  • Consistent tagging of content, eliminating end user tagging
  • Ability to rapidly build business-specific taxonomies, and to import and customize existing hierarchies
  • Deployed automatic routing, based on the text content found within each document
  • Proven, high performance architecture for throughput, and multi taxonomy and multi-site requirements
  • Ability to add value to customer relationship systems and to databases, by adding metadata
  • Complements general migration tools
  • Efficient, automated migration of large volume projects
  • Rapidly deployed, designed for subject-matter experts, and easily managed

Ask a Question

Leave your details and one of our consultants will get back to you.

Concept Searching