Archive | Search RSS feed for this section

Dysfunctional Search and the FBI, any of it sound familiar?

I think dysfunctional search is a great name. Unfortunately, it appears that the FBI wins the prize, but I am sure there are many organizations that also feel that their search is dysfunctional. An article in techdirt, ‘How The FBI’s Dysfunctional Search Systems Keep Information Out Of FOIA Requesters’ Hands’, did provide a chuckle, simply because it is just too late to take the US government seriously anymore.

To try to make this short, Trentadue versus the FBI, deals with a requested release of videotapes containing footage of the Oklahoma City bombing. Somehow during the first four days of testimony it was revealed that the FBI has ‘convenient’ information silos, instead of a cohesive repository for search. The problem is the person requesting the information must specify the correct records system for a comprehensive search to take place. The FBI typically only searches the main repository. In addition, the requester must specify in their request a ‘cross-reference’ check, which may mention the subject, but is not stored in the main repository. Again, the now beleaguered requester, must also send a request to the field offices involved, because the FBI ‘Records Information Dissemination’ has no cross-links to other than the original field office.

What about internal search at the FBI? The Central Records System (CRS), as it turns out, is not really a central repository and will accept three different methods of search, which will return three different sets of documents. One of the search methods, Automated Case Support (ACS) is used to search the CRS, but that search isn’t unified. To make matters worse the ACS is then split into three components. And, I think I’ll stop there as it just gets worse and worse, really it does. Oh, one more tidbit, the FBI decides what keywords to use.

I would imagine, or sincerely hope most organizations do not have a search environment such as the FBI. But enterprises do have silos of information and many have no integrated way to search across multiple repositories either via a software product that crosses repositories or through federated search. This should be a basic function. According to an AIIM study, only 18% of organizations have cross repository search capabilities. Maybe the FBI should provide training lessons.

Does your organization have cross repository search capabilities or federated search?

Comments are closed

Office 365 Compliance Search for eMail and Content -Good but not Good Enough

According to our third annual Microsoft Survey, the use of Exchange is almost a given. So is the rise of data breaches, which is most likely caused by your own employees. Security in Exchange for the identification of potential exposure can be done through the use of Compliance Search. This will enable administrators to search for common strings such as social security number, credit card numbers, or account numbers. The searches can be saved and re-executed. Concept Searching adds value to the identification of data privacy or confidential information, regardless of where it resides because it is not limited to defined descriptors such as a social security number, but can contain any descriptor and verbiage that you want secured.

Most security products, including Office 365 Compliance Search will identify the most likely, and standard descriptors typically used by most organizations. Sometimes that doesn’t always work. Confidential information, For Official Use Only (FOUO), new product information, competitive information, intellectual property, patents, or specific customer information may all contain confidential information, but it’s not easy as each subject may not have a common denominator to use as a rule. What to do then?

Concept Searching lets the organization quickly define rules that contain descriptors (social security number) and/or associated verbiage. Since we generate multi-term metadata that forms a concept the organization has no limit or bottlenecks trying to secure specific information. Once found, using Office 365 or SharePoint tools the content can be redirected to a secure repository, removed from search, and portability is prevented. Pretty cool. The rules are easily added, deleted if no longer necessary, and can be changed as the content the organization considers confidential may also change. In SharePoint, taxonomies can be deployed and when a document is found to have a data breach, the content type is automatically changed and classified against the taxonomy. Works when content is created or ingested, and in real-time. It works with diverse repositories, SharePoint, Office 365, You name it, you’re totally covered.

Comments are closed

Precision versus Recall – What is old becomes new again

During my research I often find some little snippets of information that make me stop and think about how ideas, theories, processes are repeated, imagining a highway being built that stretches endlessly in the horizon and we return to the starting point. It seems to be happening more often lately.

Even with technology we are still seeing history being repeated. Enterprise search has been around for about 67 years as described by J.E. Holmstrom in 1948. Machine Learning or Artificial Intelligence has been around for 61 years, and is now becoming the newest buzzword and must have technology. Precision and Recall, was introduced in 1955 when a gentleman named Allen Kent joined Case Western Reserve University. That same year, Kent and his colleagues published a paper in American Documentation describing the precision and recall measures as well as detailing a proposed “framework” for evaluating an Information Retrieval system which included statistical sampling methods for determining the number of relevant documents not retrieved.

Over three generations have passed, and what is ‘old’ is now ‘new’. Precision and Recall is now back in the news, at least in the legal industry. What brought this to mind is an article I read in Legaltech News, written by Zach Warren, it’s actually a good read regardless of industry as in almost all points he hits the nail on the head.

Years ago, the accuracy of search was measured by precision versus recall, in fact, we have several clients who use our tools to tweak and manage precision versus recall. Why? One is considered one of the top three global analyst firms, and they need precision and recall on their external client web site – poor search results equal lost revenue. The other client has 170K global users and needs accurate search results. The image from Wikipedia illustrates Precision and Recall in an easy-to-understand graphic.

These days, despite some of our clients, I don’t think it is used much. I also agree with the writer, that most tools don’t let you easily manipulate precision versus recall. It seems to be a forgotten metric in search efficiency. Luckily, our tools are easy-to-use and although precision and recall is a tough nut to crack it’s not like it used to be. Nice to see it back around again, at least in the legal industry.

 

Comments are closed

What is Microsoft’s Search Strategy? Are they as confused as I am?

Microsoft’s search strategy is somewhat unclear, at least to me. Office Graph uses artificial intelligence and borrows from the FAST search technology. This is the basis for the Clutter feature in Outlook that lets users remove low priority emails. It is also the basis for Delve, which is a business social tool. From within Word, Excel, and PowerPoint, Bing is used to provide a tool called Insights with a ‘Tell Me’ search feature from within the basic Microsoft applications. Many organizations would find this confusing, and one wonders if improvements and management of the results would require additional support personnel to address each search option. I would have to believe organizations would prefer not to put together pieces of the search puzzle. Adding on-premises to the mix, becomes more complicated.

These factors can present challenges to Microsoft, although organizations want accurate and relevant search, they don’t want to spend money or time on it, would like a plug and play environment, and take the burden off the end user to find what they are seeking. Unfortunately, Office Graph, even though combined with FAST needs to learn the interests of each individual, which will delay the effectiveness of search across the organization, and ultimately Office 365 adoption. The primary stumbling block is going to be the issue of end user tagging, as Office Graph uses the metadata added automatically or by the individual. Delve is going to be very confused considering how poorly users tag content.

Comments are closed

Big Brother really is watching you! Office 365 Delve

Under the name ‘Organizational Analytics’ the new version of Delve, available later this year, will include a dashboard view which will track your own work performance and compare it to the company average. Although Microsoft sees this as a valuable tool, one would question if it is an effective management tool or will upset the proverbial end user apple cart. This actually bothers me a bit. I realize that there are those who are diligent workers and then there are the slackers. Now we will all be tracked on exactly what we are doing, ‘oh-oh you went to too many meetings, you’re answering too many emails, the whole department is performing better than you’, I think you get the picture.

Another new feature, termed a productivity tool, Delve has also added a new profile page for users to specify their contact information, whom you report to, who reports to you, and, a personal blog page that enables the user to embed videos, documents and images. It also includes a Praise page where the user can list personal accolades, customer sales, contracts, whatever they wish to share with colleagues. Hmm, what will the Organizational Analytics think of my time spent building my blog of ‘atta boys’.

The above ‘tools’ go hand-in-hand with Microsoft’s new infographic, which I thought was just very tasteless. If you haven’t seen it yet, ‘This terrifying Microsoft ad suggests you’re not working hard enough in the bathroom‘ infographic, which has gone viral. I thought it was a huge marketing mistake, but am rethinking the assumption that it really wasn’t a mistake at all. What do you think?

Comments are closed