Why Federation will make a huge difference and lead us to better information governance practices.

Posted on March 16, 2010


A different approach towards Records Management, eDiscovery and Freedom of Information Requests (FOIA).

As more and more data moves into the cloud and as more and more different enterprise information management applications are used by commercial organizations as well as by governments, it is increasingly complex to manage all this data, not only for on-going records management, but especially when eDiscovery and Freedom of Information Requests (FOIA) requests are at stake. Searching and collecting information from many different repositories can be a horrendous task. Let alone management of information which is physically and logically distributed and various in nature.

In the past, one of the common approaches has been to move all enterprise data into one or if this is not possible, into as little as possible enterprise content or records management solution and to search all data with one enterprise search application. Unfortunately many of these enterprise projects fail because (i) there simply too much data, there will be a moment in time when it will no longer fit easily in the central enterprise solution (ii) there are too many specialist content management applications which embed specialist functionality that is not available or that cannot (economically) be rebuilt in the enterprise solution, (iii) business users refuse to move from an familiar application to the enterprise one, and (iv) more and more data resides in hosted or in other repositories in the cloud which can simply not be moved into the enterprise application.

As a result of this, I have seen increased interest from both our customers and industry analysts towards a more federated approach based on so-called open search technology (aka the ATOM standard) and many available XML-based applications programming interfaces (API’s) for almost all repositories. By using these open standards and open interfaces, a different, often easier to implement, approach is possible. Search federation is available in almost all in-house enterprise information and content management applications, but for almost all social networks, internet email and other typical cloud applications as well. Based on these open standards, it is now also very well possible to search and even collect (identify and download) information with one federated interface.  With a little bit more effort, it is even possible to manage the retention of such information in the cloud. This is called federated search, federated collection and federated records management.

Wikipedia defines Federated search as the simultaneous search of multiple online databases or web resources and is an emerging feature of automated, web-based library and information retrieval systems. It is also often referred to as a portal or a federated search engine (http://en.wikipedia.org/wiki/Federated_search). One can either federate one or more another systems, or one can be federated.

This approach is easier to implement than rolling out central enterprise solutions; it will also have a much higher probability to be successfully implemented and it will have faster user acceptance.

The one major disadvantage from a search perspective is that not all repositories will have the advanced specialist search that are available as enterprise search solution, but this problem can be overcome to provide native search based for those repositories where this is essential of economically justifiable (often these are the legally risky archives). This can be done by XML data dumps or by in-line full-text indexing (which can optionally also integrate security and additional user dependent information). Having said this, more and more repositories include more than decent search these days, so this does become less of an issue in time.  Another challenge is to map meta information and result structures between the different applications, but this is less of a problem and can be done on a case by case basis. Most of the major vendors already have linked into the open search standard and map their results and meta information properly when they are being federated.

In the next few years, I expect federation to become increasingly popular. Especially since hard deadlines from eDiscovery & FOIA requests in particular, and Information Governance requirements in general force us to solve information access and  management problems rather yesterday than next week. Easier deployment, fast roll-out, less financial and technological risks, less upfront investments, integration of both cloud and non-cloud repositories,  minimal data migration and less user training are just a few of the benefits of this approach over the one-enterprise solution.

By developing federation search connectors for the most popular repositories software providers will continue to stay ahead of the curve and offer their customers the most economic solution for their particular infrastructure. Custom connectors will also continue to be an option.  Next steps are federated collection for eDiscovery and FOIA, and last but not least will be the step towards federated records management and federated information governance which will help us to manage all our data in a federated model.