A question often asked by people learning about search technology is, "what's the difference between enterprise and federated search?" It is not the simplest question to answer, and in my travels I have found that the many various definitions of these terms typically don't help in explaining the difference. And that's because the difference is very subtle indeed.
At the core, both enterprise and federated search are about accessing, indexing, and querying diverse and dispersed repositories. Enterprise search is very specifically about searching within the enterprise, which may include an externally-facing web site or extranet(s).
It's often the case that an enterprise search tool can come across repositories that already have their own search engine that's created its own index. Assume for a moment that you elect not to have your enterprise search engine re-index all that content itself. Yet you want to expose items from those repositories. In this scenario, when the initial search engine is trying to return results from those repositories that are already indexed, it can do one of two things:
- Directly query those indexes created by other search engines in real-time, or
- Trigger those search engines to provide results, and in turn aggregate what it collects into a single, enterprise-wide results set
That's federation: leveraging the indexes and potentially the query results of other search services within your enterprise.
However, I've heard some vendors use the terms enterprise search and federated search interchangeably -- which creates confusion. A scenario where an enterprise search tool crawls other repositories itself and builds an index based on that content in all its unstructured glory is classic enterprise search, even if your vendor calls it "federated search."
Obviously federated search sounds quite attractive, especially in an environment where enterprises have licensed multiple search products or employ various information management systems that embed their own search facilities.
In practice, however, comprehensive federated search is quite difficult. Even more than with “regular” enterprise search, security becomes an important issue, particularly when multiple indexes or engines follow different security models. Also, in scenarios 1) and 2) above, it can become extremely hard to de-dupe, merge, and provide comprehensive relevancy rankings in a reasonable timeframe for the searcher. A bottleneck in one repository can gum up the overall results.
While hosting a search seminar with Martin White of Intranet Focus in London a couple of years ago, he drew an excellent representation of these federated search challenges, and I have since transformed it to PowerPoint and use it often in my presentations about search technologies.
Here at Real Story Group, we've added more detail to our search vendor evaluations regarding how vendors handle this highly complicated scenario. Despite the challenges of federated search, there is promise in these technologies, especially those approaches that rely on other native-search engines to provide the core results, and serve as basic aggregators.