Using Google for Lucene

  • 11-Apr-2011

As I've noted on this blog many times before, a lot of Google Search Appliance's "features" are actually outside-of-the-box, rather than out-of-the-box. That can present an unpleasant surprise to many Google customers. But ironically, it's also a great advantage to anyone who wants to use open source Lucene/Solr.

In order to be able to index SharePoint and other sources, the Search Appliance (GSA) uses Google's "Enterprise connector framework." This formerly required an external server to run on. Only in version 6.2 of the GSA (released by the end of 2009) was this actually deployed on the yellow boxes themselves, making indexing SharePoint a point-and-click UI affair.

But the nice thing about the Framework is that Google actually open sourced it (under an Apache license). This means it's easy to adapt it to other repositories (if you have the developer acumen to do so, and a server to run it on). More interestingly, perhaps, is that this also means it's not too hard to use Google's Framework to feed SharePoint content to Solr, instead. And this is exactly what Lucid Imagination has done in their latest 1.7 release of the LucidWorks distribution. Using the connector, it's now as easy to index SharePoint with Solr as it is with a Google Appliance.

However, if you're looking for a discount search engine for SharePoint, LucidWorks Enterprise isn't necessarily it. It's not "free" as in "free beer": you'll need a $64K annual subscription with Lucid to run it in a production environment. If money matters, do the math.

More importantly, don't forget that while a cost comparison is difficult enough, it's only one variable in the mix. And with various search solutions mix-and-matching their components, this becomes comparing Grapples to Orangelos. The good news is there are plenty of options; but then there's plenty of homework, too...

Other Enterprise Search posts

MD