Google Search Appliance v6: don't believe the hype

  • 2-Jun-2009

The release of a new major version of the Google Search Appliance usually creates lots of excitement, but that excitement fades quickly once people start to use the machines for real. Even many of the Google resellers I talk to admit to some disappointment: first highly anticipated features are finally introduced, and then they turn out to be rather DIY.

I don't want to sound like a broken record -- see Google Search Appliance: small step in technology, giant leap in marketing or The Emperor's New Box. The point with many of the Google Appliance's new features is that typically, they're not actually on the box itself, and quite often, that means you start loosing the convenience of having a plug-and-play appliance.

So pardon me for being skeptical of what the new version 6 of the yellow server will actually bring to the table. Try to line up the several lists of new features: the PowerPoint presentation I saw was slightly different than the "New!" list on the product page. The YouTube announcement mentions cross-language-enterprise search, which isn't a feature but an experimental project. Actually, looking at the version 6 documentation, quite a few of the new features turn out to be still in beta. Many of the announcements mention "early binding" security as a new feature, but I haven't been able to find this in the documentation. The same goes for many of the other capabilities you'll hear buzzing about: they're very hard to trace back to actual features in the new version 6.

Perhaps the most hyped novelty is scaling, or as Mountain View likes to call it, (GSA)n. As the Google Enterprise blog gushes, "When we tested it out, the product manager was pretty excited about all the new features and search power. He was used to hearing about the millions of docs we could handle – but this time we were going to push it to a new realm: billions." They've even made a video that shows the product manager being all excited. I can imagine him being excited -- like many others, Google charges per indexed document. But a flat claim of "billions," with "less than five server racks!" means very little to me. Billions of what? 10 digit phone numbers or 40mb PDFs? (I can go on for hours on why this is quite meaningless, since it all depends on the exact circumstances, but suffice it to say that I've already seen several vendors demonstrate billion-document-indexes on a lot less than five server racks).

However, I was pleased to see that this is one of the features that's actually on the box itself in version 6. Well... "Dynamic Scalability" is, at any rate, which "enables multiple Google Search Appliances to work together to scale up to 30 million documents". So how do you get from 30 million to billions? I presume by using "Distributed Crawling", since this "greatly increases the number of documents that can be crawled". That's a beta feature, though, which can only lead me to conclude that the theme of this release should actually be (GSA)beta (pronounced as "to the power of beta").

I tend to think that in every market, there's room for Rolls Royces, Volkswagens, and everything in between. But it makes very little sense to take a Volkswagen and turn it into a Rolls Royce. Thankfully, Volkswagen understands this. Maybe Google should, too.