Real Story Group. Make Better Technology Decisions.

Formerly CMS Watch. Here's our story
What Real Independence means. Find Out

  • Schedule a Demo
  • Free Sample
  • Contact
  • Subscriber Login
  • Your cart is empty.
Sign up for our Newsletter
  • Home
  • Evaluation Reports
  • Premium Subscriptions
  • About
  • Blog
  • Buy Now
  • Recent Entries
  • Get Custom Feeds

 

 

 

Bloem Adriaan Bloem

GSA6: Google Billions, Revisited

11-Jun-2009

Tags: Enterprise Search, Marketplace at Large, Google Search Appliance

Last week, I posted a highly critical comment on Google's marketing of the Appliance, version 6. My main qualm is that the hyperbole makes it very hard to understand what it actually is they're selling. What you get with a GSA is not exactly how it looks on YouTube (well, the box is, but not necessarily the internals).

Of course, in my quest to get you the real story, I'm not going to leave it at "press releases and documentation don't match up". The interesting bit is what the software is actually capable of; even more interesting is what customers are doing with it in reality.

For now, I'll zoom in on what made the headlines: the Appliance's new capability to index billions of documents, rather than the 30 million of previous version. I noted two things about this:

  • The Dynamic Scalibility feature, according to the version 6.0 documentation, "enables multiple Google Search Appliances to work together to scale up to 30 million documents and provide unified search results" (not billions);
  • Being able to index billions of documents, in general (and this applies to all vendors) is a rather meaningless statement, since it really depends on what you're indexing (I used the example of 10-digit phone numbers vs. 40mb PDFs).

Google got in touch with me to explain this, and this led to two surprises.

First of all: Dynamic Scalibility is, in fact, the feature that would enable indexing billions of documents, and this isn't a beta feature. So what about the documentation's reference to a 30 million document limit? As it turns out: this is an error in Google's documentation. (For now, the error is still in the "Guide to Software Release 6.0", but I've been told this will be corrected.) According to Google, there is no hardwired limit to the number of documents you can index using multiple machines (as long as you buy lots and lots of Appliances to do it on, of course).

Secondly, about the difference between indexing 10-digit phone numbers or 40mb PDFs: I've been told that the Appliance's hardware is carefully over-spec'ed to handle the load Google claims it can deal with. (The Dell PowerEdge R710s the vendor ships would out-perform many commodity servers). My 40mb comment was a bit of a jab: an Appliance won't index documents larger than 30mb. But as Google explained, the limit has been set so they can guarantee that when they say a GB-7007 can index 10 million documents, it can actually index 10 million of those 30mb PDFs when that's what you need to do. And to be fair, if large documents are an issue for you, you'll want to read our Search & Information Access Report product evaluations carefully, since most enterprise search products have similar limits.

In the end, of course, the proof will be in the pudding: even if the software is capable of tying together 38 appliances to index a billion documents, this may not mean you'd actually want to. What are minor issues on a smaller corpus suddenly become major problems on that scale, and I'm looking forward to seeing how real enterprises are faring in deploying a cluster of GSAs for such high volumes.

And if anything: you still shouldn't believe the hype. Google's "billion document index" headline was syndicated across hundreds of news sources before even Google itself found out its documentation contradicted this. You'll want to be sure to get your information from a reliable source.

    Now Get the Complete Real Story

    Vendor Evaluations

    Learn the real strengths and weaknesses of major vendors from around the world, in our research stream.

Tweet

close x

Free Sample Request

  Digital and Media Asset Management
  Document Management (ECM)
  Enterprise Collaboration & Social Software
  Enterprise Search
  Portals and Content Integration
  SharePoint Ecosystem
  Web Content Management
 Send me bi-weekly tips and insights from Real Story Group.
Your personal information, including your e-mail address, will be held in the strictest of confidence and will never be shared with anyone.

Subscriber Log In


Remember Me
Forgot password?


Not a subscriber?
Learn about our subscriptions

Research Mentioned in this Post

Vendor Evaluations

 | 

Our Newsletter

Get the Real Story bi-weekly.

Have Questions?

USA & Canada
+1 800 325 6190

UK
+44 (0) 20 3318 1911

International
+1 617 340 6464


All Other Inquiries

Our Customers Say

"I wish I had found your Web CMS Research six months ago. The "Pitfalls to Avoid" section is worth its weight in gold!"

Georgeann Elliott Moss, Director of Internet Publishing, Dallas County Community College District

next More

Real Story Group

Follow us on:  RSS  |  Twitter  |  Facebook  |  YouTube

Evaluation Reports

  • Web Content Management
  • Document Management (ECM)
  • Portals and Content Integration
  • Enterprise Search
  • Digital and Media Asset Management
  • SharePoint Ecosystem
  • Enterprise Collaboration & Social Software

Premium Subscriptions

  • Research Streams
  • Advisory Papers
  • Vendors Evaluated
  • Schedule Analyst Consultation
  • Online Education
  • Configure a Subscription

About Us

  • Our Methodology
  • Our Team
  • Media
  • Customer List
  • Events
  • Consulting
  • Contact Us

Need Help?

  • Talk to an Expert
  • FAQs
  • Customer Support
  • Contact Sales Team
  • Help with your account

Copyright Real Story Group 2001 - 2012. All rights reserved.

  • Contact Us
  • Copyright Policy
  • Privacy Policy
  • Terms of Use

Log In

Remember MeForgot password?

close x
close x

All analyst firms claim to be independent or vendor-neutral. We're different.

Real Independence


Get the real story on commercial and open source tools from a firm that works only for you, the technology customer.

close x

Newsletter Signup

Thank you for signing up for The Real Story Group Newsletter. You will receive our monthly newsletter, plus updates with new information on the technology streams you have expressed interest in below.










Choose the streams that you’d like to receive updates for: