Thoughts on Google Monoculture and the Cloud

  • 31-Jan-2009

At 1430 hours UTC on Saturday, January 31, the unthinkable happened. Google malfunctioned.

Anybody who performed a Google search between 6:30 a.m. Pacific time (USA) and 7:25 a.m. saw the message "This site may harm your computer" as a link under each search result (even if you Googled "google"). Upon clicking any hit in the results list for any search, users were taken to a special Google warning page instead of the desired target page.

Ironically, the problem came about as the result of human error caused by a worker at a non-profit called StopBadware.org (which Google hires to look into every complaint about harmful sites sent to Google). Somebody mistakenly flagged "/" as a bad site. When it was entered in the system, Google dutifully flagged all sites as harmful.

It took roughly an hour for the problem to be fixed. But what if it had taken a day? Two days? A week? Unlikely, to be sure. But so was this incident. And remember, it took Google an hour to get the system back to normal. There was no automatic failover mechanism.

One wonders: If Google were to go down (or become essentially unusable -- same thing) for, say, 72 hours or more, how disruptive would it be to the economy? Would online retailers see a slowdown in business? Would job-seekers remain out of work longer? Would the productivity of information workers (who supposedly spend a couple hours per day doing online searches) be seriously affected?

How would users of Google applications be affected by problems with the main search engine? Google offers over 70 services of various kinds. Does anyone even know what all the dependencies are?

Fortunately, there are alternatives to Google for basic search, and I like to think that in the event of a serious Google outage, people would fairly quickly (re)discover Mamma.com, Yahoo Search, Cuil, and other alternative search sites. But even so, a significant percentage of the world's online advertising happens via Google. And a lot of us use Gmail (and Google Docs, and other Google apps) for non-trivial purposes. To think that a Google outage wouldn't have significant repercussions for business would be unrealistic.

Perhaps the Saturday-morning #Googlemayharm incident (which is how it quickly became known on Twitter) should serve as a warning. It's a wakeup call for those of us who've gotten caught up in what some have dubbed the "Google monoculture." Becoming dependent on a single supplier is bad ... have we forgotten?

The Google incident should also remind us that graceful failover is good. (Absence of backup systems can be painful and costly.) If you're planning an Enterprise Search deployment, think about what a search outage would mean for your business. Estimate the risks and costs. Budget accordingly.

Maybe there's a message here too for cloud-computing proponents. Sometimes even the most highly distributed, highly virtualized, "enterprise-hardened" infrastructure is no stronger than its weakest component. And quite often, the weakest component is human. That's never going to change -- cloud or no cloud.