Cloud computing is one of those buzzphrases that, like "redistribution of income," seems to make otherwise dispassionate people hyperventilate. Oracle founder Larry Ellison, speaking at the recent Oracle OpenWorld conference, raised quite a few eyebrows when he derided "cloud computing" as "complete gibberish" in an extended on-stage rant before an audience of financial analysts. A few days later, Free Software Foundation patriarch Richard Stallman (never one to mince words) called cloud computing "worse than stupidity" in a highly critical interview with The Guardian.
Don't be fooled, though. Cloud computing is not just a catchphrase. Like REST, it's a style of doing things that doesn't seem particularly profound at first glance, but has important implications for certain problem-spaces. What the skeptics need, perhaps, are a few real-world case studies in cloud computing, to understand what the hubbub is about.
One such case study comes by way of a blog by Derek Gottfrid of The New York Times. Gottfrid tells (in some detail) how he used cloud computing on a spot basis to solve a very specific (and quite daunting) content-management problem in a short period of time, at minimal cost.
The problem: Make all of NYT's public domain articles from 1851-1922 available online as PDFs. The source data? Four terabytes worth of TIFF images.
Gottfrid decided to use the Amazon S3 service for temporary storage, and Amazon EC2 as a processing grid. The basic idea, Gottfrid says, was to "write some code that would run on numerous EC2 instances to read the source data, create PDFs, and store the results back into S3. S3 would then be used to serve the PDFs to the general public."
Gottfrid wrote some relatively simple code against the popular iText PDF Library to achieve TIFF-to-PDF conversion. Doing the conversion with a single running instance of iText would have taken a very long time, obviously, so the main challenge was to distribute the processing across numerous EC2 virtual machines. To help with this, Gottfrid turned to Hadoop, the open source implementation of Google's MapReduce algorithm for distributed computing. Using Hadoop, Gottfrid was able to spread the conversion process across 100 EC2 instances. As a result, some 11 million PDFs were generated in 24 hours. Mission accomplished.
The cost? A total of $240.
NYT's use case is tactical in nature. A more recent example of a company using cloud computing to solve a strategic problem comes from Drop.io, the private file-sharing service. (Files stored on Drop.io are meant to be easy to share with friends, but tightly "locked down" otherwise, so that even Google can't expose your files, or the fact of their existence, to the world.) Yesterday, after a year of doing things the old-school data-center way, Drop.io announced that it has moved 100 percent of its infrastructure to Amazon's cloud. (Company co-founder Sam Lessin gives his reasoning in a blog posting.)
So is cloud computing "complete gibberish"? Is it simply old-fashioned data-center computing with a bit of lip gloss? To me, it's about rethinking the role of infrastructure in a world of astonishingly cheap and abundant infrastructure. Think about it. If you could have all the storage, memory, bandwidth, and computing horsepower you could ever want, for the cost of a daily cup of coffee at Starbucks, what sorts of solutions would you build, and how would that change your life (and the lives of your customers)? That's the question -- not the answer -- implied by "cloud computing."