Having been rather cynical on the subject of Big Data, it was reassuring to see a full-house at a recent Computer Weekly "CW500 Club" gathering dedicated to the subject. Nevertheless, I came away with more questions than answers about the value of this topic right now for business users.
As I mentioned in my recent blog post, the big guns were present and correct in logo form. Oracle, EMC, and IBM cropped up regularly with reference to infrastructure, alongside Quid and Connexcia building on top of Lucene. A great deal of time was spent discussing relational databases versuss NoSQL schema-less data structures and Hadoop / MapReduce clusters.
So much so, that you'd get the impression that the business case for Big Data was a done deal.
In the details of the discussion was a clue: Big Data was defined as "workloads that we just couldn't handle before." Big Data's three-Vs of variety, volume and velocity certainly provide a framework around which the IT- literate can debate. However, it wasn't until the discussion turned to practical business scenarios in which this horsepower could be utilized that things got really interesting.
Here's a selection of some viewpoints expressed in the room:
- The longer and louder that we discuss Big Data, the more the public are aware of the volumes of data that are stored about them; therefore privacy becomes paramount
- Today, the easiest way to justify a business case for a Big Data project is to fold the effort into a storage consolidation program, rather than tout the benefits of data analysis, (even though that analysis is potentially hugely valuable for the business)
- Data quality and source management remain big open questions, especially with extra-organizational information. Pre-existing "data quality" tools can address structured data integrity, but when you bring in, say, external social media data, these tools may not suffice
Valuable as this debate was -- especially to an audience of architects and information nerds (like me) -- it did make a comment on the subject resonate in my head, that for many at this stage in its lifespan Big Data is "...a clever toolset desperately looking for a purpose..."
Whilst some organizations with well-considered information architectures may be able to jump right in and find real benefits right away, for many (or most) others, Big Data remains at least at present, puzzlingly opaque.
So don't feel bad if you don't have a strategy. It's a work in progress for everyone.