The problem of dark matter in the information universe

It seems to me IDC may have missed (or at least skimmed over) some important conclusions in its newly released 2008 update of last year's widely cited The Expanding Digital Universe, which tries to outline the dimensions of the ongoing explosion of digital information.

Not surprisingly, the 2008 update finds that the 2007 estimate of the world's information content was too small. It turns out the 2007 digital universe was actually 281 billion gigabytes, about 10 percent bigger than IDC thought.

By 2011, IDC says in its new report, the digital universe will grow to 10 times its 2006 size. I suspect that when 2011 rolls around, this estimate will prove an underestimation as well.

The upshot of the report, of course, is that information continues to explode out of control. It is growing faster than we can store it (good news for EMC, who commissioned the report). In fact, IDC says that "by 2011, almost half of the digital universe will not have a permanent home."

What IDC has not gone on to point out (maybe because it is obvious?) is that "information" is being produced faster than human beings can possibly consume it, even with the aid of machines. From this, it follows that most information is destined to enter a zombie state, in which it is stored (and in many cases managed) but never consumed. One might call this the dark matter of the digital universe.

It seems there are two fundamental Laws of Information at work here:

  1. Information is vastly easier to create than to store
  2. Information is vastly easier to store than to dispose of

 

(And as Alan pointed out, vastly cheaper to store than parse.)

What does it mean for those of us in the content management business? Quite simply this: It is no longer enough just to manage content. The world needs new and better ways to make content consumable (i.e., process, transform, package, and ultimately serve it to audiences that can use it). This could explain the popularity of newly trendy document-purposing companies like Omtool, Optio, StreamServe, and Document Sciences (recently acquired by EMC). We can expect to see much more activity in this space.

Just as important as making existing information consumable is that we need to find ways to identify and eradicate unneeded information. This is actually a much greater challenge, calling for new approaches and (quite possibly) new technologies. Zombie information is everywhere, and growing at an explosive rate. It will cost a fortune to store, and manage, going forward. We need to think about information-tenuring technology now, before we're overrun by the undead.

Other ECM & Cloud File Sharing posts

ECM Standards in Perspective

In real life I don't see ECM standards proving particularly meaningful, and you should see them as a relative benefit rather than absolute must-have.