Real Story Group Enterprise Search blog posts Copyright (c) %2012 RealStoryGroup.com, Inc. All Rights Reserved. http://www.realstorygroup.com/ www.realstorygroup.com : Blogs en-us 05/10/2012 00:00:00 60 Dark Data provides Lucid moment as Big Data turf war heats up #lucene #Cloud Thu, 10 May 2012 08:49 UTC http://www.realstorygroup.com/Blog/2354-Dark-Data-provides-Lucid-moment-as-Big-Data-turf-war-heats-up?source=RSS WARNING: If you're already getting "Big Data" buzzword sickness, look away now, as Lucid Imagination -- the professional open source distributor of the Apache Lucene/Solr search platform -- has announced a beta program for its formal entry into the "Big Data" turf war. It's called, predictably, LucidWorks Big Data.

Built on top of their Lucid Works Solr distribution and extended using a range of additional Apache projects (including the Mahout "Machine Learning" engine), this platform allows beta-approved customers to build Cloud-based sandboxes to test their data sources to a satisfactory level of accuracy, without building up the necessary architecture in-house. 

Where this gets interesting is Lucid's reference to "Dark Data" -- their term for unstructured data -- and their acknowledgement that the vast majority of data retained within organizations is such dark matter. Much in the same way that IBM positioned the purchase of Vivisimo to perform a quality control or curation role as an entry point for unstructured data into a Big Data system, Lucid attempts the same with their LucidWorks Solr/Lucene distribution. Machine learning via Mahout adds the potential for some classification/categorization functionalities to be built into this curation process.  All caveats about machine learning still apply here.

As we mentioned last time, sandboxing is where many -- if not most -- customers are right now in their Big Data journey. Whether Lucid's beta program is suitable for these experiments will depend heavily upon your available skill sets to utilize the toolset.

]]>
Never Mind the Quality, Feel the Width - Big Data's emerging problem #ibm #Oracle Tue, 08 May 2012 11:51 UTC http://www.realstorygroup.com/Blog/2352-Never-Mind-the-Quality-Feel-the-Width-Big-Datas-emerging-problem?source=RSS Big Data is may be a buzzword, yet it's certainly generating interesting discussions. Over the last month or two, I been party to a number of really interesting sessions - such as the CW500 event I mentioned previously - and with recent acquisitions in this space, the question is becoming less about whether Big Data is possible, and more about how it can be applied in the enterprise.

The Problem of Data Quality for Unstructured Content

For me this raises the question of quality -- especially when dealing with unstructured data.

"On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question." (Charles Babbage, Passages from the Life of a Philosopher).

Babbage's thoughts on the subject of data quality, were neatly summarized by George Fuechsel a century or so later as, "garbage in, garbage out," or "GIGO."

Understanding of data quality in the world of structured data (think ERP, CRM, BI) has reached a very high level of maturity. Unfortunately, the same cannot be said for the world of unstructured data, a.k.a., content.  Ensuring that same level of quality for unstructured data such that it doesn't skew subsequent analysis is much harder to apply.

Use Cases on Offer

Listening to Oracle discussing the possible outcomes from Big Data, you hear many references to use cases such as "smart meters" in domestic scenarios, or medical sensing equipment attached to patients. These examples certainly when scaled-out can produce vast quantities of data, and almost certainly that data will provide valuable insight once analyzed.

I would argue though that these are pretty limited use cases that simply extend existing applications.

They ignore the massive amount of true content: from short-form social, to long-form document text. Is this because such content is inherently not useful, or that the problem of quality makes it too hard to glean actionable results?

Reading IBM's own commentary on how they plan to apply their new Vivisimo technology to this problem suggests that they have at least recognized the issue exists. IBM envisions Vivisimo as a kind of content curation tool: federating sources and assembling data sets that have been filtered for quality and faceted together into logical collections. However, while this appears to be sensible in theory, it begs a question.  Why Vivisimo, rather than their pre-existing Content Analytics/Omnifind technologies? Might Enterprise Search find a new role across the board in this emerging area?

What You Should Do

There is certainly right now a paucity of solid business cases for Big Data in the enterprise. Certainly not a shortage of ideas and theories, but customers are still primarily sandboxing sub-sets of data, looking for indications that there are a demonstrable returns on investment to come. As you look for suitable use cases, and your Big Data explorations turn more to unstructured data, remember GIGO and don't lose sight of data quality.

]]>
IBM and Vivisimo - Long live the federation? #ibm #search Wed, 25 Apr 2012 14:28 UTC http://www.realstorygroup.com/Blog/2344-IBM-and-Vivisimo-Long-live-the-federation?source=RSS Today IBM announced that it was acquiring search vendor Vivisimo for an undisclosed amount, purportedly to boost Big Blue's capabilities in Big Data.

Vivisimo's major strength historically was its ability to federate results from a variety of other search technologies and bundle them together into something useful. Given that IBM has something like half a dozen different search products at its disposal (including Content Analytics née Omnifind, iPhrase, Websphere Portal Search, etc.) this might be an apposite moment for such a move.

Interestingly, in the same press release, IBM also announced a partnership with Cloudera, an organization that has recently been partnering with Oracle on their new range of "Big Data" appliances. This brings access to Cloudera's Apache Hadoop distribution, which provides the muscle for much of the processing required for Big Data scenarios (along with allied Apache projects such as Hive).

Whether these two announcements are connected is not, at least at this stage, clear. What they do show is a measure of public intent from IBM not to let Oracle steal too much of a march on the enterprise market for Big Data and that, despite some comments to the contrary, federated search apparently ain't dead yet...

As always, existing IBM customers should not rush to buy into this latest offering just because it has a new owner.  For a detailed review of Vivisimo's capabilities, consult our Enterprise Search evaluations.

]]>
A portrait of the artist as a metadata manager #socialmedia #NABShow Tue, 24 Apr 2012 06:17 UTC http://www.realstorygroup.com/Blog/2340-A-portrait-of-the-artist-as-a-metadata-manager?source=RSS Art helps us to understand the world we live in. We can in fact think of art as metadata about the world (and artists as metadata experts of the human condition). Art is also ahead in showing us the path to the future, and digital art may provide some clues to the future in the content management world.

At the NAB show, futurist Marina Gorbis talked about the seismic shifts taking place in the world of content creation and used examples of installation art pieces to illustrate trends like algorithmic content and non-human content creation.

Exhibit 1 is what I call "real-time fashion". In 1998, artist Nancy Paterson created a stock market skirt . Economists say there is a relationship between the stock market and fashion fluctuations. If the market is doing well, the skirt length reduces and if the markets are floundering, the length increases. The installation art consists of the mannequin Judy, a computer, several display monitors, and a mechanical system of motors, cables and pulleys. Some Perl scripts analyzed online stock quotes, and the skirt length adjusted accordingly.

"Painting the town LED" is Exhibit 2. In Erik Krikortz's art project Emotional Cities, the aggregated responses to "How are you feeling today" are used to light up a city's skyscrapers and serve to illumine the city's zeitgeist. You'll recognize the more prosaic version of this as "sentiment analysis" which is increasingly being added to marketers' tool kit.

Not mentioned by Gorbis but Exhibit 3 is "Writing on the wall". The Think Exhibition at New York's Lincoln Center to commemorate IBM's centenary consisted of a 123 feet long data visualization wall, which displayed dynamic patterns based on data feeds from the city's traffic, pollution, and sunshine indicators. The art on display was not a Da Vinci, but perhaps a Fibonacci.

Conventional notions of what constitutes art get challenged in these explorations. Similarly in an age of convergence, boundaries blur and traditional categorizations like platforms and channels can crumble.

Metadata so far has been operating in a linear, unidimensional fashion and has been helping make only basic connections and discovery for us. Are our information models, sense-making apparatus, and systems (and yes, even our own evaluations) ready for this brave beautiful world of multi-dimensional interconnectedness? What do we need to get ready?

Welcome your comments below...

]]>
Start Your Content Migration Planning Early #migration #pmot Mon, 23 Apr 2012 09:50 UTC http://www.realstorygroup.com/Blog/2336-Start-Your-Content-Migration-Planning-Early?source=RSS Many IT projects require some sort of content migration, but we often find that many customers give scant attention to migration planning. It's almost always considered an activity that can wait until the end.

We recommend that you make Content Migration a part of your overall project and start planning for it from the very outset. This will allow you analyze your content and make improvements to overall content quality. You will also have better chance of keeping your project on track and avoid any major cost or schedule overruns.

In our recent advisory briefing, Start Your Content Migration Planning Early, we lay out what migration activities you need to plan or execute at each stage in a project, from team selection to piloting.

This advisory is for subscribers to our Collaboration, DAM, ECM, SharePoint, Search, CMS or Portals streams.

For more about content migration, you should also check out our earlier advisory, A Guide to Successful Content Migration in conjunction with this one.

]]>
Free Webinar - How Cloud, Mobile, and Social Will Change the World of Information Management #info360 #Cloud Fri, 20 Apr 2012 12:10 UTC http://www.realstorygroup.com/Blog/2335-Free-Webinar-How-Cloud-Mobile-and-Social-Will-Change-the-World-of-Information-Management?source=RSS Cloud, Mobile, and Social are three of the most common buzzwords in today's IT lexicon. The words are here to stay, but will the underlying concepts really bring about fundamental changes to the way we manage information? Or, are they more hype than substance?

On May 9, we'll be conducting a webinar that will answer those questions and shed light on:

  • How you should think about cloud options for your technology solutions
  • Creating a mobile strategy that actually improves, rather than hinders, the customer brand experience
  • Why implementing social tools without a proper business strategy can lead to disastrous results

You can register for this free webinar here

This webinar will be a preview of many of the topics that we will be discussing in depth at this year's info360 conference. In NYC on June 12-14, we'll be presenting a number of sessions including:

Social Workplace Market Overview 2012

Keynote: Consumerization of IT

Acronym Soup: ECM, WCM, CMS, WEM, CEM, DAM Dissected

The Right Way to Select Enterprise Collaboration Technology

DAM (Digital Asset Management) 101

Understanding the Marketing Technologist Toolkit

How to Negotiate the Right Price for Enterprise Software

 

The Advance Rates to the conference are available until May 4. We look forward to seeing many of you in New York!

]]>
2012 Enterprise Search Market Analysis - Do we now live in a Solr-system? #search #autonomy Wed, 11 Apr 2012 14:06 UTC http://www.realstorygroup.com/Blog/2327-2012-Enterprise-Search-Market-Analysis-Do-we-now-live-in-a-Solr-system?source=RSS We've recently published our "2012 Enterprise Search Market Analysis." After 2011's raft of acquisitions, this year promises to be another interesting one for search customers:

  • The fall-out from the HP/Autonomy and Oracle/Endeca deals in 2011 has created something of a vacuum in the upper tiers of the search market. To varying degrees, both of these deals will engender a degree of uncertainty amongst their existing and prospective customers alike until the future shape of the products becomes clearer.
     
  • In part because of all the industry M&A activity, Apache Solr has become almost a default option for many enterprises when approaching search-based projects. The question being increasingly asked is, "Can Solr do this?" Other players get considered only where Solr lacks functionality or maturity, or the cost of implementation/customisation becomes too onerous.
     
  • At the same time, longstanding players in this space - such as Coveo and Exalead - are continuing to focus less on Enterprise Search and more on producing abstraction tools for querying and visualising data. Although behind the scenese these platforms are still performing familiar search functions, the trend is following search buyers who have migrated from pure IT, towards more business-focused tasks. It is those business users that are providing the contemporary scenarios for this technology.

These are just a few of the highlights from our 2012 Enterprise Search Market Overview, which is available for immediate download by search stream research subscribers, and for separate purchase by anyone else.

On top of an extended market and trends analysis, we highlight many of the vendors we cover and assess their current positioning and potential risk they represent to you, the buyer and implementer of these technologies. As always, be sure to let us know if you have any questions.

]]>
Join us at Internet World, London #DAM #search Wed, 04 Apr 2012 07:31 UTC http://www.realstorygroup.com/Blog/2319-Join-us-at-Internet-World-London?source=RSS My UK colleague Matt Mullen and I will be hosting three sessions at the upcoming Internet World conference in London, April 24 - 26. Matt will be talking about the enterprise search market, and I'll be talking about mobile brand management, as well as how to select enterprise technology.

If you're a current RSG customer, we'd love to meet up with you (drop me a line).  Or if you'd like to find out more about Real Story Group and our research, please drop into our sessions in London or at any of the many events where we'll be speaking

]]>
Are you an Intelligent Customer? Lessons from HMRC #EntArch Tue, 03 Apr 2012 09:22 UTC http://www.realstorygroup.com/Blog/2315-Are-you-an-Intelligent-Customer-Lessons-from-HMRC?source=RSS This month's CW500 presentation in London was delivered by Phil Pavitt, CIO at HMRC (Her Majesty's Revenue and Customs, the UK tax man), who described a programme within his department to drive cost savings within IT. To provide some perspective on the IT estate that exists within HMRC: it is the largest in Europe and the 15th largest globally, controlling in excess of 1b transactions, totalling more that £1.2t in payments through the government banking system. (This is apparently larger than HSBC performs, globally.)

Historically IT in UK government doesn't have the greatest of reputations, both in terms of efficiency and reliability -- something that a cynic could trace back to the UK's first IT project, Babbage's "Difference Engine" - and it is against that background, plus the edict to cut 33% of back-office costs from HMRC's operations, that Pavitt and his team have been undertaking their project. Thus far, in a little under two years, the team has cut £175m over the next 5 years in software license consolidation alone, whilst reducing the number of active platforms from in excess of 900 down to a current figure of around 150.

The latter is significant.  People used to It joke that HMRC had  purchased a licence "for every commercial software product ever released." But how had this situation been allowed to grow out of control?

Pavitt put it plainly; HMRC was not "an intelligent customer." Its IT procurement had become divorced from any business strategy and shadow IT -- where departments bought or commissioned their own systems without involving IT in the process -- became increasingly common. This tends to be symptomatic of a general lack of satisfaction with IT in general, so along with reaching out to the greater organisation to produce a formal "business-led IT strategy," Pavitt's team introduced to increase proper rigor when acquiring software and services.

Primarily, this rigor manifests itself in the use of what Pavitt calls "Tripartite and Benchmarking" (derived from previous UK Govt "Resilience Benchmarking Project" focused on UK financial systems). When applied to HMRC, this boils down to creating a system of verification and validation of processes prior to requesting quotes and entering into a cost-matching exercise. At RSG we'd refer to this as "Scenario-based Selection" -- i.e., "What are we actually using the platform for?"  This is something we have historically believed is essential to a successful purchase-cycle, long-before trawling for quotes from suppliers.

If you're not currently a subscriber and would like to learn further about our approach to research, please download a free sample of our work here. Current subscribers can rest assured that whilst large enterprises like HMRC are catching on to methodologies like ours, we're always working to make sure we stay a few steps ahead.

]]>
Digital workplace and enterprise architecture -- two sides to same coin #EntArch #intranet Mon, 19 Mar 2012 13:10 UTC http://www.realstorygroup.com/Blog/2311-Digital-workplace-and-enterprise-architecture-two-sides-to-same-coin?source=RSS You may have heard of the emerging concept of the "Digital Workplace:" where employees go to get work done digitally. Much of the current discussion has centered around what notions of a digital workplace mean for traditional intranets, emerging social collaboration spaces, and aging transactional systems.

Those are important topics, but I think an even bigger to-do for enterprises is to bring the right skill sets to bear. One key skill set to engage here is enterprise architecture.

If you examine the individual applications and platforms that employees access to complete work every day, you end up charting myriad of different systems in the typical enterprise. Some of these may be loosely aggregated within a portal, while others may not. The digital workplace concept, though, helpfully turns the table around by looking at it from the standpoint of the employee, rather than the enterprise. There's clearly an opportunity to apply well-known user experience methodologies -- such as User Centered Design (UCD) -- to improve your colleagues' effectiveness here.

But as you dig deeper into the employee digital experience, you'll discover more than just clunky, freestanding applications. You'll find:

  • Information and process flows that span multiple systems
  • Siloed data and content
  • COTS vendors pushing their own separate mobile apps
  • Diverse security and information access needs

Cataloging and understanding the business value of all these systems is the job of an enterprise architect. If you are trying to transform your digital workplace, then you'd do well to engage your enterprise architecture team -- you have one, right? -- in reconstructing the pieces into a greater whole.

Just remember that the point is not to re-arrange boxes and arrows to work your way forward from back-end systems to employees, but rather to re-arrange what happens on your colleagues' screens by working your way backwards. Is it too simple to say, UCD + EA = New Digital Workplace ?

If you're an enterprise architect tackling some of these issues, I'd love to hear from you.

]]>
Big Data: Does Variety, Volume, and Velocity really deliver Veracity? #EMC #Oracle Mon, 12 Mar 2012 07:50 UTC http://www.realstorygroup.com/Blog/2307-Big-Data:-Does-Variety-Volume-and-Velocity-really-deliver-Veracity?source=RSS Having been rather cynical on the subject of Big Data, it was reassuring to see a full-house at a recent Computer Weekly "CW500 Club" gathering dedicated to the subject.  Nevertheless, I came away with more questions than answers about the value of this topic right now for business users.

As I mentioned in my recent blog post, the big guns were present and correct in logo form. Oracle, EMC, and IBM cropped up regularly with reference to infrastructure, alongside Quid and Connexcia building on top of Lucene. A great deal of time was spent discussing relational databases versuss NoSQL schema-less data structures and Hadoop / MapReduce clusters.

So much so, that you'd get the impression that the business case for Big Data was a done deal.

In the details of the discussion was a clue: Big Data was defined as "workloads that we just couldn't handle before." Big Data's three-Vs of variety, volume and velocity certainly provide a framework around which the IT- literate can debate. However, it wasn't until the discussion turned to practical business scenarios in which this horsepower could be utilized that things got really interesting.

Here's a selection of some viewpoints expressed in the room:

  • The longer and louder that we discuss Big Data, the more the public are aware of the volumes of data that are stored about them; therefore privacy becomes paramount
     
  • Today, the easiest way to justify a business case for a Big Data project is to fold the effort into a storage consolidation program, rather than tout the benefits of data analysis, (even though that analysis is potentially hugely valuable for the business)
     
  • Data quality and source management remain big open questions, especially with extra-organizational information. Pre-existing "data quality" tools can address structured data integrity, but when you bring in, say, external social media data, these tools may not suffice

Valuable as this debate was -- especially to an audience of architects and information nerds (like me) -- it did make a  comment on the subject resonate in my head, that for many at this stage in its lifespan Big Data is "...a clever toolset desperately looking for a purpose..."

Whilst some organizations with well-considered information architectures may be able to jump right in and find real benefits right away, for many (or most) others, Big Data remains at least at present, puzzlingly opaque.

So don't feel bad if you don't have a strategy.  It's a work in progress for everyone.

]]>
UK Government Cloudstore - not yet ready for prime time #Gov20 #Cloud Thu, 23 Feb 2012 09:14 UTC http://www.realstorygroup.com/Blog/2299-UK-Government-Cloudstore-not-yet-ready-for-prime-time?source=RSS The IT press here in the UK have heralded the remarkable launch of the government's G-Cloud Cloudstore. The timing was certainly impressive: In just four short weeks this online services directory and procurement application went from contract to launch.

Yet I remain a bit puzzled as to why the service has gone live at all, since it's clearly far from complete, usable, or reliable. The fact that it took four weeks to build what is essentially a simple web app is all well and good (even though it repeatedly delivered error pages in my test). But the real value that in this sort of application should derive from quality of the information it delivers. Currently the quality of that information is dismal.

From what I can gather 250 firms submitted information on a total of 1500 services they could deliver to the public sector, and all of them have gotten duly listed on the site.

That was the first red flag, and indeed further investigation reveals that as of now not one of those services has been tested or certified in any way at all; the claims have just been taken as verbatim. Even so, Cloudstore allows you to circumvent thorough tendering processes through the Official Journal of the European Community (OJEC), yet cannnot guarantee whatsoever the quality or fit of the services it promotes. 

I found:

  • Services offered that run on products that do not meet Government standards
  • A dominance of the usual major IT suppliers, claiming to offer almost anything (regardless of actual expertise)
  • Many well known and experienced government IT suppliers missing
  • Currently no details on what future accreditation will actually mean or demand of a supplier or service

This last point in particular concerns me deeply. Surely it seems fair to assume that "accredited" services and suppliers will have had their organization vetted, that they are viable and solvent, that they have experienced and reliable products and support, and that they meet technical standards.  Buyers will also want a solid understanding of these criteria and how they were met. Surely this is the basic due diligence that any buyer has to undertake? Yet I found no indication here of what accredited status will mean, how it will be administered, and how it will be checked and maintained.

I had high hopes for this system. Perhaps it was just a really bad idea to launch it now, long before the real work has been done.  To my mind this store should not have gone live until all the services and suppliers had been thoroughly vetted and assessed. Expert reviews of the remaining suppliers and further research of the market to ensure that a wide array of viable options is actually represented (rather than just those that have volunteered themselves) also seems necessary.

Until that time, this has little more value than a search on Google or Craigslist, and comes with potentially more risk of those, since it intends to help you shortcut necessary selection steps.

I'm all for speeding up and removing unnecessary bureaucracy from the procurement process, which is actually a one focus area of our business; but to repeat, the web application itself is not the important thing here. You can label it "cloud" and get more buzz, but at the end of the day it's the quality and veracity of the information delivered that matters most.  That information is nowhere near ready for consumption right now.

]]>
Big Data plus Enterprise Search equals Big Enterprise Disappointment? #autonomy #EMC Mon, 20 Feb 2012 09:43 UTC http://www.realstorygroup.com/Blog/2297-Big-Data-plus-Enterprise-Search-equals-Big-Enterprise-Disappointment?source=RSS This week, whilst I sat on one of London's First Capital Connect's delightful 1950's railway carriages commuting to the RSG UK office, I imagined a conversation a decade or so hence...

"So, Uncle Matt, what do you remember most about 2012? Was it the London Olympics?"

"Well my impertinent nephew, 2012 was in fact the year we learned about 'Big Data'."

" 'Big' Data? Wasn't there much data around before 2012?"

"Oh there was loads of it. It was just until then, we'd not been told to notice it. Not until an unholy alliance of analysts and software and storage companies like IBM and EMC began to point out that it was really, exciting. And coincidentally enough, it turns out they had all sorts of solutions to manage all this data."

"Wow that was lucky. So, did they work ?"

"No of course not. But then again they were the same tools we'd been using for ages, and they'd never worked then either."

This imagined interchange echoes some of the conversations we've been having within RSG about the Enterprise Search market. For much like "Big Data," Enterprise Search has been around a long while, but has never really fulfilled its promise. And Enterprise Search -- just like Big Data -- is due for a resurgence in the market, to get cast as something new and wonderful, disregarding the fact that it's not new and didn't work the first time around.

Consider the following:

  • The term "Search" itself provides a negative connotation as it tells you there is a problem, that stuff has been lost. Just as Big Data accurately describes a lot of "stuff" --  much of which we have no idea how it got there, and the bulk of which is likely junk.
     
  • A history of overpromising that Search will magically fix everything, when it patently won't. Just as slicing and dicing of data at ever more granular points will somehow give our businesses ever more accurate insights.
     
  • The myth that search engines can extract gold from junk has long since been debunked, just as anyone who has ever tried running data warehouses full of dirty data can attest: it just doesn't work.


  • Bundled search with business applications is "good enough" for most people (particularly as expectations are low), as is most basic BI (Business Intelligence) reporting more than enough for most business folk.
     
  • Web searching is a totally different challenge to enterprise searching, but few seem to understand that differentiation. Just as mining big data is a totally different paradigm from analyzing a specific local dataset.

From recent conversations with subscription customers, it's clear that the underlying business problems that led to so many big search projects are still there. Burned by their previous experiences with big search projects, however, organizations are learning to bite off ever smaller chunks of these issues. If anything the Enterprise Search market has stalled or even gone backwards these last few years.

Fortunately the crazy hype around Big Data has yet to infect the Enterprise Search market, but following Oracle's acquisition of Endeca and HP's crazy year in which they acquired Autonomy, there is undoubtedly a vacuum in the heart of the Enterprise Search market ready to be filled.

But is it going to be yet more hype and unfulfilled promises that flood into that gap, or will we move on and finally start to deliver on the promise?  What do you think?

]]>
Matt Mullen joins our Analyst Team #cms #search Mon, 13 Feb 2012 09:00 UTC http://www.realstorygroup.com/Blog/2293-Matt-Mullen-joins-our-Analyst-Team?source=RSS Today I am very happy to share that joining me in the UK office of RSG is new analyst Matt Mullen.  A self-confessed nerd, Matt has been working with online and information management technologies for almost 20 years, dating himself awkwardly by admitting he started out with text-based browsers in the early 1990s.

His key areas of focus at RSG will be Web Content Management and Search technologies, with a watching brief on other areas such as social and the semantic web.  Those who have seen Matt speak at conferences in the past will know that he explains highly complex technical issues in a down-to-earth, clear-cut, and humorous way.

With our new hires, we hope to be able to better support our growing ranks of advisory service customers, both in the US, Europe, and wherever you may be located.

]]>
Three lessons across ten years of content technologies #trends #EntArch Wed, 08 Feb 2012 14:28 UTC http://www.realstorygroup.com/Blog/2290-Three-lessons-across-ten-years-of-content-technologies?source=RSS Late last year we marked our tenth year in business, which seemed like a good time to reflect on what has changed across the landscape of content, web, and collaboration technologies -- and what has not.

So I came up with three major lessons that seemed worth sharing:

  1. Your internal competencies may be the biggest factor in your success implementing technology
  2. Every prediction of significant marketplace "consolidation" has been wrong, although M&A activity has proven painful for customers
  3. What makes your enterprise unique (and therefore successful) also plays out in how you apply technology

As always, welcome your thoughts in the comments below...

]]>
Alfresco Version 4 is Buzzword Compliant #Cloud #mobile Tue, 07 Feb 2012 12:13 UTC http://www.realstorygroup.com/Blog/2286-Alfresco-Version-4-is-Buzzword-Compliant?source=RSS Last week open source document management vendor Alfresco released Version 4 of its (commercially-supported) enterprise edition package. As we've come to expect from Alfresco, it's long on buzzwords and interesting new directions, but a bit short on functional niceties and architectural continuity.

The key features and implications for what Alfresco calls its "Cloud Connected Content Platform" are:

  • The ability to publish content to external channels, such as YouTube, Facebook, SlideShare, Twitter and LinkedIn. However, you can only publish assets from Alfresco's document library to these channels. This is really different from say publishing a simpler web post to Facebook (unless of course you manage that post in Document Library).

  • A new module to transfer files from Alfresco's repository to a file system.

  • Replacing its Lucene-based internal search with a Solr-based alternative. Granted, Solr is based on Lucene, but now all the plumbing required to make Lucene work gets done by Solr and not Alfresco. This also means you will have to recreate your indexes and many services (such as blog and discussions) that used Lucene queries will now employ database queries. This should not have any impact on your applications themselves, but you should still test it carefully. You also can't use Solr if you employ the old AVM-based WCM or use Alfresco in multi-tenant mode. The latter prohibition is rather surprising given Alfresco's upcoming cloud service.

  • Integration of Alfresco's in-house Activiti workflow engine in lieu of the incumbent JBPM. We have covered this before in our reports and it was only a matter if time before it happened. JBPM is still included in the base package for the time being but will remain disabled by default in new installations. I suspect it will slowly be deprecated over the next few versions. So this would be a good time to think of how you will migrate your workflows from JBPM to Activiti.

  • A new app to access content from mobile devices. For now, Alfresco seems to have focused only on Apple's iOS and mainly on the iPad. This is probably a sensible prioritization because tablets (and in particular iPads) have a dominant share within enterprises. Alfresco is also working on an integration with DropBox, which would offer two key features: the ability to access Alfresco from all the devices that DropBox supports, as well as critical synchronization capabilities for things like document check-out, offline work, and multi-device sync.

There are various other changes too, but as you can probably make out, the big story seems to be around cloud. Alfresco plans to offer its cloud-based offering later this year, based on Version 4. Much of this is really new and by that I don't mean just in terms of technology. For example, their proposed integration with DropBox really tries to marry enterprise functionality with one that is consumer facing. We can't say how this will pan out in organizations but we'll keep watching.

Meanwhile, you should remain skeptical about anything that uses "Cloud" in any way and ask tough questions. Fortunately, there's help easily available. Check out this advisory paper: Are cloud-based file-sharing services too immature for the enterprise?

]]>
Updated Technology Vendor Map #trends #EntArch Mon, 30 Jan 2012 16:33 UTC http://www.realstorygroup.com/Blog/2283-Updated-Technology-Vendor-Map?source=RSS We've just updated our longstanding "subway map" for 2012.

Biggest changes have come around acquisitions (e.g., HP), and a fast-moving Red Line (a.k.a., Collaboration & Social), as well as a number of key entrants in the Digital & Media Asset Management segment.

Real Story Group
            Vendor Subway Map, 2011

For higher-resolution JPG and nicely-printable PDF versions of the map, visit
http://www.realstorygroup.com/vendormap/

Hope you find it helpful!

]]>
Why Enterprise Search is not in the limelight #EntArch #EnSW Wed, 18 Jan 2012 18:33 UTC http://www.realstorygroup.com/Blog/2271-Why-Enterprise-Search-is-not-in-the-limelight?source=RSS In the enterprise search community there has been a lot of talk recently about the lack of deep coverage by major analyst firms such as Gartner and Forrester.  Many feel slighted and believe that their industry is a large and thriving one, one that is unjustly ignored.

The reality is that search remains an important element within the information management spectrum, yet big enterprise search projects are thin on the ground.  Most buyers default to whatever search engine is bundled with their enterprise licensing deals (IBM, Microsoft, Oracle, et. al.) and few RFP/Tender documents are issued for specialist search vendors to bid upon.

Nevertheless enterprise search in its broadest sense is a multi-billion dollar industry, and pretty much every enterprise globally makes some use of search technology. So why the lack of visibility and so few major search projects?  Simply put, much of the money in enterprise search is in basic document filters.

Document filters are a core components of search systems. They parse the many different electronic document types (Word/Excel/Pages/Keynote, etc.) used within a typical organization, thus making them indexable, searchable, and more broadly viewable. It is a crucial if basic job, and one that runs in the background -- as technology embedded within other technology. 

Almost every major business software supplier makes use of such filters, and when you stop to consider the vast number of document types and how often they change, you can understand why most software suppliers don't try to built their own.  Rather they license well-maintained filters from Oracle, HP (Autonomy), or ISYS.

Indeed the licensing of filters is so pervasive and lucrative that some argued at the time of HP's $11B acquisition of Autonomy that much if not most of Autonomy's "IDOL" revenue actually came from their Keyview document filters -- rather than their flagship IDOL enterprise search systems. Oracle also retained and grew a very healthy license revenue stream when they acquired Stellent's INSO filters, and independent search specialist ISYS does a healthy trade in filters too. Taken together it is fair to say that filters make up the lion's share of enterprise search related spend. Yet frankly as a technology sector to watch and comment on there is little that is less interesting. What would you rather study and write about, the new Ferrari or the oil filters it uses?

So what ever happened to real enterprise search, those mega projects involving federated search across multiple silos of information, that normalize search sets from many different sources and types into a cohesive singularity?  Well those projects still do exist.  But they are rare, expensive, complex, and have a high failure rate.

Enterprise search will always be with us, as a critical component of any information management strategy. But there's the rub: it's just a component, typically dwarfed by other elements and playing a supporting role at best.

The danger here of course is that buyers can underestimate the importance of getting the right search technology, and the burden of maintaining and managing the search environment. When technology is simply bundled into a deal, there's a temptation to ignore it.

Don't ignore search.  Search can be labor intensive, and with a lack of skilled resources can degrade to the point of uselessness over time. Search needs to be taken seriously, and it would be good if it got more of the limelight. But the reality in 2012 is that specialized search technology options such as dtSearch, Solr, Recommind, ISYS, or Funnelback can't and shouldn't occupy the limelight.  The limelight should be on the proper practices and resources needed to manage a search environment and the difficulty most organizations face in obtaining them. Most organizations today already have search technology, it was typically thrown into a deal without any further discussion. It is the ability to fully exploit the technology that they lack.

]]>
2012 Technology Predictions #sharepoint #DAM Tue, 06 Dec 2011 13:54 UTC http://www.realstorygroup.com/Blog/2260-2012-Technology-Predictions?source=RSS It's that time of year for our team of Real Story Group analysts to reveal our 2012 predictions, where we try to predict what the future holds in the technology world.

This is our sixth year in a row doing this humbling exercise. If you'd like to see how we've done previously, you can view past predictions here: 2011, 2010, 2009, 2008, and 2007.

Here's our 2012 technology predictions:

1. Big data meets web marketing
Digital marketing systems -- from analytics, to adaptive personalization, to social media monitoring platforms -- generate huge amounts of data. The ability to extract and leverage meaningful nuggets from these vast stores of information represents a persistent but increasingly important challenge for marketing specialists. 2012 will see specialist (typically SaaS...see below) vendors pull away from the pack of integrated WCM suites and other adjacent technologies that implement e-marketing functionality as a simple, add-on service.



2. Enterprise search marketplace opens up...again
The major vendors in this space are undergoing substantial transformation: FAST is getting sucked into the SharePoint vortex; Autonomy is facing an unclear future under HP; and Endeca remains fitful and distracted. Look for upstart vendors to fill the void as they did earlier this decade when the market was more open. In particular, look for specific applications based on the open source platform, Lucene.



3. Social services get called on the carpet in SharePoint
SharePoint has seen stratospheric, often viral growth in enterprises around the world. Licensees are beginning to discover, however, that its lack of contemporary social networking services and polished collaboration applications are limiting its effectiveness and driving business units to self-provision other tools. 2012 will see the rise of a variety of SharePoint-specific, supplementary offerings, from new and existing vendors alike.



4. CRM and CMS on a collision course
Customer Relationship Management (CRM) and Content Management Systems (CMS) have long been central pieces in the digital marketing toolkit; however the lines between these two systems will continue to blur in 2012. More and more, marketers want to set content and interaction experiences based on customer interaction, so CMS vendors continue to add CRM features, while CRM systems add more web publishing features. In the long run, we think integration is more promising than convergence; in the meantime, expect some messy collisions.

5. Death of the intranet as we know it
Intranet managers still have a key role to play in enterprise collaboration and information management, but employee expectations and the role of the intranet have changed dramatically over the past few years. Savvy companies will focus on the broader employee experience in a mobile, "digital workplace." 2012 will see a significant reallocation of resources from corporate communications to more business-oriented functionality.

6. BPM springs back to life
Process still matters, and workflow applications continue to dominate enterprise document and records management efforts. 2012 will see a renewed interest in good, old-fashioned BPM, as enterprises seek to orchestrate activities across organizational boundaries, including partner and supplier systems.

7. Rich media goes mainstream in the enterprise
Video is no longer an emerging technology for the enterprise. New social initiatives in particular will bring more media into internal systems. To be sure, a gulf remains in production quality (between professional and amateur), and employees will continue to look for increasingly sophisticated capabilities as both media producers and consumers. In 2012, enterprises will respond with specific, rich-media initiatives.

8. Big data blows into the cloud
More and more information management systems are generating or leveraging "big data." Yet, many enterprises don't have the resources, capacity, or expertise to properly store and mine this data. Fortunately, "big data" characteristics (such as unpredictable data inflow rates and the need for elastic processing capacity) make it a natural fit for the cloud. As a result, data-rich applications -- such as social media monitoring -- will increasingly go to market with SaaS-only delivery models.

9. Pervasive mobile-only apps
2011's mantra could have been "mobile first." 2012 will see "mobile first and last," as enterprises develop mobile-only interfaces to certain internal applications without focusing any effort on traditional, web-based (desktop/laptop) UIs. Many of these mobile apps will consist of specialized mashups among existing systems. A key driver here is the inexorable rise of tablets. We'll also see interesting examples where enterprises will tweak business processes to leverage tablets (e.g., in-store tablet catalogs).

10. New job titles emerge
Major technical and operational changes are driving new roles -- often informal, hybrid roles -- within the enterprise. 2012 will see the formalization of some of these roles into broadly recognized job titles. Samples include:

  • Marketing Technologist - to master the increasingly complex services around e-marketing at scale
  • Social Media Monitor - to interpret, understand, learn from, and respond to the fire hose of relevant activity on public social networks
  • Enterprise Community Facilitator - to support localized community managers and foster productive cross-silo interaction
  • Enterprise Media Producer - to produce or edit high volumes of video for internal and external consumption
  • Director or VP of Digital Assets / Digital Media Manager - formal DAM roles emerge to establish ownership -- not just of assets, but of the systems and metadata -- of DAM and MAM

11. Security fears rise: phones, tablets, portable drives, the cloud -- where is our content?
Nearly everyone is a mobile worker. The proliferation of smartphones and tablets means that employees are walking around with disk drives containing company information. A lost or stolen phone or tablet containing sensitive information will likely cause a backlash in enterprise security departments. We've already heard of some highly regulated enterprises banning enterprise access from employee phones. For many employees, 2012 will bring even more rules and regulations around how they can use their mobile devices and renewed enterprise interest in digital rights management applications.

12. Lines blur between commercial and open source technologies
In the WCM and portal marketplaces, major open source projects are "commercializing" fairly rapidly, while many (though certainly not all) commercial vendors are adopting more open development and support models. This means that in 2012, customers will see increasingly less distinction between commercial vendors and "commercial open source" suppliers. The bigger gulf -- though it remains largely one of licensing -- is emerging between commercially-oriented open source projects and community-oriented projects across the WCM and portal landscapes.

Here is RSG's Alan Pelz-Sharpe to shed some more light on our predictions:

]]>
How accurate were our 2011 predictions? #ecm #e20 Mon, 05 Dec 2011 13:58 UTC http://www.realstorygroup.com/Blog/2259-How-accurate-were-our-2011-predictions?source=RSS Like all analyst firms, every year-end we make predictions. I think we're fairly unique, though, in going back to see how our earlier predictions fared. Let's see whether our predictions for 2011 actually panned out. These were the twelve predictions we made in December, 2010:

1) "Bring Your Own Device" policies will push HTML5 adoption for mobile access to enterprise applications
This has definitely happened, although HTML5 adoption remains quite incomplete.

2) Content-rich customers will rebel against Web CMS marketing spins
The key phrase here is "content-rich." We've definitely seen some disappointment among high-volume publishers (e.g., media), who don't want e-marketing and other corollary services from their WCM vendor.

3) Microsoft will turn to partners to fix SharePoint shortcomings
Yep. It's an old story. As a SharePoint version ages, Redmond stops arguing that it's feature-complete and encourages customers to seek out supplementary tools from partners.

4) The top end of the Web CMS market will be redefined
This happened. OpenText/Vignette and IBM continue to fade. EMC gave up on Documentum Web Publisher, Oracle is effectively deprecating its UCM (neé Stellent) WCM in favor of its new FatWire acquisition, and the future of Interwoven TeamSite/LiveSite has gotten even dicier with HP's acquisition of Autonomy. The big boys, long fading, have almost disappeared, getting replaced by a bevy of more focused alternatives.

5) Intranet community managers will adopt public social functionality
We've definitely seen more of this, though it's still not pervasive.

6) SaaS vendors will try to separate from "The Cloud"
Nope, we were wrong. If anything the opposite: SaaS vendors have uniformly embraced fluffy Cloud terminology to ride that wave of hype. This was the case even among those SaaS players that don't actually employ Cloud-based technology.

7) Buyers will have a greater acceptance of newer standards
In some cases (CMIS, HTML5) yes, but in other cases (activitystrea.ms, OpenSocial) not so much, yet.

8) Case Management will become the leading application from high-end ECM vendors
Absolutely. Case management actually constitutes a family of applications relevant to many different types of organizations. It's where much of the non-SharePoint ECM action lies today.

9) Digital Asset Management vendors will greatly expand video management capabilities
This happened, though perhaps not as heavily as we predicted. It remains unclear whether traditional DAM players have the chops to compete effectively with more video-oriented vendors.

10) E-mail will remain the world's de-facto enterprise document repository and workflow system
Alas, we were correct. Companies ditching e-mail remain very much the exception and not the rule.

11) Portal software will increasingly produce services for other portals
Hard to know exactly. Major enterprise software gets talked about more than implemented, but portals seem the opposite. Enterprises who have committed heavily to portal technology continue to invest in those platforms -- or perhaps more tellingly -- in the systems around those platforms. Yet more enterprises are getting comfortable with multiple portals, including "portal lite" applications.

12) Specialized talent around managing content will begin to migrate out of large corporations
This is an organic trend that's difficult to track, but we saw increasing evidence of it last year, as consultancies and integrators desperate for more talent continued to comb the ranks of enterprises for experienced specialists. There's no recession in information management technology right now...

By my count, we were about 9 out of 12, or 75%. Not bad, not awesome. About the same rate as previous years -- and a reasonable co-efficient to add to our 2012 predictions. Look for those new predictions from us in the next few days....

]]>
Storage Wars in the Cloud #storage #Cloud Thu, 17 Nov 2011 13:55 UTC http://www.realstorygroup.com/Blog/2252-Storage-Wars-in-the-Cloud?source=RSS OK, I'll admit it. One of my guilty pleasures in recent years has been the American television show, Storage Wars. If you are unfamiliar with the show, the premise is
simple. When storage units are abandoned, they are put up for auction. The show follows a group of potential buyers who, after getting only 5 minutes, bid on the contents of the storage units. The highest bidder then takes ownership of the storage unit's contents, which they try to re-sell for more than they paid for that storage unit. As you might expect, the buyers end up with lots of trash, but occasionally they find a gem (literally and figuratively) that enables them to turn quite a profit.

The side of the story that's glossed over on the show, though, is why someone abandoned the storage units in the first place. Were they unable to pay their storage rental bills due to hardship? forgetfullness? death? Surely, the original owners never expected a stranger to buy their posessions (prized or otherwise), let alone for it to happen on a television show.

I couldn't help but think of the parallels between physical storage space and the seemlingly limitless digital space available in "the cloud." I'm sure most of us think that we'll be the only ones who will ever have access to our photos on Facebook, our contacts in LinkedIn, our e-mails in Gmail, our files in Dropbox or Box.net or Office 365.

Much attention has been paid the security of the cloud from hackers - and rightly so - but users of cloud storage providers should get clear agreements in place that clearly states what happens if for some reason we stop paying our storage bills. Who owns the content? Is the content transferable to another owner? Is it transferable to another system? Does the content ever get destroyed?

The producers of Storage Wars are not likely to be thinking of a digital version of the show; but on TV or not, I doubt any of us would want our content to simply go to the highest bidder.

]]>
New evaluation of taxonomy management tools #KMers #search Wed, 16 Nov 2011 13:21 UTC http://www.realstorygroup.com/Blog/2251-New-evaluation-of-taxonomy-management-tools?source=RSS Taxonomy management tools are starting to gain traction in the business world, and as a result the vendor marketplace is evolving rapidly.

As corporate requirements for search and content management have intensified, so has the demand for tools that help organizations create, administer, and publish semantic structures.

You the customer can choose from among several fully-featured taxonomy management tools, yet each vendor has tackled the problem of managing vocabularies from a different angle. So how do you figure out which one is right for your context?

I've just authored an advisory paper that evaluates a collection of leading enterprise-level taxonomy management tools head to head:
·    Sempahore Ontology Manager (Smartlogic)
·    Synaptica
·    Data Harmony Thesaurus Master
·    TopBraid Enterprise Vocabulary Net (TopQuadrant)
·    Intelligent Topic Manager (Mondeca)

Learn about each tool's key strengths and weaknesses, add-on modules available, and SharePoint integration capabilities. Find out which vendors excel in multilingual vocabularies, ontologies, governance, and cross-mapping.

Subscribers to the RSG Enterprise Search stream have automatic access to the evaluations.  Others can purchase it online here.

]]>
Oracle-Endeca, HP-Autonomy, and Coveo follow the customer #search #autonomy Mon, 07 Nov 2011 12:18 UTC http://www.realstorygroup.com/Blog/2246-Oracle-Endeca-HP-Autonomy-and-Coveo-follow-the-customer?source=RSS Enterprise Search engines can be divided neatly into two categories: those optimized for website search and those optimized for searching across internal information silos. Today the gap between the two is opening ever wider. 

The reasons are not too difficult to understand. External websites that feature customer interaction are considered a priority, especially for ecommerce environments.  If your customers can't find what they're looking for, then that is bad for business. 

To be sure, searching and providing navigation to an external website is not usually cheap or easy either. But it has one major advantage over internal search: the data wants to be found, and is typically structured, stored, and tagged accordingly.  Most of us are familiar with the very granular and typically very accurate faceted search provided on today's large shopping sites. Search in this environment works well and there is a growing market for it.

Internally focused Enterprise Search remains, as the awful phrase goes, the poor stepchild. Searching multiple internal silos -- full of unmanaged and unstructured information -- is typically a hard, expensive, and disappointing task to undertake. 

So guess where all the Enterprise Search vendors want to focus their efforts these days?

You can't really blame them of course, not the least because the needs of ecommerce and external websites extends far beyond Search.  Being able to find a replacement fridge drawer on the Samsung website (as my wife did today) is scratching the surface of what could be done. As Mike Davis of Ovum said at the recent European Enterprise Search Summit, "Firms like Tesco drive their business from the data on your loyalty card, but they want to know more about you than your transactions."  Ultimately its all about context. As we have said many times before, the context for structured data is often found in an unassociated unstructured file.

And so the world of Search enters the world of true analytics and "Big Data."  What has long been the sole purview of Business Intelligence vendors is now slowly starting to be encroached upon by Search tools from IBM, Oracle (Endeca), HP (Autonomy) and a few independents such as Coveo. I imagine we'll still see two different takes at the same problem for a while, but now that Big Data organizations like IBM, HP, and Oracle are taking Search seriously for once, over time some kind of solid hybrid may just emerge. 

Customer interaction and commerce will grow ever more sophisticated, with predictive analytics taking the lead. No doubt we will see the split between Internal and External Search widen even further over the next couple of years, as some Search vendors at least have finally found a truly lucrative niche, and they are unlikely to turn back now.

]]>
Not coming clean about Enterprise Search #search #EntArch Wed, 26 Oct 2011 13:50 UTC http://www.realstorygroup.com/Blog/2245-Not-coming-clean-about-Enterprise-Search?source=RSS I have just spent two days at the inaugural European Enterprise Search Summit in London, and left with much to think and rant about.  For I listened to a series of consultants and vendors telling the audience that enterprise search was an imperative, and (this was brought up in some form by almost everyone in some form or other) that according to IDC 20% of the working week is spent finding information, ergo reduce that 20% to 5% using Enterprise Search and then we will have World Peace. 

In short the business case for Enterprise Search made at the event boiled down to a series of hackneyed statements, that can be summarized as follows:

  • Any investment you make in Search technology will be quickly returned, as your workers will have so much more time on their hands and therefore be so much more productive
  • You will be miraculously saved from multi-million dollar lawsuits due to your ability to prove without a doubt how innocent you are of any such allegations

I'm sure you get the drift, and like most potential buyers of Enterprise Search technology, you don't believe a word of it, nor should you. It's naive nonsense. End users may indeed spend time looking for things, but what they might actually do with any time saved is as much your guess as mine. Moreover, and more importantly, I don't know about you but if I do spend 20% of my week looking for information, it is by using a Search Engine -- and has been for over a decade now. As for that pending lawsuit, the reality is most organizations would rather not know exactly what is lurking on their network, thank you very much.

The Search pitch is further weakened by the argument that there is no point in cleaning up your dirty data. The line goes like this: Why bother actually trying to manage information at all? It's only going to get into a mess again, and anyway a Search engine can work with whatever you throw at it! Again this is nonsense as everyone in the Search industry knows full well that bad data equals bad search results, and that this simple fact will never change. Put this altogether and it's not really all that surprising that Enterprise Search is stuck in a deep rut that it doesn't know how to get out of. The best it seems to be able to come up with right now is to not to call it Enterprise Search at all, and instead fob it off as some kind of exotic analytical engine, or even as an "SBA" because nonsense acronyms will fool anyone.....

The bottom line is, in the words of conference Chairman and friend Martin White, "Fundamentally, we have an information management problem." Indeed we have, until organizations start to manage unstructured data with the same care they do structured data, we will continue to have a problem. The fact that we have a problem should hardly come as a surprise for where an organization will employ a veritable army of Database Administrators to manage the 20% of their data that is structured, they will employ almost no-one to manage the remaining 80%. That amounts to a huge volume of bad data. 

Search has a role to play, and a very important one at that. But until the Search industry itself starts to come clean about just how difficult and expensive Search is to leverage, and how much it is dependent on other factors outside of its control, it's role and value will always be minimalized. Buyers are right to be dismissive of widely optimistic claims. Best to come clean I say.

]]>