Real Story Group. Make Better Technology Decisions.

Delivering fearless advice since 2001. Here's our story
What Real Independence means. Find Out

  • Schedule a Demo
  • Free Sample
  • Contact
  • Subscriber Login
  • Your cart is empty.
Sign up for our Newsletter
  • Home
  • Evaluation Reports
  • Premium Subscriptions
  • About
  • Blog
  • Buy Now
  • Recent Entries
  • Get Custom Feeds

 

 

 

Bloem Adriaan Bloem

Copy-pasting from Word

23-Feb-2011

Tags: Web Content and Experience Management, Usability

I've been working with web content management systems for almost fifteen years now. And exasperatingly, I still see the same project problems recur constantly. Some of this is because of a lack of education -- it seems the field has grown a lot quicker than the general level of knowledge about the basics of content management. But a lot of it is just the same old technical problems.

Exhibit A: copy/pasting from Microsoft's Word.

Where does content commonly come from when it's repurposed for the Web? Microsoft Office, which is pretty much the standard for office productivity applications. In fact, it's quite usual for editors to send in their content as Word documents -- with webmasters or web managers diligently copying all the text, and pasting it into a rich text editor within a CMS.

Or rather, pasting it in Notepad, and then pasting it into the editor. Because what Word leaves on the clipboard is Microsoft's interpretation of what HTML should look like -- and that's quite a mess. Redmond's proprietary tags routinely break pages and standard layouts. And then there's the separate problem of content encoding -- those magic quotes often don't translate too well. In short, Word doesn't really separate content and design -- one of the basic tenets of content management.

Most systems nowadays have some sort of solution to this. Popular rich text editors like CKEditor and TinyMCE have buttons to either paste plain text only (the equivalent of the Notepad intermediary) or "clean" the Word content. Alternatively, your CMS may offer filters that will try to scrub the HTML after it is saved.

Cleaning, however, never quite works. Either too much gets stripped, so tables or more complex document structures don't make it across; or too little, leaving us with a bunch of tags with unpredictable results. All of this is difficult to get right. (I know this all too well, having once tried my hand at writing an XSLT filter for the purpose. The horror!) Unrealistic expectations here can lead to many help-desk calls -- "the CMS screwed up my document" -- and the like.

The reality is that the only reliable way to get text from Office to the web editor is "text only" -- forget any formatting. That's what the Notepad-route does; and it's what Google's Chrome browser now does with CTRL + SHIFT + V.

It's fair to say only Microsoft could really fix this. How hard would it be to just paste minimal markup, instead of proprietary lingo? This isn't exactly rocket science, cold fusion, or teleportation. So, I asked the company.

The problem for Microsoft, of course, is that while pasting into web applications is common, pasting from one Office document to another is much, much more common. In those cases, you'll often want to preserve formatting, and according to Redmond, "the HTML clipboard format in Word is optimized for those scenarios." What's more, there's now the Office Web Apps -- so Microsoft enables pasting into those web versions of the Office suite with all formatting intact, too.

That's all fair, but what about the web editor and her tedious clean-up process? Well, according to Microsoft, "[Y]ou can save your documents as 'Web Page, Filtered' where the extra markup will be removed and you will be left with a simpler set of HTML markup." Alas, even filtered HTML is not entirely MS-free. 

So, there's a glimmer of hope, yet we remain pretty much were we've been the past decade on this problem. There is no single answer to something as simple as copying text from an Office document and pasting it into your CMS. Microsoft's solution is a bit cumbersome and incomplete, and Google's rips out tables and other content you may like to keep.

However, instead of blaming Microsoft for this, consider it a reminder. The trenches aren't glamorous, but it's where you're most likely to encounter hurdles. There are plenty more day-to-day obstacles to getting it right. And nobody's going to magically fix this for you any time soon.

    Excerpt from the Escenic Evaluation

    Web Content Management Report looks at... Personalization Services in Escenic

    "Personalization services are in transition. In version 4.3, Escenic had the Profile Web Service API for writing modules that store user preferences and user account data to allow for personalization. This API has now been superseded by the new REST API, but Escenic hasn't implemented a replacement for the profiling functionality yet. You should be aware that..."
    (p. 660)

    CMS Vendor Evaluations

    Learn the real strengths and weaknesses of major CMS vendors from around the world, in our Web Content and Experience Management research stream.

Tweet

close x

Free Sample Request

  Digital and Media Asset Management
  Document Management (ECM)
  Enterprise Collaboration & Social Software
  Enterprise Search
  Portals and Content Integration
  SharePoint Ecosystem
  Web Content and Experience Management
 Send me bi-weekly tips and insights from Real Story Group.
Your personal information, including your e-mail address, will be held in the strictest of confidence and will never be shared with anyone.

Subscriber Log In


Remember Me
Forgot password?


Not a subscriber?
Learn about our subscriptions

Research Mentioned in this Post

CMS Vendor Evaluations

Learn the real strengths and weaknesses of 35 major Web CMS products from around the world.

 | 

Our Newsletter

Get the Real Story bi-weekly.

Have Questions?

USA & Canada
+1 800 325 6190

UK
+44 (0) 20 3318 1911

International
+1 617 340 6464


All Other Inquiries

Our Customers Say

"I've seen a lot of basic vendor comparison guides, but none of them come close to the technical depth, real-life experience, and hard-hitting critiques that I found in the Search & Information Access Research. When I need the real scoop about vendors, I always turn to the Real Story Group."

Alexander T. Deligtisch, Co-founder & Vice President, Spliteye Multimedia

next More

Real Story Group

Follow us on:  RSS  |  Twitter  |  Facebook  |  YouTube

Evaluation Reports

  • Web Content and Experience Management
  • Digital and Media Asset Management
  • Enterprise Collaboration & Social Software
  • Document Management (ECM)
  • Portals and Content Integration
  • Enterprise Search
  • SharePoint Ecosystem

Premium Subscriptions

  • Research Streams
  • Advisory Papers
  • Vendors Evaluated
  • Schedule Analyst Consultation
  • Online Education
  • Configure a Subscription

About Us

  • Our Methodology
  • Our Team
  • Media
  • Customer List
  • Events
  • Consulting
  • Contact Us

Need Help?

  • Talk to an Expert
  • FAQs
  • Customer Support
  • Contact Sales Team
  • Help with your account

Copyright Real Story Group 2001 - 2012. All rights reserved.

  • Contact Us
  • Copyright Policy
  • Privacy Policy
  • Terms of Use

Log In

Remember MeForgot password?

close x
close x

All analyst firms claim to be independent or vendor-neutral. We're different.

Real Independence


Get the real story on commercial and open source tools from a firm that works only for you, the technology customer.

close x

Newsletter Signup

Thank you for signing up for The Real Story Group Newsletter. You will receive our monthly newsletter, plus updates with new information on the technology streams you have expressed interest in below.










Choose the streams that you’d like to receive updates for: