This post originally appeared in TechWhirl under the title: The Documentation Problem: Is Structured Content An Order of Magnitude Improvement?


Even a relatively small company or organization generates a lot of documents. When you get into larger environments you’re dealing with thousands or even millions of pages of documentation, reports, marketing materials, regulatory compliance docs, presentations…a staggering amount of content. This content represents a huge investment in time and should be considered a significant, valuable company asset. Why? Because that content is your first line of communication with your customers, it keeps groups like HR and customer service running smoothly and efficiently, and it contains the combined intellectual property and experience of your organization or company. But, all too often it resides in siloed (unstructured) authoring environments like MS Word files, that make it nearly impossible to centrally manage these assets.

Fortunately, there is a solution that is centralized, based on open standards, scalable and highly cost-effective. Built on the Darwin Information Typing Architecture (DITA) standard developed by IBM to solve their own documentation issues, this solution represents an order of magnitude improvement over the legacy content creation and management models still widely in use today. When DITA is combined with an authoring/management environment (Component Content Management System or CCMS), it provides a forward-looking solution to virtually all of your document management issues. The ten cases below illustrate the scalability, flexibility, and significant cost savings associated with moving to a DITA CCMS.

1 Version Control

In DITA, you do not have multiple versions of a chunk of content. When you need to ‘copy’ that content for another use or to share it, you don’t actually copy that chunk. Instead you point to the chunk in the database. Any changes, comments or edits made to the chunk are reflected in every instance of that chunk. This eliminates time-consuming copy/paste/email attach/track changes processes which, in effect, create multiple unconnected versions of the content. When you have multiple versions of content you face a version control problems as changes are made to different versions. This is no longer an issue with DITA.

2 Translation

The impact of a DITA Component Content Management System (CCMS) on translation workflow and cost can be dramatic. And this ROI can be very precisely measured because the cost of outside translation vendors prior to and after a DITA implementation is easily tracked via billing invoices for their services. DITA’s impact on translation and localization costs is based on its ability to streamline translation workflow. When changes are made to a previously translated document a typical process was to send the entire document to the translator. The translator would charge a per-word fee for the changes and they would also charge a processing fee for searching through the document to find the changed content. In DITA, only topics with changes are sent into the translation workflow. In addition, with thousands of documents in multiple languages, duplication of translations was common and costly. Amber Swope explored the financial implications of DITA on translation workflow. She notes, as an example, that “a manufacturing company who adopted DITA after a global expansion that required translation to 14 new languages was able to save more than $225,000 within the first three months of their pilot project on a single documentation set (emphasis mine).

3 Publishing

Creating content in unstructured formats like MS Word limits your ability to easily and simultaneously publish content to the wide variety of media now used to consume information. Where even a few years ago,it was enough to add to a knowledge base or publish to PDF and print, today people expect to consume information on a variety of devices from mobile to desktop. Consider the Internet of Things (IoT). The Internet of Things is emerging with the potential to include digital documentation directly in a product or in an app associated with the product. One example is the Tesla automobile that is managed by an internal software system that can be updated via the internet, adding product capabilities like hands-free driving. Documentation, including owner training, is included with that update and delivered to each car. These kinds of scenarios will be increasingly common in the future and unstructured documentation workflows will not be able to handle these new needs as they emerge. Read about firmware updates and how Tesla delivers their documentation on their site. Owners can also download a mobile app to remotely monitor and control their car from their phone. That app is yet another example of a medium requiring real time publishing capabilities.

4 Searchability and Management

If you have hundreds or thousands of docs in multiple formats and you need to determine if content exists on a subject, what do you do? You search through them. With DITA they all reside in a central repository and are tagged with standardized metadata for things like author, date created, subject, topic type- virtually any subject category that applies. This means you can search in very specific ways, i.e. ‘all tasks associated with learning XYZ v.2.1 software written before 2012’. In most DITA applications this is done via a faceted search where you can select from lists of filters to refine your search. This also enables sophisticated content management as you organize and deliver content quickly and accurately using your search tools.

5 Topic-based Authoring

Writing content in chunks known as topics is the underlying difference between DITA and unstructured authoring software like Word. When you break each document into topics and assign types and metadata to each piece, you’ve now turned your documentation into data that can be used in many valuable ways. There is a learning curve for writers unfamiliar with this methodology but once they ‘get it’ there is no going back. Sarah O’Keefe of Scriptorium Publishing explores the ROI of Topic-based Authoring in DITA (with numbers) in this SlideShare presentation.

6 Reuse

How much duplication of effort is there in creating documentation across your company? Are different departments creating documents that contain similar sections like Safety Notices or task instructions shared by different models of products in a product line? This can get costly, but with legacy ways of creating information it is virtually impossible to track these duplications of effort. Topic-based authoring in DITA makes efficient reuse of content much easier as each topic is tagged as to type and metadata. This, in turn, enables very powerful searchability and creates a library of content chunks (topics) to search through and select from for reuse. Each chunk can be shared, reviewed and edited individually with any changes appearing in all global reuse instances of that chunk. This updating capability means significant time savings while ensuring your documentation reflects any changes in real time, concurrently with product and service development. Reuse also figures into publishing in that you can easily rearrange or compile content, with maps, that ensure publishing is optimized for each media format. In a CIDM article Beth Pollock and Andrea Rutherfoord of Citrix Systems estimate that content reuse with DITA saved their team 1800 days, or the work of seven full time equivalents (FTEs), over the first fifteen months after implementation.

7 Updates

Need to change a part number across multiple user manuals, help desk, web site and print documentation? Change it once in DITA, and every place the content is reused is updated. No excuse for outdated documentation. See our Case Study on Skyward, who had a major need for constant content updates.

8 Scalability

Storing thousands of Word docs in various places simply doesn’t scale.  Managing that amount of documentation is incredibly manual and prone to error. Searching for information in this haystack is virtually impossible. DITA was designed by IBM for its own documentation management to specifically solve these problems on an enormous scale (60+million docs, 40 languages). Realizing they had solved a very big problem that almost any business or organization was facing, they released the DITA database as an open-source standard in 2005. It was designed to go big. IBM provides extensive (and readable!) documentation and a history of DITA on their DeveloperWorks site.

9 Workflow

In typical unstructured environments, even the authoring workflow is often somewhat ad hoc. Assign, write, review, edit, rewrite, edit, approve, publish–across functions and up and down hierarchies. DITA formalizes this workflow with built-in tools to keep the work flowing through without bottlenecks caused by delivery issues (‘I know that doc is somewhere in my email…’). Assign tasks, comment inline, review and approve edits, etc., all in the same application. Cloud-based DITA systems go even further, enabling anyone with Internet access and a secure login to jump into the workflow from anywhere. Managers can easily see into processes to determine status and resolve any issues.

10 Open-standard

Open standards must conform to certain requirements regardless of who is utilizing them. DITA is based on XML which itself is an open standard. The DITA database and architecture  are maintained and updated by committees consisting of users who vote on upgrades. File formats move freely among applications that read XML, meaning your documentation is not locked into proprietary formats. Our White Paper on Open vs. Proprietary Formats shows how adopting open file formats like XML is a forward-looking strategy in content management.


If you are facing any of these situations with document creation and management in your world, the move to a DITA-based content creation and management solution can help you save time, money and make the best use of the valuable assets your company is creating every day.

Free 30 Day Trial