There are effectively three things that matter when your evaluating the total quality of your content creation process and system:
Everything else is details.
These three factors are fairly easy to assess.
- Efficiency = how easy is it to reuse content and how quickly does the system produce your required outputs
- Flexibility = how many output formats are required or how easy is it to collaborate
- Scalability = how many collaborators can the system support and how much output (or throughput) can it easily produce
Efficiency and consistency are simple to manage in small teams. When there are three or less people contributing to your content, you can control consistency with interpersonal communication and peer review, even though the process is manual. It’s when you start to scale that these factors need to be approached more strategically. These factors are also achieved through reuses which, in most authoring systems may be difficult to achieve and manage.
Flexibility is size and scale agnostic. If you need PDF, eLearning, knowledge-base, and chatbot from your content, it doesn’t matter if your team is one person or one hundred people.
Scalability is critical when you are dealing with more than a few documents or a team of more than a few people. Once your requirements include having to support a large number of documents, you wind up having to look at: How do you support multiple contributors? How much output can the system produce at a given time? How easy would it be to reuse information in various documents without having to determine whether the reused pieces stay in synch?
So, let’s look at some options for content creation systems and see how they stack up to these pillars.
It’s going to be no surprise that Word largely fails on all factors.
While Word is fairly good at producing single unstructured documents, writers end up spending 30-50% of their time on formatting. You have to deploy templates and train teams to use them religiously. The content developers can also spend time trying to tweak the presentation, over concentrating on the content itself, which results in less meaningful information. Reusing any amount of content is achieved through copy and paste, which often results in content that becomes out of synch if changes occur. To keep the reused content consistent requires outside management methods which are error-prone and time intensive.
While Word is easy to use for a casual writer, the output capabilities are limited and you cannot change the look and feel of a document to meet different needs. While the PDF and printed page may look the same, if you need to create web output or online help, you’ll need an entirely external process to support this, and oftentimes, that system will be based on people copying and pasting content into it.
Because Word is so prevalent in the marketplace, it is useful in terms of content portability should an organization be acquired or divested. However, rebranding the content, should the need arise, becomes challenging.
Because Word is designed for a single user, scaling to produce documentation for enterprise doc sets can be challenging. You need to develop external, elaborate processes and procedures if you want to have multiple collaborators and the application is deployed using a standard license. Using the cloud version could result in contributor collision that you’ll have to resolve.
If you need to create a large documentation set, updating the resulting document is onerous. On the other hand, combining smaller documents into something bigger is challenging.
Help Authoring Tools (HATs)
Some examples of HATs are MadCap Flare, Adobe FrameMaker, DocToHelp, and RoboHelp. These systems are designed to support the needs of small teams with low complexity requirements. Under the right conditions, each of these tools can do this job well, but as your team grows or your requirements expand, HATs aren’t going to keep up.
Similar to Word, most HATs use a paradigm that intertwines content and formatting. While this is familiar to most casual contributors, it results in the same problems in terms of the processes surrounding implementation of templates and enforcement of styles and time lost tweaking the presentation rather than developing content. The advantage of HATs is the ability to build a larger document from individual components which can be a stepping stone to getting a team of contributors working in a structured, topic-based, paradigm. Most HATs provide the capability to apply metadata to improve searchability but may not be able to provide that information or capability effectively to the resulting output. Some HATs provide mechanisms for reusing content which improves consistency of content.
HATs do provide some flexibility to produce outputs. While their main purpose is to produce HTML-based output, most modern HATs also produce other formats, such as PDF as a side-benefit.
Should an organization be acquired or divested, HATs will require one of the organizations to change their processes and acquire licenses for software so that all authors are using the same processes, which could become expensive. This could result in duplicate processes for a significant time during transition or, worse yet, continuation of dual processes. Rebranding the content, should the need arise, becomes challenging.
Scalability is where HATs tend to struggle. HATs use a desktop publishing paradigm for content creation. This becomes problematic as you increase the number of contributors because the software has to be installed on each machine driving up the costs of scaling your publishing infrastructure. This model also tends to force your content creation processes to take a partitioned view of your corpus which inhibits cross collaboration – your writers tend to concentrate only on their assigned portions of the corpus, rarely contributing to other areas. This lends to unfamiliarity in other areas of the content stifling your capability to cover that content development should any of your team be unavailable for extended periods, and increasing risk associated with a team member leaving the organization.
Because access to the content requires the application to be installed on each machine that requires access to the content, content reviews become cumbersome. Most organizations develop a separate, disconnected, process for SMEs to review the content. With the comments divorced from the source, transcription errors incorporating the changes into the source files can occur.
Markdown is currently the cool kid on the block, it’s a part of a cycle that mirrors the boom bust of wikis during the 00s. Markdown can be very efficient from the writer’s perspective and if your only output target is a relatively small, static documentation site, it can be a wholly functional solution.
From a writing perspective, markdown appears to be extremely efficient. However, the markup language itself is not intuitive for unacquainted contributors, which may actually slow down an otherwise fast process. This problem could be overcome by editing tools, but using WYSIWYG authoring tools that provide the markup capability makes the use of markdown questionable from a benefit standpoint.
Reusing content is the copy and paste method which, as described previously, runs the risk of inconsistent content should there be changes. As with other systems already mentioned, acquisitions and divestitures present challenges for merging processes and rebranding content.
Markdown implementations tend to be single purpose, most often HTML outputs that are delivered with a product. Creating other output formats from markdown could mean processing out to HTML first and transforming the HTML to the other output formats or developing transforms from markdown to the target output form.
There are a few things that could affect consistency. The first is that there are a number of markdown specifications and, depending on the contributor, the specification used might not produce the same output results. Structures that exist in one markdown language might not exist in another such that, when processed, can look significantly different.
Markdown could be useful for organizations that have a large number of casual contributors, but may not be efficient should an enterprise need to produce a large amount of technical documentation for public consumption.
The lack of strong reuse mechanisms makes scaling Markdown implementations difficult, especially when the content needs to support multiple products or audiences. Additionally, as content implementations grow, the number of outputs grow with it, this can be a significant problem for markdown implementations.
Originally used to share knowledge within enterprises, somewhere along the line wikis became a way to also share knowledge with customers by making the wiki server publicly available. The web-based, simple, collaborative method of providing information makes it very popular.
Tools for wikis tend to be fairly intuitive so contributors feel comfortable and effective using wikis. The big difference is that wikis can be set up so that individual topics can be organized into a structure to comprise a larger document. From a writing perspective, collaboration is easier than the other options and, depending on the software used, can be enriched enough so content is easy to find. The metadata is applied on the topic itself, allowing you to find topics that are relevant using metadata rather than just plain text search.
Of course, like Word, the contributors can spend inordinate amounts of time adjusting formatting of the content to make sure it looks right. The only guardrails for this behavior is any standards or templates implemented by an enterprise and, unless significant time is devoted to the effort, they can be bypassed relatively easily.
As easy as it is to collaborate, one of the bigger drawbacks of a wiki is that it is difficult to create other formats of output other than the presentation provided by the wiki server. Reuse is copy and paste, like Word, which causes problems when it comes to ensuring that changes are propagated through all the reused objects.
Here again, portability of the content and rebranding becomes problematic, especially because there’s no inherent way to reuse content without copy and paste.
Wikis can scale easily to allow more collaboration. However, as the number of collaborators increases, management of the quality of the content begins to decrease as the collaborators add information without validating it against existing content. Also, maintenance of existing content becomes problematic because there is no way to make sure that existing content is up-to-date with new developments. This impacts overall usability because search can return conflicting or irrelevant information. Additionally, the lack of organizational tools means finding content grows more complex as the quantity of content grows. This results in the system becoming unreliable as a source of truth.
Providing different output formats is challenging in this environment, because most systems are not designed to do that; providing offline versions of the content might not be an option for most enterprises.
To be clear, we’re talking about the DITA XML standard and how most software in the market supports the standard. Besides being an XML application, DITA expresses an approach for authoring structured topic-based content. This article won’t go into all of the considerations involved when writing with DITA as those concepts have been described in books about DITA.
There have been many articles and blog posts that state that DITA is hard. In reality, the difficulty resides in the requirements of the content being expressed. The basic DITA unit of content is a topic. What appears to add complexity is the specializations of the specification that implement the other domains, such as learning and training and technical content. You can create any documentation without using these other domains and still conform to the specification and all the software that currently exists can process the content.
Where the biggest efficiency comes from removing the formatting from the content. The authors, whether casual or experienced can focus on the actual information rather than the formatting. That’s handled by the processing that produces the output from the source. Depending on the authoring tools used, the writing environment can look like Word and the DITA elements can be hidden from casual contributors.
Here’s where using DITA starts to shine. You can reuse topics in different documents or even use substructures among different topics. You can compose different output formats from the same structure, depending on defined processes, with little effort on the part of the writers. In fact, because the topics are independent of structure, they can exist in different topics and be profiled such that the topics can take on the context of the structure in which they are referenced.
Metadata can be applied to the structure that binds the topics together, rather than on the topics themselves, so that the ability to find topics improves, even in a large corpus. Also, if the output enables processing, you can locate content on a more granular level, such as “show me all topics that talk about the xwidget user control”.
DITA has built-in mechanisms for reuse. Rather than copying and pasting content from one topic to another, you can create reusable objects that are referenced by the topics that need them and the output processing creates a copy of the object upon output. This feature ensures consistency of formatted content and ensures that changes are applied throughout any delivered documentation. The reuse features in DITA range from simple references to include chunks of content to more advanced variable-style content. This “baked in” reuse capability makes it easy to rebrand should there be an acquisition or divestiture.
Because the formatting is independent of the source content, the consistency of is handled by the formatting processes and not by the writers, eliminating inconsistencies caused by writers ignoring templates or other style standards. Also, style changes can be implemented without author involvement, thus providing a way to pivot styles without disrupting authoring productivity.
Another major point for DITA is scalability. When it comes to scalability, it’s largely designed into DITA. The reuse mechanisms mean you can scale the quantity of content without incurring a huge maintenance overhead from duplicate content. The separation of presentation and formatting means you can scale outputs without having to modify or duplicate the content. Strong semantic and linking structures means that a large base of content can still navigable and searchable. And since DITA is an open standard, you can change or add to your tools with far less cost than other formats.
There are pluses and minuses to any content format. When it comes to technical content, these are the general conclusions for each of the discussed formats:
- Microsoft Word: Not really appropriate for technical content
- Help Authoring Tools: Can work well for small, isolated teams. Struggle with collaboration.
- Markdown: Great for developer groups with single output targets. Lack of strong structure and reuse mechanisms make scaling difficult.
- Wikis: Good for small, internal teams, or well defined collaborative spaces. Not well suited for enterprise scale technical content.
- DITA: Structure presents a learning curve for new authors, but enables long term maintainability and scalability.
DITA, proprietary content in HATs, and Markdown are all viable documentation formats, Microsoft Word and Wikis less so. Each of these options comes with tradeoffs. DITA is the best choice for larger or more complex documentation sets.
Ready to see DITA in action? Whether you are a small team or a large one, If you have notable publishing demands or complex documentation sets, easyDITA provides the ultimate tool for techcomm teams ready to take the next step towards intelligent, structured content. Try a free trial today
Patrick is a software industry professional specializing in developing, productizing, and solving problems with product content software. He is a highly skilled developer, thoughtful manager, passionate customer advocate.