Comparisons are tricky business.
They only work if there’s a certain level of overlap. For instance, if you are interested in buying a new vehicle, you would probably compare two or more vehicles within the same category.
If you were looking for an electric vehicle priced somewhere around $75k you would compare a Tesla X to a Jaguar I-PACE. Just by the nature of initiating this comparison, you are acknowledging a substantial overlap.
It’s the same with content standards.
If someone is sitting down to compare two product documentation standards, they’ve probably already decided on a number of pre-requisites.
Enter the DITA vs. DocBook Dialogue.
For instance, if someone is looking at DITA versus DocBook for their content, they likely prioritize:
At this stage, a standard like Markdown or JSON wouldn’t make sense. It would be like the Tesla/Jaguar buyer looking at a Ford F-150. Not better or worse, just incomparable on a number of different metrics.
So, with that said, the DITA or DocBook question is a common refrain that we still hear from time to time (though less often these days).
In this article, we’ll compare the two by explaining:
- How the standards got their start
- How the timing impacted their respective trajectories
- What their evolutions looked like
- And how to evaluate the standards going forward
Start At The Beginning
Everything has to start somewhere.
DITA stands for Darwin Information Typing Architecture. It’s an XML based, end-to-end architecture for authoring, producing, and delivering technical information as online help and product support portals on the Web.
🏗️ Created in 2001 by IBM
💼 In 2004, the OASIS DITA Technical Committee is formed
In 2005, the DITA 1.0 Standard is approved as an OASIS open standard
Core Principles at the Founding
This was the hallmark of DITA from the onset. Rather than content written in linear form like a book, Topic-based authoring encouraged writing content as smaller chunks or blocks of content that were independent of the surrounding context.
Structured Content Reuse
Part of the appeal of the modular, topic-based authoring was the ability to reuse rather than copy-paste. By being independent of the surrounding context, theoretically, a topic could be reused wherever needed via references. This meant no duplication of existing content.
Content Distinct from Format
Just like DocBook, DITA content is authored without formatting. Rather, the formatting (layout, styling, etc) is applied automatically upon publishing.
DocBook is an XML language developed specifically for the creation of books and papers about computer hardware and software. DocBook is a large and robust schema that’s main structures correspond to the general notion of what constitutes a “book.”
🏗️ Created in 1991 as a joint project of HAL Computer Systems and O’Reilly & Associates.
💼 In 1994, DocBook maintenance was taken on by the Davenport Group before being handed to the SGML Open which later became OASIS. Currently, it is maintained by the DocBook Technical Committee at OASIS.
Core Principles at the Founding
Book Based Authoring
DocBook changed the way that authors could write and think about a book. Rather than a fully intertwined and linear process, DocBook broke up the structural elements of a book such as the:
DocBook would then allow these elements to be created in a structured and hierarchical manner.
Content Distinct from Format
Like DITA, Docbook content is authored without formatting; rather, the formatting (layout, styling, etc) is applied automatically upon publishing.
📅 All About Timing
So far, we’ve seen a lot of similarities between DITA and DocBook. Like two brothers born ten years apart, there’s a lot of overlap in the DNA.
However, just like those brothers, DITA and DocBook are each a product of their own respective eras or zeitgeists.
One particular shift occurred between the two standard’s beginnings.
Mass adoption of the Internet.
Remember, DITA was founded in 2001 specifically for a world in which users consumed product information online. On IBM’s website, in the article announcing the introduction of DITA, the very first paragraph of the article states that:
This architecture consists of a set of design principles for creating “information-typed” modules at a topic level and for using that content in delivery modes such as online help and product support portals on the Web.
That just makes sense.
In 2001, IBM was on the front lines of accelerating innovation. They recognized that people consume content differently online as opposed to a physical book or document. If users want an answer, they type it into a search bar and jump to the section or topic that specifically addresses their query. That’s the world we live in and that’s the world for which DITA was designed.
DocBook was founded in 1991 by a physical book publisher. DocBook was perfect at addressing the needs of the time. Unfortunately, times were-a-changing and so were the needs.
For reference, it was In 1993 that Marc Andreessen and his team introduced Mosaic, the first popular web browser that worked on Microsoft Windows.
When DocBook was founded, the Internet was still a largely unknown vector for user engagement. Developing a standard for print-first just made sense.
As a result, DocBook was not designed to be modular and topic-based. Rather, it was designed to be hierarchical.
This doesn’t mean that topic-based was impossible with DocBook. Section elements made it possible to construct content in a modular manner, but it required someone to actually work against the natural proclivity of the standard itself.
🔬Evolution of the Standards
Ok, we’ve looked at the origin, the similarities, and some context. But the big question is how these two standards have changed over time.
Both standards made changes in subsequent versions.
Version Updates for DITA 🕊️
DITA, for instance, had a number of minor version updates after DITA 1.0:
- DITA 1.1
- DITA 1.2
- DITA 1.3
The current version is 1.3
All minor changes were entirely backward compatible because they built up the capabilities already established in the initial version of the standard.
Version Updates for DocBook 🦆
DocBook is a different story.
DocBook has undergone a number of major version updates such as:
- DocBook 3
- DocBook 4
- DocBook 5
None of which are backward compatible.
Additionally, DocBook also release multiple minor versions (4.1, 4.2, 4.3, 4.4, 4.5, etc)
The current version is 5.1
Over the same period of time, DITA underwent far fewer version updates while DocBook went the opposite way and produced multiple major releases.
Playing catch up. As mentioned earlier, DocBook was developed before the Internet’s mainstream success. This was a problem. In many ways, the DocBook major releases were attempts to reconcile the standard with the current world. For instance, it wasn’t until DocBook 5.1 (The current version) that DocBook finally made another shift in essentially following DITA by introducing the assembly and topic elements which are the equivalent of DITA’s map and topic elements, something that was present in DITA 1.0.
🧭 Where are they now?
Ok, at this point you might be thinking:
“DITA might have led the way, but isn’t DocBook fully caught up now?”
Not really. There are a few key distinctions that still persist between the standards such as content reuse, linking methodology, and other in-the-weeds topics.
But we’re trying to stay big-picture here and one massive distinction towers above the rest.
Confidence in the Standard
Yes. Confidence in the standard.
That might seem a little intangible, like saying a football player is successful because of their “grit,” but bear with me while I paint the picture of where the confidence gap originated.
You see, it all started with DITA’s decision to include Specialization.
What is Specialization in DITA XML?
Specialization is the act of creating new designs based on existing designs. This allows new kinds of content to be processed using existing processing rules.
That sounds… innocuous?
Basically, specialization lets you create a new element type or attribute from the parts of existing element types and attributes.
That might still sound unremarkable, but the greatest impact of specialization is not how it’s used but what it communicates and what it prevents.
The Big Bucket Approach
DocBook does not use specialization and for many users, that’s fine. Specialization is not an easy process and therefore not a feature their smaller team is likely to undertake.
Instead, DocBook takes the Big Bucket Approach. By that, I mean that they just keep adding aggressively to the number of elements and attributes that are in the standard.
This is a big reason for the number of updates to the DocBook standard through the years. Specialization allows users to extend the standard while still remaining within the standard. DocBook users, however, have two options. They can hope and lobby for their element to be added to the next version of DocBook or they can “tweak” their use of the standard.
To Tweak or Not To Tweak? 💀
It ain’t Hamlet or an existential crisis, but it is a big consideration for your content model.
Is it okay to “tweak” the open standard?
However, it’s clear that companies that use DocBook are less likely to stick to the standard.
And it’s clear to see why:
DITA XML has remained consistent throughout its existence. Founded on the principles of topic-based online consumption of content, the standard has remained consistent and backward compatible. For this reason, vendors, users, and integration partners know that the DITA standard is worth sticking to.
It earned the street cred as a trusted open standard.
Meh, maybe. DocBook has continuously shifted its standard chasing the kind of topic-based content architecture that DITA established. As a result, the DocBook standard simply isn’t trusted in the same way. Most systems built on DocBook… Aren’t really built on DocBook.
They might use DocBook as a launching off point, but they actually have to extend DocBook beyond its capabilities to get the results they want.
The outcome is an open-standard in name only.
Do You Need an Open Standard?
Maybe not, but an important consideration is where you want to go in the future.
The power of an open standard is the interoperability between different systems and applications. DITA is recognized as an open-standard, not just in name but also in practice. Since its inception, the DITA standard has never had a major overhaul that alienated the user base for the sake of playing catch up. That consistency speaks volumes when it comes to attracting partners willing to invest.
If you want a system that does what you need today, there are a lot of options and a DocBook based system can be one of them.
However, if you want a system that is built on an open standard that you can continue to rely on, even after your current system no longer meets your need, then you should take a look at DITA XML 🕊️
If you want to see what a DITA-based system looks like in action, register for a demo of easyDITA.
easyDITA is an Authoring system, a Component Content Management System (CCMS), and a multi-channel publishing engine all in one. It turns your content into flexible, actionable, and manageable data.
- How to Choose Technical Writing Software - November 19, 2020
- Is DITA Good for Search Engine Optimization (SEO)? - November 2, 2020
- Omnichannel or Multi-Channel: What’s the Difference & Can We Use Them Both? - September 1, 2020