As a co-founder of Jorsek LLC, developers of the easyDITA Darwin Information Typing Architecture Component Content Management System (DITA CCMS), I have been involved with the DITA standard since it was open-sourced by IBM in 2005 under the auspices of OASIS. After several years, with multiple CCMSs entering the market, it became apparent that there was a need for an international standard for a CCMS to define best practices, ensure compatibility among systems, and encourage adherence to accepted open information architecture standards. I became involved in the process of proposing the standard to ISO/IEC/IEEE (three of the largest standards developing organizations in the world) and eventually became one of three primary authors. This piece is an anecdotal description of that process and how it (and most standards development processes) could be greatly improved by actually writing and reviewing the standards in a DITA CCMS.
Since writing the standard, I wanted to take some time to do a retrospective on the process, and offer some ideas to other standards developers that could help make the process far more efficient and far less stressful.
First, A Bit of Irony
As you read on, you will find that many of my observations come from the fact that we weren’t using any Content Management System or topic-based authoring tools when developing the standard, other than at the very beginning of the process.
It’s ironic that we were authoring a standard on best practices for content management, without using any structured content management tools, beyond our initial planning stage…and throughout the process we were losing hundreds of hours to problems/bottlenecks that using a CCMS would have eliminated. However, this was intentional.
Let me clarify. Since easyDITA (my company’s product) is a CCMS, I felt it would be very important that the standard authors were not using it during the process– to avoid unintentionally created bias. Additionally, the use of CCMS and topic-based authoring tools is currently not part of ISO’s workflow, though, as you will find out, there are significant reasons why it should be.
1) Draft Authoring: We Started With DITA
The draft version of the standard was written collaboratively with Dr. JoAnn Hackos, Bob Boiko and myself. Dr. Hackos and I actually began writing the standard in DITA during the draft phase. Both I and Dr. Hackos were familiar with DITA while Bob was not, however he was able to come up to speed in just a few days. By collaborating in DITA, we found the process was very streamlined:
The first step was designing the high level structure of the standard and dividing up work:
- We used DITA maps and stub topics (placeholders, content TBD) to create a high level Table of Contents (TOC) for the entire standard
- Work was easily divided between authors because all the sections were componentized. Individual topics were assigned to each author, and changes incorporated over time without any merging problems.
Using DITA meant tracking work was a simple process without version control issues:
- It was easier to track progress than when using MS Word because we could look at the state of each topic and the comments on them. The simple fact that we could look at the last time a specific topic was modified meant that we could see progress at a very granular level. If this process had been done in a DITA CCMS like easyDITA, we could have used its commenting, metadata, document statuses and Assignments system to even further improve this. It would have significantly streamlined the ability to see the total project status at a quick glance and collaborate on moving the content through its workflow.
DITA eliminates common issues with styling or versions of MS Word:
- Using DITA also meant that we didn’t have to worry about the styling of the content, or dealing with differences between versions of MS Word that authors were using (more on this later as it turned out to be a huge problem once we were forced to move into using MS Word)
- Besides differences in the voice used by different authors, the content was automatically created consistently because the structure had to adhere to the DITA spec. Because DITA is an XML-based standard, there was no styling or structure cleanup needed to ensure consistency even though there were 3 different authors. Over the course of the draft phase, this likely amounted to weeks worth of saved time between all the authors.
But Then We Had To Move To Using MS Word…
After the draft phase we were forced to move into MS Word because this is what ISO used at the time. We easily converted our DITA content to an MS Word document with the push of a button. Going forward, we quickly hit some large bottlenecks because of workflow around MS Word.
2) Working Draft Stage
An ISO standard under development goes through many rounds of review from members around the world. During this phase, Word documents were passed around via email to different groups for review and comments. Reviewers would leave feedback using MS Word track changes and commenting features.
The Painful Process of Receiving and Merging Draft Comments
There were roughly 10 reviewers during this phase and it lasted about 8 months. This process was excruciatingly painful since each round of review meant getting 5+ new Word documents with comments, that had to all be merged into a single document. Furthermore, there was no standardized version of MS Word that was used. The differences between versions for users in different countries (Japan, Germany, US) were significant. Oftentimes, when comments came back, the styling of the documents was mangled. Then, someone on the team had to manually go through and merge all these documents. This also created confusion because the original author attribution of the comment may be lost, making it difficult to determine who to address when reviewing specific comments.
In addition, there were issues around duplicate comments being created because lots of reviewers were working in parallel and would often find the same issues– but could not see each others comments or know they had already been resolved.
Based on these issues alone, I estimate that the process of sending out documents, getting review feedback and then manually merging them, cost many hundreds of hours throughout the project that could have been better utilized.
Time Was Lost Reviewing Styling, Rather Than Content
Since there was no consistent styling, and styling was often broken across different versions of MS Word being used, reviewers spent a lot of time discussing and commenting about style changes. Reviewers would complain about things like bullet styling, bullet numbering and other inconsistencies. Since not everyone saw these issues in their version, it was even more confusing when trying to address them.
ISO also has specific guidelines for how documents must be formatted, however without DITA there is nothing clarifying or enforcing them. Authors would break the styling because they didn’t know the guidelines, and then later on they would need to be manually fixed.
Endless Meetings Were Dedicated To Resolving These Issues
During weekly meetings to discuss comments, it was really hard to keep everyone on track since at least a quarter of the time was spent addressing confusion related to styling and versioning. The meetings would be filled with dialog like:
A: “What page are you on again? You said that was on page 25, but for me it is on page 26.”
B: “Mr. Z, you changed the numbering style on these bullets but ISO guidelines say they should be the way they were before.”
X: “The graphic is going off the page, we need to fix that.”
Y: “It looks fine in my version.”
With 10 or so people on a call, many of whom were not native English speakers, we frequently ran through a two hour meeting without really digging into anything significant. Some weeks we had 6 hours of meetings just to get through 20-50 comments.
What Version is Correct?
During this phase, most of the collaboration was done via email. Versions of the standard were emailed back and forth, merged, and then redistributed. Sometimes comments would not be incorporated correctly and no one would notice until several weeks later. At that point there would be more new versions and the documents changed significantly enough that determining where those comments belonged was very difficult. Doing a quick search through my inbox I see that I have more than 200+ emails with different versions of the standard which were sent back and forth during the development process. 200 different versions of a doc in my inbox alone. Now multiply that by ten to twelve other people and you can understand the problem.
3) Draft International Standard Phase: Moving To MS Excel
During this phase the standard is circulated to ISO members to vote and comment on it. Since there are many more reviewers during this phase, capturing comments and changes directly in Word was unsustainable. Instead, comments are captured in Excel spreadsheets and submitted back to the working committee, instead of actually modifying the documents.
Where Does This Comment Belong?
The spreadsheet doesn’t match the Word doc…
Users were still reviewing the document in MS Word, but comments were aggregated in a spreadsheet with a reference to the section in Word where the comment applied. Because of differences in Word versions, the section numbers referenced in the spreadsheet didn’t always line up with what authors would see in their versions. In addition, in many instances reviewers were not even reviewing the right version of the document because they accidently downloaded the wrong version, or used an old version they had in an email by mistake. Frequently, comments were not even relevant anymore, or couldn’t be tracked back to the correct part of the document. This resulted in many extra meetings just for the purpose of asking the original commenter to show us where the section was that they were referencing.
As an author, my job was to review comments and respond with recommendations for changes or explain why changes are not appropriate. Roughly 50% of the time I spent reviewing comments was actually spent reconciling the notes in the spreadsheet to what content they were suppose to reference in the standard.
Who’s Responsible for Addressing This Comment?
Since there was no way to track who wrote what parts of the standard, and comments were detached in a spreadsheet, we would often have meetings just to determine who was responsible for addressing a specific comment. I would often have to go through all the comments even though only a sub-set of them applied to the content I had developed. This was troublesome even with just a few authors; for larger standards with many authors I imagine this would be extremely difficult to handle.
4) Translation Stage: Stop Everything
I wasn’t part of the translation process, but during this process we had to stop all work while the documents were processed and were not allowed to make any significant changes afterwards, because the translation process was time-consuming and expensive. If a problem was found, and it was not critical, it wouldn’t be fixed.
This process could have been dramatically better if we used a DITA CCMS throughout the process
There are three important takeaways from this hypothesis:
- First, we would likely have developed a higher quality standard, produced in less time using less resources.
- Second, the time from initial proposal to final approved standard could have been greatly reduced. Using the current tools (Word, Excel) to create the standard and get it approved took about 3 years. This affected the relevance of the standard even as we completed it. We could have gotten the whole process done in 1-2 years and the standard would have had higher value to ISO members. In addition, updates could be made, reviewed and approved much faster, making the standards much more relevant in a rapidly changing world.
- Finally, standards are often written and reviewed by unpaid volunteers (sometimes subsidized by their employers) with limited time and bandwidth. Saving them time and eliminating frustrating processes would attract more talent to the worldwide standards development field.
Let’s look at the alternative to the process that we could have used.
Authoring Content In A DITA CCMS
A DITA CCMS is a structured content repository that consists of a single source database accessed remotely by writers, reviewers, editors and approvers. The content exists only in one central version and the application tracks all interactions with the content on multiple levels. The file formats are XML, which means formatting is not handled by the authors- it is set at a system level, ensuring consistent adherence to the requirements of the Standards Developing Organization (SDO).
As a result, authoring and reviewing in a DITA CCMS removes nearly all the pain points and bottlenecks described above:
- Stubbing out content and topics/sections during the planning process is very easy and creates a high level roadmap of what is to be developed that everyone can collaborate on
- Participants log into the system from any location globally, rather sending emails back and forth with attachments
- Status can be assigned at the topic level so it’s very easy to see what content is still being drafted
- Content formatting is automatically made consistent because of rules imposed by DITA and separation of content and style/formatting. The ISO formatting guidelines would be enforced via DITA.
- One version (one source of the truth) of content completely removes all of the problems associated with emailing multiple versions of Word documents back and forth
- There is the ability to compare content with past versions, and rollback content if needed
- Comments are captured in a centralized system, instead of a spreadsheet, and are maintained directly in the context of the specific content they relate to:
- Prevents losing comments and ensures the authorship of the comment is known
- Allows replying to comments directly, instead of via email, and allows notifications to automatically be sent to users when new comments are added on their content or replies to their comments are created
- Prevents duplicate comments being created by different reviewers (this was a big problem)
- Permissions can be applied to various users to prevent reviewers from changing content they shouldn’t be able to. Permissions can be applied to content in various stages of the lifecycle so that content which has already been approved is automatically frozen, and no users can modify it. This allows some portions of the document to be frozen while others are still being worked on vs. in MS Word, where it is all or nothing.
These are just a few of the advantages inherent in using a DITA CCMS to author and review standards. Finally, let’s look at the significant potential Return On Investment (ROI) for all the stakeholders in the standards development process.
In Conclusion: The Benefits Would Be Substantial
With multiple authors, reviewers and dozens of others spending considerable time and effort on standards development, often on a volunteer basis, any improvement in the technology used to create standards can mean significant and measurable savings and other benefits:
- Time Savings. A standard created in a DITA CCMS could be brought to completion in a much shorter timeframe. The resulting time savings imply a faster time to market and earlier realization of revenues for the SDO.
- Updates. Any required updates could me made at the topic level and the standard easily republished in days rather than years.
- Reuse and Translation Management. Structured content lends itself to flexible reuse of ‘chunks’ of content and enables translation to be managed at the updated topic-level.
- Publishing. Because the standard exists as XML, publishing to a variety of emerging media (web, mobile, tablet, wiki, PDF, Word, etc.) can be accomplished, with consistent formatting, with a few clicks.
- Monetization. Because standards authored in this model exist as data, they can easily be bundled into relevant sets and sold as subscriptions that include regular updates.
These benefits and the substantial savings they represent, present a compelling case for SDOs the world over to consider reevaluating the way standards are created, managed, and distributed.
Casey Jordan is co-founder of Jorsek LLC, creators of the easyDITA DITA CCMS. He is a co-author of the ISO/IEC/IEEE 26531 Standard: Systems and Software Engineering- Content Management for Product Lifecycle, User and Service Management Documentation.
In 2005 Casey co-founded Jorsek LLC, with the goal of improving how the world created and exchanged content. Jorsek developed and brought to market a new component content management system focused around DITA, an open standard for authoring, organizing, and delivering high-value information.
Casey played a key role in developing the business model and architecture for Jorsek's flagship product, an advanced web-based XML content management system called easyDITA.