The software industry proved that continuous integration –the use of small, discrete components that are tested and integrated frequently– saves time and money. We believe the three principles of continuous integration will also make your documents better and less expensive. When we implemented these principles for ourselves we were surprised by further unexpected benefits.
Documentation is the last phase before product release. The docs follow the product. Writers seem to accept this almost as a natural rule. After all, they ask, “How can I be confident about the way something works before it’s produced?” So, in software, the documents –the “How To” part of the product– are often not ready for publication for weeks or months after the “Can Do” part of the product. And even worse than the long, expensive wait, the documentation, unlike the software itself, is seldom measured against requirements. In fact, we seldom know whether the documents themselves actually “work”.
In software development, confidence comes from continuous integration and testing
To the question “How can you be confident …?” developers offer a different answer than the writers. Developers gain confidence in their new code in small steps. In a process they call “continuous integration”, they apply changes frequently and regularly — often several times in a day. They run automated tests on every change, and they circulate the results among team members. They follow this pattern through dozens and hundreds of iterations, gradually building a shared confidence and accumulated understanding of their product in all its complexity. A more reliable product is created in less time because quality control is respected and enforced through the entire process rather than being applied as final step.
Modular development and testing in a rapid cycle — continuous integration — is the fundamental aspect of the software development revolution that began 20 years ago and is now all but complete.
Confidence in the documentation process is less certain
And what remains? The docs, of course. Documentation writers still are almost forced to wait patiently for the product so that they can learn it, then write about it, before finally publishing the manual as either:
1) a perfectly edited text that is accurate, comprehensive and an essential part of the customer’s satisfying experience with the product,
2) is sufficiently large and tidy enough to reassure the purchaser that there must be something useful among all those pages.
Could it be different? Could writers stand up and lean over the tops of their partitions to see the way developers are working? Could modular, test-oriented development methods be successfully applied to documentation? (Could we all just get along?)
“Ahhh …. No.” many would answer, “Documents are not code, they’re text.”
We think that’s wrong, and recently, almost by accident, we realized that we can demonstrate it. We think that documentation, like the code itself, has to be developed in a process of continuous integration, with changes that are applied frequently and regularly and are then tested — and the test results circulate among the team members, especially the developers.
We’ll go even further. The documentation of the product is actually the foundation of software testing; without correct documentation, the software itself cannot to be shown to actually work.
The “T” in DITA stands for “Typing”
If you’re a developer, typing is information about the kind of data your program specifies. Typing answers the question, “With this set of symbols, are you trying to represent a big number, a little number, text, yes/no, or what then?” Typing is the every-day stuff of industrial-strength development languages because it kills bugs and saves millions.
One of the bold steps taken by the Darwin Information Typing Architecture (DITA) is to offer typing (classification, not referring to keyboarding) to writers. Is this block of text a Concept, a Reference, a Task, or the all-encompassing Topic type? If writers specify and distinguish these types then automatic things begin to happen: Documentation continuously integrates with engineering and writers are no longer left at the end of the process trying to clean up after the party.
In fact, as we’ve discovered, writers can start contributing as early as the design phase of the project. Providing “documentation first” can offer clarity, precision and accuracy that guarantees a better product that is more profitable to the business because it’s available more quickly.
Among the types Concept, Reference and Task, it’s the Task that is the most highly-regulated and detailed. That’s why Tasks make it very easy to integrate documentation with development — and we’ll get to that in a minute. But before we explore the Task itself, let’s not ignore the value of topic modularity and typings as a more general principle.
By encouraging writers to differentiate types, DITA provides document managers the data they need to make decisions about the selective update of text. Unlike traditional (pre-DITA) documentation, modular, typed text can provide a single, unambiguous place to update information about a corresponding software component. The questions “What to change and test?”, and “What not to change or test?” both have easy answers. Further, in document sets that are large enough to have significant commercial value, finding what has to be changed, guaranteeing it, and making sure that internal inconsistencies are not accidentally created, is not only possible, it’s accessible to low-cost automation.
If you were part of a development team in the early 1990s, this will all sound familiar. For developers, the choice back then was clear:
- Create smaller, more independent, and testable code modules
- Apply and test changes as regularly as possible and fix issues as soon as they occur
- Automate and track the whole process
And to be fair, developers faced a production management problem that was simpler in significant ways from the problems faced today by writers. Natural language is subject to much more customer scrutiny than source code, and it is much harder to automatically parse and analyze. All the more reason, we think, to apply the three basic principles.
Principle 1: Create smaller, more independent, and testable modules that correspond to product components
If we learn to write small, discrete topics, and build relationships (linking) among them, we can keep conceptual, procedural and factual information separate. Then, if we align them, via metadata, with the product components, we get three huge effects. We know:
- What to update when the product is updated
- What to test when either the documentation changes or the product changes
- How the updates will effect other parts of the documentation
Principle 2: Apply and test changes as regularly as possible and fix issues as soon as they occur
If we learn to write independent topics we don’t have to deal with large, unwieldy ranges of text all at once. This typically makes the work easier and more reliable for an individual writer — especially one whose attention is divided — but it’s critical where two or more writers operate on the same document set. High-quality updates are available at an earlier stage of production.
If we notify the team when topics change, they can more easily help us with review because they can work immediately in small, manageable increments. We can also respond to product changes in the same immediate, incremental way by reviewing and testing corresponding topics. User testing of the topics can reveal several kinds of failure. We can test whether the user completed the procedure that the topics describe, and not, we can find out why in time to make the fix. This can be a test of concepts and references along with the related tasks.
Following Principle 2 yields at least four benefits:
- Team members get immediate feedback on the quality and direction of documentation, building confidence and unity of direction for the team
- All team members, writers, developers and testers, are constantly aware of on-going changes, and writers can take this into account while creating or changing content
- Issues are identified and fixed when they are small and manageable, before they become complex and inter-dependent
- Documentation can provide reliable data to development. Actively testing documentation against user reaction will identify bugs in the product sooner so the development team can correct them earlier
Principle 3: Automate and track anything that can be automated and tracked. Anything.
Even though document automation is much more complex than software automation, there is compelling potential. For example, it’s now possible to automate regular documentation builds to multiple outputs. We can also automatically detect events on the development servers and use that information to trigger various work-flows.
Imagine the automated testing of interactive content of the API documentation; we can simulate inputs and expected outputs before we have access to users. And, of course, we have the automation benefits that are supported by XML itself — the validation of content against content models, the verification of correct linking behaviors, and the capacity to apply grammatical, taxonometric and translation quality at the level topics or even phrases and words.
With even partial application of Principle 3, we get:
- Deliverables that are are constantly available for testing, demo or sample purposes, at any required versions
- Validation, linking, grammatical, and structural issues can be identified and corrected before any formal review processes
- Interactive systems that can be tested through simulation, decreasing the time needed for users to spend “in front of” the application
DITA Tasks, under the hood, are a lot like code
Developers expect to see their test failures while running a “test script” or verifying a “use case”. Both of these are, ideally, documents that have a regular pattern including, at a minimum, a title, preconditions/prerequisites/context, steps, and an expected result.
DITA is very particular about Tasks. It adopts that pattern, augments it, and adds validation because tech writers have always known that the “how” is the prize their users seek. Documentation managers also know that Tasks, unlike other the other DITA topic types, are subject to a high degree of variability, especially when a product changes rapidly, or documentation is shared by many products. A simple change to a product can demand several Tasks updates. That’s why quality assurance around Tasks is the shortest path to satisfied, successful users.
This is our task in the easyDITA author that shows how to insert an image:
But back to the developers. A typical DITA Task is so carefully structured and detailed that it represents the user interaction with the software. If the user interaction fails to reach the expected result at the end of a Task, you’re seeing a test failure. Put another way, the DITA Task contains both the use case and the test script.
Let’s look at a snippet of the raw XML for our DITA Task :
That’s a simple DITA Task, but it contains all of the kind of information we once wrote in test scripts. So, we don’t write test scripts anymore — we just generate them. That allows us to test the software, of course, but just as important, it allows us to test the documentation.
Once we generate the test cases, they appear in our test case management system, as individual tests:
A discrepancy is a bug. But the less obvious fact is that it’s a bug in a highly proscribed and easily identified part of the software-documentation system. It can be corrected in either or both places immediately.
Opening a test case allows us to see the procedure in the same form as the documentation:
That’s what distinguishes continuous integration of software and documentation from the conventional method: we’re not forced to wait for weeks, and then to chase around our systems to re-align and verify the correspondence between a software component and its documentation topic with respect to version, release, language or any of the other several qualifiers that commercial software is prone to. In short, it’s right there in front of us.
We can also track our time, add comments and results:
And there’s no lack of statistics and reports:
All of our documentation is in easyDITA and the test system shown is TestRail; the APIs and platform that we needed for the generation and consumption of test scripts were close at hand. When test case creation is automated, we can have lots of them — and they’re reliable. It saved us hours and hours of one-off work.
Surprises- the good kind
But it also surprised us — it gave us two capabilities we hadn’t thought of until the system was done: usability testing, and user assessment and progress testing. We were able demonstrate the suitability of the software to the user’s objectives, and to test the user’s experience and facility with the software.
Test systems like TestRail can track how long it takes a user to perform a test — which offers a measure of efficiency. That metric can suggest that a task is too complicated and, aggregated over time, can be used to find and eliminate user experience bottlenecks in the product as well as the documentation.
Efficiency bottlenecks should lead to interface re-designs, and this was our other surprise. Test script generation allows us to conduct A/B testing. We can easily test two or more versions of a procedure and its Task documentation comparatively, and report back on the failure/success rate, as well as time taken to complete the task. Both the software and the documentation are under scrutiny. We can discover flaws in the UI, but we can also determine when the documentation is too verbose, too sparse, or just inaccurate.
Remember the three principles of continuous integration
Whether or not you take your document integration as far as we have, it’s critical to move in the direction of continuous integration. The three principles are:
- Writing discrete topics that map to discrete areas of the product
- Applying changes frequently and testing regularly
- Fixing, automating and tracking
To which you might add the “principle” of recursion: Do It Again.
Well-known industry studies have shown that, within the development team alone, finding and fixing issues early in the development process has been proven to save significant time and money. These methods improve communication and build momentum while eliminating costly bottlenecks in the process. These effects also apply to writing.
When both documentation and development teams are working in the same paradigm, effort and communication can be optimized between both groups. Feedback from the development process is relayed into the documentation process, and feedback from the documentation process is relayed into the development process, improving quality and release management for both.
In 2005 Casey co-founded Jorsek LLC, with the goal of improving how the world created and exchanged content. Jorsek developed and brought to market a new component content management system focused around DITA, an open standard for authoring, organizing, and delivering high-value information.
Casey played a key role in developing the business model and architecture for Jorsek's flagship product, an advanced web-based XML content management system called easyDITA.