Overview of DITA and XML

The Darwin Information Typing Architecture (DITA) is a framework for authoring and publishing documentation using Extensible Markup Language (XML). XML is the markup language that includes the tags used to create links to websites, bold or italicize text, or create lists. The basic structure of a document includes several files: ditamap, concept, task, and reference that are the code and content to produce a document. These files are then transformed, or published, to create the output in various formats such as XHTML or PDF.

A brief explanation of each file type:

File Type Function
Ditamap Includes the document title, builds the document’s structure, and contains links to the concept, task, and reference files that create the complete document.
Concept The document’s overview, purpose, or description.
Task There can be one or several task files that include the procedures to complete a task. Each task file should contain the steps to complete one task.
Reference Usually contains the document’s revision history or information such as a bibliography to cite sources.

Single-source and Content reuse are two advantages of DITA. With single-source, a writer creates the code and content once, and then transforms the content into XHTML, PDF, or a Linux man page. The ability to reuse content saves development time because task files that are already written and edited can be reused by simply creating a link in another document’s ditamap. Another benefit is that when content is revised, it only needs to be changed once.

For example, a task file that contains a procedure about how to save a file is used in the ditamaps of several software manuals. The developer updates the software and changes how a file is saved. The technical writer updates the task file with the new procedure. The ditamaps for each manual is transformed and the manual contains the updated procedure.

If you are familiar with the Linux command line or using text editors, you should feel comfortable using an application such as the Oxygen XML Editor. The XML files are text based, but the application is context sensitive. The advantage is that when you begin typing an XML tag, the application will display a menu that you can click to select the tag, that inserts the tag’s structure into your file. If the tag cannot be used in the section of code you are working in, there are error messages and visual queues that alert you of the problem.

There are some tags that produce output that looks exactly the same. For example, by default the tags <bold> and <uicontrol> output bold text. The difference is that <bold> is a format and <uicontrol> describes an object that a user controls. The difference comes into effect if your company decides to change <uicontrol> to output in red italic text rather than bold. If you used <uicontrol> to display bold text, the next time your document is transformed, the output will not be bold text; the output will be red italic text. To get the desired bold text output, the code needs to be changed and that could be a monumental task if you have hundreds of files in your repository that use <uicontrol> to create bold text.

A search in a bookstore will yield a list of books on DITA. One book I have perused and that is not 1,000 pages:

DITA Best Practices: A Roadmap for Writing, Editing, and Architecting in DITA
Authors: Laura Bellamy, Michelle Carey, Jenifer Schlotfeldt
ISBN: 9780132480529

DITA teaches the writer to consider content reuse as we create task files to write a document. Using the correct tags is essential, even when there are multiple tags that produce similar output. The goal with DITA is to create a repository of content that can be reused to create new documents and create robust code that does not require updates even when changes are implemented.