User Tools

Site Tools


Initial DocBook Tools

Jeremy Buhler, Karen Schneider, Tina Ji

Creating valid DocBook XML files: authoring tools

Basically any text processor that can save plain text files can be used as a DocBook authoring tool. Professional XML editors will help you to write error-free XML documents, validate your XML against a DTD or a schema, and force you to stick to a valid XML structure.

There are numerous XML editors available. Unless otherwise noted the editors below have the following features:

  • Add closing tags to your opening tags automatically
  • Force you to write valid XML
  • Color code or indent your XML syntax
  • Verify your XML against DocBook v.5
  • Transformation function
  • Multiple views function



Editix ( DocBook v.5 schema is not embedded. You may download it and link to it in order to validate against it.)

epcEdit XML/SGML ( DocBook v.5 schema is not embedded. You may download it and link to it in order to validate against it.)

Syntext Serna Free ( DocBook v.5 schema is not embedded. You may download it and link to it in order to validate against it.)

XML Mind Personal Edition (Transformation function is not available.)

For more DocBook authoring tools, go to DocBook Wiki

Converting Word document to DocBook 5 XML

Many Evergreen community documents (documentation, guides, training manuals, etc.) are currently in Microsoft Word format. Some of these documents are being used as the basis of Evergreen "core" documentation, which will be produced in DocBook 5 XML.

There are three options for converting Word to DocBook 5: using brute force (rekeying the documents), using a commercial conversion tool, or using Open Office to do a crude, one-way, partial Word-to-DocBook conversion.

The DocBook wiki also maintains a list of conversion tools.

Brute Force Method

This is sometimes the best way to convert very small documents, especially those that are in types not supported by conversion tools, such as glossaries.

Using commercial conversion tools

(Needs description)

Converting Word to DocBook with OpenOffice

The Open Office method has a clear cost advantage (where labor is free or low-cost), but the bad news is that the resulting files will be in the DocBook 4 format, and will also be limited to a few tags and document types. The files will require extensive cleanup and reformatting. But it's a start.

Step 1: Convert text and layout

Open the source file (the Word document) in OpenOffice. The file can now be re-saved in different formats by clicking File > Save As, then selecting the desired file type from the Save as type dropdown. Save the file as type DocBook (.xml).

This creates a DocBook XML file with the text and general layout of the source. Again, this produces a DocBook 4 file that will still need extensive editing/conversion to meet Evergreen project requirements (DocBook 5, to start with), but at least it's XML.

If you don't need to convert images from the source document, you're done; othwerwise proceed to step 2.

Step 2: Extract embedded images

Much like HTML, DocBook XML files do not have embedded images. Instead a DocBook file "points" to separate image files to display. To render the images in DocBook you must first extract them from the source DOC file.

Use File > Save As to re-save the source file as type HTML Document. This will create several new files: one HTML file and an image file for each picture embedded in the source document.

The last step is to link the extracted images to the DocBook XML file created in Step 1. Delete the HTML file, rename the image files (optional), then open the DocBook file in an XML or text editor and edit the <imagedata> tags to point to the corresponding image files.

Transforming: using XSL Stylesheets

The DocBook website provides an excellent collection of default XSL stylesheets. In theory, these stylesheets can be further tweaked and customized. As the Evergreen documentation project gets under way, more guidance will be available, but for now, the best advice is to stick with the default stylesheets. It will be challenging enough to get these to work the first few times.

Formatting output: using CSS

Transformed with the standard XSL stylesheets, DocBook XHTML is, well, ugly – unstyled HTML. DocBook is a markup language, not a style language. Most DocBook sites style their HTML with CSS (cascading stylesheets). Usually, these files are called in the XSL transformation process.

The Evergreen project currently does not have a set of CSS for DocBook.

Setting up DocBook Transforms

If you are using an XML editor that does not have document transformation function, you may set up the transformation process by the following guide.

On Linux

Every Linux distribution seems to ship with different tools for transforming DocBook. It is relatively simple to set up a set of transforms to XHTML and PDF using the standard XSLT stylesheets and FO tools. This guide will get you up and running quickly.


We require just one binary package included in your distribution. Every distribution makes the libxslt processor, xsltproc, available in some package. Look for a package named libxslt and install it.

Preparing the build tools

In this phase, we download, extract, and create symbolic links to the build tools. You can probably use more recent versions of the tools as they become available.

mkdir doctools
cd doctools

# Install the DocBook RelaxNG schema

# Install the DocBook XSL stylesheets
tar xjf docbook-xsl-1.73.2.tar.bz2
ln -sf docbook-xsl-1.73.2 docbook

# Install Apache FOP
tar xzf fop-0.94-bin-jdk1.4.tar.gz
ln -sf fop-0.94 fop

# Install hypenation support for Apache FOP

Generating the XHTML and PDF from a DocBook source file

The following is a simple script for generating XHTML and PDF from a DocBook source file. It assumes that your tools are installed in a subdirectory called doctools within your home directory:


# Generate XHTML
xsltproc $XSL/xhtml/docbook.xsl $DOC/index.xml > $DOC/index.html

#Generate PDF via FO
xsltproc $XSL/fo/docbook.xsl $DOC/index.xml > $DOC/
$FOP/fop $DOC/ -pdf $DOC/index.pdf -c $FOP/fop.xconf

On Windows points to the most popular free XML tools for Windows transforms. XMLMind and Eclipse are two Windows XML editors frequently mentioned on discussion lists.

Some XML processors that run on Windows, such as oXygen ($), automate all or part of the following, and for substantial editorial work in a Windows environment, investing in a serious tool may be worth your while. But the following will get you going with a free XML editing toolkit.

Installing a free Windows-based XSL validator and processor


  1. The latest libxml2 package
  2. The latest libxslt package

Note: The above packages can be downloaded from or Besides these two packages, you may also need their dependencies - iconv and zlib - also available from these sources. They may also need to be placed within your PATH as described below.

Then on your computer:

  1. Create a new directory, C:\XMLTOOLS\ to contain the DocBook tools.
  2. Extract the contents of both the libxml2 and libxslt zip files to C:\XMLTOOLS\
  3. Rename the sub-directories to C:\XMLTOOLS\libxml2 and C:\XMLTOOLS\libxslt for convenience.
  4. Open a terminal session and add C:\XMLTOOLS\libxml2 and C:\XMLTOOLS\libxslt to your PATH environment variable:
    set PATH=%PATH%;C:\XMLTOOLS\libxml2\bin;C:\XMLTOOLS\libxslt\bin

    You can set this permanently in your computer's environment variable settings if you like. (from My Computer's properties, look for the Advanced tab, then a button for Environment Variable.)

  1. Test that XSLT processing tools work:
    xsltproc --help
    1. If you get a list of options for passing to xsltproc, then everything is set up correctly.
    2. If you get an error dialog saying "The procedure entry point xmlXPathCompiledEvalToBoolean could not be located in the dynamic link library libxml2.dll" you probably have an old version of the library installed somewhere on your system. Try:
      rename C:\WINDOWS\system32\libxml2.dll libxml2.old

      (although note that this might break some other tool on your Windows system…)

  1. Download the DocBook XML schema package from
  2. Extract the contents of the zip file to C:\XMLTOOLS\. This creates a new directory called C:\XMLTOOLS\docbook-5.0
  3. Download the DocBook XSL Stylesheet distribution (docbook-xsl) from - click the "download" link beside the docbook-xsl package, then download the ZIP file.
  4. Extract the docbook-xsl ZIP file to C:\XMLTOOLS\. Rename the newly created directory to C:\XMLTOOLS\docbook-xsl for convenience
  5. Using the sample XML files you downloaded, test a basic transform:
    xsltproc C:\XMLTOOLS\docbook-xsl\xhtml\chunk.xsl C:\XMLTOOLS\docbook-xsl\tests\refentry.007.ns.xml

    This should produce three HTML files in your current working directory. If it does, then your DocBook processing toolchain is set up to successfully produce XHTML.

evergreen-docs/dig_toolset.txt · Last modified: 2009/08/12 16:44 by tji

© 2008-2017 GPLS and others. Evergreen is open source software, freely licensed under GNU GPLv2 or later.
The Evergreen Project is a member of Software Freedom Conservancy.