Table of Contents
Transforming DocBook XML into HTML, PDF, etc.
Producing DocBook XML is the first step in the documentation process. To successfully produce web pages, PDF, text files, etc. from DocBook requires using XSL processors to transform the files.
DocBook XSL Stylesheets
The DocBook website provides an excellent collection of default XSL stylesheets. In theory, these stylesheets can be further tweaked and customized. As the Evergreen documentation project gets under way, more guidance will be available, but for now, the best advice is to stick with the default stylesheets. It will be challenging enough to get these to work the first few times.
DocBook and Validation Tools
To successfully transform into HTML, PDF, and so forth, DocBook XML must be valid and well-formed. With XML, a miss is as good as a mile; one missing angle bracket will produce bad output. Furthermore, writers new to DocBook may misunderstand how tags are used or nested.
There are a number of good validation tools for DocBook. A standard freebie is xmllint, which runs on many platforms. DocBook-friendly editors such as XMLMind will offer some validation assistance. Higher-end tools such as oXygen have built-in validators that all but ensure valid files.
CSS and DocBook Output
Transformed with the standard XSL stylesheets, DocBook XHTML is, well, ugly – unstyled HTML. DocBook is a markup language, not a style language. Most DocBook sites style their HTML with CSS (cascading stylesheets). Usually, these files are called in the XSL transformation process.
The Evergreen project currently does not have a set of CSS for DocBook.
Setting up DocBook transforms on Linux
Every Linux distribution seems to ship with different tools for transforming DocBook. It is relatively simple to set up a set of transforms to XHTML and PDF using the standard XSLT stylesheets and FO tools. This guide will get you up and running quickly.
We require just one binary package included in your distribution. Every distribution makes the libxslt processor,
xsltproc, available in some package. Look for a package named
libxslt and install it.
Preparing the build tools
In this phase, we download, extract, and create symbolic links to the build tools. You can probably use more recent versions of the tools as they become available.
mkdir doctools cd doctools # Install the DocBook RelaxNG schema wget http://www.docbook.org/xml/5.0CR5/rng/docbook.rng wget http://downloads.sourceforge.net/docbook/docbook-xsl-1.73.2.tar.bz2 # Install the DocBook XSL stylesheets tar xjf docbook-xsl-1.73.2.tar.bz2 ln -sf docbook-xsl-1.73.2 docbook # Install Apache FOP wget http://apache.sunsite.ualberta.ca/xmlgraphics/fop/fop-0.94-bin-jdk1.4.tar.gz tar xzf fop-0.94-bin-jdk1.4.tar.gz ln -sf fop-0.94 fop # Install hypenation support for Apache FOP wget http://downloads.sourceforge.net/offo/offo-hyphenation.zip unzip offo-hyphenation.zip
Generating the XHTML and PDF from a DocBook source file
The following is a simple script for generating XHTML and PDF from a DocBook source file. It assumes that your tools are installed in a subdirectory called
doctools within your home directory:
FOP=~/doctools/fop XSL=~/doctools/docbook DOC=~/eg_manual # Generate XHTML xsltproc $XSL/xhtml/docbook.xsl $DOC/index.xml > $DOC/index.html #Generate PDF via FO xsltproc $XSL/fo/docbook.xsl $DOC/index.xml > $DOC/index.fo $FOP/fop $DOC/index.fo -pdf $DOC/index.pdf -c $FOP/fop.xconf
Setting up DocBook transforms on Windows
Sagehill.net points to the most popular free XML tools for Windows transforms. XMLMind and Eclipse are two Windows XML editors frequently mentioned on discussion lists.
Some XML processors that run on Windows, such as oXygen ($), automate all or part of the following, and for substantial editorial work in a Windows environment, investing in a serious tool may be worth your while. But the following will get you going with a free XML editing toolkit.
Installing a free Windows-based XSL validator and processor
- The latest libxml2 package
- The latest libxslt package
Note: The above packages can be downloaded from http://xmlsoft.org/sources/win32/ or ftp://ftp.zlatkovic.com/pub/libxml/. Besides these two packages, you may also need their dependencies - iconv and zlib - also available from these sources. They may also need to be placed within your PATH as described below.
Then on your computer:
- Create a new directory,
C:\XMLTOOLS\to contain the DocBook tools.
- Extract the contents of both the libxml2 and libxslt zip files to
- Rename the sub-directories to C:\XMLTOOLS\libxml2 and C:\XMLTOOLS\libxslt for convenience.
- Open a terminal session and add C:\XMLTOOLS\libxml2 and C:\XMLTOOLS\libxslt to your PATH environment variable:
You can set this permanently in your computer's environment variable settings if you like. (from My Computer's properties, look for the Advanced tab, then a button for Environment Variable.)
- Test that XSLT processing tools work:
- If you get a list of options for passing to xsltproc, then everything is set up correctly.
- If you get an error dialog saying "The procedure entry point xmlXPathCompiledEvalToBoolean could not be located in the dynamic link library libxml2.dll" you probably have an old version of the library installed somewhere on your system. Try:
rename C:\WINDOWS\system32\libxml2.dll libxml2.old
(although note that this might break some other tool on your Windows system…)
- Download the DocBook XML schema package from http://www.docbook.org/xml/5.0/docbook-5.0.zip
- Extract the contents of the zip file to C:\XMLTOOLS\. This creates a new directory called C:\XMLTOOLS\docbook-5.0
- Download the DocBook XSL Stylesheet distribution (docbook-xsl) from http://sourceforge.net/project/showfiles.php?group_id=21935&package_id=219178 - click the "download" link beside the docbook-xsl package, then download the ZIP file.
- Extract the docbook-xsl ZIP file to C:\XMLTOOLS\. Rename the newly created directory to C:\XMLTOOLS\docbook-xsl for convenience
- Using the sample XML files you downloaded, test a basic transform:
xsltproc C:\XMLTOOLS\docbook-xsl\xhtml\chunk.xsl C:\XMLTOOLS\docbook-xsl\tests\refentry.007.ns.xml
This should produce three HTML files in your current working directory. If it does, then your DocBook processing toolchain is set up to successfully produce XHTML.