Action against software patentsGnome2 LogoW3C LogoRed Hat Logo
Made with Libxml2 Logo

The XML C parser and toolkit of Gnome

Upgrading 1.x code

Developer Menu
API Indexes
Related links

Incompatible changes:

Version 2 of libxml2 is the first version introducing seriousbackwardincompatible changes. The main goals were:

  • a general cleanup. A number of mistakes inherited from the veryearlyversions couldn't be changed due to compatibility constraints.Examplethe "childs" element in the nodes.
  • Uniformization of the various nodes, at least for their header andlinkparts (doc, parent, children, prev, next), the goal is asimplerprogramming model and simplifying the task of the DOMimplementors.
  • better conformances to the XML specification, for example version1.xhad an heuristic to try to detect ignorable white spaces. As a resulttheSAX event generated were ignorableWhitespace() while the specrequirescharacter() in that case. This also mean that a number of DOMnodecontaining blank text may populate the DOM tree which were notpresentbefore.

How to fix libxml-1.x code:

So client code of libxml designed to run with version 1.x may have tobechanged to compile against version 2.x of libxml. Here is a list ofchangesthat I have collected, they may not be sufficient, so in case you findotherchange which are required, dropme amail:

  1. The package name have changed from libxml to libxml2, the librarynameis now -lxml2 . There is a new xml2-config script which should beused toselect the right parameters libxml2
  2. Node childsfield has beenrenamedchildrenso s/childs/children/g should beapplied(probability of having "childs" anywhere else is close to 0+
  3. The document don't have anymore a rootelement ithasbeen replaced by childrenand usually you will getalist of element here. For example a Dtd element for the internalsubsetand it's declaration may be found in that list, as well asprocessinginstructions or comments found before or after the documentroot element.Use xmlDocGetRootElement(doc)to get theroot element ofa document. Alternatively if you are sure to not referenceDTDs nor havePIs or comments before or after the rootelements/->root/->children/g will probably do it.
  4. The white space issue, this one is more complex, unless special caseofvalidating parsing, the line breaks and spaces usually used forindentingand formatting the document content becomes significant. So theyarereported by SAX and if your using the DOM tree, corresponding nodesaregenerated. Too approach can be taken:
    1. lazy one, use the compatibilitycallxmlKeepBlanksDefault(0)but be aware that youarerelying on a special (and possibly broken) set of heuristicsoflibxml to detect ignorable blanks. Don't complain if it breaksormake your application not 100% clean w.r.t. to it's input.
    2. the Right Way: change you code to accept possiblyinsignificantblanks characters, or have your tree populated withweird blank textnodes. You can spot them using the commodityfunctionxmlIsBlankNode(node)returning 1 for suchblanknodes.

    Note also that with the new default the output functions don't addanyextra indentation when saving a tree in order to be able to roundtrip(read and save) without inflating the document with extraformattingchars.

  5. The include path has changed to $prefix/libxml/ and theincludesthemselves uses this new prefix in includes instructions... Ifyou areusing (as expected) the
    xml2-config --cflags

    output to generate you compile commands this will probably work outofthe box

  6. xmlDetectCharEncoding takes an extra argument indicating the lengthinbyte of the head of the document available for character detection.

Ensuring both libxml-1.x and libxml-2.x compatibility

Two new version of libxml (1.8.11) and libxml2 (2.3.4) have beenreleasedto allow smooth upgrade of existing libxml v1code whileretainingcompatibility. They offers the following:

  1. similar include naming, one shoulduse#include<libxml/...>in both cases.
  2. similar identifiers defined via macros for the child and rootfields:respectivelyxmlChildrenNodeandxmlRootNode
  3. a new macro LIBXML_TEST_VERSIONwhich should beinsertedonce in the client code

So the roadmap to upgrade your existing libxml applications isthefollowing:

  1. install the libxml-1.8.8 (and libxml-devel-1.8.8) packages
  2. find all occurrences where the xmlDoc rootfield isusedand change it to xmlRootNode
  3. similarly find all occurrences where thexmlNodechildsfield is used and change ittoxmlChildrenNode
  4. add a LIBXML_TEST_VERSIONmacro somewhere inyourmain()or in the library init entry point
  5. Recompile, check compatibility, it should still work
  6. Change your configure script to look first for xml2-config and fallbackusing xml-config . Use the --cflags and --libs output of the commandasthe Include and Linking parameters needed to use libxml.
  7. install libxml2-2.3.x and libxml2-devel-2.3.x (libxml-1.8.yandlibxml-devel-1.8.y can be kept simultaneously)
  8. remove your config.cache, relaunch your configuration mechanism,andrecompile, if steps 2 and 3 were done right it should compileas-is
  9. Test that your application is still running correctly, if not thismaybe due to extra empty nodes due to formating spaces being kept inlibxml2contrary to libxml1, in that case insert xmlKeepBlanksDefault(1)in yourcode before calling the parser (nexttoLIBXML_TEST_VERSIONis a fine place).

Following those steps should work. It worked for some of my own code.

Let me put some emphasis on the fact that there is far more changesfromlibxml 1.x to 2.x than the ones you may have to patch for. The overallcodehas been considerably cleaned up and the conformance to the XMLspecificationhas been drastically improved too. Don't take those changes asan excuse tonot upgrade, it may cost a lot on the long term ...

Daniel Veillard