of libxml to detect ignorable blanks. Don't complain if it breaks or make your application not 100% clean w.r.t. to it's input.
  • the Right Way: change you code to accept possibly insignificant blanks characters, or have your tree populated with weird blank text nodes. You can spot them using the commodity function xmlIsBlankNode(node) returning 1 for such blank nodes.
  • Note also that with the new default the output functions don't add any extra indentation when saving a tree in order to be able to round trip (read and save) without inflating the document with extra formatting chars.

  • The include path has changed to $prefix/libxml/ and the includes themselves uses this new prefix in includes instructions... If you are using (as expected) the
    xml2-config --cflags

    output to generate you compile commands this will probably work out of the box

  • xmlDetectCharEncoding takes an extra argument indicating the length in byte of the head of the document available for character detection.
  • Ensuring both libxml-1.x and libxml-2.x compatibility

    Two new version of libxml (1.8.11) and libxml2 (2.3.4) have been released to allow smooth upgrade of existing libxml v1code while retaining compatibility. They offers the following:

    1. similar include naming, one should use #include<libxml/...> in both cases.
    2. similar identifiers defined via macros for the child and root fields: respectively xmlChildrenNode and xmlRootNode
    3. a new macro LIBXML_TEST_VERSION which should be inserted once in the client code

    So the roadmap to upgrade your existing libxml applications is the following:

    1. install the libxml-1.8.8 (and libxml-devel-1.8.8) packages
    2. find all occurrences where the xmlDoc root field is used and change it to xmlRootNode
    3. similarly find all occurrences where the xmlNode childs field is used and change it to xmlChildrenNode
    4. add a LIBXML_TEST_VERSION macro somewhere in your main() or in the library init entry point
    5. Recompile, check compatibility, it should still work
    6. Change your configure script to look first for xml2-config and fall back using xml-config . Use the --cflags and --libs output of the command as the Include and Linking parameters needed to use libxml.
    7. install libxml2-2.3.x and libxml2-devel-2.3.x (libxml-1.8.y and libxml-devel-1.8.y can be kept simultaneously)
    8. remove your config.cache, relaunch your configuration mechanism, and recompile, if steps 2 and 3 were done right it should compile as-is
    9. Test that your application is still running correctly, if not this may be due to extra empty nodes due to formating spaces being kept in libxml2 contrary to libxml1, in that case insert xmlKeepBlanksDefault(1) in your code before calling the parser (next to LIBXML_TEST_VERSION is a fine place).

    Following those steps should work. It worked for some of my own code.

    Let me put some emphasis on the fact that there is far more changes from libxml 1.x to 2.x than the ones you may have to patch for. The overall code has been considerably cleaned up and the conformance to the XML specification has been drastically improved too. Don't take those changes as an excuse to not upgrade, it may cost a lot on the long term ...

    Thread safety

    Starting with 2.4.7, libxml2 makes provisions to ensure that concurrent threads can safely work in parallel parsing different documents. There is however a couple of things to do to ensure it:

    Note that the thread safety cannot be ensured for multiple threads sharing the same document, the locking must be done at the application level, libxml exports a basic mutex and reentrant mutexes API in <libxml/threads.h>. The parts of the library checked for thread safety are:

    XPath is supposed to be thread safe now, but this wasn't tested seriously.

    DOM Principles

    DOM stands for the Document Object Model; this is an API for accessing XML or HTML structured documents. Native support for DOM in Gnome is on the way (module gnome-dom), and will be based on gnome-xml. This will be a far cleaner interface to manipulate XML files within Gnome since it won't expose the internal structure.

    The current DOM implementation on top of libxml2 is the gdome2 Gnome module, this is a full DOM interface, thanks to Paolo Casarini, check the Gdome2 homepage for more information.

    A real example

    Here is a real size example, where the actual content of the application data is not kept in the DOM tree but uses internal structures. It is based on a proposal to keep a database of jobs related to Gnome, with an XML based storage structure. Here is an XML encoded jobs base:

    <?xml version="1.0"?>
    <gjob:Helping xmlns:gjob="http://www.gnome.org/some-location">
      <gjob:Jobs>
    
        <gjob:Job>
          <gjob:Project ID="3"/>
          <gjob:Application>GBackup</gjob:Application>
          <gjob:Category>Development</gjob:Category>
    
          <gjob:Update>
            <gjob:Status>Open</gjob:Status>