The XML C parser and toolkit of Gnome

import libxml2, sys
+  LibxmlJ is
+    an effort to create a 100% JAXP-compatible Java wrapper for libxml2 and
+    libxslt as part of GNU ClasspathX project.
+  Patrick McPhee provides Rexx bindings fof libxml2 and libxslt, look for
+    RexxXML.
+  Satimage
+    provides XMLLib
+    osax. This is an osax for Mac OS X with a set of commands to
+    implement in AppleScript the XML DOM, XPATH and XSLT. Also includes
+    commands for Property-lists (Apple's fast lookup table XML format.)
+  Francesco Montorsi developped wxXml2
+    wrappers that interface libxml2, allowing wxWidgets applications to
+    load/save/edit XML instances.
+The distribution includes a set of Python bindings, which are guaranteed
+to be maintained as part of the library in the future, though the Python
+interface have not yet reached the completeness of the C API.
Note that some of the Python purist dislike the default set of Python
+bindings, rather than complaining I suggest they have a look at lxml the more pythonic bindings for libxml2
+and libxslt and help Martijn
+Faassen complete those.
St�phane Bidoul
+maintains a Windows port
+of the Python bindings.
Note to people interested in building bindings, the API is formalized as
+an XML API description file which allows to
+automate a large part of the Python bindings, this includes function
+descriptions, enums, structures, typedefs, etc... The Python script used to
+build the bindings is python/generator.py in the source distribution.
To install the Python bindings there are 2 options:
If you use an RPM based distribution, simply install the libxml2-python
+    RPM (and if needed the libxslt-python
+    RPM).
+  Otherwise use the libxml2-python
+    module distribution corresponding to your installed version of
+    libxml2 and libxslt. Note that to install it you will need both libxml2
+    and libxslt installed and run "python setup.py build install" in the
+    module tree.
+
The distribution includes a set of examples and regression tests for the
+python bindings in the python/tests directory. Here are some
+excerpts from those tests:
tst.py:
This is a basic test of the file interface and DOM navigation:
import libxml2, sys
 
 doc = libxml2.parseFile("tst.xml")
 if doc.name != "tst.xml":
@@ -80,25 +87,24 @@ child = root.children
 if child.name != "foo":
     print "child.name failed"
     sys.exit(1)
-doc.freeDoc()
The Python module is called libxml2; parseFile is the
-equivalentofxmlParseFile (most of the bindings are automatically generated,
-and thexmlprefix is removed and the casing convention are kept). All node
-seen atthebinding level share the same subset of accessors:
name: returns the node name
-  type: returns a string indicating the node type
-  content: returns the content of the node, it is
-    basedonxmlNodeGetContent() and hence is recursive.
-  parent,
-    children,last,next,
-    prev,doc,properties: pointing to
-    the associatedelement in the tree,those may return None in case no such
-    linkexists.
-
Also note the need to explicitly deallocate documents with
-freeDoc().Reference counting for libxml2 trees would need quite a lot of
-worktofunction properly, and rather than risk memory leaks if
-notimplementedcorrectly it sounds safer to have an explicit function to free
-atree. Thewrapper python objects like doc, root or child are
-themautomatically garbagecollected.
validate.py:
This test check the validation interfaces and redirection
-oferrormessages:
import libxml2
+doc.freeDoc()
The Python module is called libxml2; parseFile is the equivalent of
+xmlParseFile (most of the bindings are automatically generated, and the xml
+prefix is removed and the casing convention are kept). All node seen at the
+binding level share the same subset of accessors:
name : returns the node name
+  type : returns a string indicating the node type
+  content : returns the content of the node, it is based on
+    xmlNodeGetContent() and hence is recursive.
+  parent , children, last,
+    next, prev, doc,
+    properties: pointing to the associated element in the tree,
+    those may return None in case no such link exists.
+
Also note the need to explicitly deallocate documents with freeDoc() .
+Reference counting for libxml2 trees would need quite a lot of work to
+function properly, and rather than risk memory leaks if not implemented
+correctly it sounds safer to have an explicit function to free a tree. The
+wrapper python objects like doc, root or child are them automatically garbage
+collected.
validate.py:
This test check the validation interfaces and redirection of error
+messages:
import libxml2
 
 #deactivate error messages from the validation
 def noerr(ctx, str):
@@ -113,29 +119,27 @@ doc = ctxt.doc()
 valid = ctxt.isValid()
 doc.freeDoc()
 if valid != 0:
-    print "validity check failed"
The first thing to notice is the call to registerErrorHandler(),
-itdefinesa new error handler global to the library. It is used to avoid
-seeingtheerror messages when trying to validate the invalid document.
The main interest of that test is the creation of a parser
-contextwithcreateFileParserCtxt() and how the behaviour can be changed
-beforecallingparseDocument() . Similarly the informations resulting from
-theparsing phaseare also available using context methods.
Contexts like nodes are defined as class and the libxml2 wrappers mapstheC
-function interfaces in terms of objects method as much as possible.Thebest to
-get a complete view of what methods are supported is to look atthelibxml2.py
-module containing all the wrappers.
push.py:
This test show how to activate the push parser interface:
import libxml2
+    print "validity check failed"
The first thing to notice is the call to registerErrorHandler(), it
+defines a new error handler global to the library. It is used to avoid seeing
+the error messages when trying to validate the invalid document.
The main interest of that test is the creation of a parser context with
+createFileParserCtxt() and how the behaviour can be changed before calling
+parseDocument() . Similarly the informations resulting from the parsing phase
+are also available using context methods.
Contexts like nodes are defined as class and the libxml2 wrappers maps the
+C function interfaces in terms of objects method as much as possible. The
+best to get a complete view of what methods are supported is to look at the
+libxml2.py module containing all the wrappers.
push.py:
This test show how to activate the push parser interface:
import libxml2
 
 ctxt = libxml2.createPushParser(None, "<foo", 4, "test.xml")
 ctxt.parseChunk("/>", 2, 1)
 doc = ctxt.doc()
 
-doc.freeDoc()
The context is created with a special call based
-onthexmlCreatePushParser() from the C library. The first argument is
-anoptionalSAX callback object, then the initial set of data, the length and
-thename ofthe resource in case URI-References need to be computed by
-theparser.
Then the data are pushed using the parseChunk() method, the
-lastcallsetting the third argument terminate to 1.
pushSAX.py:
this test show the use of the event based parsing interfaces. In
-thiscasethe parser does not build a document, but provides callback
-informationasthe parser makes progresses analyzing the data being
-provided:
import libxml2
+doc.freeDoc()
The context is created with a special call based on the
+xmlCreatePushParser() from the C library. The first argument is an optional
+SAX callback object, then the initial set of data, the length and the name of
+the resource in case URI-References need to be computed by the parser.
Then the data are pushed using the parseChunk() method, the last call
+setting the third argument terminate to 1.
pushSAX.py:
this test show the use of the event based parsing interfaces. In this case
+the parser does not build a document, but provides callback information as
+the parser makes progresses analyzing the data being provided:
import libxml2
 log = ""
 
 class callback:
@@ -183,16 +187,15 @@ reference = "startDocument:startElement foo {'url': 'tst'}:" + \
             "characters: bar:endElement foo:endDocument:"
 if log != reference:
     print "Error got: %s" % log
-    print "Expected: %s" % reference
The key object in that test is the handler, it provides a number
-ofentrypoints which can be called by the parser as it makes progresses
-toindicatethe information set obtained. The full set of callback is larger
-thanwhatthe callback class in that specific example implements (see
-theSAXdefinition for a complete list). The wrapper will only call those
-suppliedbythe object when activated. The startElement receives the names of
-theelementand a dictionary containing the attributes carried by this
-element.
Also note that the reference string generated from the callback
-showsasingle character call even though the string "bar" is passed to
-theparserfrom 2 different call to parseChunk()
xpath.py:
This is a basic test of XPath wrappers support
import libxml2
+    print "Expected: %s" % reference
The key object in that test is the handler, it provides a number of entry
+points which can be called by the parser as it makes progresses to indicate
+the information set obtained. The full set of callback is larger than what
+the callback class in that specific example implements (see the SAX
+definition for a complete list). The wrapper will only call those supplied by
+the object when activated. The startElement receives the names of the element
+and a dictionary containing the attributes carried by this element.
Also note that the reference string generated from the callback shows a
+single character call even though the string "bar" is passed to the parser
+from 2 different call to parseChunk()
xpath.py:
This is a basic test of XPath wrappers support
import libxml2
 
 doc = libxml2.parseFile("tst.xml")
 ctxt = doc.xpathNewContext()
@@ -204,15 +207,14 @@ if res[0].name != "doc" or res[1].name != "foo":
     print "xpath query: wrong node set value"
     sys.exit(1)
 doc.freeDoc()
-ctxt.xpathFreeContext()
This test parses a file, then create an XPath context to
-evaluateXPathexpression on it. The xpathEval() method execute an XPath query
-andreturnsthe result mapped in a Python way. String and numbers are
-nativelyconverted,and node sets are returned as a tuple of libxml2 Python
-nodeswrappers. Likethe document, the XPath context need to be freed
-explicitly,also not thatthe result of the XPath query may point back to the
-documenttree and hencethe document must be freed after the result of the
-query isused.
xpathext.py:
This test shows how to extend the XPath engine with functions
-writteninpython:
import libxml2
+ctxt.xpathFreeContext()
This test parses a file, then create an XPath context to evaluate XPath
+expression on it. The xpathEval() method execute an XPath query and returns
+the result mapped in a Python way. String and numbers are natively converted,
+and node sets are returned as a tuple of libxml2 Python nodes wrappers. Like
+the document, the XPath context need to be freed explicitly, also not that
+the result of the XPath query may point back to the document tree and hence
+the document must be freed after the result of the query is used.
xpathext.py:
This test shows how to extend the XPath engine with functions written in
+python:
import libxml2
 
 def foo(ctx, x):
     return x + 1
@@ -224,10 +226,9 @@ res = ctxt.xpathEval("foo(1)")
 if res != 2:
     print "xpath extension failure"
 doc.freeDoc()
-ctxt.xpathFreeContext()
Note how the extension function is registered with the context
-(butthatpart is not yet finalized, this may change slightly in the
-future).
tstxpath.py:
This test is similar to the previous one but shows how
-theextensionfunction can access the XPath evaluation context:
def foo(ctx, x):
+ctxt.xpathFreeContext()
Note how the extension function is registered with the context (but that
+part is not yet finalized, this may change slightly in the future).
tstxpath.py:
This test is similar to the previous one but shows how the extension
+function can access the XPath evaluation context:
def foo(ctx, x):
     global called
 
     #
@@ -236,16 +237,16 @@ theextensionfunction can access the XPath evaluation context:
def foo(ct
     pctxt = libxml2.xpathParserContext(_obj=ctx)
     ctxt = pctxt.context()
     called = ctxt.function()
-    return x + 1
All the interfaces around the XPath parser(or rather evaluation)contextare
-not finalized, but it should be sufficient to do contextual workat
-theevaluation point.
Memory debugging:
last but not least, all tests starts with the following prologue:
#memory debug specific
+    return x + 1
All the interfaces around the XPath parser(or rather evaluation) context
+are not finalized, but it should be sufficient to do contextual work at the
+evaluation point.
Memory debugging:
last but not least, all tests starts with the following prologue:
#memory debug specific
 libxml2.debugMemory(1)
and ends with the following epilogue:
#memory debug specific
 libxml2.cleanupParser()
 if libxml2.debugMemory(1) == 0:
     print "OK"
 else:
     print "Memory leak %d bytes" % (libxml2.debugMemory(1))
-    libxml2.dumpMemory()
Those activate the memory debugging interface of libxml2 whereallallocated
-block in the library are tracked. The prologue then cleans upthelibrary state
-and checks that all allocated memory has been freed. If notitcalls
-dumpMemory() which saves that list in a .memdumpfile.
Daniel Veillard

The XML C parser and toolkit of Gnome

Python and bindings

The XML C parser and toolkit of Gnome

Python and bindings

tst.py:

tst.py:

validate.py:

validate.py:

push.py:

push.py:

pushSAX.py:

pushSAX.py:

xpath.py:

xpath.py:

xpathext.py:

xpathext.py:

tstxpath.py:

tstxpath.py:

Memory debugging:

Memory debugging: