The XML C parser and toolkit of Gnome

import libxml2, sys
+  LibxmlJisaneffort
+    to create a 100% JAXP-compatible Java wrapper for libxml2andlibxslt as
+    part of GNU ClasspathX project.
+  Patrick McPhee provides Rexx bindings fof libxml2 and libxslt,
+    lookforRexxXML.
+  SatimageprovidesXMLLibosax.This
+    is an osax for Mac OS X with a set of commands toimplement inAppleScript
+    the XML DOM, XPATH and XSLT. Also includescommands forProperty-lists
+    (Apple's fast lookup table XML format.)
+  Francesco Montorsi developped wxXml2wrappersthat
+    interface libxml2, allowing wxWidgets applications toload/save/editXML
+    instances.
+The distribution includes a set of Python bindings, which are
+guaranteedtobe maintained as part of the library in the future, though
+thePythoninterface have not yet reached the completeness of the C API.
Note that some of the Python purist dislike the default set
+ofPythonbindings, rather than complaining I suggest they have a look at lxml the more pythonic bindings
+forlibxml2and libxsltand helpMartijnFaassencomplete
+those.
St�phaneBidoulmaintains aWindows portof the Python
+bindings.
Note to people interested in building bindings, the API is formalized asan XML API description filewhich allows
+toautomatea large part of the Python bindings, this includes
+functiondescriptions,enums, structures, typedefs, etc... The Python script
+used tobuild thebindings is python/generator.py in the source
+distribution.
To install the Python bindings there are 2 options:
If you use an RPM based distribution, simply install the libxml2-pythonRPM(andif
+    needed the libxslt-pythonRPM).
+  Otherwise use the libxml2-pythonmoduledistributioncorresponding
+    to your installed version oflibxml2 andlibxslt. Note that to install it
+    you will need both libxml2and libxsltinstalled and run "python setup.py
+    build install" in themodule tree.
+
The distribution includes a set of examples and regression tests
+forthepython bindings in the python/testsdirectory. Here
+aresomeexcerpts from those tests:
tst.py:
This is a basic test of the file interface and DOM navigation:
import libxml2, sys
 
 doc = libxml2.parseFile("tst.xml")
 if doc.name != "tst.xml":
@@ -87,24 +80,25 @@ child = root.children
 if child.name != "foo":
     print "child.name failed"
     sys.exit(1)
-doc.freeDoc()
The Python module is called libxml2; parseFile is the equivalent of
-xmlParseFile (most of the bindings are automatically generated, and the xml
-prefix is removed and the casing convention are kept). All node seen at the
-binding level share the same subset of accessors:
name : returns the node name
-  type : returns a string indicating the node type
-  content : returns the content of the node, it is based on
-    xmlNodeGetContent() and hence is recursive.
-  parent , children, last,
-    next, prev, doc,
-    properties: pointing to the associated element in the tree,
-    those may return None in case no such link exists.
-
Also note the need to explicitly deallocate documents with freeDoc() .
-Reference counting for libxml2 trees would need quite a lot of work to
-function properly, and rather than risk memory leaks if not implemented
-correctly it sounds safer to have an explicit function to free a tree. The
-wrapper python objects like doc, root or child are them automatically garbage
-collected.
validate.py:
This test check the validation interfaces and redirection of error
-messages:
import libxml2
+doc.freeDoc()
The Python module is called libxml2; parseFile is the
+equivalentofxmlParseFile (most of the bindings are automatically generated,
+and thexmlprefix is removed and the casing convention are kept). All node
+seen atthebinding level share the same subset of accessors:
name: returns the node name
+  type: returns a string indicating the node type
+  content: returns the content of the node, it is
+    basedonxmlNodeGetContent() and hence is recursive.
+  parent,
+    children,last,next,
+    prev,doc,properties: pointing to
+    the associatedelement in the tree,those may return None in case no such
+    linkexists.
+
Also note the need to explicitly deallocate documents with
+freeDoc().Reference counting for libxml2 trees would need quite a lot of
+worktofunction properly, and rather than risk memory leaks if
+notimplementedcorrectly it sounds safer to have an explicit function to free
+atree. Thewrapper python objects like doc, root or child are
+themautomatically garbagecollected.
validate.py:
This test check the validation interfaces and redirection
+oferrormessages:
import libxml2
 
 #deactivate error messages from the validation
 def noerr(ctx, str):
@@ -119,27 +113,29 @@ doc = ctxt.doc()
 valid = ctxt.isValid()
 doc.freeDoc()
 if valid != 0:
-    print "validity check failed"
The first thing to notice is the call to registerErrorHandler(), it
-defines a new error handler global to the library. It is used to avoid seeing
-the error messages when trying to validate the invalid document.
The main interest of that test is the creation of a parser context with
-createFileParserCtxt() and how the behaviour can be changed before calling
-parseDocument() . Similarly the informations resulting from the parsing phase
-are also available using context methods.
Contexts like nodes are defined as class and the libxml2 wrappers maps the
-C function interfaces in terms of objects method as much as possible. The
-best to get a complete view of what methods are supported is to look at the
-libxml2.py module containing all the wrappers.
push.py:
This test show how to activate the push parser interface:
import libxml2
+    print "validity check failed"
The first thing to notice is the call to registerErrorHandler(),
+itdefinesa new error handler global to the library. It is used to avoid
+seeingtheerror messages when trying to validate the invalid document.
The main interest of that test is the creation of a parser
+contextwithcreateFileParserCtxt() and how the behaviour can be changed
+beforecallingparseDocument() . Similarly the informations resulting from
+theparsing phaseare also available using context methods.
Contexts like nodes are defined as class and the libxml2 wrappers mapstheC
+function interfaces in terms of objects method as much as possible.Thebest to
+get a complete view of what methods are supported is to look atthelibxml2.py
+module containing all the wrappers.
push.py:
This test show how to activate the push parser interface:
import libxml2
 
 ctxt = libxml2.createPushParser(None, "<foo", 4, "test.xml")
 ctxt.parseChunk("/>", 2, 1)
 doc = ctxt.doc()
 
-doc.freeDoc()
The context is created with a special call based on the
-xmlCreatePushParser() from the C library. The first argument is an optional
-SAX callback object, then the initial set of data, the length and the name of
-the resource in case URI-References need to be computed by the parser.
Then the data are pushed using the parseChunk() method, the last call
-setting the third argument terminate to 1.
pushSAX.py:
this test show the use of the event based parsing interfaces. In this case
-the parser does not build a document, but provides callback information as
-the parser makes progresses analyzing the data being provided:
import libxml2
+doc.freeDoc()
The context is created with a special call based
+onthexmlCreatePushParser() from the C library. The first argument is
+anoptionalSAX callback object, then the initial set of data, the length and
+thename ofthe resource in case URI-References need to be computed by
+theparser.
Then the data are pushed using the parseChunk() method, the
+lastcallsetting the third argument terminate to 1.
pushSAX.py:
this test show the use of the event based parsing interfaces. In
+thiscasethe parser does not build a document, but provides callback
+informationasthe parser makes progresses analyzing the data being
+provided:
import libxml2
 log = ""
 
 class callback:
@@ -187,15 +183,16 @@ reference = "startDocument:startElement foo {'url': 'tst'}:" + \
             "characters: bar:endElement foo:endDocument:"
 if log != reference:
     print "Error got: %s" % log
-    print "Expected: %s" % reference
The key object in that test is the handler, it provides a number of entry
-points which can be called by the parser as it makes progresses to indicate
-the information set obtained. The full set of callback is larger than what
-the callback class in that specific example implements (see the SAX
-definition for a complete list). The wrapper will only call those supplied by
-the object when activated. The startElement receives the names of the element
-and a dictionary containing the attributes carried by this element.
Also note that the reference string generated from the callback shows a
-single character call even though the string "bar" is passed to the parser
-from 2 different call to parseChunk()
xpath.py:
This is a basic test of XPath wrappers support
import libxml2
+    print "Expected: %s" % reference
The key object in that test is the handler, it provides a number
+ofentrypoints which can be called by the parser as it makes progresses
+toindicatethe information set obtained. The full set of callback is larger
+thanwhatthe callback class in that specific example implements (see
+theSAXdefinition for a complete list). The wrapper will only call those
+suppliedbythe object when activated. The startElement receives the names of
+theelementand a dictionary containing the attributes carried by this
+element.
Also note that the reference string generated from the callback
+showsasingle character call even though the string "bar" is passed to
+theparserfrom 2 different call to parseChunk()
xpath.py:
This is a basic test of XPath wrappers support
import libxml2
 
 doc = libxml2.parseFile("tst.xml")
 ctxt = doc.xpathNewContext()
@@ -207,14 +204,15 @@ if res[0].name != "doc" or res[1].name != "foo":
     print "xpath query: wrong node set value"
     sys.exit(1)
 doc.freeDoc()
-ctxt.xpathFreeContext()
This test parses a file, then create an XPath context to evaluate XPath
-expression on it. The xpathEval() method execute an XPath query and returns
-the result mapped in a Python way. String and numbers are natively converted,
-and node sets are returned as a tuple of libxml2 Python nodes wrappers. Like
-the document, the XPath context need to be freed explicitly, also not that
-the result of the XPath query may point back to the document tree and hence
-the document must be freed after the result of the query is used.
xpathext.py:
This test shows how to extend the XPath engine with functions written in
-python:
import libxml2
+ctxt.xpathFreeContext()
This test parses a file, then create an XPath context to
+evaluateXPathexpression on it. The xpathEval() method execute an XPath query
+andreturnsthe result mapped in a Python way. String and numbers are
+nativelyconverted,and node sets are returned as a tuple of libxml2 Python
+nodeswrappers. Likethe document, the XPath context need to be freed
+explicitly,also not thatthe result of the XPath query may point back to the
+documenttree and hencethe document must be freed after the result of the
+query isused.
xpathext.py:
This test shows how to extend the XPath engine with functions
+writteninpython:
import libxml2
 
 def foo(ctx, x):
     return x + 1
@@ -226,9 +224,10 @@ res = ctxt.xpathEval("foo(1)")
 if res != 2:
     print "xpath extension failure"
 doc.freeDoc()
-ctxt.xpathFreeContext()
Note how the extension function is registered with the context (but that
-part is not yet finalized, this may change slightly in the future).
tstxpath.py:
This test is similar to the previous one but shows how the extension
-function can access the XPath evaluation context:
def foo(ctx, x):
+ctxt.xpathFreeContext()
Note how the extension function is registered with the context
+(butthatpart is not yet finalized, this may change slightly in the
+future).
tstxpath.py:
This test is similar to the previous one but shows how
+theextensionfunction can access the XPath evaluation context:
def foo(ctx, x):
     global called
 
     #
@@ -237,16 +236,16 @@ function can access the XPath evaluation context:
def foo(ctx, x):
     pctxt = libxml2.xpathParserContext(_obj=ctx)
     ctxt = pctxt.context()
     called = ctxt.function()
-    return x + 1
All the interfaces around the XPath parser(or rather evaluation) context
-are not finalized, but it should be sufficient to do contextual work at the
-evaluation point.
Memory debugging:
last but not least, all tests starts with the following prologue:
#memory debug specific
+    return x + 1
All the interfaces around the XPath parser(or rather evaluation)contextare
+not finalized, but it should be sufficient to do contextual workat
+theevaluation point.
Memory debugging:
last but not least, all tests starts with the following prologue:
#memory debug specific
 libxml2.debugMemory(1)
and ends with the following epilogue:
#memory debug specific
 libxml2.cleanupParser()
 if libxml2.debugMemory(1) == 0:
     print "OK"
 else:
     print "Memory leak %d bytes" % (libxml2.debugMemory(1))
-    libxml2.dumpMemory()
Those activate the memory debugging interface of libxml2 where all
-allocated block in the library are tracked. The prologue then cleans up the
-library state and checks that all allocated memory has been freed. If not it
-calls dumpMemory() which saves that list in a .memdump file.
Daniel Veillard

The XML C parser and toolkit of Gnome

Python and bindings

The XML C parser and toolkit of Gnome

Python and bindings

tst.py:

tst.py:

validate.py:

validate.py:

push.py:

push.py:

pushSAX.py:

pushSAX.py:

xpath.py:

xpath.py:

xpathext.py:

xpathext.py:

tstxpath.py:

tstxpath.py:

Memory debugging:

Memory debugging: