From 968041a8b2ec86c39b5074024ce97d136ecd9a95 Mon Sep 17 00:00:00 2001 From: Mike Hommey Date: Thu, 26 Oct 2006 11:17:37 +0200 Subject: Load /tmp/libxml2-2.6.27 into libxml2/branches/upstream/current. --- doc/python.html | 247 ++++++++++++++++++++++++++++---------------------------- 1 file changed, 124 insertions(+), 123 deletions(-) (limited to 'doc/python.html') diff --git a/doc/python.html b/doc/python.html index adb3d36..5910766 100644 --- a/doc/python.html +++ b/doc/python.html @@ -7,66 +7,73 @@ H1 {font-family: Verdana,Arial,Helvetica} H2 {font-family: Verdana,Arial,Helvetica} H3 {font-family: Verdana,Arial,Helvetica} A:link, A:visited, A:active { text-decoration: underline } -Python and bindings
Action against software patentsGnome2 LogoW3C LogoRed Hat Logo
Made with Libxml2 Logo

The XML C parser and toolkit of Gnome

Python and bindings

Developer Menu
API Indexes
Related links

There are a number of language bindings and wrappers available -forlibxml2,the list below is not exhaustive. Please contact the xml-bindings@gnome.org(archives) inorder -toget updates to this list or to discuss the specific topic of -libxml2orlibxslt wrappers or bindings:

  • Libxml++seemsthemost - up-to-date C++ bindings for libxml2, check the documentationandthe - examples.
  • -
  • There is another C++wrapperbased on the gdome2 - bindingsmaintained by Tobias Peters.
  • +Python and bindings
    Action against software patentsGnome2 LogoW3C LogoRed Hat Logo
    Made with Libxml2 Logo

    The XML C parser and toolkit of Gnome

    Python and bindings

    Developer Menu
    API Indexes
    Related links

    There are a number of language bindings and wrappers available for +libxml2, the list below is not exhaustive. Please contact the xml-bindings@gnome.org +(archives) in +order to get updates to this list or to discuss the specific topic of libxml2 +or libxslt wrappers or bindings:

    The distribution includes a set of Python bindings, which are -guaranteedtobe maintained as part of the library in the future, though -thePythoninterface have not yet reached the completeness of the C API.

    Note that some of the Python purist dislike the default set -ofPythonbindings, rather than complaining I suggest they have a look at lxml the more pythonic bindings -forlibxml2and libxsltand helpMartijnFaassencomplete -those.

    StéphaneBidoulmaintains aWindows portof the Python -bindings.

    Note to people interested in building bindings, the API is formalized asan XML API description filewhich allows -toautomatea large part of the Python bindings, this includes -functiondescriptions,enums, structures, typedefs, etc... The Python script -used tobuild thebindings is python/generator.py in the source -distribution.

    To install the Python bindings there are 2 options:

    • If you use an RPM based distribution, simply install the libxml2-pythonRPM(andif - needed the libxslt-pythonRPM).
    • -
    • Otherwise use the libxml2-pythonmoduledistributioncorresponding - to your installed version oflibxml2 andlibxslt. Note that to install it - you will need both libxml2and libxsltinstalled and run "python setup.py - build install" in themodule tree.
    • -

    The distribution includes a set of examples and regression tests -forthepython bindings in the python/testsdirectory. Here -aresomeexcerpts from those tests:

    tst.py:

    This is a basic test of the file interface and DOM navigation:

    import libxml2, sys
    +  
  • LibxmlJ is + an effort to create a 100% JAXP-compatible Java wrapper for libxml2 and + libxslt as part of GNU ClasspathX project.
  • +
  • Patrick McPhee provides Rexx bindings fof libxml2 and libxslt, look for + RexxXML.
  • +
  • Satimage + provides XMLLib + osax. This is an osax for Mac OS X with a set of commands to + implement in AppleScript the XML DOM, XPATH and XSLT. Also includes + commands for Property-lists (Apple's fast lookup table XML format.)
  • +
  • Francesco Montorsi developped wxXml2 + wrappers that interface libxml2, allowing wxWidgets applications to + load/save/edit XML instances.
  • +

    The distribution includes a set of Python bindings, which are guaranteed +to be maintained as part of the library in the future, though the Python +interface have not yet reached the completeness of the C API.

    Note that some of the Python purist dislike the default set of Python +bindings, rather than complaining I suggest they have a look at lxml the more pythonic bindings for libxml2 +and libxslt and help Martijn +Faassen complete those.

    Stéphane Bidoul +maintains a Windows port +of the Python bindings.

    Note to people interested in building bindings, the API is formalized as +an XML API description file which allows to +automate a large part of the Python bindings, this includes function +descriptions, enums, structures, typedefs, etc... The Python script used to +build the bindings is python/generator.py in the source distribution.

    To install the Python bindings there are 2 options:

    • If you use an RPM based distribution, simply install the libxml2-python + RPM (and if needed the libxslt-python + RPM).
    • +
    • Otherwise use the libxml2-python + module distribution corresponding to your installed version of + libxml2 and libxslt. Note that to install it you will need both libxml2 + and libxslt installed and run "python setup.py build install" in the + module tree.
    • +

    The distribution includes a set of examples and regression tests for the +python bindings in the python/tests directory. Here are some +excerpts from those tests:

    tst.py:

    This is a basic test of the file interface and DOM navigation:

    import libxml2, sys
     
     doc = libxml2.parseFile("tst.xml")
     if doc.name != "tst.xml":
    @@ -80,25 +87,24 @@ child = root.children
     if child.name != "foo":
         print "child.name failed"
         sys.exit(1)
    -doc.freeDoc()

    The Python module is called libxml2; parseFile is the -equivalentofxmlParseFile (most of the bindings are automatically generated, -and thexmlprefix is removed and the casing convention are kept). All node -seen atthebinding level share the same subset of accessors:

    • name: returns the node name
    • -
    • type: returns a string indicating the node type
    • -
    • content: returns the content of the node, it is - basedonxmlNodeGetContent() and hence is recursive.
    • -
    • parent, - children,last,next, - prev,doc,properties: pointing to - the associatedelement in the tree,those may return None in case no such - linkexists.
    • -

    Also note the need to explicitly deallocate documents with -freeDoc().Reference counting for libxml2 trees would need quite a lot of -worktofunction properly, and rather than risk memory leaks if -notimplementedcorrectly it sounds safer to have an explicit function to free -atree. Thewrapper python objects like doc, root or child are -themautomatically garbagecollected.

    validate.py:

    This test check the validation interfaces and redirection -oferrormessages:

    import libxml2
    +doc.freeDoc()

    The Python module is called libxml2; parseFile is the equivalent of +xmlParseFile (most of the bindings are automatically generated, and the xml +prefix is removed and the casing convention are kept). All node seen at the +binding level share the same subset of accessors:

    • name : returns the node name
    • +
    • type : returns a string indicating the node type
    • +
    • content : returns the content of the node, it is based on + xmlNodeGetContent() and hence is recursive.
    • +
    • parent , children, last, + next, prev, doc, + properties: pointing to the associated element in the tree, + those may return None in case no such link exists.
    • +

    Also note the need to explicitly deallocate documents with freeDoc() . +Reference counting for libxml2 trees would need quite a lot of work to +function properly, and rather than risk memory leaks if not implemented +correctly it sounds safer to have an explicit function to free a tree. The +wrapper python objects like doc, root or child are them automatically garbage +collected.

    validate.py:

    This test check the validation interfaces and redirection of error +messages:

    import libxml2
     
     #deactivate error messages from the validation
     def noerr(ctx, str):
    @@ -113,29 +119,27 @@ doc = ctxt.doc()
     valid = ctxt.isValid()
     doc.freeDoc()
     if valid != 0:
    -    print "validity check failed"

    The first thing to notice is the call to registerErrorHandler(), -itdefinesa new error handler global to the library. It is used to avoid -seeingtheerror messages when trying to validate the invalid document.

    The main interest of that test is the creation of a parser -contextwithcreateFileParserCtxt() and how the behaviour can be changed -beforecallingparseDocument() . Similarly the informations resulting from -theparsing phaseare also available using context methods.

    Contexts like nodes are defined as class and the libxml2 wrappers mapstheC -function interfaces in terms of objects method as much as possible.Thebest to -get a complete view of what methods are supported is to look atthelibxml2.py -module containing all the wrappers.

    push.py:

    This test show how to activate the push parser interface:

    import libxml2
    +    print "validity check failed"

    The first thing to notice is the call to registerErrorHandler(), it +defines a new error handler global to the library. It is used to avoid seeing +the error messages when trying to validate the invalid document.

    The main interest of that test is the creation of a parser context with +createFileParserCtxt() and how the behaviour can be changed before calling +parseDocument() . Similarly the informations resulting from the parsing phase +are also available using context methods.

    Contexts like nodes are defined as class and the libxml2 wrappers maps the +C function interfaces in terms of objects method as much as possible. The +best to get a complete view of what methods are supported is to look at the +libxml2.py module containing all the wrappers.

    push.py:

    This test show how to activate the push parser interface:

    import libxml2
     
     ctxt = libxml2.createPushParser(None, "<foo", 4, "test.xml")
     ctxt.parseChunk("/>", 2, 1)
     doc = ctxt.doc()
     
    -doc.freeDoc()

    The context is created with a special call based -onthexmlCreatePushParser() from the C library. The first argument is -anoptionalSAX callback object, then the initial set of data, the length and -thename ofthe resource in case URI-References need to be computed by -theparser.

    Then the data are pushed using the parseChunk() method, the -lastcallsetting the third argument terminate to 1.

    pushSAX.py:

    this test show the use of the event based parsing interfaces. In -thiscasethe parser does not build a document, but provides callback -informationasthe parser makes progresses analyzing the data being -provided:

    import libxml2
    +doc.freeDoc()

    The context is created with a special call based on the +xmlCreatePushParser() from the C library. The first argument is an optional +SAX callback object, then the initial set of data, the length and the name of +the resource in case URI-References need to be computed by the parser.

    Then the data are pushed using the parseChunk() method, the last call +setting the third argument terminate to 1.

    pushSAX.py:

    this test show the use of the event based parsing interfaces. In this case +the parser does not build a document, but provides callback information as +the parser makes progresses analyzing the data being provided:

    import libxml2
     log = ""
     
     class callback:
    @@ -183,16 +187,15 @@ reference = "startDocument:startElement foo {'url': 'tst'}:" + \
                 "characters: bar:endElement foo:endDocument:"
     if log != reference:
         print "Error got: %s" % log
    -    print "Expected: %s" % reference

    The key object in that test is the handler, it provides a number -ofentrypoints which can be called by the parser as it makes progresses -toindicatethe information set obtained. The full set of callback is larger -thanwhatthe callback class in that specific example implements (see -theSAXdefinition for a complete list). The wrapper will only call those -suppliedbythe object when activated. The startElement receives the names of -theelementand a dictionary containing the attributes carried by this -element.

    Also note that the reference string generated from the callback -showsasingle character call even though the string "bar" is passed to -theparserfrom 2 different call to parseChunk()

    xpath.py:

    This is a basic test of XPath wrappers support

    import libxml2
    +    print "Expected: %s" % reference

    The key object in that test is the handler, it provides a number of entry +points which can be called by the parser as it makes progresses to indicate +the information set obtained. The full set of callback is larger than what +the callback class in that specific example implements (see the SAX +definition for a complete list). The wrapper will only call those supplied by +the object when activated. The startElement receives the names of the element +and a dictionary containing the attributes carried by this element.

    Also note that the reference string generated from the callback shows a +single character call even though the string "bar" is passed to the parser +from 2 different call to parseChunk()

    xpath.py:

    This is a basic test of XPath wrappers support

    import libxml2
     
     doc = libxml2.parseFile("tst.xml")
     ctxt = doc.xpathNewContext()
    @@ -204,15 +207,14 @@ if res[0].name != "doc" or res[1].name != "foo":
         print "xpath query: wrong node set value"
         sys.exit(1)
     doc.freeDoc()
    -ctxt.xpathFreeContext()

    This test parses a file, then create an XPath context to -evaluateXPathexpression on it. The xpathEval() method execute an XPath query -andreturnsthe result mapped in a Python way. String and numbers are -nativelyconverted,and node sets are returned as a tuple of libxml2 Python -nodeswrappers. Likethe document, the XPath context need to be freed -explicitly,also not thatthe result of the XPath query may point back to the -documenttree and hencethe document must be freed after the result of the -query isused.

    xpathext.py:

    This test shows how to extend the XPath engine with functions -writteninpython:

    import libxml2
    +ctxt.xpathFreeContext()

    This test parses a file, then create an XPath context to evaluate XPath +expression on it. The xpathEval() method execute an XPath query and returns +the result mapped in a Python way. String and numbers are natively converted, +and node sets are returned as a tuple of libxml2 Python nodes wrappers. Like +the document, the XPath context need to be freed explicitly, also not that +the result of the XPath query may point back to the document tree and hence +the document must be freed after the result of the query is used.

    xpathext.py:

    This test shows how to extend the XPath engine with functions written in +python:

    import libxml2
     
     def foo(ctx, x):
         return x + 1
    @@ -224,10 +226,9 @@ res = ctxt.xpathEval("foo(1)")
     if res != 2:
         print "xpath extension failure"
     doc.freeDoc()
    -ctxt.xpathFreeContext()

    Note how the extension function is registered with the context -(butthatpart is not yet finalized, this may change slightly in the -future).

    tstxpath.py:

    This test is similar to the previous one but shows how -theextensionfunction can access the XPath evaluation context:

    def foo(ctx, x):
    +ctxt.xpathFreeContext()

    Note how the extension function is registered with the context (but that +part is not yet finalized, this may change slightly in the future).

    tstxpath.py:

    This test is similar to the previous one but shows how the extension +function can access the XPath evaluation context:

    def foo(ctx, x):
         global called
     
         #
    @@ -236,16 +237,16 @@ theextensionfunction can access the XPath evaluation context:

    def foo(ct
         pctxt = libxml2.xpathParserContext(_obj=ctx)
         ctxt = pctxt.context()
         called = ctxt.function()
    -    return x + 1

    All the interfaces around the XPath parser(or rather evaluation)contextare -not finalized, but it should be sufficient to do contextual workat -theevaluation point.

    Memory debugging:

    last but not least, all tests starts with the following prologue:

    #memory debug specific
    +    return x + 1

    All the interfaces around the XPath parser(or rather evaluation) context +are not finalized, but it should be sufficient to do contextual work at the +evaluation point.

    Memory debugging:

    last but not least, all tests starts with the following prologue:

    #memory debug specific
     libxml2.debugMemory(1)

    and ends with the following epilogue:

    #memory debug specific
     libxml2.cleanupParser()
     if libxml2.debugMemory(1) == 0:
         print "OK"
     else:
         print "Memory leak %d bytes" % (libxml2.debugMemory(1))
    -    libxml2.dumpMemory()

    Those activate the memory debugging interface of libxml2 whereallallocated -block in the library are tracked. The prologue then cleans upthelibrary state -and checks that all allocated memory has been freed. If notitcalls -dumpMemory() which saves that list in a .memdumpfile.

    Daniel Veillard

    + libxml2.dumpMemory()

    Those activate the memory debugging interface of libxml2 where all +allocated block in the library are tracked. The prologue then cleans up the +library state and checks that all allocated memory has been freed. If not it +calls dumpMemory() which saves that list in a .memdump file.

    Daniel Veillard

-- cgit v1.2.3