summaryrefslogtreecommitdiff
path: root/doc/library.html
blob: afe4d774e490ae3a686dcef37943932aa57e5081 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" /><link rel="SHORTCUT ICON" href="/favicon.ico" /><style type="text/css">
TD {font-family: Verdana,Arial,Helvetica}
BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em}
H1 {font-family: Verdana,Arial,Helvetica}
H2 {font-family: Verdana,Arial,Helvetica}
H3 {font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
</style><title>The parser interfaces</title></head><body bgcolor="#8b7765" text="#000000" link="#a06060" vlink="#000000"><table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr><td width="120"><a href="http://swpat.ffii.org/"><img src="epatents.png" alt="Action against software patents" /></a></td><td width="180"><a href="http://www.gnome.org/"><img src="gnome2.png" alt="Gnome2 Logo" /></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo" /></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo" /></a><div align="left"><a href="http://xmlsoft.org/"><img src="Libxml2-Logo-180x168.gif" alt="Made with Libxml2 Logo" /></a></div></td><td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center"><h1>The XML C parser and toolkit of Gnome</h1><h2>The parser interfaces</h2></td></tr></table></td></tr></table></td></tr></table><table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr><td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3"><tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Developer Menu</b></center></td></tr><tr><td bgcolor="#fffacd"><form action="search.php" enctype="application/x-www-form-urlencoded" method="get"><input name="query" type="text" size="20" value="" /><input name="submit" type="submit" value="Search ..." /></form><ul><li><a href="index.html" style="font-weight:bold">Main Menu</a></li><li><a href="html/index.html" style="font-weight:bold">Reference Manual</a></li><li><a href="examples/index.html" style="font-weight:bold">Code Examples</a></li><li><a href="guidelines.html">XML Guidelines</a></li><li><a href="tutorial/index.html">Tutorial</a></li><li><a href="xmlreader.html">The Reader Interface</a></li><li><a href="ChangeLog.html">ChangeLog</a></li><li><a href="XSLT.html">XSLT</a></li><li><a href="python.html">Python and bindings</a></li><li><a href="architecture.html">libxml2 architecture</a></li><li><a href="tree.html">The tree output</a></li><li><a href="interface.html">The SAX interface</a></li><li><a href="xmlmem.html">Memory Management</a></li><li><a href="xmlio.html">I/O Interfaces</a></li><li><a href="library.html">The parser interfaces</a></li><li><a href="entities.html">Entities or no entities</a></li><li><a href="namespaces.html">Namespaces</a></li><li><a href="upgrade.html">Upgrading 1.x code</a></li><li><a href="threads.html">Thread safety</a></li><li><a href="DOM.html">DOM Principles</a></li><li><a href="example.html">A real example</a></li><li><a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a></li></ul></td></tr></table><table width="100%" border="0" cellspacing="1" cellpadding="3"><tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>API Indexes</b></center></td></tr><tr><td bgcolor="#fffacd"><ul><li><a href="APIchunk0.html">Alphabetic</a></li><li><a href="APIconstructors.html">Constructors</a></li><li><a href="APIfunctions.html">Functions/Types</a></li><li><a href="APIfiles.html">Modules</a></li><li><a href="APIsymbols.html">Symbols</a></li></ul></td></tr></table><table width="100%" border="0" cellspacing="1" cellpadding="3"><tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr><tr><td bgcolor="#fffacd"><ul><li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li><li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li><li><a href="http://phd.cs.unibo.it/gdome2/">DOM gdome2</a></li><li><a href="http://www.aleksey.com/xmlsec/">XML-DSig xmlsec</a></li><li><a href="ftp://xmlsoft.org/">FTP</a></li><li><a href="http://www.zlatkovic.com/projects/libxml/">Windows binaries</a></li><li><a href="http://www.blastwave.org/packages.php/libxml2">Solaris binaries</a></li><li><a href="http://www.explain.com.au/oss/libxml2xslt.html">MacOsX binaries</a></li><li><a href="http://libxmlplusplus.sourceforge.net/">C++ bindings</a></li><li><a href="http://www.zend.com/php5/articles/php5-xmlphp.php#Heading4">PHP bindings</a></li><li><a href="http://sourceforge.net/projects/libxml2-pas/">Pascal bindings</a></li><li><a href="http://libxml.rubyforge.org/">Ruby bindings</a></li><li><a href="http://tclxml.sourceforge.net/">Tcl bindings</a></li><li><a href="http://bugzilla.gnome.org/buglist.cgi?product=libxml2">Bug Tracker</a></li></ul></td></tr></table></td></tr></table></td><td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd"><p>This section is directly intended to help programmers
gettingbootstrappedusing the XML tollkit from the C language. It is not
intended tobeextensive. I hope the automatically generated documents will
providethecompleteness required, but as a separate set of documents. The
interfacesofthe XML parser are by principle low level, Those interested in a
higherlevelAPI should <a href="#DOM">look at DOM</a>.</p><p>The <a href="html/libxml-parser.html">parser interfaces
forXML</a>areseparated from the <a href="html/libxml-htmlparser.html">HTMLparserinterfaces</a>.  Let's have a
look at how the XML parser can becalled:</p><h3><a name="Invoking" id="Invoking">Invoking the parser : the pull method</a></h3><p>Usually, the first thing to do is to read an XML input. The
parseracceptsdocuments either from in-memory strings or from files.  The
functionsaredefined in "parser.h":</p><dl><dt><code>xmlDocPtr xmlParseMemory(char *buffer, int size);</code></dt>
    <dd><p>Parse a null-terminated string containing the document.</p>
    </dd>
</dl><dl><dt><code>xmlDocPtr xmlParseFile(const char *filename);</code></dt>
    <dd><p>Parse an XML document contained in a (possibly compressed)file.</p>
    </dd>
</dl><p>The parser returns a pointer to the document structure (or NULL in
caseoffailure).</p><h3 id="Invoking1">Invoking the parser: the push method</h3><p>In order for the application to keep the control when the document
isbeingfetched (which is common for GUI based programs) libxml2 provides
apushinterface, too, as of version 1.8.3. Here are the interfacefunctions:</p><pre>xmlParserCtxtPtr xmlCreatePushParserCtxt(xmlSAXHandlerPtr sax,
                                         void *user_data,
                                         const char *chunk,
                                         int size,
                                         const char *filename);
int              xmlParseChunk          (xmlParserCtxtPtr ctxt,
                                         const char *chunk,
                                         int size,
                                         int terminate);</pre><p>and here is a simple example showing how to use the interface:</p><pre>            FILE *f;

            f = fopen(filename, "r");
            if (f != NULL) {
                int res, size = 1024;
                char chars[1024];
                xmlParserCtxtPtr ctxt;

                res = fread(chars, 1, 4, f);
                if (res &gt; 0) {
                    ctxt = xmlCreatePushParserCtxt(NULL, NULL,
                                chars, res, filename);
                    while ((res = fread(chars, 1, size, f)) &gt; 0) {
                        xmlParseChunk(ctxt, chars, res, 0);
                    }
                    xmlParseChunk(ctxt, chars, 0, 1);
                    doc = ctxt-&gt;myDoc;
                    xmlFreeParserCtxt(ctxt);
                }
            }</pre><p>The HTML parser embedded into libxml2 also has a push
interface;thefunctions are just prefixed by "html" rather than "xml".</p><h3 id="Invoking2">Invoking the parser: the SAX interface</h3><p>The tree-building interface makes the parser memory-hungry,
firstloadingthe document in memory and then building the tree itself. Reading
adocumentwithout building the tree is possible using the SAX interfaces
(seeSAX.h and<a href="http://www.daa.com.au/~james/gnome/xml-sax/xml-sax.html">JamesHenstridge'sdocumentation</a>).
Note also that the push interface can belimited to SAX:just use the two first
arguments of<code>xmlCreatePushParserCtxt()</code>.</p><h3><a name="Building" id="Building">Building a tree from scratch</a></h3><p>The other way to get an XML tree in memory is by building
it.Basicallythere is a set of functions dedicated to building new
elements.(These arealso described in &lt;libxml/tree.h&gt;.) For example,
here is apiece ofcode that produces the XML document used in the previous
examples:</p><pre>    #include &lt;libxml/tree.h&gt;
    xmlDocPtr doc;
    xmlNodePtr tree, subtree;

    doc = xmlNewDoc("1.0");
    doc-&gt;children = xmlNewDocNode(doc, NULL, "EXAMPLE", NULL);
    xmlSetProp(doc-&gt;children, "prop1", "gnome is great");
    xmlSetProp(doc-&gt;children, "prop2", "&amp; linux too");
    tree = xmlNewChild(doc-&gt;children, NULL, "head", NULL);
    subtree = xmlNewChild(tree, NULL, "title", "Welcome to Gnome");
    tree = xmlNewChild(doc-&gt;children, NULL, "chapter", NULL);
    subtree = xmlNewChild(tree, NULL, "title", "The Linux adventure");
    subtree = xmlNewChild(tree, NULL, "p", "bla bla bla ...");
    subtree = xmlNewChild(tree, NULL, "image", NULL);
    xmlSetProp(subtree, "href", "linus.gif");</pre><p>Not really rocket science ...</p><h3><a name="Traversing" id="Traversing">Traversing the tree</a></h3><p>Basically by <a href="html/libxml-tree.html">including"tree.h"</a>yourcode
has access to the internal structure of all the elementsof the tree.The names
should be somewhat simple
like<strong>parent</strong>,<strong>children</strong>,
<strong>next</strong>,<strong>prev</strong>,<strong>properties</strong>,
etc... For example, stillwith the previousexample:</p><pre><code>doc-&gt;children-&gt;children-&gt;children</code></pre><p>points to the title element,</p><pre>doc-&gt;children-&gt;children-&gt;next-&gt;children-&gt;children</pre><p>points to the text node containing the chapter title
"TheLinuxadventure".</p><p><strong>NOTE</strong>: XML allows <em>PI</em>s and
<em>comments</em>tobepresent before the document root, so
<code>doc-&gt;children</code>maypointto an element which is not the document
Root Element; afunction<code>xmlDocGetRootElement()</code>was added for this
purpose.</p><h3><a name="Modifying" id="Modifying">Modifying the tree</a></h3><p>Functions are provided for reading and writing the document content.Hereis
an excerpt from the <a href="html/libxml-tree.html">tree API</a>:</p><dl><dt><code>xmlAttrPtr xmlSetProp(xmlNodePtr node, const xmlChar
  *name,constxmlChar *value);</code></dt>
    <dd><p>This sets (or changes) an attribute carried by an ELEMENT
      node.Thevalue can be NULL.</p>
    </dd>
</dl><dl><dt><code>const xmlChar *xmlGetProp(xmlNodePtr node,
  constxmlChar*name);</code></dt>
    <dd><p>This function returns a pointer to new copy of thepropertycontent.
      Note that the user must deallocate the result.</p>
    </dd>
</dl><p>Two functions are provided for reading and writing the text
associatedwithelements:</p><dl><dt><code>xmlNodePtr xmlStringGetNodeList(xmlDocPtr doc,
  constxmlChar*value);</code></dt>
    <dd><p>This function takes an "external" string and converts it toonetext
      node or possibly to a list of entity and text nodes.Allnon-predefined
      entity references like &amp;Gnome; will bestoredinternally as entity
      nodes, hence the result of the function maynot bea single node.</p>
    </dd>
</dl><dl><dt><code>xmlChar *xmlNodeListGetString(xmlDocPtr doc, xmlNodePtr
  list,intinLine);</code></dt>
    <dd><p>This function is the inverseof<code>xmlStringGetNodeList()</code>.
      It generates a newstringcontaining the content of the text and entity
      nodes. Note theextraargument inLine. If this argument is set to 1, the
      function willexpandentity references.  For example, instead of
      returning the&amp;Gnome;XML encoding in the string, it will substitute
      it with itsvalue (say,"GNU Network Object Model Environment").</p>
    </dd>
</dl><h3><a name="Saving" id="Saving">Saving a tree</a></h3><p>Basically 3 options are possible:</p><dl><dt><code>void xmlDocDumpMemory(xmlDocPtr cur,
  xmlChar**mem,int*size);</code></dt>
    <dd><p>Returns a buffer into which the document has been saved.</p>
    </dd>
</dl><dl><dt><code>extern void xmlDocDump(FILE *f, xmlDocPtr doc);</code></dt>
    <dd><p>Dumps a document to an open file descriptor.</p>
    </dd>
</dl><dl><dt><code>int xmlSaveFile(const char *filename, xmlDocPtr cur);</code></dt>
    <dd><p>Saves the document to a file. In this case,
      thecompressioninterface is triggered if it has been turned on.</p>
    </dd>
</dl><h3><a name="Compressio" id="Compressio">Compression</a></h3><p>The library transparently handles compression when
doingfile-basedaccesses. The level of compression on saves can be turned on
eithergloballyor individually for one file:</p><dl><dt><code>int  xmlGetDocCompressMode (xmlDocPtr doc);</code></dt>
    <dd><p>Gets the document compression ratio (0-9).</p>
    </dd>
</dl><dl><dt><code>void xmlSetDocCompressMode (xmlDocPtr doc, int mode);</code></dt>
    <dd><p>Sets the document compression ratio.</p>
    </dd>
</dl><dl><dt><code>int  xmlGetCompressMode(void);</code></dt>
    <dd><p>Gets the default compression ratio.</p>
    </dd>
</dl><dl><dt><code>void xmlSetCompressMode(int mode);</code></dt>
    <dd><p>Sets the default compression ratio.</p>
    </dd>
</dl><p><a href="bugs.html">Daniel Veillard</a></p></td></tr></table></td></tr></table></td></tr></table></td></tr></table></td></tr></table></body></html>