summaryrefslogtreecommitdiff
path: root/doc/xmldtd.html
blob: 3f5a79786c99f0a9fe356277dd972ba7a83e09f0 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" /><link rel="SHORTCUT ICON" href="/favicon.ico" /><style type="text/css">
TD {font-family: Verdana,Arial,Helvetica}
BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em}
H1 {font-family: Verdana,Arial,Helvetica}
H2 {font-family: Verdana,Arial,Helvetica}
H3 {font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
</style><title>Validation &amp; DTDs</title></head><body bgcolor="#8b7765" text="#000000" link="#a06060" vlink="#000000"><table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr><td width="120"><a href="http://swpat.ffii.org/"><img src="epatents.png" alt="Action against software patents" /></a></td><td width="180"><a href="http://www.gnome.org/"><img src="gnome2.png" alt="Gnome2 Logo" /></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo" /></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo" /></a><div align="left"><a href="http://xmlsoft.org/"><img src="Libxml2-Logo-180x168.gif" alt="Made with Libxml2 Logo" /></a></div></td><td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center"><h1>The XML C parser and toolkit of Gnome</h1><h2>Validation &amp; DTDs</h2></td></tr></table></td></tr></table></td></tr></table><table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr><td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3"><tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr><tr><td bgcolor="#fffacd"><form action="search.php" enctype="application/x-www-form-urlencoded" method="get"><input name="query" type="text" size="20" value="" /><input name="submit" type="submit" value="Search ..." /></form><ul><li><a href="index.html">Home</a></li><li><a href="html/index.html">Reference Manual</a></li><li><a href="intro.html">Introduction</a></li><li><a href="FAQ.html">FAQ</a></li><li><a href="docs.html" style="font-weight:bold">Developer Menu</a></li><li><a href="bugs.html">Reporting bugs and getting help</a></li><li><a href="help.html">How to help</a></li><li><a href="downloads.html">Downloads</a></li><li><a href="news.html">Releases</a></li><li><a href="XMLinfo.html">XML</a></li><li><a href="XSLT.html">XSLT</a></li><li><a href="xmldtd.html">Validation &amp; DTDs</a></li><li><a href="encoding.html">Encodings support</a></li><li><a href="catalog.html">Catalog support</a></li><li><a href="namespaces.html">Namespaces</a></li><li><a href="contribs.html">Contributions</a></li><li><a href="examples/index.html" style="font-weight:bold">Code Examples</a></li><li><a href="html/index.html" style="font-weight:bold">API Menu</a></li><li><a href="guidelines.html">XML Guidelines</a></li><li><a href="ChangeLog.html">Recent Changes</a></li></ul></td></tr></table><table width="100%" border="0" cellspacing="1" cellpadding="3"><tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr><tr><td bgcolor="#fffacd"><ul><li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li><li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li><li><a href="http://phd.cs.unibo.it/gdome2/">DOM gdome2</a></li><li><a href="http://www.aleksey.com/xmlsec/">XML-DSig xmlsec</a></li><li><a href="ftp://xmlsoft.org/">FTP</a></li><li><a href="http://www.zlatkovic.com/projects/libxml/">Windows binaries</a></li><li><a href="http://www.blastwave.org/packages.php/libxml2">Solaris binaries</a></li><li><a href="http://www.explain.com.au/oss/libxml2xslt.html">MacOsX binaries</a></li><li><a href="http://libxmlplusplus.sourceforge.net/">C++ bindings</a></li><li><a href="http://www.zend.com/php5/articles/php5-xmlphp.php#Heading4">PHP bindings</a></li><li><a href="http://sourceforge.net/projects/libxml2-pas/">Pascal bindings</a></li><li><a href="http://libxml.rubyforge.org/">Ruby bindings</a></li><li><a href="http://tclxml.sourceforge.net/">Tcl bindings</a></li><li><a href="http://bugzilla.gnome.org/buglist.cgi?product=libxml2">Bug Tracker</a></li></ul></td></tr></table></td></tr></table></td><td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd"><p>Table of Content:</p><ol><li><a href="#General5">General overview</a></li>
  <li><a href="#definition">The definition</a></li>
  <li><a href="#Simple">Simple rules</a>
    <ol><li><a href="#reference">How to reference a DTD from a document</a></li>
      <li><a href="#Declaring">Declaring elements</a></li>
      <li><a href="#Declaring1">Declaring attributes</a></li>
    </ol></li>
  <li><a href="#Some">Some examples</a></li>
  <li><a href="#validate">How to validate</a></li>
  <li><a href="#Other">Other resources</a></li>
</ol><h3><a name="General5" id="General5">General overview</a></h3><p>Well what is validation and what is a DTD ?</p><p>DTD is the acronym for Document Type Definition. This is a
descriptionofthe content for a family of XML files. This is part of the
XML1.0specification, and allows one to describe and verify that a
givendocumentinstance conforms to the set of rules detailing its structure
andcontent.</p><p>Validation is the process of checking a document against a
DTD(moregenerally against a set of construction rules).</p><p>The validation process and building DTDs are the two most difficultpartsof
the XML life cycle. Briefly a DTD defines all the possible elementsto befound
within your document, what is the formal shape of your documenttree(by
defining the allowed content of an element; either text, aregularexpression
for the allowed list of children, or mixed content i.e.both textand
children). The DTD also defines the valid attributes for allelements andthe
types of those attributes.</p><h3><a name="definition1" id="definition1">The definition</a></h3><p>The <a href="http://www.w3.org/TR/REC-xml">W3C XML Recommendation</a>(<a href="http://www.xml.com/axml/axml.html">Tim Bray's annotated
versionofRev1</a>):</p><ul><li><a href="http://www.w3.org/TR/REC-xml#elemdecls">Declaringelements</a></li>
  <li><a href="http://www.w3.org/TR/REC-xml#attdecls">Declaringattributes</a></li>
</ul><p>(unfortunately) all this is inherited from the SGML world, the
syntaxisancient...</p><h3><a name="Simple1" id="Simple1">Simple rules</a></h3><p>Writing DTDs can be done in many ways. The rules to build them if
youneedsomething permanent or something which can evolve over time can
beradicallydifferent. Really complex DTDs like DocBook ones are flexible
butquiteharder to design. I will just focus on DTDs for a formats with a
fixedsimplestructure. It is just a set of basic rules, and definitely
notexhaustive norusable for complex DTD design.</p><h4><a name="reference1" id="reference1">How to reference a DTD from a document</a>:</h4><p>Assuming the top element of the document is <code>spec</code>and the
dtdisplaced in the file <code>mydtd</code>in the
subdirectory<code>dtds</code>ofthe directory from where the document were
loaded:</p><p><code>&lt;!DOCTYPE spec SYSTEM "dtds/mydtd"&gt;</code></p><p>Notes:</p><ul><li>The system string is actually an URI-Reference (as defined in <a href="http://www.ietf.org/rfc/rfc2396.txt">RFC 2396</a>) so you can
    useafull URL string indicating the location of your DTD on the Web. This
    isareally good thing to do if you want others to validate
  yourdocument.</li>
  <li>It is also possible to associate a <code>PUBLIC</code>identifier(amagic
    string) so that the DTD is looked up in catalogs on the clientsidewithout
    having to locate it on the web.</li>
  <li>A DTD contains a set of element and attribute declarations,
    buttheydon't define what the root of the document should be. This
    isexplicitlytold to the parser/validator as the first element
    ofthe<code>DOCTYPE</code>declaration.</li>
</ul><h4><a name="Declaring2" id="Declaring2">Declaring elements</a>:</h4><p>The following declares an element <code>spec</code>:</p><p><code>&lt;!ELEMENT spec (front, body, back?)&gt;</code></p><p>It also expresses that the spec element contains one<code>front</code>,one
<code>body</code>and one optional<code>back</code>children elements inthis
order. The declaration of oneelement of the structure and its contentare done
in a single declaration.Similarly the following
declares<code>div1</code>elements:</p><p><code>&lt;!ELEMENT div1 (head, (p | list | note)*, div2?)&gt;</code></p><p>which means div1 contains one <code>head</code>then a series
ofoptional<code>p</code>, <code>list</code>s and <code>note</code>s and
thenanoptional <code>div2</code>. And last but not least an element
cancontaintext:</p><p><code>&lt;!ELEMENT b (#PCDATA)&gt;</code></p><p><code>b</code>contains text or being of mixed content (text and
elementsinno particular order):</p><p><code>&lt;!ELEMENT p (#PCDATA|a|ul|b|i|em)*&gt;</code></p><p><code>p </code>can contain text or
<code>a</code>,<code>ul</code>,<code>b</code>, <code>i </code>or
<code>em</code>elements inno particularorder.</p><h4><a name="Declaring1" id="Declaring1">Declaring attributes</a>:</h4><p>Again the attributes declaration includes their content definition:</p><p><code>&lt;!ATTLIST termdef name CDATA #IMPLIED&gt;</code></p><p>means that the element <code>termdef</code>can have
a<code>name</code>attribute containing text (<code>CDATA</code>) and which
isoptional(<code>#IMPLIED</code>). The attribute value can also be
definedwithin aset:</p><p><code>&lt;!ATTLIST list
type(bullets|ordered|glossary)"ordered"&gt;</code></p><p>means <code>list</code>element have a <code>type</code>attribute
with3allowed values "bullets", "ordered" or "glossary" and which
defaultto"ordered" if the attribute is not explicitly specified.</p><p>The content type of an attribute can be
text(<code>CDATA</code>),anchor/reference/references(<code>ID</code>/<code>IDREF</code>/<code>IDREFS</code>),entity(ies)(<code>ENTITY</code>/<code>ENTITIES</code>)
orname(s)(<code>NMTOKEN</code>/<code>NMTOKENS</code>). The following
definesthat a<code>chapter</code>element can have an
optional<code>id</code>attributeof type <code>ID</code>, usable for reference
fromattribute of typeIDREF:</p><p><code>&lt;!ATTLIST chapter id ID #IMPLIED&gt;</code></p><p>The last value of an attribute definition can
be<code>#REQUIRED</code>meaning that the attribute has to be
given,<code>#IMPLIED</code>meaning that it is optional, or the default
value(possibly prefixed by<code>#FIXED</code>if it is the only allowed).</p><p>Notes:</p><ul><li>Usually the attributes pertaining to a given element are declared
    inasingle expression, but it is just a convention adopted by a lot
    ofDTDwriters:
    <pre>&lt;!ATTLIST termdef
          id      ID      #REQUIRED
          name    CDATA   #IMPLIED&gt;</pre>
    <p>The previous construct defines
    both<code>id</code>and<code>name</code>attributes for the
    element<code>termdef</code>.</p>
  </li>
</ul><h3><a name="Some1" id="Some1">Some examples</a></h3><p>The directory <code>test/valid/dtds/</code>in the
libxml2distributioncontains some complex DTD examples. The example in
thefile<code>test/valid/dia.xml</code>shows an XML file where the simple
DTDisdirectly included within the document.</p><h3><a name="validate1" id="validate1">How to validate</a></h3><p>The simplest way is to use the xmllint program included with
libxml.The<code>--valid</code>option turns-on validation of the files given
asinput.For example the following validates a copy of the first revision of
theXML1.0 specification:</p><p><code>xmllint --valid --noout test/valid/REC-xml-19980210.xml</code></p><p>the -- noout is used to disable output of the resulting tree.</p><p>The <code>--dtdvalid dtd</code>allows validation of the
document(s)againsta given DTD.</p><p>Libxml2 exports an API to handle DTDs and validation, check the <a href="http://xmlsoft.org/html/libxml-valid.html">associateddescription</a>.</p><h3><a name="Other1" id="Other1">Other resources</a></h3><p>DTDs are as old as SGML. So there may be a number of examples
on-line,Iwill just list one for now, others pointers welcome:</p><ul><li><a href="http://www.xml101.com:8081/dtd/">XML-101 DTD</a></li>
</ul><p>I suggest looking at the examples found under test/valid/dtd and any
ofthelarge number of books available on XML. The dia example in
test/validshouldbe both simple and complete enough to allow you to build your
own.</p><p></p><p><a href="bugs.html">Daniel Veillard</a></p></td></tr></table></td></tr></table></td></tr></table></td></tr></table></td></tr></table></body></html>