From a7e9d3f37d5e9fba4b9acaa43e7c12b6d9a669ae Mon Sep 17 00:00:00 2001 From: Mike Hommey Date: Thu, 8 Jun 2006 10:59:26 +0200 Subject: Load /tmp/libxml2-2.6.26 into libxml2/branches/upstream/current. --- doc/entities.html | 74 ++++++++++++++++++++++++++++--------------------------- 1 file changed, 38 insertions(+), 36 deletions(-) (limited to 'doc/entities.html') diff --git a/doc/entities.html b/doc/entities.html index c234b41..732562a 100644 --- a/doc/entities.html +++ b/doc/entities.html @@ -7,34 +7,35 @@ H1 {font-family: Verdana,Arial,Helvetica} H2 {font-family: Verdana,Arial,Helvetica} H3 {font-family: Verdana,Arial,Helvetica} A:link, A:visited, A:active { text-decoration: underline } -Entities or no entities
Action against software patentsGnome2 LogoW3C LogoRed Hat Logo
Made with Libxml2 Logo

The XML C parser and toolkit of Gnome

Entities or no entities

Developer Menu
API Indexes
Related links

Entities in principle are similar to simple C macros. An entity defines an -abbreviation for a given string that you can reuse many times throughout the -content of your document. Entities are especially useful when a given string -may occur frequently within a document, or to confine the change needed to a -document to a restricted area in the internal subset of the document (at the -beginning). Example:

1 <?xml version="1.0"?>
+Entities or no entities
Action against software patentsGnome2 LogoW3C LogoRed Hat Logo
Made with Libxml2 Logo

The XML C parser and toolkit of Gnome

Entities or no entities

Developer Menu
API Indexes
Related links

Entities in principle are similar to simple C macros. An entity +definesanabbreviation for a given string that you can reuse many times +throughoutthecontent of your document. Entities are especially useful when a +givenstringmay occur frequently within a document, or to confine the change +neededto adocument to a restricted area in the internal subset of the +document (atthebeginning). Example:

1 <?xml version="1.0"?>
 2 <!DOCTYPE EXAMPLE SYSTEM "example.dtd" [
 3 <!ENTITY xml "Extensible Markup Language">
 4 ]>
 5 <EXAMPLE>
 6    &xml;
-7 </EXAMPLE>

Line 3 declares the xml entity. Line 6 uses the xml entity, by prefixing -its name with '&' and following it by ';' without any spaces added. There -are 5 predefined entities in libxml2 allowing you to escape characters with -predefined meaning in some parts of the xml document content: -&lt; for the character '<', &gt; -for the character '>', &apos; for the character ''', -&quot; for the character '"', and -&amp; for the character '&'.

One of the problems related to entities is that you may want the parser to -substitute an entity's content so that you can see the replacement text in -your application. Or you may prefer to keep entity references as such in the -content to be able to save the document back without losing this usually -precious information (if the user went through the pain of explicitly -defining entities, he may have a a rather negative attitude if you blindly -substitute them as saving time). The xmlSubstituteEntitiesDefault() -function allows you to check and change the behaviour, which is to not -substitute entities by default.

Here is the DOM tree built by libxml2 for the previous document in the -default case:

/gnome/src/gnome-xml -> ./xmllint --debug test/ent1
+7 </EXAMPLE>

Line 3 declares the xml entity. Line 6 uses the xml entity, byprefixingits +name with '&' and following it by ';' without any spacesadded. Thereare 5 +predefined entities in libxml2 allowing you to escapecharacters +withpredefined meaning in some parts of the xml +documentcontent:&lt;for the character +'<',&gt;for the character +'>',&apos;for the +character''',&quot;for the character +'"',and&amp;for the character '&'.

One of the problems related to entities is that you may want the +parsertosubstitute an entity's content so that you can see the replacement +textinyour application. Or you may prefer to keep entity references as such +inthecontent to be able to save the document back without losing +thisusuallyprecious information (if the user went through the pain +ofexplicitlydefining entities, he may have a a rather negative attitude if +youblindlysubstitute them as saving time). The xmlSubstituteEntitiesDefault()functionallows +you to check and change the behaviour, which is to notsubstituteentities by +default.

Here is the DOM tree built by libxml2 for the previous document +inthedefault case:

/gnome/src/gnome-xml -> ./xmllint --debug test/ent1
 DOCUMENT
 version=1.0
    ELEMENT EXAMPLE
@@ -49,16 +50,17 @@ DOCUMENT
 version=1.0
    ELEMENT EXAMPLE
      TEXT
-     content=     Extensible Markup Language

So, entities or no entities? Basically, it depends on your use case. I -suggest that you keep the non-substituting default behaviour and avoid using -entities in your XML document or data if you are not willing to handle the -entity references elements in the DOM tree.

Note that at save time libxml2 enforces the conversion of the predefined -entities where necessary to prevent well-formedness problems, and will also -transparently replace those with chars (i.e. it will not generate entity -reference elements in the DOM tree or call the reference() SAX callback when -finding them in the input).

WARNING: handling entities -on top of the libxml2 SAX interface is difficult!!! If you plan to use -non-predefined entities in your documents, then the learning curve to handle -then using the SAX API may be long. If you plan to use complex documents, I -strongly suggest you consider using the DOM interface instead and let libxml -deal with the complexity rather than trying to do it yourself.

Daniel Veillard

+ content= Extensible Markup Language

So, entities or no entities? Basically, it depends on your use +case.Isuggest that you keep the non-substituting default behaviour and +avoidusingentities in your XML document or data if you are not willing to +handletheentity references elements in the DOM tree.

Note that at save time libxml2 enforces the conversion of +thepredefinedentities where necessary to prevent well-formedness problems, +andwill alsotransparently replace those with chars (i.e. it will not +generateentityreference elements in the DOM tree or call the reference() SAX +callbackwhenfinding them in the input).

WARNING: handlingentitieson +top of the libxml2 SAX interface is difficult!!! If you plan +tousenon-predefined entities in your documents, then the learning curve +tohandlethen using the SAX API may be long. If you plan to use +complexdocuments, Istrongly suggest you consider using the DOM interface +instead andlet libxmldeal with the complexity rather than trying to do it +yourself.

Daniel Veillard

-- cgit v1.2.3