From 2ee13d9e464a1f5daccaff58f5d09d36b7c4f667 Mon Sep 17 00:00:00 2001
From: Aron Xu <aron@debian.org>
Date: Mon, 21 Sep 2015 22:58:06 +0800
Subject: Revert "Imported Upstream version 2.9.1+dfsg1"

This reverts commit 7300193becde71a344c8ac0973dc290fa24d800d.
---
 doc/encoding.html | 30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

(limited to 'doc/encoding.html')
diff --git a/doc/encoding.html b/doc/encoding.html
index 93de5bf..7c7953f 100644
--- a/doc/encoding.html
+++ b/doc/encoding.html
@@ -13,7 +13,8 @@ by Tim Bray on Unicode and why you should care about it.</p><p>If you don't unde
 without knowing what encoding it uses</b>, then as Joel Spolsky said <a href="http://www.joelonsoftware.com/articles/Unicode.html">please do not
 write another line of code until you finish reading that article.</a>. It is
 a prerequisite to understand this page, and avoid a lot of problems with
-libxml2, XML or text processing in general.</p><p>Table of Content:</p><ol><li><a href="encoding.html#What">What does internationalization support
+libxml2, XML or text processing in general.</p><p>Table of Content:</p><ol>
+  <li><a href="encoding.html#What">What does internationalization support
     mean ?</a></li>
   <li><a href="encoding.html#internal">The internal encoding, how and
   why</a></li>
@@ -33,7 +34,8 @@ allows the document to be encoded in other encodings at the condition that
 they are clearly labeled as such. For example the following is a wellformed
 XML document encoded in ISO-8859-1 and using accentuated letters that we
 French like for both markup and content:</p><pre>&lt;?xml version="1.0" encoding="ISO-8859-1"?&gt;
-&lt;très&gt;là &lt;/très&gt;</pre><p>Having internationalization support in libxml2 means the following:</p><ul><li>the document is properly parsed</li>
+&lt;très&gt;là &lt;/très&gt;</pre><p>Having internationalization support in libxml2 means the following:</p><ul>
+  <li>the document is properly parsed</li>
   <li>information about it's encoding is saved</li>
   <li>it can be modified</li>
   <li>it can be saved in its original encoding</li>
@@ -54,7 +56,8 @@ an internationalized fashion by libxml2 too:</p><pre>&lt;!DOCTYPE HTML PUBLIC "-
 &lt;p&gt;W3C crée des standards pour le Web.&lt;/body&gt;
 &lt;/html&gt;</pre><h3><a name="internal" id="internal">The internal encoding, how and why</a></h3><p>One of the core decisions was to force all documents to be converted to a
 default internal encoding, and that encoding to be UTF-8, here are the
-rationales for those choices:</p><ul><li>keeping the native encoding in the internal form would force the libxml
+rationales for those choices:</p><ul>
+  <li>keeping the native encoding in the internal form would force the libxml
     users (or the code associated) to be fully aware of the encoding of the
     original document, for examples when adding a text node to a document,
     the content would have to be provided in the document encoding, i.e. the
@@ -67,7 +70,8 @@ rationales for those choices:</p><ul><li>keeping the native encoding in the inte
     considered an intelligent choice too since it's a direct Unicode mapping
     support. I selected UTF-8 on the basis of efficiency and compatibility
     with surrounding software:
-    <ul><li>UTF-8 while a bit more complex to convert from/to (i.e. slightly
+    <ul>
+      <li>UTF-8 while a bit more complex to convert from/to (i.e. slightly
         more costly to import and export CPU wise) is also far more compact
         than UTF-16 (and UCS-4) for a majority of the documents I see it used
         for right now (RPM RDF catalogs, advogato data, various configuration
@@ -86,8 +90,10 @@ rationales for those choices:</p><ul><li>keeping the native encoding in the inte
         upcoming Gnome text widget, and a lot of Unix code (yet another place
         where Unix programmer base takes a different approach from Microsoft
         - they are using UTF-16)</li>
-    </ul></li>
-</ul><p>What does this mean in practice for the libxml2 user:</p><ul><li>xmlChar, the libxml2 data type is a byte, those bytes must be assembled
+    </ul>
+  </li>
+</ul><p>What does this mean in practice for the libxml2 user:</p><ul>
+  <li>xmlChar, the libxml2 data type is a byte, those bytes must be assembled
     as UTF-8 valid strings. The proper way to terminate an xmlChar * string
     is simply to append 0 byte, as usual.</li>
   <li>One just need to make sure that when using chars outside the ASCII set,
@@ -95,7 +101,8 @@ rationales for those choices:</p><ul><li>keeping the native encoding in the inte
 </ul><h3><a name="implemente" id="implemente">How is it implemented ?</a></h3><p>Let's describe how all this works within libxml, basically the I18N
 (internationalization) support get triggered only during I/O operation, i.e.
 when reading a document or saving one. Let's look first at the reading
-sequence:</p><ol><li>when a document is processed, we usually don't know the encoding, a
+sequence:</p><ol>
+  <li>when a document is processed, we usually don't know the encoding, a
     simple heuristic allows to detect UTF-16 and UCS-4 from encodings where
     the ASCII range (0-0x7F) maps with ASCII</li>
   <li>the xml declaration if available is parsed, including the encoding
@@ -136,7 +143,8 @@ err2.xml:1: error: Unsupported encoding UnsupportedEnc
 collected/built an xmlDoc DOM like structure) ? It depends on the function
 called, xmlSaveFile() will just try to save in the original encoding, while
 xmlSaveFileTo() and xmlSaveFileEnc() can optionally save to a given
-encoding:</p><ol><li>if no encoding is given, libxml2 will look for an encoding value
+encoding:</p><ol>
+  <li>if no encoding is given, libxml2 will look for an encoding value
     associated to the document and if it exists will try to save to that
     encoding,
     <p>otherwise everything is written in the internal form, i.e. UTF-8</p>
@@ -175,7 +183,8 @@ so a couple of functions htmlGetMetaEncoding() and htmlSetMetaEncoding() have
 been provided. The parser also attempts to switch encoding on the fly when
 detecting such a tag on input. Except for that the processing is the same
 (and again reuses the same code).</p><h3><a name="Default" id="Default">Default supported encodings</a></h3><p>libxml2 has a set of default converters for the following encodings
-(located in encoding.c):</p><ol><li>UTF-8 is supported by default (null handlers)</li>
+(located in encoding.c):</p><ol>
+  <li>UTF-8 is supported by default (null handlers)</li>
   <li>UTF-16, both little and big endian</li>
   <li>ISO-Latin-1 (ISO-8859-1) covering most western languages</li>
   <li>ASCII, useful mostly for saving</li>
@@ -193,7 +202,8 @@ goal is to be able to parse document whose encoding is supported but where
 the name differs (for example from the default set of names accepted by
 iconv). The following functions allow to register and handle new aliases for
 existing encodings. Once registered libxml2 will automatically lookup the
-aliases when handling a document:</p><ul><li>int xmlAddEncodingAlias(const char *name, const char *alias);</li>
+aliases when handling a document:</p><ul>
+  <li>int xmlAddEncodingAlias(const char *name, const char *alias);</li>
   <li>int xmlDelEncodingAlias(const char *alias);</li>
   <li>const char * xmlGetEncodingAlias(const char *alias);</li>
   <li>void xmlCleanupEncodingAliases(void);</li>
-- 
cgit v1.2.3