<feed xmlns='http://www.w3.org/2005/Atom'>
<title>pkgsrc/textproc/py-html5lib, branch pkgsrc-2014Q4</title>
<subtitle>[no description]</subtitle>
<id>https://git.osdyson.ru/mirror/pkgsrc/atom?h=pkgsrc-2014Q4</id>
<link rel='self' href='https://git.osdyson.ru/mirror/pkgsrc/atom?h=pkgsrc-2014Q4'/>
<link rel='alternate' type='text/html' href='https://git.osdyson.ru/mirror/pkgsrc/'/>
<updated>2014-05-10T15:46:52Z</updated>
<entry>
<title>Add missing six dependency. Bump revision.</title>
<updated>2014-05-10T15:46:52Z</updated>
<author>
<name>joerg</name>
<email>joerg@pkgsrc.org</email>
</author>
<published>2014-05-10T15:46:52Z</published>
<link rel='alternate' type='text/html' href='https://git.osdyson.ru/mirror/pkgsrc/commit/?id=fe06e81883dd85f1c22f96372335eb8a3d32601d'/>
<id>urn:sha1:fe06e81883dd85f1c22f96372335eb8a3d32601d</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Update to 0.999:</title>
<updated>2014-01-17T23:32:02Z</updated>
<author>
<name>wiz</name>
<email>wiz@pkgsrc.org</email>
</author>
<published>2014-01-17T23:32:02Z</published>
<link rel='alternate' type='text/html' href='https://git.osdyson.ru/mirror/pkgsrc/commit/?id=2f1637227c98f7c07d74102b22195b2c5f0e6a39'/>
<id>urn:sha1:2f1637227c98f7c07d74102b22195b2c5f0e6a39</id>
<content type='text'>
0.999
~~~~~

Released on December 23, 2013

* Fix #127: add work-around for CPython issue #20007: .read(0) on
  http.client.HTTPResponse drops the rest of the content.

* Fix #115: lxml treewalker can now deal with fragments containing, at
  their root level, text nodes with non-ASCII characters on Python 2.


0.99
~~~~

Released on September 10, 2013

* No library changes from 1.0b3; released as 0.99 as pip has changed
  behaviour from 1.4 to avoid installing pre-release versions per
  PEP 440.


1.0b3
~~~~~

Released on July 24, 2013

* Removed ``RecursiveTreeWalker`` from ``treewalkers._base``. Any
  implementation using it should be moved to
  ``NonRecursiveTreeWalker``, as everything bundled with html5lib has
  for years.

* Fix #67 so that ``BufferedStream`` to correctly returns a bytes
  object, thereby fixing any case where html5lib is passed a
  non-seekable RawIOBase-like object.


1.0b2
~~~~~

Released on June 27, 2013

* Removed reordering of attributes within the serializer. There is now
  an ``alphabetical_attributes`` option which preserves the previous
  behaviour through a new filter. This allows attribute order to be
  preserved through html5lib if the tree builder preserves order.

* Removed ``dom2sax`` from DOM treebuilders. It has been replaced by
  ``treeadapters.sax.to_sax`` which is generic and supports any
  treewalker; it also resolves all known bugs with ``dom2sax``.

* Fix treewalker assertions on hitting bytes strings on
  Python 2. Previous to 1.0b1, treewalkers coped with mixed
  bytes/unicode data on Python 2; this reintroduces this prior
  behaviour on Python 2. Behaviour is unchanged on Python 3.


1.0b1
~~~~~

Released on May 17, 2013

* Implementation updated to implement the `HTML specification
  &lt;http://www.whatwg.org/specs/web-apps/current-work/&gt;`_ as of 5th May
  2013 (`SVN &lt;http://svn.whatwg.org/webapps/&gt;`_ revision r7867).

* Python 3.2+ supported in a single codebase using the ``six`` library.

* Removed support for Python 2.5 and older.

* Removed the deprecated Beautiful Soup 3 treebuilder.
  ``beautifulsoup4`` can use ``html5lib`` as a parser instead. Note that
  since it doesn't support namespaces, foreign content like SVG and
  MathML is parsed incorrectly.

* Removed ``simpletree`` from the package. The default tree builder is
  now ``etree`` (using the ``xml.etree.cElementTree`` implementation if
  available, and ``xml.etree.ElementTree`` otherwise).

* Removed the ``XHTMLSerializer`` as it never actually guaranteed its
  output was well-formed XML, and hence provided little of use.

* Removed default DOM treebuilder, so ``html5lib.treebuilders.dom`` is no
  longer supported. ``html5lib.treebuilders.getTreeBuilder("dom")`` will
  return the default DOM treebuilder, which uses ``xml.dom.minidom``.

* Optional heuristic character encoding detection now based on
  ``charade`` for Python 2.6 - 3.3 compatibility.

* Optional ``Genshi`` treewalker support fixed.

* Many bugfixes, including:

  * #33: null in attribute value breaks XML AttValue;

  * #4: nested, indirect descendant, &lt;button&gt; causes infinite loop;

  * `Google Code 215
    &lt;http://code.google.com/p/html5lib/issues/detail?id=215&gt;`_: Properly
    detect seekable streams;

  * `Google Code 206
    &lt;http://code.google.com/p/html5lib/issues/detail?id=206&gt;`_: add
    support for &lt;video preload=...&gt;, &lt;audio preload=...&gt;;

  * `Google Code 205
    &lt;http://code.google.com/p/html5lib/issues/detail?id=205&gt;`_: add
    support for &lt;video poster=...&gt;;

  * `Google Code 202
    &lt;http://code.google.com/p/html5lib/issues/detail?id=202&gt;`_: Unicode
    file breaks InputStream.

* Source code is now mostly PEP 8 compliant.

* Test harness has been improved and now depends on ``nose``.

* Documentation updated and moved to http://html5lib.readthedocs.org/.</content>
</entry>
<entry>
<title>Changes 0.95:</title>
<updated>2012-12-01T18:37:51Z</updated>
<author>
<name>adam</name>
<email>adam@pkgsrc.org</email>
</author>
<published>2012-12-01T18:37:51Z</published>
<link rel='alternate' type='text/html' href='https://git.osdyson.ru/mirror/pkgsrc/commit/?id=0cd5fb86a7c58d23d5d5c9b7e03a7f82dabb6a06'/>
<id>urn:sha1:0cd5fb86a7c58d23d5d5c9b7e03a7f82dabb6a06</id>
<content type='text'>
* Parses valid and invalid HTML documents to a tree
* Support for minidom, ElementTree (including cElementTree and lxml.etree), BeautifulSoup (deprecated) and custom simpletree output formats
* DOM to SAX converter
* Reports parse errors
* Character encoding detection
* Filtering and serializing of trees
* HTML+CSS sanitizer
* Many unit tests</content>
</entry>
<entry>
<title>Drop superfluous PKG_DESTDIR_SUPPORT, "user-destdir" is default these days.</title>
<updated>2012-10-25T06:55:37Z</updated>
<author>
<name>asau</name>
<email>asau@pkgsrc.org</email>
</author>
<published>2012-10-25T06:55:37Z</published>
<link rel='alternate' type='text/html' href='https://git.osdyson.ru/mirror/pkgsrc/commit/?id=4961a4ef35d266a26e4c92f763c5beb55aa5aa1d'/>
<id>urn:sha1:4961a4ef35d266a26e4c92f763c5beb55aa5aa1d</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Adjust HOMEPAGE</title>
<updated>2012-01-21T16:52:44Z</updated>
<author>
<name>gls</name>
<email>gls@pkgsrc.org</email>
</author>
<published>2012-01-21T16:52:44Z</published>
<link rel='alternate' type='text/html' href='https://git.osdyson.ru/mirror/pkgsrc/commit/?id=4db336045ec703b21f424bf441aae9e3a9ef4130'/>
<id>urn:sha1:4db336045ec703b21f424bf441aae9e3a9ef4130</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Changes 0.90:</title>
<updated>2011-04-15T08:42:03Z</updated>
<author>
<name>adam</name>
<email>adam@pkgsrc.org</email>
</author>
<published>2011-04-15T08:42:03Z</published>
<link rel='alternate' type='text/html' href='https://git.osdyson.ru/mirror/pkgsrc/commit/?id=c2f7be2c04168c5e370899e22d5c79c33e903e71'/>
<id>urn:sha1:c2f7be2c04168c5e370899e22d5c79c33e903e71</id>
<content type='text'>
* Parses valid and invalid HTML documents to a tree
* Support for minidom, ElementTree (including cElementTree and lxml.etree),
  BeautifulSoup (deprecated) and custom simpletree output formats
* DOM to SAX converter
* Reports parse errors
* Character encoding detection
* Filtering and serializing of trees
* HTML+CSS sanitizer
* Many unit tests</content>
</entry>
<entry>
<title>Update to html5lib-0.11.1. No detailed changes.</title>
<updated>2009-10-19T10:57:40Z</updated>
<author>
<name>joerg</name>
<email>joerg@pkgsrc.org</email>
</author>
<published>2009-10-19T10:57:40Z</published>
<link rel='alternate' type='text/html' href='https://git.osdyson.ru/mirror/pkgsrc/commit/?id=f3e005ba5e8bd81515427feb3f4f98c9c9f0fd21'/>
<id>urn:sha1:f3e005ba5e8bd81515427feb3f4f98c9c9f0fd21</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Remove @dirrm entries from PLISTs</title>
<updated>2009-06-14T18:17:11Z</updated>
<author>
<name>joerg</name>
<email>joerg@pkgsrc.org</email>
</author>
<published>2009-06-14T18:17:11Z</published>
<link rel='alternate' type='text/html' href='https://git.osdyson.ru/mirror/pkgsrc/commit/?id=39c828b6a6a7fd0175afcffca207a0a5e8d85f00'/>
<id>urn:sha1:39c828b6a6a7fd0175afcffca207a0a5e8d85f00</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Import py-html5lib-0.11:</title>
<updated>2009-01-27T17:27:07Z</updated>
<author>
<name>joerg</name>
<email>joerg@pkgsrc.org</email>
</author>
<published>2009-01-27T17:27:07Z</published>
<link rel='alternate' type='text/html' href='https://git.osdyson.ru/mirror/pkgsrc/commit/?id=74b18971741c5fc84ee852a4f39923bf868045bd'/>
<id>urn:sha1:74b18971741c5fc84ee852a4f39923bf868045bd</id>
<content type='text'>
html5lib is a pure-python library for parsing HTML. The parser is
designed to handle all flavours of HTML and  parses invalid documents
using well-defined error handling rules compatible with the behaviour of
major desktop web browsers.

Output is to a tree structure; the current release supports output to
DOM, ElementTree, lxml and BeautifulSoup tree formats as well as a
simple custom format.</content>
</entry>
</feed>
