summaryrefslogtreecommitdiff
path: root/textproc
AgeCommit message (Collapse)AuthorFilesLines
2015-08-29Let's assume that the second p5-Sub-Exporter location is just a typo...joerg1-2/+2
2015-08-29Update to 3.3.7wen2-8/+8
Update DEPENDS Upstream changes: 3.3.7 2015-08-28 13:45:00+0900 - Fix for older Perl 5.8.8 or lower(#145) - Enable 5.8 tests again 3.3.6 2015-08-25 13:50:00+0900 - Fix issue 'include' makes stack pointer incorrect(#130) 3.3.5 2015-08-05 18:50:00+0900 - Update Mouse version for Perl 5.22 or higher
2015-08-28Update to 1.24mef2-7/+6
-------------- 2015-08-28 Sean M. Burke sburke@cpan.org * RELEASE 1.24. Fixing a little (BIG) bug that David Cusimano is a superstar for having noticed. Ah, what a difference a ";" vs a "," makes! [https://rt.cpan.org/Public/Bug/Display.html?id=105420] * I'M BACK. After nine months of semi-catastrophic system failures, and after Voyager-style flybys of a dozen project deadlines... and now I can somehow try to get back in the swing of things. * ANOTHER superstar is Mistah Brendan Byrd who said that there are [ https://rt.cpan.org/Public/Bug/Display.html?id=102357 ] many ports of Unidecode to other languages and that I should brag about that fact, and he is very extremely correct, so now the Pod in Unidecode.pm indeed does just that. * (I got my distro-building back up and running. WOLVERIIIINES!) * I'm thinking of having future Unidecode/*.pm data files contain the canonical Unicode character name for every character as a comment. Obviously, this would make the dist pretty big. But the lib/Unidecode/*.pm files is somewhere around a meg. What's a few megs more?... with the benefit of added clarity? Everyone's a winner!
2015-08-28+ miller.wiz1-1/+2
2015-08-28Import miller-2.0.0 as textproc/miller.wiz4-0/+40
Miller is like sed, awk, cut, join, and sort for name-indexed data such as CSV. With Miller, you get to use named fields without needing to count positional indices. This is something the Unix toolkit always could have done, and arguably always should have done. It operates on key-value-pair data while the familiar Unix tools operate on integer-indexed fields: if the natural data structure for the latter is the array, then Miller's natural data structure is the insertion-ordered hash map. This encompasses a variety of data formats, including but not limited to the familiar CSV. (Miller can handle positionally-indexed data as a special case.)
2015-08-26Update to 1.60. Changes:shattered2-6/+6
+ add configure option --with-man2html + update configure macros + update config.guess, config.sub
2015-08-26Update to 5.2.5:wiz7-34/+400
[ANNOUNCE] Link Grammar version 5.2.0 is now available. This is a major release of the parser, with many important changes in it. The internals of the parser have been re-organized, resulting in a speedup of 2x to 4x for typical English texts. Multiple multi- threading bugs were fixed, and there is now a simple multi-threading unit test. A memory leak was fixed, and a memory over-consumption bug was fixed. These changes were enabled by the final removal of the "fat link" code from the parser. Parser internals work continues apace: it is expected that a version 5.3.0 will follow shortly, featuring a completely re-designed tokenizer. This redesign should enable simpler and better morphology support. The ChangeLog notes other fixes as well: Version 5.2.0 (27 December 2014) * y'all, ain't, gonna, y'gotta: Beverly Hillbillies basilect. * Permanent removal of the fat-link code. * Remove deprecated constituent tree code. * Windows: add terminal screen resizing support. * Windows: a build fix. * reign, rule, run, leave, come: can take predicative adjective. * Rework costs for many verb-derived adjectives. * Handle (predicative) adjectival modifiers for assorted perfect verbs. * Fixes for various color names. * Fixes for various affirmative answers. * Add 100 missing verbs. * Add preliminary lxc-docker (docker.io) support. * Remove MSVC6 support. * Fix memleak introduced in version 5.1.0 * Speedup of 1.7x to 4x (depending on text) from linkage processing redesign. * Fix multi-threading safety bug. * Fix link-and-domain printing alignment (to handle utf8 char widths). * Windows: fixes for MSVC12 support. * Fix memory consumption bug (EMPTY_WORD) introduced in version 4.7.10. * Get rid of xrealloc, which clashes with libbfd symbol xrealloc. * Add multi-threaded parsing unit test. ================================================================= Link Grammar version 5.1.2 is now available. Download from: http://www.abisource.com/downloads/link-grammar/5.1.2/link-grammar-5.1.2.tar.gz The most serious fix in this release is a build-break fix for Apple OSX Mavericks. Other fixes, from the ChangeLog: * Fix greeting: "How do you do?" * Fix indirect object in 'what' questions: 'To what do you owe your success?' * Fix assorted questions with verb "to be". * Compile fixes for Apple OSX version "Mavericks" ================================================================= [ANNOUNCE] link-grammar version 5.1.0 This version includes a number of important changes. One of these is that the connectors can now be given a direction (head and tail indicators), so that link-grammar dependencies can now be true, hierarchical dependency arrows. This is of marginal importance for English, where dependency directions are implicit, but is vital for free-word-order languages, where bi-directional links are not enough. Another important change is that costs can now be arbitrary floating point numbers. This is particularly useful for providing fine-grained parse ranking. The LG cost system assigns a "cost" to every connector, and the sum-total of costs for a sentence determines the parse ranking. Since costs are additive, they behave as entropies (log P -- the logarithm of a probability: probabilities are multiplicative, logarithms are additive). Under the covers, there's been some major work on the tokenization (splitting sentences into words) and morphology (splitting words into morphemes) code. This work is ongoing, and should eventually result in much better support for non-English languages. Other notable changes include an updated Russian dictionary, and an assortment of changes to the English dictionary. An intriguing step towards phonology: LG can now distinguish between the use of the determiners "a" and "an" preceding nouns that start with consonants or vowels. Whether fancier phonology support is possible is a curious question. The full changelog is below: * Updated Russian dictionaries from Sergei Protasov. * Added morphology-based unknown-word handling for Russian, from Sergei. * Fix up fat-linkage code, which was recently broken... * API cleanup: many command-line options never belonged in the API. * New emoticon support was clobbering certain dictionary words. * Fix: "Go to spot X", "It happens at time T." * Add a dozen missing verbs. * Minor work on greetings. * Add mechanism for denoting fractional costs in the file-backed dict. * Fix: broken handling of gerunds (due to bad verb-wall connectors) * Major redesign of morpheme splitting mechanism (from AmirP) * Minor extensions to support numeric formulas, e.g. 1 + 1 = 2. * Remove fat linkage support from the SAT solver. * Enable build of SAT solver by default. * Fix multiple bugs with unit stripping. * Add bounds-checking to the C API. * Fix the old disjunct-printing implementation. * Add support for easy-to-use link direction indicator. * Add random morphology generator tool. * Partial support for phonetic use of "a" vs. "an" for English. * Rework how coordination between conjunctions works: "either... or ...", etc. * Major redesign of tokenization mechanism (from AmirP) ================================================================= Version 5.0.0 of the Link Grammar Parser is now available. (Yes, its April 1st. No, this is not a joke. Maybe I'll think of something snarky next year.) We are proud to announce a major new release of the Link Grammar Parser! It contains many important changes and new additions. One of the most significant changes is that the license has been changed from the BSD license to the LGPL. This was done to enable considerably more flexibility in accepting contributions to the project: it seems that few are particularly interested in contributing to a BSD-licensed project. This change has enabled folding in some new work: o Arabic and Persion dictionaries! These were previously maintained as separate add-ons. Including them as part of the distribution should make it easier for interested users. o A new 'bindings' directory, containing code for Java, Python, Common Lisp, OCaML and AutoIt programming languages. The Python bindings are an updated version of the older pylinkgrammar-0.2.13 bindings. A SWIG interface file should make it easy to create other language bindigns as well. o Improved morphology support. This will be invisible to most users, but it lays the groundwork for add Hebrew support to the parser. o Expanded Lithuanian support. This remains a simplistic prototype, but it now performs a more sophisticated morphological analysis. o Experimental Turkish and Hebrew dictionaries. o A demo of the JSON parser server: it shows how to run the server, which will accept accept raw sentences on a socket, and returns the parsed forms. o Some slightly incompatible changes to the API: it was time for some housekeeping. o Misc minor updates to the English Language dictionaries. o Preliminary work for SQL-backed dynamic dictionaries. This should enable certain types of automated language learning. The full changelog is shown below. CHANGELOG: Version 5.0.0 (1 April 2014) * License upgrade to LGPLv2.1 * Arabic dictionaries, from Jon Dehdari * Persian dictionaries, from Jon Dehdari * Support for Hebrew tokenization, from Amir P. * Fix wild-card matching for user-supplied word lookup. * Prototype Turkish dictionary from Can Bruce. * Re-arrange programming language bindings directory. * Adopt the orphaned/unsupported pylinkgrammar Python bindings. * Deprecate the obsolete CNode interface. * Provide low-level perl bindings. * Adopt the orphaned/unsupported OCaML bindings. * Support affirmative replies: "Who did it?" "John's evil twin." * Expanded Lithuanian dictionary. * Minor disjunct printing fixes. * Fix: "Mary is too XXX to talk to." * Prototype Hebrew dictionary from Amir P. * Change !suffixes flag to !morphology. * Introduce a bi-directional connector, for free-word-order languages. * Introduce a symmetric-AND operator, for free-word-order languages. * Add demo shell script for running the JSON parse server. * Bugfix: Java server failing when input sentence has commas in it! * New !test and !debug commands for selective debugging support. * Print post-processing rejection message, when !bad is enabled. * Remove some deprecated functions for C API. * Remove all deprecated functions from Java API. * Initial support for an SQL-backed dynamic dictionary. ================================================================= Version 4.8.5 of the Link Grammar Parser is now available. This is the third release in about a week; each prompted by a build-break in the previous version. Sorry! There's been assorted (minor) new work, and this has been enough to cause trouble for various people. Some notable changes in the last 6 weeks: * Improved Russian (UTF-8) support for MSWindows users. * Build files for MSVC12 * Several Java binding fixes * English dictionary: add a verb-wall connector for present participles. A full list of changes is given below. If none of these seem to affect you, there is no particular need to upgrade. CHANGELOG: Version 4.8.5 (5 January 2014) * Update memory usage accounting; fix accounting bugs. * Fix Java garbage collection bug. * Fix numerous compiler warnings in the SAT-solver code. * Fix build-break involving multiple declaration of 'Boolean'. Version 4.8.4 (30 December 2013) * Fix build break for Mac OSX. Version 4.8.3 (30 December 2013) * Create new msvc12 build files, restore old msvc9 files. * Revert location of the Windows mbrtowc declaration. * Add verb-wall connector for present participles. * Fix build-time include file directory paths. * Provide the 'any' language to enumerate all possible linkages. * Fix recognition of U+00A0, c2 a0, NO-BREAK SPACE as whitespace. * Improve parse-time performance of exceptionally long sentences. * Fix crash on certain sentences containing equals sign. Version 4.8.2 (25 November 2013) * More MSWindows UTF-8/multi-byte fixes (for Russian). * Add missing JSONUtils file. Version 4.8.1 (21 November 2013) * Ongoing work on viterbi. * Updated MSVC9 project files from Jand Hashemi (Lucky--) * Fix important bug in Java services: return top parses, not random ones. * Java: for the link-diagram string, do not limit to 80 char term width. * Windows: UTF-8 fixes so that Russian works in most MSWindows locales. ================================================================= Version 4.8.0 of the Link Grammar Parser is now available. This is the start of a new version series, containing an important change to the English language dictionary. Three new link types are introduced WV, CV and IV. These are used to connect the left-wall to the primary verb of the sentence (WV), to connect the ruling clause to the primary verb of a dependent clause (CV), and a similar link for certain infinitive verbs (IV). The goal of these links is to make it easier to locate verbs, and thus to provide a more direct mapping from the link-grammar formalism to a dependency parse (as dependency parses always put the verb at the root of a sentence). These are not the first links that explicitly indicate root verbs: several other link types already play this role: The AF, CP, Eq, COq and B links already play this role. The new WV, CV and IV links round out this capability and do so in a very general form. See http://www.abisource.com/projects/link-grammar/dict/section-WV.html for details. With this release, we expect that all (non-auxiliary) verbs in a sentence will be linked either to the wall, or to a controlling parent. We also expect there to be some additional fixes and tightening-up to occur in future releases, especially in regards to comparative sentences. This release also includes a variety of fixes to the Java API/server. In addition, some ancient, deprecated C code was removed. CHANGELOG: Version 4.8.0 (24 October 2013) * Fix "he answered yes" * Support bulleted, numbered lists. * New link types from Lian Ruiting, for identifying the head-verb. * Java: fix bug when totaling WordNet word-sense score. * Java: add info to README about using the JSON parse server. * Java: remove many deprecated functions. * C API: remove some deprecated functions. * Java: fix silent failure when library is not found. * Java: Add support for fetching the ASCII-art diagram string. * Java: Fix insane language selection initialization. * Fix: "The pig runs SLOWER than the cat." * Fix: conjoined superlatives: "... the longest and the farthest." * Fix: "inside" can be used with conjunction: "near or inside..." * Fix: conjoined question modifiers: "exactly when and precisely where..." * Fix: issue 59: crash/corruption when dictionary opened twice. * Fix: assorted exclamations! ================================================================= ================================================================= ================================================================= Version 4.7.12 of the Link Grammar Parser is now available. The biggest change in this version is a sharply updated Russian dictionary, which fixes a large number of bugs generated during during the initial release. Thanks to Sergey Protasov who did almost all this work! The other notable change is that the fat-link code is no longer build by default. It will be permanently removed in some future version, "real soon now". A miscellany of other minor changes are listed below. The link-grammar homepage: http://www.abiword.org/projects/link-grammar/ Download: http://www.abiword.org/downloads/link-grammar/4.7.12/link-grammar-4.7.12.tar.gz WHAT IS LINK GRAMMAR? The Link Grammar Parser is a syntactic parser of English (and other languages as well), based on link grammar, an original theory of English syntax. Given a sentence, the system assigns to it a syntactic structure, which consists of a set of labelled links connecting pairs of words. The parser also produces a "constituent" (Penn tree-bank style phrase tree) representation of a sentence (showing noun phrases, verb phrases, etc.). The RelEx extension provides dependency-parse output. CHANGELOG: Version 4.7.12 (25 May 2013) * Large fixes to the Russian dictionaries. * Windows: Explicitly fail if cygwin version is too old. * Tweak the lt dict to work again with the modern parser. * Make the fat linkages code be compile-time configurable. * Disable fat linkages by default; mark as deprecated. * Fix SAT-solver build; recent changes had broken it. * Export read-dict.h as a public API. * Ongoing development of the Viterbi prototype. * Windows: some UTF8/widechar refactoring. * Java bindings: add method to set the language. * CMake: add version checking to the CMakefile * Fix: failed handling of capitalized first word for Russian. * Fix: stemming failures in many cases (for Russian dictionaries) * Add flag to suppress stem-suffix printing. * Windows: Fixes to MSVC6 build files. * Fix: hash-table bug affecting Russian dictionaries
2015-08-23Remove two bl3.mk files that shouldn't be there.wiz2-28/+0
No headers or libraries to link to.
2015-08-23Bump PKGREVISION for nettle shlib major bump.wiz1-2/+2
2015-08-23Update to 1.43:wiz2-6/+6
1.43 2015-08-21 NEILB - Got rid of the "Redundant argument in sprintf" warnings from Text:Diff::Table on Perl 5.021+. RT#100505 and RT#106602. - Metadata and doc now refer to NEILB's repo rather than OVID's.
2015-08-22Add DWB.leot1-1/+2
2015-08-22Import textproc/DWB as DWB-20150517. From Carsten Kunze via pkgsrc-wip.leot28-0/+1017
The Documenter's Workbench (DWB) Release 3.3 is AT&T's original software distribution of nroff and troff (ditroff), the preprocessors tbl, eqn, pic, and grap, and the macro packages man, ms, and mm.
2015-08-20Update to 1.42mef2-7/+6
-------------- 1.42 2015-08-20 NEILB - Fixed pod link that was referring to the wrong place. Thanks to KENTNL for RT#106150. - First non-developer released of the changes listed against 1.41_01.
2015-08-18Bump all packages that depend on curses.bui* or terminfo.bui* since theywiz6-10/+12
might incur ncurses dependencies on some platforms, and ncurses just bumped its shlib. Some packages were bumped twice now, sorry for that.
2015-08-17Bump PKGREVISION for ncurses shlib bump.wiz4-7/+8
2015-08-16Fix MASTER_SITES after PKGREVISION bump.wiz1-2/+2
2015-08-15Bump PKGREVISION for librevenge boost fix.wiz2-3/+4
2015-08-14Update to 3.60:wiz2-6/+6
iso-codes 3.60 -------------- Dr. Tobias Quathamer <toddy@debian.org> Sun, 2 Aug 2015 [ ISO 4217 ] * Correct name for AOK (Angolan Kwanza), spotted by Anders Jonsson. Thanks! [ ISO 4217 translations ] * French by Christian Perrier * Thai by Theppitak Karoonboonyanan * Swedish by Anders Jonsson (TP) [ ISO 639-5 translations ] * Polish by Jakub Bogusz (TP) [ ISO 639-3 translations ] * Polish by Jakub Bogusz (TP) * Russian by Dmitry Sivachenko (TP) * Ukrainian by Yuri Chornoivan (TP) [ ISO 3166 translations ] * Ukrainian by Yuri Chornoivan (TP) * Esperanto by Edmund GRIMLEY EVANS (TP) * Hungarian by Balázs Úr (TP) * Italian by Milo Casagrande (TP) * Polish by Jakub Bogusz (TP) * Russian by Yuri Kozlov (TP) * Norwegian Bokmaal by Hans Fredrik Nordhaug (TP) [ ISO 639 translations ] * Ukrainian by Yuri Chornoivan (TP) * Catalan by Jordi Mas i Hernàndez (TP) [ ISO 3166-2 translations ] * Polish by Jakub Bogusz (TP)
2015-08-12Update to 0.1.4ryoon2-6/+6
Changelog: libodfgen 0.1.4 - drawing interface: do no forget to call startDocument/endDocument when writing in the manifest - metadata: added handler for 'template' metadata, unknown metadata are written in a meta:user-defined elements, - defineSheetNumberingStyle: can now define styles for the whole document (and not only for the actual sheet) - update doxygen configuration file + add a make astyle command libodfgen 0.1.3 - Allow writing meta:creation-date metadata element for drawings and presentations too. - Improve handling of headings. Most importantly, write valid ODF. - Write meta:generator metadata element. - Add initial support for embedded fonts. It is currently limited to Flat ODF output. libodfgen 0.1.2 - Use text:h element for headings. Any paragraph with text:outline-level property is recognized as a heading. - Handle layers. - Improve handling of styles. Particularly, do not emit duplicate styles. - Slightly improve documentation. - Handle master pages. - Do not expect that integer properties are always in inches. - Fix misspelled style:paragraph-properties element in presentation notes. - Only export public symbols on Linux. - Fix bogus XML-escaping of metadata values. - And many other improvements and fixes.
2015-08-07Recursive revbump associated with lang/ocaml update.jaapb5-10/+10
2015-08-07Update to 1.25wen2-7/+8
Upstream changes: 1.25 2015.05.25 - Rename test files to have number prefix. - Move test reqirements to TEST_REQUIRES or BUILD_REQUIRES for older EUMM - Older versions of EU::MM require quotes around 2-dot versions (CHORNY) 1.24 2015.05.22 - Include the rc files in the distribution to use the proper Perl::Critic configuration. - use Test::Version to make sure we have the same version number in every module. - Configure Perl::Critic to be level 4. - Lots of other refactorings.
2015-08-06Update to 0.59:wiz2-7/+6
0.59 Mon Jan 26 15:04:10 PST 2015 - PR/23 Better scalar dump heuristics - More closely match YAML.pm - Thanks Matthias Bethke 0.58 Tue Jan 20 21:01:49 PST 2015 - Add a VERSION statement to YAML::LibYAML (issue#8) 0.57 Thu Jan 15 23:05:15 EST 2015 - Applied fix for PR/21. nawglan++ 0.56 Thu Jan 15 22:21:47 EST 2015 - Update copyright year - Use Swim cpan-tail block functions in doc 0.55 Mon Dec 22 17:26:27 PST 2014 - Get YAML::XS using latest libyaml
2015-08-04CVE-2015-1283 heap based buffer overflow in expat.tnn3-2/+82
Patch via Debian bug#793484 and Mozilla. Bump.
2015-08-02Update to 1.11szptvlfn3-10/+9
changelog: Changes in polyparse-1.11 A fix for the Applicative/Monad/Functor classes rearrangement in ghc-7.10.*. Changes in polyparse-1.10 A new basic text-accepting combinator "literal", accepts the given string, without lexing it into Haskell-like words ("isWord"). A more correct implementation of "manyFinally". A new combinator "satisfyMsg" which has a string describing the predicate, in order to produce more informative error messages.
2015-08-02Bump PKGREVISION for hs-text-1.2.1.3szptvlfn19-37/+38
2015-08-01Update to 1.12wen2-10/+8
Remove duplicate lines Add LICENSE Upstream changes: 1.12 2015-07-04 NEILB - Added [MetaJSON] to dist.ini, so release will include META.json. RT#105629 from ETHER++ 1.11 2014-06-09 - Set up the usual directory structure - Switched to Dist::Zilla - Added COPYRIGHT AND LICENSE section to pod - Added github repo to pod
2015-07-30sortjnemeth1-2/+2
2015-07-30complete removal of packagejnemeth1-32/+0
2015-07-26+ xfce4-dictyouri1-2/+2
- xfce4-dict-plugin
2015-07-26Remove dead plugin.youri4-42/+0
2015-07-26Import xfce4-dict-0.7.1 as textproc/xfce4-dict.youri4-0/+78
This program allows you to search different kinds of dictionary services for words or phrases and shows you the result. Currently you can query a Dict server(RFC 2229), any online dictionary service by opening a web browser or search for words using the aspell/ispell program.
2015-07-25Bump PKGREVISION for hs-unordered-containers-0.2.5.1szptvlfn5-10/+10
2015-07-25Add a package for Text::Table. From David Gutteridge in PR pkg/50054.bsiegert4-1/+29
Text::Table can be used to render plaintext/ASCII-art/Unicode-art tables.
2015-07-23Update to 2.300:wiz2-7/+6
2.300 (2015-06-17) - update to Unicode 8.0.0
2015-07-21Add new package for Text::Aligner. Package from David Gutteridge inbsiegert4-1/+26
PR pkg/50053. From DESCR: Text::Aligner provides nicely formatted alignment of text strings.
2015-07-19Do not include $Mdocdate$ in the patch, some versions of cvs will changejoerg2-10/+4
it on checkout, breaking the patchsum.
2015-07-19Update to 2.6.2:wiz3-17/+16
Python-Markdown 2.3 Release Notes ================================= We are pleased to release Python-Markdown 2.3 which adds one new extension, removes a few old (obsolete) extensions, and now runs on both Python 2 and Python 3 without running the 2to3 conversion tool. See the list of changes below for details. Python-Markdown supports Python versions 2.6, 2.7, 3.1, 3.2, and 3.3. Backwards-incompatible Changes ------------------------------ * Support has been dropped for Python 2.5. No guarantees are made that the library will work in any version of Python lower than 2.6. As all supported Python versions include the ElementTree library, Python-Markdown will no longer try to import a third-party installation of ElementTree. * All classes are now "new-style" classes. In other words, all classes subclass from 'object'. While this is not likely to affect most users, extension authors may need to make a few minor adjustments to their code. * "safe_mode" has been further restricted. Markdown formatted links must be of a known white-listed scheme when in "safe_mode" or the URL is discarded. The white-listed schemes are: 'HTTP', 'HTTPS', 'FTP', 'FTPS', 'MAILTO', and 'news'. Schemeless URLs are also permitted, but are checked in other ways - as they have been for some time. * The ids assigned to footnotes now contain a dash (`-`) rather than a colon (`:`) when `output_format` it set to `"html5"` or `"xhtml5"`. If you are making reference to those ids in your JavaScript or CSS and using the HTML5 output, you will need to update your code accordingly. No changes are necessary if you are outputting XHTML (the default) or HTML4. * The `force_linenos` configuration setting of the CodeHilite extension has been marked as **Pending Deprecation** and a new setting `linenums` has been added to replace it. See documentation for the [CodeHilite Extension] for an explanation of the new `linenums` setting. The new setting will honor the old `force_linenos` if it is set, but it will raise a `PendingDeprecationWarning` and will likely be removed in a future version of Python-Markdown. [CodeHilite Extension]: extensions/codehilite.html * The "RSS" extension has been removed and no longer ships with Python-Markdown. If you would like to continue using the extension (not recommended), it is archived on [GitHub](https://gist.github.com/waylan/4773365). * The "HTML Tidy" Extension has been removed and no longer ships with Python-Markdown. If you would like to continue using the extension (not recommended), it is archived on [GitHub](https://gist.github.com/waylan/5152650). Note that the underlying library, uTidylib, is not Python 3 compatible. Instead, it is recommended that the newer [PyTidyLib] (version 0.2.2+ for Python 3 comparability - install from GitHub not PyPI) be used. As the API for that library is rather simple, it is recommended that the output of Markdown be wrapped in a call to PyTidyLib rather than using an extension (for example: `tidylib.tidy_fragment(markdown.markdown(source), options={...})`). [PyTidyLib]: http://countergram.com/open-source/pytidylib What's New in Python-Markdown 2.3 --------------------------------- * The entire code base now universally runs in Python 2 and Python 3 without any need for running the 2to3 conversion tool. This not only simplifies testing, but by using Unicode_literals, results in more consistent behavior across Python versions. Additionally, the relative imports (made possible in Python 2 via absolute_import) allows the entire library to more easily be embedded in a sub-directory of another project. The various files within the library will still import each other properly even though 'markdown' may not be in Python's root namespace. * The [Admonition Extension] has been added, which implements [rST-style][rST] admonitions in the Markdown syntax. However, be warned that this extension is experimental and the syntax and behavior is still subject to change. Please try it out and report bugs and/or improvements. [Admonition Extension]: extensions/admonition.html [rST]: http://docutils.sourceforge.net/docs/ref/rst/directives.html#specific-admonitions * Various bug fixes have been made. See the [commit log](https://github.com/waylan/Python-Markdown/commits/master) for a complete history of the changes. Python-Markdown 2.4 Release Notes ================================= We are pleased to release Python-Markdown 2.4 which adds one new extension and fixes various bugs. See the list of changes below for details. Python-Markdown supports Python versions 2.6, 2.7, 3.1, 3.2, and 3.3. Backwards-incompatible Changes ------------------------------ * The `force_linenos` configuration setting of the CodeHilite extension has been marked as **Deprecated**. It had previously been marked as "Pending Deprecation" in version 2.3 when a new setting `linenums` was added to replace it. See documentation for the [CodeHilite Extension] for an explanation of the new `linenums` setting. The new setting will honor the old `force_linenos` if it is set, but `force_linenos` will raise a `DeprecationWarning` and will likely be removed in a future version of Python-Markdown. [CodeHilite Extension]: extensions/code_hilite.html * URLs are no longer percent-encoded. This improves compatibility with the original (written in Perl) Markdown implementation. Please percent-encode your URLs manually when needed. What's New in Python-Markdown 2.4 --------------------------------- * Thanks to the hard work of [Dmitry Shachnev] the [Smarty Extension] has been added, which implements [SmartyPants] using Python-Markdown's Extension API. This offers a few benefits over a third party script. The HTML does not need to be "tokenized" twice, no hacks are required to combine SmartyPants and code highlighting, and we get markdown's escaping feature for free. Please try it out and report bugs and/or improvements. [Dmitry Shachnev]: https://github.com/mitya57 [Smarty Extension]: extensions/smarty.html [SmartyPants]: http://daringfireball.net/projects/smartypants/ * The [Table of Contents Extension] now supports new `permalink` option for creating [Sphinx]-style anchor links. [Table of Contents Extension]: extensions/toc.html [Sphinx]: http://sphinx-doc.org/ * It is now possible to enable Markdown formatting inside HTML blocks by appending `markdown=1` to opening tag attributes. See [Markdown Inside HTML Blocks] section for details. Thanks to [ryneeverett] for implementing this feature. [Markdown Inside HTML Blocks]: extensions/extra.html#nested-markdown-inside-html-blocks [ryneeverett]: https://github.com/ryneeverett * The code blocks now support emphasizing some of the code lines. To use this feature, specify `hl_lines` option after language name, for example (using the [Fenced Code Extension]): ```.python hl_lines="1 3" # This line will be emphasized. # This one won't. # This one will be also emphasized. ``` Thanks to [A. Jesse Jiryu Davis] for implementing this feature. [Fenced Code Extension]: extensions/fenced_code_blocks.html [A. Jesse Jiryu Davis]: https://github.com/ajdavis * Various bug fixes have been made. See the [commit log](https://github.com/waylan/Python-Markdown/commits/master) for a complete history of the changes. Python-Markdown 2.5 Release Notes ================================= We are pleased to release Python-Markdown 2.5 which adds a few new features and fixes various bugs. See the list of changes below for details. Python-Markdown version 2.5 supports Python versions 2.7, 3.2, 3.3, and 3.4. Backwards-incompatible Changes ------------------------------ * Python-Markdown no longer supports Python version 2.6. You must be using Python versions 2.7, 3.2, 3.3, or 3.4. [importlib]: https://pypi.python.org/pypi/importlib * The `force_linenos` configuration key on the [CodeHilite Extension] has been **deprecated** and will raise a `KeyError` if provided. In the previous release (2.4), it was issuing a `DeprecationWarning`. The [`linenums`][linenums] keyword should be used instead, which provides more control of the output. [CodeHilite Extension]: extensions/code_hilite.html [linenums]: extensions/code_hilite.html#usage * Both `safe_mode` and the associated `html_replacement_text` keywords will be deprecated in version 2.6 and will raise a **`PendingDeprecationWarning`** in 2.5. The so-called "safe mode" was never actually "safe" which has resulted in many people having a false sense of security when using it. As an alternative, the developers of Python-Markdown recommend that any untrusted content be passed through an HTML sanitizer (like [Bleach]) after being converted to HTML by markdown. If your code previously looked like this: html = markdown.markdown(text, same_mode=True) Then it is recommended that you change your code to read something like this: import bleach html = bleach.clean(markdown.markdown(text)) If you are not interested in sanitizing untrusted text, but simply desire to escape raw HTML, then that can be accomplished through an extension which removes HTML parsing: from markdown.extensions import Extension class EscapeHtml(Extension): def extendMarkdown(self, md, md_globals): del md.preprocessors['html_block'] del md.inlinePatterns['html'] html = markdown.markdown(text, extensions=[EscapeHtml()]) As the HTML would not be parsed with the above Extension, then the serializer will escape the raw HTML, which is exactly what happens now when `safe_mode="escape"`. [Bleach]: http://bleach.readthedocs.org/ * Positional arguments on the `markdown.Markdown()` are pending deprecation as are all except the `text` argument on the `markdown.markdown()` wrapper function. Only keyword arguments should be used. For example, if your code previously looked like this: html = markdown.markdown(text, ['extra']) Then it is recommended that you change it to read something like this: html = markdown.markdown(text, extensions=['extra']) !!! Note This change is being made as a result of deprecating `"safe_mode"` as the `safe_mode` argument was one of the positional arguments. When that argument is removed, the two arguments following it will no longer be at the correct position. It is recommended that you always use keywords when they are supported for this reason. * In previous versions of Python-Markdown, the built-in extensions received special status and did not require the full path to be provided. Additionally, third party extensions whose name started with `"mdx_"` received the same special treatment. This behavior will be deprecated in version 2.6 and will raise a **`PendingDeprecationWarning`** in 2.5. Ensure that you always use the full path to your extensions. For example, if you previously did the following: markdown.markdown(text, extensions=['extra']) You should change your code to the following: markdown.markdown(text, extensions=['markdown.extensions.extra']) The same applies to the command line: $ python -m markdown -x markdown.extensions.extra input.txt See the [documentation](reference.html#extensions) for a full explanation of the current behavior. * The previously documented method of appending the extension configuration as a string to the extension name will be deprecated in Python-Markdown version 2.6 and will raise a **`PendingDeprecationWarning`** in 2.5. The [`extension_configs`](reference.html#extension_configs) keyword should be used instead. See the [documentation](reference.html#extension-configs) for a full explanation of the current behavior. What's New in Python-Markdown 2.5 --------------------------------- * The [Smarty Extension] has had a number of additional configuration settings added, which allows one to define their own substitutions to better support languages other than English. Thanks to [Martin Altmayer] for implementing this feature. [Smarty Extension]: extensions/smarty.html [Martin Altmayer]:https://github.com/MartinAltmayer * Named Extensions (strings passed to the [`extensions`][ex] keyword of `markdown.Markdown`) can now point to any module and/or Class on your PYTHONPATH. While dot notation was previously supported, a module could not be at the root of your PYTHONPATH. The name had to contain at least one dot (requiring it to be a sub-module). This restriction no longer exists. Additionally, a Class may be specified in the name. The class must be at the end of the name (which uses dot notation from PYTHONPATH) and be separated by a colon from the module. Therefore, if you were to import the class like this: from path.to.module import SomeExtensionClass Then the named extension would comprise this string: "path.to.module:SomeExtensionClass" This allows multiple extensions to be implemented within the same module and still accessible when the user is not able to import the extension directly (perhaps from a template filter or the command line). This also means that extension modules are no longer required to include the `makeExtension` function which returns an instance of the extension class. However, if the user does not specify the class name (she only provides `"path.to.module"`) the extension will fail to load without the `makeExtension` function included in the module. Extension authors will want to document carefully what is required to load their extensions. [ex]: reference.html#extensions * The Extension Configuration code has been refactored to make it a little easier for extension authors to work with configuration settings. As a result, the [`extension_configs`][ec] keyword now accepts a dictionary rather than requiring a list of tuples. A list of tuples is still supported so no one needs to change their existing code. This should also simplify the learning curve for new users. Extension authors are encouraged to review the new methods available on the `markdown.extnesions.Extension` class for handling configuration and adjust their code going forward. The included extensions provide a model for best practices. See the [API] documentation for a full explanation. [ec]: reference.html#extension_configs [API]: extensions/api.html#configsettings * The [Command Line Interface][cli] now accepts a `--extensions_config` (or `-c`) option which accepts a file name and passes the parsed content of a [YAML] or [JSON] file to the [`extension_configs`][ec] keyword of the `markdown.Markdown` class. The contents of the YAML or JSON must map to a Python Dictionary which matches the format required by the `extension_configs` keyword. Note that [PyYAML] is required to parse YAML files. [cli]: cli.html#using-extensions [YAML]: http://yaml.org/ [JSON]: http://json.org/ [PyYAML]: http://pyyaml.org/ * The [admonition extension][ae] is no longer considered "experimental." [ae]: extensions/admonition.html * There have been various refactors of the testing framework. While those changes will not directly effect end users, the code is being better tested which will benefit everyone. * Various bug fixes have been made. See the [commit log](https://github.com/waylan/Python-Markdown/commits/master) for a complete history of the changes. Python-Markdown 2.6 Release Notes ================================= We are pleased to release Python-Markdown 2.6 which adds a few new features and fixes various bugs. See the list of changes below for details. Python-Markdown version 2.6 supports Python versions 2.7, 3.2, 3.3, and 3.4 as well as PyPy. Backwards-incompatible Changes ------------------------------ ### `safe_mode` Deprecated Both `safe_mode` and the associated `html_replacement_text` keywords are deprecated in version 2.6 and will raise a **`DeprecationWarning`**. The `safe_mode` and `html_replacement_text` keywords will be ignored in version 2.7. The so-called "safe mode" was never actually "safe" which has resulted in many people having a false sense of security when using it. As an alternative, the developers of Python-Markdown recommend that any untrusted content be passed through an HTML sanitizer (like [Bleach]) after being converted to HTML by markdown. If your code previously looked like this: html = markdown.markdown(text, safe_mode=True) Then it is recommended that you change your code to read something like this: import bleach html = bleach.clean(markdown.markdown(text)) If you are not interested in sanitizing untrusted text, but simply desire to escape raw HTML, then that can be accomplished through an extension which removes HTML parsing: from markdown.extensions import Extension class EscapeHtml(Extension): def extendMarkdown(self, md, md_globals): del md.preprocessors['html_block'] del md.inlinePatterns['html'] html = markdown.markdown(text, extensions=[EscapeHtml()]) As the HTML would not be parsed with the above Extension, then the serializer will escape the raw HTML, which is exactly what happens now when `safe_mode="escape"`. [Bleach]: http://bleach.readthedocs.org/ ### Positional Arguments Deprecated Positional arguments on the `markdown.Markdown()` class are deprecated as are all except the `text` argument on the `markdown.markdown()` wrapper function. Using positional arguments will raise a **`DeprecationWarning`** in 2.6 and an error in version 2.7. Only keyword arguments should be used. For example, if your code previously looked like this: html = markdown.markdown(text, [SomeExtension()]) Then it is recommended that you change it to read something like this: html = markdown.markdown(text, extensions=[SomeExtension()]) !!! Note This change is being made as a result of deprecating `"safe_mode"` as the `safe_mode` argument was one of the positional arguments. When that argument is removed, the two arguments following it will no longer be at the correct position. It is recommended that you always use keywords when they are supported for this reason. ### "Shortened" Extension Names Deprecated In previous versions of Python-Markdown, the built-in extensions received special status and did not require the full path to be provided. Additionally, third party extensions whose name started with `"mdx_"` received the same special treatment. This behavior is deprecated and will raise a **`DeprecationWarning`** in version 2.6 and an error in 2.7. Ensure that you always use the full path to your extensions. For example, if you previously did the following: markdown.markdown(text, extensions=['extra']) You should change your code to the following: markdown.markdown(text, extensions=['markdown.extensions.extra']) The same applies to the command line: $ python -m markdown -x markdown.extensions.extra input.txt Similarly, if you have used a third party extension (for example `mdx_math`), previously you might have called it like this: markdown.markdown(text, extensions=['math']) As the `"mdx"` prefix will no longer be appended, you will need to change your code as follows (assuming the file `mdx_math.py` is installed at the root of your PYTHONPATH): markdown.markdown(text, extensions=['mdx_math']) Extension authors will want to update their documentation to reflect the new behavior. See the [documentation](reference.html#extensions) for a full explanation of the current behavior. ### Extension Configuration as Part of Extension Name Deprecated The previously documented method of appending the extension configuration options as a string to the extension name is deprecated and will raise a **`DeprecationWarning`** in version 2.6 and an error in 2.7. The [`extension_configs`](reference.html#extension_configs) keyword should be used instead. See the [documentation](reference.html#extension-configs) for a full explanation of the current behavior. ### HeaderId Extension Pending Deprecation The [HeaderId][hid] Extension is pending deprecation and will raise a **`PendingDeprecationWarning`** in version 2.6. The extension will be deprecated in version 2.7 and raise an error in version 2.8. Use the [Table of Contents][TOC] Extension instead, which offers most of the features of the HeaderId Extension and more (support for meta data is missing). Extension authors who have been using the `slugify` and `unique` functions defined in the HeaderId Extension should note that those functions are now defined in the Table of Contents extension and should adjust their import statements accordingly (`from markdown.extensions.toc import slugify, unique`). [hid]: extensions/header_id.html ### The `configs` Keyword is Deprecated Positional arguments and the `configs` keyword on the `markdown.extension.Extension` class (and its subclasses) are deprecated. Each individual configuration option should be passed to the class as a keyword/value pair. For example. one might have previously initiated an extension subclass like this: ext = SomeExtension(configs={'somekey': 'somevalue'}) That code should be updated to pass in the options directly: ext = SomeExtension(somekey='somevalue') Extension authors will want to note that this affects the `makeExtension` function as well. Previously it was common for the function to be defined as follows: def makeExtension(configs=None): return SomeExtension(configs=configs) Extension authors will want to update their code to the following instead: def makeExtension(**kwargs): return SomeExtension(**kwargs) Failing to do so will result in a **`DeprecationWarning`** and will raise an error in the next release. See the [Extension API][mext] documentation for more information. In the event that an `markdown.extension.Extension` subclass overrides the `__init__` method and implements its own configuration handling, then the above may not apply. However, it is recommended that the subclass still calls the parent `__init__` method to handle configuration options like so: class SomeExtension(markdown.extension.Extension): def __init__(**kwargs): # Do pre-config stuff here # Set config defaults self.config = { 'option1' : ['value1', 'description1'], 'option2' : ['value2', 'description2'] } # Set user defined configs super(MyExtension, self).__init__(**kwargs) # Do post-config stuff here Note the call to `super` to get the benefits of configuration handling from the parent class. See the [documentation][config] for more information. [config]: extensions/api.html#configsettings [mext]: extensions/api.html#makeextension What's New in Python-Markdown 2.6 --------------------------------- ### Official Support for PyPy Official support for [PyPy] has been added. While Python-Markdown has most likely worked on PyPy for some time, it is now officially supported and tested on PyPy. [PyPy]: http://pypy.org/ ### YAML Style Meta-Data The [Meta-Data] Extension now includes optional support for [YAML] style meta-data. By default, the YAML deliminators are recognized, however, the actual data is parsed as previously. This follows the syntax of [MultiMarkdown], which inspired this extension. <del>Alternatively, if the `yaml` option is set, then the data is parsed as YAML.</del> <ins>As the `yaml` option was buggy, it was removed in 2.6.1. It is suggested that a third party extension be used if you want true YAML support. See [Issue #390][#390] for a full explanation.</ins> [MultiMarkdown]: http://fletcherpenney.net/MultiMarkdown_Syntax_Guide#metadata [Meta-Data]: extensions/meta_data.html [YAML]: http://yaml.org/ [#390]: https://github.com/waylan/Python-Markdown/issues/390 ### Table of Contents Extension Refactored The [Table of Contents][TOC] Extension has been refactored and some new features have been added. See the documentation for a full explanation of each feature listed below: * The extension now assigns the Table of Contents to the `toc` attribute of the Markdown class regardless of whether a "marker" was found in the document. Third party frameworks no longer need to insert a "marker," run the document through Markdown, then extract the Table of Contents from the document. * The Table of Contents Extension is now a "registered extension." Therefore, when the `reset` method of the Markdown class is called, the `toc` attribute on the Markdown class is cleared (set to an empty string). * When the `marker` configuration option is set to an empty string, the parser completely skips the process of searching the document for markers. This should save parsing time when the Table of Contents Extension is being used only to assign ids to headers. * A `separator` configuration option has been added allowing users to override the separator character used by the slugify function. * A `baselevel` configuration option has been added allowing users to set the base level of headers in their documents (h1-h6). This allows the header levels to be automatically adjusted to fit within the hierarchy of an HTML template. [TOC]: extensions/toc.html ### Pygments can now be disabled The [CodeHilite][ch] Extension has gained a new configuration option: `use_pygments`. The option is `True` by default, however, it allows one to turn off Pygments code highlighting (set to `False`) while preserving the language detection features of the extension. Note that Pygments language guessing is not used as that would 'use Pygments'. If a language is defined for a code block, it will be assigned to the `<code>` tag as a class in the manner suggested by the [HTML5 spec][spec] (alternate output will not be entertained) and could potentially be used by a JavaScript library in the browser to highlight the code block. [ch]: extensions/code_hilite.html [spec]: http://www.w3.org/TR/html5/text-level-semantics.html#the-code-element ### Miscellaneous Test coverage has been improved including running [flake8]. While those changes will not directly effect end users, the code is being better tested which will benefit everyone. [flake8]: http://flake8.readthedocs.org/en/latest/ Various bug fixes have been made. See the [commit log](https://github.com/waylan/Python-Markdown/commits/master) for a complete history of the changes.
2015-07-15Bump PKGREVISION for poppler shlib major bump.wiz1-2/+2
2015-07-14Update to 2015.6.21:wiz2-6/+6
2015.6.21 ========= ---- * Fix #31: HTML entities stay inside link. * Fix #71: Coverage detects command line tests. * Fix #39: Documentation update. * Fix #61: Functionality added for optional use of automatic links. * Feature #80: ``title`` attribute is preserved in both inline and reference links. * Feature #82: More command line options. See docs. 2015.6.12 ========= ---- * Feature #76: Making ``pre`` blocks clearer for further automatic formatting. * Fix #71: Coverage detects tests carried out in ``subprocesses`` 2015.6.6 ======== ---- * Fix #24: ``3.200.3`` vs ``2014.7.3`` output quirks. * Fix #61. Malformed links in markdown output. * Feature #62: Automatic version number. * Fix #63: Nested code, anchor bug. * Fix #64: Proper handling of anchors with content that starts with tags. * Feature #67: Documentation all over the module. * Feature #70: Adding tests for the module. * Fix #73: Typo in config documentation.
2015-07-13+ py-pyphenkleink1-1/+2
2015-07-13Import Pyphen-0.9.1 as textproc/py-pyphen.kleink6-0/+141
Pyphen is a pure Python module to hyphenate text using existing Hunspell hyphenation dictionaries.
2015-07-12Comment out dependencies of the stylewiz35-77/+77
{perl>=5.16.6,p5-ExtUtils-ParseXS>=3.15}:../../devel/p5-ExtUtils-ParseXS since pkgsrc enforces the newest perl version anyway, so they should always pick perl, but sometimes (pkg_add) don't due to the design of the {,} syntax. No effective change for the above reason. Ok joerg
2015-07-12Fix MASTER_SITES.wiz1-3/+2
2015-07-12Escape braces in intltool-update. This is evident when using the --versionrodent2-1/+49
option. The programme emits deprecation warnings which break package builds which depend on that output being sane.
2015-07-11Update to 3.3.4wen2-7/+6
Upstream changes: 3.3.4 2015-03-24 23:21:57+0900 - Fix typos in document - Introduce $Text::Xslate::DEFAULT_CACHE_DIR
2015-07-09Update to 0.9.6:wiz2-6/+6
New in 0.9.6: * The data tables and line breaking algorithm have been updated to Unicode version 8.0.0.
2015-07-09Various fixes:jperkin11-11/+171
- Use nbcompat correctly. - Support newer zlib API. - Handle catpages correctly. Fixes build on SunOS at least.
2015-07-09fix typo (thanks dholland@)richard1-2/+2
2015-07-08Add docbook-xml and docbook-xsl to avoid nonet load failures as well asrichard2-4/+10
add to xsltproc-nonet.mk a variable XSLTPROC_PATH allowing packages to specify where to find locally files such as dtds, avoiding warnings like 'warning: failed to load external entity'. At the same time add a BUILD_DEPENDS to libxslt for xsltproc-nonet.mk and bump PKGREVISION.
2015-07-05Update to 3.59:wiz2-6/+6
iso-codes 3.59 -------------- Dr. Tobias Quathamer <toddy@debian.org> Wed, 1 Jul 2015 [ ISO 639 translations ] * Turkish by Volkan Gezer (TP) * Ukrainian by Yuri Chornoivan (TP) * Russian by Dmitry Sivachenko (TP) [ ISO 3166 translations ] * French by Christian Perrier * German by Dr. Tobias Quathamer * Thai by Theppitak Karoonboonyanan * Belarusian by Viktar Siarheichyk. Closes: #789278 [ ISO 639-3 translations ] * Ukrainian by Yuri Chornoivan (TP) * Dutch by Freek de Kruijf (TP) * Russian by Dmitry Sivachenko (TP) * Turkish by Volkan Gezer (TP) [ ISO 639-5 translations ] * Dutch by Freek de Kruijf (TP) [ ISO 15924 translations ] * Ukrainian by Yuri Chornoivan (TP) [ ISO 4217 translations ] * Russian by Dmitry Sivachenko (TP)