summaryrefslogtreecommitdiff
path: root/textproc
AgeCommit message (Collapse)AuthorFilesLines
2021-02-21py-yaml: needs py-cythonadam1-1/+3
2021-02-21py-yaml: updated to 5.4.1adam3-11/+17
5.4.1 * Fix stub compat with older pyyaml versions that may unwittingly load it 5.4 * Build modernization, remove distutils, fix metadata, build wheels, CI to GHA * Fix for CVE-2020-14343, moves arbitrary python tags to UnsafeLoader * Fix memory leak in implicit resolver setup * Fix py2 copy support for timezone objects * Fix compatibility with Jython
2021-02-21lowdown: update to 0.8.2.fcambus3-8/+9
Version 0.8.2, 2021-02-19 Fix tables as processed by the difference engine. Tables are now fully opaque, which means that any changes will result in the deletion and re-addition of the table. This isn't a good fix, but it does mean that any tables run through the difference engine will be sane for output. Fix metadata to also be properly handled by both the difference engine and conforming front-ends. This is a bit unusual since metadata is both processed during parse and also affects document output, such as in document title. For now use the same rule that front-ends with metadata differences affecting document layout (e.g., title) use the new form, if changed. Lastly, fix footnote reference. When they're emitted in the new document, the reference definitions are re-ordered in the correct way to allow -Tms and such to work properly. While here, make sure that all printed footnote numbers start at one and colours are properly represented in output. Split lowdown(1) into lowdown-diff(1) for easier reading. Properly render tables for -Tgemini as fixed-width displays. By default, render Gemini link labels using "Excel" format (hexavigesimal) with the option of using Roman numerals (--gemini-link-roman) or without labels at all (--gemini-link-noref). This choice of default may change in later versions, hwoever.
2021-02-21textproc: +upmendexmarkd1-1/+2
2021-02-21upmendex: Add version 0.54markd4-0/+42
From Atsushi Toyokura in pkgsrc-wip upmendex is a multilingual index processor with following features: * Mostly compatible with makeindex and upper compatible with mendex, based on mendex version 2.6f by ASCII media works. * Unicode for internal process and support UTF-8 encoding for input/output. Will work with upLaTeX, XeLaTeX and luaLaTeX. * Support Latin (including non-English), Greek, Cyrillic, Korean Hangul and Han (Hanzi ideographs) scripts * Apply International Components for Unicode (ICU)[4] for sorting process.
2021-02-20libpinyin: Update to 2.6.0ryoon6-48/+24
* Enable libzhuyin. Changelog: version 2.6.0 * bug fixes version 2.4.92 * update pinyin data * bug fixes version 2.4.91 * improve full pinyin auto correction * bug fixes version 2.3.0 * update pinyin data version 2.2.2 * minor fixes version 2.2.1 * fixes predicted candidates version 2.2.0 * bug fixes version 2.1.91 * fixes zhuyin parsers; version 2.1.0 * support sort option in pinyin_guess_candidates function; version 2.0.92 * reduce memory consumption after imported user dictionary; version 2.0.91 * merge libzhuyin code; version 2.0.0 * the first official release of 2.0.x; * fixes autoconf; version 1.9.92 * fixes crash in double pinyin; version 1.9.91 * multiple sentence candidates; version 1.7.0 * fixes build on FreeBSD; * update cmake files; version 1.6.91 * change license to GPLv3+; * import open-gram dictionary and remove pinyin tones; * add some checks when load data from file; version 1.6.0 * bug fixes. version 1.5.91 * change pinyin/phrase tables to use dbm. * enhance pinyin key representation and pinyin parsers. version 1.2.0 * bug fixes. version 1.1.91 * support Kyoto Cabinet as alternative to Berkeley DB. * improve multiple dictionaries support feature. version 1.1.0 * support to export user phrases.
2021-02-19gnome-doc-utils: resolve remaining pkglint warningsnia1-2/+2
2021-02-19gnome-doc-utils: add Python 3 support, based on Fedora patchesnia10-10/+362
bump PKGREVISION
2021-02-18inih: update to 53nia2-7/+7
Updates to Meson config: meson: optionally depend on C++ meson: enable distro settings by default meson: add static compile args to inih_dep
2021-02-18(*/hs-*) BUILDLINK_API_DEPENDS.ghc <8.10, againmef5-10/+10
2021-02-18libstemmer: update to 2.1.0.wiz3-27/+14
Snowball 2.1.0 (2021-01-21) =========================== C/C++ ----- * Fix decoding of 4-byte UTF-8 sequences in `grouping` checks. This bug affected Unicode codepoints U+40000 to U+7FFFF and U+C0000 to U+FFFFF and doesn't affect any of the stemming algorithms we currently ship (#138, reported by Stephane Carrez). Python ------ * Fix snowballstemmer.algorithms() method (#132, reported by kkaiser). * Update code to generate trove language classifiers for PyPI. All the natural languages we previously had stemmers for have now been added to PyPI's list, but Armenian and Yiddish aren't on it. Patch from Dmitry Shachnev. Java ---- Code Quality Improvements ------------------------- * Suppress GCC warning in compiler code. * Use `const` pointers more in C runtime. * Only use spaces for indentation in javascript code. Change proposed by Emily Marigold Klassen in #123, and seems to be the modern Javascript norm. New Code Generators ------------------- * Add Ada generator from Stephane Carrez (#135). New Snowball Language Features ------------------------------ * `lenof` and `sizeof` can now be applied to a literal string, which can be useful if you want to do calculations on cursor values. This change actually simplifies the language a little, since you can now use a literal string in any read-only context which accepts a string variable. Code generation improvements ---------------------------- * General: + Fix bugs in the code generated to handle failure of `goto`, `gopast` or `try` inside `setlimit` or string-`$`. This affected all languages (though the issue with `try` wasn't present for C). These bugs don't affect any of the stemming algorithms we currently ship. Reported by Stefan Petkovic on snowball-discuss. + Change `hop` with a negative argument to work as documented. The manual says a negative argument to hop will raise signal f, but the implementation for all languages was actually to move the cursor in the opposite direction to `hop` with a positive argument. The implemented behaviour is problematic as it allows invalidating implicitly saved cursor values by modifying the string outside the current region, so we've decided it's best to fix the implementation to match the documentation. The only Snowball code we're aware of which relies on this was the original version of the new Yiddish stemming algorithm, which has been updated not to rely on this. The compiler now issues a warning for `hop` with a constant negative argument (internally now converted to `false`), and for `hop` with a constant zero argument (internally now converted to `true`). + Canonicalise `among` actions equivalent to `()` such as `(true)` which previously resulted in an extra case in the among, and for Python we'd generate invalid Python code (`if` or `elif` with an empty body). Bug revealed by Assaf Urieli's Yiddish stemmer in #137. + Eliminate variables whose values are never used - they no longer have corresponding member variables, etc, and no code is generated for any assignments to them. + Don't generate anything for an unused `grouping`. + Stop warning "grouping X defined but not used" for a `grouping` which is only used to define other another `grouping`. * C/C++: + Store booleans in same array as integers. This means each boolean is stored as an int instead of an unsigned char which means 4 bytes instead of 1, but we save a pointer (4 or 8 bytes) in struct SN_env which is a win for all the current stemmers. For an algorithm which uses both integers and booleans, we also save the overhead of allocating a block on the heap, and potentially improve data locality. + Eliminate duplicate generated C comment for sliceto. * Pascal: + Avoid generating unused variables. The Pascal code generated for the stemmers we ship is now warning free (tested with fpc 3.2.0). * Python: + End `if`-chain with `else` where possible, avoiding a redundant test of the variable being switched on. This optimisation kicks in for an `among` where all cases have commands. This change seems to speed up `make check_python_arabic` by a few percent. New stemming algorithms ----------------------- * Add Serbian stemmer from stef4np (#113). * Add Yiddish stemmer from Assaf Urieli (#137). * Add Armenian stemmer from Astghik Mkrtchyan. It's been on the website for over a decade, and included in Xapian for over 9 years without any negative feedback. Behavioural changes to existing algorithms ------------------------------------------ Optimisations to existing algorithms ------------------------------------ * kraaij_pohlmann: Use `$v = limit` instead of `do (tolimit setmark v)` since this generates simpler code, and also matches the code other algorithm implementations use. Probably for languages like C with optimising compilers the compiler will generate equivalent code anyway, but e.g. for Python this should be an improvement. Code clarity improvements to existing algorithms ------------------------------------------------ * hindi.sbl: Fix comment typo. Compiler -------- * Don't count `$x = x + 1` as initialising or using `x`, so it's now handled like `$x += 1` already is. * Comments are now only included in the generated code if command like option -comments is specified. The comments in the generated code are useful if you're trying to debug the compiler, and perhaps also if you are trying to debug your Snowball code, but for everyone else they just bloat the code which as the number of languages we support grows becomes more of an issue. * `-parentclassname` is not only for java and csharp so don't disable it if those backends are disabled. * `-syntax` now reports the value for each numeric literal. * Report location for excessive get nesting error. * Internally the compiler now represents negated literal numbers as a simple `c_number` rather than `c_neg` applied to a `c_number` with a positive value. This simplifies optimisations that want to check for a constant numeric expression. Build system ------------ * Link binaries with LDFLAGS if it's set, which is needed for some platform (e.g. OpenEmbedded). Patch from Andreas Müller (#120). * Add missing dependencies of algorithms.go rule. Testsuite --------- * C: Add stemtest for low-level regression tests. Documentation ------------- * Document a C99 compiler as a requirement for building the snowball compiler (but the C code it generates should still work with any ISO C compiler.) A few declarations mixed with code crept in some time ago (which nobody's complained about), so this is really just formally documenting a requirement which already existed. * README: Explain what Snowball is and what Stemming is (#131, reported by Sean Kelly). * CONTRIBUTING.rst: Expand section on adding a new generator. * For Python snowballstemmer module include global NEWS instead of Python-specific CHANGES.rst and use README.rst as the long description. Patch from Dmitry Shachnev (#119). * COPYING: Update and incorporate Python backend licensing information which was previously in a separate file.
2021-02-18asciidoc: update to 9.1.0.wiz2-8/+9
Version 9.1.0 (2021-02-08) -------------------------- .Features - Can specify a line range when using the `include` macro. - Setting the `SGML_CATALOG_FILES` environment variable will set `--catalogs` on xmllint within a2x.
2021-02-16py-dominate: updated to 2.6.0adam2-9/+9
2.6.0: Add get_current() to return the current active element in a with context.
2021-02-15cldr-emoji-annotation: Add buildlink3.mk for pkg-config fileryoon1-0/+14
This is required by upcomming inputmethod/fcitx5.
2021-02-15tex-xindy{,-doc}: update to 2.5.1.55330markd4-19/+16
changes unknown
2021-02-15tex-latexdiff{,-doc}: update to 1.3.1.1markd4-20/+16
changes unknown
2021-02-14py-phonenumbers: updated to 8.12.18adam2-7/+7
8.12.18: Unknown changes
2021-02-14py-xmlschema: updated to 1.5.1adam3-11/+10
v1.5.1 * Optimize NamespaceView read-only mapping * Add experimental XML data bindings with a DataBindingConverter * Add experimental PythonGenerator for static codegen with Jinja2
2021-02-14Mark these packages Ruby 3.0 incompatibletaca1-1/+3
2021-02-14textproc/ruby-ferret: build fixes for Ruby 3.0 and moretaca7-2/+85
* Fix build problem with Ruby 3.0. * Really compare two objects in two cases. Bump PKGREVISION.
2021-02-14textproc/Makefile: add and enable ruby-rexmltaca1-1/+2
2021-02-14textproc/ruby-rexml: re-add package version 3.2.4taca4-0/+91
ruby-rexml was bundled to ruby base package and removed past. Ruby 3.0 dose not bundle rexml library any more, so re-add its latest version now. REXML REXML was inspired by the Electric XML library for Java, which features an easy-to-use API, small size, and speed. Hopefully, REXML, designed with the same philosophy, has these same features. I've tried to keep the API as intuitive as possible, and have followed the Ruby methodology for method naming and code flow, rather than mirroring the Java API. REXML supports both tree and stream document parsing. Stream parsing is faster (about 1.5 times as fast). However, with stream parsing, you don't get access to features such as XPath.
2021-02-14textproc/Makefile: add and enable ruby-actiontext61taca1-1/+2
2021-02-14textproc/ruby-actiontext61: add package version 6.1.2.1taca4-0/+88
Action Text Action Text brings rich text content and editing to Rails. It includes the [Trix editor](https://trix-editor.org) that handles everything from formatting to links to quotes to lists to embedded images and galleries. The rich text content generated by the Trix editor is saved in its own RichText model that's associated with any existing Active Record model in the application. Any embedded images (or other attachments) are automatically stored using Active Storage and associated with the included RichText model. You can read more about Action Text in the [Action Text Overview](https://edgeguides.rubyonrails.org/action_text_overview.html) guide. This is for Ruby on Rails 6.1.
2021-02-14textproc/ruby-kramdown-rfc2629: update to 1.3.31taca2-9/+7
pkgsrc change: remove unnecessary OVERRIDE_GEMSPEC. 1.3.31 (2021-02-13) * Add temporary workaround for draft referencing regression 1.3.30 (2021-02-12) * Use rfc-editor/datatracker for RFC/I-D bibxml unless KRAMDOWN_USE_TOOLS_SERVER environment variable is set. 1.3.29 (2021-02-12) * Do not use server-side anchor setup for stand_alone: true 1.3.28 (2021-02-05) * SVG error handling fixes
2021-02-13tex-lwarp{,-doc}: update to 0.894markd5-17/+80
0.86 MathJax: Updated to v3. Fixed forward references. Improved equation numbering. Added support for starred macros, and starred macros for mathtools, nccmath, physics. Improved filename generation. Fixed labels in eqnarray and lateximage. Fixed nccmath, xcolor.
2021-02-13(*/hs-*) fix build, not adapted to ghc90 versionmef7-7/+14
2021-02-12Add lokpin1-1/+2
2021-02-12textproc/lok: import packagepin5-0/+180
Command line tool, that is used to quickly calculate the number of lines of various language codes in a project. Features: -Quickly calculate data -Support multiple languages -Support multiple output formats, ASCII, HTML, Markdown
2021-02-11www/ruby-rails60: update to 6.0.3.5taca1-5/+5
databases/ruby-activerecord60: ## Rails 6.0.3.5 (February 10, 2021) ## * Fix possible DoS vector in PostgreSQL money type Carefully crafted input can cause a DoS via the regular expressions used for validating the money format in the PostgreSQL adapter. This patch fixes the regexp. Thanks to @dee-see from Hackerone for this patch! [CVE-2021-22880] *Aaron Patterson* www/ruby-actionpack60 ## Rails 6.0.3.5 (February 10, 2021) ## * Prevent open redirect when allowed host starts with a dot [CVE-2021-22881] Thanks to @tktech (https://hackerone.com/tktech) for reporting this issue and the patch! *Aaron Patterson*
2021-02-11py-markups: update to 3.1.0gutteridge2-7/+7
Version 3.1.0, 2021-01-31 ========================= Incompatible changes: * Python versions older than 3.6 are no longer supported. Other changes: * Instead of ``pkg_resources``, ``importlib.metadata`` is now used. * For Markdown markup, ``markdown-extensions.yaml`` files are now supported in addition to ``markdown-extensions.txt`` files. * Type annotations were added for public API. * The reStructuredText markup no longer raises exceptions for invalid markup. * MathJax v3 is now supported in addition to v2. Also, the Arch Linux mathjax packages are now supported (issue #4). * Added Pygments CSS support for the ``pymdownx.highlight`` Markdown extension.
2021-02-11py-markdown-math: update to 0.8gutteridge2-7/+7
Version 0.8, 2020-11-03 ======================= * GitLab-style math blocks are now supported in nested environments such as lists. - Thanks to Ran Shaham for the contribution. * Tests now pass with Python-Markdown 3.3.
2021-02-10py-elementpath: updated to 2.1.4adam3-8/+9
v2.1.4 * Add tests and apply small fixes to TDOP parser * Fix wildcard selection of attributes
2021-02-09lowdown: update to 0.8.1.fcambus2-7/+7
ChangeLog: Version 0.8.1, 2021-02-09 Add --term-nolinks to strip URLs out of terminal output (when alternative text is available). Then add --nroff-nolinks and --nroff-shortlinks, just like those for -Tterm, for use with -Tman or --nroff-no-groff. Fix long-standing kinda-bug where www autolinks were being reported as regular links instead of autolinks. Introduce -m and -M, which allow metadata to be provided on the command line. Metadata keys are first looked for in -m, overriden by what's in the document, and those overridden by what's in -M. Remove the deprecated -D, -d, -E, and -e, which were long ago replaced by long options. Inhibit printing of metadata in -Tgemini unless --gemini-metadata is given.
2021-02-09py-snowballstemmer: updated to 2.1.0adam3-8/+17
2.1.0: * Fix snowballstemmer.algorithms() method. * Update code to generate trove language classifiers for PyPI. All the natural languages we previously had stemmers for have now been added to PyPI's list, but Armenian and Yiddish aren't on it.
2021-02-08py-pandocfilters: updated to 1.4.3adam2-8/+8
1.4.3: Unknown changes
2021-02-07py-parsimonious: added version 0.8.1adam5-1/+74
Parsimonious aims to be the fastest arbitrary-lookahead parser written in pure Python-and the most usable. It's based on parsing expression grammars (PEGs), which means you feed it a simplified sort of EBNF notation. Parsimonious was designed to undergird a MediaWiki parser that wouldn't take 5 seconds or a GB of RAM to do one page, but it's applicable to all sorts of languages
2021-02-07*: Recursive revbump from audio/pulseaudio-14.2.nb1ryoon6-12/+12
2021-02-05py-jsbeautifier: updated to 1.13.5adam2-7/+7
1.13.5: Unknown changes
2021-02-05py-elementpath: updated to 2.1.3adam2-7/+7
v2.1.3: * Extend tests for XPath 2.0 with minor fixes * Fix fn:round-half-to-even
2021-02-05py-Unidecode: updated to 1.2.0adam3-8/+13
unidecode 1.2.0 * Add 'errors' argument that specifies how characters with unknown replacements are handled. Default is 'ignore' to replicate the behavior of older versions. * Many characters that were previously replaced with '[?]' are now correctly marked as unknown and will behave as specified in the new errors='...' argument. * Added some missing ligatures and quotation marks in U+1F6xx and U+27xx ranges. * Add PEP 561-style type information (thanks to Pascal Corpet) * Support for Python 2 and 3.5 to be removed in next release.
2021-02-05py-xmlschema: updated to 1.5.0adam3-14/+47
v1.5.0 * Add DataElement class for creating objects with schema bindings * Add DataElementConverter for decode to structured objects * Add an experimental abstract base class for building jinja2 based code generators (jinja2 as an optional dependency)
2021-02-05textproc/ruby-kramdown-rfc2629: update to 1.3.27taca2-8/+8
1.3.27 (2020-02-04) * Add links to SVG-generating tools in README * Add -i option to kdrfc, run idnits 1.3.26 (2020-02-04) * Depend on json_pure
2021-02-04ugrep: update to 3.1.7.wiz2-7/+7
New --bool option to specify Boolean search query patterns (with Google search syntax or fzf-like when used with -F to search strings instead of regex patterns); new --and and --not options; new --dotall option; updated --format to support -v; other improvements. More coming soon!
2021-02-04asciidoc: update to 9.0.5.wiz2-8/+7
Version 9.0.5 (2021-01-24) -------------------------- .Bug fixes - Use config newline setting in system attribute evaulation (thanks @hoadlck) .Testing - Update to deadsnakes/python@v2.0.2
2021-02-04lowdown: update to 0.8.0.fcambus3-10/+9
ChangeLog: Version 0.8.0, 2021-01-31 Recognise the volume, source, and section metadata. These are currently only used by -Tman. Convert all internal functions to return an error code on memory allocation failure. Prior to this, these functions had a chance of exiting and printing failure to stderr. Now, this is left as the responsibility of the front-end. There's no significant API change except that all renderers return a value. Fix the difference engine in several subtle ways, improving the produced scripts, and also fix crashes where similar text would match multiple parts of the parse tree, resulting in assertions. Re-write the -Tms and -Tman generator to use a completely different internal algorithm. This algorithm, instead of formatting directly into output, converts the AST into an array of output blocks marked either as text, literal, macro, or font/colour change. An assembler for this array manages newlines and spacing between blocks. This fixes all known instances of unexpected line breaks and allows for significantly simplified handling of text interspersed with macros (e.g., links, etc.). An API result of this is that the tree passed to lowdown_nroff_rndr(3) is now const. Recognise non-block and block lists for -Tlatex output. Emit a UTF-8 preconv header to all -Tms and -Tman so that -Kutf8 need not be passed to the formatter. Remove the --nroff-hardwrap option, which needlessly complicates code without benefit.
2021-02-03textproc/ruby-kramdown-rfc2629: update to 1.3.25taca4-14/+19
pkgsrc changes: * Add pkg_alternatives support. 1.3.25 (2021-02-02) * Work around ERB api deprecation.
2021-02-03add textproc/inih.nia6-1/+62
inih (INI Not Invented Here) is a simple .INI file parser written in C. It's only a couple of pages of code, and it was designed to be small and simple, so it's good for embedded systems. It's also more or less compatible with Python's ConfigParser style of .INI files, including RFC 822-style multi-line syntax and name: value entries.
2021-02-03aspell-en: Update to 2020.12.07.0nia2-7/+7
2020.12.07: Updates to SCOWL 2020.12.07 which added some new words. The update also fixed a number of variant problems and removed irregardless, froward (+ derivatives) and perpend.
2021-02-03py-Levenshtein: update to 0.12.2gutteridge2-7/+7
Change log: Incorrect checking code was left in one function