summaryrefslogtreecommitdiff
path: root/textproc
AgeCommit message (Collapse)AuthorFilesLines
2022-12-17py-cmudict: update to 1.0.12gutteridge3-8/+16
(Change log entries absent for some releases. Mostly commits and subsequent reversions it seems. Also some dictionary updates.) v1.0.12 Bug Fixes restored returning a file-like object by _stream() (fa145ec) use backported importlib_resources for python 3.9 (1adee16) Continuous Integration removed dependabot-batcher (a517ca4) dependabot: updated dependabot prefixes to use conventional commits (570b36c) Tests test_cmudict: added test case for dict_stream() (08f2a08) ignore deprecation warning for is_binary (839cd75) v1.0.3 Maintenance: dependency updates migrated to poetry added bump, publish, release pipeline
2022-12-16Update to 2.24scole2-6/+6
all changes for pthai.el - use expand-file-name in a few places - fix pthai-audio-display-definition plumbing - use call-process* for pthai-mp3-play and pthai-split-command - rename pthai-splitter-swath-word-length to pthai-splitter-max-swath-word-length - restore pthai-thai-break-words splitter
2022-12-15py-lxml: updated to 4.9.2adam3-8/+35
4.9.2 (2022-12-13) ================== Bugs fixed ---------- * CVE-2022-2309: A Bug in libxml2 2.9.1[0-4] could let namespace declarations from a failed parser run leak into later parser runs. This bug was worked around in lxml and resolved in libxml2 2.10.0. https://gitlab.gnome.org/GNOME/libxml2/-/issues/378 Other changes ------------- * LP-1981760: ``Element.attrib`` now registers as ``collections.abc.MutableMapping``. * lxml now has a static build setup for macOS on ARM64 machines (not used for building wheels). Patch by Quentin Leffray.
2022-12-15py-nltk: updated to 3.8adam3-16/+19
Version 3.8 2022-12-12 * Refactor dispersion plot * Provide type hints for LazyCorpusLoader variables * Throw warning when LanguageModel is initialized with incorrect vocabulary * Fix WordNet's all_synsets() function * Resolve TreebankWordDetokenizer inconsistency with end-of-string contractions * Support both iso639-3 codes and BCP-47 language tags * Avoid DeprecationWarning in Regexp tokenizer * Fix many doctests, add doctests to CI * Fix bool field not being read in VerbNet * Greatly improve time efficiency of SyllableTokenizer when tokenizing numbers * Fix encodings of Polish udhr corpus reader * Allow TweetTokenizer to tokenize emoji flag sequences * Prevent LazyModule from increasing the size of nltk.__dict__ * Fix CoreNLPServer non-default port issue * Add "acion" suffix to the Spanish SnowballStemmer * Allow loading WordNet without OMW * Use input() in nltk.chat.chatbot() for Jupyter support * Fix edit_distance_align() in distance.py * Tackle performance and accuracy regression of sentence tokenizer since NLTK 3.6.6 * Add the Iota operator to semantic logic * Resolve critical errors in WordNet app * Resolve critical error in CHILDES Corpus * Make WordNet information_content() accept adjective satellites * Add "strict=True" parameter to CoreNLP * Resolve issue with WordNet's synset_from_sense_key * Handle WordNet synsets that were lost in mapping * Resolve TypeError in Boxer * Add function to retrieve WordNet synonyms * Warn about nonexistent OMW offsets instead of raising an error * Fix missing ic argument in res, jcn and lin similarity functions of WordNet * Add support for the extended OMW * Fix LC cutoff policy of text tiling * Optimize ConditionalFreqDist.__add__ performance * Add Markdown corpus reader
2022-12-15ruby-nokogumbo: Requires libxml2.jperkin1-1/+2
2022-12-15swath: Requires libiconv.jperkin1-1/+2
2022-12-14libfyaml: use proper distfilewiz4-65/+64
avoids dependency on autotools, and fixes build since pkg-config m4 file was not depended on.
2022-12-13FlightCrew: Work around NetBSD unzip for patched files.jperkin1-1/+12
2022-12-13Avoid extracting the vendored discount library. We don't use it at allschmonz1-3/+8
(instead buildlinking textproc/discount), and it sometimes contains macOS xattrs that break extraction as root on other systems. Fixes "Cannot restore extended attributes: com.apple.quarantine com.apple.quarantine" seen with pkg_comp(8) on NetBSD/amd64 9.3.
2022-12-12py-itemloaders: updated to 1.0.6adam2-7/+6
1.0.6 Fixes a regression introduced in 1.0.5 that would cause the re parameter of ItemLoader.add_xpath and similar methods to be passed to lxml, which would trigger an exception when the value of re was a compiled pattern and not a string 1.0.5 Allow additional args to be passed when calling ItemLoader.add_xpath Fixed missing space in an exception message Updated company name in author and copyright sections Added official support for Python 3.9 and improved PyPy compatibility Added official support for Python 3.10
2022-12-12py-itemadapter: updated to 0.7.0adam3-8/+10
0.7.0 (2022-08-02) ItemAdapter.get_field_names_from_class 0.6.0 (2022-05-12) Slight performance improvement 0.5.0 (2022-03-18) Improve performance by removing imports inside functions 0.4.0 (2021-08-26) Added ItemAdapter.is_item_class and ItemAdapter.get_field_meta_from_class 0.3.0 (2021-07-15) Added built-in support for pydantic models
2022-12-12py-black: updated to 22.12.0adam2-6/+6
22.12.0 Preview style <!-- Changes that affect Black's preview style --> - Enforce empty lines before classes and functions with sticky leading comments - Reformat empty and whitespace-only files as either an empty file (if no newline is present) or as a single newline character (if a newline is present) - Implicitly concatenated strings used as function args are now wrapped inside parentheses - Correctly handle trailing commas that are inside a line's leading non-nested parens Configuration <!-- Changes to how Black can be configured --> - Fix incorrectly applied `.gitignore` rules by considering the `.gitignore` location and the relative path to the target file - Fix incorrectly ignoring `.gitignore` presence when more than one source directory is specified Parser <!-- Changes to the parser or to version autodetection --> - Parsing support has been added for walruses inside generator expression that are passed as function args (for example, `any(match := my_re.match(text) for text in texts)`) Integrations <!-- For example, Docker, GitHub Actions, pre-commit, editors --> - Vim plugin: Optionally allow using the system installation of Black via `let g:black_use_virtualenv = 0`
2022-12-12py-m2r: updated to 0.3.1adam2-10/+9
Version 0.3.0 * Drop support for Python 2.7, 3.4, 3.5, and 3.6 * Add compatibility with docutils 0.19 * Sync up assertion with changes in argparse * Limit mistune dependency version range
2022-12-11textproc/php-xapian: this package is not compatible with php82taca1-2/+2
Currently, this package supports php56 and php74.
2022-12-09py-dicttoxml: updated to 1.7.15adam2-6/+6
Version 1.7.15 Fixed issue 43 Implemented issue 82 Small fixes to readme.md Version 1.7.14 Handle floating point keys as per issue 61 Option to return string instead of bytes as per issue 55 Reorganized readme.md to have a better flow. Version 1.7.13 Fixed issue 53, dicttoxml(None) and dicttoxml("name") break in 1.7 Fixed issue 96, update readme section on debugging Version 1.7.12 Fixed issue 95: changed project.toml to support Python 3.6+ and updated readme documentation.
2022-12-08textproc: add libfyamlkhorben1-1/+2
2022-12-08libfyaml: import version 0.7.12khorben16-0/+289
libfyaml is a fancy 1.2 YAML and JSON parser/writer. Fully feature complete YAML parser and emitter, supporting the latest YAML spec and passing the full YAML testsuite. It is designed to be very efficient, avoiding copies of data, and has no artificial limits like the 1024 character limit for implicit keys. libfyaml is using https://github.com/yaml/yaml-test-suite as a core part of its testsuite.
2022-12-08Update to 2.23scole2-7/+6
all changes for pthai.el - move to end of word after *insert *complete-word functions - only toggle highlighting for 'pthai overlays - various bug fixes, cleanups, and simplifications - make end of word bounds detection more consistent - add 'pthai-soundfiles-counts to display downloaded word counts per letter - rename 'pthai-spell-string to pthai-spell-string-at-point, and 'pthai-spell-word to pthai-spell-word-at-point - rename pthai-overlay-off to pthai-hightlight-off
2022-12-08Revbump all Go packages after go119 security updatebsiegert8-22/+35
2022-12-08ruby-nokogiri: update to 1.13.10.tsutsui2-7/+6
Upstream changes: https://github.com/sparklemotion/nokogiri/releases/tag/v1.13.10 1.13.10 / 2022-12-07 Security * [CRuby] Address CVE-2022-23476, unchecked return value from xmlTextReaderExpand. See GHSA-qv4q-mr5r-qprj for more information. Improvements * [CRuby] XML::Reader#attribute_hash now returns nil on parse errors. This restores the behavior of #attributes from v1.13.7 and earlier. [#2715]
2022-12-08textproc/ruby-ferret: fix build problem on NetBSDtaca1-1/+2
Remove -D_XOPEN_SOURCE=500 on NetNSD. I don't know it is required on other operating systems. I built successsfully without this change on NetBSD 9.3_STALBE on 9th Nov 2022 and I don't know what was changed.
2022-12-06py-ujson: updated to 5.6.0adam2-7/+7
5.6.0 Added Update vendored double-conversion to 3.2.1 Fixed Fix len integer overflow issue
2022-12-05textproc/csvlens: update to 0.1.10pin2-7/+7
v0.1.10 - Handle irregular CSV when calculating column widths - Improved event loop handling - Improved memory usage when creating temporary file from stdin
2022-12-04textproc/git-delta: update to 0.15.1pin2-7/+7
- Explicitly request xz compression by @dandavison in #1249
2022-12-04textproc/ruby-terminal-table: update to 3.0.2taca3-11/+18
3.0.2 (2021-09-19) * fix align_column for nil values and colspan 3.0.1 / 2021-05-10 * Support for unicode-display_width 2.0 * Fix issue where last row of an empty table changed format 3.0.0 / 2020-01-27 * Support for (optional) Unicode border styles on tables. In order to support decent looking Unicode borders, different types of intersections get different types of intersection characters. This has the side effect of subtle formatting differences even for the ASCII table border case due to removal of certain intersections near colspans. For example, previously the output of a table may be: +------+-----+ | Title | +------+-----+ | Char | Num | +------+-----+ | a | 1 | | b | 2 | | c | 3 | +------+-----+ And now the `+` character above the word Title is removed, as it is no longer considered an intersection: +------------+ | Title | +------+-----+ | Char | Num | +------+-----+ | a | 1 | | b | 2 | +------+-----+ * The default border remains an ASCII border for backwards compatibility, however multiple border classes are included / documented, and user defined border types can be applied as needed. In support of this update, the following issues were addressed: * colspan creates conflict with colorize (#95) * Use nice UTF box-drawing characters by default (#99) - Note that `AsciiBorder` is stll the default * Border-left and border-right style (#100) * Helper function to style as Markdown (#111) - Achieved using `MarkdownBorder`
2022-12-04textproc/ruby-treetop: update to 1.6.12taca2-6/+6
1.6.12 (2022-11-24) * Fix home URL * Migrate CI to GitHub Actions * Replace deprecated File.exists? with File.exist?
2022-12-04textproc/ruby-temple: update to 0.9.1taca3-33/+34
0.9.1 (2022-10-24) * Fix Slim's error in AttributeMerger due to 0.9.0's :capture_generator (#137) * Use specified :capture_generator for nested captures (#112) * Fix Temple::ERB::Engine's <%= to not escape and <%== to escape expressions 0.9.0 (2022-10-24) * Require Ruby 2.5+ (#131) * Change default :capture_generator to self (#113) * Improve compatibility with Rails 7.1 (#135) * Support Rails 6.1's annotate_rendered_view_with_filenames with Temple::Filters::Ambles (#134) * Fix a crash in StringSplitter filter (#138) * Fix a warning by Object#=~ since Ruby 2.6 (#129) * Fix deprecated Tilt template mime type (#108) * Stop using deprecated EscapeUtils from Temple::Utils (#136)
2022-12-04textproc/ruby-review: update to 5.6.0taca2-6/+7
5.6.0 (2022-10-28) New Features * IDGXMLBuilder: support imgmath math_format in //texequation and @<m> (#1829) * LATEXBuilder: use reviewicon macro instead of reviewincludegraphics in @<icon> (#1838) * trim spaces before/after characters in ruby text (#1839) Breaking Changes * LATEXBuilder: use MEMO, NOTICE, CAUTION or other headers instead of ■メモ. If you want to use older headers, add ■メモ in locale.yml. (#1856) Others * update documents format.md and format.ja.md (#1860)
2022-12-04textproc/ruby-libxml: update to 3.2.4taca3-8/+8
3.2.4 (2022-10-29) * Support libxml2 version 2.10.2 (Charlie Savage) * Reduce number of globally included C header files (Charlie Savage)
2022-12-04textproc/ruby-jmespath: add ALTERNATIVEStaca1-0/+1
Add pkg_alternatives support.
2022-12-04textproc/ruby-jmespath: update to 1.6.2taca3-7/+11
1.6.2 (2022-11-25) * Issue - Allow comparison of Numeric types (includes Float). * Issue - Add jmespath.rb to gemspec executables.
2022-12-04textproc/ruby-html-pipeline: update to 2.14.3taca2-6/+6
2.14.3 (2022-10-14) Closed issues: * Allow vertical-align #366 * Since bump 2.14.2 builds are failing #363 Merged pull requests: * Replace EscapeUtils.escape_html with CGI.escape_html #365 (ramhoj)
2022-12-04textproc/ruby-haml: update to 6.0.12taca3-7/+8
6.0.12 (2022-11-26) * Fix a whitespace removal with > and an if-else statement #1114 6.0.11 (2022-11-25) * Fix a whitespace removal with > and an if statement #1114 6.0.10 (2022-11-09) * Evaluate :erb filter in the template context like Haml 5 6.0.9 (2022-11-07) * Support sass-embedded #1112 6.0.8 (2022-10-28) * Support interpolation in HTML comments, which has not been working since 6.0.0 #1107 6.0.7 (2022-10-13) * Haml::Engine and Haml::Template use StringBuffer instead of ArrayBuffer o It seems more performant in many cases with recent Ruby versions. o Haml::RailsTemplate is not affected.
2022-12-04textproc/ruby-asciidoctor: update to 2.0.18taca2-6/+6
2.0.18 (2022-10-15) Improvements * Propagate :to_dir option to document of AsciiDoc table cell (#4297) * Force encoding of attribute data passed via CLI to UTF-8 if transcoding fails (#4351) (@zkaip) * Add include role to link macro that replaces include directive when include is not enabled Bug Fixes * Change internal uriish? helper to only detect a URI pattern at start of a string; avoids misleading messages (#4357) * Prevent highlight.js warning when no language is set on source block; don't call highlightBlock if data-lang attribute is absent (#4263) * Don't raise error if Asciidoctor::Extensions.unregister is called before groups are initialized (#4270) * If path is included both partially and fully, store it with true value (included fully) in includes table of document catalog * Reset registry if activate is called on it again (#4256) * Format source location in exception message when extension code is malformed * Fix lineno on reader when skip-front-matter attribute is set but end of front matter is not found * Fix Asciidoctor::Cli::Invoker constructor when first argument is a hash * Update default stylesheet to honor marker on unordered list when marker is defined on ancestor unordered list (#4361)
2022-12-03textproc/git-delta: update to 0.15.0pin3-56/+8
What's Changed Thanks to all contributors for the changes in this release! One particularly exciting contribution is the tweaks to the highlighting algorithm made by @phillipwood in #1244. This is something that has remained more or less the same since delta was first created, but #1244 brings several improvements in the details of exactly which characters are highlighted. - Change Rust toolchain in 'Deploy Manual' CI task by @dandavison in #1183 - Switch bat to library mode by @tranzystorek-io in #1187 - Add sourcehut link parsing by @p00f in #1190 - Refactoring ansi/iterator by @zhiburt in #1191 - Add codeberg link parsing by @p00f in #1194 - Add terminal width fallback via stty if on windows/MSYS2 by @th1000s in #1030 - measure_text_width() without constructing a temporary string by @th1000s in #1216 - Remove Git 2.37 workaround from install docs by @adamchainz in #1228 - Fix clippy warnings by @clnoll in #1236 - Remove Provides in Debian package by @baryluk in #1217 - Handle quoted filenames in diff header by @th1000s in #1222 - ci: improve formatting by @MarcoIeni in #1238 - Highlighting improvements by @phillipwood in #1244 - ci: release apple arm binary by @MarcoIeni in #1239 - try fix bad alignment in unicode (#1144) by @SheldonNico in #1145
2022-11-30py-html5-parser: not for Python 2.7adam1-4/+4
2022-11-30py-html-sanitizer: updated to 1.9.3adam2-9/+10
1.9 (2020-01-20) Added Python 3.8 to the CI matrix. Be able to keep the <style> tag by adding it to tags. Added a style check to the CI matrix. 1.8 (2019-11-21) Actually added support for customizing lxml's autolinking behavior using a dictionary argument. Stopped removing explicitly allowed attributes. Removed id from allowed attributes of <a> tags to provide an additional layer of defense against DOM clobbering attacks. Added an element preprocessor which assigns the id value to the name attribute of anchors if name isn't set or empty. This should provide additional backwards compatibility making the id removal less of a problem when using named anchors. 1.7 (2019-02-19) Added a system check which validates sanitizer configurations early when using Django. Fixed an edge case where passing in an empty allowed tags list would unexpectedly and silently not remove any tags at all (because that's the way lxml's cleaner works). Changed the sanitizer tags, empty and separate options to also accept any iterable, not just sets. Changed the lru_cache import in the Django module to try functools first. Fixed the tag merging to also check tags in empty. This means that e.g. consecutive <hr> tags are also merged now when using the default settings. Made it possible to override the set of tags processed as whitespace. The default set is {"br"} which preserves the current behavior of stripping breaks from the beginning or end of tags' content.
2022-11-30py-nltk: add ALTERNATIVESadam1-0/+1
2022-11-30py27-cssselect2: exclude pytest-flake8 and pytest-isort from testingadam4-6/+38
2022-11-30py27-tinycss2: exclude pytest-flake8 and pytest-isort from testingadam4-8/+40
2022-11-29py-nltk: updated to 3.7adam3-13/+42
NLTK 3.7 release: February 2022: improve and update the NLTK team page on nltk.org drop support for Python 3.6 add support for Python 3.10
2022-11-29py-pyphen: updated to 0.13.2adam3-8/+9
Version 0.13.2 -------------- * Add Thai dictionary.
2022-11-29py-openpyxl: put correct DEPENDSadam1-3/+2
2022-11-29py-tablib: updated to 3.2.1adam3-13/+18
3.2.1 (2022-04-09) Bugfixes - Support solo CR in text input imports 3.2.0 (2022-01-27) Changes - Dropped Python 3.6 support Bugfixes - Corrected order of arguments to a regex call in `safe_xlsx_sheet_title` 3.1.0 (2021-10-26) Improvements - Add support for Python 3.10 - The csv, xls, and xlsx formats gained support for the `skip_lines` keyword argument for their `import_set()` method to be able to skip the nth first lines of a read file Bugfixes - Avoided mutable parameter defaults - Specify build backend for editable installs - Doubled sample size passed to `csv.Sniffer()` in `_csv.detect()` 3.0.0 (2020-12-05) Breaking changes - Dropped Python 3.5 support. - JSON-exported data is no longer forced to ASCII characters. - YAML-exported data is no longer forced to ASCII characters. Improvements - Added Python 3.9 support. - Added read_only option to xlsx file reader Bugfixes - Prevented crash in rst export with only-space strings
2022-11-29py-markuppy: added version 1.14adam5-1/+40
This is MarkupPy - a Python module that attempts to make it easier to generate HTML/XML from a Python program in an intuitive, lightweight, customizable and pythonic way.
2022-11-29py-openpyxl: updated to 3.0.10adam4-23/+22
3.0.10 (2021-05-13) Bugfixes * Image files not closed when workbooks are saved * Problem with missing scope attribute in Pivot Table formats * Excel unhappy when multiple sorts are defined 3.0.9 (2021-09-22) Bugfixes * Ignore blank ignored in existing Data Validations * Add support for cell protection for merged cell ranges * Timezone-aware datetimes raise an Exception * Improved normalisation of chart series * Catch OverflowError for out of range datetimes * Alignment.relativeIndent can be negative * Incorrect default value groupBy attribute 3.0.8 (brown bag) Deleted because it contained breaking changes from 3.1 3.0.7 (2021-03-09) Bugfixes * Problems with zero time values * Not possible to correctly convert excel dates to timedelta * Exception raised when merging cells which do not have borders all the way round. * Python 2 print statement in the tutorial Pull Requests * Add documentation on datetime handling * Drop dependency on jdcal * Datetime rounding * Unify handling of 1900 epoch * Add explicit support for reading datetime deltas * Millisecond precision for datetimes 3.0.6 (2021-01-14) Bugfixes * Borders in differential styles are incorrect * Error when opening some pivot tables * Resave breaks the border format in conditional formatting rules * Read-only workbook not closed properly if generator interrupted * Pandas.Multiindex.labels deprecated * Pandas.Multiinex not expanded correctly * Cannot read rows with exponents * numpy.float is deprecated * Cells without coordinate attributes not always correctly handled Pull Requests * Improved handling of borders for differential styles * Support subclasses of datetime objects * Improved handling of cells without coordinates 3.0.5 (2020-08-21) Bugfixes * Incorrectly consider currency format as datetime * Cannot copy worksheets with merged cells * Empty worksheets do not return generators when looping. * Hyperlinks duplicated on multiple saves * Incorrectly literal format as datetime * Links set to range of cells not preserved * Exception when opening workbook with chartsheets and tables 3.0.4 (2020-06-24) Bugfixes * Find tables by name * Worksheet protection missing in existing files * Exception when reading files with external images * Reading lots of merged cells is very slow. * Read support for Bubble Charts. * Preserve any indexed colours * Reading many thousand of merged cells is really slow. * Adding tables in write-only mode raises an exception. Pull Requests * Add support for finding tables by name or range. 3.0.3 (2020-01-20) Bugfixes * Exception when handling merged cells with hyperlinks * Problems when both lxml and defusedxml are installed * CFVO with incorrect values cannot be processed 3.0.2 (2019-11-25) Bug fixes * DeprecationError if both defusedxml and lxml are installed * ws._current_row is higher than ws.max_row * Border bottom style is not optional when it should be * Empty cells in read-only, values-only mode are sometimes returned as ReadOnlyCells * Cannot add page breaks to existing worksheets if none exist already Pull Requests * Improvements to the documentation 3.0.1 (2019-11-14) Bugfixes * Cannot read empty charts. Pull Requests * Fix for 1250 * TableStyleElement is a sequence 3.0.0 (2019-09-25) Python 3.6+ only release
2022-11-29py-dicttoxml: updated to 1.7.11adam2-6/+6
Version 1.7.11 Simplified solution to issue 94 Version 1.7.9 Fixed issue 94
2022-11-28Fix _PYTHON_VERSION check to avoid errorabs1-2/+2
Replace .if ${_PYTHON_VERSION} < 38 with .if ${_PYTHON_VERSION} == 37 as otherwise will fail when ${_PYTHON_VERSION} is "none". Triggered for "make clean-depends" for a package with PYTHON_VERSIONS_ACCEPTED=27 which depends on textproc/py-pygments
2022-11-28py-dicttoxml: updated to 1.7.8adam2-6/+6
Version 1.7.8 Fixed: Boolean values now export into XML in lowercase (true, false) instead of capitalized (True, False).
2022-11-28py-dicttoxml: updated to 1.7.7adam2-6/+6
Version 1.7.7 Fixed: debug is turned off by default, and no longer prints "Debug mode is off" in the console.