summaryrefslogtreecommitdiff
path: root/textproc
AgeCommit message (Collapse)AuthorFilesLines
2012-01-29textproc/ebview: Fix indirect linking error on DragonFlymarino1-1/+3
2012-01-27Add missing sysutils/file buildlinksbd1-1/+3
Bump PKGREVISION
2012-01-26Updated to 0.14rhaen2-8/+7
no changelog from upstream available
2012-01-26Updated to 0.615rhaen2-7/+6
ChangeLog: 0.615 Tue Jan 17 01:32:07 2012 +1100 <joe@kafsemo.org> - Fix test skipping when Unicode is unsupported. 0.614 Mon Jan 9 00:24:10 2012 +1100 <joe@kafsemo.org> - Fix regression in 0.613 and set encoding on GLOBs. 0.613 Sat Jan 7 22:51:26 2012 +1100 <joe@kafsemo.org> - Use 'Object->new()' syntax throughout (#65840). - Support passing in any arbitrary object that has a print() method (from Jason Rodrigues).
2012-01-26Updated to 3.39rhaen2-7/+6
ChangeLog: - nothing for 3.39 in upstream noted version 3.38 date: 2011-07-27 # minor maintenance release fixed: RT 65865: _ should be allowed at the start on an XML name https://rt.cpan.org/Ticket/Display.html?id=65865 reported by Steve Prokopowich removed: making att and class lvalues created problems: in certain context they made regular calls to the method create empty attributes. I could find no satisfactory fix,they were either incompletes, or to complex for often used methods. So att and class are back to being regular, non l-value methods. latt and lclass are the l-value versions. added: documented the -html option for xml_grep, that allows processing HTML input added: the -Tidy option to xml_grep, that uses HTML::Tidy to convert HTML to XML
2012-01-24Recursive dependency bump for databases/gdbm ABI_DEPENDS change.sbd3-6/+6
2012-01-23+ py-pdf-parser.wiz1-1/+2
2012-01-23Initial import of py-pdf-parser-0.3.7:wiz5-0/+58
This tool will parse a PDF document to identify the fundamental elements used in the analyzed file. It will not render a PDF document. The code of the parser is quick-and-dirty, I'm not recommending this as text book case for PDF parsers, but it gets the job done.
2012-01-23Not MAKE_JOBS_SAFE.joerg1-1/+3
2012-01-23Don't use non-ASCII character literals.joerg2-1/+15
2012-01-21Adjust HOMEPAGEgls1-2/+2
2012-01-21+ cmigemoobache1-1/+2
2012-01-21Import cmigemo-1.3e.20110227 as textproc/cmigemo.obache9-0/+277
Based on PR 45815 by Kiyono, Goro. Migemo is a library to generate regex pattern to search Japanese (which includes Hiragana/Kanji) text without any IME easily. C/Migemo is an implementation of that Migemo library.
2012-01-20update to 0.0.25drochner3-9/+10
changes: - README file was improved to provide better guidance for users - show the text-web-browser converting command in verbose mode for better debugging - workaround passivetex limitation for chapters titles starting with L - use passivetex/fop extensions by default, provide --noextensions option to disable them - basic experimental support for conversion from docbook to epub -bugfixes
2012-01-20change HOMEPAGE to point github page (original URL is not available anymore).obache1-2/+2
2012-01-18Revbump after db5 updateadam1-8/+9
2012-01-17Convert packages with add --libdir=* to CONFIGURE_ARGS to usesbd1-2/+2
GNU_CONFIGURE_LIBDIR or GNU_CONFIGURE_LIBSUBDIR.
2012-01-17update to 1.3.1drochner2-6/+6
changes: bugfixes
2012-01-17add patch from upstream to fix potential DOS problem (CVE-2011-3905)drochner3-7/+57
bump PKGREV
2012-01-16Fix build with boost 1.48.0.ryoon2-1/+27
As same as PR pkg/45803.
2012-01-14gsed related clean up.obache2-10/+8
* Stop to treat NetBSD's sed as GNU sed, not full compatible. * Then, no need to reset TOOLS_PLATFORM.gsed for NetBSD if USE_TOOLS+=gsed and real GNU sed is required. * In addition, convert simple USE_TOOLS+=gsed to conditionally, without NetBSD. * convert {BUILD_,}DEPENDS+=gsed to USE_TOOLS, all tools from gsed are real gsed.
2012-01-14Convert to USE_TOOLS=zip.hans2-11/+5
2012-01-13Recursive bump from audio/libaudiofile, x11/qt4-libs and x11/qt4-tools ABI bump.obache7-14/+14
2012-01-12Simplify. Don't allow Python 3 due to unsupported setuptools dependency.joerg2-10/+6
2012-01-12add 2 patches from upstream:drochner3-7/+36
-fix buffer overflow on entity references with long name (CVE-2011-3919) -fix error handling on realloc() failure bump PKGREV
2012-01-11On a system without setuptools, this fails to build, therefore itschmonz1-2/+2
must be an egg.
2012-01-11Update to 5.1. From the changelog:schmonz3-11/+16
* Extensive, extensive unit test refactoring * Convert the Docbook documentation to ReST * Include the documentation in the source distribution * Consolidate the disparate README files into one * Support Jython somewhat (almost all unit tests pass) * Support Python 3.2 * Fix Python 3 issues exposed by improved unit tests * Fix international domain name issues exposed by improved unit tests * Issue 148 (loose parser doesn't always return unicode strings) * Issue 204 (FeedParserDict behavior should not be controlled by `assert`) * Issue 247 (mssql date parser uses hardcoded tokyo timezone) * Issue 249 (KeyboardInterrupt and SystemExit exceptions being caught) * Issue 250 (`updated` can be a 9-tuple or a string, depending on context) * Issue 252 (running setup.py in Python 3 fails due to missing sgmllib) * Issue 253 (document that text/plain content isn't sanitized) * Issue 260 (Python 3 doesn't decompress gzip'ed or deflate'd content) * Issue 261 (popping from empty tag list) * Issue 262 (docs are missing from distribution files) * Issue 264 (vcard parser crashes on non-ascii characters) * Issue 265 (http header comparisons are case sensitive) * Issue 271 (monkey-patching sgmllib breaks other libraries) * Issue 272 (can't pass bytes or str to `parse()` in Python 3) * Issue 275 (`_parse_date()` doesn't catch OverflowError) * Issue 276 (mutable types used as default values in `parse()`) * Issue 277 (`python3 setup.py install` fails) * Issue 281 (`_parse_date()` doesn't catch ValueError) * Issue 282 (`_parse_date()` crashes when passed `None`) * Issue 285 (crash on empty xmlns attribute) * Issue 286 ('apos' character entity not handled properly) * Issue 289 (add an option to disable microformat parsing) * Issue 290 (Blogger's invalid img tags are unparseable) * Issue 292 (atom id element not explicitly supported) * Issue 294 ('categories' key exists but raises KeyError) * Issue 297 (unresolvable external doctype causes crash) * Issue 298 (nested nodes clobber actual values) * Issue 300 (performance improvements) * Issue 303 (unicode characters cause crash during relative uri resolution) * Remove "Hot RSS" support since the format doesn't actually exist * Remove the old feedparser.org website files from the source * Remove the feedparser command line interface * Remove the Zope interoperability hack * Remove extraneous whitespace
2012-01-11Add and enable p5-Text-Typography.schmonz1-1/+2
2012-01-11Initial import of p5-Text-Typography, a thin wrapper for Johnschmonz3-0/+35
Gruber's SmartyPants plugin for various CMSs. SmartyPants is a web publishing utility that translates plain ASCII punctuation characters into "smart" typographic punctuation HTML entities. SmartyPants can perform the following transformations: * Straight quotes ( " and ' ) into "curly" quote HTML entities * Backticks-style quotes (``like this'') into "curly" quote HTML entities * Dashes (-- and ---) into en- and em-dash entities * Three consecutive dots (...) into an ellipsis entity
2012-01-10Indent.schmonz2-24/+24
2012-01-10Update to 1.2.8.0. From the changelog:schmonz2-7/+6
1.2.8.0 Tue Dec 13 14:45:07 UTC 2011 [Changes contributed by Olly Betts] - Add note to README about documentation, pointing out that the docs for Xapian are useful. - Improve note in README about moving to SWIG-generated wrappers in the next release series. 1.2.7.0 Wed Aug 10 06:14:53 UTC 2011 [Changes contributed by Olly Betts] - Note in README that the hand-coded XS wrappers are heading for retirement. 1.2.6.0 Sun Jun 12 11:55:42 UTC 2011 [Changes contributed by Adam Sjøgren] - Wrap new method QueryParser::set_max_wildcard_expansion(). (ticket#350) 1.2.5.0 Mon Apr 4 14:00:38 UTC 2011 [Changes contributed by Olly Betts] - simpleindex.pl - use 'while' to loop over input lines - 'foreach' reads them all in and then loops over them, while reads and processes line by line. - Add '1;' to the end of t/symbol-test/SymbolTest.pm. 1.2.4.0 Thu Dec 19 12:41:49 UTC 2010 [Changes contributed by Olly Betts] - Xapian exceptions were still being thrown as strings in Perl in some cases. Now all cases throw a subclass of Search::Xapian::Error. For compatibility with code which expects the previous behaviour these subclasses auto-stringify to the string which would have been thrown before. - Make sure all Perl files have 'use strict;' and 'use warnings;'. - Remove superfluous 'use Carp;' from generated error classes. - t/document.t,t/index.t,t/search.t: Test TermIterator::get_termname(). - Makefile.PL now looks for CXXFLAGS and CPPFLAGS passed on the command line, and adds them to CCFLAGS in the generated Makefile. [Changes contributed by Tim Brody] - New testcase t/10query.t. [Changes contributed by David F. Skoll and Dave O'Neill] - Tell DynaLoader to load the module with RTLD_GLOBAL so exceptions still work when multiple Perl modules which link to xapian-core are loaded. (ticket#522) 1.2.3.0 Tue Aug 24 06:03:12 UTC 2010 [Changes contributed by Tim Brody] - Allow user-specified ExpandDecider to be specified to get_eset(). [Changes contributed by Jess Robinson] - Fix bogus "can't find libtool" error when rerunning Makefile.PL and XAPIAN_CONFIG isn't explicitly specified. 1.1.4.0 Mon Feb 15 14:08:51 UTC 2010 [Changes contributed by Henry Combrinck] - Add wrappers for the spelling correction functionality (ticket#420). - Add wrapper for Database::close() (ticket#422). 1.1.3.0 Wed Nov 18 11:00:23 UTC 2009 [Changes contributed by Olly Betts] - Wrap new Xapian::SerialisationError class. - Ship simplematchdecider.pl example, which was added in 1.0.13.1 but accidentally not added to 1.1.1.0. - Work around odd rerunning of Makefile.PL by MakeMaker when srcdir != builddir. 1.1.1.0 Tue Jun 9 13:22:07 UTC 2009 [Changes contributed by Olly Betts] - Add Search::Xapian::MSet::items() method which returns an array tied to the MSet (much like Search::Xapian::Enquire::matches(), but you get easy access to the MSet object itself too). - Add the ability to tie an ESet to an array and a new Search::Xapian::ESet::items() method to make use of it. - Add new translated version of the simple examples from the Python bindings. - Add more fully featured examples: full-indexer.pl and full-searcher.pl. - Add better test coverage for MatchDecider. - Catch C++ exceptions from methods of Document and rethrow as Perl exceptions (ticket#284). - Add dependency to regenerate Makefile if Xapian.pm changes (since the former contains a version number extracted from the latter). 1.1.0.0 Thu Apr 22 13:56:31 GMT 2009 [Changes contributed by Andreas Marienborg and Olly Betts] - Xapian C++ exceptions classes are now wrapped and C++ exceptions are caught and rethrown in Perl as the wrapped classes. [Changes contributed by Olly Betts] - Xapian-core now uses libtool 2.2.x, which has required changes to the how we cram libtool into the MakeMaker-generated Makefile. However, there's still a wrinkle in this change - you can't currently run "make install" in a tree configured to use an uninstalled xapian-core. 1.0.23.0 Fri Jan 14 04:18:24 UTC 2011 [Changes contributed by David F. Skoll and Dave O'Neill] - Tell DynaLoader to load the module with RTLD_GLOBAL so exceptions still work when multiple Perl modules which link to xapian-core are loaded (ticket#522). 1.0.22.0 Sun Oct 3 12:36:44 UTC 2010 [Changes contributed by Jess Robinson] - Fix bogus "can't find libtool" error when rerunning Makefile.PL and XAPIAN_CONFIG isn't explicitly specified. [Changes contributed by Tim Brody] - New testcase t/10query.t.
2012-01-10Update to 1.2.8. Changelog since 1.0.18 is way too long and highlightsschmonz5-21/+32
aren't obvious. Lots of bug fixes.
2012-01-10Update to 1.2.8. From the changelog:schmonz5-64/+26
1.2.8: API: * Add support to TermGenerator and QueryParser for indexing and searching CJK text using n-grams. Currently this is only enabled when the environmental variable XAPIAN_CJK_NGRAM is set to a non-empty value. portability: + Some fixes for warnings when cross-compiling to mingw. * tests/soaktest/soaktest.cc: With Sun's compiler, random() and srandom() aren't in <cstdlib> so we need to use <stdlib.h> instead. 1.2.7: API: * Document objects now track whether any document positions have been modified so that replacing a modified document can completely skip considering updating positions if none have changed. Currently the flint, chert, and brass backends implement this optimisation. A common case this speeds up is adding and/or removing boolean filter terms to/from existing documents - for example this gives an 18% speedup for adding tags in notmuch. portability: * Fix -Wshadow warnings from GCC 4.6. * Fix warning from GCC 3.3. 1.2.6: API: * QueryParser: + Add new set_max_wildcard_expansion() method to allow limiting the number of terms a wildcard can expand to. (ticket#350) + If default_op is OP_NEAR or OP_PHRASE then disable stemming of the terms, since we don't index positional information for stemmed terms by default. * Spelling correction was failing to correctly handle words which had the same trigram in an even number of times. portability: * Fix to build for mingw. 1.2.5: API: * Enquire::get_eset() now accepts a min_wt argument to allow the minimum wanted weight to be specified. Default is 0, which gives the previous behaviour. * QueryParser: Handle NEAR/<offset> and ADJ/<offset> where offset isn't an integer the same way at the end of the query as in the middle. * Replication: + Only keep $XAPIAN_MAX_CHANGESETS changeset files when generating a new one (previously this variable only controlled if we generated changesets or not). Closes ticket#278. + $XAPIAN_MAX_CHANGESETS is reread each time, rather than only when the database is opened. + If you build Xapian with DANGEROUS mode enabled, changeset files now actually have the appropriate flag set (the reader will currently throw an exception, but that's better than quietly handling them incorrectly). portability: * api/compactor.cc: Add missing header <ctime> for time() (ticket#530). * api/compactor.cc: Use msvc_posix_rename() under __WIN32__ to atomically update stub file after compaction (ticket#525). * Fix uninitialised variable warnings with gcc -O3. * Eliminate std::string member of global static object used when compiled with --enable-log which was causes problems on Mac OS X. * Fix some issues highlighted by clang++ warnings. 1.2.4: API: * QueryParser: + Avoid a double free if Query construction throws an exception in a particular case. Fixes ticket#515. + Allow phrase generators between a probabilistic prefix and the term itself (e.g. path:/usr/local). + The correct window size wasn't being set in some cases when default_op was set to OP_PHRASE. * Enquire::get_mset(): + Avoid pointlessly trying to allocate lots of memory if the first document requested is larger than the size of the database. + An empty query now returns an MSet with firstitem set correctly - previously firstitem was always 0 in this case. * Document: Initialise docid to 0 when creating a document from scratch, as documented. * Compactor: + Move the database compaction and merging functionality into this new class, and make xapian-compact a simple wrapper around this class. (ticket#175) + Inputs can now be stub database directories or files, in which case the databases in the stub are used as inputs. + Add support for compacting to a stub database, which can be one of the inputs (for atomic update). + If spellings and/or synonyms were only present in some source databases, they weren't copied to the output database, but now they are. portability: * configure: Add support for --enable-sse=sse and --enable-sse=sse2 to allow control of which SSE instructions to use. * configure: Enable use of SSE maths on x86 by default with Sun's compiler. * configure: Beef up the test for whether -lm is required and add a special case to force it to be for Sun's C++ compiler - there's some interaction with libtool and/or shared objects which means that the previous configure test didn't think -lm is needed here when it is. * Fix to build on OpenBSD 4.5 with GCC 3.3.5. * Need to avoid excess precision on m68k when targeting models 68010, 68020, 68030 as well as 68000. * Fix compilation with Sun's C++ compiler. * Fix testsuite to build on Solaris < 10. 1.2.3: API: * Database::get_spelling_suggestion() will now suggest a correction even if the passed word is in the dictionary, provided the correction has at least the same frequency. Partly addresses #225. * QueryParser: + Fix handling of groups of terms which are all stopwords - in situations where this causes a problem we now disable stopword checks for such groups. (ticket#245) + Fix to be smarter about handling a boolean filter term containing ".." in the presence of valuerangeprocessors. portability: * configure: Don't pass -mtune=generic unless GCC >= 4.2 is in use (ticket#492). * Fix handling of some obscure cases of resolving relative paths on Microsoft Windows. (ticket#243). * Optimise closing of all unwanted file descriptors after forking by using closefrom() if available, and otherwise providing our own implementation (optimised to some extent for many platforms). * Fix test harness to build under Microsoft Windows (ticket#495). 1.2.2: portability: * Revert 1.2.1 change to visibility of Xapian::Weight's copy constructor as it making it private broke compilation with GCC 4.1 (which seems to be a bug in this compiler version). * tests/harness/testsuite.cc: Need <cstdio> for sprintf(). Fixes compilation error which was masked if valgrind was installed. (ticket#489) pkgsrc changes: * Remove options (the "quartz" backend was unrelated to Darwin and no longer exists). * Unconditionally buildlink libuuid. If that's overzealous, improve its builtin detection.
2012-01-09Recursive bump from boost-libs shlib bump.obache2-4/+4
2012-01-08Add and enable p5-Text-Markdown-Discount.schmonz1-1/+2
2012-01-08Initial import of Text::Markdown::Discount, a Perl extension interfaceschmonz3-0/+28
for "Discount", an implementation of John Gruber's "markdown" in C developed by David Loren Parsons.
2012-01-04Requires Berkeley DB on platforms that don't have db1.85 in libc.dholland1-1/+2
Build fix, no revbump.
2012-01-02Allow 2012 in man page dates.wiz2-5/+5
2012-01-01Fix build on 5.1/amd64 (PR 45691) -- cast long tv_sec to time_t. Notshattered3-2/+17
a problem in -current, where tv_sec is time_t.
2011-12-29Make sure that the gsed package always has a 'gsed' executable.sbd2-3/+14
Bump PKGREVISION
2011-12-29Update groonga to 1.2.9.obache3-8/+9
Release 1.2.9 - 2011/12/29 -------------------------- Improvements ^^^^^^^^^^^^ * Supported Fedora 16. * Dropped Fedora 15 support. * [groonga] Improved the default server ID address to work on unresolved host name environment. [Reported by @uzulla] * Supported MAP_HUGETLB. * [admin] Supported throughput chart. * Stopped adding nul character in ``grn_itoh()``. [#1194] [Reported by SHIDARA Yoji] * Added ``grn_obj_get_values()``. * Added ``grn_obj_delete_by_id()``. * Supported string vector column for query expansion. [#1216] * Added ``--filter`` option to :doc:`/commands/delete` to delete many record at once. [#1225] * Supported approximate type customization for :doc:`/functions/geo_in_circle` and :doc:`/functions/geo_distance`. [#1226] * Made ``geo_distance2()`` and ``geo_distance3()`` are deprecated. * Changed to use ``null`` instead of ``""`` for empty geo point value in JSON output. * Almost supported MessagePack output. [#1215] [Worked by SHIDARA Yoji] * Added missing newlines after drilldown result tags in XML output. * Supported truncate for grn_dat. * Supported longest common prefix search by grn_dat. Fixes ^^^^^ * [windows] Fixed inverted map type. * Fixed -Wno- compiler flag detection. [Patch by Arnaud Fontaine] * Fixed a problem that ``groonga --version`` reports wrongly about MeCab. [#1209] [Patch by SHIDARA Yoji] * Added missing lock into ``grn_obj_remove()``. * Fixed Content-Type on error. [#1220] [Patch by SHIDARA Yoji] * Fixed a problem that deleting SIS (Semi Infinite String) may keep a garbage.
2011-12-26Don't include partial RCS ID to confuse the build info generation.joerg2-4/+4
2011-12-26Use LIBES to pass LDFLAGS to the build process.sbd1-1/+2
2011-12-18Add ruby-multi_json (Hi taca)sbd1-1/+2
2011-12-17Change default PKGNAME scheme for PECL packages.obache3-3/+6
Drop ${PHP_BASE_VARS} from PKGVERSION by default. It used to be required to support multiple php version. But after PHP version based ${PHP_PKG_PREFIX} was introduced, such trick is not required anymore. In addition to this, such version name schme invokes unwanted version bump when base php version is bumped, plus, such version scheme is hard to use for DEPENDS pattern. To avoid downgrading of package using such legacy version scheme, PECL_LEGACY_VERSION_SCHEME is introduced. If it is defined, current version scheme is still used for currently supported PHP version (5 and 53), but instead of ${PHP_BASE_VARS}, current fixed PHP base version in pkgsrc is used to avoid unwanted version bump from update of PHP base package. With newer PHP (54, or so on), new version scheme will be used if it is defined. This trick will not be required and should be removed after php5 and php53 will be gone away from pkgsrc.
2011-12-17Remove duplicated RUBY_VERSION_SUPPORTED.taca1-4/+1
2011-12-17Update textproc/ruby-kramdown to 0.13.4.taca3-7/+11
Changes * 1 minor change: - Added a converter that extracts the TOC of a document (requested by Brendan Hay). Note that this is only useful if you use kramdown as a library! * 7 bug fixes - Fixed a typo: It should be --output and not --ouput (patch by postmodern) - Fixed HTML converter to correctly output empty span tags (patch by John Croisant) - Fixed bug RF#29350: Parsing of HTML tags with mismatched case now works - Fixed bug RF#29426: Content of style tags is treated as raw text now - HTML converter now uses rel instead of rev to be HTML5 compatible (patch by Joe Fiorini) - Fixed Ruby 1.9.3 related warnings - Fixed HTML parser to work around an implementation change of Array#delete_if in Ruby 1.9.3
2011-12-16textproc/py-enchant: Fix file permissions errormarino1-1/+5
The bad file permissions on ispell/README.txt prevented the package from building on DragonFly with PKG_DEVELOPER=yes. I'm not sure how it's passing on NetBSD unless the bulk reports I'm seeing aren't using that option.
2011-12-16Fix build on SunOS with gcc>=4.6hans1-1/+5
2011-12-16Importing textproc/ruby-multi_json package version 1.0.4.taca4-0/+43
MultiJSON Lots of Ruby libraries utilize JSON parsing in some form, and everyone has their favorite JSON library. In order to best support multiple JSON parsers and libraries, multi_json is a general-purpose swappable JSON backend library.