summaryrefslogtreecommitdiff
path: root/textproc
AgeCommit message (Collapse)AuthorFilesLines
2004-05-19USE_LANGUAGES= c c++minskim1-1/+3
USE_LIBTOOL= yes
2004-05-19Bump PKGREVISION due to major version bump of the opensp library.minskim1-1/+2
2004-05-19Update opensp to 1.5.1.minskim6-82/+26
Changes: Enable run time selection of message format with SP_MESSAGE_FORMAT environment variable. Value is one of XML, NONE, TRADITIONAL. When validating/parseing a document using http, OpenSP will now follow any redirects headers/requests from the server. The environment variable SP_HTTP_USER_AGENT can be used to specify a UserAgent: header. The environment variable SP_HTTP_ACCEPT can be used to specify Accept: headers. A number of enhancements have been made to the osx tool: security fixes in the handling of output files; addition of the "preserve case option". A testing framework together with some initial tests have been added. Currently there are 22 tests. 6 of which fail. Support for Mac OS/X, Darwin has been improved. Build infrastructure and localisation fixes and enhancements. Improved compiler support.
2004-05-18update to 0.2.1recht3-217/+54
patch provided by Michal Pasternak in PR pkg/25611 Release 0.2.1 11 May 2004 Minor bugfixes and test improvements. Release 0.2.0 20 Feb 2004 Reorganized code into modules; converted some iteration constructs to Python iterators and generators. All text processing internally is now handled as Unicode. Analyzers are back as generators of tokens. The changes to the code to make it more pythonic appear to have resulted in trading time for space: preliminary tests indicate about a 5% speedup on one dataset in exchange for a 20% increase in memory usage.
2004-05-17Garbage collect BUILDLINK_PKGBASE.<pkg> from buildlink3: it is not anymoreseb1-2/+1
used since revision 1.139 of mk/buildlink3/bsd.buildlink3.mk.
2004-05-16Fix build with older GNU libstdc++ (mentioned in PR pkg/25590).seb3-2/+18
While here add support for test target.
2004-05-15Update to version 1.9.seb5-34/+29
Changes since last packaged version: * --no-doc option added to cancel the --doc option even if it is implied (e.g., when css is given) (as suggested by Keith Lea and Grant McLean) * deal with \r correctly (reported by barrett@9hells.org) * added scanner for language LUA (thanks to Marc Côté) * added scanner for CAML and SML (with the help of Jean-Baptiste Rouquier and James Riely) * fixed a bug in C++ scanner concerning tabs after # (reported by Don Stauffer). * If not specified, the source language will be guessed from the input file extension. * Added src-hilite-lesspipe.sh a script that can be used with less in order to higlight the files processed with less (suggested by Konstantine Serebriany) * fixed a bug in perl scanning when \" is used in regular expressions (reported by Geir Nilsen) * html attribute values are generated in quotes (bug fixed by Patrick Wagstrom) * can generate anchors for line numbers (thanks to Oliver Fischer)
2004-05-13Sort.xtraeme1-2/+2
2004-05-12s/netbsd.org/NetBSD.org/igrant1-2/+2
2004-05-09Unused.wiz1-22/+0
2004-05-09Convert to buildlink3.snj3-10/+15
2004-05-09Unused.wiz1-27/+0
2004-05-09Add and enable p5-XML-Sablotron.minskim1-1/+2
2004-05-09Import p5-XML-Sablotron from pkgsrc-wip. Packaged by Adam Migus andminskim4-0/+46
modified by me. This perl module is encapsulation of the XSLT processor called Sablotron (textproc/sablotron). If you don't know what is XSLT, look at http://www.w3.org/ site. If you don't know what is Sablotron, look at http://www.gingerall.com/.
2004-05-09Update MASTER_SITES.minskim1-2/+2
2004-05-08Enable pkgviews installation.uebayasi1-1/+3
2004-05-08Unused.wiz1-8/+0
2004-05-08Under Irix, vsnprintf(3) happily truncates longer strings and returnsjschauma2-1/+24
the number of size. This lead to some of the commands being truncated and not executing appropriately. (The function in questions was make_message in ./src/preproc/html/pre-html.cpp.) Patch this to also behave correctly with Irix' vsnprintf(3) family. This should address PR pkg/22563.
2004-05-07Update to 1.78.hrs3-534/+578
Since the version 1.67, the distfiles are moved to sourceforge. A lot of bugfixes, improvements, and more localization support are added. This pkgsrc update are reviewed by hubertf@.
2004-05-07Shorter HOMEPAGE.seb1-2/+2
2004-05-07Reset maintainer to tech-pkg@ (from ad@, since he is not working on themwiz1-2/+2
any longer).
2004-05-07Include converters/libiconv/buildlink3.mk since now chasen uses libiconv(3).taca1-1/+2
This should fix PR pkg/25484 by diro at nixsys.bz. (I haven't noticed it, thanks much.)
2004-05-07Drop maintainership; I don't have the enough free time to maintainxtraeme4-8/+8
all these packages.
2004-05-06Correct path to devel/darts.kristerw2-4/+4
2004-05-06Register .dtd files in the system catalog and bump PKGREVISION.minskim1-5/+9
OK'ed by jmmv@.
2004-05-06Avoid installing nonexistent files and directories for them. Alsominskim3-5/+16
remove related @dirrm entries, which only cause annoying error messages with pkg_delete. PKGREVISION will be bumped in a minute with another fix for this package.
2004-05-06Quote arguments properly for xmlcatmgr. OK'ed by jmmv@.minskim3-9/+9
2004-05-06Use c++ for link since libchasen.so is linked with libstdc++.taca3-3/+16
Bump PKGREVISION.
2004-05-06Update namazu package to 2.0.13.taca9-160/+50
Overview of Changes in Namazu 2.0.13 - April 14, 2004 * Include File::MMagic 1.20. * Add -X and --check-filesize options for mknmz text-processing. * Add Polish translations. (Contributed by Kryzystof Drewicz.) * Add German translations. (Contributed by Gerald Pfeifer.) * Add new filters (Ichitaro variants, OpenOffice.org, RTF, apachecache, MP3) * Add new filter (Macbinary) * Adapt new filter programs (wvWare 0.7.4, xpdf 2.02 - 3.00) * Add new directives for namazurc (SUICIDE_TIME, REGEX_SEARCH) (to prevent possibility of remote DoS, reported by sheepman.) * Add new directives for mknmzrc (HTML_ATTRIBUTES) (This pattern specifies attribute of a HTML tag which should be searchable.) * Change soname (LTVERSION 7:0:0, lib/libnmz.so.6 -> lib/libnmz.so.7) * Support $WAKATI="module_mecab"; in mknmzrc. (experimental) * Fix MacOSX compilation problem (getopt.c deviation from gengetopt-2.5) * Fix some bugs and possibility of security hole.
2004-05-06Update chasen (meta-package) to 2.3.3; chasen-base-2.3.3 and ipadic-2.7.0.taca1-2/+2
2004-05-06Update ipadic package to 2.7.0.taca7-42/+52
---------------------------------------------------------------------- ipadic-2.7.0 (2003/11/15) ---------------------------------------------------------------------- - Parameters are updated - CTYPE/CFORM are modified ---------------------------------------------------------------------- ipadic-2.6.3 (2003/8/15) ---------------------------------------------------------------------- - Parameters are updated ---------------------------------------------------------------------- ipadic-2.6.2 (2003/8/6) ---------------------------------------------------------------------- - Parameters are updated ---------------------------------------------------------------------- ipadic-2.6.1 (2003/8/2) ---------------------------------------------------------------------- - for ChaSen-2.3.2 -- Makefile.am is modified - chasenrc.in -- COMPOSIT -> COMPOSIT_POS ---------------------------------------------------------------------- ipadic-2.6.0 (2003/6/14) ---------------------------------------------------------------------- - Parameters are updated - added new words from Prof. Tsurumaru (Nagasaki Univ.) -- listed in http://chasen.aist-nara.ac.jp/~masayu-a/ipadic/arch/ipadic-2.6.0-diff.txt - added new words -- listed in http://chasen.aist-nara.ac.jp/~masayu-a/ipadic/arch/ipadic-2.6.0-add.txt - for ChaSen-2.3.1 -- removed .pat, .ary -- configuration for Double Array library Darts http://cl.aist-nara.ac.jp/%7etaku-ku/software/darts/ -- option -i (Character Code) e:EUC-JP, s:Shift_JIS, w:UTF-8, a:ISO-8859-1 ---------------------------------------------------------------------- ipadic-2.5.1 (2002/1/30) ---------------------------------------------------------------------- - Parameters are updated - added new words -- listed in http://chasen.aist-nara.ac.jp/~masayu-a/ipadic/arch/ipadic-2.5.1-newword.txt ---------------------------------------------------------------------- ipadic-2.5.0 (2001/4/13) ---------------------------------------------------------------------- - SJIS(CRLF) problem is fixed - Japanese Reading modification
2004-05-06Update chasen-base package to 2.3.3.taca9-62/+59
---------------------------------------------------------------------- ChaSen 2.3.3 (2003/08/16) ---------------------------------------------------------------------- - bug fix - print null strings with empty readings and pronunciations. - read the paths of chasenrc and grammar files from the registry on Windows. ---------------------------------------------------------------------- ChaSen 2.3.2 (2003/08/01) ---------------------------------------------------------------------- - bug fix - new dictionary format for registoring conjugation form specified words. ---------------------------------------------------------------------- ChaSen 2.3.1 (2003/06/19) ---------------------------------------------------------------------- - removed PATDIC, SUFDIC - introduced -i option (Character Encoding) (e: EUC-JP, s:Shift_JIS, w:UTF-8, a:ISO-8859-1) ---------------------------------------------------------------------- ChaSen 2.3.0 (2003/02/24) ---------------------------------------------------------------------- - introduced a double array library "Darts" for dictionary look up - bug fix for sortdic - extension for the module reading `cforms.cha' -- to change BASE_FORM name - increased the number of dictionaries (*.int/pat/ary) from 5 to 32. - removed server and client mode - removed command interpreter
2004-05-06Update chasen's version to 2.3.3.taca1-5/+9
---------------------------------------------------------------------- ChaSen 2.3.3 (2003/08/16) ---------------------------------------------------------------------- - bug fix - print null strings with empty readings and pronunciations. - read the paths of chasenrc and grammar files from the registry on Windows. ---------------------------------------------------------------------- ChaSen 2.3.2 (2003/08/01) ---------------------------------------------------------------------- - bug fix - new dictionary format for registoring conjugation form specified words. ---------------------------------------------------------------------- ChaSen 2.3.1 (2003/06/19) ---------------------------------------------------------------------- - removed PATDIC, SUFDIC - introduced -i option (Character Encoding) (e: EUC-JP, s:Shift_JIS, w:UTF-8, a:ISO-8859-1) ---------------------------------------------------------------------- ChaSen 2.3.0 (2003/02/24) ---------------------------------------------------------------------- - introduced a double array library "Darts" for dictionary look up - bug fix for sortdic - extension for the module reading `cforms.cha' -- to change BASE_FORM name - increased the number of dictionaries (*.int/pat/ary) from 5 to 32. - removed server and client mode - removed command interpreter
2004-05-05Update to 20040505. Add --line-buffered option.cjep2-6/+6
2004-05-05bl3ifyrecht2-8/+9
2004-05-05No longer used.snj1-24/+0
2004-05-05Convert to buildlink3.snj9-37/+37
2004-05-05No longer used.snj2-58/+0
2004-05-04Convert to buildlink3.snj44-132/+128
2004-05-04Unused.wiz1-25/+0
2004-05-03darwin is anal about cpp being a *C* preprocessor, so use m4 heredanw3-6/+11
instead. Also fix a braino in the Makefile that resulted in it always using the bytecode compiler rather than the native compiler. PKGREVISION++
2004-05-03Convert to bl3 and make build with gcc3.wiz5-10/+48
2004-05-02Update to 2.1:jmmv3-12/+12
Second stable version of the 2.x branch, released on 2004/05/02. * Fixed an attribute name when parsing the `uri' tag in XML catalogs; it expects `name', not `uriId'. * Fixed a warning message when removing entries from an XML catalog. * Fixed several warnings when building mem.c code in a system with glibc 2.[23].x and -O2 enabled. * Added the `-p' flag which changes the behavior of the `add' action so that new entries are prepended instead of appended. * Improved consistency of the lookup action so that it behaves equally for SGML and XML catalogs (this includes making XML lookup show all matching entries). * Documentation is now installed in an unversioned directory by default.
2004-04-30This configures the to-be-installed mdoc.local file so thereed2-9/+18
"volume-operating-system" macro is ${OPSYS}. And sets the default .Os value to "pkgsrc" as suggested by wiz@. (It was hard-coded "NetBSD\~1.6".) Usually the mdoc.local "volume-operating-system" definition is for the operating system name often displayed on top of man pages. And "operating-system" is for the default .Os value (operating system and version/release) and is usually displayed at bottom of man page. Bump PKGREVISION. This closes my PR #23100.
2004-04-29Add and enable p5-Text-Quoted-1.5.minskim1-1/+2
2004-04-29Import p5-Text-Quoted from pkgsrc-wip. Packaged by dieter Roelants.minskim4-0/+34
"Text::Quoted" examines the structure of some text which may contain multiple different levels of quoting, and turns the text into a nested data structure. The structure is an array reference containing hash references for each paragraph belonging to the same author. Each level of quoting recursively adds another list reference.
2004-04-29Fix my email address.sketch1-2/+2
2004-04-28Remove support for bl2 since remaining packageswiz2-46/+1
using this have been converted to bl3.
2004-04-27Bl3ify (so that catalogs.mk uses the bl3 file); per wiz@ request.jmmv4-4/+8
2004-04-27Re-instate for now (catalogs.mk _sets_ USE_BUILDLINK2).wiz1-0/+42