summaryrefslogtreecommitdiff
path: root/textproc/p5-Lingua-EN-Tagger
diff options
context:
space:
mode:
authorsno <sno@pkgsrc.org>2010-08-19 19:50:36 +0000
committersno <sno@pkgsrc.org>2010-08-19 19:50:36 +0000
commite737f2cb6c2e0e28bb57e2acc9bd14994381d0e3 (patch)
treeaf839ca5faad64a9775df80980ec8bef6492ff39 /textproc/p5-Lingua-EN-Tagger
parent1fb984aa34eeca10bf5968af78eac556098ca5fb (diff)
downloadpkgsrc-e737f2cb6c2e0e28bb57e2acc9bd14994381d0e3.tar.gz
Importing package for Perl5 module Lingua::EN::Tagger 0.16 into
textproc/p5-Lingua-EN-Tagger as dependency of scheduled import of Lingua::EN::Inflect::Phrase, which is a dependency of scheduled update of DBIx::Class::Schema::Loader. The module is a probability based, corpus-trained tagger that assigns POS tags to English text based on a lookup dictionary and a set of probability values. The tagger assigns appropriate tags based on conditional probabilities - it examines the preceding tag to determine the appropriate tag for the current word. Unknown words are classified according to word morphology or can be set to be treated as nouns or other parts of speech. The tagger also extracts as many nouns and noun phrases as it can, using a set of regular expressions.
Diffstat (limited to 'textproc/p5-Lingua-EN-Tagger')
-rw-r--r--textproc/p5-Lingua-EN-Tagger/DESCR9
-rw-r--r--textproc/p5-Lingua-EN-Tagger/Makefile25
-rw-r--r--textproc/p5-Lingua-EN-Tagger/distinfo5
3 files changed, 39 insertions, 0 deletions
diff --git a/textproc/p5-Lingua-EN-Tagger/DESCR b/textproc/p5-Lingua-EN-Tagger/DESCR
new file mode 100644
index 00000000000..e936e467b90
--- /dev/null
+++ b/textproc/p5-Lingua-EN-Tagger/DESCR
@@ -0,0 +1,9 @@
+The module is a probability based, corpus-trained tagger that assigns POS
+tags to English text based on a lookup dictionary and a set of probability
+values. The tagger assigns appropriate tags based on conditional
+probabilities - it examines the preceding tag to determine the appropriate
+tag for the current word. Unknown words are classified according to word
+morphology or can be set to be treated as nouns or other parts of speech.
+
+The tagger also extracts as many nouns and noun phrases as it can, using a
+set of regular expressions.
diff --git a/textproc/p5-Lingua-EN-Tagger/Makefile b/textproc/p5-Lingua-EN-Tagger/Makefile
new file mode 100644
index 00000000000..744d4096fab
--- /dev/null
+++ b/textproc/p5-Lingua-EN-Tagger/Makefile
@@ -0,0 +1,25 @@
+# $NetBSD: Makefile,v 1.1.1.1 2010/08/19 19:50:36 sno Exp $
+#
+
+DISTNAME= Lingua-EN-Tagger-0.16
+PKGNAME= p5-${DISTNAME}
+#PKGREVISION= 1
+CATEGORIES= textproc perl5
+MASTER_SITES= ${MASTER_SITE_PERL_CPAN:=Lingua/}
+
+MAINTAINER= pkgsrc-users@NetBSD.org
+HOMEPAGE= http://search.cpan.org/dist/Lingua-EN-Tagger/
+COMMENT= Part-of-speech tagger for English natural language processing
+LICENSE= gnu-gpl-v3
+
+DEPENDS+= p5-HTML-Parser>=3.45:../../www/p5-HTML-Parser
+DEPENDS+= p5-Lingua-Stem>=0.81:../../textproc/p5-Lingua-Stem
+DEPENDS+= p5-Memoize-ExpireLRU>=0.55:../../devel/p5-Memoize-ExpireLRU
+
+USE_LANGUAGES= # empty
+PERL5_PACKLIST= auto/Lingua/EN/Tagger/.packlist
+
+PKG_DESTDIR_SUPPORT= user-destdir
+
+.include "../../lang/perl5/module.mk"
+.include "../../mk/bsd.pkg.mk"
diff --git a/textproc/p5-Lingua-EN-Tagger/distinfo b/textproc/p5-Lingua-EN-Tagger/distinfo
new file mode 100644
index 00000000000..1b7a99e0c74
--- /dev/null
+++ b/textproc/p5-Lingua-EN-Tagger/distinfo
@@ -0,0 +1,5 @@
+$NetBSD: distinfo,v 1.1.1.1 2010/08/19 19:50:36 sno Exp $
+
+SHA1 (Lingua-EN-Tagger-0.16.tar.gz) = 3908945b39d7603df34c49045c0aefeb10615f1a
+RMD160 (Lingua-EN-Tagger-0.16.tar.gz) = add56f25ba3ecabd29f40e60272ef22ba94d0a28
+Size (Lingua-EN-Tagger-0.16.tar.gz) = 262264 bytes