summaryrefslogtreecommitdiff
path: root/textproc/p5-Lingua-EN-Tagger/DESCR
AgeCommit message (Collapse)AuthorFilesLines
2010-08-19Importing package for Perl5 module Lingua::EN::Tagger 0.16 intosno1-0/+9
textproc/p5-Lingua-EN-Tagger as dependency of scheduled import of Lingua::EN::Inflect::Phrase, which is a dependency of scheduled update of DBIx::Class::Schema::Loader. The module is a probability based, corpus-trained tagger that assigns POS tags to English text based on a lookup dictionary and a set of probability values. The tagger assigns appropriate tags based on conditional probabilities - it examines the preceding tag to determine the appropriate tag for the current word. Unknown words are classified according to word morphology or can be set to be treated as nouns or other parts of speech. The tagger also extracts as many nouns and noun phrases as it can, using a set of regular expressions.