aboutsummaryrefslogtreecommitdiff
path: root/textproc/p5-Lingua-EN-Tagger/pkg-descr
blob: 31099ec5bede71341a67a6631a118833eb491886 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
The module is a probability based, corpus-trained tagger that assigns
POS tags to English text based on a lookup dictionary and probability
values. The tagger determines appropriate tags based on conditional
probabilities - it looks at the preceding tag to figure out what the
appropriate tag is for the current word. Unknown words will be classified
according to word morphology or can be set to be treated as nouns or
other parts of speech.

The tagger also recursively extracts as many nouns and noun phrases as
it can, using a set of regular expressions.

WWW: https://metacpan.org/release/Lingua-EN-Tagger