aboutsummaryrefslogblamecommitdiff
path: root/textproc/py-langid/pkg-descr
blob: 5f3d76514028503c3caf83593699037b1bfeaa38 (plain) (tree)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
















                                                                              
langid.py is a standalone Language Identification (LangID) tool.

The design principles are as follows:

    Fast
    Pre-trained over a large number of languages (currently 97)
    Not sensitive to domain-specific features (e.g. HTML/XML markup)
    Single .py file with minimal dependencies
    Deployable as a web service

Remark: the main script langid/langid.py is cross-compatible with both Python2
and Python3, but the accompanying training tools are still Python2-only, hence
not installed by this port.

See also the port textproc/py-langdetect for a similar program.

WWW: https://github.com/saffsd/langid.py