blob: e93c6b74a8795b3f7df94d99ed13e25b6067b319 (
plain) (
blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
langid.py is a standalone Language Identification (LangID) tool.
The design principles are as follows:
Fast
Pre-trained over a large number of languages (currently 97)
Not sensitive to domain-specific features (e.g. HTML/XML markup)
Single .py file with minimal dependencies
Deployable as a web service
Remark: the main script langid/langid.py is cross-compatible with both Python2
and Python3, but the accompanying training tools are still Python2-only, hence
not installed by this port.
See also the port textproc/py-langdetect for a similar program.
|