Locked History Actions

NKJP model for TnT Tagger

NKJP model for TnT Tagger

Here you can download the model for TnT tagger, which was created by training the tagger with the one-million manually annotated subcorpus of the Polish National Corpus. The model is available under BSD license. The downloaded file has to be unzipped.

To use the TnT tagger, you need a copy from its author (see his webpage). To run the tagger, use:

tnt nkjp <nazwa_pliku>

The input file needs to be tokenized into sentences (separated with two end-of-line characters) and words in the way the Polish National Corpus is tokenized. (Use Morfeusz if unsure). The tagger's quality is around 88%.