Differences between revisions 5 and 7 (spanning 2 versions)
Size: 805
Comment:
|
← Revision 7 as of 2013-01-29 18:26:00 ⇥
Size: 857
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
## page was renamed from TnT | |
Line 3: | Line 4: |
= TnT = | = NKJP model for TnT Tagger = |
NKJP model for TnT Tagger
Here you can download the model for TnT tagger, which was created by training the tagger with the one-million manually annotated subcorpus of the Polish National Corpus. The model is available under BSD license. The downloaded file has to be unzipped.
To use the TnT tagger, you need a copy from its author (see his webpage). To run the tagger, use:
tnt nkjp <nazwa_pliku>
The input file needs to be tokenized into sentences (separated with two end-of-line characters) and words in the way the Polish National Corpus is tokenized. (Use Morfeusz if unsure). The tagger's quality is around 88%.