Differences between revisions 1 and 6 (spanning 5 versions)

NKJP model for TnT Tagger

Here you can download the model for TnT tagger, which was created by training the tagger with the one-million manually annotated subcorpus of the Polish National Corpus. The model is available under BSD license. The downloaded file has to be unzipped.

To use the TnT tagger, you need a copy from its author (see his webpage). To run the tagger, use:

tnt nkjp <nazwa_pliku>

The input file needs to be tokenized into sentences (separated with two end-of-line characters) and words in the way the Polish National Corpus is tokenized. (Use Morfeusz if unsure). The tagger's quality is around 88%.

-  ⇤ ← Revision 1 as of 2013-01-29 16:51:23 → 
  Size: 70
  Editor: MichalLenart
  Comment:
+   ← Revision 6 as of 2013-01-29 18:25:31 → ⇥
  Size: 827
  Editor: MarcinMilkowski
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 3:
-= TnT =
+= NKJP model for TnT Tagger =

Here you can download the model for [[http://www.coli.uni-saarland.de/~thorsten/tnt/ | TnT tagger]], which was created by training the tagger with the one-million manually annotated subcorpus of the Polish National Corpus. The [[attachment:nkjp.zip|model]] is available under BSD license. The downloaded file has to be unzipped.

To use the TnT tagger, you need a copy from its author (see [[http://www.coli.uni-saarland.de/~thorsten/tnt/ | his webpage]]). To run the tagger, use:

tnt nkjp <nazwa_pliku>

The input file needs to be tokenized into sentences (separated with two end-of-line characters) and words in the way the Polish National Corpus is tokenized. (Use Morfeusz if unsure). The tagger's quality is around 88%.

Diff for "NKJP model for TnT Tagger"

Menu

NKJP model for TnT Tagger