Locked History Actions

Diff for "NKJPNGrams"

Differences between revisions 5 and 7 (spanning 2 versions)
Revision 5 as of 2012-08-01 11:12:46
Size: 385
Editor: MichalLenart
Comment:
Revision 7 as of 2012-08-01 11:13:07
Size: 502
Editor: MichalLenart
Comment:
Deletions are marked like this. Additions are marked like this.
Line 5: Line 5:
== Download == == Downloads ==
Line 8: Line 8:
 * [[attachment:2grams.gz]]
 * [[attachment:3grams.gz]]
 * [[attachment:4grams.gz]]
 * [[attachment:5grams.gz]]

N-grams from balanced National Corpus of Polish

The resource is a set of N-grams extracted from balanced National Corpus of Polish for N from 1 to 5. Each unigram is maximum continuous chunk of non-whitespace lower-case characters. The resource contains all unique N-grams followed by number of occurrencies.

Downloads