Differences between revisions 3 and 5 (spanning 2 versions)
|
Size: 375
Comment:
|
Size: 385
Comment:
|
| Deletions are marked like this. | Additions are marked like this. |
| Line 2: | Line 2: |
== Description == |
|
| Line 8: | Line 6: |
* [[attachment:1grams.gz]] |
N-grams from balanced National Corpus of Polish
The resource is a set of N-grams extracted from balanced National Corpus of Polish for N from 1 to 5. Each unigram is maximum continuous chunk of non-whitespace lower-case characters. The resource contains all unique N-grams followed by number of occurrencies.
