This directory contains stemmer tables for Polish. They were built using
training samples of various sizes, randomly taken out of a well-balanced
corpus of contemporary Polish, containing ~70,000 nouns, verbs and adjectives.
The training corpus contained at least 4 inflected forms for each lemma.

The table names reflect the sample size, e.g. stemmer_500.out was built
using a random training sample of 500 different lemmas.

Unfortunately, due to licensing restrictions I am not able to distribute
all source corpora. Also, the tables which offer F-measure above 75% are
available only commercially.
