Locked History Actions

Diff for "Concraft"

Differences between revisions 11 and 12
Revision 11 as of 2013-01-10 01:29:39
Size: 1699
Comment:
Revision 12 as of 2013-01-10 01:31:54
Size: 1682
Comment:
Deletions are marked like this. Additions are marked like this.
Line 20: Line 20:
We provide a [[attachment:model.bin|model]] for the Polish language which has been trained on the manually annotated subcorpus of the National Corpus of Polish. The training corpus has been first re-analysed with the [[http://nlp.pwr.wroc.pl/redmine/projects/libpltagger/wiki|Maca]] tool (using the `morfeusz-nkjp-official` configuration) and the same preprocessing pipeline should be used to prepare input for disambiguation. We provide a [[attachment:model.bin|model]] for the Polish language which has been trained on the manually annotated subcorpus of the National Corpus of Polish. The training corpus has been first re-analysed with the [[http://nlp.pwr.wroc.pl/redmine/projects/libpltagger/wiki|Maca]] tool (using the `morfeusz-nkjp-official` configuration) and the same preprocessing pipeline should be used for disambiguation.

Concraft

This page provides the official release of Concraft, a morphosyntactic disambiguation tool based on constrained conditional random fields.

Author: Jakub Waszczuk
License: 2-clause BSD

Documentation

See the README file from the development repository.

Downloads

Concraft is available in a form of a software distribution which can be downloaded from Hackage using the Cabal tool. To compile Concraft you will also need the Glasgow Haskell Compiler (GHC). The simplest way to get both Cabal and GHC is to install the Haskell Platform. Please see the documentation for more information about the installation process.

Pre-trained model

We provide a model for the Polish language which has been trained on the manually annotated subcorpus of the National Corpus of Polish. The training corpus has been first re-analysed with the Maca tool (using the morfeusz-nkjp-official configuration) and the same preprocessing pipeline should be used for disambiguation.

Publications