Differences between revisions 9 and 10

Concraft

This page provides the official release of Concraft, a morphosyntactic disambiguation tool based on constrained conditional random fields.

Author: Jakub Waszczuk
License: 2-clause BSD

Documentation

See the README file from the development repository.

Downloads

Concraft is available in a form of a software distribution which can be downloaded from Hackage using the Cabal tool. To compile Concraft you will also need the Glasgow Haskell Compiler (GHC). The simplest way to get both Cabal and GHC is to install the Haskell Platform. Please see the documentation for more information about the installation process.

Pre-trained model

You can download a pre-trained Concraft model for the Polish language from here. The training material, manually annotated 1-million word subcorpus of the National Corpus of Polish, has been first re-analysed using the Maca tool set up to use the morfeusz-nkjp-official configuration. The same preprocessing pipeline should be used to prepare input data for subsequent disambiguation.

Publications

Jakub Waszczuk. (2012). Harnessing the CRF complexity with domain-specific constraints. The case of morphosyntactic tagging of a highly inflected language.
In: Proceedings of COLING 2012, Mumbai, India.

-  ⇤ ← Revision 9 as of 2013-01-09 22:59:23 → 
  Size: 1721
  Editor: JakubWaszczuk
  Comment:
+   ← Revision 10 as of 2013-01-10 01:08:36 → ⇥
  Size: 1746
  Editor: JakubWaszczuk
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 20:
-You can download a pre-trained Concraft model for the Polish language from here.  The training material, manually annotated 1-million word subcorpus of the National Corpus of Polish, has been first re-analysed using the [[http://nlp.pwr.wroc.pl/redmine/projects/libpltagger/wiki|Maca]] tool set up to use the `morfeusz-nkjp-official` configuration.  The same preprocessing pipeline should be used to prepare input data for subsequent disambiguation.
+You can download a pre-trained Concraft model for the Polish language from [[attachment:model.bin|here]].  The training material, manually annotated 1-million word subcorpus of the National Corpus of Polish, has been first re-analysed using the [[http://nlp.pwr.wroc.pl/redmine/projects/libpltagger/wiki|Maca]] tool set up to use the `morfeusz-nkjp-official` configuration.  The same preprocessing pipeline should be used to prepare input data for subsequent disambiguation.

Diff for "Concraft"

Menu

Concraft

Documentation

Downloads

Pre-trained model

Publications