Locked History Actions

Diff for "Concraft"

Differences between revisions 17 and 23 (spanning 6 versions)
Revision 17 as of 2013-02-27 15:45:30
Size: 1964
Comment:
Revision 23 as of 2013-04-18 15:07:42
Size: 2022
Comment:
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:
= Concraft = = Concraft-pl =
Line 4: Line 4:
This page provides the official release of Concraft, a morphosyntactic disambiguation tool based on constrained conditional random fields. This page provides the official release of Concraft-pl, a morphosyntactic tagger for Polish based on constrained conditional random fields. The tool combines the following components into a pipeline:

 * A morphosyntactic segmentation and analysis tool [[http://nlp.pwr.wroc.pl/redmine/projects/libpltagger/wiki|Maca]],
 * A morphosyntactic disambiguation library [[https://github.com/kawu/concraft#concraft|Concraft]].
Line 12: Line 15:
See the [[https://github.com/kawu/concraft/blob/master/README.md#concraft|README]] file from the development repository. See the [[https://github.com/kawu/concraft-pl#concraft-pl|README]] file from the development repository.
Line 16: Line 19:
Concraft is available in a form of a software distribution which can be downloaded from [[http://hackage.haskell.org/package/concraft|Hackage]] using the [[http://www.haskell.org/cabal/|Cabal]] tool. To compile Concraft you will also need the [[http://www.haskell.org/ghc/|Glasgow Haskell Compiler]] (GHC). The simplest way to get both Cabal and GHC is to install the [[http://www.haskell.org/platform/|Haskell Platform]]. Please see the documentation for more information about the installation process. Concraft-pl is available in a form of a software distribution which can be downloaded from [[http://hackage.haskell.org/package/concraft-pl|Hackage]] using the [[http://www.haskell.org/cabal/|Cabal]] tool. To compile Concraft-pl you will also need the [[http://www.haskell.org/ghc/|Glasgow Haskell Compiler]] (GHC). The simplest way to get both Cabal and GHC is to install the [[http://www.haskell.org/platform/|Haskell Platform]]. Please see the documentation for more information about the installation process.
Line 20: Line 23:
A model for the Polish language has been trained on the [[http://clip.ipipan.waw.pl/LRT?action=AttachFile&do=view&target=NKJP-PodkorpusMilionowy-1.1.tgz|manually annotated subcorpus]] of the [[http://nkjp.pl/index.php?page=0&lang=1|National Corpus of Polish]]. An archive file with the model can be downloaded from [[attachment:model.zip|here]] ('''note''': you need version 0.4 of Concraft or higher to use it). The corpus has been first re-analysed with the [[http://nlp.pwr.wroc.pl/redmine/projects/libpltagger/wiki|Maca]] tool (using the `morfeusz-nkjp-official` configuration) and the same preprocessing pipeline should be used to prepare input data for morphosyntactic disambiguation. We provide Concraft-pl models trained on the [[http://clip.ipipan.waw.pl/LRT?action=AttachFile&do=view&target=NKJP-PodkorpusMilionowy-1.1.tgz|manually annotated subcorpus]] of the [[http://nkjp.pl/index.php?page=0&lang=1|National Corpus of Polish]]. Choose appropriate model depending on the version of Concraft-pl you are using.

|| Version || Model ||
|| 0.1 || [[attachment:model-0.5.gz|Download]] ||
|| 0.2 || [[attachment:model-0.2.gz|Download]] ||

Concraft-pl

This page provides the official release of Concraft-pl, a morphosyntactic tagger for Polish based on constrained conditional random fields. The tool combines the following components into a pipeline:

  • A morphosyntactic segmentation and analysis tool Maca,

  • A morphosyntactic disambiguation library Concraft.

Author: Jakub Waszczuk
License: 2-clause BSD

Documentation

See the README file from the development repository.

Downloads

Concraft-pl is available in a form of a software distribution which can be downloaded from Hackage using the Cabal tool. To compile Concraft-pl you will also need the Glasgow Haskell Compiler (GHC). The simplest way to get both Cabal and GHC is to install the Haskell Platform. Please see the documentation for more information about the installation process.

Pre-trained model

We provide Concraft-pl models trained on the manually annotated subcorpus of the National Corpus of Polish. Choose appropriate model depending on the version of Concraft-pl you are using.

Version

Model

0.1

Download

0.2

Download

Publications