Differences between revisions 2 and 4 (spanning 2 versions)
Size: 808
Comment:
|
← Revision 4 as of 2014-12-18 13:08:50 ⇥
Size: 1327
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 6: | Line 6: |
* [[http://sgjp.pl/|Morfeusz analyzer]] (version 1 and 2) * [[http://nlp.pwr.wroc.pl/redmine/projects/wcrft/wiki|WCRFT tagger]], * [[http://zil.ipipan.waw.pl/Spejd|Spejd parser]], |
* [[http://sgjp.pl/|Morfeusz analyzer]] (version 1 and 2), by Marcin Woliński, * [[http://nlp.pwr.wroc.pl/redmine/projects/wcrft/wiki|WCRFT tagger]], by Adam Radziszewski, * [[http://zil.ipipan.waw.pl/Spejd|Spejd parser]], by Bartosz Zaborowski and Adam Przepiórkowski, * Spejd grammar, by Katarzyna Głowińska, Łukasz Degórski and Piotr Przybyła, |
Line 20: | Line 21: |
Currently, LemmaPL can be used from a [[https://www.docker.com/|Docker]] container: ipipan/langtools-taggers | Currently, LemmaPL can be used from a [[https://www.docker.com/|Docker]] container: ipipan/langtools-all or ipipan/langtools-taggers (with your own WCRFT model attached to the container). Instructions for ipipan/langtools-all image: * docker pull ipipan/langtools-all * docker run -v /home/username/my_tests:/root/my_tests -it ipipan/langtools-all /bin/bash ''inside container'': * cd /root/lemmapl * python lemmapl.py ../my_tests/test.txt |
LemmaPL
LemmaPL is a lemmatization tool, which uses several existing tools and resources to provide higher than state-of-the-art lemmatization performance for Polish. Specifically, the following tools are used:
Morfeusz analyzer (version 1 and 2), by Marcin Woliński,
WCRFT tagger, by Adam Radziszewski,
Spejd parser, by Bartosz Zaborowski and Adam Przepiórkowski,
- Spejd grammar, by Katarzyna Głowińska, Łukasz Degórski and Piotr Przybyła,
- abbreviations dictionary,
- frequency data from National Corpus of Polish.
Author: Łukasz Kobyliński
License: GPL
Usage
LemmaPL is available in a form of a web service (SOON).
Currently, LemmaPL can be used from a Docker container: ipipan/langtools-all or ipipan/langtools-taggers (with your own WCRFT model attached to the container).
Instructions for ipipan/langtools-all image:
- docker pull ipipan/langtools-all
- docker run -v /home/username/my_tests:/root/my_tests -it ipipan/langtools-all /bin/bash
inside container:
- cd /root/lemmapl
- python lemmapl.py ../my_tests/test.txt