Size: 12690
Comment:
|
← Revision 109 as of 2022-09-09 07:00:19 ⇥
Size: 13257
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 2: | Line 2: |
== Polish COMBO models == | = COMBO's models for Polish = |
Line 4: | Line 4: |
The [[https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/tree/master|COMBO]] models for Polish are trained on the current version of [[http://zil.ipipan.waw.pl/PDB|Polish Dependency Bank]]. The models use the [[https://huggingface.co/allegro/herbert-base-cased|HerBERT]] language model. | [[https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/tree/master|COMBO's]] models for Polish trained on the current version of [[http://zil.ipipan.waw.pl/PDB|Polish Dependency Bank]] and using the [[https://huggingface.co/allegro/herbert-base-cased|HerBERT]] language model. |
Line 6: | Line 6: |
== PDB-trained models == | === PDB-trained models === |
Line 11: | Line 11: |
== PDB-UD-trained model == | === PDB-UD-trained model === |
Line 43: | Line 43: |
== Parsing performance (outdated) == | === COMBO === * COMBO's [[https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/tree/master|source code]] * Beginner's [[https://colab.research.google.com/drive/1D1P4AiE40Cc_4SF3HY-Mz06JY0XMiEFs?hl=en|tutorial]] (collab notebook) * COMBO's [[https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/blob/master/docs/performance.md|performance]] on test sets for multiple languages from [[https://universaldependencies.org|Universal Dependencies]] {{{#!wiki comment === Parsing performance === |
Line 47: | Line 54: |
{{{#!wiki comment | |
Line 106: | Line 113: |
== PDB-based dependency parsing demos == | === COMBO demos === |
Line 108: | Line 115: |
* [[http://scwad-demo.nlp.ipipan.waw.pl:8000/dependency-parsing|COMBO demo]] (only in Polish) | * [[http://combo-demo.nlp.ipipan.waw.pl/combo-eng|English]] * [[http://combo-demo.nlp.ipipan.waw.pl/combo-pl|Polish]] {{{#!wiki comment * [[http://scwad-demo.nlp.ipipan.waw.pl:8000/dependency-parsing|COMBO demo]] |
Line 111: | Line 122: |
* To download the parser's output in CoNLL format, "Select output format:". | * To download the parser's output in CoNLL format, "Select output format:".}}} |
Line 113: | Line 124: |
== Publications == | === Publications === |
Line 122: | Line 133: |
== Licensing == | === Licensing === |
Line 126: | Line 137: |
== Acknowledgment == The research was founded by SONATA 8 grant no 2014/15/D/HS2/03486 from the National Science Centre Poland and by the Polish Ministry of Science and Higher Education as part of the investment in the CLARIN-PL research infrastructure. The computing was performed at Poznań Supercomputing and Networking Center. |
=== Acknowledgment === The research was founded by SONATA 8 grant no 2014/15/D/HS2/03486 from the National Science Centre Poland and by the Polish Ministry of Science, Higher Education as part of the investment in the CLARIN-PL research infrastructure and DARIAH-PL. The computing was performed at Poznań Supercomputing and Networking Center. |
Line 130: | Line 141: |
== Contact == | === Contact === |
COMBO's models for Polish
COMBO's models for Polish trained on the current version of Polish Dependency Bank and using the HerBERT language model.
PDB-trained models
model for dependency parsing only
model for part-of-speech tagging, morphological analysis, lemmatisation, and dependency parsing (dependency relation types without semantic extensions, e.g. adjunct instead of adjunct_temp)
model for part-of-speech tagging, morphological analysis, lemmatisation, and dependency parsing (dependency relation types with semantic extensions, e.g. adjunct_temp)
PDB-UD-trained model
model for part-of-speech tagging, morphological analysis, lemmatisation, and dependency parsing
COMBO
COMBO's source code
Beginner's tutorial (collab notebook)
COMBO's performance on test sets for multiple languages from Universal Dependencies
COMBO demos
Publications
Licensing
The dependency parsing models for Polish are released under the CC BY-NC-SA 4.0 licence and by downloading them you accept the conditions of that licence.
Acknowledgment
The research was founded by SONATA 8 grant no 2014/15/D/HS2/03486 from the National Science Centre Poland and by the Polish Ministry of Science, Higher Education as part of the investment in the CLARIN-PL research infrastructure and DARIAH-PL. The computing was performed at Poznań Supercomputing and Networking Center.
Contact
Any questions, comments? Please send them to <alina AT SPAMFREE ipipan DOT waw DOT pl>.