Size: 13125
Comment:
|
← Revision 109 as of 2022-09-09 07:00:19 ⇥
Size: 13257
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 2: | Line 2: |
= Polish COMBO models = | = COMBO's models for Polish = |
Line 4: | Line 4: |
The [[https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/tree/master|COMBO]] models for Polish are trained on the current version of [[http://zil.ipipan.waw.pl/PDB|Polish Dependency Bank]]. The models use the [[https://huggingface.co/allegro/herbert-base-cased|HerBERT]] language model. | [[https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/tree/master|COMBO's]] models for Polish trained on the current version of [[http://zil.ipipan.waw.pl/PDB|Polish Dependency Bank]] and using the [[https://huggingface.co/allegro/herbert-base-cased|HerBERT]] language model. |
Line 6: | Line 6: |
=== PDB-trained COMBO models === | === PDB-trained models === |
Line 11: | Line 11: |
=== PDB-UD-trained COMBO model === | === PDB-UD-trained model === |
Line 45: | Line 45: |
* Source [[https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/tree/master|code]] * [[https://colab.research.google.com/drive/1D1P4AiE40Cc_4SF3HY-Mz06JY0XMiEFs?hl=en|Tutorial]] (collab notebook) * [[https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/blob/master/docs/performance.md|Performance]] |
* COMBO's [[https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/tree/master|source code]] * Beginner's [[https://colab.research.google.com/drive/1D1P4AiE40Cc_4SF3HY-Mz06JY0XMiEFs?hl=en|tutorial]] (collab notebook) * COMBO's [[https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/blob/master/docs/performance.md|performance]] on test sets for multiple languages from [[https://universaldependencies.org|Universal Dependencies]] |
Line 113: | Line 113: |
== COMBO demos == | === COMBO demos === |
Line 124: | Line 124: |
== Publications == | === Publications === |
Line 133: | Line 133: |
== Licensing == | === Licensing === |
Line 137: | Line 137: |
== Acknowledgment == The research was founded by SONATA 8 grant no 2014/15/D/HS2/03486 from the National Science Centre Poland and by the Polish Ministry of Science and Higher Education as part of the investment in the CLARIN-PL research infrastructure. The computing was performed at Poznań Supercomputing and Networking Center. |
=== Acknowledgment === The research was founded by SONATA 8 grant no 2014/15/D/HS2/03486 from the National Science Centre Poland and by the Polish Ministry of Science, Higher Education as part of the investment in the CLARIN-PL research infrastructure and DARIAH-PL. The computing was performed at Poznań Supercomputing and Networking Center. |
Line 141: | Line 141: |
== Contact == | === Contact === |
COMBO's models for Polish
COMBO's models for Polish trained on the current version of Polish Dependency Bank and using the HerBERT language model.
PDB-trained models
model for dependency parsing only
model for part-of-speech tagging, morphological analysis, lemmatisation, and dependency parsing (dependency relation types without semantic extensions, e.g. adjunct instead of adjunct_temp)
model for part-of-speech tagging, morphological analysis, lemmatisation, and dependency parsing (dependency relation types with semantic extensions, e.g. adjunct_temp)
PDB-UD-trained model
model for part-of-speech tagging, morphological analysis, lemmatisation, and dependency parsing
COMBO
COMBO's source code
Beginner's tutorial (collab notebook)
COMBO's performance on test sets for multiple languages from Universal Dependencies
COMBO demos
Publications
Licensing
The dependency parsing models for Polish are released under the CC BY-NC-SA 4.0 licence and by downloading them you accept the conditions of that licence.
Acknowledgment
The research was founded by SONATA 8 grant no 2014/15/D/HS2/03486 from the National Science Centre Poland and by the Polish Ministry of Science, Higher Education as part of the investment in the CLARIN-PL research infrastructure and DARIAH-PL. The computing was performed at Poznań Supercomputing and Networking Center.
Contact
Any questions, comments? Please send them to <alina AT SPAMFREE ipipan DOT waw DOT pl>.