Locked History Actions

Diff for "PDB/PDBparser"

Differences between revisions 4 and 109 (spanning 105 versions)
Revision 4 as of 2017-06-27 10:05:18
Size: 6335
Comment:
Revision 109 as of 2022-09-09 07:00:19
Size: 13257
Comment:
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:
= PDBparser = = COMBO's models for Polish =
Line 4: Line 4:
PDBparser is a Polish dependency parser trained on the current version of ([[http://zil.ipipan.waw.pl/PDB|Polish Depedency Bank]]) with the publicly available parsing systems – [[http://maltparser.org|MaltParser]] or [[https://code.google.com/archive/p/mate-tools/|MateParser]]. `MaltParser` is a transition-based dependency parser that uses a deterministic parsing algorithm. The deterministic parsing algorithm builds a dependency structure of an input sentence based on transitions (shift-reduce actions) predicted by a classifier. The classifier learns to predict the next transition given training data and the parse history. `MateParser`, in turn, is a graph-based parser that defines a space of well-formed candidate dependency trees for an input sentence, scores them given an induced parsing model, and selects the highest scoring dependency tree as a correct analysis of the input sentence. [[https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/tree/master|COMBO's]] models for Polish trained on the current version of [[http://zil.ipipan.waw.pl/PDB|Polish Dependency Bank]] and using the [[https://huggingface.co/allegro/herbert-base-cased|HerBERT]] language model.
Line 6: Line 6:
== Dependency parsing models for Polish == === PDB-trained models ===
 * [[http://mozart.ipipan.waw.pl/~alina/Polish_dependency_parsing_models/COMBO_pytorch/combo_PDB_parseonly_220906.tar.gz|model]] for dependency parsing only
 * [[http://mozart.ipipan.waw.pl/~alina/Polish_dependency_parsing_models/COMBO_pytorch/combo_PDB_full_220906.tar.gz|model]] for part-of-speech tagging, morphological analysis, lemmatisation, and dependency parsing (dependency relation types '''without''' semantic extensions, e.g. adjunct instead of adjunct_temp)
 * [[http://mozart.ipipan.waw.pl/~alina/Polish_dependency_parsing_models/COMBO_pytorch/combo_PDB_full_SEMLAB_220906.tar.gz|model]] for part-of-speech tagging, morphological analysis, lemmatisation, and dependency parsing (dependency relation types '''with''' semantic extensions, e.g. adjunct_temp)
Line 8: Line 11:
 * `MateParser` model for Polish: [[attachment:170608_PDBMate.mdl]]
 * `MaltParser` model for Polish: [[attachment:170608_PDBMalt.mco]]
=== PDB-UD-trained model ===
 * [[http://mozart.ipipan.waw.pl/~alina/Polish_dependency_parsing_models/COMBO_pytorch/combo_PDBUD_full_220906.tar.gz|model]] for part-of-speech tagging, morphological analysis, lemmatisation, and dependency parsing
Line 11: Line 14:
The dependency parsing models for Polish are released under the GNU General Public License v3 (GPL v.3) and by downloading it you accept the conditions of that licence. {{{#!wiki comment
[[https://github.com/360er0/COMBO|COMBO]], [[https://code.google.com/archive/p/mate-tools/|MateParser]] and [[http://maltparser.org|MaltParser]]. /* ''MaltParser'' is a transition-based dependency parser that uses a deterministic parsing algorithm. The deterministic parsing algorithm builds a dependency structure of an input sentence based on transitions (shift-reduce actions) predicted by a classifier. The classifier learns to predict the next transition given training data and the parse history. `MateParser`, in turn, is a graph-based parser that defines a space of well-formed candidate dependency trees for an input sentence, scores them given an induced parsing model, and selects the highest scoring dependency tree as a correct analysis of the input sentence. */
Line 13: Line 17:
== Parsing performance ==  * [[http://mozart.ipipan.waw.pl/~alina/Polish_dependency_parsing_models/COMBO_pytorch/combo_PDB_parseonly_220906.tar.gz|COMBO-pytorch model]] for dependency parsing only (with [[https://huggingface.co/allegro/herbert-base-cased|HerBERT-base]] embeddings),
 * [[http://mozart.ipipan.waw.pl/~alina/Polish_dependency_parsing_models/COMBO/20200930_COMBO_PDB_nosem_parseonly.pkl|COMBO model]] for dependency parsing only
 * [[http://mozart.ipipan.waw.pl/~alina/Polish_dependency_parsing_models/COMBO/20200930_COMBO_PDB_nosem.pkl|COMBO model]] for part-of-speech tagging, lemmatisation, and dependency parsing
 * [[http://mozart.ipipan.waw.pl/~alina/Polish_dependency_parsing_models/COMBO/20200930_COMBO_PDB_sem.pkl|COMBO model]] for part-of-speech tagging, lemmatisation, dependency parsing, and semantic role labelling

 * [[http://mozart.ipipan.waw.pl/~alina/Polish_dependency_parsing_models/COMBO/191107_COMBO_PDB_semlab_parseonly.pkl|COMBO model]] for (semantic) dependency parsing only

 * [[http://mozart.ipipan.waw.pl/~alina/Polish_dependency_parsing_models/MATE/20190612_MATE_PDB.pkl|MATE model]] for dependency parsing
 * [[http://mozart.ipipan.waw.pl/~alina/Polish_dependency_parsing_models/MALT/190125_MALT_PDB.mco|MaltParser model]] for dependency parsing


== PDB-UD-trained dependency parsing models for Polish ==
The PDB-UD-based models are trained on the current version of [[http://git.nlp.ipipan.waw.pl/alina/PDBUD|Polish Dependency Bank in Universal Dependencies format]] with the publicly available parsing systems – [[https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/tree/master|COMBO-pytorch]], [[https://github.com/360er0/COMBO|COMBO]], [[http://ufal.mff.cuni.cz/udpipe|UDPipe]].

 * [[http://mozart.ipipan.waw.pl/~mklimaszewski/models/polish-herbert-base.tar.gz|COMBO-pytorch model]] for for part-of-speech tagging, lemmatisation, and dependency parsing (with [[https://huggingface.co/allegro/herbert-base-cased|HerBERT-base]] embeddings),
 * [[http://mozart.ipipan.waw.pl/~mklimaszewski/models/polish-herbert-large.tar.gz|COMBO-pytorch model]] for for part-of-speech tagging, lemmatisation, and dependency parsing (with [[https://huggingface.co/allegro/herbert-large-cased|HerBERT-large]] embeddings),
 * [[http://mozart.ipipan.waw.pl/~mklimaszewski/models/polish-ud27.tar.gz|COMBO-pytorch model]] for for part-of-speech tagging, lemmatisation, and dependency parsing (with fastText embeddings),
 * [[http://mozart.ipipan.waw.pl/~alina/Polish_dependency_parsing_models/COMBO/20200930_COMBO_PDBUD_nosem.pkl|COMBO model]] for part-of-speech tagging, lemmatisation, and dependency parsing
 * [[http://mozart.ipipan.waw.pl/~alina/Polish_dependency_parsing_models/COMBO/20200930_COMBO_PDBUD_sem.pkl|COMBO model]] for part-of-speech tagging, lemmatisation, dependency parsing, and semantic role labelling
 * [[http://mozart.ipipan.waw.pl/~alina/Polish_dependency_parsing_models/UDPIPE/20200930_PDBUD_ttp_embedd.udpipe|UDPipe model]] for tokenisation, part-of-speech tagging, lemmatisation, and dependency parsing
 * [[http://mozart.ipipan.waw.pl/~alina/Polish_dependency_parsing_models/UDPIPE/20200930_PDBUD_tokeniser.udpipe|UDPipe model]] for tokenisation}}}

{{{#!wiki comment
 * [[http://mozart.ipipan.waw.pl/~prybak/model_poleval2018/model_A_semi.pkl|COMBO]] model for Polish (the model estimated for the [[http://poleval.pl/tasks#task1|PolEval 2018]] competition)
 * [[attachment:180606_PDBUDPipe.udpipe|UDPipe]] model for Polish}}}

=== COMBO ===

 * COMBO's [[https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/tree/master|source code]]
 * Beginner's [[https://colab.research.google.com/drive/1D1P4AiE40Cc_4SF3HY-Mz06JY0XMiEFs?hl=en|tutorial]] (collab notebook)
 * COMBO's [[https://gitlab.clarin-pl.eu/syntactic-tools/combo/-/blob/master/docs/performance.md|performance]] on test sets for multiple languages from [[https://universaldependencies.org|Universal Dependencies]]

{{{#!wiki comment
=== Parsing performance ===

See [[http://clip.ipipan.waw.pl/benchmarks#Dependency_parsing|Dependency parsing]] section.


  * [[attachment:190115_COMBO_PDB_nosem.pkl]] – PDB-based COMBO model for part-of-speech tagging, lemmatisation, and dependency parsing
  * [[attachment: 190115_COMBO_PDB_sem.pkl]] – PDB-based COMBO model for part-of-speech tagging, lemmatisation, dependency parsing and semantic role labelling

 * '''NEW!''' PDB-based COMBO model compatible with the tagset of Morfeusz 2: [[attachment:180912_PDBCOMBO.pkl]]

 * MateParser

  * '''NEW!''' PDB-based Mate model compatible with the tagset of Morfeusz 2: [[attachment:180322_PDBMate.mdl]]
  * PDB-based Mate model compatible with the tagset of Morfeusz: [[attachment:170608_PDBMate.mdl]]

 * MateParser
  * [[attachment:190125_MATE_PDB.model]] – PDB-based MateParser model for dependency parsing
 * MaltParser
  * [[attachment:190125_MALT_PDB.mco]] – PDB-based MaltParser model for dependency parsing


  * '''NEW!''' PDB-based MaltParser model compatible with the tagset of Morfeusz 2: [[attachment:180322_PDBMalt.mco]]
  * PDB-basd MaltParser model compatible with the tagset of Morfeusz: [[attachment:170608_PDBMalt.mco]]
Line 17: Line 77:
|| `Polish MateParser` || 0.85 || 0.89 ||
|| `Polish MaltParser` || 0.84 || 0.89 ||
|| `PDBMate` || 0.85 || 0.89 ||
|| `PDBMalt` || 0.82 || 0.86 ||
Line 26: Line 86:
||<bgcolor="#eef3ff">abbrev_punct ||<bgcolor="#eef3ff">0.99 ||<bgcolor="#eef3ff">0.98 ||<bgcolor="#eef3ff">0.98 ||<bgcolor="#eef3ff">0.96 ||<bgcolor="#eef3ff"> 0.98 ||<bgcolor="#eef3ff">0.97 ||
||adjunct || 0.89 || 0.76 || 0.92 || 0.79 || 0.82 || 0.78 ||
||<bgcolor="#eef3ff">adjunct_qt ||<bgcolor="#eef3ff"> 0.74 ||<bgcolor="#eef3ff"> 0.27 ||<bgcolor="#eef3ff"> 0.76 ||<bgcolor="#eef3ff"> 0.20 ||<bgcolor="#eef3ff"> 0.75 ||<bgcolor="#eef3ff"> 0.23 ||
||<bgcolor="#eef3ff">abbrev_punct ||<bgcolor="#eef3ff">0.99 ||<bgcolor="#eef3ff">0.99 ||<bgcolor="#eef3ff">0.98 ||<bgcolor="#eef3ff">0.97 ||<bgcolor="#eef3ff"> 0.98 ||<bgcolor="#eef3ff">0.98 ||
||adjunct || 0.89 || 0.73 || 0.92 || 0.77 || 0.82 || 0.75 ||
||<bgcolor="#eef3ff">adjunct_qt ||<bgcolor="#eef3ff"> 0.74 ||<bgcolor="#eef3ff"> 0.51 ||<bgcolor="#eef3ff"> 0.76 ||<bgcolor="#eef3ff"> 0.58 ||<bgcolor="#eef3ff"> 0.75 ||<bgcolor="#eef3ff"> 0.55 ||
Line 30: Line 90:
||<bgcolor="#eef3ff">app ||<bgcolor="#eef3ff"> 0.75 ||<bgcolor="#eef3ff"> 0.58 ||<bgcolor="#eef3ff"> 0.69 ||<bgcolor="#eef3ff"> 0.47 ||<bgcolor="#eef3ff"> 0.72 ||<bgcolor="#eef3ff"> 0.51 ||
||aux || 0.95 || 0.92 || 0.97 || 0.92 || 0.96 || 0.92 ||
||<bgcolor="#eef3ff">comp ||<bgcolor="#eef3ff"> 0.90 ||<bgcolor="#eef3ff"> 0.88 ||<bgcolor="#eef3ff"> 0.87 ||<bgcolor="#eef3ff"> 0.85  ||<bgcolor="#eef3ff"> 0.88 ||<bgcolor="#eef3ff"> 0.86 ||
||comp_ag || 0.95 || 0.90 || 0.96 || 0.90 || 0.94 || 0.90 ||
||<bgcolor="#eef3ff">comp_fin ||<bgcolor="#eef3ff"> 0.87 ||<bgcolor="#eef3ff"> 0.61 ||<bgcolor="#eef3ff"> 0.86 ||<bgcolor="#eef3ff"> 0.71 ||<bgcolor="#eef3ff"> 0.87 ||<bgcolor="#eef3ff"> 0.66 ||
||comp_inf || 0.95 || 0.90 || 0.96 || 0.90 || 0.93 || 0.90 ||
||<bgcolor="#eef3ff"> cond ||<bgcolor="#eef3ff"> 1.00 ||<bgcolor="#eef3ff"> 0.97 ||<bgcolor="#eef3ff"> 1.00 ||<bgcolor="#eef3ff"> 0.97 ||<bgcolor="#eef3ff"> 1.00 ||<bgcolor="#eef3ff"> 0.97 ||
||conjunct || 0.85 || 0.74 || 0.82 || 0.72        || 0.82 || 0.73 ||
||<bgcolor="#eef3ff"> imp ||<bgcolor="#eef3ff">0.98 ||<bgcolor="#eef3ff"> 1.00 ||<bgcolor="#eef3ff"> 0.91 ||<bgcolor="#eef3ff"> 0.74       ||<bgcolor="#eef3ff">0.94 ||<bgcolor="#eef3ff"> 0.85 ||
|| item || 0.87 || 0.62     || 0.73 || 0.51    || 0.61 || 0.56 ||
||<bgcolor="#eef3ff"> mwe ||<bgcolor="#eef3ff"> 0.90 ||<bgcolor="#eef3ff">
0.87 ||<bgcolor="#eef3ff"> 0.83 ||<bgcolor="#eef3ff"> 0.80 ||<bgcolor="#eef3ff">0.87 ||<bgcolor="#eef3ff"> 0.84 ||
||ne || 0.87 || 0.79 || 0.73 || 0.58 || 0.76 || 0.67 ||
||<bgcolor="#eef3ff"> neg ||<bgcolor="#eef3ff">0.99 ||<bgcolor="#eef3ff"> 0.98 ||<bgcolor="#eef3ff"> 1.00 ||<bgcolor="#eef3ff"> 0.99 ||<bgcolor="#eef3ff">0.99 ||<bgcolor="#eef3ff"> 0.98 ||
||obj || 0.89 || 0.81 || 0.91 || 0.88 || 0.89 || 0.84
||
||<bgcolor="#eef3ff"> obj_th ||<bgcolor="#eef3ff">0.83 ||<bgcolor="#eef3ff"> 0.79 ||<bgcolor="#eef3ff"> 0.76 ||<bgcolor="#eef3ff"> 0.68 ||<bgcolor="#eef3ff">0.80 ||<bgcolor="#eef3ff"> 0.73 ||
||pd || 0.86 || 0.80 || 0.80 || 0.78 || 0.87 || 0.79 ||
||<bgcolor="#eef3ff"> pre_coord ||<bgcolor="#eef3ff">0.86 ||<bgcolor="#eef3ff"> 0.85    ||<bgcolor="#eef3ff">0.78 ||<bgcolor="#eef3ff"> 0.57 ||<bgcolor="#eef3ff">0.82 ||<bgcolor="#eef3ff"> 0.68 ||
||punct || 0.97 || 0.81 || 0.98 || 0.81 || 0.88 || 0.81 ||
||<bgcolor="#eef3ff">refl ||<bgcolor="#eef3ff"> 0.99 ||<bgcolor="#eef3ff"> 0.97 ||<bgcolor="#eef3ff"> 0.99 ||<bgcolor="#eef3ff"> 0.97 ||<bgcolor="#eef3ff"> 0.99 ||<bgcolor="#eef3ff"> 0.97 ||
||root || 0.91 || 0.88 || 0.91 || 0.88
|| 0.94 ||0.88 ||
||<bgcolor="#eef3ff"> subj ||<bgcolor="#eef3ff"> 0.94||<bgcolor="#eef3ff"> 0.87 ||<bgcolor="#eef3ff">0.94 ||<bgcolor="#eef3ff"> 0.86 ||<bgcolor="#eef3ff">0.94 ||<bgcolor="#eef3ff"> 0.86 ||
||<bgcolor="#eef3ff">app ||<bgcolor="#eef3ff"> 0.75 ||<bgcolor="#eef3ff"> 0.58 ||<bgcolor="#eef3ff"> 0.69 ||<bgcolor="#eef3ff"> 0.52    ||<bgcolor="#eef3ff"> 0.72 ||<bgcolor="#eef3ff"> 0.55 ||
||aux || 0.95 || 0.90 || 0.97 || 0.92 || 0.96 || 0.91 ||
||<bgcolor="#eef3ff">comp ||<bgcolor="#eef3ff"> 0.90 ||<bgcolor="#eef3ff"> 0.85 ||<bgcolor="#eef3ff"> 0.87 ||<bgcolor="#eef3ff"> 0.82 ||<bgcolor="#eef3ff"> 0.88 ||<bgcolor="#eef3ff"> 0.84 ||
||comp_ag || 0.95 || 0.90 || 0.96 || 0.91 || 0.94 || 0.90 ||
||<bgcolor="#eef3ff">comp_fin ||<bgcolor="#eef3ff"> 0.87 ||<bgcolor="#eef3ff"> 0.75 ||<bgcolor="#eef3ff"> 0.86 ||<bgcolor="#eef3ff"> 0.79 ||<bgcolor="#eef3ff"> 0.87 ||<bgcolor="#eef3ff"> 0.77 ||
||comp_inf || 0.95 || 0.91 || 0.96 || 0.90 || 0.93 || 0.90 ||
||<bgcolor="#eef3ff"> cond ||<bgcolor="#eef3ff"> 1.00 ||<bgcolor="#eef3ff"> 0.97 ||<bgcolor="#eef3ff"> 1.00 ||<bgcolor="#eef3ff"> 0.96 ||<bgcolor="#eef3ff"> 1.00 ||<bgcolor="#eef3ff"> 0.96 ||
||conjunct || 0.85 || 0.71 || 0.82 || 0.65 || 0.82 || 0.68 ||
||<bgcolor="#eef3ff"> imp ||<bgcolor="#eef3ff">0.98 ||<bgcolor="#eef3ff"> 0.97 ||<bgcolor="#eef3ff"> 0.91 ||<bgcolor="#eef3ff"> 0.87 ||<bgcolor="#eef3ff">0.94 ||<bgcolor="#eef3ff"> 0.92 ||
|| item || 0.87 || 0.4 || 0.73 || 0.37 || 0.61 || 0.39 ||
||<bgcolor="#eef3ff"> mwe ||<bgcolor="#eef3ff"> 0.90 ||<bgcolor="#eef3ff">
0.83 ||<bgcolor="#eef3ff"> 0.83 ||<bgcolor="#eef3ff"> 0.75        ||<bgcolor="#eef3ff">0.87 ||<bgcolor="#eef3ff"> 0.79 ||
||ne || 0.87 || 0.78 || 0.73 || 0.64 || 0.76 || 0.70 ||
||<bgcolor="#eef3ff"> neg ||<bgcolor="#eef3ff">0.99 ||<bgcolor="#eef3ff"> 0.97 ||<bgcolor="#eef3ff"> 1.00 ||<bgcolor="#eef3ff"> 0.98 ||<bgcolor="#eef3ff">0.99 ||<bgcolor="#eef3ff"> 0.98 ||
||obj || 0.89 || 0.8
1 || 0.91 || 0.86        || 0.89 || 0.83 ||
||<bgcolor="#eef3ff"> obj_th ||<bgcolor="#eef3ff">0.83 ||<bgcolor="#eef3ff"> 0.76   ||<bgcolor="#eef3ff"> 0.76 ||<bgcolor="#eef3ff"> 0.65 ||<bgcolor="#eef3ff">0.80 ||<bgcolor="#eef3ff"> 0.70 ||
||pd || 0.86 || 0.77 || 0.80 || 0.72 || 0.87 || 0.74 ||
||<bgcolor="#eef3ff"> pre_coord ||<bgcolor="#eef3ff">0.86 ||<bgcolor="#eef3ff"> 0.76 ||<bgcolor="#eef3ff">0.78 ||<bgcolor="#eef3ff"> 0.55 ||<bgcolor="#eef3ff">0.82 ||<bgcolor="#eef3ff"> 0.64 ||
||punct || 0.97 || 0.75 || 0.98 || 0.76 || 0.88 || 0.76 ||
||<bgcolor="#eef3ff">refl ||<bgcolor="#eef3ff"> 0.99 ||<bgcolor="#eef3ff"> 0.96 ||<bgcolor="#eef3ff"> 0.99 ||<bgcolor="#eef3ff"> 0.96 ||<bgcolor="#eef3ff"> 0.99 ||<bgcolor="#eef3ff"> 0.96 ||
||root || 0.91 || 0.80 || 0.91 || 0.
81 || 0.94 ||0.80 ||
||<bgcolor="#eef3ff"> subj ||<bgcolor="#eef3ff"> 0.94||<bgcolor="#eef3ff"> 0.84 ||<bgcolor="#eef3ff">0.94 ||<bgcolor="#eef3ff"> 0.83 ||<bgcolor="#eef3ff">0.94 ||<bgcolor="#eef3ff"> 0.84 ||
}}}
Line 52: Line 113:
== Dependency parser integrated into Multiservice NLP for Polish ==
The performance of !MaltParser model for Polish may be tested in Multiservice NLP – [[http://multiservice.nlp.ipipan.waw.pl]].
To parse a Polish text in Multiservice "select predefined chain of actions": 3: Pantera, `DependencyParser`, input your text, and press the button "Run".
=== COMBO demos ===
Line 56: Line 115:
== Publications ==
<<BibMate(key, "wrob:14", omitYears=true)>>
<<BibMate(key, "wro:prz:14", omitYears=true)>>
<<BibMate(key, "wroblewska:12", omitYears=true)>>
<<BibMate(key, "awmw:deparsing", omitYears=true)>>
 * [[http://combo-demo.nlp.ipipan.waw.pl/combo-eng|English]]
 * [[http://combo-demo.nlp.ipipan.waw.pl/combo-pl|Polish]]

{{{#!wiki comment
 * [[http://scwad-demo.nlp.ipipan.waw.pl:8000/dependency-parsing|COMBO demo]]
 * [[http://multiservice.nlp.ipipan.waw.pl|MaltParser demo in Multiservice NLP]]
  * To parse a Polish text in Multiservice "Select predefined chain of actions": 5: Concraft, !DependencyParser, input your text, and press the button "Run".
  * To download the parser's output in CoNLL format, "Select output format:".}}}

=== Publications ===

<<BibMate(key, "kli:wro:2021b", omitYears=true)>>
<<BibMate(key, "wro:ryb:2019", omitYears=true)>> (Note: Please contact the first author to get a copy of this article.)
{{{#!wiki comment
<<BibMate(key, "wro:14", omitYears=true)>>
<<BibMate(key, "wroblewska:12", omitYears=true)>>}}}
Line 63: Line 133:
== Contact == === Licensing ===

The dependency parsing models for Polish are released under the [[https://creativecommons.org/licenses/by-nc-sa/4.0/|CC BY-NC-SA 4.0]] licence and by downloading them you accept the conditions of that licence.

=== Acknowledgment ===
The research was founded by SONATA 8 grant no 2014/15/D/HS2/03486 from the National Science Centre Poland and by the Polish Ministry of Science, Higher Education as part of the investment in the CLARIN-PL research infrastructure and DARIAH-PL. The computing was performed at Poznań Supercomputing and Networking Center.


=== Contact ===

COMBO's models for Polish

COMBO's models for Polish trained on the current version of Polish Dependency Bank and using the HerBERT language model.

PDB-trained models

  • model for dependency parsing only

  • model for part-of-speech tagging, morphological analysis, lemmatisation, and dependency parsing (dependency relation types without semantic extensions, e.g. adjunct instead of adjunct_temp)

  • model for part-of-speech tagging, morphological analysis, lemmatisation, and dependency parsing (dependency relation types with semantic extensions, e.g. adjunct_temp)

PDB-UD-trained model

  • model for part-of-speech tagging, morphological analysis, lemmatisation, and dependency parsing

COMBO

COMBO demos

Publications

List of publications

Mateusz Klimaszewski and Alina Wróblewska. COMBO: State-of-the-art morphosyntactic analysis. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 50–62, Online and Punta Cana, Dominican Republic, 2021. Association for Computational Linguistics.

List of publications

Alina Wróblewska and Piotr Rybak. Dependency parsing of Polish. Poznań Studies in Contemporary Linguistics, 55(2):305–337, 2019.

(Note: Please contact the first author to get a copy of this article.)

Licensing

The dependency parsing models for Polish are released under the CC BY-NC-SA 4.0 licence and by downloading them you accept the conditions of that licence.

Acknowledgment

The research was founded by SONATA 8 grant no 2014/15/D/HS2/03486 from the National Science Centre Poland and by the Polish Ministry of Science, Higher Education as part of the investment in the CLARIN-PL research infrastructure and DARIAH-PL. The computing was performed at Poznań Supercomputing and Networking Center.

Contact

Any questions, comments? Please send them to <alina AT SPAMFREE ipipan DOT waw DOT pl>.