PDB/PDBparser

Polish COMBO models

The COMBO models for Polish are trained on the current version of Polish Dependency Bank. The models use the HerBERT language model.

PDB-trained models

model for dependency parsing only
model for part-of-speech tagging, morphological analysis, lemmatisation, and dependency parsing (dependency relation types without semantic extensions, e.g. adjunct instead of adjunct_temp)
model for part-of-speech tagging, morphological analysis, lemmatisation, and dependency parsing (dependency relation types with semantic extensions, e.g. adjunct_temp)

PDB-UD-trained model

model for part-of-speech tagging, morphological analysis, lemmatisation, and dependency parsing

Parsing performance (outdated)

See Dependency parsing section.

190115_COMBO_PDB_nosem.pkl – PDB-based COMBO model for part-of-speech tagging, lemmatisation, and dependency parsing
190115_COMBO_PDB_sem.pkl – PDB-based COMBO model for part-of-speech tagging, lemmatisation, dependency parsing and semantic role labelling

NEW! PDB-based COMBO model compatible with the tagset of Morfeusz 2: 180912_PDBCOMBO.pkl
MateParser
- NEW! PDB-based Mate model compatible with the tagset of Morfeusz 2: 180322_PDBMate.mdl
- PDB-based Mate model compatible with the tagset of Morfeusz: 170608_PDBMate.mdl
MateParser
- 190125_MATE_PDB.model – PDB-based MateParser model for dependency parsing
MaltParser
- 190125_MALT_PDB.mco – PDB-based MaltParser model for dependency parsing
- NEW! PDB-based MaltParser model compatible with the tagset of Morfeusz 2: 180322_PDBMalt.mco
- PDB-basd MaltParser model compatible with the tagset of Morfeusz: 170608_PDBMalt.mco

10-fold cross-validation (avg.)

Model	LAS	UAS
`PDBMate`	0.85	0.89
`PDBMalt`	0.82	0.86

Precision, recall and f-score of individual dependency relations (avg.)

The description of Polish dependency relations types is available on Polish dependency relation types.

Dependency relation type	Precision		Recall		F-Measure
Dependency relation type	Mate	Malt	Mate	Malt	Mate	Malt
abbrev_punct	0.99	0.99	0.98	0.97	0.98	0.98
adjunct	0.89	0.73	0.92	0.77	0.82	0.75
adjunct_qt	0.74	0.51	0.76	0.58	0.75	0.55
aglt	1.00	0.98	1.00	0.98	0.98	0.98
app	0.75	0.58	0.69	0.52	0.72	0.55
aux	0.95	0.90	0.97	0.92	0.96	0.91
comp	0.90	0.85	0.87	0.82	0.88	0.84
comp_ag	0.95	0.90	0.96	0.91	0.94	0.90
comp_fin	0.87	0.75	0.86	0.79	0.87	0.77
comp_inf	0.95	0.91	0.96	0.90	0.93	0.90
cond	1.00	0.97	1.00	0.96	1.00	0.96
conjunct	0.85	0.71	0.82	0.65	0.82	0.68
imp	0.98	0.97	0.91	0.87	0.94	0.92
item	0.87	0.4	0.73	0.37	0.61	0.39
mwe	0.90	0.83	0.83	0.75	0.87	0.79
ne	0.87	0.78	0.73	0.64	0.76	0.70
neg	0.99	0.97	1.00	0.98	0.99	0.98
obj	0.89	0.81	0.91	0.86	0.89	0.83
obj_th	0.83	0.76	0.76	0.65	0.80	0.70
pd	0.86	0.77	0.80	0.72	0.87	0.74
pre_coord	0.86	0.76	0.78	0.55	0.82	0.64
punct	0.97	0.75	0.98	0.76	0.88	0.76
refl	0.99	0.96	0.99	0.96	0.99	0.96
root	0.91	0.80	0.91	0.81	0.94	0.80
subj	0.94	0.84	0.94	0.83	0.94	0.84

PDB-based dependency parsing demos

COMBO demo (only in Polish)
MaltParser demo in Multiservice NLP
- To parse a Polish text in Multiservice "Select predefined chain of actions": 5: Concraft, DependencyParser, input your text, and press the button "Run".
- To download the parser's output in CoNLL format, "Select output format:".

Publications

List of publications

Mateusz Klimaszewski and Alina Wróblewska. COMBO: State-of-the-art morphosyntactic analysis. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 50–62, Online and Punta Cana, Dominican Republic, 2021. Association for Computational Linguistics.

List of publications

Alina Wróblewska and Piotr Rybak. Dependency parsing of Polish. Poznań Studies in Contemporary Linguistics, 55(2):305–337, 2019.

(Note: Please contact the first author to get a copy of this article.) List of publications

Alina Wróblewska. Polish Dependency Parser Trained on an Automatically Induced Dependency Bank. Ph.D. dissertation, Institute of Computer Science, Polish Academy of Sciences, Warsaw, 2014.

List of publications

Alina Wróblewska. Polish dependency bank. Linguistic Issues in Language Technology, 7(1), 2012.

Licensing

The dependency parsing models for Polish are released under the CC BY-NC-SA 4.0 licence and by downloading them you accept the conditions of that licence.

Acknowledgment

The research was founded by SONATA 8 grant no 2014/15/D/HS2/03486 from the National Science Centre Poland and by the Polish Ministry of Science and Higher Education as part of the investment in the CLARIN-PL research infrastructure. The computing was performed at Poznań Supercomputing and Networking Center.

Contact

Any questions, comments? Please send them to <alina AT SPAMFREE ipipan DOT waw DOT pl>.

PDB/PDBparser

Menu

Polish COMBO models

PDB-trained models

PDB-UD-trained model

PDB-UD-trained dependency parsing models for Polish

Parsing performance (outdated)

10-fold cross-validation (avg.)

Precision, recall and f-score of individual dependency relations (avg.)

PDB-based dependency parsing demos

Publications

Licensing

Acknowledgment

Contact