PDBparser
PDBparser is a Polish dependency parser trained on the current version of (Polish Depedency Bank) with the publicly available parsing systems – MaltParser or MateParser. MaltParser is a transition-based dependency parser that uses a deterministic parsing algorithm. The deterministic parsing algorithm builds a dependency structure of an input sentence based on transitions (shift-reduce actions) predicted by a classifier. The classifier learns to predict the next transition given training data and the parse history. MateParser, in turn, is a graph-based parser that defines a space of well-formed candidate dependency trees for an input sentence, scores them given an induced parsing model, and selects the highest scoring dependency tree as a correct analysis of the input sentence.
PDB-based parsing models for Polish
MateParser
NEW! PDBMate model (compatible with the tagset of Morfeusz 2): 180322_PDBMate.mdl
PDBMate model (compatible with the tagset of Morfeusz): 170608_PDBMate.mdl
MaltParser
NEW! PDBMalt model (compatible with the tagset of Morfeusz 2): 180322_PDBMalt.mco
PDBMalt model (compatible with the tagset of Morfeusz): 170608_PDBMalt.mco
Parsing performance
10-fold cross-validation (avg.)
Model |
LAS |
UAS |
PDBMate |
0.85 |
0.89 |
PDBMalt |
0.82 |
0.86 |
Precision, recall and f-score of individual dependency relations (avg.)
The description of Polish dependency relations types is available on Polish dependency relation types.
Dependency relation type |
Precision |
Recall |
F-Measure |
|||
Mate |
Malt |
Mate |
Malt |
Mate |
Malt |
|
abbrev_punct |
0.99 |
0.99 |
0.98 |
0.97 |
0.98 |
0.98 |
adjunct |
0.89 |
0.73 |
0.92 |
0.77 |
0.82 |
0.75 |
adjunct_qt |
0.74 |
0.51 |
0.76 |
0.58 |
0.75 |
0.55 |
aglt |
1.00 |
0.98 |
1.00 |
0.98 |
0.98 |
0.98 |
app |
0.75 |
0.58 |
0.69 |
0.52 |
0.72 |
0.55 |
aux |
0.95 |
0.90 |
0.97 |
0.92 |
0.96 |
0.91 |
comp |
0.90 |
0.85 |
0.87 |
0.82 |
0.88 |
0.84 |
comp_ag |
0.95 |
0.90 |
0.96 |
0.91 |
0.94 |
0.90 |
comp_fin |
0.87 |
0.75 |
0.86 |
0.79 |
0.87 |
0.77 |
comp_inf |
0.95 |
0.91 |
0.96 |
0.90 |
0.93 |
0.90 |
cond |
1.00 |
0.97 |
1.00 |
0.96 |
1.00 |
0.96 |
conjunct |
0.85 |
0.71 |
0.82 |
0.65 |
0.82 |
0.68 |
imp |
0.98 |
0.97 |
0.91 |
0.87 |
0.94 |
0.92 |
item |
0.87 |
0.4 |
0.73 |
0.37 |
0.61 |
0.39 |
mwe |
0.90 |
0.83 |
0.83 |
0.75 |
0.87 |
0.79 |
ne |
0.87 |
0.78 |
0.73 |
0.64 |
0.76 |
0.70 |
neg |
0.99 |
0.97 |
1.00 |
0.98 |
0.99 |
0.98 |
obj |
0.89 |
0.81 |
0.91 |
0.86 |
0.89 |
0.83 |
obj_th |
0.83 |
0.76 |
0.76 |
0.65 |
0.80 |
0.70 |
pd |
0.86 |
0.77 |
0.80 |
0.72 |
0.87 |
0.74 |
pre_coord |
0.86 |
0.76 |
0.78 |
0.55 |
0.82 |
0.64 |
punct |
0.97 |
0.75 |
0.98 |
0.76 |
0.88 |
0.76 |
refl |
0.99 |
0.96 |
0.99 |
0.96 |
0.99 |
0.96 |
root |
0.91 |
0.80 |
0.91 |
0.81 |
0.94 |
0.80 |
subj |
0.94 |
0.84 |
0.94 |
0.83 |
0.94 |
0.84 |
Dependency parser integrated into Multiservice NLP for Polish
The performance of MaltParser model for Polish may be tested in Multiservice NLP – http://multiservice.nlp.ipipan.waw.pl.
To parse a Polish text in Multiservice "Select predefined chain of actions": 5: Concraft, DependencyParser, input your text, and press the button "Run".
- To download the parser's output in CoNLL format, "Select output format:":
Publications
Licensing
The dependency parsing models for Polish are released under the GNU General Public License v3 (GPL v.3) and by downloading it you accept the conditions of that licence.
Contact
Any questions, comments? Please send them to <alina AT SPAMFREE ipipan DOT waw DOT pl>.