Differences between revisions 12 and 19 (spanning 7 versions)

PDB-based dependency parsing models for Polish

The PDB-based models are trained on the current version of (Polish Depedency Bank) with the publicly available parsing systems – [[|COMBO]], MaltParser or MateParser. MaltParser is a transition-based dependency parser that uses a deterministic parsing algorithm. The deterministic parsing algorithm builds a dependency structure of an input sentence based on transitions (shift-reduce actions) predicted by a classifier. The classifier learns to predict the next transition given training data and the parse history. MateParser, in turn, is a graph-based parser that defines a space of well-formed candidate dependency trees for an input sentence, scores them given an induced parsing model, and selects the highest scoring dependency tree as a correct analysis of the input sentence.

PDB-based models

MateParser

NEW! PDBMate model (compatible with the tagset of Morfeusz 2): 180322_PDBMate.mdl
PDBMate model (compatible with the tagset of Morfeusz): 170608_PDBMate.mdl

MaltParser

NEW! PDBMalt model (compatible with the tagset of Morfeusz 2): 180322_PDBMalt.mco
PDBMalt model (compatible with the tagset of Morfeusz): 170608_PDBMalt.mco

= PDBUD-based dependency parsing models for Polish=

UDPipe

UDPipe model for Polish: 180606_PDBUDPipe.udpipe

Parsing performance

10-fold cross-validation (avg.)

Model	LAS	UAS
`PDBMate`	0.85	0.89
`PDBMalt`	0.82	0.86

Precision, recall and f-score of individual dependency relations (avg.)

The description of Polish dependency relations types is available on Polish dependency relation types.

Dependency relation type	Precision		Recall		F-Measure
Dependency relation type	Mate	Malt	Mate	Malt	Mate	Malt
abbrev_punct	0.99	0.99	0.98	0.97	0.98	0.98
adjunct	0.89	0.73	0.92	0.77	0.82	0.75
adjunct_qt	0.74	0.51	0.76	0.58	0.75	0.55
aglt	1.00	0.98	1.00	0.98	0.98	0.98
app	0.75	0.58	0.69	0.52	0.72	0.55
aux	0.95	0.90	0.97	0.92	0.96	0.91
comp	0.90	0.85	0.87	0.82	0.88	0.84
comp_ag	0.95	0.90	0.96	0.91	0.94	0.90
comp_fin	0.87	0.75	0.86	0.79	0.87	0.77
comp_inf	0.95	0.91	0.96	0.90	0.93	0.90
cond	1.00	0.97	1.00	0.96	1.00	0.96
conjunct	0.85	0.71	0.82	0.65	0.82	0.68
imp	0.98	0.97	0.91	0.87	0.94	0.92
item	0.87	0.4	0.73	0.37	0.61	0.39
mwe	0.90	0.83	0.83	0.75	0.87	0.79
ne	0.87	0.78	0.73	0.64	0.76	0.70
neg	0.99	0.97	1.00	0.98	0.99	0.98
obj	0.89	0.81	0.91	0.86	0.89	0.83
obj_th	0.83	0.76	0.76	0.65	0.80	0.70
pd	0.86	0.77	0.80	0.72	0.87	0.74
pre_coord	0.86	0.76	0.78	0.55	0.82	0.64
punct	0.97	0.75	0.98	0.76	0.88	0.76
refl	0.99	0.96	0.99	0.96	0.99	0.96
root	0.91	0.80	0.91	0.81	0.94	0.80
subj	0.94	0.84	0.94	0.83	0.94	0.84

Dependency parser integrated into Multiservice NLP for Polish

The performance of MaltParser model for Polish may be tested in Multiservice NLP – http://multiservice.nlp.ipipan.waw.pl.
To parse a Polish text in Multiservice "Select predefined chain of actions": 5: Concraft, DependencyParser, input your text, and press the button "Run".
To download the parser's output in CoNLL format, "Select output format:":

Publications

List of publications

Alina Wróblewska. Polish Dependency Parser Trained on an Automatically Induced Dependency Bank. Ph.D. dissertation, Institute of Computer Science, Polish Academy of Sciences, Warsaw, 2014.

List of publications

Alina Wróblewska and Adam Przepiórkowski. Projection-based annotation of a Polish dependency treebank. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, editors, Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, pages 2306–2312, Reykjavík, Iceland, 2014. European Language Resources Association (ELRA).

List of publications

Alina Wróblewska. Polish dependency bank. Linguistic Issues in Language Technology, 7(1), 2012.

List of publications

Licensing

The dependency parsing models for Polish are released under the CC BY-NC-SA 4.0 licence and by downloading it you accept the conditions of that licence.

Contact

Any questions, comments? Please send them to <alina AT SPAMFREE ipipan DOT waw DOT pl>.

-  ⇤ ← Revision 12 as of 2018-03-22 16:31:48 → 
  Size: 6739
  Editor: AlinaWroblewska
  Comment:
+   ← Revision 19 as of 2018-10-05 14:45:05 → ⇥
  Size: 6927
  Editor: AlinaWroblewska
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 2:
-= PDBparser =
+= PDB-based dependency parsing models for Polish =
 Line 4:
-PDBparser is a Polish dependency parser trained on the current version of ([[http://zil.ipipan.waw.pl/PDB|Polish Depedency Bank]]) with the publicly available parsing systems – [[http://maltparser.org|MaltParser]] or [[https://code.google.com/archive/p/mate-tools/|MateParser]]. `MaltParser` is a transition-based dependency parser that uses a deterministic parsing algorithm. The deterministic parsing algorithm builds a dependency structure of an input sentence based on transitions (shift-reduce actions) predicted by a classifier. The classifier learns to predict the next transition given training data and the parse history. `MateParser`, in turn, is a graph-based parser that defines a space of well-formed candidate dependency trees for an input sentence, scores them given an induced parsing model, and selects the highest scoring dependency tree as a correct analysis of the input sentence.
+The PDB-based models are trained on the current version of ([[http://zil.ipipan.waw.pl/PDB|Polish Depedency Bank]]) with the publicly available parsing systems – [[|COMBO]], [[http://maltparser.org|MaltParser]] or [[https://code.google.com/archive/p/mate-tools/|MateParser]]. `MaltParser` is a transition-based dependency parser that uses a deterministic parsing algorithm. The deterministic parsing algorithm builds a dependency structure of an input sentence based on transitions (shift-reduce actions) predicted by a classifier. The classifier learns to predict the next transition given training data and the parse history. `MateParser`, in turn, is a graph-based parser that defines a space of well-formed candidate dependency trees for an input sentence, scores them given an induced parsing model, and selects the highest scoring dependency tree as a correct analysis of the input sentence.
 Line 6:
-== PDB-based parsing models for Polish ==
+== PDB-based models ==
 Line 10:
+ * '''NEW!''' PDBMate model (compatible with the tagset of Morfeusz 2): [[attachment:180322_PDBMate.mdl]]
-Line 11:
+Line 12:
- * PDBMate model (compatible with the tagset of Morfeusz 2):
 Line 14:
+ * '''NEW!''' PDBMalt model (compatible with the tagset of Morfeusz 2): [[attachment:180322_PDBMalt.mco]]
-Line 15:
+Line 17:
- * PDBMalt model (compatible with the tagset of Morfeusz 2): [[attachment:180322_PDBMalt.mco]]
-Line 17:
+Line 18:
-=== Semantic PDB models ===
 * Semantic PDBMate model:
 * Semantic PDBMalt model:
+= PDBUD-based dependency parsing models for Polish=
-Line 21:
+Line 20:
-The dependency parsing models for Polish are released under the GNU General Public License v3 (GPL v.3) and by downloading it you accept the conditions of that licence.
+=== UDPipe ===
 * UDPipe model for Polish: [[attachment:180606_PDBUDPipe.udpipe]]
-Line 25:
+Line 24:
 Line 69:
-<<BibMate(key, "wrob:14", omitYears=true)>>
+<<BibMate(key, "wro:14", omitYears=true)>>
 Line 72:
-<<BibMate(key, "awmw:deparsing", omitYears=true)>>
+<<BibMate(key, "awmw:departing", omitYears=true)>>
 Line 75:
-== Contact ==
+=== Licensing ===

The dependency parsing models for Polish are released under the [[https://creativecommons.org/licenses/by-nc-sa/4.0/|CC BY-NC-SA 4.0]] licence and by downloading it you accept the conditions of that licence.

=== Contact ===

Diff for "PDB/PDBparser"

Menu

PDB-based dependency parsing models for Polish

PDB-based models

MateParser

MaltParser

UDPipe

Parsing performance

10-fold cross-validation (avg.)

Precision, recall and f-score of individual dependency relations (avg.)

Dependency parser integrated into Multiservice NLP for Polish

Publications

Licensing

Contact