Size: 14
Comment:
|
Size: 7127
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
= PDBparser= | #acl AlinaWroblewska:read,write,revert All:read = PDB-based dependency parsing models for Polish = The PDB-based models are trained on the current version of [[http://zil.ipipan.waw.pl/PDB|Polish Depedency Bank]] with the publicly available parsing systems – [[https://github.com/360er0/COMBO|COMBO]], [[https://code.google.com/archive/p/mate-tools/|MateParser]] and [[http://maltparser.org|MaltParser]]. /* ''MaltParser'' is a transition-based dependency parser that uses a deterministic parsing algorithm. The deterministic parsing algorithm builds a dependency structure of an input sentence based on transitions (shift-reduce actions) predicted by a classifier. The classifier learns to predict the next transition given training data and the parse history. `MateParser`, in turn, is a graph-based parser that defines a space of well-formed candidate dependency trees for an input sentence, scores them given an induced parsing model, and selects the highest scoring dependency tree as a correct analysis of the input sentence. */ == PDB-based models == === COMBO === * '''NEW!''' PDB-based COMBO model compatible with the tagset of Morfeusz 2: [[attachment:180912_PDBCOMBO.pkl]] === MateParser === * '''NEW!''' PDB-based Mate model compatible with the tagset of Morfeusz 2: [[attachment:180322_PDBMate.mdl]] * PDB-based Mate model compatible with the tagset of Morfeusz: [[attachment:170608_PDBMate.mdl]] === MaltParser === * '''NEW!''' PDB-based MaltParser model compatible with the tagset of Morfeusz 2: [[attachment:180322_PDBMalt.mco]] * PDB-basd MaltParser model compatible with the tagset of Morfeusz: [[attachment:170608_PDBMalt.mco]] = PDBUD-based dependency parsing models for Polish= === UDPipe === * UDPipe model for Polish: [[attachment:180606_PDBUDPipe.udpipe]] == Parsing performance == === 10-fold cross-validation (avg.) === || '''Model''' || '''LAS''' || '''UAS''' || || `PDBMate` || 0.85 || 0.89 || || `PDBMalt` || 0.82 || 0.86 || === Precision, recall and f-score of individual dependency relations (avg.) === The description of Polish dependency relations types is available on [[http://zil.ipipan.waw.pl/PDB/DepRelTypes|Polish dependency relation types]]. ||<rowspan=2> '''Dependency relation type''' |||| '''Precision''' |||| '''Recall''' |||| '''F-Measure''' || || Mate || Malt || Mate || Malt || Mate || Malt || ||<bgcolor="#eef3ff">abbrev_punct ||<bgcolor="#eef3ff">0.99 ||<bgcolor="#eef3ff">0.99 ||<bgcolor="#eef3ff">0.98 ||<bgcolor="#eef3ff">0.97 ||<bgcolor="#eef3ff"> 0.98 ||<bgcolor="#eef3ff">0.98 || ||adjunct || 0.89 || 0.73 || 0.92 || 0.77 || 0.82 || 0.75 || ||<bgcolor="#eef3ff">adjunct_qt ||<bgcolor="#eef3ff"> 0.74 ||<bgcolor="#eef3ff"> 0.51 ||<bgcolor="#eef3ff"> 0.76 ||<bgcolor="#eef3ff"> 0.58 ||<bgcolor="#eef3ff"> 0.75 ||<bgcolor="#eef3ff"> 0.55 || ||aglt || 1.00 || 0.98 || 1.00 || 0.98 || 0.98 || 0.98 || ||<bgcolor="#eef3ff">app ||<bgcolor="#eef3ff"> 0.75 ||<bgcolor="#eef3ff"> 0.58 ||<bgcolor="#eef3ff"> 0.69 ||<bgcolor="#eef3ff"> 0.52 ||<bgcolor="#eef3ff"> 0.72 ||<bgcolor="#eef3ff"> 0.55 || ||aux || 0.95 || 0.90 || 0.97 || 0.92 || 0.96 || 0.91 || ||<bgcolor="#eef3ff">comp ||<bgcolor="#eef3ff"> 0.90 ||<bgcolor="#eef3ff"> 0.85 ||<bgcolor="#eef3ff"> 0.87 ||<bgcolor="#eef3ff"> 0.82 ||<bgcolor="#eef3ff"> 0.88 ||<bgcolor="#eef3ff"> 0.84 || ||comp_ag || 0.95 || 0.90 || 0.96 || 0.91 || 0.94 || 0.90 || ||<bgcolor="#eef3ff">comp_fin ||<bgcolor="#eef3ff"> 0.87 ||<bgcolor="#eef3ff"> 0.75 ||<bgcolor="#eef3ff"> 0.86 ||<bgcolor="#eef3ff"> 0.79 ||<bgcolor="#eef3ff"> 0.87 ||<bgcolor="#eef3ff"> 0.77 || ||comp_inf || 0.95 || 0.91 || 0.96 || 0.90 || 0.93 || 0.90 || ||<bgcolor="#eef3ff"> cond ||<bgcolor="#eef3ff"> 1.00 ||<bgcolor="#eef3ff"> 0.97 ||<bgcolor="#eef3ff"> 1.00 ||<bgcolor="#eef3ff"> 0.96 ||<bgcolor="#eef3ff"> 1.00 ||<bgcolor="#eef3ff"> 0.96 || ||conjunct || 0.85 || 0.71 || 0.82 || 0.65 || 0.82 || 0.68 || ||<bgcolor="#eef3ff"> imp ||<bgcolor="#eef3ff">0.98 ||<bgcolor="#eef3ff"> 0.97 ||<bgcolor="#eef3ff"> 0.91 ||<bgcolor="#eef3ff"> 0.87 ||<bgcolor="#eef3ff">0.94 ||<bgcolor="#eef3ff"> 0.92 || || item || 0.87 || 0.4 || 0.73 || 0.37 || 0.61 || 0.39 || ||<bgcolor="#eef3ff"> mwe ||<bgcolor="#eef3ff"> 0.90 ||<bgcolor="#eef3ff"> 0.83 ||<bgcolor="#eef3ff"> 0.83 ||<bgcolor="#eef3ff"> 0.75 ||<bgcolor="#eef3ff">0.87 ||<bgcolor="#eef3ff"> 0.79 || ||ne || 0.87 || 0.78 || 0.73 || 0.64 || 0.76 || 0.70 || ||<bgcolor="#eef3ff"> neg ||<bgcolor="#eef3ff">0.99 ||<bgcolor="#eef3ff"> 0.97 ||<bgcolor="#eef3ff"> 1.00 ||<bgcolor="#eef3ff"> 0.98 ||<bgcolor="#eef3ff">0.99 ||<bgcolor="#eef3ff"> 0.98 || ||obj || 0.89 || 0.81 || 0.91 || 0.86 || 0.89 || 0.83 || ||<bgcolor="#eef3ff"> obj_th ||<bgcolor="#eef3ff">0.83 ||<bgcolor="#eef3ff"> 0.76 ||<bgcolor="#eef3ff"> 0.76 ||<bgcolor="#eef3ff"> 0.65 ||<bgcolor="#eef3ff">0.80 ||<bgcolor="#eef3ff"> 0.70 || ||pd || 0.86 || 0.77 || 0.80 || 0.72 || 0.87 || 0.74 || ||<bgcolor="#eef3ff"> pre_coord ||<bgcolor="#eef3ff">0.86 ||<bgcolor="#eef3ff"> 0.76 ||<bgcolor="#eef3ff">0.78 ||<bgcolor="#eef3ff"> 0.55 ||<bgcolor="#eef3ff">0.82 ||<bgcolor="#eef3ff"> 0.64 || ||punct || 0.97 || 0.75 || 0.98 || 0.76 || 0.88 || 0.76 || ||<bgcolor="#eef3ff">refl ||<bgcolor="#eef3ff"> 0.99 ||<bgcolor="#eef3ff"> 0.96 ||<bgcolor="#eef3ff"> 0.99 ||<bgcolor="#eef3ff"> 0.96 ||<bgcolor="#eef3ff"> 0.99 ||<bgcolor="#eef3ff"> 0.96 || ||root || 0.91 || 0.80 || 0.91 || 0.81 || 0.94 ||0.80 || ||<bgcolor="#eef3ff"> subj ||<bgcolor="#eef3ff"> 0.94||<bgcolor="#eef3ff"> 0.84 ||<bgcolor="#eef3ff">0.94 ||<bgcolor="#eef3ff"> 0.83 ||<bgcolor="#eef3ff">0.94 ||<bgcolor="#eef3ff"> 0.84 || == Dependency parser integrated into Multiservice NLP for Polish == * The performance of !MaltParser model for Polish may be tested in Multiservice NLP – [[http://multiservice.nlp.ipipan.waw.pl]]. * To parse a Polish text in Multiservice "Select predefined chain of actions": 5: Concraft, DependencyParser, input your text, and press the button "Run". * To download the parser's output in CoNLL format, "Select output format:": == Publications == <<BibMate(key, "wro:14", omitYears=true)>> <<BibMate(key, "wro:prz:14", omitYears=true)>> <<BibMate(key, "wroblewska:12", omitYears=true)>> <<BibMate(key, "awmw:departing", omitYears=true)>> === Licensing === The dependency parsing models for Polish are released under the [[https://creativecommons.org/licenses/by-nc-sa/4.0/|CC BY-NC-SA 4.0]] licence and by downloading it you accept the conditions of that licence. === Contact === Any questions, comments? Please send them to <<MailTo(alina AT SPAMFREE ipipan DOT waw DOT pl)>>. |
PDB-based dependency parsing models for Polish
The PDB-based models are trained on the current version of Polish Depedency Bank with the publicly available parsing systems – COMBO, MateParser and MaltParser.
PDB-based models
COMBO
NEW! PDB-based COMBO model compatible with the tagset of Morfeusz 2: 180912_PDBCOMBO.pkl
MateParser
NEW! PDB-based Mate model compatible with the tagset of Morfeusz 2: 180322_PDBMate.mdl
PDB-based Mate model compatible with the tagset of Morfeusz: 170608_PDBMate.mdl
MaltParser
NEW! PDB-based MaltParser model compatible with the tagset of Morfeusz 2: 180322_PDBMalt.mco
PDB-basd MaltParser model compatible with the tagset of Morfeusz: 170608_PDBMalt.mco
= PDBUD-based dependency parsing models for Polish=
UDPipe
UDPipe model for Polish: 180606_PDBUDPipe.udpipe
Parsing performance
10-fold cross-validation (avg.)
Model |
LAS |
UAS |
PDBMate |
0.85 |
0.89 |
PDBMalt |
0.82 |
0.86 |
Precision, recall and f-score of individual dependency relations (avg.)
The description of Polish dependency relations types is available on Polish dependency relation types.
Dependency relation type |
Precision |
Recall |
F-Measure |
|||
Mate |
Malt |
Mate |
Malt |
Mate |
Malt |
|
abbrev_punct |
0.99 |
0.99 |
0.98 |
0.97 |
0.98 |
0.98 |
adjunct |
0.89 |
0.73 |
0.92 |
0.77 |
0.82 |
0.75 |
adjunct_qt |
0.74 |
0.51 |
0.76 |
0.58 |
0.75 |
0.55 |
aglt |
1.00 |
0.98 |
1.00 |
0.98 |
0.98 |
0.98 |
app |
0.75 |
0.58 |
0.69 |
0.52 |
0.72 |
0.55 |
aux |
0.95 |
0.90 |
0.97 |
0.92 |
0.96 |
0.91 |
comp |
0.90 |
0.85 |
0.87 |
0.82 |
0.88 |
0.84 |
comp_ag |
0.95 |
0.90 |
0.96 |
0.91 |
0.94 |
0.90 |
comp_fin |
0.87 |
0.75 |
0.86 |
0.79 |
0.87 |
0.77 |
comp_inf |
0.95 |
0.91 |
0.96 |
0.90 |
0.93 |
0.90 |
cond |
1.00 |
0.97 |
1.00 |
0.96 |
1.00 |
0.96 |
conjunct |
0.85 |
0.71 |
0.82 |
0.65 |
0.82 |
0.68 |
imp |
0.98 |
0.97 |
0.91 |
0.87 |
0.94 |
0.92 |
item |
0.87 |
0.4 |
0.73 |
0.37 |
0.61 |
0.39 |
mwe |
0.90 |
0.83 |
0.83 |
0.75 |
0.87 |
0.79 |
ne |
0.87 |
0.78 |
0.73 |
0.64 |
0.76 |
0.70 |
neg |
0.99 |
0.97 |
1.00 |
0.98 |
0.99 |
0.98 |
obj |
0.89 |
0.81 |
0.91 |
0.86 |
0.89 |
0.83 |
obj_th |
0.83 |
0.76 |
0.76 |
0.65 |
0.80 |
0.70 |
pd |
0.86 |
0.77 |
0.80 |
0.72 |
0.87 |
0.74 |
pre_coord |
0.86 |
0.76 |
0.78 |
0.55 |
0.82 |
0.64 |
punct |
0.97 |
0.75 |
0.98 |
0.76 |
0.88 |
0.76 |
refl |
0.99 |
0.96 |
0.99 |
0.96 |
0.99 |
0.96 |
root |
0.91 |
0.80 |
0.91 |
0.81 |
0.94 |
0.80 |
subj |
0.94 |
0.84 |
0.94 |
0.83 |
0.94 |
0.84 |
Dependency parser integrated into Multiservice NLP for Polish
The performance of MaltParser model for Polish may be tested in Multiservice NLP – http://multiservice.nlp.ipipan.waw.pl.
To parse a Polish text in Multiservice "Select predefined chain of actions": 5: Concraft, DependencyParser, input your text, and press the button "Run".
- To download the parser's output in CoNLL format, "Select output format:":
Publications
Licensing
The dependency parsing models for Polish are released under the CC BY-NC-SA 4.0 licence and by downloading it you accept the conditions of that licence.
Contact
Any questions, comments? Please send them to <alina AT SPAMFREE ipipan DOT waw DOT pl>.