Size: 6927
Comment:
|
Size: 7561
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 2: | Line 2: |
= PDB-based dependency parsing models for Polish = | == PDB-based dependency parsing models for Polish == |
Line 4: | Line 4: |
The PDB-based models are trained on the current version of ([[http://zil.ipipan.waw.pl/PDB|Polish Depedency Bank]]) with the publicly available parsing systems – [[|COMBO]], [[http://maltparser.org|MaltParser]] or [[https://code.google.com/archive/p/mate-tools/|MateParser]]. `MaltParser` is a transition-based dependency parser that uses a deterministic parsing algorithm. The deterministic parsing algorithm builds a dependency structure of an input sentence based on transitions (shift-reduce actions) predicted by a classifier. The classifier learns to predict the next transition given training data and the parse history. `MateParser`, in turn, is a graph-based parser that defines a space of well-formed candidate dependency trees for an input sentence, scores them given an induced parsing model, and selects the highest scoring dependency tree as a correct analysis of the input sentence. | The PDB-based models are trained on the current version of [[http://zil.ipipan.waw.pl/PDB|Polish Depedency Bank]] with the publicly available parsing systems – [[https://github.com/360er0/COMBO|COMBO]], [[https://code.google.com/archive/p/mate-tools/|MateParser]] and [[http://maltparser.org|MaltParser]]. /* ''MaltParser'' is a transition-based dependency parser that uses a deterministic parsing algorithm. The deterministic parsing algorithm builds a dependency structure of an input sentence based on transitions (shift-reduce actions) predicted by a classifier. The classifier learns to predict the next transition given training data and the parse history. `MateParser`, in turn, is a graph-based parser that defines a space of well-formed candidate dependency trees for an input sentence, scores them given an induced parsing model, and selects the highest scoring dependency tree as a correct analysis of the input sentence. */ |
Line 6: | Line 6: |
== PDB-based models == | ==== COMBO ==== tba {{{#!wiki comment * '''NEW!''' PDB-based COMBO model compatible with the tagset of Morfeusz 2: [[attachment:180912_PDBCOMBO.pkl]]}}} |
Line 8: | Line 10: |
=== MateParser === | ==== MateParser ==== |
Line 10: | Line 12: |
* '''NEW!''' PDBMate model (compatible with the tagset of Morfeusz 2): [[attachment:180322_PDBMate.mdl]] * PDBMate model (compatible with the tagset of Morfeusz): [[attachment:170608_PDBMate.mdl]] |
* '''NEW!''' PDB-based Mate model compatible with the tagset of Morfeusz 2: [[attachment:180322_PDBMate.mdl]] * PDB-based Mate model compatible with the tagset of Morfeusz: [[attachment:170608_PDBMate.mdl]] |
Line 13: | Line 15: |
=== MaltParser === | ==== MaltParser ==== |
Line 15: | Line 17: |
* '''NEW!''' PDBMalt model (compatible with the tagset of Morfeusz 2): [[attachment:180322_PDBMalt.mco]] * PDBMalt model (compatible with the tagset of Morfeusz): [[attachment:170608_PDBMalt.mco]] |
* '''NEW!''' PDB-based MaltParser model compatible with the tagset of Morfeusz 2: [[attachment:180322_PDBMalt.mco]] * PDB-basd MaltParser model compatible with the tagset of Morfeusz: [[attachment:170608_PDBMalt.mco]] |
Line 18: | Line 20: |
= PDBUD-based dependency parsing models for Polish= | |
Line 20: | Line 21: |
=== UDPipe === * UDPipe model for Polish: [[attachment:180606_PDBUDPipe.udpipe]] |
== PDBUD-based dependency parsing models for Polish == The PDBUD-based models are trained on the current version of [[http://git.nlp.ipipan.waw.pl/alina/PDBUD|Polish Depedency Bank in Universal Dependencies format]] with the publicly available parsing systems – [[http://ufal.mff.cuni.cz/udpipe|UDPipe]] and [[https://github.com/360er0/COMBO|COMBO]]. ==== UDPipe ==== * [[attachment:180606_PDBUDPipe.udpipe|UDPipe]] model for Polish ==== COMBO ==== * [[http://mozart.ipipan.waw.pl/~prybak/model_poleval2018/model_A_semi.pkl|COMBO]] model for Polish |
Line 25: | Line 32: |
{{{#!wiki comment | |
Line 62: | Line 70: |
}}} | |
Line 63: | Line 72: |
== Dependency parser integrated into Multiservice NLP for Polish == | == PDB-based MaltParser in Multiservice == |
Line 68: | Line 77: |
== Publications == | === Publications === |
PDB-based dependency parsing models for Polish
The PDB-based models are trained on the current version of Polish Depedency Bank with the publicly available parsing systems – COMBO, MateParser and MaltParser.
COMBO
tba #!wiki comment * '''NEW!''' PDB-based COMBO model compatible with the tagset of Morfeusz 2: [[attachment:180912_PDBCOMBO.pkl]]
MateParser
NEW! PDB-based Mate model compatible with the tagset of Morfeusz 2: 180322_PDBMate.mdl
PDB-based Mate model compatible with the tagset of Morfeusz: 170608_PDBMate.mdl
MaltParser
NEW! PDB-based MaltParser model compatible with the tagset of Morfeusz 2: 180322_PDBMalt.mco
PDB-basd MaltParser model compatible with the tagset of Morfeusz: 170608_PDBMalt.mco
PDBUD-based dependency parsing models for Polish
The PDBUD-based models are trained on the current version of Polish Depedency Bank in Universal Dependencies format with the publicly available parsing systems – UDPipe and COMBO.
UDPipe
UDPipe model for Polish
COMBO
COMBO model for Polish
Parsing performance
PDB-based MaltParser in Multiservice
The performance of MaltParser model for Polish may be tested in Multiservice NLP – http://multiservice.nlp.ipipan.waw.pl.
To parse a Polish text in Multiservice "Select predefined chain of actions": 5: Concraft, DependencyParser, input your text, and press the button "Run".
- To download the parser's output in CoNLL format, "Select output format:":
Publications
Licensing
The dependency parsing models for Polish are released under the CC BY-NC-SA 4.0 licence and by downloading it you accept the conditions of that licence.
Contact
Any questions, comments? Please send them to <alina AT SPAMFREE ipipan DOT waw DOT pl>.