Polish TAG Grammar
This is a TAG (Tree Adjoining Grammar) grammar for Polish. The description of TAG formalism can be found in this paper: http://www.seas.upenn.edu/~joshi/joshi-schabes-tag-97.pdf. The Tree Adjoining Grammar for Polish has been extracted automatically from Składnica - a Polish constituency treebank. The extraction procedure was based on the one described in this paper: nlp.cs.nyu.edu/nycnlp/autoextract.ps.
Author: Katarzyna Krasnowska
License: GPL v3
Description
The grammar can be used with TuLiPA-pl - a modified version of TuLiPA (https://sourcesup.cru.fr/tulipa/) which is included with the grammar as a Java jar file. More infomation on the usage of TuLiPA-pl can be found in the README file. The grammar follows the 3-layer design adopted by the authors of TuLiPA (grammar, lexicon, morphology), but provides only the two first layers. The morphology can be either provided by the user or generated by TuLiPA-pl during parsing (using the Morfeusz morphological analyser).
The grammar and lexicon are in XMG (http://wiki.loria.fr/wiki/XMG/Documentation) and LEX2ALL (http://wiki.loria.fr/wiki/LEX2ALL) formats respectively.
Contents of the package
- grammar/ directory which contains the TAG grammar for polish:
- polish.mg - the grammar file in XMG metagrammar format
- polish.xml - the same grammar in XML format, used by TuLiPA-pl
- polish-lex - the lexicon file in LEX2ALL format
- polish-lex.xml - the same lexicon in XML format, used by TuLiPA-pl
- TuLiPA-pl.jar - a Java jar archive containing the parser
- README file
- licence text (GPL v3)