Locked History Actions

Diff for "plTAG"

Differences between revisions 3 and 6 (spanning 3 versions)
Revision 3 as of 2013-01-22 11:01:23
Size: 1804
Comment:
Revision 6 as of 2013-01-22 11:07:53
Size: 1841
Comment:
Deletions are marked like this. Additions are marked like this.
Line 4: Line 4:
This is a TAG (Tree Adjoining Grammar) grammar for Polish. The description of TAG formalism can be found in this paper: http://www.seas.upenn.edu/~joshi/joshi-schabes-tag-97.pdf. The Tree Adjoining Grammar for Polish has been extracted automatically from [[http://zil.ipipan.waw.pl/Składnica|Składnica]] - a Polish constituency treebank. The extraction procedure was based on the one described in this paper: [[nlp.cs.nyu.edu/nycnlp/autoextract.ps]]. This is a TAG (Tree Adjoining Grammar) grammar for Polish. The description of TAG formalism can be found in this paper: http://www.seas.upenn.edu/~joshi/joshi-schabes-tag-97.pdf. The Tree Adjoining Grammar for Polish has been extracted automatically from [[http://zil.ipipan.waw.pl/Składnica|Składnica]] - a Polish constituency treebank. The extraction procedure was based on the one described in this paper: [[http://nlp.cs.nyu.edu/nycnlp/autoextract.ps]].
Line 10: Line 10:
= Description =
Line 16: Line 14:
= Contents of the package = == Contents of the package ==

The package [[attachment:pl-TAG]] contains:

Polish TAG Grammar

This is a TAG (Tree Adjoining Grammar) grammar for Polish. The description of TAG formalism can be found in this paper: http://www.seas.upenn.edu/~joshi/joshi-schabes-tag-97.pdf. The Tree Adjoining Grammar for Polish has been extracted automatically from Składnica - a Polish constituency treebank. The extraction procedure was based on the one described in this paper: http://nlp.cs.nyu.edu/nycnlp/autoextract.ps.

Author: Katarzyna Krasnowska
License: GPL v3

The grammar can be used with TuLiPA-pl - a modified version of TuLiPA (https://sourcesup.cru.fr/tulipa/) which is included with the grammar as a Java jar file. More infomation on the usage of TuLiPA-pl can be found in the README file. The grammar follows the 3-layer design adopted by the authors of TuLiPA (grammar, lexicon, morphology), but provides only the two first layers. The morphology can be either provided by the user or generated by TuLiPA-pl during parsing (using the Morfeusz morphological analyser).

The grammar and lexicon are in XMG (http://wiki.loria.fr/wiki/XMG/Documentation) and LEX2ALL (http://wiki.loria.fr/wiki/LEX2ALL) formats respectively.

Contents of the package

The package pl-TAG contains:

  • grammar/ directory which contains the TAG grammar for polish:
    • polish.mg - the grammar file in XMG metagrammar format
    • polish.xml - the same grammar in XML format, used by TuLiPA-pl
    • polish-lex - the lexicon file in LEX2ALL format
    • polish-lex.xml - the same lexicon in XML format, used by TuLiPA-pl
  • TuLiPA-pl.jar - a Java jar archive containing the parser
  • README file
  • licence text (GPL v3)