This file provides some basic information concerning POLFIE, an LFG grammar of Polish.

POLFIE is an LFG grammar of Polish implemented in the XLE system (Xerox Linguistic Environment); it has been developed at the Institute of Computer Science, Polish Academy of Sciences (IPI PAN) within two projects: NEKST and CLARIN-PL. It provides a two-layer representation: constituent structure (c-structure, tree representation) and functional structure (f-structure, AVM representation). It is based on two previous implemented grammars of Polish: its c-structure is based on GFJP2, a DCG grammar used by the parser Świgra, while its f-structure is inspired by FOJP, an HPSG grammar of Polish. Lexical entries used by the grammar are created using Morfeusz, the state-of-the-art morphological analyser for Polish, Walenty, a valence dictionary of Polish, and selected converted valence dictionaries used by Świgra.


This file is divided into the following sections:

LICENCE
CONTACT
AVAILABILITY
NOTE
LIST OF GRAMMAR FILES AND DIRECTORIES
SCRIPTS ACCOMPANYING THE GRAMMAR
MODULES ACCOMPANYING SCRIPTS
OBTAINING PREREQUISITES
PREPARING XLE CONFIG FILE
PREPARING GRAMMAR CONFIG FILE
TOKENIZER SELECTION
RUNNING XLE
RUNNING morf2xle-Walenty.py


#################################################################################################################################################################################################################

LICENCE:

The grammar is made available according to the terms of GNU General Public License (see the file gpl-3.0.txt) unless stated otherwise.

#################################################################################################################################################################################################################

CONTACT:

POLFIE is currently developed and maintained by Agnieszka Patejuk (aep@ipipan.waw.pl).

#################################################################################################################################################################################################################

AVAILABILITY:

For the most current version for local installation, go to http://zil.ipipan.waw.pl/LFG.

POLFIE is also available as a web service (it does not require a local installation of XLE) at http://iness.mozart.ipipan.waw.pl/iness/xle-web:
• choose grammar from Grammar menu:
  – POLFIE-Morfeusz2 (stable version)
  – POLFIE-Morfeusz2-OT (version with OT constraints)
  – POLFIE-Morfeusz2-dev (development version)
• write a sentence in the relevant field
• click the "Parse sentence" button

#################################################################################################################################################################################################################

NOTE:

All the instructions provided in this file assume that you use a Linux-type operating system.

Remember about putting the path to XLE in morf2xle-Walenty.py (see below: RUNNING morf2xle-Walenty.py).

#################################################################################################################################################################################################################

LIST OF GRAMMAR FILES AND DIRECTORIES:

POLFIE:
common2xle.py
dicts2xle/
frame_maker_lib.py
gpl-3.0.txt
grammar/
morf2xle-Walenty.py
Morfeusz/
morfeusz_nkjp.py
morftagtrans.py
POLFIE.lfg
POLFIE-lex-core-auto.lfg
POLFIE-lex-core-mwe.lfg
POLFIE-lex-core-overlay.lfg
README
realizations_20140608.txt
slowal_20140608_gotowe_tymczasowy_sprawdzone_SPLIT_noORargsL.txt
Walenty/
.xlerc

POLFIE/dicts2xle:
compclasses
dict-val-adjs
dict-val-nouns
gfjp_slowskr.pl
gfjp_slowwyj.pl

POLFIE/grammar:
common.features.lfg
default-parse-tokenizer.fsmfile
POLFIE-features.lfg
POLFIE-morphology.lfg
POLFIE-overlay-rules-FOJP.lfg
POLFIE-overlay-rules-lexsem.lfg
POLFIE-rules-main.lfg
POLFIE-rules-meta-coord.lfg
POLFIE-rules-meta-main.lfg
POLFIE-rules-meta-punct.lfg
POLFIE-rules-NP.lfg
POLFIE-rules-paths.lfg
POLFIE-templates-ann.lfg
POLFIE-templates-aux.lfg
POLFIE-templates-features.lfg
POLFIE-templates-lexicalised.lfg
POLFIE-templates-lexicalised-comp.lfg
POLFIE-templates-lexicalised-conj.lfg
POLFIE-templates-lexicalised-rest.lfg
POLFIE-templates-modpart.lfg
POLFIE-templates-pos.lfg
POLFIE-templates-syntax.lfg
POLFIE-templates-valence.lfg
POLFIE-templates-valence-quasi.lfg

POLFIE/Morfeusz:
morfeusz-SGJP-linux32-20130413.tar.bz2
morfeusz-SGJP-linux64-20130413.tar.bz2

POLFIE/Walenty:
slowal_20140608.tar.gz

#################################################################################################################################################################################################################

SCRIPTS ACCOMPANYING THE GRAMMAR:

-- morf2xle-Walenty.py:
   runs in two modes:
      -- interactive (required: XLE):
      	 creates dictionary entries on the fly for sentences that you want to parse
      -- batch:
      	 creates a lexicon for the set of sentences provided for parsing
   – summary of prerequisites:
     – module containing common functions (it has its own prerequisites): common2xle.py
     – module for translating tags returned by Morfeusz: morftagtrans.py

#################################################################################################################################################################################################################

MODULES ACCOMPANYING SCRIPTS:

-- common2xle.py:
   module containing common functions and variables used by morf2xle-Walenty.py
   – summary of prerequisites:
     – converted dictionaries of nouns (dict-val-nouns), adjectives (dict-val-adjs)
     – base dictionary of abbreviations (used by Świgra): gfjp_slowskr.pl
     – dictionary of complementiser classes: compclasses
     – core lexicon: POLFIE-lex-core-auto.lfg

-- frame_maker_lib.py:
   module converting schemata from Walenty valence dictionary to LFG constraints
   – summary of prerequisites:
     – text export of Walenty (modified: slowal_20140608_gotowe_tymczasowy_sprawdzone_SPLIT_noORargsL.txt; original: slowal_20140608.tar.gz)
     – definition of category expansions from Walenty (realizations_20140608.txt)

-- morfeusz_nkjp.py:
   python interface (by Jakub Wilk) to Morfeusz morphological analyser, a slightly modified version

-- morftagtrans.py:
   module translating tags returned by Morfeusz to NKJP-compliant tags and elimitating/replacing bad analyses
   – summary of prerequisites:
     – slightly modified python interface to Morfeusz: morfeusz_nkjp.py
     – Świgra patch for filtering Morfeusz results: gfjp_slowwyj.pl

#################################################################################################################################################################################################################

OBTAINING PREREQUISITES:

To use this version of POLFIE, you need to obtain a copy of:
-- XLE (http://www2.parc.com/isl/groups/nltt/xle/)
-- Morfeusz (bundled in the package:
   	     – 32-bit: POLFIE/Morfeusz/morfeusz-SGJP-linux32-20130413.tar.bz2;
	     – 64-bit: POLFIE/Morfeusz/morfeusz-SGJP-linux64-20130413.tar.bz2)
-- Walenty (bundled in the package: POLFIE/Walenty/slowal_20140608.tar.gz)

#################################################################################################################################################################################################################

PREPARING XLE CONFIG FILE (.xlerc):

see the enclosed example (.xlerc)
for more information, check the XLE documentation (http://www2.parc.com/isl/groups/nltt/xle/doc/xle.html#SEC4)

#################################################################################################################################################################################################################

PREPARING GRAMMAR CONFIG FILE (POLFIE.lfg):

see the enclosed example (POLFIE.lfg)
1. if the name of the lexicon file is not "morfeusz_interactive_dict":
   -- change the name of the lexicon file
   -- or replace "morfeusz_interactive_dict" in the config file with the appropriate filename
2. ensure the character encoding line is there ("  CHARACTERENCODING utf-8.")
3. for more information, check the XLE documentation (http://www2.parc.com/isl/groups/nltt/xle/doc/notations.html#N6 and http://www2.parc.com/isl/groups/nltt/xle/doc/walkthrough.html#W.namedspecs)

#################################################################################################################################################################################################################

TOKENIZER SELECTION:

-- the basic one (provided in the XLE distribution: you need to copy XLE/bin/default-parse-tokenizer.fsmfile to this directory: grammar):
   -- does not handle punctuation haplology, abbreviations nor MWEs

*OR*

-- the more sophisticated one (written by Ron Kaplan, provided with the grammar):
   -- does handle punctuation haplology, abbreviations and MWEs

#################################################################################################################################################################################################################

RUNNING XLE:

if you need to install XLE, check the documentation (http://www2.parc.com/isl/groups/nltt/xle/doc/xle.html#SEC1.5)
1. run XLE by typing "xle" (in the directory containing the main grammar config file)
2. to load a grammar whose base config file name is POLFIE.lfg, type:
create-parser POLFIE.lfg
(you may skip this step if you put such a line in your .xlerc file; the .xlerc file provided with the grammar contains this line)
3. to parse sentence "To zdanie jest proste.", type:
parse {To zdanie jest proste.}
4. to quit XLE, type:
exit
5. for more information, check the XLE documentation (http://www2.parc.com/isl/groups/nltt/xle/doc/xle.html#SEC2), you may find XLE Walkthrough very helpful (http://www2.parc.com/isl/groups/nltt/xle/doc/walkthrough.html)

#################################################################################################################################################################################################################

RUNNING morf2xle-Walenty.py:

ensure that all necessary files are there (and that you have python, Morfeusz and XLE installed)
1. put the path to XLE as the value of "SCIEZKA_DO_XLE" variable in morf2xle-Walenty.py (default value: /opt/xle/bin/xle)
2. type in command line:
python morf2xle-Walenty.py
3. follow the instructions, for interactive use with lexicon stored in the file named "morfeusz_interactive_dict" (you need to put this file name in the grammar config file below the line "put the name of your output_dictionary below"), type:
      python morf2xle-Walenty.py -i morfeusz_interactive_dict
4. to quit, hit CTRL+D


HAPPY PARSING!