Grammatical Lexicon of Polish Economic Phraseology
The Grammatical Lexicon of Polish Economic Phraseology (SEJFEK – Słownik Elektroniczny Jednostek Frazeologicznych z EKonomii) is an electronic lexicon containing multi-word nominal terms of Polish economic and financial terminology. It has been created within the ERDF Nekst project.
Some aspects of its construction, contents and use have been described in:
GRALIŃSKI, F., SAVARY, A., CZEREPOWICKA, M., MAKOWIECKI, F. (2010): Computational Lexicography of Multi-Word Units: How Efficient Can It Be?, in Proceedings of Multiword Expressions: from Theory to Applications (MWE 2010), Workshop at COLING 2010, Beijing, China, August 28.
SAVARY, A., ZABOROWSKI, B., KRAWCZYK-WIECZOREK, A., MAKOWIECKI, F. (2012): SEJFEK — a Lexicon and a Shallow Grammar of Polish Economic Multi-Word Units, in Proceedings of Cognitive Aspects of the Lexicon (COGALEX-III), a Workshop at COLING 2012, Mumbai, India.
The lexicon contains:
11,212 multi-word nominal lexemes (e.g. aktywne ryzyko płynności),
146,861 corresponding inflected forms (e.g. aktywnego ryzyka płynności),
- 305 graph-based inflection paradigms.
See also SEJFEK4Spejd – a shallow grammar for Spejd with fully lexicalized rules automatically generated from SEJFEK lexicon entries.
Authors
- Filip Makowiecki – lexicography
Aleksandra Krawczyk-Wieczorek - lexicon-grammar conversion
Agata Savary – automatic inflection and validation
Bartosz Zaborowski - lexicon-grammar conversion
Tools
The lexicon has been created within Toposław, tool for developing and managing inflectional dictionaries of multi-word units. Toposław integrates:
Morfeusz SGJP – a morphological analyser and generator of Polish,
Multiflex – a morpho-syntactic generator of multi-word units,
graph editor stemming from Unitex.
License
The data are available under the CC BY-SA license.
Available resources
Multiflex-compatible archive containing:
- the list of morphologically annotated lexemes,
- the list of corresponding inflected forms and variants,
inflection graphs compatible with Unitex graph editor,
- list of known problems.
Future work
Defining an LMF format for the lexicon.