Locked History Actions

Diff for "SEJFEK"

Differences between revisions 10 and 11
Revision 10 as of 2012-10-16 17:03:09
Size: 2781
Editor: AgataSavary
Comment:
Revision 11 as of 2012-11-15 17:10:42
Size: 3039
Editor: AgataSavary
Comment:
Deletions are marked like this. Additions are marked like this.
Line 9: Line 9:
 * SAVARY, A., ZABOROWSKI, B., KRAWCZYK-WIECZOREK, A., MAKOWIECKI, F. (2012): ''SEJFEK — a Lexicon and a Shallow Grammar of Polish Economic Multi-Word Units'', in Proceedings of a Workshop on Cognitive Aspects of the Lexicon (COGALEX-III), Mumbai, India.

Grammatical Lexicon of Polish Economic Phraseology

The Grammatical Lexicon of Polish Economic Phraseology (SEJFEKSłownik Elektroniczny Jednostek Frazeologicznych z EKonomii) is an electronic lexicon containing multi-word nominal terms of Polish economic and financial terminology. It has been created within the ERDF Nekst project.

Some aspects of its construction, contents and use have been described in:

  • GRALIŃSKI, F., SAVARY, A., CZEREPOWICKA, M., MAKOWIECKI, F. (2010): Computational Lexicography of Multi-Word Units: How Efficient Can It Be?, in Proceedings of Multiword Expressions: from Theory to Applications (MWE 2010), Workshop at COLING 2010, Beijing, China, August 28.

  • SAVARY, A., ZABOROWSKI, B., KRAWCZYK-WIECZOREK, A., MAKOWIECKI, F. (2012): SEJFEK — a Lexicon and a Shallow Grammar of Polish Economic Multi-Word Units, in Proceedings of a Workshop on Cognitive Aspects of the Lexicon (COGALEX-III), Mumbai, India.

The lexicon contains:

  • 11,212 multi-word nominal lexemes (e.g. aktywne ryzyko płynności),

  • 146,861 corresponding inflected forms (e.g. aktywnego ryzyka płynności),

  • 305 graph-based inflection paradigms.

See also SEJFEK4Spejd - a shallow grammar for Spejd with fully lexicalized rules automatically generated from SEJFEK lexicon entries.

Authors

  • Filip Makowiecki - lexicography
  • Agata Savary - automatic inflection and validation

Tools

The lexicon has been created within Toposław, tool for developping and managing inflectional dictionaries of multi-word units. Toposław integrates:

  • Morfeusz SGJP -- a morphological analyser and generator of Polish,

  • Multiflex -- a morpho-syntactic generator of multi-word units,

  • graph editor stemming from Unitex.

License

The data are available under the CC BY-SA license.

Available resources

  • Slownik -- the binary source file in Toposław format

  • Multiflex-compatible archive containing:

    • the list of morphologically annotated lexemes,
    • the list of corresponding inflected forms and variants,
    • inflection graphs compatible with Unitex graph editor,

    • list of known problems.

Future work

Defining an LMF format for the lexicon.