Size: 77
Comment:
|
Size: 3173
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
= Słownik elektroniczny języka polskiego dla wyrażeń frazeologicznych = | #acl +All:read Default = Grammatical Lexicon of Polish Phraseology = The Grammatical Lexicon of Polish Phraseology (SEJF = Słownik Elektroniczny Języka polskiego dla wyrażeń Frazeologicznych) is an electronic lexicon containing multi-word units (mainly nominal, adjectival and adverbial compounds) of the general (non terminological) Polish language. It has been created within the ERDF Nekst project. Some aspects of its construction, contents and use have been described in: * GRALIŃSKI, F., SAVARY, A., CZEREPOWICKA, M., MAKOWIECKI, F. (2010): ''[[http://multiword.sourceforge.net/CONF_30_MWE_2010___lb__COLING__rb__/CONF_50_Online_Proceedings/pdf/MWE01.pdf|Computational Lexicography of Multi-Word Units: How Efficient Can It Be?]]'', in Proceedings of Multiword Expressions: from Theory to Applications (MWE 2010), Workshop at COLING 2010, Beijing, China, August 28. * CZEREPOWICKA, M., KOSEK, I. (2011): ''Problemy opisu związków frazeologicznych w formalizmie „Multifleks” (na przykładzie rodzaju wyrażeń frazeologicznych)'', in "Różne formy, różne treści", pp. 117–126, Warszawa 2011. * CZEREPOWICKA, M. (2011): ''„Toposław” jako narzędzie znakowania jednostek wieloczłonowych'', in Matusiak-Kempa, I., Przybyszewski, S. (eds.) Nowe zjawiska w języku, tekście, komunikacji. Kontekst a komunikacja, Olsztyn, pp. 28–35. The lexicon contains about 5,000 multi-word lexemes, 93,000 corresponding inflected forms, and 160 graph-based inflection paradigms, with the following distribution: * 3890 nominal compounds (e.g. ''bajońskie sumy''), * 455 adjectival compounds (e.g. ''prosty jak strzała'', ''wprost proporcjonalny''), * 609 adverbial compounds (e.g. ''chcąc nie chcąc''), * 54 others (e.g. ''ni z gruszki, ni z pietruszki''). == Authors == * [[http://www.uwm.edu.pl/polonistyka/index.php?option=com_content&view=article&id=95&catid=50&Itemid=9|Monika Czerepowicka]] - lexicography * [[http://www.info.univ-tours.fr/~savary/English/indexgb.html|Agata Savary]] - automatic inflection and validation == Tools == The lexicon has been created within [[http://zil.ipipan.waw.pl/Toposlaw|Toposław]], tool for developping and managing inflectional dictionaries of multi-word units. Topsław integrates: * [[http://sgjp.pl/morfeusz/|Morfeusz SGJP]] -- a morphological analyser and generator of Polish, * [[http://www.springerlink.com/content/n265j22n73084433/|Multiflex]] -- a morpho-syntactic generator of multi-word units, * graph editor stemming from [[http://igm.univ-mlv.fr/~unitex/|Unitex]]. == License == The data are available under the [[http://creativecommons.org/licenses/by-sa/3.0/|CC-BY-SA license]]. == Available resources == * [[attachment:Slownik.zip|Slownik]] -- the binary source file in [[http://zil.ipipan.waw.pl/Toposlaw|Toposław]] format * [[http://www.springerlink.com/content/n265j22n73084433/|Multiflex]]-compatible [[attachment:SEJF.zip|archive]] containing: * the list of morphologically annotated lexemes, * the list of corresponding inflected forms and variants, * inflection paradigms (graphs), * list of known problems. |
Grammatical Lexicon of Polish Phraseology
The Grammatical Lexicon of Polish Phraseology (SEJF = Słownik Elektroniczny Języka polskiego dla wyrażeń Frazeologicznych) is an electronic lexicon containing multi-word units (mainly nominal, adjectival and adverbial compounds) of the general (non terminological) Polish language. It has been created within the ERDF Nekst project.
Some aspects of its construction, contents and use have been described in:
GRALIŃSKI, F., SAVARY, A., CZEREPOWICKA, M., MAKOWIECKI, F. (2010): Computational Lexicography of Multi-Word Units: How Efficient Can It Be?, in Proceedings of Multiword Expressions: from Theory to Applications (MWE 2010), Workshop at COLING 2010, Beijing, China, August 28.
CZEREPOWICKA, M., KOSEK, I. (2011): Problemy opisu związków frazeologicznych w formalizmie „Multifleks” (na przykładzie rodzaju wyrażeń frazeologicznych), in "Różne formy, różne treści", pp. 117–126, Warszawa 2011.
CZEREPOWICKA, M. (2011): „Toposław” jako narzędzie znakowania jednostek wieloczłonowych, in Matusiak-Kempa, I., Przybyszewski, S. (eds.) Nowe zjawiska w języku, tekście, komunikacji. Kontekst a komunikacja, Olsztyn, pp. 28–35.
The lexicon contains about 5,000 multi-word lexemes, 93,000 corresponding inflected forms, and 160 graph-based inflection paradigms, with the following distribution:
3890 nominal compounds (e.g. bajońskie sumy),
455 adjectival compounds (e.g. prosty jak strzała, wprost proporcjonalny),
609 adverbial compounds (e.g. chcąc nie chcąc),
54 others (e.g. ni z gruszki, ni z pietruszki).
Authors
Monika Czerepowicka - lexicography
Agata Savary - automatic inflection and validation
Tools
The lexicon has been created within Toposław, tool for developping and managing inflectional dictionaries of multi-word units. Topsław integrates:
Morfeusz SGJP -- a morphological analyser and generator of Polish,
Multiflex -- a morpho-syntactic generator of multi-word units,
graph editor stemming from Unitex.
License
The data are available under the CC-BY-SA license.