Polish Valence Dictionary (Walenty)

The Polish Valence Dictionary (Walenty) is an electronic dictionary of subcategorisation frames for Polish verbs and quasi-verbal predicates. Some textual snapshots of the dictionary are made available at the bottom of this page, together with articles describing its format in detail. What follows is an increasingly obsolete overview, based on earlier versions of the dictionary.

The dictionary represents valence as a list of individual frames describing a particular verbal base with a particular aspect (perfective, imperfective, or bi-aspectual, listed as _). The actual argument structure is presented as a set of positions which must be filled by phrases of appropriate types and parameters. Individual positions may be marked for their status as a subject (subj) or a passivisable direct object (obj), and for their role in control relations with other positions in the argument structure (controller and controlee).

The resource has been produced as part of the CESAR project (Central and Southeast European Resources), as well as other projects carried out at ZIL IPI PAN, and is made available on META-SHARE.

See the bottom of this page for the latest versions of the dictionary and for papers describing the format of Walenty in more detail.

Format

The format of the dictionary (devised by the authors listed below) is based on the electronic version of Świdziński's dictionary, but includes a number of significant changes:

Entry structure

The dictionary, in text format, consists of a list of valence frames. Every frame is associated with a lemma, in the following format:

The actual valence information is represented as a list of syntactic positions, expressed within curly braces and separated with plus signs. A position may include more than one type of argument if the arguments can be coordinated within the same position (arguments within a position are separated by semicolons).

Positions may bear special categories (e.g. subject, passivisable object), listed before the relevant position. The following categories are distinguished:

In the situation where different types of arguments may be coordinated within a single position, certain categories are only relevant for a subset of listed arguments:

Since the dictionary is syntactic in nature, only longest possible frames are listed - shorter frames are included within broader ones, regardless of differences in semantics, unless they differ in terms of control relations or the presence of the subject.

Types of arguments

Actual arguments listed in valence frames are categorised into following types, listed as argument(parameter1,parameter2,...):

Sources

Authors

The format of the dictionary has been devised by:

Manual edition has been carried out by:

License

The data are available under a CC BY-SA license.

Available resources

The following text versions of the dictionary are available:

The complete dictionary database, including corpus examples illustrating the use of individual frames, may also be viewed through the online application Slowal, which was used during the creation of the dictionary.

Errors

Please report any errors here.

Publications

List of publications

Adam Przepiórkowski, Elżbieta Hajnicz, Agnieszka Patejuk, and Marcin Woliński. Extended phraseological information in a valence dictionary for NLP applications. In Proceedings of the Workshop on Lexical and Grammatical Resources for Language Processing (LG-LP 2014), pages 83–91, Dublin, Ireland, 2014. Association for Computational Linguistics and Dublin City University.

Adam Przepiórkowski, Elżbieta Hajnicz, Agnieszka Patejuk, Marcin Woliński, Filip Skwarski, and Marek Świdziński. Walenty: Towards a comprehensive valence dictionary of Polish. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, editors, Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, pages 2785–2792, Reykjavík, Iceland, 2014. European Language Resources Association (ELRA).

Adam Przepiórkowski, Filip Skwarski, Elżbieta Hajnicz, Agnieszka Patejuk, Marek Świdziński, and Marcin Woliński. Modelowanie własności składniowych czasowników w nowym słowniku walencyjnym języka polskiego. Polonica, XXXIII:159–178, 2014.