Project factsheet

English name:

Compositional distributional modelling of Polish language semantics

Polish name:

Kompozycyjno-dystrybucyjne modelowanie semantyki języka polskiego

Project type:

The National Science Centre SONATA 8 grant 2014/15/D/HS2/03486


30 September 2015 ‒ 29 September 2018

Principal investigator:

Alina Wróblewska

Project summary

Within the project, basic research will be conducted on compositional distributional semantics employed in modelling the meaning of phrases and sentences. A compositional distributional semantic model endeavours to determine the meaning of sentences or phrases based on the sophisticated procedure of composing distributional word vectors, and to generate a vector representation of this meaning. The degree of similarity between two vectors, which belong to the same vector space but represent meanings of different sentences, can be estimated with similarity measures. With respect to the Polish language, this scientific issue has been studied neither by us nor by other members of the natural language processing community in Poland. Within our pioneering studies, we will investigate whether it is possible to estimate compositional distributional semantic models for languages with a complex inflectional system and relatively free word order, such as Polish.


  • The morphosyntactic disambiguator "Toygger" won the shared task 1(A) in POLEVAL 2017 competition (Task 1(A) results)



  • Toygger - morphosyntactic disambiguator of Polish (Krasnowska-Kieraś, 2017)