This is the home page of the Polish Sentiment Treebank

The dataset is a dependency treebank with sentiment annotations. It was parsed using the Polish dependency parser models available from http://zil.ipipan.waw.pl/PolishDependencyParser.

For each sentence in the treebank, sentiment of each sub-phrase (sub-tree) has been assigned by a linguist. Sentiment of each leaf word has been labelled according to Polish sentiment dictionary, partially also verified manually.

Sentiment labels of both phrases and leaves include three classes: neutral, positive and negative.

Sentiment annotations for each token corresponds to the overall sentiment of the whole phrase under it and inclusive. Specifically:

The treebank has been created specifically for the purpose of analysing compositional sentiment effects in Polish language.

Treebank Wydzwieku: version 1.0

Together, the first version of the treebank consisted of 6555 sentiment-annotated phrases from the parse trees of 1200 sentences. The resource described is the first freely available corpus with fully labeled parse trees that allows for a complete analysis of the compositional effects of sentiment in the Polish language.

Treebank Wydzwieku: version 2.0

In August 2018, as we have added many sentences, we have released a 2.0 version of the treebank! It contains following new parts:

Download

Releases:

Have questions or ideas?

Please contact me: axw at ipipan dot ...