Locked History Actions

Diff for "Scwad/CDSCorpus"

Differences between revisions 15 and 20 (spanning 5 versions)
Revision 15 as of 2017-08-09 09:18:42
Size: 1233
Comment:
Revision 20 as of 2017-10-03 08:32:25
Size: 1488
Comment:
Deletions are marked like this. Additions are marked like this.
Line 7: Line 7:
Polish CDSCorpus consists of 10K Polish sentence pairs which are human-annotated for semantic relatedness and entailment. The dataset may be used for the evaluation of compositional distributional semantics models of Polish. For more details, please refer to the [[http://www.aclweb.org/anthology/P/P17/P17-1073.pdf|paper]] describing the dataset (Wróblewska and Krasnowska-Kieraś, 2017). == The dataset for compositional distributional semantics ==

Polish CDSCorpus consists of 10K Polish sentence pairs which are human-annotated for semantic relatedness and entailment. The dataset may be used for the evaluation of compositional distributional semantics models of Polish. The dataset was presented at ACL 2017. Please refer to the [[http://www.aclweb.org/anthology/P/P17/P17-1073.pdf|Wróblewska and Krasnowska-Kieraś (2017)]] for a detailed description of the resource.
Line 13: Line 15:
Line 15: Line 18:
Alina Wróblewska and Katarzyna Krasnowska-Kieraś (2017) [[http://www.aclweb.org/anthology/P/P17/P17-1073.pdf|Polish evaluation dataset for compositional distributional semantics models]]. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 784–792, DOI: doi.org/10.18653/v1/P17-1073. <<BibMate(key, "wrob:kier:17", omitYears=true)>>
Line 19: Line 22:

== People ==

 * Alina Wróblewska
 * Katarzyna Krasnowska-Kieraś
 * Alicja Dziedzic-Rawska
 * Bożena Itoya
 * Magdalena Król
 * Anna Latusek
 * Justyna Małek
 * Małgorzata Michalik
 * Agnieszka Norwa
 * Małgorzata Szajbel-Keck
 * Alicja Walichnowska
 * Konrad Zieliński
 * and some other

== Acknowledgments ==
The building of the resource was supported by SONATA 8 grant no 2014/15/D/HS2/03486 from the National Science Centre Poland.

Polish CDSCorpus

The dataset for compositional distributional semantics

Polish CDSCorpus consists of 10K Polish sentence pairs which are human-annotated for semantic relatedness and entailment. The dataset may be used for the evaluation of compositional distributional semantics models of Polish. The dataset was presented at ACL 2017. Please refer to the Wróblewska and Krasnowska-Kieraś (2017) for a detailed description of the resource.

Download

You can have a look at a part of CDSCorpus (1K annotated sentence pairs). If you wish to get the entire CDSCorpus (10K annotated sentence pairs) please contact alina <at> ipipan.waw.pl (replace <at> with @).

Publication

List of publications

Contact

For contacting Alina Wróblewska, please write to the email alina <at> ipipan.waw.pl.

People

  • Alina Wróblewska
  • Katarzyna Krasnowska-Kieraś
  • Alicja Dziedzic-Rawska
  • Bożena Itoya
  • Magdalena Król
  • Anna Latusek
  • Justyna Małek
  • Małgorzata Michalik
  • Agnieszka Norwa
  • Małgorzata Szajbel-Keck
  • Alicja Walichnowska
  • Konrad Zieliński
  • and some other

Acknowledgments

The building of the resource was supported by SONATA 8 grant no 2014/15/D/HS2/03486 from the National Science Centre Poland.