Size: 459
Comment:
|
← Revision 26 as of 2021-02-08 13:24:40 ⇥
Size: 1786
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 7: | Line 7: |
== The dataset for compositional distributional semantics == | |
Line 8: | Line 9: |
Polish CDSCorpus consists of 10K Polish sentence pairs which are human-annotated for semantic relatedness and entailment. The dataset may be used for the evaluation of compositional distributional semantics models of Polish. The dataset was presented at ACL 2017. Please refer to the [[http://www.aclweb.org/anthology/P/P17/P17-1073.pdf|Wróblewska and Krasnowska-Kieraś (2017)]] for a detailed description of the resource. | |
Line 9: | Line 11: |
== Dataset == Go to [[http://git.nlp.ipipan.waw.pl/Scwad/SCWAD-CDSCorpus|CDSCorpus]] repository. |
|
Line 10: | Line 14: |
== Download == | {{{#!wiki comment You can have a look at a part of [[attachment:dataset_1000.csv|CDSCorpus]] (1K annotated sentence pairs). If you wish to get the entire CDSCorpus (10K annotated sentence pairs) please contact ''alina'' <at> ''ipipan.waw.pl'' (replace <at> with @). }}} |
Line 12: | Line 18: |
You can have a look at the part of [[dataset_1000.csv|CDSCorpus]] (1k annotated sentence pairs). If you wish to get the entire CDSCorpus (10k annotated sentence pairs) please contact alina <at> ipipan.waw.pl (replace <at> with @). | |
Line 16: | Line 21: |
<<BibMate(key, "wro:kra:17", omitYears=true)>> <<BibMate(key, "kra:wro:2019", omitYears=true)>> == Licence == The resources is distributed under the [[https://creativecommons.org/licenses/by-nc-sa/4.0/|CC BY-NC-SA 4.0]] licence. |
|
Line 17: | Line 31: |
For contacting Alina Wróblewska, please write to the email alina <at> ipipan.waw.pl. | For contacting Alina Wróblewska, please write to the email ''alina'' <at> ''ipipan.waw.pl''. == People == * Alina Wróblewska * Katarzyna Krasnowska-Kieraś * Alicja Dziedzic-Rawska * Bożena Itoya * Magdalena Król * Anna Latusek * Justyna Małek * Małgorzata Michalik * Agnieszka Norwa * Małgorzata Szajbel-Keck * Alicja Walichnowska * Konrad Zieliński * and some other == Acknowledgments == The building of the resource was supported by SONATA 8 grant no 2014/15/D/HS2/03486 from the National Science Centre Poland. |
Polish CDSCorpus
The dataset for compositional distributional semantics
Polish CDSCorpus consists of 10K Polish sentence pairs which are human-annotated for semantic relatedness and entailment. The dataset may be used for the evaluation of compositional distributional semantics models of Polish. The dataset was presented at ACL 2017. Please refer to the Wróblewska and Krasnowska-Kieraś (2017) for a detailed description of the resource.
Dataset
Go to CDSCorpus repository.
Publication
Licence
The resources is distributed under the CC BY-NC-SA 4.0 licence.
Contact
For contacting Alina Wróblewska, please write to the email alina <at> ipipan.waw.pl.
People
- Alina Wróblewska
- Katarzyna Krasnowska-Kieraś
- Alicja Dziedzic-Rawska
- Bożena Itoya
- Magdalena Król
- Anna Latusek
- Justyna Małek
- Małgorzata Michalik
- Agnieszka Norwa
- Małgorzata Szajbel-Keck
- Alicja Walichnowska
- Konrad Zieliński
- and some other
Acknowledgments
The building of the resource was supported by SONATA 8 grant no 2014/15/D/HS2/03486 from the National Science Centre Poland.