Locked History Actions

Diff for "PolishDiscourseCorpus"

Differences between revisions 7 and 18 (spanning 11 versions)
Revision 7 as of 2022-02-01 10:47:42
Size: 997
Comment:
Revision 18 as of 2024-10-15 10:23:13
Size: 984
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
## page was renamed from PolishDiscourseCorpus
Line 4: Line 5:
The following corpus of discourse relations is based on the [[PCC|Polish Coreference Corpus]] as part of the [[http://clip.ipipan.waw.pl/CLARIN-PL-2|CLARIN-PL]] project. The annotation of the corpus was completed using [[Discann|Discann annotation tool]]. The following corpus of discourse relations is based on the [[PCC|Polish Coreference Corpus]]. The annotation of the corpus was completed using [[Discann|Discann annotation tool]].
Line 16: Line 17:
== Downloads == == Download ==
Line 22: Line 23:
== Publication ==
<<BibMate(key, "hel:ogr:19:lc", omitYears=true)>>
== Please cite ==
<<BibMate(key, "ogr:etal:24", "hel:ogr:19:lc", omitYears=true)>>

Polish Discourse Corpus / Polski Korpus Metatekstowy

The following corpus of discourse relations is based on the Polish Coreference Corpus. The annotation of the corpus was completed using Discann annotation tool.

Documentation

Please see the annotation instructions, in Polish (by Celina Heliasz).

Licence

Creative Commons Attribution 3.0 Unported License

http://i.creativecommons.org/l/by/3.0/88x31.png

Download

The corpus is available for download in the form of a zip file containing:

  • 1773 source XML TEI files of the Polish Coreference Corpus
  • metatext.xml file containing descriptions of all relations

Please cite

List of publications

Maciej Ogrodniczuk, Aleksandra Tomaszewska, Daniel Ziembicki, Sebastian Żurowski, Ryszard Tuora, and Aleksandra Zwierzchowska. Polish Discourse Corpus (PDC): Corpus design, ISO-compliant annotation, data highlights, and parser development. In Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, and Nianwen Xue, editors, Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 12829–12835, Torino, Italy, 2024. ELRA and ICCL.

Celina Heliasz and Maciej Ogrodniczuk. Eksplicytność a implicytność w świetle analizy korpusowej (meta)tekstu. Linguistica Copernicana, 16:75–100, 2019.