Differences between revisions 4 and 22 (spanning 18 versions)

Polish Discourse Corpus / Polski Korpus Metatekstowy

The corpus of discourse relations is based on the Polish Coreference Corpus. The annotation of the corpus was completed using Discann annotation tool.

Version 0.1

Documentation

The annotation instructions (in Polish) were created by Celina Heliasz.

Download

The corpus is available for download in the form of a zip file containing:

1773 source XML TEI files of the Polish Coreference Corpus
metatext.xml file containing descriptions of all relations

Funding

Version 1.0 of the corpus was financed by the Polish Ministry of Education and Science under the agreement DIR/WK/2016/02.

Version 1.0

Documentation

The annotation instructions (in Polish) were created by Maciej Ogrodniczuk.

Download

The corpus is available for download in the form of a zip file in the Inforex format.

Funding

Version 1.0 of the corpus was financed by the European Regional Development Fund as a part of the 2014–2020 Smart Growth Operational Programme, CLARIN — Common Language Resources and Technology Infrastructure, project no. POIR.04.02.00–00C002/19, the Polish Ministry of Education and Science grant 2022/WK/09, continued as part of the investment: CLARIN ERIC – European Research Infrastructure Consortium: Common Language Resources and Technology Infrastructure (period: 2024-2026) funded by the Polish Ministry of Science and Higher Education (Programme: ”Support for the participation of Polish scientific teams in international research infrastructure projects”), agreement number 2024/WK/01 and by CLARIN-PL, the European Regional Development Fund, FENG programme, agreement number FENG.02.04-IP.040004/24.

Licence

Creative Commons Attribution 3.0 Unported License

Please cite

List of publications

Maciej Ogrodniczuk, Aleksandra Tomaszewska, Daniel Ziembicki, Sebastian Żurowski, Ryszard Tuora, and Aleksandra Zwierzchowska. Polish Discourse Corpus (PDC): Corpus design, ISO-compliant annotation, data highlights, and parser development. In Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, and Nianwen Xue, editors, Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 12829–12835, Torino, Italy, 2024. ELRA and ICCL.

Sebastian Żurowski, Daniel Ziembicki, Aleksandra Tomaszewska, Maciej Ogrodniczuk, and Agata Drozd. Adopting ISO 24617-8 for discourse relations annotation in Polish: Challenges and future directions. In Sara Carvalho, Anas Fahad Khan, Ana Ostroski Anić, Blerina Spahiu, Jorge Gracia, John P. McCrae, Dagmar Gromann, Barbara Heinisch, and Ana Castro Salgado, editors, Proceedings of the 4th Conference on Language, Data and Knowledge, pages 482–492, Vienna, Austria, 2023. NOVA CLUNL, Portugal.

Celina Heliasz and Maciej Ogrodniczuk. Eksplicytność a implicytność w świetle analizy korpusowej (meta)tekstu. Linguistica Copernicana, 16:75–100, 2019.

-  ⇤ ← Revision 4 as of 2020-12-30 15:19:35 → 
  Size: 916
  Editor: MaciejOgrodniczuk
  Comment:
+   ← Revision 22 as of 2026-02-28 01:31:30 → ⇥
  Size: 2355
  Editor: MaciejOgrodniczuk
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
+## page was renamed from PolishDiscourseCorpus
-Line 4:
+Line 5:
-The Polish Discourse Corpus is a corpus of discourse relations based on the [[PCC|Polish Coreference Corpus]] as part of the [[http://clip.ipipan.waw.pl/CLARIN-PL-2|CLARIN-PL]] project.
+The corpus of discourse relations is based on the [[PCC|Polish Coreference Corpus]]. The annotation of the corpus was completed using [[Discann|Discann annotation tool]].
-Line 6:
+Line 7:
-== Documentation ==
+== Version 0.1 ==
-Line 8:
+Line 9:
-Please see the [[attachment:instrukcja-anotacji-metatekstu.pdf|annotation instructions]], in Polish.
+=== Documentation ===

The [[attachment:instrukcja-anotacji-metatekstu.pdf|annotation instructions]] (in Polish) were created by Celina Heliasz.

=== Download ===

The corpus is available for download in the form of a [[attachment:corpus.tar.gz|zip file]] containing:
 * 1773 source XML TEI files of the Polish Coreference Corpus
 * metatext.xml file containing descriptions of all relations

=== Funding ===

Version 1.0 of the corpus was financed by the Polish Ministry of Education and Science under the agreement DIR/WK/2016/02.

== Version 1.0 ==

=== Documentation ===

The [[attachment:anotacja-pdc.pdf|annotation instructions]] (in Polish) were created by Maciej Ogrodniczuk.

=== Download ===

The corpus is available for download in the form of a [[attachment:pdc.zip|zip file]] in the [[https://clarin.biz/tools/inforex|Inforex]] format.

=== Funding ===

Version 1.0 of the corpus was financed by the European Regional Development Fund as a part of the 2014–2020 Smart Growth Operational Programme, CLARIN — Common Language Resources and Technology Infrastructure, project no. POIR.04.02.00–00C002/19, the Polish Ministry of Education and Science grant 2022/WK/09, continued as part of the investment: CLARIN ERIC – European Research Infrastructure Consortium: Common Language Resources and Technology Infrastructure (period: 2024-2026) funded by the Polish Ministry of Science and Higher Education (Programme: ”Support for the participation of Polish scientific teams in international research infrastructure projects”), agreement number 2024/WK/01 and by CLARIN-PL, the European Regional Development Fund, FENG programme, agreement number FENG.02.04-IP.040004/24.
-Line 16:
+Line 43:
-== Downloads ==
-Line 18:
+Line 44:
-The corpus is available for download in the form of a [[attachment:corpus.tar.gz|zip file]] containing:
 * 1773 source XML TEI files of the Polish Coreference Corpus
 * metatext.xml file containing descriptions of all relations

== Citing ==
Please cite:
<<BibMate(key, "hel:ogr:19:lc", omitYears=true)>>
+== Please cite ==
<<BibMate(key, "ogr:etal:24", "tom:etal:24:iso", "zur:etal:23:ldk", "hel:ogr:19:lc", omitYears=true)>>

Diff for "PolishDiscourseCorpus"

Menu

Polish Discourse Corpus / Polski Korpus Metatekstowy

Version 0.1

Documentation

Download

Funding

Version 1.0

Documentation

Download

Funding

Licence

Please cite