Locked History Actions

Diff for "PolishDiscourseCorpus"

Differences between revisions 2 and 16 (spanning 14 versions)
Revision 2 as of 2020-12-18 16:34:33
Size: 1497
Comment:
Revision 16 as of 2024-06-10 13:38:40
Size: 1020
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
## page was renamed from PolishDiscourseCorpus
Line 4: Line 5:
This page offers the official release of the corpus of discourse relations created as a part of the [[http://clip.ipipan.waw.pl/CLARIN-PL-2|CLARIN-PL]] project. By downloading the corpus data you accept the conditions of that licence. The following corpus of discourse relations is based on the [[PCC|Polish Coreference Corpus]]. The annotation of the corpus was completed using [[Discann|Discann annotation tool]].
Line 8: Line 9:
 * [[attachment:PCC_README_EN.pdf|Description of the corpus, in English]]
 * [[attachment:PCC_README_PL.pdf|Description of the corpus, in Polish]]
Please see the [[attachment:instrukcja-anotacji-metatekstu.pdf|annotation instructions]], in Polish (by Celina Heliasz).
Line 17: Line 17:
== Download ==
Line 18: Line 19:
== Downloads == The corpus is available for download in the form of a [[attachment:corpus.tar.gz|zip file]] containing:
 * 1773 source XML TEI files of the Polish Coreference Corpus
 * metatext.xml file containing descriptions of all relations
Line 20: Line 23:
The corpus is available for download in 3 formats:
 * [[attachment:PCC-1.5-MMAX.zip|full corpus in MMAX format]] ([[attachment:example_text_mmax.zip|example text in MMAX format]])
 * [[attachment:PCC-1.5-TEI.zip|full corpus in TEI format]] ([[attachment:example_text_tei.zip|example text in TEI format]])
 * [[attachment:PCC-1.5-BRAT.zip|full corpus in BRAT format]] ([[attachment:example_text_brat.zip|example text in BRAT format]])

== Online version ==

The corpus is available:
 * [[http://cothec.nlp.ipipan.waw.pl/|for browsing]]
 * [[http://pcc.nlp.ipipan.waw.pl/|for search]]

You may also want to see [[PolishCoreferenceTools|Polish Coreference Tools site]].

== Citing ==
When using Polish Discourse Corpus, please cite:
== Please cite ==
Line 36: Line 25:
<<BibMate(key, "hel:ogr:19:lc", omitYears=true)>>

Polish Discourse Corpus / Polski Korpus Metatekstowy

The following corpus of discourse relations is based on the Polish Coreference Corpus. The annotation of the corpus was completed using Discann annotation tool.

Documentation

Please see the annotation instructions, in Polish (by Celina Heliasz).

Licence

Creative Commons Attribution 3.0 Unported License

http://i.creativecommons.org/l/by/3.0/88x31.png

Download

The corpus is available for download in the form of a zip file containing:

  • 1773 source XML TEI files of the Polish Coreference Corpus
  • metatext.xml file containing descriptions of all relations

Please cite

List of publications

Celina Heliasz and Maciej Ogrodniczuk. Eksplicytność a implicytność w świetle analizy korpusowej (meta)tekstu. Linguistica Copernicana, 16:75–100, 2019.

List of publications

Celina Heliasz and Maciej Ogrodniczuk. Eksplicytność a implicytność w świetle analizy korpusowej (meta)tekstu. Linguistica Copernicana, 16:75–100, 2019.