Size: 2547
Comment:
|
Size: 2890
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
#acl +All:read Default | #acl MateuszKopec:read,write,revert All:read {{attachment:logo.png|logo|width=100 height=100}} |
Line 10: | Line 12: |
{{http://i.creativecommons.org/l/by/3.0/88x31.png}} |
|
Line 14: | Line 18: |
Will soon be available. | [[attachment:manual.pdf|Description]] of the corpus (in English). |
Line 20: | Line 24: |
* [[attachment:PolishSummariesCorpus0.5.zip|Polish Summaries Corpus 0.5]] | * [[attachment:PSC_1.0.zip|Polish Summaries Corpus 1.0]] There is a Java API to the corpus: * source code is available at [[http://git.nlp.ipipan.waw.pl/summarization/pscapi|git repository]] * Maven users may add following dependency: {{{ <dependency> <groupId>pl.waw.ipipan.zil.summ</groupId> <artifactId>pscapi</artifactId> <version>1.0</version> </dependency> }}} and repository: {{{ <repository> <id>zil-maven-repo</id> <name>ZIL maven repository</name> <url>http://maven.nlp.ipipan.waw.pl/content/repositories/releases/</url> </repository> }}} |
Line 24: | Line 47: |
<<BibMate(key, "ogro:kop:14:lrec", omitYears=true)>> | |
Line 25: | Line 49: |
Maciej Ogrodniczuk and Mateusz Kopeć. '''The Polish Summaries Corpus'''. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, editors, Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, pages 3712–3715, Reykjavík, Iceland, 2014. ELRA. | |
Line 27: | Line 50: |
{{{ @inproceedings{ ogro:kop:14:lrec, author = "Ogrodniczuk, Maciej and Kopeć, Mateusz", pdf = "http://nlp.ipipan.waw.pl/Bib/ogro:kop:14:lrec.pdf", title = "The {P}olish {S}ummaries {C}orpus", pages = "3712--3715", crossref = "lrec:14" |
{{{#!rhtml <script type="application/ld+json"> { "@context":"http://schema.org/", "@type":"Dataset", "name":"Polish Summaries Corpus", "description":"Corpus of Polish news summaries.", "url":"http://zil.ipipan.waw.pl/PolishSummariesCorpus", "keywords":[ "summarization", "polish", "corpus", "news" ], "creator":{ "@type":"Organization", "url": "https://ipipan.waw.pl/", "name":"Institute of Computer Science, Polish Academy of Sciences", "contactPoint":{ "@type":"ContactPoint", "contactType": "customer service", "telephone":"+48-22-380-05-00", "email":"ipi@ipipan.waw.pl" } }, "distribution":[ { "@type":"DataDownload", "encodingFormat":"XML", "contentUrl":"http://zil.ipipan.waw.pl/PolishSummariesCorpus?action=AttachFile&do=view&target=PSC_1.0.zip" } ], "temporalCoverage":"1993-01-01/2002-12-31" |
Line 36: | Line 82: |
@proceedings{ lrec:14, editor = "Calzolari, Nicoletta and Choukri, Khalid and Declerck, Thierry and Loftsson, Hrafn and Maegaard, Bente and Mariani, Joseph and Moreno, Asuncion and Odijk, Jan and Piperidis, Stelios", isbn = "978-2-9517408-8-4", title = "Proceedings of the Ninth International {C}onference on {L}anguage {R}esources and {E}valuation, {LREC}~2014", url = "http://www.lrec-conf.org/proceedings/lrec2014/index.html", booktitle = "Proceedings of the Ninth International {C}onference on {L}anguage {R}esources and {E}valuation, {LREC}~2014", address = "Reykjavík, Iceland", key = "LREC", year = "2014", organization = "ELRA" } |
</script> |
Polish Summaries Corpus
This page offers the official Creative Commons Attribution 3.0 Unported License release of the corpus of Polish news summaries, which creation was cofounded by the ATLAS project and by the European Union from resources of the European Social Fund -- Project PO KL „Information technologies: Research and their interdisciplinary applications”. By downloading the corpus data you accept the conditions of that licence.
Contact person: Mateusz Kopeć
License: CC BY v.3
Texts to summarize were extracted from http://www.cs.put.poznan.pl/dweiss/research/rzeczpospolita/ and are currently available on terms stated at that corpus webpage.
Documentation
Description of the corpus (in English).
Downloads
Preliminary version of the corpus is available to download under the following link:
There is a Java API to the corpus:
source code is available at git repository
- Maven users may add following dependency:
<dependency> <groupId>pl.waw.ipipan.zil.summ</groupId> <artifactId>pscapi</artifactId> <version>1.0</version> </dependency>
and repository:
<repository> <id>zil-maven-repo</id> <name>ZIL maven repository</name> <url>http://maven.nlp.ipipan.waw.pl/content/repositories/releases/</url> </repository>
Citing
When using Polish Summaries Corpus, please cite the following article: