Locked History Actions

Diff for "ZILStart"

Differences between revisions 40 and 256 (spanning 216 versions)
Revision 40 as of 2013-04-09 14:03:01
Size: 8426
Comment:
Revision 256 as of 2025-07-29 18:54:17
Size: 17934
Comment:
Deletions are marked like this. Additions are marked like this.
Line 4: Line 4:
The Linguistic Engineering (LE) Group is part of the [[http://www.ipipan.waw.pl/en/dept/dept-ai.html|Department of Artificial Intelligence]] at the [[http://www.ipipan.waw.pl/en/|Institute of Computer Science]], [[http://www.english.pan.pl/|Polish Academy of Sciences]] (ICS PAS). The Linguistic Engineering (LE) Group is part of the Department of Artificial Intelligence at the [[http://www.ipipan.waw.pl/en/|Institute of Computer Science]], [[https://institution.pan.pl/|Polish Academy of Sciences]] (IPI PAN).
Line 8: Line 8:
|| [[http://zil.ipipan.waw.pl/AnnaAndrzejczuk|Anna Andrzejczuk]], PhD || [[mailto:anna.andrzejczuk@ipipan.waw.pl|anna.andrzejczuk@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/LukaszDegorski|Łukasz Degórski]], MSc || [[mailto:ldegorski@ipipan.waw.pl|ldegorski@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/ElzbietaHajnicz|Elżbieta Hajnicz]], PhD || [[mailto:elzbieta.hajnicz@ipipan.waw.pl|elzbieta.hajnicz@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/LukaszKobylinski|Łukasz Kobyliński]], PhD || [[mailto:lkobylinski@ipipan.waw.pl|lkobylinski@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/MateuszKopec|Mateusz Kopeć]], MSc || [[mailto:mateusz.kopec@ipipan.waw.pl|mateusz.kopec@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/KatarzynaKrasnowska|Katarzyna Krasnowska]], MSc || [[mailto:katarzyna.krasnowska@ipipan.waw.pl|katarzyna.krasnowska@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/AnnaKupsc|Anna Kupść]], PhD (on leave) || [[mailto:anna.kupsc@ipipan.waw.pl|anna.kupsc@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/MalgorzataMarciniak|Małgorzata Marciniak]], PhD || [[mailto:malgorzata.marciniak@ipipan.waw.pl|malgorzata.marciniak@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/AgnieszkaMykowiecka|Agnieszka Mykowiecka]], PhD || [[mailto:agnieszka.mykowiecka@ipipan.waw.pl|agnieszka.mykowiecka@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/MaciejOgrodniczuk|Maciej Ogrodniczuk]], PhD || [[mailto:maciej.ogrodniczuk@ipipan.waw.pl|maciej.ogrodniczuk@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/Agnieszka Patejuk|Agnieszka Patejuk]], MSc || [[mailto:aep@ipipan.waw.pl|aep@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/JakubPiskorski|Jakub Piskorski]], PhD, Associate || [[mailto:jakub.piskorski@ipipan.waw.pl|jakub.piskorski@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/AdamPrzepiorkowski|Adam Przepiórkowski]], PhD, Head of the Group || [[mailto:adam.przepiorkowski@ipipan.waw.pl|adam.przepiorkowski@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/DominikaRogozinska|Dominika Rogozińska]] || ||
|| [[http://zil.ipipan.waw.pl/PiotrRychlik|Piotr Rychlik]], PhD || [[mailto:rychlik@ipipan.waw.pl|rychlik@ipipan.waw.pl]] ||
|| [[http://www.cs.albany.edu/~tomek/|Tomek Strzałkowski]], PhD, Foreign Associate || [[mailto:tomek@cs.albany.edu|tomek@cs.albany.edu]] ||
|| [[http://zil.ipipan.waw.pl/JanSzejko|Jan Szejko]] || [[mailto:jan.szejko@ipipan.waw.pl|jan.szejko@ipipan.waw.pl]] ||
|| [[http://www.site.uottawa.ca/~szpak/|Stan Szpakowicz]], PhD, Foreign Associate || [[mailto:szpak@site.uottawa.ca|szpak@site.uottawa.ca]] ||
|| [[http://zil.ipipan.waw.pl/JakubWaszczuk|Jakub Waszczuk]], MSc || [[mailto:jakub.waszczuk@ipipan.waw.pl|jakub.waszczuk@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/AleksanderWawer|Aleksander Wawer]], MSc || [[mailto:aleksander.wawer@ipipan.waw.pl|aleksander.wawer@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/MarcinWolinski|Marcin Woliński]], PhD || [[mailto:marcin.wolinski@ipipan.waw.pl|marcin.wolinski@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/BeataWojtowicz|Beata Wójtowicz]], PhD (part time) || [[mailto:beata.wojtowicz@ipipan.waw.pl|beata.wojtowicz@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/AlinaWroblewska|Alina Wróblewska]], MSc || [[mailto:alina.wroblewska@ipipan.waw.pl|alina.wroblewska@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/BartoszZaborowski|Bartosz Zaborowski]], MSc || [[mailto:bartosz.zaborowski@ipipan.waw.pl|bartosz.zaborowski@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/SebastianZurowski|Sebastian Żurowski]], PhD (part time) || [[mailto:sebastian.zurowski@ipipan.waw.pl|sebastian.zurowski@ipipan.waw.pl]] ||
=== Core team ===

|| [[http://zil.ipipan.waw.pl/TomaszBartosiak|Tomasz Bartosiak]], MSc || [[mailto:tomasz.bartosiak@gmail.com|tomasz.bartosiak@gmail.com]] ||
|| [[https://www.diegofeinmann.com/|Diego Feinmann]], PhD || [[mailto:diego.feinmann@ipipan.waw.pl|diego.feinmann@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/ElzbietaHajnicz|Elżbieta Hajnicz]], PhD, Assoc. Prof. || [[mailto:elzbieta.hajnicz@ipipan.waw.pl|elzbieta.hajnicz@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/WitoldKieras|Witold Kieraś]], PhD || [[mailto:witold.kieras@ipipan.waw.pl|witold.kieras@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/LukaszKobylinski|Łukasz Kobyliński]], PhD || [[mailto:lkobylinski@ipipan.waw.pl|lukasz.kobylinski@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/DorotaKomosi%C5%84ska|Dorota Komosińska]], MSc || [[mailto:dorota.komosinska@gmail.com|dorota.komosinska@gmail.com]] ||
|| [[http://zil.ipipan.waw.pl/KatarzynaKrasnowska|Katarzyna Krasnowska-Kieraś]], MSc || [[mailto:katarzyna.krasnowska@ipipan.waw.pl|katarzyna.krasnowska@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/MalgorzataMarciniak|Małgorzata Marciniak]], PhD, Assoc. Prof. || [[mailto:malgorzata.marciniak@ipipan.waw.pl|malgorzata.marciniak@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/AgnieszkaMykowiecka|Agnieszka Mykowiecka]], PhD, Assoc. Prof. || [[mailto:agnieszka.mykowiecka@ipipan.waw.pl|agnieszka.mykowiecka@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/MaciejOgrodniczuk|Maciej Ogrodniczuk]], PhD, Assoc. Prof., Head of the Group || [[mailto:maciej.ogrodniczuk@ipipan.waw.pl|maciej.ogrodniczuk@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/AgnieszkaPatejuk|Agnieszka Patejuk]], PhD || [[mailto:aep@ipipan.waw.pl|agnieszka.patejuk@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/AdamPrzepiorkowski|Adam Przepiórkowski]], PhD, Full Prof. || [[mailto:adam.przepiorkowski@ipipan.waw.pl|adam.przepiorkowski@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/PiotrPrzybyla|Piotr Przybyła]], PhD (on postdoctoral fellowship at [[https://www.upf.edu/web/erinia|UPF]]) || [[mailto:piotr.przybyla@ipipan.waw.pl|piotr.przybyla@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/MichałRudolf|Michał Rudolf]], PhD || [[mailto:michal@rudolf.waw.pl|michal@rudolf.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/PiotrRychlik|Piotr Rychlik]], PhD || [[mailto:rychlik@ipipan.waw.pl|piotr.rychlik@ipipan.waw.pl]] ||
|| [[https://zil.ipipan.waw.pl/KarolinaSaputa|Karolina Saputa]], BEng || [[mailto:karolsaputa@gmail.com|karolsaputa@gmail.com]] ||
|| [[http://zil.ipipan.waw.pl/AleksandraTomaszewska|Aleksandra Tomaszewska]], PhD candidate || [[mailto:aleksandra.tomaszewska@ipipan.waw.pl|aleksandra.tomaszewska@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/AleksanderWawer|Aleksander Wawer]], PhD || [[mailto:aleksander.wawer@ipipan.waw.pl|aleksander.wawer@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/MarcinWolinski|Marcin Woliński]], PhD, Assoc. Prof. || [[mailto:marcin.wolinski@ipipan.waw.pl|marcin.wolinski@ipipan.waw.pl]] ||
|| Joanna Wołoszyn, PhD || [[mailto:joanna.woloszyn@ipipan.waw.pl|joanna.woloszyn@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/AlinaWroblewska|Alina Wróblewska]], PhD || [[mailto:alina.wroblewska@ipipan.waw.pl|alina.wroblewska@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/SebastianZawada|Sebastian Zawada]], MSc || [[mailto:sebastian.zawada@ipipan.waw.pl|sebastian.zawada@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/NataliaZawadzka|Natalia Zawadzka-Paluektau]], PhD || [[mailto:natalia.zawadzka-paluektau@ipipan.waw.pl|natalia.zawadzka-paluektau@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/BartoszŻuk|Bartosz Żuk]], PhD candidate || [[mailto:bartoszzuk.poczta@gmail.com|bartoszzuk.poczta@gmail.com]] ||


=== Associates ===

|| [[http://zil.ipipan.waw.pl/AnnaAndrzejczuk|Anna Andrzejczuk]], PhD (on leave) || [[mailto:anna.andrzejczuk@ipipan.waw.pl|anna.andrzejczuk@ipipan.waw.pl]] ||
|| Wiktor Eźlakowski, MSc || [[mailto:wiktor.ezlakowski@ipipan.waw.pl|wiktor.ezlakowski@ipipan.waw.pl]] ||
|| Sonia Janicka || [[mailto:sonia.janicka@gmail.com|sonia.janicka@gmail.com]] ||
|| [[http://zil.ipipan.waw.pl/MateuszKlimaszewski|Mateusz Klimaszewski]], MSc || [[mailto:mk.klimaszewski@gmail.com|mk.klimaszewski@gmail.com]] ||
|| Jakub Piskorski, PhD || [[mailto:jpiskorski@gmail.com|jpiskorski@gmail.com]] ||
|| Piotr Rybak, MSc || [[mailto:piotr.cezary.rybak@gmail.com|piotr.cezary.rybak@gmail.com]] ||
|| Jakub Szymanik, PhD || [[mailto:jakub.szymanik@gmail.com|jakub.szymanik@gmail.com]] ||
|| [[http://zil.ipipan.waw.pl/RyszardTuora|Ryszard Tuora]], MSc || [[mailto:ryszardtuora@gmail.com|ryszardtuora@gmail.com]] ||
|| Grzegorz Wojdyga, MSc || [[mailto:g.wojdyga@ipipan.waw.pl|g.wojdyga@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/BeataWojtowicz|Beata Wójtowicz]], PhD, Assoc. Prof. || [[mailto:beata.wojtowicz@ipipan.waw.pl|beata.wojtowicz@ipipan.waw.pl]] ||
Line 39: Line 54:
 * (Polish) corpus linguistics; cf. the [[http://korpus.pl/en/|IPI PAN Corpus of Polish]] and the [[http://nkjp.pl/|National Corpus of Polish]],
 * syntactic and semantic parsing of Polish; cf. [[http://zil.ipipan.waw.pl/Spejd/|Spejd]] and [[http://nlp.ipipan.waw.pl/~wolinski/swigra/|Świgra]],
 * extraction of linguistic knowledge from corpora,
 * information extraction,
 * sentiment analysis,
 * morphosyntactic system of Polish,
 * (Polish) corpus linguistics ([[http://nkjp.pl/|National Corpus of Polish]]) /* ; cf. the [[http://korpus.pl/en/|IPI PAN Corpus of Polish]] and the [[http://nkjp.pl/|National Corpus of Polish]], */
 * morphosyntactic tagging and lemmatisation of Polish
 * syntactic and semantic parsing of Polish
 * extraction of linguistic knowledge from corpora
 * information extraction
 * distributional semantics and compositional distributional semantics
 * sentiment analysis
 * credibility assessment of online content
 * reference and discourse relations
Line 47: Line 65:
The Group is a member of [[http://www.clarin.eu/|CLARIN]], [[http://www.flarenet.eu/|FLaReNet]] and [[http://www.meta-net.eu/|META-NET]]. The Group is a member of [[http://www.clarin.eu/|CLARIN]], [[http://dariah.pl/|DARIAH-PL]], [[http://clip.ipipan.waw.pl/ELRC|ELRC]], [[http://www.flarenet.eu/|FLaReNet]] and [[http://www.meta-net.eu/|META-NET]].
Line 51: Line 69:
 * [[CORE|CORE]] (Computer-based methods for coreference resolution in Polish texts),
 * [[NEKST|NEKST]] (An adaptive system to support problem-solving on the basis of document collections in the Internet),
 * [[SYNAT|SYNAT]] (Creation of a universal, open repository platform for hosting and communication of networked resources of knowledge for science, education and open knowledge-based society).
 * [[http://clip.ipipan.waw.pl/CLARIN-PL-3|CLARIN-PL]] (Polish chapter of [[http://www.clarin.eu/|Common Language Resources and Technology Infrastructure]])
 * [[CORMETAN]] (Cognitive and sociocultural analysis of metaphoric expressions in Polish texts)
 * [[http://clip.ipipan.waw.pl/CURLICAT|CURLICAT]] (Curated Multilingual Language Resources for CEF AT)
 * [[http://korpus-dekady.ipipan.waw.pl|Korpus Dekady]] ([[http://dariah.pl/|DARIAH-PL]] — Digital Research Infrastructure for the Arts and Humanities)
 * [[http://clip.ipipan.waw.pl/ELE|ELE]] (European Language Equality)
 * [[http://clip.ipipan.waw.pl/ELG|ELG]] (European Language Grid)
 * [[http://clip.ipipan.waw.pl/ELRC|ELRC]] (European Language Resource Coordination)
 * [[HOMADOS|HOMADOS]] (Hampering Misinformation by Assessing Credibility of Online Sources)
 * [[http://clip.ipipan.waw.pl/KORBA-2|KORBA 2]] (Extension of the "Electronic corpus of 17th and 18th century Polish texts" and its integration with the "Electronic Dictionary of the 17th–18th Century Polish")
 * [[http://zil.ipipan.waw.pl/Quantifiers|Kwantyfikatory w języku: użycie i znaczenie]] (Quantifiers in Language: Use and Meaning)
 * [[http://clip.ipipan.waw.pl/MARCELL|MARCELL]] (Multilingual Resources for CEF.AT in the legal domain)
 * [[http://clip.ipipan.waw.pl/Nexus|Nexus Linguarum]] (European network for Web-centred linguistic data science)
 * [[http://zil.ipipan.waw.pl/Scwad|Scwad]] (Compositional distributional modelling of Polish language semantics)
 * [[http://synamet.uw.edu.pl/|SYNAMET]] (Microcorpus of Synaesthetic Metaphors. Towards a Formal Description and Efficient Methods of Analysis of Metaphors in Discourse)
Line 56: Line 85:

 * [[ATLAS|ATLAS]] (Applied Technology for Language-Aided CMS),
 * [[CESAR|CESAR]] (CEntral and South-east europeAn Resources),
 * [[Construction of a treebank for Polish using automatic syntactic analysis|Construction of a treebank for Polish using automatic syntactic analysis]],
 * [[CLARIN|CLARIN]] (Common Language Resources and Technology Infrastructure),
 * [[NKJP|NKJP]] (National Corpus of Polish),
 * [[Automatic detection of semantic dependencies within verb argument structures in large treebanks|Automatic detection of semantic dependencies within verb argument structures in large treebanks]],
 * [[LUNA|LUNA]] (spoken Language UNderstanding in multilinguAl communication systems) with the Polish support,
 * [[LT4eL|LT4eL]] (Language Technology for eLearning),
 * [[Automatic extraction of linguistic knowledge from a large corpus of Polish|Automatic extraction of linguistic knowledge from a large corpus of Polish]],
 * [[Information Extraction from Polish free text|Information Extraction from Polish free text]],
 * [[IPI PAN Corpus|The IPI PAN Corpus of Polish]],
 * [[Test Suite of Polish Utterances|Treebank / Test Suite of Polish Utterances]],
 * [[HPSG Grammar of Polish|HPSG Grammar of Polish]].
 * [[ATLAS|ATLAS]] (Applied Technology for Language-Aided CMS)
 * [[http://zil.ipipan.waw.pl/Automatic%20detection%20and%20correction%20of%20annotation%20errors%20in%20Polish%20language%20corpora|Automatic detection and correction of annotation errors in Polish language corpora]]
 * [[Automatic detection of semantic dependencies within verb argument structures in large treebanks|Automatic detection of semantic dependencies within verb argument structures in large treebanks]]
 * [[Automatic extraction of linguistic knowledge from a large corpus of Polish|Automatic extraction of linguistic knowledge from a large corpus of Polish]]
 * [[CESAR|CESAR]] (CEntral and South-east europeAn Resources)
 * [[http://zil.ipipan.waw.pl/Chronofleks|Chronofleks]] (A diachronic formal model of Polish inflection and its implementation)
 * [[CLARIN|CLARIN]] (Polish chapter of [[http://www.clarin.eu/|Common Language Resources and Technology Infrastructure]], see also [[http://clip.ipipan.waw.pl/CLARIN-PL-2|CLARIN-PL 2]])
 * [[http://zil.ipipan.waw.pl/CoDeS|CoDeS]] (Compositional distributional semantic models for identification, discrimination and disambiguation of senses in Polish texts)
 * [[Construction of a treebank for Polish using automatic syntactic analysis|Construction of a treebank for Polish using automatic syntactic analysis]]
 * [[CORE|CORE]] (Computer-based methods for coreference resolution in Polish texts)
 * [[http://clip.ipipan.waw.pl/COTHEC|COTHEC]] (Unified theory of coreference in Polish and its corpus-based verification)
 * [[HPSG Grammar of Polish|HPSG Grammar of Polish]]
 * [[Information Extraction from Polish free text|Information Extraction from Polish free text]]
 * [[IPI PAN Corpus|IPI PAN Corpus of Polish]]
 * [[http://clip.ipipan.waw.pl/KORBA|KORBA]] (Electronic corpus of 17th and 18th century Polish texts)
 * [[LT4eL|LT4eL]] (Language Technology for eLearning)
 * [[LUNA|LUNA]] (spoken Language UNderstanding in multilinguAl communication systems) with the Polish support
 * [[NEKST|NEKST]] (An adaptive system to support problem-solving on the basis of document collections in the Internet)
 * [[NKJP|NKJP]] (National Corpus of Polish)
 * [[http://zil.ipipan.waw.pl/OPTA|OPTA]] (Automatyczne metody rozpoznawania przedmiotów i wyrażeń opinii w języku polskim)
 * [[PARSEME|PARSEME]] (PARSing and Multi-word Expressions. Towards linguistic precision and computational efficiency in natural language processing)
 * [[http://clip.ipipan.waw.pl/Parthenos|Parthenos]] (Pooling Activities, Resources and Tools for Heritage, E-research Networking, Optimization and Synergies)
 * [[http://clip.ipipan.waw.pl/Readability|Readability]] (Measuring the degree of readability of nonliterary Polish texts)
 * [[SYNAT|SYNAT]] (Creation of a universal, open repository platform for hosting and communication of networked resources of knowledge for science, education and open knowledge-based society)
 * [[Test Suite of Polish Utterances|Treebank / Test Suite of Polish Utterances]]
 * [[http://clip.ipipan.waw.pl/TextLink|TextLink]] (Structuring Discourse in Multilingual Europe)
 * [[http://clip.ipipan.waw.pl/TrendMiner|TrendMiner]] (Large-scale, Cross-lingual Trend Mining and Summarisation of Real-time Media Streams)
Line 73: Line 115:
Here are some of the tools and resources created within our projects. See [[http://clip.ipipan.waw.pl/|CLIP]] pages for a more exhaustive list of Polish tools and resources. Here are some of the tools and resources created within our projects. See [[http://clip.ipipan.waw.pl/|CLIP]] pages for a more exhaustive list of Polish tools and resources, including more tools and resources developed at ZIL IPI PAN.
Line 75: Line 117:
Tools (all open source, under [[http://www.gnu.org/copyleft/gpl.html|GPL]]): Some '''tools''' (all open source, under [[http://www.gnu.org/copyleft/gpl.html|GPL]]; see also [[http://clip.ipipan.waw.pl/|CLIP]]):
Line 77: Line 119:
 * [[http://nlp.ipipan.waw.pl/~wolinski/swigra/|Świgra]] – a DCG parser,  * [[http://morfeusz.sgjp.pl/|Morfeusz 2]] – a morphological analyser of Polish,
Line 79: Line 121:
 * [[http://zil.ipipan.waw.pl/%C5%9Awigra|Świgra]] – a DCG parser,
 * [[https://github.com/360er0/COMBO|COMBO]] – a language-independent system for natural language preprocessing (i.e. morphosyntactic tagging, lemmatisation, dependency parsing and thematic role labelling,
 * [[http://zil.ipipan.waw.pl/Concraft|Concraft]] — a CRF morphosyntactic tagger of Polish compatible with Morfeusz SGJP,
 * [[http://zil.ipipan.waw.pl/PANTERA|PANTERA]] – a morphosyntactic tagger for Polish,
Line 80: Line 126:
 * [[http://zil.ipipan.waw.pl/PANTERA|PANTERA]] – a morphosyntactic tagger for Polish,
Line 82: Line 127:
 * [[https://sourceforge.net/projects/poliqarp2/|Poliqarp2]] – a new generation corpus indexing and search engine,
Line 83: Line 129:
 * [[http://zil.ipipan.waw.pl/Anotatornia/|Anotatornia]] – a system for multi-level manual annotation of corpora (forthcoming),
 * [[http://zil.ipipan.waw.pl/WSDDE|WSDDE]] – a system for designing and performing Word Sense Disambiguation experiments (forthcoming),
 * [[http://nlp.ipipan.waw.pl/PPJP/|etc.]]
 * [[http://zil.ipipan.waw.pl/Anotatornia2/|Anotatornia 2]] – an annotation tool geared towards historical corpora,
 * [[http://zil.ipipan.waw.pl/WSDDE|WSDDE]] – a system for designing and performing Word Sense Disambiguation experiments,
 * [[http://multiservice.nlp.ipipan.waw.pl/|Multiservice]] – web service for various of our tools,
 * [[http://zil.ipipan.waw.pl/TermoPL|TermoPL]] - multiword terms extraction from text
 * [[http://dsmodels.nlp.ipipan.waw.pl/sim1.html|DSmodels]] - web service for calculating word similarity using Polish word embeddings
Line 88: Line 136:
Resources:
Line 90: Line 137:
Main '''resources''' (many more at [[http://clip.ipipan.waw.pl/|CLIP]]):

 * [[http://walenty.ipipan.waw.pl/|Walenty]] – a valence dictionary of Polish (described [[http://zil.ipipan.waw.pl/Walenty|here]]),
Line 91: Line 141:
 * [[http://zil.ipipan.waw.pl/DistrNKJP/|DistrNKJP]] – a distributable (IPR-free) subcorpus of National Corpus of Polish,
 * [[http://korpus.pl/|IPI PAN Corpus of Polish]] (obsolete).
 * [[http://zil.ipipan.waw.pl/CoDeS|Polish word embeddings based on NKJP and Wikipedia]],
 * Polish dependency banks: [[http://zil.ipipan.waw.pl/PDB|PDB]], [[http://git.nlp.ipipan.waw.pl/alina/PDBUD/tree/master/PDB-UD_current|PDB-UD]], [[http://git.nlp.ipipan.waw.pl/alina/PDBUD/tree/master/PUD-PL_current|PUD-PL]], [[http://git.nlp.ipipan.waw.pl/alina/PDBUD/tree/master/NKJP1M-UD_current|NKJP1M-UD]],
 * [[http://zil.ipipan.waw.pl/PDB/PDBparser|Dependency parsing models for Polish]].
Line 100: Line 151:
 * [[http://nlp.ipipan.waw.pl/seminar-e.html|NLP Seminar at IPI PAN]];
 * [[http://iis.ipipan.waw.pl/|Intelligent Information Systems]] series of conferences.
 * [[http://jlm.ipipan.waw.pl/|Journal of Language Modelling]]
 * [[http://zil.ipipan.waw.pl/seminar|NLP Seminar at IPI PAN]]
 * [[http://poleval.pl/|PolEval]], the evaluation campaign for natural language processing tools for Polish
 * conferences organised by the Group:
  * [[http://iis.ipipan.waw.pl/|Intelligent Information Systems]] series of conferences
  * [[http://poltal.ipipan.waw.pl/|9th International Conference on Natural Language Processing]] (PolTAL 2014), 17–19 September 2014, Warsaw, Poland
  * [[http://tlt14.ipipan.waw.pl/|14th International Workshop on Treebanks and Linguistic Theories]] (TLT14), 11–12 December 2015, Warsaw, Poland
  * [[http://corbon.nlp.ipipan.waw.pl/2016/|Coreference Resolution Beyond OntoNotes]] (CORBON 2016) workshop at [[http://naacl.org/naacl-hlt-2016/|NAACL 2016]] (The 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies), 16 June 2016, San Diego, US
  * [[http://headlex16.ipipan.waw.pl/|Joint 2016 Conference on Head-driven Phrase Structure Grammar and Lexical Functional Grammar]] (!HeadLex16), 24–29 July 2016, Warsaw, Poland
  * [[http://corbon.nlp.ipipan.waw.pl/|2nd Workshop on Coreference Resolution Beyond OntoNotes]] (CORBON 2017) at [[http://eacl2017.org/|EACL 2017]] (The 15th Conference of the European Chapter of the Association for Computational Linguistics), 4 April 2017, Valencia, Spain
  * [[http://anawiki.essex.ac.uk/dali/crac18/|Computational Models of Reference, Anaphora, and Coreference]] workshop (CRAC) at [[http://naacl2018.org/|NAACL 2018]] (The 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies), 6 June 2018, New Orleans, USA
  * [[https://nlpday.pl/|AI & NLP Workshop Day]], 19 October 2018, Warsaw
  * [[https://sites.google.com/view/crac2019/|Second Workshop on Computational Models of Reference, Anaphora and Coreference]] (CRAC 2019), 6 ot 7 June 2019, Minneapolis
  * [[http://www.dynamicsoflanguage.edu.au/lfg-2019/|The 24th International Lexical-Functional Grammar Conference]] (LFG19), 8–10 July 2019, Canberra
  * [[https://lfg20.w.uib.no/|The 25th International Lexical-Functional Grammar Conference]] (LFG20), 23–25 June 2020, online
  * [[https://typo.uni-konstanz.de/lfg2021/|The 26th International Lexical-Functional Grammar Conference]] (LFG21), 13–15 July 2021, online


== Selected publications ==

<<BibMate(author, "Andrzejczuk", "Bartosiak", "Gawłowicz", "Hajnicz", "Kaczyński", "Kieraś", "Klimaszewski", "Kobyliński", "Krasnowska", "Marciniak", "Mykowiecka", "Nitoń", "Ogrodniczuk", "Patejuk", "Przepiórkowski", "Przybyła", "Rychlik, "Wawer", "Wojdyga", "Wołoszyn", "Woliński", "Wójtowicz", "Wróblewska", "Bolc")>>

The Linguistic Engineering Group

The Linguistic Engineering (LE) Group is part of the Department of Artificial Intelligence at the Institute of Computer Science, Polish Academy of Sciences (IPI PAN).

People

Core team

Tomasz Bartosiak, MSc

tomasz.bartosiak@gmail.com

Diego Feinmann, PhD

diego.feinmann@ipipan.waw.pl

Elżbieta Hajnicz, PhD, Assoc. Prof.

elzbieta.hajnicz@ipipan.waw.pl

Witold Kieraś, PhD

witold.kieras@ipipan.waw.pl

Łukasz Kobyliński, PhD

lukasz.kobylinski@ipipan.waw.pl

Dorota Komosińska, MSc

dorota.komosinska@gmail.com

Katarzyna Krasnowska-Kieraś, MSc

katarzyna.krasnowska@ipipan.waw.pl

Małgorzata Marciniak, PhD, Assoc. Prof.

malgorzata.marciniak@ipipan.waw.pl

Agnieszka Mykowiecka, PhD, Assoc. Prof.

agnieszka.mykowiecka@ipipan.waw.pl

Maciej Ogrodniczuk, PhD, Assoc. Prof., Head of the Group

maciej.ogrodniczuk@ipipan.waw.pl

Agnieszka Patejuk, PhD

agnieszka.patejuk@ipipan.waw.pl

Adam Przepiórkowski, PhD, Full Prof.

adam.przepiorkowski@ipipan.waw.pl

Piotr Przybyła, PhD (on postdoctoral fellowship at UPF)

piotr.przybyla@ipipan.waw.pl

Michał Rudolf, PhD

michal@rudolf.waw.pl

Piotr Rychlik, PhD

piotr.rychlik@ipipan.waw.pl

Karolina Saputa, BEng

karolsaputa@gmail.com

Aleksandra Tomaszewska, PhD candidate

aleksandra.tomaszewska@ipipan.waw.pl

Aleksander Wawer, PhD

aleksander.wawer@ipipan.waw.pl

Marcin Woliński, PhD, Assoc. Prof.

marcin.wolinski@ipipan.waw.pl

Joanna Wołoszyn, PhD

joanna.woloszyn@ipipan.waw.pl

Alina Wróblewska, PhD

alina.wroblewska@ipipan.waw.pl

Sebastian Zawada, MSc

sebastian.zawada@ipipan.waw.pl

Natalia Zawadzka-Paluektau, PhD

natalia.zawadzka-paluektau@ipipan.waw.pl

Bartosz Żuk, PhD candidate

bartoszzuk.poczta@gmail.com

Associates

Anna Andrzejczuk, PhD (on leave)

anna.andrzejczuk@ipipan.waw.pl

Wiktor Eźlakowski, MSc

wiktor.ezlakowski@ipipan.waw.pl

Sonia Janicka

sonia.janicka@gmail.com

Mateusz Klimaszewski, MSc

mk.klimaszewski@gmail.com

Jakub Piskorski, PhD

jpiskorski@gmail.com

Piotr Rybak, MSc

piotr.cezary.rybak@gmail.com

Jakub Szymanik, PhD

jakub.szymanik@gmail.com

Ryszard Tuora, MSc

ryszardtuora@gmail.com

Grzegorz Wojdyga, MSc

g.wojdyga@ipipan.waw.pl

Beata Wójtowicz, PhD, Assoc. Prof.

beata.wojtowicz@ipipan.waw.pl

Research

The main research areas of the Group

  • (Polish) corpus linguistics (National Corpus of Polish)

  • morphosyntactic tagging and lemmatisation of Polish
  • syntactic and semantic parsing of Polish
  • extraction of linguistic knowledge from corpora
  • information extraction
  • distributional semantics and compositional distributional semantics
  • sentiment analysis
  • credibility assessment of online content
  • reference and discourse relations
  • generative linguistic formalisms, esp., HPSG and LFG.

The Group is a member of CLARIN, DARIAH-PL, ELRC, FLaReNet and META-NET.

Current externally funded projects

  • CLARIN-PL (Polish chapter of Common Language Resources and Technology Infrastructure)

  • CORMETAN (Cognitive and sociocultural analysis of metaphoric expressions in Polish texts)

  • CURLICAT (Curated Multilingual Language Resources for CEF AT)

  • Korpus Dekady (DARIAH-PL — Digital Research Infrastructure for the Arts and Humanities)

  • ELE (European Language Equality)

  • ELG (European Language Grid)

  • ELRC (European Language Resource Coordination)

  • HOMADOS (Hampering Misinformation by Assessing Credibility of Online Sources)

  • KORBA 2 (Extension of the "Electronic corpus of 17th and 18th century Polish texts" and its integration with the "Electronic Dictionary of the 17th–18th Century Polish")

  • Kwantyfikatory w języku: użycie i znaczenie (Quantifiers in Language: Use and Meaning)

  • MARCELL (Multilingual Resources for CEF.AT in the legal domain)

  • Nexus Linguarum (European network for Web-centred linguistic data science)

  • Scwad (Compositional distributional modelling of Polish language semantics)

  • SYNAMET (Microcorpus of Synaesthetic Metaphors. Towards a Formal Description and Efficient Methods of Analysis of Metaphors in Discourse)

Some of our past projects

Publicly available tools and resources

Here are some of the tools and resources created within our projects. See CLIP pages for a more exhaustive list of Polish tools and resources, including more tools and resources developed at ZIL IPI PAN.

Some tools (all open source, under GPL; see also CLIP):

  • Morfeusz 2 – a morphological analyser of Polish,

  • Spejd – a shallow parsing and disambiguation system,

  • Świgra – a DCG parser,

  • COMBO – a language-independent system for natural language preprocessing (i.e. morphosyntactic tagging, lemmatisation, dependency parsing and thematic role labelling,

  • Concraft — a CRF morphosyntactic tagger of Polish compatible with Morfeusz SGJP,

  • PANTERA – a morphosyntactic tagger for Polish,

  • TaKIPI – a morphosyntactic tagger for Polish,

  • Poliqarp – a corpus indexing and search engine,

  • Poliqarp2 – a new generation corpus indexing and search engine,

  • Dendrarium – a treebank development system (under development),

  • Anotatornia 2 – an annotation tool geared towards historical corpora,

  • WSDDE – a system for designing and performing Word Sense Disambiguation experiments,

  • Multiservice – web service for various of our tools,

  • TermoPL - multiword terms extraction from text

  • DSmodels - web service for calculating word similarity using Polish word embeddings

Main resources (many more at CLIP):

Other activities

Links to some other activities of the Group:

Selected publications

List of publications

2025

Witold Kieraś, Małgorzata Marciniak, Marek Łaziński, Marcin Woliński, Krystyna Bojałkowska, Wiktor Eźlakowski, Łukasz Kobyliński, Dorota Komosińska, Katarzyna Krasnowska-Kieraś, Michał Rudolf, Aleksandra Tomaszewska, Joanna Wołoszyn, and Natalia Zawadzka-Paluektau. Korpus Współczesnego Języka Polskiego. Dekada 2011–2020. Język Polski, 105(2):5–20, 2025.

Aleksandra Tomaszewska, Dariusz Czerski, Bartosz Żuk, and Maciej Ogrodniczuk. NeoN: A tool for automated detection, linguistic and LLM-driven aalysis of neologisms in Polish. In Michael H. Lees, Wentong Cai, Siew Ann Cheong, Yi Su, David Abramson, Jack J. Dongarra, and Peter M. A. Sloot, editors, Computational Science – ICCS 2025, pages 318–326, Cham, 2025. Springer Nature Switzerland.

Aleksandra Tomaszewska and Maciej Ogrodniczuk. Corpus studies in 2024: Emerging trends and applications. In David Bradley, Katarzyna Dziubalska-Kołaczyk, Camiel Hamans, Ik-Hwan Lee, and Frieda Steurs, editors, Contemporary Linguistics: Integrating Languages, Communities, and Technologies, pages 467–477. Brill, Leiden, The Netherlands, 2025.

Alina Wróblewska, Martyna Lewandowska, Aleksandra Tomaszewska, Karol Saputa, and Maciej Ogrodniczuk. Koncepcja form równościowych z asteryskiem inkluzywnym. Język Polski, 105(2):97–117, 2025.

2024

Tomaž Erjavec, Matyáš Kopp, Nikola Ljubešić, Taja Kuzman, Paul Rayson, Petya Osenova, Maciej Ogrodniczuk, Çağrı Çöltekin, Danijel Koržinek, Katja Meden, Jure Skubic, Peter Rupnik, Tommaso Agnoloni, José Aires, Starkaður Barkarson, Roberto Bartolini, Núria Bel, María Calzada Pérez, Roberts Dargis, Sascha Diwersy, Maria Gavriilidou, Ruben van Heusden, Mikel Iruskieta, Neeme Kahusk, Anna Kryvenko, Noémi Ligeti-Nagy, Carmen Magariños, Martin Mölder, Costanza Navarretta, Kiril Simov, Lars Magne Tungland, Jouni Tuominen, John Vidler, Adina Ioana Vladu, Tanja Wissik, Väinö Yrjänäinen, and Darja Fišer. ParlaMint II: Advancing comparable parliamentary corpora across Europe. Language Resources and Evaluation, 59:2071–2102, 2024.

Katarzyna Krasnowska-Kieraś and Marcin Woliński. Parsing headed constituencies. In Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, and Nianwen Xue, editors, Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 12633–12643, Torino, Italy, 2024. ELRA and ICCL.

Maciej Ogrodniczuk and Łukasz Kobyliński, editors. Proceedings of the PolEval 2024 Workshop, Warsaw, 2024. Institute of Computer Science, Polish Academy of Sciences.

Maciej Ogrodniczuk, Anna Nedoluzhko, Massimo Poesio, Sameer Pradhan, and Vincent Ng, editors. Proceedings of The Seventh Workshop on Computational Models of Reference, Anaphora and Coreference, Miami, 2024. Association for Computational Linguistics.

Adam Przepiórkowski, Magdalena Borysiak, Adam Okrasiński, Bartosz Pobożniak, Wojciech Stempniak, Kamil Tomaszek, and Adam Głowacki. Symmetric dependency structure of coordination: Crosslinguistic arguments from dependency length minimization. In Daniel Dakota, Sarah Jablotschkin, Sandra Kübler, and Heike Zinsmeister, editors, Proceedings of the 22nd Workshop on Treebanks and Linguistic Theories (TLT 2024), pages 11–22, Hamburg,Germany, 2024. Association for Computational Linguistics.

Adam Przepiórkowski, Katarzyna Kuś, Agnieszka Patejuk, and Berke Şenşekerci. You can depend on the symmetry of coordination and that NPs and CPs can be conjoined. Presentation delivered on 5 July 2024 at the “Form and Meaning of Coordination” workshop in Göttingen, Germany (https://www.uni-goettingen.de/de/685553.html), 2024.

Adam Przepiórkowski. Case. In Stefan Müller, Anne Abeillé, Robert D. Borsley, and Jean-Pierre Koenig, editors, Head-Driven Phrase Structure Grammar: The Handbook, pages 261–294. Language Science Press, Berlin, 2nd edition, 2024.

Piotr Rybak, Piotr Przybyła, and Maciej Ogrodniczuk. PolQA: Polish question answering dataset. In Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, and Nianwen Xue, editors, Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 12846–12855, Torino, Italy, 2024. ELRA and ICCL.

Karol Saputa, Angelika Peljak-Łapińska, and Maciej Ogrodniczuk. Polish Coreference Corpus as an LLM testbed: Evaluating coreference resolution within instruction-following language models by instruction–answer alignment. In Maciej Ogrodniczuk, Anna Nedoluzhko, Massimo Poesio, Sameer Pradhan, and Vincent Ng, editors, Proceedings of The Seventh Workshop on Computational Models of Reference, Anaphora and Coreference, pages 23–32, Miami, 2024. Association for Computational Linguistics.

Alina Wróblewska. Investigating large language models for their competence in extracting grammatically sound sentences from transcribed noisy utterances. In Libby Barak and Malihe Alikhani, editors, Proceedings of the 28th Conference on Computational Natural Language Learning, pages 10–23, Miami, FL, 2024. Association for Computational Linguistics.

2023

Łukasz Kobyliński, Maciej Ogrodniczuk, Piotr Rybak, Piotr Przybyła, Piotr Pęzik, Agnieszka Mikołajczyk, Wojciech Janowski, Michał Marcińczuk, and Aleksander Smywiński-Pohl. PolEval 2022/23 challenge tasks and results. In Maria Ganzha, Leszek Maciaszek, Marcin Paprzycki, and Dominik Ślęzak, editors, Proceedings of the 18th Conference on Computer Science and Intelligence Systems, volume 35 of Annals of Computer Science and Information Systems, pages 1237–1244, 2023.

Katarzyna Krasnowska-Kieraś and Marcin Woliński. Constituency parsing with spines and attachments. In Jiří Mikyška, Clélia de Mulatier, Maciej Paszynski, Valeria V. Krzhizhanovskaya, Jack J. Dongarra, and Peter M.A. Sloot, editors, Computational Science – ICCS 2023. 23rd International Conference, Prague, Czech Republic, July 3–5, 2023, Proceedings, Part I, number 14073 in Lecture Notes in Computer Science, pages 191–205, Cham, 2023. Springer Nature Switzerland.

Maciej Ogrodniczuk, editor. Analiza danych parlamentarnych. Warsztat pokonkursowy, Warsaw, 2023. Institute of Computer Science, Polish Academy of Sciences.

Maciej Ogrodniczuk, Piotr Pęzik, Marek Łaziński, and Marcin Miłkowski. Language Report Polish. In Georg Rehm and Andy Way, editors, European Language Equality: A Strategic Agenda for Digital Language Equality, pages 191–194. Springer International Publishing, Cham, 2023.

Adam Przepiórkowski and Michał Woźniak. Conjunct lengths in English, Dependency Length Minimization, and dependency structure of coordination. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15494–15512, Toronto, Canada, 2023. Association for Computational Linguistics.

Karol Saputa, Aleksandra Tomaszewska, Natalia Zawadzka-Paluektau, Witold Kieraś, and Łukasz Kobyliński. Korpusomat.eu: A multilingual platform for building and analysing linguistic corpora. In Jiří Mikyška, Clélia de Mulatier, Maciej Paszynski, Valeria V. Krzhizhanovskaya, Jack J. Dongarra, and Peter M.A. Sloot, editors, Computational Science – ICCS 2023. 23rd International Conference, Prague, Czech Republic, July 3–5, 2023, Proceedings, Part II, number 14074 in Lecture Notes in Computer Science, pages 230–237, Cham, 2023. Springer Nature Switzerland.

Marcin Woliński, Alina Wróblewska, Małgorzata Marciniak, Katarzyna Krasnowska-Kieraś, and Wiktor Eźlakowski. O konstrukcji …, ale nie… i podobnych w języku polskim. Język Polski, CIII(4):5–21, 2023.

Joanna Wołoszyn, Witold Kieraś, and Marcin Woliński. Sieć powiązań derywacyjnych na materiale Słownika gramatycznego języka polskiego: Propozycja klasyfikacji. LingVaria, 18(2):47–61, 2023.

Sebastian Żurowski, Daniel Ziembicki, Aleksandra Tomaszewska, Maciej Ogrodniczuk, and Agata Drozd. Adopting ISO 24617-8 for discourse relations annotation in Polish: Challenges and future directions. In Sara Carvalho, Anas Fahad Khan, Ana Ostroski Anić, Blerina Spahiu, Jorge Gracia, John P. McCrae, Dagmar Gromann, Barbara Heinisch, and Ana Castro Salgado, editors, Proceedings of the 4th Conference on Language, Data and Knowledge, pages 482–492, Vienna, Austria, 2023. NOVA CLUNL, Portugal.

2022

Elżbieta Hajnicz. Annotation of metaphorical expressions in the Basic Corpus of Polish Metaphors. In Proceedings of the Language Resources and Evaluation Conference, pages 5648–5653, Marseille, France, 2022. European Language Resources Association.

Maciej Ogrodniczuk, Sameer Pradhan, Anna Nedoluzhko, Vincent Ng, and Massimo Poesio, editors. Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference, Gyeongju, Republic of Korea, 2022. Association for Computational Linguistics.

Adam Przepiórkowski. Polyadic cover quantification in heterofunctional coordination. In Daniel Gutzmann and Sophie Repp, editors, Proceedings of Sinn und Bedeutung 26, pages 677–696, 2022.

Marcin Woliński, Bartłomiej Nitoń, Witold Kieraś, and Jakub Szymanik. HerBERT based language model detects quantifiers and their semantic properties in Polish. In Proceedings of the Language Resources and Evaluation Conference, pages 7140–7146, Marseille, France, 2022. European Language Resources Association.

2021

Maciej Ogrodniczuk and Łukasz Kobyliński, editors. Proceedings of the PolEval 2021 Workshop, Warsaw, 2021. Institute of Computer Science, Polish Academy of Sciences.

Maciej Ogrodniczuk and Piotr Przybyła. PolEval 2021 Task 4: Question Answering Challenge. In Maciej Ogrodniczuk and Łukasz Kobyliński, editors, Proceedings of the PolEval 2021 Workshop, pages 123–136, Warsaw, 2021. Institute of Computer Science, Polish Academy of Sciences.

Adam Przepiórkowski. Frazemowość narzędnikowych form liczebnikowych na -u. Język Polski, CI(3):5–15, 2021.

2020

Mary Dalrymple, Agnieszka Patejuk, and Mark-Matthias Zymla. XLE+Glue – A new tool for integrating semantic analysis in XLE. In Miriam Butt and Tracy Holloway King, editors, The Proceedings of the LFG'20 Conference, pages 89–108, Stanford, CA, 2020. CSLI Publications.

Maciej Ogrodniczuk and Łukasz Kobyliński, editors. Proceedings of the PolEval 2020 Workshop, Warsaw, 2020. Institute of Computer Science, Polish Academy of Sciences.

Tamás Váradi, Svetla Koeva, Martin Yamalov, Marko Tadić, Bálint Sass, Bartłomiej Nitoń, Maciej Ogrodniczuk, Piotr Pęzik, Verginica Barbu Mititelu, Radu Ion, Elena Irimia, Maria Mitrofan, Vasile Păiș, Dan Tufiș, Radovan Garabík, Simon Krek, Andraž Repar, Matjaž Rihtar, and Janez Brank. The MARCELL legislative corpus. In Proceedings of The 12th Language Resources and Evaluation Conference, pages 3761–3768, Marseille, France, 2020. European Language Resources Association (ELRA).

Marcin Woliński, Witold Kieraś, Dorota Komosińska, and Włodzimierz Gruszczyński. Results of the PolEval 2020 shared task 2: Morphosyntactic tagging of Middle, New and Modern Polish. pages 39–46, Warsaw, 2020. Institute of Computer Science, Polish Academy of Sciences.

Alina Wróblewska. Towards the Conversion of National Corpus of Polish to Universal Dependencies. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 5308–5315, Marseille, France, 2020. European Language Resources Association (ELRA).

2019

Celina Heliasz and Maciej Ogrodniczuk. Eksplicytność a implicytność w świetle analizy korpusowej (meta)tekstu. Linguistica Copernicana, 16:75–100, 2019.

Łukasz Kobyliński, Maciej Ogrodniczuk, Jan Kocoń, Michał M. Marcińczuk, Aleksander Smywiński-Pohl, Krzysztof Wołk, Danijel Koržinek, Michał Ptaszyński, Agata Pieciukiewicz, and Paweł Dybała. PolEval 2019 — the next chapter in evaluating Natural Language Processing tools for Polish. In Zygmunt Vetulani and Patrick Paroubek, editors, Human Language Technologies as a Challenge for Computer Science and Linguistics – 2019, pages 165–172. Wydawnictwo Nauka i Innowacje, Poznań, Poland, 2019.

Łukasz Kobyliński and Michał Wasiluk. Deep learning in event detection in Polish. In Christiane Fellbaum, Piek Vossen, Ewa Rudnicka, Marek Maziarz, and Maciej Piasecki, editors, Proceedings of the 10th Global WordNet Conference (GWC 2019), pages 216–221, Wrocław, 2019. Oficyna Wydawnicza Politechniki Wrocławskiej.

Katarzyna Krasnowska-Kieraś and Łukasz Kobyliński. Part of speech tagging for Polish. Poznań Studies in Contemporary Linguistics, 55(2):211–237, 2019.

Katarzyna Krasnowska-Kieraś and Alina Wróblewska. Empirical linguistic study of sentence embeddings. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5729–5739, Florence, Italy, 2019. Association for Computational Linguistics.

Maciej Ogrodniczuk, Rafał L. Górski, Marek Łaziński, and Piotr Pęzik. From the National Corpus of Polish to the Polish Corpus Infrastructure. Jazykovedný časopis, 70(2):315–323, 2019.

Adam Przepiórkowski. Status gramatyczny predykatywnych szkoda, wstyd, żal raz jeszcze. Polonica, XXXIX:85–110, 2019.

Ryszard Tuora and Łukasz Kobyliński. Integrating Polish language tools and resources in Spacy. In Proceedings of PP-RAI 2019 Conference, pages 210–214, Wrocław, 2019. Department of Systems and Computer Networks, Faculty of Electronics, Wroclaw University of Science and Technology.

Alina Wróblewska and Piotr Rybak. Dependency parsing of Polish. Poznań Studies in Contemporary Linguistics, 55(2):305–337, 2019.

2018

Małgorzata Marciniak, Agnieszka Mykowiecka, and Piotr Rychlik. Recognition of irrelevant phrases in automatically extracted lists of domain terms. Terminology, 24(1):66–90, 2018.

Agnieszka Mykowiecka, Małgorzata Marciniak, and Piotr Rychlik. SimLex-999 for Polish. In Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, and Takenobu Tokunaga, editors, Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Paris, France, 2018. European Language Resources Association (ELRA).

Agnieszka Mykowiecka, Małgorzata Marciniak, and Aleksander Wawer. Literal, metphorical or both? Detecting metaphoricity in isolated adjective-noun phrases. In Beata Beigman Klebanov, Ekaterina Shutova, Patricia Lichtenstein, Smaranda Muresan, and Chee Wee, editors, Proceedings of the Workshop on Figurative Language Processing, pages 27–33. Association for Computational Linguistics, 2018.

Agnieszka Mykowiecka, Aleksander Wawer, and Małgorzata Marciniak. Detecting figurative word occurrences using recurrent neural networks. In Beata Beigman Klebanov, Ekaterina Shutova, Patricia Lichtenstein, Smaranda Muresan, and Chee Wee, editors, Proceedings of the Workshop on Figurative Language Processing, pages 124–127. Association for Computational Linguistics, 2018.

Bartłomiej Nitoń, Paweł Morawiecki, and Maciej Ogrodniczuk. Deep neural networks for coreference resolution for Polish. In Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, and Takenobu Tokunaga, editors, Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), pages 395–400, Paris, France, 2018. European Language Resources Association (ELRA).

Maciej Ogrodniczuk, Joanna Bilińska, Zbigniew Bronk, and Witold Kieraś. Multisłownik: Linking plWordNet-based lexical data for lexicography and educational purposes. In Francis Bond, Takayuki Kuribayashi, Christiane Fellbaum, and Piek Vossen, editors, Proceedings of the 9th Global WordNet Conference (GWC 2018), pages 368–375, Singapore, 2018. University of Tartu.

Adam Przepiórkowski. The origin of the valency metaphor in linguistics. Lingvisticæ Investigationes, 41(1):152–159, 2018.

Piotr Rybak and Alina Wróblewska. Semi-supervised neural system for tagging, parsing and lemmatization. Addendum. In Proceedings of the PolEval 2018 Workshop, pages 49–51. Institute of Computer Science, Polish Academy of Sciences, 2018.

Alina Wróblewska. Polish corpus of annotated descriptions of images. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), pages 2141–2146. European Language Resources Association (ELRA), 2018.

Alina Wróblewska. Results of the PolEval 2018 Shared Task 1: Dependency Parsing. In Proceedings of the PolEval 2018 Workshop, pages 11–24. Institute of Computer Science, Polish Academy of Sciences, 2018.

Magdalena Zawisławska, Marta Falkowska, and Maciej Ogrodniczuk. Verbal synaesthesia in the Polish corpus of synaesthetic metaphors. LaMiCuS, 2:226–253, 2018.

2017

Witold Kieraś and Marcin Woliński. Morfeusz 2 – analizator i generator fleksyjny dla języka polskiego. Język Polski, XCVII(1):75–83, 2017.

Małgorzata Marciniak, Agnieszka Mykowiecka, and Piotr Rychlik. Automatyczne wydobywanie terminologii dziedzinowej z korpusów tekstowych. Język Polski, XCVII(1):64–74, 2017.

Bartłomiej Nitoń and Maciej Ogrodniczuk. Multi-pass sieve coreference resolution system for Polish. In Jorge Gracia, Francis Bond, John P. McCrae, Paul Buitelaar, Christian Chiarcos, and Sebastian Hellmann, editors, Proceedings of the 1st Conference on Language, Data and Knowledge (LDK 2017), number 10318 in Lecture Notes in Artificial Intelligence, pages 222–236. Springer International Publishing, Berlin, 2017.

Maciej Ogrodniczuk, Magdalena Derwojedowa, Marek Łaziński, and Piotr Pęzik. Narodowy Korpus Języka Polskiego – co dalej?. Prace Filologiczne, LXXI:237–245, 2017.

Maciej Ogrodniczuk and Mateusz Kopeć. Lexical correction of Polish Twitter political data. In Proceedings of the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pages 115–125, Vancouver, Canada, 2017. Association for Computational Linguistics.

Maciej Ogrodniczuk and Vincent Ng, editors. Proceedings of the 2nd Workshop on Coreference Resolution Beyond OntoNotes (CORBON 2017), Valencia, Spain, 2017. Association for Computational Linguistics.

Adam Przepiórkowski. Argumenty i modyfikatory w gramatyce i w słowniku. Wydawnictwa Uniwersytetu Warszawskiego, Warsaw, 2017.

Adam Przepiórkowski. On the argument–adjunct distinction in the Polish Semantic Syntax tradition. Cognitive Studies / Études Cognitives, 17:1–10, 2017.

Aleksander Wawer and Agnieszka Mykowiecka. Supervised and unsupervised word sense disambiguation on word embedding vectors of unambigous synonyms. In Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications, pages 120–125. Association for Computational Linguistics, 2017.

Marcin Woliński, Witold Kieraś, and Dorota Komosińska. Anotatornia 2 — an annotation tool geared towards historical corpora. In Zygmunt Vetulani and Patrick Paroubek, editors, Proceedings of the 8th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, pages 158–162, Poznań, Poland, 2017. Fundacja Uniwersytetu im. Adama Mickiewicza w Poznaniu.

Alina Wróblewska and Katarzyna Krasnowska-Kieraś. Polish evaluation dataset for compositional distributional semantics models. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 784–792, Vancouver, Canada, 2017. Association for Computational Linguistics.

Alina Wróblewska, Katarzyna Krasnowska-Kieraś, and Piotr Rybak. Towards the evaluation of feature embedding models of the fusional languages. In Zygmunt Vetulani and Patrick Paroubek, editors, Proceedings of the 8th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, pages 420–424, Poznań, Poland, 2017. Fundacja Uniwersytetu im. Adama Mickiewicza w Poznaniu.

2016

Joanna Bilińska, Magdalena Derwojedowa, Witold Kieraś, and Monika Kwiecień. Mikrokorpus polszczyzny 1830-1918. Komunikacja specjalistyczna, 11:149–161, 2016.

Renata Bronikowska, Włodzimierz Gruszczyński, Maciej Ogrodniczuk, and Marcin Woliński. The use of electronic historical dictionary data in corpus design. Studies in Polish Linguistics, 11(2):47–56, 2016.

Magdalena Derwojedowa, Witold Kieraś, Joanna Bilińska, and Monika Kwiecień. Dynamika zmian fleksyjnych i ortograficznych między reformami 1830-1918. Język Polski, XCVI(1):24–35, 2016.

Elżbieta Hajnicz, Agnieszka Patejuk, Adam Przepiórkowski, and Marcin Woliński. Walenty: słownik walencyjny języka polskiego z bogatym komponentem frazeologicznym. In Karolina Skwarska and Elżbieta Kaczmarska, editors, Výzkum slovesné valence ve slovanských zemích, pages 71–102. Slovanský ústav AV ČR, Prague, 2016.

Małgorzata Marciniak, Agnieszka Mykowiecka, and Piotr Rychlik. TermoPL — a flexible tool for terminology extraction. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, editors, Proceedings of the Tenth International Conference on Language Resources and Evaluation, LREC 2016, pages 2278–2284, Portorož, Slovenia, 2016. European Language Resources Association (ELRA).

Adam Przepiórkowski. How not  to distinguish arguments from adjuncts in LFG. In Doug Arnold, Miriam Butt, Berthold Crysmann, Tracy Holloway King, and Stefan Müller, editors, The Proceedings of the Joint 2016 Conference on Head-driven Phrase Structure Grammar and Lexical Functional Grammar, pages 560–580, Stanford, CA, 2016. CSLI Publications.

Marcin Woliński and Witold Kieraś. The on-line version of Grammatical Dictionary of Polish. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, editors, Proceedings of the Tenth International Conference on Language Resources and Evaluation, LREC 2016, pages 2589–2594, Portorož, Slovenia, 2016. European Language Resources Association (ELRA).

2015

Markus Dickinson, Erhard Hinrichs, Agnieszka Patejuk, and Adam Przepiórkowski, editors. Proceedings of the Fourteenth International Workshop on Treebanks and Linguistic Theories (TLT 14), Warsaw, 2015. Institute of Computer Science, Polish Academy of Sciences.

Elżbieta Hajnicz, Bartłomiej Nitoń, Agnieszka Patejuk, Adam Przepiórkowski, and Marcin Woliński. Internetowy słownik walencyjny języka polskiego oparty na danych korpusowych. Prace Filologiczne, LXV:95–110, 2015.

Katarzyna Krasnowska-Kieraś and Agnieszka Patejuk. Integrating Polish LFG with external morphology. In Markus Dickinson, Erhard Hinrichs, Agnieszka Patejuk, and Adam Przepiórkowski, editors, Proceedings of the Fourteenth International Workshop on Treebanks and Linguistic Theories (TLT 14), pages 134–147, Warsaw, 2015. Institute of Computer Science, Polish Academy of Sciences.

Agnieszka Patejuk. Unlike Coordination in Polish: An LFG Account. Ph.D. dissertation, Institute of Polish Language, Polish Academy of Sciences, Cracow, 2015.

Adam Przepiórkowski. Towards a linguistically-oriented textual entailment test-suite for Polish based on the semantic syntax approach. Cognitive Studies / Études Cognitives, 15:177–191, 2015.

2014

Elżbieta Hajnicz. The procedure of lexico-semantic annotation of Składnica treebank. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, editors, Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, pages 2290–2297, Reykjavík, Iceland, 2014. European Language Resources Association (ELRA).

Mateusz Kopeć and Maciej Ogrodniczuk. Inter-annotator agreement in coreference annotation of Polish. In Janusz Sobecki, Veera Boonjing, and Suphamit Chittayasothorn, editors, Advanced Approaches to Intelligent Information and Database Systems, volume 551 of Studies in Computational Intelligence, pages 149–158. Springer International Publishing, Switzerland, 2014.

Adam Przepiórkowski, Elżbieta Hajnicz, Agnieszka Patejuk, and Marcin Woliński. Extended phraseological information in a valence dictionary for NLP applications. In Proceedings of the Workshop on Lexical and Grammatical Resources for Language Processing (LG-LP 2014), pages 83–91, Dublin, Ireland, 2014. Association for Computational Linguistics and Dublin City University.

Adam Przepiórkowski, Elżbieta Hajnicz, Agnieszka Patejuk, Marcin Woliński, Filip Skwarski, and Marek Świdziński. Walenty: Towards a comprehensive valence dictionary of Polish. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, editors, Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, pages 2785–2792, Reykjavík, Iceland, 2014. European Language Resources Association (ELRA).

Adam Przepiórkowski and Maciej Ogrodniczuk, editors. Advances in Natural Language Processing: Proceedings of the 9th International Conference on NLP, PolTAL 2014, Warsaw, Poland, September 17–19, 2014. Number 8686 in Lecture Notes in Artificial Intelligence. Springer International Publishing, Heidelberg, 2014.

Alina Wróblewska. Polish Dependency Parser Trained on an Automatically Induced Dependency Bank. Ph.D. dissertation, Institute of Computer Science, Polish Academy of Sciences, Warsaw, 2014.

2013

Włodzimierz Gruszczyński, Dorota Adamiec, and Maciej Ogrodniczuk. Elektroniczny korpus tekstów polskich z XVII i XVIII w. (do 1772 r.) — prezentacja projektu badawczego. Polonica, XXXIII:309–316, 2013.

Elżbieta Hajnicz. Actualising lexico-semantic annotation of Składnica Treebank to modified versions of source resources. In Zygmunt Vetulani, editor, Proceedings of the 6th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, pages 178–182, Poznań, Poland, 2013. Wydawnictwo Poznańskie, Fundacja Uniwersytetu im. Adama Mickiewicza.

Mieczysław A. Kłopotek, Jacek Koronacki, Małgorzata Marciniak, Agnieszka Mykowiecka, and Sławomir T. Wierzchoń, editors. Language Processing and Intelligent Information Systems – 20th International Conference, IIS 2013, Warsaw, Poland, June 17-18, 2013. Proceedings, number 7912 in Lecture Notes in Computer Science, Berlin, Heidelberg, 2013. Springer-Verlag.

Katarzyna Krasnowska. Towards a Polish LTAG grammar. In Mieczysław A. Kłopotek, Jacek Koronacki, Małgorzata Marciniak, Agnieszka Mykowiecka, and Sławomir T. Wierzchoń, editors, Language Processing and Intelligent Information Systems – 20th International Conference, IIS 2013, Warsaw, Poland, June 17-18, 2013. Proceedings, number 7912 in Lecture Notes in Computer Science, pages 16–21, Berlin, Heidelberg, 2013. Springer-Verlag.

Barbara Lewandowska-Tomaszczyk, Rafał Górski, Marek Łaziński, and Adam Przepiórkowski. The National Corpus of Polish (NKJP). Language use and data analysis. In Irina Kor Chahine and Charles Zaremba, editors, Travaux de slavistique : Actes du VIe congrès de la Slavic Linguistic Society, pages 309–319. Presses Universitaires de Provence, 2013.

Małgorzata Marciniak and Agnieszka Mykowiecka. Terminology extraction from domain texts in Polish. In R. Bembenik, L. Skonieczny, H. Rybinski, M. Kryszkiewicz, and M. Niezgodka, editors, Intelligent Tools for Building a Scientific Information Platform. Advanced Architectures and Solutions, volume 467 of Studies in Computational Intelligence, pages 171–185. Springer-Verlag, 2013.

Maciej Ogrodniczuk. Cyfrowi mówcy uczą się szybko. Academia, 4 (36):26–29, 2013.

Maciej Ogrodniczuk. Translation- and projection-based unsupervised coreference resolution for Polish. In Mieczysław A. Kłopotek, Jacek Koronacki, Małgorzata Marciniak, Agnieszka Mykowiecka, and Sławomir T. Wierzchoń, editors, Language Processing and Intelligent Information Systems – 20th International Conference, IIS 2013, Warsaw, Poland, June 17-18, 2013. Proceedings, number 7912 in Lecture Notes in Computer Science, pages 125–130. Springer-Verlag, Berlin, Heidelberg, 2013.

Maciej Ogrodniczuk and Michał Lenart. A multi-purpose online toolset for NLP applications. In Elisabeth Métais, Farid Meziane, Mohamed Saraee, Vijay Sugumaran, and Sunil Vadera, editors, Proceedings of the 18th International Conference on Applications of Natural Language to Information Systems, number 7934 in Lecture Notes in Computer Science, pages 392–395. Springer-Verlag, Berlin, Heidelberg, 2013.

Adam Przepiórkowski. The syntax of distance distributivity in Polish: Preserving generalisations with weak heads. In Stefan Müller, editor, Proceedings of the HPSG 2013 Conference, pages 161–181, Stanford, CA, 2013. CSLI Publications.

Adam Przepiórkowski, Maciej Piasecki, Krzysztof Jassem, and Piotr W. Fuglewicz, editors. Computational Linguistics: Applications. Springer-Verlag, Berlin, 2013.

Piotr Przybyła. Question Classification for Polish Question Answering. In Mieczysław A. Kłopotek, Jacek Koronacki, Małgorzata Marciniak, Agnieszka Mykowiecka, and Sławomir T. Wierzchoń, editors, Proceedings of the 20th International Conference on Language Processing and Intelligent Information Systems (LP&IIS 2013), pages 50–56. Springer-Verlag, 2013.

Sebastian Sulger, Miriam Butt, Tracy Holloway King, Paul Meurer, Tibor Laczkó, György Rákosi, Cheikh Bamba Dione, Helge Dyvik, Victoria Rosén, Koenraad De Smedt, Agnieszka Patejuk, Özlem Çetinoğlu, I Wayan Arka, and Meladel Mistica. ParGramBank: The ParGram parallel treebank. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 550–560, Sofia, Bulgaria, 2013. Association for Computational Linguistics.

2012

Szymon Acedański, Adam Slaski, and Adam Przepiórkowski. Machine learning of syntactic attachment from morphosyntactic and semantic co-occurrence statistics. In Proceedings of the ACL 2012 Joint Workshop on Statistical Parsing and Semantic Processing of Morphologically Rich Languages, pages 42–47, Jeju, Republic of Korea, 2012. Association for Computational Linguistics.

Anna Andrzejczuk. Klasyfikacja onomazjologiczna rzeczowników a ich charakterystyka gramatyczna. Nowy sposób opracowania materiału leksykograficznego.. PhD thesis, Instytut Języka Polskiego, Polska Akademia Nauk, Cracow, 2012.

Pascal Bouvry, Mieczysław A. Kłopotek, Franck Leprevost, Małgorzata Marciniak, Agnieszka Mykowiecka, and Henryk Rybiński, editors. Security and Intelligent Information Systems: International Joint Conference, SIIS 2011, Warsaw, Poland, June 13-14, 2011, Revised Selected Papers. Number 7053 in Lecture Notes in Computer Science. Springer-Verlag, 2012.

Łukasz Degórski and Adam Przepiórkowski. Ręcznie znakowany milionowy podkorpus NKJP. In Adam Przepiórkowski, Mirosław Bańko, Rafał L. Górski, and Barbara Lewandowska-Tomaszczyk, editors, Narodowy Korpus Języka Polskiego, pages 51–58. Wydawnictwo Naukowe PWN, Warsaw, 2012.

Elżbieta Hajnicz. Znakowanie semantyczne Składnicy frazowej. założenia ogólne, nazwy własne, aktualizacja. Technical Report 1025, Institute of Computer Science, Polish Academy of Sciences, Warsaw, 2012.

Łukasz Kobyliński and Krzysztof Walczak. Emerging patterns and classification for spatial and image data. In Guozhu Dong and James Bailey, editors, Contrast Data Mining: Concepts, Algorithms and Applications, Data Mining and Knowledge Discovery, pages 285–302. Chapman & Hall/CRC, 2012.

Mateusz Kopeć, Rafał Młodzki, and Adam Przepiórkowski. Word Sense Disambiguation in the National Corpus of Polish. Prace Filologiczne, LXIII:155–165, 2012.

Mateusz Kopeć, Rafał Młodzki, and Adam Przepiórkowski. Automatyczne znakowanie sensami słów. In Adam Przepiórkowski, Mirosław Bańko, Rafał L. Górski, and Barbara Lewandowska-Tomaszczyk, editors, Narodowy Korpus Języka Polskiego, pages 209–224. Wydawnictwo Naukowe PWN, Warsaw, 2012.

Barbara Lewandowska-Tomaszczyk, Mirosław Bańko, Rafał L. Górski, Marek Łazinski, Piotr Pęzik, and Adam Przepiórkowski. Narodowy Korpus Języka Polskiego: geneza i dzień dzisiejszy. In Adam Przepiórkowski, Mirosław Bańko, Rafał L. Górski, and Barbara Lewandowska-Tomaszczyk, editors, Narodowy Korpus Języka Polskiego, pages 3–10. Wydawnictwo Naukowe PWN, Warsaw, 2012.

Maciej Ogrodniczuk, Piotr Pęzik, and Adam Przepiórkowski. Towards a comprehensive open repository of Polish language resources. In Proceedings of the Eighth International Conference on Language Resources and Evaluation, LREC 2012, pages 3593–3597, Istanbul, Turkey, 2012. European Language Resources Association (ELRA).

Adam Przepiórkowski, Mirosław Bańko, Rafał L. Górski, and Barbara Lewandowska-Tomaszczyk, editors. Narodowy Korpus Języka Polskiego. Wydawnictwo Naukowe PWN, Warsaw, 2012.

Adam Przepiórkowski. Znakowanie XML. In Adam Przepiórkowski, Mirosław Bańko, Rafał L. Górski, and Barbara Lewandowska-Tomaszczyk, editors, Narodowy Korpus Języka Polskiego, pages 169–193. Wydawnictwo Naukowe PWN, Warsaw, 2012.

Bartosz Zaborowski and Adam Przepiórkowski. Tagset conversion with decision trees. In Hitoshi Isahara and Kyoko Kanzaki, editors, Advances in Natural Language Processing: Proceedings of the 8th International Conference on NLP, JapTAL 2012, Kanazawa, Japan, October 22-24, 2012, number 7614 in Lecture Notes in Artificial Intelligence, pages 144–155. Springer-Verlag, Heidelberg, 2012.

2011

Anna Andrzejczuk. Dwoje urodzin to brzmi dziwnie. Norma językowe dotycząca połączeń rzeczowników PT z liczebnikami a jej realizacja w tekstach Narodowego Korpusu Języka Polskiego i w tekstach internetowych. Język Polski, XCI(4):273–283, 2011.

Elżbieta Hajnicz. Ordering slots of semantically related schemata of Polish verbs. In Zygmunt Vetulani, editor, Proceedings of the 5th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, pages 232–236, Poznań, Poland, 2011.

Andreas Kathol, Adam Przepiórkowski, and Jesse Tseng. Advanced topics in HPSG. In Robert D. Borsley and Kersti Börjars, editors, Non-Transformational Syntax: Formal and Explicit Models of Grammar, pages 54–111. Blackwell, Oxford, 2011.

Łukasz Kobyliński and Krzysztof Walczak. Efficient mining of jumping emerging patterns with occurrence counts for classification. Transactions on Rough Sets, 13(6499):73–88, 2011.

Agnieszka Mykowiecka and Małgorzata Marciniak. Some remarks on automatic semantic annotation of a medical corpus. In Proc. of Third Louhi Workshop on Health Documentation Text Mining and Information Analysis at AIME, 2011.

Agnieszka Mykowiecka and Małgorzata Marciniak. Automatic semantic labeling of medical texts with feature structures. In Ivan Habernal and Václav Matoušek, editors, Text, Speech and Dialogue: 14th International Conference, TSD 2011, Plzeň, Czech Republic, number 6836 in Lecture Notes in Artificial Intelligence, pages 49–56, Heidelberg, 2011. Springer-Verlag.

Maciej Ogrodniczuk. The Packaged TEI P5-based Stand-off Annotation Format. Internal description of the Multiservice format, 2011.

Maciej Ogrodniczuk and Mateusz Kopeć. Rule-based coreference resolution module for Polish. In Proceedings of the 8th Discourse Anaphora and Anaphor Resolution Colloquium (DAARC 2011), pages 191–200, Faro, Portugal, 2011.

Adam Przepiórkowski and Piotr Bański. XML text interchange format in the National Corpus of Polish. In Stanisław Goźdź-Roszkowski, editor, Explorations across Languages and Corpora: PALC 2009, pages 55–65, Frankfurt am Main, 2011. Peter Lang.

Marcin Woliński, Katarzyna Głowińska, and Marek Świdziński. A preliminary version of Składnica—a treebank of Polish. In Zygmunt Vetulani, editor, Proceedings of the 5th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, pages 299–303, Poznań, Poland, 2011.

2010

Elżbieta Hajnicz. Aggregating entries of semantic valence dictionary of Polish verbs. In Pier Marco Bertinetto, Anna Korhonen, Alessandro Lenci, Alissa Melinger, Sabine Schulte im Walde, and Aline Villavicencio, editors, Proceedings of the Interdisciplinary Workshop on the Identification and Representation of Verb Features (Verb 2010), pages 49–54, Pisa, 2010. Scuola Normale Superiore and Università di Pisa.

Łukasz Kobyliński and Krzysztof Walczak. Spatial emerging patterns for scene classification. In Leszek Rutkowski, Rafał Scherer, Ryszard Tadeusiewicz, Lotfi A. Zadeh, and Jacek M. Żurada, editors, Proceedings of the 10th International Conference on Artificial Intelligence and Soft Computing, number 6113 in Lecture Notes in Computer Science, pages 512–522. Springer-Verlag, 2010.

Małgorzata Marciniak, editor. Anotowany korpus dialogów telefonicznych. Akademicka Oficyna Wydawnicza EXIT, Warsaw, 2010.

Małgorzata Marciniak, Agnieszka Mykowiecka, and Piotr Rychlik. Medical text data anonymization. Journal of Medical Informatics & Technologies, 16:83–88, 2010.

Maciej Ogrodniczuk and Adam Przepiórkowski. Linguistic processing chains as Web Services: Initial linguistic considerations, 2010. CLARIN deliverable D5R-3a.

Marek Świdziński and Marcin Woliński. Towards a bank of constituent parse trees for Polish. In Petr Sojka, Aleš Horák, Ivan Kopeček, and Karel Pala, editors, Text, Speech and Dialogue: 13th International Conference, TSD 2010, Brno, Czech Republic, number 6231 in Lecture Notes in Artificial Intelligence, pages 197–204, Heidelberg, 2010. Springer-Verlag.

2009

Mieczysław A. Kłopotek, Adam Przepiórkowski, Sławomir T. Wierzchoń, and Krzysztof Trojanowski, editors. Recent Advances in Intelligent Information Systems. Akademicka Oficyna Wydawnicza EXIT, Warsaw, 2009.

Małgorzata Marciniak and Agnieszka Mykowiecka, editors. Aspects of Natural Language Processing. Essays dedicated to Leonard Bolc on the Occasion of His 75th Birthday. Number 5070 in Lecture Notes in Computer Science. Springer-Verlag, Berlin, 2009.

Agnieszka Mykowiecka, Krzysztof Marasek, Małgorzata Marciniak, Joanna Rabiega-Wiśniewska, and Ryszard Gubrynowicz. Annotated corpus of Polish spoken dialogues. In Zygmunt Vetulani and Hans Uszkoreit, editors, Human Language Technology: Challenges of the Information Society, number 5603 in Lecture Notes in Artificial Intelligence, pages 50–62, Berlin, 2009. Springer-Verlag.

Agnieszka Mykowiecka and Małgorzata Marciniak. Domain model for medical information extraction – the LightMedOnt ontology. In Aspects of Natural Language Processing, number 5730 in Lecture Notes in Computer Science. Springer-Verlag, 2009.

Marco Passarotti, Adam Przepiórkowski, Savina Raynaud, and Frank Van Eynde, editors. Proceedings of the Eighth International Workshop on Treebanks and Linguistic Theories (TLT 8), Milan, Italy, 2009.

Adam Przepiórkowski. A comparison of two morphosyntactic tagsets of Polish. In Violetta Koseska-Toszewa, Ludmila Dimitrova, and Roman Roszko, editors, Representing Semantics in Digital Lexicography: Proceedings of MONDILEX Fourth Open Workshop, pages 138–144, Warsaw, 2009.

Adam Przepiórkowski. TEI P5 as an XML standard for treebank encoding. In Marco Passarotti, Adam Przepiórkowski, Savina Raynaud, and Frank Van Eynde, editors, Proceedings of the Eighth International Workshop on Treebanks and Linguistic Theories (TLT 8), pages 149–160, Milan, Italy, 2009.

Marek Świdziński and Marcin Woliński. A new formal definition of Polish nominal phrases. In Małgorzata Marciniak and Agnieszka Mykowiecka, editors, Aspects of Natural Language Processing. Essays dedicated to Leonard Bolc on the Occasion of His 75th Birthday, number 5070 in Lecture Notes in Computer Science, pages 143–162. Springer-Verlag, Berlin, 2009.

Marcin Woliński. A relational model of Polish inflection in Grammatical Dictionary of Polish. In Zygmunt Vetulani and Hans Uszkoreit, editors, Human Language Technology: Challenges of the Information Society, number 5603 in Lecture Notes in Artificial Intelligence, pages 96–106. Springer-Verlag, Berlin, 2009.

2008

Mieczysław A. Kłopotek, Adam Przepiórkowski, Sławomir T. Wierzchoń, and Krzysztof Trojanowski, editors. Intelligent Information Systems. Akademicka Oficyna Wydawnicza EXIT, Warsaw, 2008.

Sandra Kübler, Jakub Piskorski, and Adam Przepiórkowski, editors. Proceedings of the LREC 2008 Workshop on Partial Parsing: Between Chunking and Deep Parsing, Marrakech, 2008. European Language Resources Association (ELRA).

Adam Przepiórkowski, Michał Marcińczuk, and Łukasz Degórski. Dealing with small, noisy and imbalanced data: Machine learning or manual grammars?. In Petr Sojka, Aleš Horák, Ivan Kopeček, and Karel Pala, editors, Text, Speech and Dialogue: 11th International Conference, TSD 2008, Brno, Czech Republic, September 2008, number 5246 in Lecture Notes in Artificial Intelligence, pages 169–176, Berlin, 2008. Springer-Verlag.

2007

Anna Andrzejczuk. (Nie)tylko w liczbie mnogiej. Rozważania o szeroko rozumianych plurale tantum. LingVaria, 4(2):177–188, 2007. Cracow.

Elżbieta Hajnicz. Dobór czasowników do badań przy tworzeniu słownika semantycznego czasowników polskich. IPI PAN Research Report 1003, Institute of Computer Science, Polish Academy of Sciences, Warsaw, 2007.

Agnieszka Mykowiecka and Małgorzata Marciniak. Automatic spelling correction for the texts from a restricted domain. In Peter Kosta, Gerda Haßler, Lilia Schürcks, and Nadine Thielemann, editors, Linguistic Investigations into Formal Description of Slavic Languages: Contributions of the Sixth European Conference held at Potsdam University, November 30 – December 02, 2005, pages 65–73, Frankfurt am Main, 2007. Peter Lang.

Adam Przepiórkowski, Łukasz Degórski, Miroslav Spousta, Kiril Simov, Petya Osenova, Lothar Lemnitzer, Vladislav Kuboň, and Beata Wójtowicz. Towards the automatic extraction of definitions in Slavic. In Jakub Piskorski, Bruno Pouliquen, Ralf Steinberger, and Hristo Tanev, editors, Proceedings of the Workshop on Balto-Slavonic Natural Language Processing at ACL 2007, pages 43–50, Prague, 2007.

Adam Przepiórkowski, Łukasz Degórski, and Beata Wójtowicz. On the evaluation of Polish definition extraction grammars. In Zygmunt Vetulani, editor, Proceedings of the 3rd Language & Technology Conference, pages 473–477, Poznań, Poland, 2007.

2006

Agnieszka Mykowiecka and Małgorzata Marciniak. Domain-driven automatic spelling correction for mammography reports. In Mieczysław A. Kłopotek, Sławomir T. Wierzchoń, and Krzysztof Trojanowski, editors, Intelligent Information Processing and Web Mining, Advances in Soft Computing, pages 521–530. Springer-Verlag, Berlin, 2006.

Adam Przepiórkowski. Poliqarp: Przeszukiwarka korpusowa dla lingwistów. In Anna Duszak, Elżbieta Gajek, and Urszula Okulska, editors, Korpusy w angielsko-polskim językoznawstwie kontrastywnym, pages 398–426. Universitas, Cracow, 2006.

Marcin Woliński. Morfeusz — a practical tool for the morphological analysis of Polish. In Mieczysław A. Kłopotek, Sławomir T. Wierzchoń, and Krzysztof Trojanowski, editors, Intelligent Information Processing and Web Mining, Advances in Soft Computing, pages 503–512. Springer-Verlag, Berlin, 2006.

2005

Agnieszka Mykowiecka, Małgorzata Marciniak, and Anna Kupść. Making shallow look deeper: Anaphora and comparisons in medical information extraction. In Zygmunt Vetulani, editor, Proceedings of the 2nd Language & Technology Conference, pages 225–229, Poznań, Poland, 2005.

Dariusz Piechociński and Agnieszka Mykowiecka. Question answering in Polish using shallow parsing. In Radovan Garabík, editor, Computer Treatment of Slavic and East European Languages: Proceedings of the Third International Seminar, Bratislava, Slovakia, 10–12 November 2005, pages 167–173, Bratislava, 2005. VEDA: Vydavatel'stvo Slovenskej akadéme vied.

Marcin Woliński. An efficient implementation of a large grammar of Polish. In Zygmunt Vetulani, editor, Proceedings of the 2nd Language & Technology Conference, pages 343–347, Poznań, Poland, 2005.

2004

Jakub Piskorski, Peter Homola, Małgorzata Marciniak, Agnieszka Mykowiecka, Adam Przepiórkowski, and Marcin Woliński. Information extraction for Polish using the SProUT platform. In Mieczysław A. Kłopotek, Sławomir T. Wierzchoń, and Krzysztof Trojanowski, editors, Intelligent Information Processing and Web Mining, Advances in Soft Computing, pages 227–236. Springer-Verlag, Berlin, 2004.

Adam Przepiórkowski. Korpus IPI PAN. Wersja wstępna. Institute of Computer Science, Polish Academy of Sciences, Warsaw, 2004.

Adam Przepiórkowski. On case transmission in Polish control and raising constructions. Poznań Studies in Contemporary Linguistics, 39:103–123, 2004.

Marcin Woliński. Komputerowa weryfikacja gramatyki Świdzińskiego. Ph.D. dissertation, Institute of Computer Science, Polish Academy of Sciences, Warsaw, 2004.

2003

Małgorzata Marciniak, Agnieszka Mykowiecka, Adam Przepiórkowski, and Anna Kupść. An HPSG-annotated test suite for Polish. In Anne Abeillé, editor, Treebanks: Building and Using Parsed Corpora, volume 20 of Text, Speech and Language Technology, pages 129–146. Kluwer, Dordrecht, 2003.

Maciej Ogrodniczuk. Wzbogacenie korpusu słownika frekwencyjnego o nowe kody gramatyczne. In Janusz S. Bień, Maciej Ogrodniczuk, and Marcin Woliński, editors, Wzbogacony korpus Słownika frekwencyjnego polszczyzny współczesnej. Płyta CD-ROM. Katedra Lingwistyki Formalnej, Wydział Neofilologii Uniwersytetu Warszawskiego, Warsaw, 2003.

Maciej Ogrodniczuk. Rozszerzenie opisów morfologicznych w tekstach korpusu „Słownika frekwencyjnego polszczyzny współczesnej”. In Roman Huszcza and Jadwiga Linde-Usiekniewicz, editors, Prace lingwistyczne dedykowane prof. Jadwidze Sambor, pages 164–168. Wydział Polonistyki Uniwersytetu Warszawskiego, Warsaw, 2003.

Marcin Woliński. System znaczników morfosyntaktycznych w korpusie IPI PAN. Polonica, XXII–XXIII:39–55, 2003.

2001

Adam Przepiórkowski. arg-st on phrases: Evidence from Polish. In Dan Flickinger and Andreas Kathol, editors, Proceedings of the HPSG 2000 Conference, pages 267–284. CSLI Publications, Stanford, CA, 2001.

Marcin Woliński. Rodzajów w polszczyźnie jest osiem. In Włodzimierz Gruszczyński, Urszula Andrejewicz, Mirosław Bańko, and Dorota Kopcińska, editors, Nie bez znaczenia... Prace ofiarowane Profesorowi Zygmuntowi Saloniemu z okazji jubileuszu 15000 dni pracy naukowej, pages 303–305. Wydawnictwo Uniwersytetu Białostockiego, Białystok, 2001.

2000

Piotr Bański and Adam Przepiórkowski, editors. Proceedings of the First Generative Linguistics in Poland Conference. Institute of Computer Science, Polish Academy of Sciences, Warsaw, 2000.

Anna Kupść, Małgorzata Marciniak, Agnieszka Mykowiecka, and Adam Przepiórkowski. Składniowe konstrukcje współrzędne w języku polskim: Próba opisu w HPSG. IPI PAN Research Report 914, Institute of Computer Science, Polish Academy of Sciences, Warsaw, 2000.

Adam Przepiórkowski. Optional and multiple Long Distance Genitive of Negation in Polish. In Piotr Bański and Adam Przepiórkowski, editors, Proceedings of the First Generative Linguistics in Poland Conference, pages 135–146, Warsaw, 2000. Institute of Computer Science, Polish Academy of Sciences.

1999

Robert D. Borsley and Adam Przepiórkowski, editors. Slavic in Head-Driven Phrase Structure Grammar. CSLI Publications, Stanford, CA, 1999.

Adam Przepiórkowski. On complements and adjuncts in Polish. In Robert D. Borsley and Adam Przepiórkowski, editors, Slavic in Head-Driven Phrase Structure Grammar, pages 183–210. CSLI Publications, Stanford, CA, 1999.

Adam Przepiórkowski. Negative polarity questions and Italian negative concord. In Valia Kordoni, editor, Tübingen Studies in Head-Driven Phrase Structure Grammar, Arbeitspapiere des Sonderforschungsbereichs 340, Bericht Nr. 132, pages 353–400, Tübingen, 1999. Universität Tübingen.

Adam Przepiórkowski and Anna Kupść. Eventuality negation and negative concord in Polish and Italian. In Robert D. Borsley and Adam Przepiórkowski, editors, Slavic in Head-Driven Phrase Structure Grammar, pages 211–246. CSLI Publications, Stanford, CA, 1999.

1998

Leonard Bolc, Krzysztof Dziewicki, Piotr Rychlik, and Andrzej Szałas. Wnioskowanie w logikach nieklasycznych. Automatyzacja wnioskowania. Akademicka Oficyna Wydawnicza RM, Warsaw, 1998.

Adam Przepiórkowski. Do So and lexical theories of passivization. Paper delivered at the Spring 1998 Meeting of the Linguistics Association of Great Britain, Lancaster, Great Britain, April 14–16 1998, 1998.

Adam Przepiórkowski. `A Unified Theory of Scope' revisited: Quantifier retrieval without spurious ambiguities. In Gosse Bouma, Geert-Jan M. Kruijff, and Richard T. Oehrle, editors, Proceedings of the Joint Conference on Formal Grammar, Head-Driven Phrase Structure Grammar, and Categorial Grammar, pages 185–195, Saarbrücken, 1998. Universität des Saarlandes.

1997

Anna Kupść, Małgorzata Marciniak, and Leonard Bolc. Anaphor binding in Polish. An attempt at an HPSG account. IPI PAN Research Report 836, Institute of Computer Science, Polish Academy of Sciences, Warsaw, 1997.

Anna Kupść, Małgorzata Marciniak, and Agnieszka Mykowiecka. Komputerowe przetwarzanie jezyka naturalnego — wybrane zagadnienia. Informatyka, 1997.

Adam Przepiórkowski and Anna Kupść. Verbal negation and complex predicate formation in Polish. In Ralph C. Blight and Michelle J. Moosally, editors, Proceedings of the 1997 Texas Linguistics Society Conference on the Syntax and Semantics of Predication, volume 38 of Texas Linguistic Forum, pages 247–261, Austin, TX, 1997.

1996

Adam Przepiórkowski. Case assignment in Polish: Towards an HPSG analysis. In Claire Grover and Enric Vallduví, editors, Studies in HPSG, volume 12 of Edinburgh Working Papers in Cognitive Science, pages 191–228. Centre for Cognitive Science, University of Edinburgh, 1996.

1995

Leonard Bolc, Krzysztof Dziewicki, Piotr Rychlik, and Andrzej Szałas. Wnioskowanie w logikach nieklasycznych. Podstawy teoretyczne. Akademicka Oficyna Wydawnicza RM, Warsaw, 1995.

Anna Kupść, Małgorzata Marciniak, Agnieszka Mykowiecka, and Adam Przepiórkowski. Formal analysis of Polish in HPSG. In Mirosław Dąbrowski, Maciej Michalewicz, and Zbigniew Raś, editors, Intelligent Information Systems. Proceedings of the Fourth Workshop on Intelligent Information Systems, pages 295–305, Augustów, Poland, 1995. Wydawnictwa IPI PAN.

1992

Leonard Bolc and Agnieszka Mykowiecka. Podstawy przetwarzania języka naturalnego. Wybrane metody formalnego zapisu składni. Akademicka Oficyna Wydawnicza RM, Warsaw, 1992.

1991

Elżbieta Hajnicz. A formalization of absolute dates and relative dates based on the point calculus. International Journal of Man-Machine Studies, 34:717–730, 1991.

Elżbieta Hajnicz. Another approach to formalizing the point and interval calculi. International Journal of Man-Machine Studies, 34:703–716, 1991.

1989

Elżbieta Hajnicz. Absolute dates and relative dates in an inferential system on temporal dependencies between events. International Journal of Man-Machine Studies, 30:537–549, 1989.

Elżbieta Hajnicz. Formalizacja systemu wnioskowania o zależnościach czasowych między zdarzeniami. IPI PAN Research Report 658, Institute of Computer Science, Polish Academy of Sciences, Warsaw, 1989.

1988

Małgorzata Marciniak. Problemy semantyczne w systemach przetwarzania języka naturalnego. IPI PAN Research Report 647, Institute of Computer Science, Polish Academy of Sciences, Warsaw, 1988.