#acl +All:read Default = The Linguistic Engineering Group = The Linguistic Engineering (LE) Group is part of the Department of Artificial Intelligence at the [[http://www.ipipan.waw.pl/en/|Institute of Computer Science]], [[https://institution.pan.pl/|Polish Academy of Sciences]] (IPI PAN). == People == === Core team === || [[http://zil.ipipan.waw.pl/TomaszBartosiak|Tomasz Bartosiak]], MSc || [[mailto:tomasz.bartosiak@gmail.com|tomasz.bartosiak@gmail.com]] || || [[https://www.diegofeinmann.com/|Diego Feinmann]], PhD || [[mailto:diego.feinmann@ipipan.waw.pl|diego.feinmann@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/ElzbietaHajnicz|Elżbieta Hajnicz]], PhD, Assoc. Prof. || [[mailto:elzbieta.hajnicz@ipipan.waw.pl|elzbieta.hajnicz@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/WitoldKieras|Witold Kieraś]], PhD || [[mailto:witold.kieras@ipipan.waw.pl|witold.kieras@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/LukaszKobylinski|Łukasz Kobyliński]], PhD || [[mailto:lkobylinski@ipipan.waw.pl|lukasz.kobylinski@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/DorotaKomosi%C5%84ska|Dorota Komosińska]], MSc || [[mailto:dorota.komosinska@gmail.com|dorota.komosinska@gmail.com]] || || [[http://zil.ipipan.waw.pl/KatarzynaKrasnowska|Katarzyna Krasnowska-Kieraś]], MSc || [[mailto:katarzyna.krasnowska@ipipan.waw.pl|katarzyna.krasnowska@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/MalgorzataMarciniak|Małgorzata Marciniak]], PhD, Assoc. Prof. || [[mailto:malgorzata.marciniak@ipipan.waw.pl|malgorzata.marciniak@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/AgnieszkaMykowiecka|Agnieszka Mykowiecka]], PhD, Assoc. Prof. || [[mailto:agnieszka.mykowiecka@ipipan.waw.pl|agnieszka.mykowiecka@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/MaciejOgrodniczuk|Maciej Ogrodniczuk]], PhD, Assoc. Prof., Head of the Group || [[mailto:maciej.ogrodniczuk@ipipan.waw.pl|maciej.ogrodniczuk@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/AgnieszkaPatejuk|Agnieszka Patejuk]], PhD || [[mailto:aep@ipipan.waw.pl|agnieszka.patejuk@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/AdamPrzepiorkowski|Adam Przepiórkowski]], PhD, Full Prof. || [[mailto:adam.przepiorkowski@ipipan.waw.pl|adam.przepiorkowski@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/PiotrPrzybyla|Piotr Przybyła]], PhD (on postdoctoral fellowship at [[https://www.upf.edu/web/erinia|UPF]]) || [[mailto:piotr.przybyla@ipipan.waw.pl|piotr.przybyla@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/MichałRudolf|Michał Rudolf]], PhD || [[mailto:michal@rudolf.waw.pl|michal@rudolf.waw.pl]] || || [[http://zil.ipipan.waw.pl/PiotrRychlik|Piotr Rychlik]], PhD || [[mailto:rychlik@ipipan.waw.pl|piotr.rychlik@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/AleksandraTomaszewska|Aleksandra Tomaszewska]], PhD candidate || [[mailto:aleksandra.tomaszewska@hotmail.com|aleksandra.tomaszewska@hotmail.com]] || || [[http://zil.ipipan.waw.pl/AleksanderWawer|Aleksander Wawer]], PhD || [[mailto:aleksander.wawer@ipipan.waw.pl|aleksander.wawer@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/MarcinWolinski|Marcin Woliński]], PhD, Assoc. Prof. || [[mailto:marcin.wolinski@ipipan.waw.pl|marcin.wolinski@ipipan.waw.pl]] || || Joanna Wołoszyn, PhD || [[mailto:joanna.woloszyn@ipipan.waw.pl|joanna.woloszyn@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/AlinaWroblewska|Alina Wróblewska]], PhD || [[mailto:alina.wroblewska@ipipan.waw.pl|alina.wroblewska@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/SebastianZawada|Sebastian Zawada]], MSc || [[mailto:sebastian.zawada@ipipan.waw.pl|sebastian.zawada@ipipan.waw.pl]] || || Natalia Zawadzka, PhD || [[mailto:natalia.zawadzka-paluektau@ipipan.waw.pl|natalia.zawadzka-paluektau@ipipan.waw.pl]] || || Bartosz Żuk, MSc || [[mailto:bartoszzuk.poczta@gmail.com|bartoszzuk.poczta@gmail.com]] || === Associates === || [[http://zil.ipipan.waw.pl/AnnaAndrzejczuk|Anna Andrzejczuk]], PhD (on leave) || [[mailto:anna.andrzejczuk@ipipan.waw.pl|anna.andrzejczuk@ipipan.waw.pl]] || || Wiktor Eźlakowski, MSc || [[mailto:wiktor.ezlakowski@ipipan.waw.pl|wiktor.ezlakowski@ipipan.waw.pl]] || || Sonia Janicka || [[mailto:sonia.janicka@gmail.com|sonia.janicka@gmail.com]] || || [[http://zil.ipipan.waw.pl/MateuszKlimaszewski|Mateusz Klimaszewski]], MSc || [[mailto:mk.klimaszewski@gmail.com|mk.klimaszewski@gmail.com]] || || Jakub Piskorski, PhD || [[mailto:jpiskorski@gmail.com|jpiskorski@gmail.com]] || || Piotr Rybak, MSc || [[mailto:piotr.cezary.rybak@gmail.com|piotr.cezary.rybak@gmail.com]] || || Karol Saputa, BEng || [[mailto:karolsaputa@gmail.com|karolsaputa@gmail.com]] || || Jakub Szymanik, PhD || [[mailto:jakub.szymanik@gmail.com|jakub.szymanik@gmail.com]] || || [[http://zil.ipipan.waw.pl/RyszardTuora|Ryszard Tuora]], MSc || [[mailto:ryszardtuora@gmail.com|ryszardtuora@gmail.com]] || || Grzegorz Wojdyga, MSc || [[mailto:g.wojdyga@ipipan.waw.pl|g.wojdyga@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/BeataWojtowicz|Beata Wójtowicz]], PhD, Assoc. Prof. || [[mailto:beata.wojtowicz@ipipan.waw.pl|beata.wojtowicz@ipipan.waw.pl]] || == Research == === The main research areas of the Group === * (Polish) corpus linguistics ([[http://nkjp.pl/|National Corpus of Polish]]), /* ; cf. the [[http://korpus.pl/en/|IPI PAN Corpus of Polish]] and the [[http://nkjp.pl/|National Corpus of Polish]], */ * morphosyntactic tagging and lemmatisation of Polish, * syntactic and semantic parsing of Polish, * extraction of linguistic knowledge from corpora, * information extraction, * distributional semantics and compositional distributional semantics, * sentiment analysis, * credibility assessment of online content, /* * morphosyntactic system of Polish, */ * generative linguistic formalisms, esp., HPSG and LFG. The Group is a member of [[http://www.clarin.eu/|CLARIN]], [[http://dariah.pl/|DARIAH-PL]], [[http://clip.ipipan.waw.pl/ELRC|ELRC]], [[http://www.flarenet.eu/|FLaReNet]] and [[http://www.meta-net.eu/|META-NET]]. === Current externally funded projects === * [[http://clip.ipipan.waw.pl/CLARIN-PL-3|CLARIN-PL]] (Polish chapter of [[http://www.clarin.eu/|Common Language Resources and Technology Infrastructure]]) * [[CORMETAN]] (Cognitive and sociocultural analysis of metaphoric expressions in Polish texts) * [[http://clip.ipipan.waw.pl/CURLICAT|CURLICAT]] (Curated Multilingual Language Resources for CEF AT) * [[http://korpus-dekady.ipipan.waw.pl|Korpus Dekady]] ([[http://dariah.pl/|DARIAH-PL]] — Digital Research Infrastructure for the Arts and Humanities) * [[http://clip.ipipan.waw.pl/ELE|ELE]] (European Language Equality) * [[http://clip.ipipan.waw.pl/ELG|ELG]] (European Language Grid) * [[http://clip.ipipan.waw.pl/ELRC|ELRC]] (European Language Resource Coordination) * [[HOMADOS|HOMADOS]] (Hampering Misinformation by Assessing Credibility of Online Sources) * [[http://clip.ipipan.waw.pl/KORBA-2|KORBA 2]] (Extension of the "Electronic corpus of 17th and 18th century Polish texts" and its integration with the "Electronic Dictionary of the 17th–18th Century Polish") * [[http://zil.ipipan.waw.pl/Quantifiers|Kwantyfikatory w języku: użycie i znaczenie]] (Quantifiers in Language: Use and Meaning) * [[http://clip.ipipan.waw.pl/MARCELL|MARCELL]] (Multilingual Resources for CEF.AT in the legal domain) * [[http://clip.ipipan.waw.pl/Nexus|Nexus Linguarum]] (European network for Web-centred linguistic data science) * [[http://zil.ipipan.waw.pl/Scwad|Scwad]] (Compositional distributional modelling of Polish language semantics) * [[http://synamet.uw.edu.pl/|SYNAMET]] (Microcorpus of Synaesthetic Metaphors. Towards a Formal Description and Efficient Methods of Analysis of Metaphors in Discourse) === Some of our past projects === * [[ATLAS|ATLAS]] (Applied Technology for Language-Aided CMS) * [[http://zil.ipipan.waw.pl/Automatic%20detection%20and%20correction%20of%20annotation%20errors%20in%20Polish%20language%20corpora|Automatic detection and correction of annotation errors in Polish language corpora]] * [[Automatic detection of semantic dependencies within verb argument structures in large treebanks|Automatic detection of semantic dependencies within verb argument structures in large treebanks]] * [[Automatic extraction of linguistic knowledge from a large corpus of Polish|Automatic extraction of linguistic knowledge from a large corpus of Polish]] * [[CESAR|CESAR]] (CEntral and South-east europeAn Resources) * [[http://zil.ipipan.waw.pl/Chronofleks|Chronofleks]] (A diachronic formal model of Polish inflection and its implementation) * [[CLARIN|CLARIN]] (Polish chapter of [[http://www.clarin.eu/|Common Language Resources and Technology Infrastructure]], see also [[http://clip.ipipan.waw.pl/CLARIN-PL-2|CLARIN-PL 2]]) * [[http://zil.ipipan.waw.pl/CoDeS|CoDeS]] (Compositional distributional semantic models for identification, discrimination and disambiguation of senses in Polish texts) * [[Construction of a treebank for Polish using automatic syntactic analysis|Construction of a treebank for Polish using automatic syntactic analysis]] * [[CORE|CORE]] (Computer-based methods for coreference resolution in Polish texts) * [[http://clip.ipipan.waw.pl/COTHEC|COTHEC]] (Unified theory of coreference in Polish and its corpus-based verification) * [[HPSG Grammar of Polish|HPSG Grammar of Polish]] * [[Information Extraction from Polish free text|Information Extraction from Polish free text]] * [[IPI PAN Corpus|IPI PAN Corpus of Polish]] * [[http://clip.ipipan.waw.pl/KORBA|KORBA]] (Electronic corpus of 17th and 18th century Polish texts) * [[LT4eL|LT4eL]] (Language Technology for eLearning) * [[LUNA|LUNA]] (spoken Language UNderstanding in multilinguAl communication systems) with the Polish support * [[NEKST|NEKST]] (An adaptive system to support problem-solving on the basis of document collections in the Internet) * [[NKJP|NKJP]] (National Corpus of Polish) * [[http://zil.ipipan.waw.pl/OPTA|OPTA]] (Automatyczne metody rozpoznawania przedmiotów i wyrażeń opinii w języku polskim) * [[PARSEME|PARSEME]] (PARSing and Multi-word Expressions. Towards linguistic precision and computational efficiency in natural language processing) * [[http://clip.ipipan.waw.pl/Parthenos|Parthenos]] (Pooling Activities, Resources and Tools for Heritage, E-research Networking, Optimization and Synergies) * [[http://clip.ipipan.waw.pl/Readability|Readability]] (Measuring the degree of readability of nonliterary Polish texts) * [[SYNAT|SYNAT]] (Creation of a universal, open repository platform for hosting and communication of networked resources of knowledge for science, education and open knowledge-based society) * [[Test Suite of Polish Utterances|Treebank / Test Suite of Polish Utterances]] * [[http://clip.ipipan.waw.pl/TextLink|TextLink]] (Structuring Discourse in Multilingual Europe) * [[http://clip.ipipan.waw.pl/TrendMiner|TrendMiner]] (Large-scale, Cross-lingual Trend Mining and Summarisation of Real-time Media Streams) == Publicly available tools and resources == Here are some of the tools and resources created within our projects. See [[http://clip.ipipan.waw.pl/|CLIP]] pages for a more exhaustive list of Polish tools and resources, including more tools and resources developed at ZIL IPI PAN. Some '''tools''' (all open source, under [[http://www.gnu.org/copyleft/gpl.html|GPL]]; see also [[http://clip.ipipan.waw.pl/|CLIP]]): * [[http://morfeusz.sgjp.pl/|Morfeusz 2]] – a morphological analyser of Polish, * [[http://zil.ipipan.waw.pl/Spejd/|Spejd]] – a shallow parsing and disambiguation system, * [[http://zil.ipipan.waw.pl/%C5%9Awigra|Świgra]] – a DCG parser, * [[https://github.com/360er0/COMBO|COMBO]] – a language-independent system for natural language preprocessing (i.e. morphosyntactic tagging, lemmatisation, dependency parsing and thematic role labelling, * [[http://zil.ipipan.waw.pl/Concraft|Concraft]] — a CRF morphosyntactic tagger of Polish compatible with Morfeusz SGJP, * [[http://zil.ipipan.waw.pl/PANTERA|PANTERA]] – a morphosyntactic tagger for Polish, * [[http://nlp.pwr.wroc.pl/takipi/|TaKIPI]] – a morphosyntactic tagger for Polish, * [[http://poliqarp.sourceforge.net/|Poliqarp]] – a corpus indexing and search engine, * [[https://sourceforge.net/projects/poliqarp2/|Poliqarp2]] – a new generation corpus indexing and search engine, * [[http://sourceforge.net/projects/dendrarium/|Dendrarium]] – a treebank development system (under development), * [[http://zil.ipipan.waw.pl/Anotatornia2/|Anotatornia 2]] – an annotation tool geared towards historical corpora, * [[http://zil.ipipan.waw.pl/WSDDE|WSDDE]] – a system for designing and performing Word Sense Disambiguation experiments, * [[http://multiservice.nlp.ipipan.waw.pl/|Multiservice]] – web service for various of our tools, * [[http://zil.ipipan.waw.pl/TermoPL|TermoPL]] - multiword terms extraction from text * [[http://dsmodels.nlp.ipipan.waw.pl/sim1.html|DSmodels]] - web service for calculating word similarity using Polish word embeddings Main '''resources''' (many more at [[http://clip.ipipan.waw.pl/|CLIP]]): * [[http://walenty.ipipan.waw.pl/|Walenty]] – a valence dictionary of Polish (described [[http://zil.ipipan.waw.pl/Walenty|here]]), * [[http://nkjp.pl/index.php?page=0&lang=1|National Corpus of Polish]], * [[http://zil.ipipan.waw.pl/CoDeS|Polish word embeddings based on NKJP and Wikipedia]], * Polish dependency banks: [[http://zil.ipipan.waw.pl/PDB|PDB]], [[http://git.nlp.ipipan.waw.pl/alina/PDBUD/tree/master/PDB-UD_current|PDB-UD]], [[http://git.nlp.ipipan.waw.pl/alina/PDBUD/tree/master/PUD-PL_current|PUD-PL]], [[http://git.nlp.ipipan.waw.pl/alina/PDBUD/tree/master/NKJP1M-UD_current|NKJP1M-UD]], * [[http://zil.ipipan.waw.pl/PDB/PDBparser|Dependency parsing models for Polish]]. == Other activities == Links to some other activities of the Group: * [[http://jlm.ipipan.waw.pl/|Journal of Language Modelling]] * [[http://zil.ipipan.waw.pl/seminar|NLP Seminar at IPI PAN]] * [[http://poleval.pl/|PolEval]], the evaluation campaign for natural language processing tools for Polish * conferences organised by the Group: * [[http://iis.ipipan.waw.pl/|Intelligent Information Systems]] series of conferences * [[http://poltal.ipipan.waw.pl/|9th International Conference on Natural Language Processing]] (PolTAL 2014), 17–19 September 2014, Warsaw, Poland * [[http://tlt14.ipipan.waw.pl/|14th International Workshop on Treebanks and Linguistic Theories]] (TLT14), 11–12 December 2015, Warsaw, Poland * [[http://corbon.nlp.ipipan.waw.pl/2016/|Coreference Resolution Beyond OntoNotes]] (CORBON 2016) workshop at [[http://naacl.org/naacl-hlt-2016/|NAACL 2016]] (The 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies), 16 June 2016, San Diego, US * [[http://headlex16.ipipan.waw.pl/|Joint 2016 Conference on Head-driven Phrase Structure Grammar and Lexical Functional Grammar]] (!HeadLex16), 24–29 July 2016, Warsaw, Poland * [[http://corbon.nlp.ipipan.waw.pl/|2nd Workshop on Coreference Resolution Beyond OntoNotes]] (CORBON 2017) at [[http://eacl2017.org/|EACL 2017]] (The 15th Conference of the European Chapter of the Association for Computational Linguistics), 4 April 2017, Valencia, Spain * [[http://anawiki.essex.ac.uk/dali/crac18/|Computational Models of Reference, Anaphora, and Coreference]] workshop (CRAC) at [[http://naacl2018.org/|NAACL 2018]] (The 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies), 6 June 2018, New Orleans, USA * [[https://nlpday.pl/|AI & NLP Workshop Day]], 19 October 2018, Warsaw * [[https://sites.google.com/view/crac2019/|Second Workshop on Computational Models of Reference, Anaphora and Coreference]] (CRAC 2019), 6 ot 7 June 2019, Minneapolis * [[http://www.dynamicsoflanguage.edu.au/lfg-2019/|The 24th International Lexical-Functional Grammar Conference]] (LFG19), 8–10 July 2019, Canberra * [[https://lfg20.w.uib.no/|The 25th International Lexical-Functional Grammar Conference]] (LFG20), 23–25 June 2020, online * [[https://typo.uni-konstanz.de/lfg2021/|The 26th International Lexical-Functional Grammar Conference]] (LFG21), 13–15 July 2021, online == Selected publications == <>