Locked History Actions

ZILStart

The Linguistic Engineering Group

The Linguistic Engineering (LE) Group is part of the Department of Artificial Intelligence at the Institute of Computer Science, Polish Academy of Sciences (IPI PAN).

People

Core team

Tomasz Bartosiak, MSc

tomasz.bartosiak@gmail.com

Diego Feinmann, PhD

diego.feinmann@ipipan.waw.pl

Elżbieta Hajnicz, PhD, Assoc. Prof.

elzbieta.hajnicz@ipipan.waw.pl

Witold Kieraś, PhD

witold.kieras@ipipan.waw.pl

Łukasz Kobyliński, PhD

lukasz.kobylinski@ipipan.waw.pl

Dorota Komosińska, MSc

dorota.komosinska@gmail.com

Katarzyna Krasnowska-Kieraś, MSc

katarzyna.krasnowska@ipipan.waw.pl

Adam Majczyk, PhD candidate

adam.majczyk@ipipan.waw.pl

Małgorzata Marciniak, PhD, Assoc. Prof.

malgorzata.marciniak@ipipan.waw.pl

Agnieszka Mykowiecka, PhD, Assoc. Prof.

agnieszka.mykowiecka@ipipan.waw.pl

Maciej Ogrodniczuk, PhD, Assoc. Prof., Head of the Group

maciej.ogrodniczuk@ipipan.waw.pl

Agnieszka Patejuk, PhD

agnieszka.patejuk@ipipan.waw.pl

Adam Przepiórkowski, PhD, Full Prof.

adam.przepiorkowski@ipipan.waw.pl

Piotr Przybyła, PhD (on postdoctoral fellowship at UPF)

piotr.przybyla@ipipan.waw.pl

Michał Rudolf, PhD

michal@rudolf.waw.pl

Piotr Rychlik, PhD

piotr.rychlik@ipipan.waw.pl

Karolina Saputa, BEng

Aleksandra Tomaszewska, MSc

aleksandra.tomaszewska@ipipan.waw.pl

Aleksander Wawer, PhD

aleksander.wawer@ipipan.waw.pl

Marcin Woliński, PhD, Assoc. Prof.

marcin.wolinski@ipipan.waw.pl

Joanna Wołoszyn, PhD

joanna.woloszyn@ipipan.waw.pl

Alina Wróblewska, PhD

alina.wroblewska@ipipan.waw.pl

Sebastian Zawada, PhD candidate

sebastian.zawada@ipipan.waw.pl

Natalia Zawadzka-Paluektau, PhD

natalia.zawadzka-paluektau@ipipan.waw.pl

Daniel Ziembicki, PhD

daniel.ziembicki@uw.edu.pl

Aleksandra Zwierzchowska, MSc

aazwierzchowska@gmail.coml

Bartosz Żuk, PhD candidate

bartoszzuk.poczta@gmail.com

Associates

Anna Andrzejczuk, PhD (on leave)

anna.andrzejczuk@ipipan.waw.pl

Wiktor Eźlakowski, MSc

wiktor.ezlakowski@ipipan.waw.pl

Sonia Janicka

sonia.janicka@gmail.com

Mateusz Klimaszewski, MSc

mk.klimaszewski@gmail.com

Jakub Piskorski, PhD

jpiskorski@gmail.com

Piotr Rybak, MSc

piotr.cezary.rybak@gmail.com

Jakub Szymanik, PhD

jakub.szymanik@gmail.com

Ryszard Tuora, MSc

ryszardtuora@gmail.com

Grzegorz Wojdyga, MSc

g.wojdyga@ipipan.waw.pl

Beata Wójtowicz, PhD, Assoc. Prof.

beata.wojtowicz@ipipan.waw.pl

Research

The main research areas of the Group

  • (Polish) corpus linguistics (National Corpus of Polish)

  • morphosyntactic tagging and lemmatisation of Polish
  • syntactic and semantic parsing of Polish
  • extraction of linguistic knowledge from corpora
  • information extraction
  • distributional semantics and compositional distributional semantics
  • sentiment analysis
  • credibility assessment of online content
  • reference and discourse relations
  • generative linguistic formalisms, esp., HPSG and LFG.

The Group is a member of CLARIN, DARIAH-PL, ELRC, FLaReNet and META-NET.

Current externally funded projects

  • CLARIN-PL (Polish chapter of Common Language Resources and Technology Infrastructure)

  • CORMETAN (Cognitive and sociocultural analysis of metaphoric expressions in Polish texts)

  • CURLICAT (Curated Multilingual Language Resources for CEF AT)

  • Korpus Dekady (DARIAH-PL — Digital Research Infrastructure for the Arts and Humanities)

  • ELE (European Language Equality)

  • ELG (European Language Grid)

  • ELRC (European Language Resource Coordination)

  • HOMADOS (Hampering Misinformation by Assessing Credibility of Online Sources)

  • KORBA 2 (Extension of the "Electronic corpus of 17th and 18th century Polish texts" and its integration with the "Electronic Dictionary of the 17th–18th Century Polish")

  • Kwantyfikatory w języku: użycie i znaczenie (Quantifiers in Language: Use and Meaning)

  • MARCELL (Multilingual Resources for CEF.AT in the legal domain)

  • Nexus Linguarum (European network for Web-centred linguistic data science)

  • Scwad (Compositional distributional modelling of Polish language semantics)

  • SYNAMET (Microcorpus of Synaesthetic Metaphors. Towards a Formal Description and Efficient Methods of Analysis of Metaphors in Discourse)

Some of our past projects

Publicly available tools and resources

Here are some of the tools and resources created within our projects. See CLIP pages for a more exhaustive list of Polish tools and resources, including more tools and resources developed at ZIL IPI PAN.

Some tools (all open source, under GPL; see also CLIP):

  • Morfeusz 2 – a morphological analyser of Polish,

  • Spejd – a shallow parsing and disambiguation system,

  • Świgra – a DCG parser,

  • COMBO – a language-independent system for natural language preprocessing (i.e. morphosyntactic tagging, lemmatisation, dependency parsing and thematic role labelling,

  • Concraft — a CRF morphosyntactic tagger of Polish compatible with Morfeusz SGJP,

  • PANTERA – a morphosyntactic tagger for Polish,

  • TaKIPI – a morphosyntactic tagger for Polish,

  • Poliqarp – a corpus indexing and search engine,

  • Poliqarp2 – a new generation corpus indexing and search engine,

  • Dendrarium – a treebank development system (under development),

  • Anotatornia 2 – an annotation tool geared towards historical corpora,

  • WSDDE – a system for designing and performing Word Sense Disambiguation experiments,

  • Multiservice – web service for various of our tools,

  • TermoPL - multiword terms extraction from text

  • DSmodels - web service for calculating word similarity using Polish word embeddings

Main resources (many more at CLIP):

Other activities

Links to some other activities of the Group:

Selected publications

Failed to fetch publications list!