Revision 154 as of 2018-11-18 15:27:08

Clear message
Locked History Actions

ZILStart

The Linguistic Engineering Group

The Linguistic Engineering (LE) Group is part of the Department of Artificial Intelligence at the Institute of Computer Science, Polish Academy of Sciences (ICS PAS).

People

Core team

Anna Andrzejczuk, PhD

anna.andrzejczuk@ipipan.waw.pl

Tomasz Bartosiak, MSc

tomasz.bartosiak@gmail.com

Zbigniew Gawłowicz

zbigniew.gawlowicz@ipipan.waw.pl

Elżbieta Hajnicz, PhD, Assoc. Prof.

elzbieta.hajnicz@ipipan.waw.pl

Witold Kieraś, PhD

witold.kieras@ipipan.waw.pl

Łukasz Kobyliński, PhD

lukasz.kobylinski@ipipan.waw.pl

Dorota Komosińska, MSc

dorota.komosinska@gmail.com

Katarzyna Krasnowska, MSc

katarzyna.krasnowska@ipipan.waw.pl

Małgorzata Marciniak, PhD, Assoc. Prof.

malgorzata.marciniak@ipipan.waw.pl

Agnieszka Mykowiecka, PhD, Assoc. Prof.

agnieszka.mykowiecka@ipipan.waw.pl

Bartłomiej Nitoń, MSc

bartek.niton@gmail.com

Maciej Ogrodniczuk, PhD, Head of the Group

maciej.ogrodniczuk@ipipan.waw.pl

Agnieszka Patejuk, PhD

agnieszka.patejuk@ipipan.waw.pl

Adam Przepiórkowski, PhD, Assoc. Prof.

adam.przepiorkowski@ipipan.waw.pl

Piotr Rychlik, PhD

piotr.rychlik@ipipan.waw.pl

Aleksander Wawer, PhD

aleksander.wawer@ipipan.waw.pl

Marcin Woliński, PhD

marcin.wolinski@ipipan.waw.pl

Alina Wróblewska, PhD

alina.wroblewska@ipipan.waw.pl

Associates

Piotr Rybak

piotr.cezary.rybak@gmail.com

Filip Stefaniuk

filip.stefaniuk@gmail.com

Jakub Szymanik, PhD

jakub.szymanik@gmail.com

Grzegorz Wojdyga, MSc

g.wojdyga@gmail.com

Beata Wójtowicz, PhD

beata.wojtowicz@ipipan.waw.pl

Research

The main research areas of the Group

  • (Polish) corpus linguistics; cf. the IPI PAN Corpus of Polish and the National Corpus of Polish,

  • syntactic and semantic parsing of Polish; cf. Spejd and Świgra,

  • extraction of linguistic knowledge from corpora,
  • information extraction,
  • sentiment analysis,
  • morphosyntactic system of Polish,
  • generative linguistic formalisms, esp., HPSG and LFG.

The Group is a member of CLARIN, DARIAH-PL, ELRC, FLaReNet and META-NET.

Current externally funded projects

  • CLARIN-PL (Polish chapter of Common Language Resources and Technology Infrastructure)

  • Chronofleks (A diachronic formal model of Polish inflection and its implementation)

  • CoDeS (Compositional distributional semantic models for identification, discrimination and disambiguation of senses in Polish texts)

  • CORMETAN (Cognitive and sociocultural analysis of metaphoric expressions in Polish texts)

  • DARIAH-PL (Digital Research Infrastructure for the Arts and Humanities)

  • ELRC (European Language Resource Coordination)

  • MARCELL (Multilingual Resources for CEF.AT in the legal domain)

  • Kwantyfikatory w języku: użycie i znaczenie (Quantifiers in Language: Use and Meaning)

  • Parthenos (Pooling Activities, Resources and Tools for Heritage, E-research Networking, Optimization and Synergies)

  • Counteracting misinformation (Counteracting misinformation in digital media by deception detection and facilitating access to reliable sources using machine learning and natural language processing)

  • Scwad (Compositional distributional modelling of Polish language semantics)

  • SYNAMET (Microcorpus of Synaesthetic Metaphors. Towards a Formal Description and Efficient Methods of Analysis of Metaphors in Discourse)

Some of our past projects

Publicly available tools and resources

Here are some of the tools and resources created within our projects. See CLIP pages for a more exhaustive list of Polish tools and resources, including more tools and resources developed at ZIL IPI PAN.

Some tools (all open source, under GPL; see also CLIP):

  • Świgra – a DCG parser,

  • Spejd – a shallow parsing and disambiguation system,

  • TaKIPI – a morphosyntactic tagger for Polish,

  • PANTERA – a morphosyntactic tagger for Polish,

  • Poliqarp – a corpus indexing and search engine,

  • Poliqarp2 – a new generation corpus indexing and search engine,

  • Dendrarium – a treebank development system (under development),

  • Anotatornia – a system for multi-level manual annotation of corpora (forthcoming),

  • WSDDE – a system for designing and performing Word Sense Disambiguation experiments,

  • Multiservice – web service for various of our tools,

  • TermoPL - multiword terms extraction from text

  • DSmodels - web service for calculating word similarity using Polish word embeddings

  • etc.

Main resources (many more at CLIP):

Other activities

Links to some other activities of the Group: