Locked History Actions

Diff for "ZILStart"

Differences between revisions 194 and 202 (spanning 8 versions)
Revision 194 as of 2020-01-06 21:00:58
Size: 15179
Revision 202 as of 2020-08-28 09:47:11
Size: 15489
Deletions are marked like this. Additions are marked like this.
Line 34: Line 34:
|| Zbigniew Gawłowicz, BSc || [[mailto:zbigniew.gawlowicz@ipipan.waw.pl|zbigniew.gawlowicz@ipipan.waw.pl]] || || Jakub Piskorski, PhD || [[mailto:jpiskorski@gmail.com|jpiskorski@gmail.com]] ||
Line 62: Line 62:
 * [[http://clip.ipipan.waw.pl/CURLICAT|CURLICAT]] (Curated Multilingual Language Resources for CEF AT)
Line 108: Line 109:
 * [[http://morfeusz.sgjp.pl/|Morfeusz]] – a morphological analyser of Polish,  * [[http://morfeusz.sgjp.pl/|Morfeusz 2]] – a morphological analyser of Polish,
Line 110: Line 111:
 * [[http://nlp.ipipan.waw.pl/~wolinski/swigra/|Świgra]] – a DCG parser,  * [[http://zil.ipipan.waw.pl/%C5%9Awigra|Świgra]] – a DCG parser,
Line 112: Line 113:
 * [[http://zil.ipipan.waw.pl/Concraft|Concraft]] — a CRF morphosyntactic tagger of Polish compatible with Morfeusz SGJP,
 * [[http://zil.ipipan.waw.pl/PANTERA|PANTERA]] – a morphosyntactic tagger for Polish,
Line 113: Line 116:
 * [[http://zil.ipipan.waw.pl/PANTERA|PANTERA]] – a morphosyntactic tagger for Polish,
Line 141: Line 143:
 * [[http://poleval.pl/|PolEval]], the evaluation campaign for natural language processing tools for Polish

The Linguistic Engineering Group

The Linguistic Engineering (LE) Group is part of the Department of Artificial Intelligence at the Institute of Computer Science, Polish Academy of Sciences (ICS PAS).


Core team

Anna Andrzejczuk, PhD


Tomasz Bartosiak, MSc


Elżbieta Hajnicz, PhD, Assoc. Prof.


Witold Kieraś, PhD


Łukasz Kobyliński, PhD


Dorota Komosińska, MSc


Katarzyna Krasnowska-Kieraś, MSc


Małgorzata Maciejewska, PhD


Małgorzata Marciniak, PhD, Assoc. Prof.


Agnieszka Mykowiecka, PhD, Assoc. Prof.


Bartłomiej Nitoń, MSc


Maciej Ogrodniczuk, PhD, Assoc. Prof., Head of the Group


Agnieszka Patejuk, PhD


Adam Przepiórkowski, PhD, Full Prof.


Piotr Przybyła, PhD


Piotr Rychlik, PhD


Aleksander Wawer, PhD


Grzegorz Wojdyga, MSc


Marcin Woliński, PhD, Assoc. Prof.


Alina Wróblewska, PhD



Jakub Piskorski, PhD


Piotr Rybak


Jakub Szymanik, PhD


Beata Wójtowicz, PhD, Assoc. Prof.



The main research areas of the Group

  • (Polish) corpus linguistics (National Corpus of Polish),

  • morphosyntactic tagging and lemmatisation of Polish,
  • syntactic and semantic parsing of Polish,
  • extraction of linguistic knowledge from corpora,
  • information extraction,
  • distributional semantics and compositional distributional semantics,
  • sentiment analysis,
  • credibility assessment of online content,

  • generative linguistic formalisms, esp., HPSG and LFG.

The Group is a member of CLARIN, DARIAH-PL, ELRC, FLaReNet and META-NET.

Current externally funded projects

  • CLARIN-PL (Polish chapter of Common Language Resources and Technology Infrastructure)

  • CoDeS (Compositional distributional semantic models for identification, discrimination and disambiguation of senses in Polish texts)

  • CORMETAN (Cognitive and sociocultural analysis of metaphoric expressions in Polish texts)

  • CURLICAT (Curated Multilingual Language Resources for CEF AT)

  • DARIAH-PL (Digital Research Infrastructure for the Arts and Humanities)

  • ELG (European Language Grid)

  • ELRC (European Language Resource Coordination)

  • KORBA 2 (Extension of the "Electronic corpus of 17th and 18th century Polish texts" and its integration with the "Electronic Dictionary of the 17th–18th Century Polish")

  • HOMADOS (Hampering Misinformation by Assessing Credibility of Online Sources)

  • MARCELL (Multilingual Resources for CEF.AT in the legal domain)

  • Nexus Linguarum (European network for Web-centred linguistic data science)

  • Kwantyfikatory w języku: użycie i znaczenie (Quantifiers in Language: Use and Meaning)

  • Parthenos (Pooling Activities, Resources and Tools for Heritage, E-research Networking, Optimization and Synergies)

  • Scwad (Compositional distributional modelling of Polish language semantics)

  • SYNAMET (Microcorpus of Synaesthetic Metaphors. Towards a Formal Description and Efficient Methods of Analysis of Metaphors in Discourse)

Some of our past projects

Publicly available tools and resources

Here are some of the tools and resources created within our projects. See CLIP pages for a more exhaustive list of Polish tools and resources, including more tools and resources developed at ZIL IPI PAN.

Some tools (all open source, under GPL; see also CLIP):

  • Morfeusz 2 – a morphological analyser of Polish,

  • Spejd – a shallow parsing and disambiguation system,

  • Świgra – a DCG parser,

  • COMBO – a language-independent system for natural language preprocessing (i.e. morphosyntactic tagging, lemmatisation, dependency parsing and thematic role labelling,

  • Concraft — a CRF morphosyntactic tagger of Polish compatible with Morfeusz SGJP,

  • PANTERA – a morphosyntactic tagger for Polish,

  • TaKIPI – a morphosyntactic tagger for Polish,

  • Poliqarp – a corpus indexing and search engine,

  • Poliqarp2 – a new generation corpus indexing and search engine,

  • Dendrarium – a treebank development system (under development),

  • Anotatornia 2 – an annotation tool geared towards historical corpora,

  • WSDDE – a system for designing and performing Word Sense Disambiguation experiments,

  • Multiservice – web service for various of our tools,

  • TermoPL - multiword terms extraction from text

  • DSmodels - web service for calculating word similarity using Polish word embeddings

Main resources (many more at CLIP):

Other activities

Links to some other activities of the Group: