Locked History Actions

Diff for "ZILStart"

Differences between revisions 151 and 208 (spanning 57 versions)
Revision 151 as of 2018-10-23 09:43:01
Size: 13427
Comment:
Revision 208 as of 2020-11-17 12:20:58
Size: 15826
Comment:
Deletions are marked like this. Additions are marked like this.
Line 4: Line 4:
The Linguistic Engineering (LE) Group is part of the [[http://www.ipipan.waw.pl/en/dept/dept-ai.html|Department of Artificial Intelligence]] at the [[http://www.ipipan.waw.pl/en/|Institute of Computer Science]], [[http://www.english.pan.pl/|Polish Academy of Sciences]] (ICS PAS). The Linguistic Engineering (LE) Group is part of the [[http://www.ipipan.waw.pl/en/dept/dept-ai.html|Department of Artificial Intelligence]] at the [[http://www.ipipan.waw.pl/en/|Institute of Computer Science]], [[http://www.english.pan.pl/|Polish Academy of Sciences]] (IPI PAN).
Line 12: Line 12:
|| Zbigniew Gawłowicz || [[mailto:zbigniew.gawlowicz@ipipan.waw.pl|zbigniew.gawlowicz@ipipan.waw.pl]] ||
Line 14: Line 13:
|| [[http://zil.ipipan.waw.pl/KonradKaczynski|Konrad Kaczyński]], MSc || [[mailto:konrad.kaczynski@ipipan.waw.pl|konrad.kaczynski@ipipan.waw.pl]] ||
Line 15: Line 15:
|| [[http://zil.ipipan.waw.pl/MateuszKlimaszewski|Mateusz Klimaszewski]], MSc || [[mailto:mk.klimaszewski@gmail.com|mk.klimaszewski@gmail.com]] ||
Line 17: Line 18:
|| [[http://zil.ipipan.waw.pl/KatarzynaKrasnowska|Katarzyna Krasnowska]], MSc        || [[mailto:katarzyna.krasnowska@ipipan.waw.pl|katarzyna.krasnowska@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/KatarzynaKrasnowska|Katarzyna Krasnowska-Kieraś]], MSc || [[mailto:katarzyna.krasnowska@ipipan.waw.pl|katarzyna.krasnowska@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/MalgorzataMaciejewska|Małgorzata Maciejewska]], PhD || [[mailto:m.maciejewska@yahoo.co.uk|m.maciejewska@yahoo.co.uk]] ||
Line 21: Line 23:
|| [[http://zil.ipipan.waw.pl/MaciejOgrodniczuk|Maciej Ogrodniczuk]], PhD, Head of the Group || [[mailto:maciej.ogrodniczuk@ipipan.waw.pl|maciej.ogrodniczuk@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/MaciejOgrodniczuk|Maciej Ogrodniczuk]], PhD, Assoc. Prof., Head of the Group || [[mailto:maciej.ogrodniczuk@ipipan.waw.pl|maciej.ogrodniczuk@ipipan.waw.pl]] ||
Line 23: Line 25:
|| [[http://zil.ipipan.waw.pl/AdamPrzepiorkowski|Adam Przepiórkowski]], PhD, Assoc. Prof. || [[mailto:adam.przepiorkowski@ipipan.waw.pl|adam.przepiorkowski@ipipan.waw.pl]] || || [[http://zil.ipipan.waw.pl/AdamPrzepiorkowski|Adam Przepiórkowski]], PhD, Full Prof. || [[mailto:adam.przepiorkowski@ipipan.waw.pl|adam.przepiorkowski@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/PiotrPrzybyla|Piotr Przybyła]], PhD || [[mailto:piotr.przybyla@ipipan.waw.pl|piotr.przybyla@ipipan.waw.pl]] ||
Line 26: Line 29:
|| [[http://zil.ipipan.waw.pl/MarcinWolinski|Marcin Woliński]], PhD               || [[mailto:marcin.wolinski@ipipan.waw.pl|marcin.wolinski@ipipan.waw.pl]] || || Grzegorz Wojdyga, MSc || [[mailto:g.wojdyga@ipipan.waw.pl|g.wojdyga@ipipan.waw.pl]] ||
|| [[http://zil.ipipan.waw.pl/MarcinWolinski|Marcin Woliński]], PhD, Assoc. Prof. || [[mailto:marcin.wolinski@ipipan.waw.pl|marcin.wolinski@ipipan.waw.pl]] ||
Line 32: Line 36:
|| Piotr Rybak || [[mailto:piotr.cezary.rybak@gmail.com|piotr.cezary.rybak@gmail.com]] ||
|| Filip Stefaniuk || [[mailto:filip.stefaniuk@gmail.com|filip.stefaniuk@gmail.com]] ||
|| Jakub Szymanik, PhD || [[mailto:jakub.szymanik@gmail.com|jakub.szymanik@gmail.com]] ||
|| Grzegorz Wojdyga, MSc || [[mailto:g.wojdyga@gmail.com|g.wojdyga@gmail.com]] ||
|| [[http://zil.ipipan.waw.pl/BeataWojtowicz|Beata Wójtowicz]], PhD || [[mailto:beata.wojtowicz@ipipan.waw.pl|beata.wojtowicz@ipipan.waw.pl]] ||
|| Jakub Piskorski, PhD || [[mailto:jpiskorski@gmail.com|jpiskorski@gmail.com]] ||
|| Piotr Rybak || [[mailto:piotr.cezary.rybak@gmail.com|piotr.cezary.rybak@gmail.com]] ||
|| Jakub Szymanik, PhD || [[mailto:jakub.szymanik@gmail.com|jakub.szymanik@gmail.com]] ||
|| [[http://zil.ipipan.waw.pl/BeataWojtowicz|Beata Wójtowicz]], PhD, Assoc. Prof. || [[mailto:beata.wojtowicz@ipipan.waw.pl|beata.wojtowicz@ipipan.waw.pl]] ||
Line 43: Line 46:
 * (Polish) corpus linguistics; cf. the [[http://korpus.pl/en/|IPI PAN Corpus of Polish]] and the [[http://nkjp.pl/|National Corpus of Polish]],
 * syntactic and semantic parsing of Polish; cf. [[http://zil.ipipan.waw.pl/Spejd/|Spejd]] and [[http://nlp.ipipan.waw.pl/~wolinski/swigra/|Świgra]],
 * (Polish) corpus linguistics ([[http://nkjp.pl/|National Corpus of Polish]]), /* ; cf. the [[http://korpus.pl/en/|IPI PAN Corpus of Polish]] and the [[http://nkjp.pl/|National Corpus of Polish]], */
 * morphosyntactic tagging and lemmatisation of Polish,
 * syntactic an
d semantic parsing of Polish,
Line 47: Line 51:
 * distributional semantics and compositional distributional semantics,
Line 48: Line 53:
 * morphosyntactic system of Polish,  * credibility assessment of online content,
 /*
* morphosyntactic system of Polish, */
Line 56: Line 62:
 * [[http://zil.ipipan.waw.pl/Chronofleks|Chronofleks]] (A diachronic formal model of Polish inflection and its implementation)
Line 58: Line 63:
 * [[CORMETAN]] (Cognitive and sociocultural analysis of metaphoric expressions in Polish texts)
 * [[http://clip.ipipan.waw.pl/CURLICAT|CURLICAT]] (Curated Multilingual Language Resources for CEF AT)
Line 59: Line 66:
 * [[http://clip.ipipan.waw.pl/ELG|ELG]] (European Language Grid)
Line 60: Line 68:
 * [[http://clip.ipipan.waw.pl/KORBA-2|KORBA 2]] (Extension of the "Electronic corpus of 17th and 18th century Polish texts" and its integration with the "Electronic Dictionary of the 17th–18th Century Polish")
 * [[HOMADOS|HOMADOS]] (Hampering Misinformation by Assessing Credibility of Online Sources)
Line 61: Line 71:
 * [[http://clip.ipipan.waw.pl/Nexus|Nexus Linguarum]] (European network for Web-centred linguistic data science)
Line 73: Line 84:
 * [[http://zil.ipipan.waw.pl/Chronofleks|Chronofleks]] (A diachronic formal model of Polish inflection and its implementation)
Line 99: Line 111:
 * [[http://nlp.ipipan.waw.pl/~wolinski/swigra/|Świgra]] – a DCG parser,  * [[http://morfeusz.sgjp.pl/|Morfeusz 2]] – a morphological analyser of Polish,
Line 101: Line 113:
 * [[http://zil.ipipan.waw.pl/%C5%9Awigra|Świgra]] – a DCG parser,
 * [[https://github.com/360er0/COMBO|COMBO]] – a language-independent system for natural language preprocessing (i.e. morphosyntactic tagging, lemmatisation, dependency parsing and thematic role labelling,
 * [[http://zil.ipipan.waw.pl/Concraft|Concraft]] — a CRF morphosyntactic tagger of Polish compatible with Morfeusz SGJP,
 * [[http://zil.ipipan.waw.pl/PANTERA|PANTERA]] – a morphosyntactic tagger for Polish,
Line 102: Line 118:
 * [[http://zil.ipipan.waw.pl/PANTERA|PANTERA]] – a morphosyntactic tagger for Polish,
Line 106: Line 121:
 * [[http://zil.ipipan.waw.pl/Anotatornia/|Anotatornia]] – a system for multi-level manual annotation of corpora (forthcoming),  * [[http://zil.ipipan.waw.pl/Anotatornia2/|Anotatornia 2]] – an annotation tool geared towards historical corpora,
Line 111: Line 126:
 * [[http://nlp.ipipan.waw.pl/PPJP/|etc.]]
Line 118: Line 133:
 * [[http://zil.ipipan.waw.pl/CoDeS|Polish word embeddings based on NKJP and Wikipedia]].  * [[http://zil.ipipan.waw.pl/CoDeS|Polish word embeddings based on NKJP and Wikipedia]],
 * Polish dependency banks: [[http://zil.ipipan.waw.pl/PDB|PDB]], [[http://git.nlp.ipipan.waw.pl/alina/PDBUD/tree/master/PDB-UD_current|PDB-UD]], [[http://git.nlp.ipipan.waw.pl/alina/PDBUD/tree/master/PUD-PL_current|PUD-PL]], [[http://git.nlp.ipipan.waw.pl/alina/PDBUD/tree/master/NKJP1M-UD_current|NKJP1M-UD]],
 * [[http://zil.ipipan.waw.pl/PDB/PDBparser|Dependency parsing models for Polish]].
Line 128: Line 145:
 * [[http://poleval.pl/|PolEval]], the evaluation campaign for natural language processing tools for Polish
Line 135: Line 153:
  * [[http://anawiki.essex.ac.uk/dali/crac18/|Computational Models of Reference, Anaphora, and Coreference]] workshop (CRAC) at [[http://naacl2018.org/|NAACL 2017]] (The 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies), 6 June 2018, New Orleans, USA   * [[http://anawiki.essex.ac.uk/dali/crac18/|Computational Models of Reference, Anaphora, and Coreference]] workshop (CRAC) at [[http://naacl2018.org/|NAACL 2018]] (The 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies), 6 June 2018, New Orleans, USA
Line 137: Line 155:
  * [[https://sites.google.com/view/crac2019/|Second Workshop on Computational Models of Reference, Anaphora and Coreference]] (CRAC 2019), 6 ot 7 June 2019, Minneapolis

The Linguistic Engineering Group

The Linguistic Engineering (LE) Group is part of the Department of Artificial Intelligence at the Institute of Computer Science, Polish Academy of Sciences (IPI PAN).

People

Core team

Anna Andrzejczuk, PhD

anna.andrzejczuk@ipipan.waw.pl

Tomasz Bartosiak, MSc

tomasz.bartosiak@gmail.com

Elżbieta Hajnicz, PhD, Assoc. Prof.

elzbieta.hajnicz@ipipan.waw.pl

Konrad Kaczyński, MSc

konrad.kaczynski@ipipan.waw.pl

Witold Kieraś, PhD

witold.kieras@ipipan.waw.pl

Mateusz Klimaszewski, MSc

mk.klimaszewski@gmail.com

Łukasz Kobyliński, PhD

lukasz.kobylinski@ipipan.waw.pl

Dorota Komosińska, MSc

dorota.komosinska@gmail.com

Katarzyna Krasnowska-Kieraś, MSc

katarzyna.krasnowska@ipipan.waw.pl

Małgorzata Maciejewska, PhD

m.maciejewska@yahoo.co.uk

Małgorzata Marciniak, PhD, Assoc. Prof.

malgorzata.marciniak@ipipan.waw.pl

Agnieszka Mykowiecka, PhD, Assoc. Prof.

agnieszka.mykowiecka@ipipan.waw.pl

Bartłomiej Nitoń, MSc

bartek.niton@gmail.com

Maciej Ogrodniczuk, PhD, Assoc. Prof., Head of the Group

maciej.ogrodniczuk@ipipan.waw.pl

Agnieszka Patejuk, PhD

agnieszka.patejuk@ipipan.waw.pl

Adam Przepiórkowski, PhD, Full Prof.

adam.przepiorkowski@ipipan.waw.pl

Piotr Przybyła, PhD

piotr.przybyla@ipipan.waw.pl

Piotr Rychlik, PhD

piotr.rychlik@ipipan.waw.pl

Aleksander Wawer, PhD

aleksander.wawer@ipipan.waw.pl

Grzegorz Wojdyga, MSc

g.wojdyga@ipipan.waw.pl

Marcin Woliński, PhD, Assoc. Prof.

marcin.wolinski@ipipan.waw.pl

Alina Wróblewska, PhD

alina.wroblewska@ipipan.waw.pl

Associates

Jakub Piskorski, PhD

jpiskorski@gmail.com

Piotr Rybak

piotr.cezary.rybak@gmail.com

Jakub Szymanik, PhD

jakub.szymanik@gmail.com

Beata Wójtowicz, PhD, Assoc. Prof.

beata.wojtowicz@ipipan.waw.pl

Research

The main research areas of the Group

  • (Polish) corpus linguistics (National Corpus of Polish),

  • morphosyntactic tagging and lemmatisation of Polish,
  • syntactic and semantic parsing of Polish,
  • extraction of linguistic knowledge from corpora,
  • information extraction,
  • distributional semantics and compositional distributional semantics,
  • sentiment analysis,
  • credibility assessment of online content,

  • generative linguistic formalisms, esp., HPSG and LFG.

The Group is a member of CLARIN, DARIAH-PL, ELRC, FLaReNet and META-NET.

Current externally funded projects

  • CLARIN-PL (Polish chapter of Common Language Resources and Technology Infrastructure)

  • CoDeS (Compositional distributional semantic models for identification, discrimination and disambiguation of senses in Polish texts)

  • CORMETAN (Cognitive and sociocultural analysis of metaphoric expressions in Polish texts)

  • CURLICAT (Curated Multilingual Language Resources for CEF AT)

  • DARIAH-PL (Digital Research Infrastructure for the Arts and Humanities)

  • ELG (European Language Grid)

  • ELRC (European Language Resource Coordination)

  • KORBA 2 (Extension of the "Electronic corpus of 17th and 18th century Polish texts" and its integration with the "Electronic Dictionary of the 17th–18th Century Polish")

  • HOMADOS (Hampering Misinformation by Assessing Credibility of Online Sources)

  • MARCELL (Multilingual Resources for CEF.AT in the legal domain)

  • Nexus Linguarum (European network for Web-centred linguistic data science)

  • Kwantyfikatory w języku: użycie i znaczenie (Quantifiers in Language: Use and Meaning)

  • Parthenos (Pooling Activities, Resources and Tools for Heritage, E-research Networking, Optimization and Synergies)

  • Scwad (Compositional distributional modelling of Polish language semantics)

  • SYNAMET (Microcorpus of Synaesthetic Metaphors. Towards a Formal Description and Efficient Methods of Analysis of Metaphors in Discourse)

Some of our past projects

Publicly available tools and resources

Here are some of the tools and resources created within our projects. See CLIP pages for a more exhaustive list of Polish tools and resources, including more tools and resources developed at ZIL IPI PAN.

Some tools (all open source, under GPL; see also CLIP):

  • Morfeusz 2 – a morphological analyser of Polish,

  • Spejd – a shallow parsing and disambiguation system,

  • Świgra – a DCG parser,

  • COMBO – a language-independent system for natural language preprocessing (i.e. morphosyntactic tagging, lemmatisation, dependency parsing and thematic role labelling,

  • Concraft — a CRF morphosyntactic tagger of Polish compatible with Morfeusz SGJP,

  • PANTERA – a morphosyntactic tagger for Polish,

  • TaKIPI – a morphosyntactic tagger for Polish,

  • Poliqarp – a corpus indexing and search engine,

  • Poliqarp2 – a new generation corpus indexing and search engine,

  • Dendrarium – a treebank development system (under development),

  • Anotatornia 2 – an annotation tool geared towards historical corpora,

  • WSDDE – a system for designing and performing Word Sense Disambiguation experiments,

  • Multiservice – web service for various of our tools,

  • TermoPL - multiword terms extraction from text

  • DSmodels - web service for calculating word similarity using Polish word embeddings

Main resources (many more at CLIP):

Other activities

Links to some other activities of the Group: