Alina Wróblewska

I am an Assistant Professor at the Institute of Computer Science PAS in the Linguistic Engineering Group. My scientific work focuses on fields related to computer science, in particular natural language processing, machine learning and computational linguistics.

About Me

Curriculum Vitae

Contact details


<alina AT SPAMFREE ipipan DOT waw DOT pl>


Google Scholar, ACL Anthology, Semantic Scholar, ResearchGate, Nauka Polska



Recent projects

  • UniDive COST Action – Universality, diversity and idiosyncrasy in language technology

  • Universal Discourse: a multilingual model of discourse relations

  • CLARIN-PL – CLARIN-ERIC – European Research Infrastructure Consortium: Common Language Resources and Technology Infrastructure (Co-Investigator)

Past projects

  • PLLuM – Polish Large Language Model (Co-Investigator)

  • CLARIN-PL-BIZ – CLARIN-ERIC – European Research Infrastructure Consortium: Common Language Resources and Technology Infrastructure

  • DARIAH.Lab – Digital Research Infrastructure for the Arts and Humanities

  • Scwad (Principal Investigator)

  • NEKST (Co-Investigator)

NLP resources and tools

  • PDB – Polish Dependency Bank

  • PDB-UD (PDB in Universal Dependencies format)

  • CDSCorpus (10K Polish sentence pairs which are human-annotated for semantic relatedness and natural language inference)

  • COMBO-pytorch source code (language-independent system for dependency parsing, POS tagging and other pre-processing NLP tasks)

  • COMBO demo (demonstration of COMBO's part-of-speech tagging, morphological analysis, lemmatisation, and dependency parsing)

  • Scwad demo (Polish dependency parsing, semantic relatedness, and natural language inference)


List of publications


Agata Savary, Daniel Zeman, Verginica Barbu Mititelu, Anabela Barreiro, Olesea Caftanatov, Marie-Catherine de Marneffe, Kaja Dobrovoljc, Gülsen Eryiğit, Voula Giouli, Bruno Guillaume, Stella Markantonatou, Nurit Melnik, Joakim Nivre, Atul Kr. Ojha, Carlos Ramisch, Abigail Walsh, Beata Wójtowicz, and Alina Wróblewska. UniDive: A COST action on universality, diversity and idiosyncrasy in language technology. In Maite Melero, Sakriani Sakti, and Claudia Soria, editors, Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024, pages 372–382, Torino, Italy, 2024. ELRA and ICCL.

Martyna Wiącek, Piotr Rybak, Łukasz Pszenny, and Alina Wróblewska. NLPre: A revised approach towards language-centric benchmarking of natural language preprocessing systems. In Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, and Nianwen Xue, editors, Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 12271–12287, Torino, Italy, 2024. ELRA and ICCL.

Alina Wróblewska. Investigating large language models for their competence in extracting grammatically sound sentences from transcribed noisy utterances. In Libby Barak and Malihe Alikhani, editors, Proceedings of the 28th Conference on Computational Natural Language Learning, pages 10–23, Miami, FL, 2024. Association for Computational Linguistics.


Marcin Woliński, Alina Wróblewska, Małgorzata Marciniak, Katarzyna Krasnowska-Kieraś, and Wiktor Eźlakowski. O konstrukcji …, ale nie… i podobnych w języku polskim. Język Polski, CIII(4):5–21, 2023.


Mateusz Klimaszewski and Alina Wróblewska. COMBO: A new module for EUD parsing. In Proceedings of the 17th International Conference on Parsing Technologies and the IWPT 2021 Shared Task on Parsing into Enhanced Universal Dependencies (IWPT 2021), pages 158–166. Association for Computational Linguistics, 2021.

Mateusz Klimaszewski and Alina Wróblewska. COMBO: State-of-the-art morphosyntactic analysis. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 50–62, Online and Punta Cana, Dominican Republic, 2021. Association for Computational Linguistics.

Robert Mroczkowski, Piotr Rybak, Alina Wróblewska, and Ireneusz Gawlik. HerBERT: Efficiently pretrained transformer-based language model for Polish. In Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing, pages 1–10, Kiyv, Ukraine, 2021. Association for Computational Linguistics.


Alina Wróblewska. Towards the Conversion of National Corpus of Polish to Universal Dependencies. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 5308–5315, Marseille, France, 2020. European Language Resources Association (ELRA).

Alina Wróblewska, Katarzyna Krasnowska-Kieraś, and Piotr Rybak. Towards the evaluation of feature embedding models of the fusional languages. In Zygmunt Vetulani, Patrick Paroubek, and Marek Kubis, editors, Human Language Technology. Challenges for Computer Science and Linguistics, 8th Language and Technology Conference, LTC 2017, Poznań, Poland, November 17–19, 2017, Revised Selected Papers, number 12598 in Lecture Notes in Computer Science, pages 256–270, Cham, 2020. Springer International Publishing.


Katarzyna Krasnowska-Kieraś and Alina Wróblewska. Empirical linguistic study of sentence embeddings. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5729–5739, Florence, Italy, 2019. Association for Computational Linguistics.

Alina Wróblewska and Piotr Rybak. Dependency parsing of Polish. Poznań Studies in Contemporary Linguistics, 55(2):305–337, 2019.


Piotr Rybak and Alina Wróblewska. Semi-supervised neural system for tagging, parsing and lematization. In Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 45–54. Association for Computational Linguistics, 2018.

Piotr Rybak and Alina Wróblewska. Semi-supervised neural system for tagging, parsing and lemmatization. Addendum. In Proceedings of the PolEval 2018 Workshop, pages 49–51. Institute of Computer Science, Polish Academy of Sciences, 2018.

Alina Wróblewska. Polish corpus of annotated descriptions of images. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), pages 2141–2146. European Language Resources Association (ELRA), 2018.

Alina Wróblewska. Results of the PolEval 2018 Shared Task 1: Dependency Parsing. In Proceedings of the PolEval 2018 Workshop, pages 11–24. Institute of Computer Science, Polish Academy of Sciences, 2018.

Alina Wróblewska. Extended and enhanced Polish dependency bank in Universal Dependencies format. In Marie-Catherine de Marneffe, Teresa Lynn, and Sebastian Schuster, editors, Proceedings of the Second Workshop on Universal Dependencies (UDW 2018), pages 173–182. Association for Computational Linguistics, 2018.

Alina Wróblewska and Aleksandra Wieczorek. Status morfoskładniowy wyrazu jako we współczesnej polszczyźnie. Język Polski, XCVIII(3):16–30, 2018.


Alina Wróblewska and Katarzyna Krasnowska-Kieraś. Polish evaluation dataset for compositional distributional semantics models. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 784–792, Vancouver, Canada, 2017. Association for Computational Linguistics.

Alina Wróblewska, Katarzyna Krasnowska-Kieraś, and Piotr Rybak. Towards the evaluation of feature embedding models of the fusional languages. In Zygmunt Vetulani and Patrick Paroubek, editors, Proceedings of the 8th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, pages 420–424, Poznań, Poland, 2017. Fundacja Uniwersytetu im. Adama Mickiewicza w Poznaniu.


Adam Przepiórkowski and Alina Wróblewska. Supporting LFG parsing with dependency parsing. In Markus Dickinson, Erhard Hinrichs, Agnieszka Patejuk, and Adam Przepiórkowski, editors, Proceedings of the Fourteenth International Workshop on Treebanks and Linguistic Theories (TLT 14), pages 168–178, Warsaw, 2015. Institute of Computer Science, Polish Academy of Sciences.


Alina Wróblewska. Polish Dependency Parser Trained on an Automatically Induced Dependency Bank. Ph.D. dissertation, Institute of Computer Science, Polish Academy of Sciences, Warsaw, 2014.

Alina Wróblewska and Adam Przepiórkowski. Projection-based annotation of a Polish dependency treebank. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, editors, Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, pages 2306–2312, Reykjavík, Iceland, 2014. European Language Resources Association (ELRA).

Alina Wróblewska and Adam Przepiórkowski. Towards a weighted induction method of dependency annotation. In Adam Przepiórkowski and Maciej Ogrodniczuk, editors, Advances in Natural Language Processing: Proceedings of the 9th International Conference on NLP, PolTAL 2014, Warsaw, Poland, September 17–19, 2014, number 8686 in Lecture Notes in Artificial Intelligence, pages 164–176. Springer International Publishing, Heidelberg, 2014.


Djamé Seddah, Reut Tsarfaty, Sandra Kübler, Marie Candito, Jinho D. Choi, Richárd Farkas, Jennifer Foster, Iakes Goenaga, Koldo Gojenola Galletebeitia, Yoav Goldberg, Spence Green, Nizar Habash, Marco Kuhlmann, Wolfgang Maier, Yuval Marton, Joakim Nivre, Adam Przepiórkowski, Ryan Roth, Wolfgang Seeker, Yannick Versley, Veronika Vincze, Marcin Woliński, Alina Wróblewska, and Eric Villemonte de la Clérgerie. Overview of the SPMRL 2013 shared task: A cross-framework evaluation of parsing morphologically rich languages. In Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages, pages 146–182, Seattle, WA, 2013. Association for Computational Linguistics.

Alina Wróblewska and Piotr Sikora. Online service for Polish dependency parsing and results visualisation. In Mieczysław A. Kłopotek, Jacek Koronacki, Małgorzata Marciniak, Agnieszka Mykowiecka, and Sławomir T. Wierzchoń, editors, Language Processing and Intelligent Information Systems – 20th International Conference, IIS 2013, Warsaw, Poland, June 17-18, 2013. Proceedings, number 7912 in Lecture Notes in Computer Science, pages 39–44, Berlin, Heidelberg, 2013. Springer-Verlag.


Alina Wróblewska. Polish dependency bank. Linguistic Issues in Language Technology, 7(1), 2012.

Alina Wróblewska and Adam Przepiórkowski. Induction of dependency structures based on weighted projection. In Proceedings of the 4th International Conference on Computational Collective Intelligence Technologies and Applications (ICCCI 2012), Part I, number 7653 in Lecture Notes in Artificial Intelligence, pages 364–374, Berlin, 2012. Springer-Verlag.

Alina Wróblewska and Marcin Sydow. DEBORA: Dependency-based method for extracting entity-relationship triples from open-domain texts in Polish. In Li Chen, Alexander Felfernig, Jiming Liu, and Zbigniew W. Raś, editors, Foundations of Intelligent Systems. Proceedings of the 20th International Symposium, ISMIS 2012, Macau, China, number 7661 in Lecture Notes in Computer Science, pages 155–161, Berlin, Heidelberg, 2012. Springer-Verlag.

Alina Wróblewska and Marcin Woliński. Preliminary experiments in Polish dependency parsing. In Pascal Bouvry, Mieczysław A. Kłopotek, Franck Leprevost, Małgorzata Marciniak, Agnieszka Mykowiecka, and Henryk Rybiński, editors, Security and Intelligent Information Systems: International Joint Conference, SIIS 2011, Warsaw, Poland, June 13-14, 2011, Revised Selected Papers, number 7053 in Lecture Notes in Computer Science, pages 279–292. Springer-Verlag, 2012.


Alina Wróblewska. Polish-English Word Alignment: Preliminary Study. In Dominik Ryżko, Henryk Rybiński, Piotr Gawrysiak, and Marzena Kryszkiewicz, editors, Emerging Intelligent Technologies in Industry, volume 369 of Studies in Computational Intelligence, pages 123–132. Springer-Verlag, Berlin, 2011.


Alina Wróblewska and Anette Frank. Cross-Lingual Projection of LFG F-Structures: Building an F-Structure Bank for Polish. In Marco Passarotti, Adam Przepiórkowski, Savina Raynaud, and Frank Van Eynde, editors, Proceedings of the Eighth International Workshop on Treebanks and Linguistic Theories (TLT 8), pages 209–220, Milan, Italy, 2009.