Locked History Actions

Diff for "Scwad"

Differences between revisions 13 and 15 (spanning 2 versions)
Revision 13 as of 2019-03-27 10:10:15
Size: 3384
Comment:
Revision 15 as of 2019-03-27 10:11:35
Size: 3344
Comment:
Deletions are marked like this. Additions are marked like this.
Line 21: Line 21:
 * [[https://github.com/360er0/COMBO|COMBO]], the jointly trained neural tagger, morphological analyser, lemmatizer and dependency parser ranked 3rd/4th in the [[http://universaldependencies.org/conll18/results.html|CoNLL 2018 shared task on Multilingual Parsing from Raw Text to Universal Dependencies]].  * [[https://github.com/360er0/COMBO|COMBO]], the jointly trained neural tagger, morphological analyser, lemmatizer and dependency parser ranked 3rd/4th in the [[http://universaldependencies.org/conll18/results.html|CoNLL 2018 Universal Dependencies shared task]].
Line 28: Line 28:
 * [[http://zil.ipipan.waw.pl/Scwad/AIDe|AIDe - Corpus of Annotated Image Descriptions]] (Wróblewska, 2018b)  * [[http://zil.ipipan.waw.pl/Scwad/AIDe|AIDe - Corpus of Annotated Image Descriptions]] (Wróblewska, 2018a)
Line 31: Line 31:
 * [[https://github.com/360er0/COMBO|COMBO]] - the jointly trained neural tagger, morphological analyser, lemmatizer and dependency parser (Rybak and Wróblewska, 2018). The [[http://zil.ipipan.waw.pl/PDB/PDBparser|COMBO models for Polish]] trained on [[http://zil.ipipan.waw.pl/PDB|Polish Dependency Bank]] (Wróblewska, 2018) are publicly available.  * [[https://github.com/360er0/COMBO|COMBO]] - the jointly trained neural tagger, morphological analyser, lemmatizer and dependency parser (Rybak and Wróblewska, 2018). The [[http://zil.ipipan.waw.pl/PDB/PDBparser|COMBO models for Polish]] trained on [[http://zil.ipipan.waw.pl/PDB|Polish Dependency Bank]] (Wróblewska, 2018b) are publicly available.
Line 37: Line 37:
<<BibMate(key, "wro:18a", omitYears=true)>>
Line 38: Line 39:
<<BibMate(key, "wro:18a", omitYears=true)>>

Scwad project

Project factsheet

English name:

Compositional distributional modelling of Polish language semantics

Polish name:

Kompozycyjno-dystrybucyjne modelowanie semantyki języka polskiego

Project type:

The National Science Centre SONATA 8 grant 2014/15/D/HS2/03486

Duration:

30 September 2015 ‒ 29 September 2018 (extended to 30 September 2019)

Principal investigator:

Alina Wróblewska

Project summary

Within the project, basic research will be conducted on compositional distributional semantics employed in modelling the meaning of phrases and sentences. A compositional distributional semantic model endeavours to determine the meaning of sentences or phrases based on the sophisticated procedure of composing distributional word vectors, and to generate a vector representation of this meaning. The degree of similarity between two vectors, which belong to the same vector space but represent meanings of different sentences, can be estimated with similarity measures. With respect to the Polish language, this scientific issue has been studied neither by us nor by other members of the natural language processing community in Poland. Within our pioneering studies, we will investigate whether it is possible to estimate compositional distributional semantic models for languages with a complex inflectional system and relatively free word order, such as Polish.

Successes

Resources

Tools

  • COMBO - the jointly trained neural tagger, morphological analyser, lemmatizer and dependency parser (Rybak and Wróblewska, 2018). The COMBO models for Polish trained on Polish Dependency Bank (Wróblewska, 2018b) are publicly available.

  • Toygger - morphosyntactic disambiguator of Polish (Krasnowska-Kieraś, 2017)

Publications

List of publications

Piotr Rybak and Alina Wróblewska. Semi-supervised neural system for tagging, parsing and lematization. In Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 45–54. Association for Computational Linguistics, 2018.

List of publications

Piotr Rybak and Alina Wróblewska. Semi-supervised neural system for tagging, parsing and lemmatization. Addendum. In Proceedings of the PolEval 2018 Workshop, pages 49–51. Institute of Computer Science, Polish Academy of Sciences, 2018.

List of publications

Alina Wróblewska. Polish corpus of annotated descriptions of images. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), pages 2141–2146. European Language Resources Association (ELRA), 2018.

List of publications

Alina Wróblewska. Extended and enhanced Polish dependency bank in Universal Dependencies format. In Marie-Catherine de Marneffe, Teresa Lynn, and Sebastian Schuster, editors, Proceedings of the Second Workshop on Universal Dependencies (UDW 2018), pages 173–182. Association for Computational Linguistics, 2018.

List of publications

Alina Wróblewska and Katarzyna Krasnowska-Kieraś. Polish evaluation dataset for compositional distributional semantics models. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 784–792, Vancouver, Canada, 2017. Association for Computational Linguistics.

List of publications

Alina Wróblewska, Katarzyna Krasnowska-Kieraś, and Piotr Rybak. Towards the evaluation of feature embedding models of the fusional languages. In Zygmunt Vetulani and Patrick Paroubek, editors, Proceedings of the 8th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, pages 420–424, Poznań, Poland, 2017. Fundacja Uniwersytetu im. Adama Mickiewicza w Poznaniu.

List of publications

Katarzyna Krasnowska-Kieraś. Morphosyntactic disambiguation for Polish with bi-LSTM neural networks. In Zygmunt Vetulani and Patrick Paroubek, editors, Proceedings of the 8th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, pages 367–371, Poznań, Poland, 2017. Fundacja Uniwersytetu im. Adama Mickiewicza w Poznaniu.