Differences between revisions 3 and 108 (spanning 105 versions)

Natural Language Processing Seminar 2017–2018

The NLP Seminar is organised by the Linguistic Engineering Group at the Institute of Computer Science, Polish Academy of Sciences (ICS PAS). It takes place on (some) Mondays, normally at 10:15 am, in the seminar room of the ICS PAS (ul. Jana Kazimierza 5, Warszawa). All recorded talks are available on YouTube.

2 October 2017

Paweł Rutkowski (University of Warsaw)

Polish Sign Language from the perspective of corpus linguistics

Polish Sign Language (polski język migowy, PJM) is a full-fledged visual-spatial language used by the Polish Deaf community. It started to evolve in the second decade of the nineteenth century, with the foundation of the first school for the deaf in Poland. Until recently, PJM attracted very little attention from the linguistic community in Poland. The aim of this talk is to present a large-scale research project aimed at creating an extensive and representative corpus of PJM. The corpus is currently being compiled at the University of Warsaw. It is a collection of video clips showing Deaf people using PJM in a variety of different communication contexts. The videos are richly annotated: they are segmented, lemmatized, translated into Polish, tagged for various grammatical features and transcribed with HamNoSys symbols. The Corpus of PJM is currently one of the two largest sets of annotated sign language data in the world. Special attention will be paid to the issue of lexical frequency in PJM. Studies of this type are available for a handful of sign languages only, including American Sign Language, New Zealand Sign Language, British Sign Language, Australian Sign Language and Slovene Sign Language. Their empirical basis ranged from 100,000 tokens (NZSL) to as little as 4,000 tokens (ASL). The present talk contributes to our understanding of lexical frequency in sign languages by analyzing a much larger set of relevant data from PJM.

23 October 2017

Katarzyna Krasnowska-Kieraś, Piotr Rybak, Alina Wróblewska (Institute of Computer Science, Polish Academy of Sciences)

Towards the evaluation of feature embedding models of the fusional languages in the context of morphosyntactic disambiguation and dependency parsing

Neural networks are recently very successful in various natural language processing tasks. An important component of a neural network approach is a dense vector representation of features, i.e. feature embedding. Various feature types are possible, e.g. words, part-of-speech tags. In our talk we are going to present results of an analysis showing what should be used as features in estimating embedding models of the fusional languages – tokens or lemmata. Furthermore, we are going to discuss the methodological question whether the results of the intrinsic evaluation of embeddings are informative for downstream applications, or the embedding models should be evaluated extrinsically. The accompanying experiments were conducted on Polish – a fusional Slavic language with a relatively free word order. The mentioned research has inspired us to implement a morphosyntactic disambiguator – Toygger (Krasnowska-Kieraś, 2017). The tool won the shared task 1 (A) in PolEval 2017 competition and will be presented in our talk.

6 November 2017

Szymon Łęski (Samsung R&D Poland)

Deep neural networks in language models

In my talk I will first give introduction to language models: traditional, n-gram based, and new, based on recurrent networks. Then, based on recent papers, I will discuss the most interesting extensions and modifications to RNN-based language models, such as modifying word representations or models with output not limited to a pre-defined vocabulary.

20 November 2017

Michał Ptaszyński (Kitami Institute of Technology, Japan)

Capturing Emotions in Context as a way towards Computational Phronesis

Research on emotions within Artificial Intelligence and related fields has flourished rapidly through recent years. Unfortunately, in most research emotions are analyzed without their context. I will argue, that recognizing emotions without recognizing their context is incomplete and cannot be sufficient for real-world applications. I will also describe some consequences of disregarding the context of emotions. Finally, I will present one approach, in which the context of emotions is considered and briefly describe some of the first experiments performed in this matter.

4 December 2017

Sebastian Żurowski, Adam Dobaczewski, Piotr Sobotka (Nicolaus Copernicus University in Toruń)

Talk title will be available shortly

Talk summary will be available shortly.

Please see also the talks given in 2000–2015 and 2015–2017.

-  ⇤ ← Revision 3 as of 2016-06-27 22:36:46 → 
  Size: 866
  Editor: MaciejOgrodniczuk
  Comment:
+   ← Revision 108 as of 2017-10-25 13:29:22 → ⇥
  Size: 7156
  Editor: MaciejOgrodniczuk
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 3:
-= Natural Language Processing Seminar 2016–2017 =
+= Natural Language Processing Seminar 2017–2018 =
 Line 5:
-||<style="border:0;padding:0">The NLP Seminar is organised by the [[http://nlp.ipipan.waw.pl/|Linguistic Engineering Group]] at the [[http://www.ipipan.waw.pl/en/|Institute of Computer Science]], [[http://www.pan.pl/index.php?newlang=english|Polish Academy of Sciences]] (ICS PAS). It takes place on (some) Mondays, normally at 10:15 am, in the seminar room of the ICS PAS (ul. Jana Kazimierza 5, Warszawa). ||<style="border:0;padding-left:30px">[[seminarium-archiwum|{{attachment:pl.png}}]]||
+||<style="border:0;padding-bottom:10px">The NLP Seminar is organised by the [[http://nlp.ipipan.waw.pl/|Linguistic Engineering Group]] at the [[http://www.ipipan.waw.pl/en/|Institute of Computer Science]], [[http://www.pan.pl/index.php?newlang=english|Polish Academy of Sciences]] (ICS PAS). It takes place on (some) Mondays, normally at 10:15 am, in the seminar room of the ICS PAS (ul. Jana Kazimierza 5, Warszawa). All recorded talks are available [[https://www.youtube.com/channel/UC5PEPpMqjAr7Pgdvq0wRn0w|on YouTube]]. ||<style="border:0;padding-left:30px">[[seminarium|{{attachment:seminar-archive/pl.png}}]]||
 Line 7:
-||<style="border:0;padding-top:10px">It's summer holiday season, please come back in October! And now see [[http://nlp.ipipan.waw.pl/NLP-SEMINAR/previous-e.html|the talks given between 2000 and 2015]] and [[http://zil.ipipan.waw.pl/seminar|2015-16]].||||
+||<style="border:0;padding-top:5px;padding-bottom:5px">'''2 October 2017'''||
||<style="border:0;padding-left:30px;padding-bottom:0px">'''Paweł Rutkowski''' (University of Warsaw)||
||<style="border:0;padding-left:30px;padding-bottom:5px">[[https://www.youtube.com/watch?v=Acfdv6kUe5I|{{attachment:seminarium-archiwum/youtube.png}}]] '''[[attachment:seminarium-archiwum/2017-10-02.pdf|Polish Sign Language from the perspective of corpus linguistics]]''' {{attachment:seminarium-archiwum/icon-pl.gif|Talk delivered in Polish.}} {{attachment:seminarium-archiwum/icon-en.gif|Slides in English.}}||
||<style="border:0;padding-left:30px;padding-bottom:15px">Polish Sign Language (polski język migowy, PJM) is a full-fledged visual-spatial language used by the Polish Deaf community. It started to evolve in the second decade of the nineteenth century, with the foundation of the first school for the deaf in Poland. Until recently, PJM attracted very little attention from the linguistic community in Poland. The aim of this talk is to present a large-scale research project aimed at creating an extensive and representative corpus of PJM. The corpus is currently being compiled at the University of Warsaw. It is a collection of video clips showing Deaf people using PJM in a variety of different communication contexts. The videos are richly annotated: they are segmented, lemmatized, translated into Polish, tagged for various grammatical features and transcribed with !HamNoSys symbols. The Corpus of PJM is currently one of the two largest sets of annotated sign language data in the world. Special attention will be paid to the issue of lexical frequency in PJM. Studies of this type are available for a handful of sign languages only, including American Sign Language, New Zealand Sign Language, British Sign Language, Australian Sign Language and Slovene Sign Language. Their empirical basis ranged from 100,000 tokens (NZSL) to as little as 4,000 tokens (ASL). The present talk contributes to our understanding of lexical frequency in sign languages by analyzing a much larger set of relevant data from PJM.||

||<style="border:0;padding-top:5px;padding-bottom:5px">'''23 October 2017'''||
||<style="border:0;padding-left:30px;padding-bottom:0px">'''Katarzyna Krasnowska-Kieraś''', '''Piotr Rybak''', '''Alina Wróblewska''' (Institute of Computer Science, Polish Academy of Sciences)||
||<style="border:0;padding-left:30px;padding-bottom:5px">'''Towards the evaluation of feature embedding models of the fusional languages in the context of morphosyntactic disambiguation and dependency parsing''' {{attachment:seminarium-archiwum/icon-pl.gif|Talk delivered in Polish.}}||
||<style="border:0;padding-left:30px;padding-bottom:15px">Neural networks are recently very successful in various natural language processing tasks. An important component of a neural network approach is a dense vector representation of features, i.e. feature embedding. Various feature types are possible, e.g. words, part-of-speech tags. In our talk we are going to present results of an analysis showing what should be used as features in estimating embedding models of the fusional languages – tokens or lemmata. Furthermore, we are going to discuss the methodological question whether the results of the intrinsic evaluation of embeddings are informative for downstream applications, or the embedding models should be evaluated extrinsically. The accompanying experiments were conducted on Polish – a fusional Slavic language with a relatively free word order. The mentioned research has inspired us to implement a morphosyntactic disambiguator – Toygger (Krasnowska-Kieraś, 2017). The tool won the shared task 1 (A) in [[http://poleval.pl|PolEval 2017]] competition and will be presented in our talk.||

||<style="border:0;padding-top:5px;padding-bottom:5px">'''6 November 2017'''||
||<style="border:0;padding-left:30px;padding-bottom:0px">'''Szymon Łęski''' (Samsung R&D Poland)||
||<style="border:0;padding-left:30px;padding-bottom:5px">'''Deep neural networks in language models'''||
||<style="border:0;padding-left:30px;padding-bottom:15px">In my talk I will first give introduction to language models: traditional, n-gram based, and new, based on recurrent networks. Then, based on recent papers, I will discuss the most interesting extensions and modifications to RNN-based language models, such as modifying word representations or models with output not limited to a pre-defined vocabulary.||

||<style="border:0;padding-top:5px;padding-bottom:5px">'''20 November 2017'''||
||<style="border:0;padding-left:30px;padding-bottom:0px">'''Michał Ptaszyński''' (Kitami Institute of Technology, Japan)||
||<style="border:0;padding-left:30px;padding-bottom:5px">'''Capturing Emotions in Context as a way towards Computational Phronesis'''||
||<style="border:0;padding-left:30px;padding-bottom:15px">Research on emotions within Artificial Intelligence and related fields has flourished rapidly through recent years. Unfortunately, in most research emotions are analyzed without their context. I will argue, that recognizing emotions without recognizing their context is incomplete and cannot be sufficient for real-world applications. I will also describe some consequences of disregarding the context of emotions. Finally, I will present one approach, in which the context of emotions is considered and briefly describe some of the first experiments performed in this matter.||

||<style="border:0;padding-top:5px;padding-bottom:5px">'''4 December 2017'''||
||<style="border:0;padding-left:30px;padding-bottom:0px">'''Sebastian Żurowski''', '''Adam Dobaczewski''', '''Piotr Sobotka''' (Nicolaus Copernicus University in Toruń)||
||<style="border:0;padding-left:30px;padding-bottom:5px">'''Talk title will be available shortly'''||
||<style="border:0;padding-left:30px;padding-bottom:15px">Talk summary will be available shortly.||


||<style="border:0;padding-top:10px">Please see also [[http://nlp.ipipan.waw.pl/NLP-SEMINAR/previous-e.html|the talks given in 2000–2015]] and [[http://zil.ipipan.waw.pl/seminar-archive|2015–2017]].||

## [[attachment:seminarium-archiwum/2016-10-10.pdf|Paraphrase Detection Ensemble – SemEval 2016 winner]]''' &#160;{{attachment:seminarium-archiwum/icon-pl.gif|Talk delivered in Polish.}} {{attachment:seminarium-archiwum/icon-en.gif|Slides in English.}}

Diff for "seminar"

Menu

Natural Language Processing Seminar 2017–2018