Revision 40 as of 2017-01-22 21:29:46

Clear message
Locked History Actions

seminar

Natural Language Processing Seminar 2016–2017

The NLP Seminar is organised by the Linguistic Engineering Group at the Institute of Computer Science, Polish Academy of Sciences (ICS PAS). It takes place on (some) Mondays, normally at 10:15 am, in the seminar room of the ICS PAS (ul. Jana Kazimierza 5, Warszawa).

seminarium

10 October 2016

Katarzyna Pakulska, Barbara Rychalska, Krystyna Chodorowska, Wojciech Walczak, Piotr Andruszkiewicz (Samsung)

Paraphrase Detection Ensemble – SemEval 2016 winner  Talk delivered in Polish.

This seminar describes the winning solution designed for a core track within the SemEval 2016 English Semantic Textual Similarity (STS) task. The goal of the competition was to measure semantic similarity between two given sentences on a scale from 0 to 5. At the same time the solution should replicate human language understanding. The presented model is a novel hybrid of recursive auto-encoders from deep learning (RAE) and a WordNet award-penalty system, enriched with a number of other similarity models and features used as input for Linear Support Vector Regression.

24 October 2016

Adam Przepiórkowski, Jakub Kozakoszczak, Jan Winkowski, Daniel Ziembicki, Tadeusz Teleżyński (Institute of Computer Science, Polish Academy of Sciences / University of Warsaw)

Corpus of formalized textual entailment steps  Talk delivered in Polish.

The authors present resources created within CLARIN project aiming to help with qualitative evaluation of RTE systems: two textual derivations corpora and a corpus of textual entailment rules. Textual derivation is a series of atomic steps which connects Text with Hypothesis in a textual entailment pair. Original pairs are taken from the FraCaS corpus and a polish translation of the RTE3 corpus. Textual entailment rule sanctions textual entailment relation between the input and the output of a step, using syntactic patterns written in the UD standard and some other semantic, logical and contextual constraints expressed in FOL.

7 November 2016

Rafał Jaworski (Adam Mickiewicz University in Poznań)

Concordia – translation memory search algorithm  Talk delivered in Polish.

The talk covers the Concordia algorithm (http://tmconcordia.sourceforge.net/), which is used to maximize the productivity of a human translator. The algorithm combines the features of standard fuzzy translation memory searching with a concordancer. As the key non-functional requirement of computer-aided translation mechanisms is performance, Concordia incorporates upgraded versions of standard approximate searching techniques, aiming at reducing the computational complexity.

21 November 2016

Norbert Ryciak, Aleksander Wawer (Institute of Computer Science, Polish Academy of Sciences)

https://www.youtube.com/watch?v=hGKzZxFa0ik Using recursive deep neural networks and syntax to compute phrase semantics  Talk delivered in Polish.

The seminar presents initial experiments on recursive phrase-level sentiment computation using dependency syntax and deep learning. We discuss neural network architectures and implementations created within Clarin 2 and present results on English language resources. Seminar also covers undergoing work on Polish language resources.

5 December 2017

Dominika Rogozińska, Marcin Woliński (Institute of Computer Science, Polish Academy of Sciences)

Methods of syntax disambiguation for constituent parse trees in Polish as post–processing phase of the Świgra parser  Talk delivered in Polish.

The presentation shows methods of syntax disambiguation for Polish utterances produced by the Świgra parser. Presented methods include probabilistic context free grammars and maximum entropy models. The best of described models achieves efficiency measure at the level of 96.2%. The outcome of our experiments is a module for post-processing Świgra's parses.

9 January 2017

Agnieszka Pluwak (Institute of Slavic Studies, Polish Academy of Sciences)

Budowa dziedzinowej reprezentacji wiedzy z pomocą rozszerzonej metody ramowej na podstawie korpusu umów najmu w języku polskim, angielskim i niemieckim  Wystąpienie w języku polskim.

Projekt FrameNet przez jego autorów określany jest jako baza leksykalna o charakterze ontologii (nie jest on ontologią sensu stricto ze względu na wybiorczy opis pojęć oraz relacji między ramami). Ontologie jako reprezentacje wiedzy w NLP powinny mieć zastosowanie do konkretnych dziedzin i tekstów, ale w literaturze przedmiotu do stycznia 2016 nie znalazłam przykładu reprezentacji wiedzy opartej w całości na ramach lub na rozbudowanej strukturze relacji między ramami. Znalazłam jedynie kilka przykładów dziedzinowych reprezentacji wiedzy z użyciem wybranych ram FrameNet (BioFrameNet, Legal FrameNet, etc.), w których wykorzystano je do łączenia danych z rożnych zasobów. Postanowiłam w mojej pracy doktorskiej przeprowadzić eksperyment budowy dziedzinowej reprezentacji wiedzy opartej na relacjach między ramami, określonymi na podstawie analizy tekstów umów najmu. Celem badania był opis ram użytecznych z punktu widzenia potencjalnej ekstrakcji danych z umów najmu, czyli zawierających odpowiedzi na pytania, jakie zadaje sobie profesjonalny analityk czytając tekst umowy. W pracy postawiłam różne pytania, m.in. czy będę mogła wykorzystać gotowe ramy FrameNet, czy też będę musiała zbudować własne? Czy język polski wniesie specyficzne problemy? Jak język specjalistyczny wpłynie na użycie ram? I wiele innych.

23 January 2017

Marek Rogalski (Lodz University of Technology)

Automatic paraphrasing  Talk delivered in Polish.

Paraphrasing is conveying the essential meaning of a message using different words. The ability to paraphrase is a measure of understanding. A teacher asking student a question "could you please tell us using your own words ...", tests whether the student has understood the topic. On this presentation we will discuss the task of automatic paraphrasing. We will differentiate between syntax-level paraphrases and essential-meaning-level paraphrases. We will bring up several techniques from seemingly unrelated fields that can be applied in automatic paraphrasing. We will also show results that we've been able to produce with those techniques.

6 February 2017

Łukasz Kobyliński (Institute of Computer Science, Polish Academy of Sciences)

Korpusomat  Talk delivered in Polish.

The summary will be available shortly.

Please see also the talks given between 2000 and 2015 and 2015-16.