AIDe - Corpus of Annotated Image Descriptions
AIDe is a dataset of image descriptions in Polish. It consists of 2K descriptions of 1K images. The descriptions are morphosyntactically analysed and the pairs of these descriptions are annotated in terms of semantic relatedness and entailment. All annotations are provided by human annotators with strong linguistic background.
The dataset can be used for evaluation of various systems integrating language and vision. It is applicable for evaluation of systems designed to caption generation based on images (text generation) or image generation based on provided descriptions. As selected images are split into thematic groups, the dataset is also useful for validating image classification approaches.
Download
Dataset of annotated image descriptions: AIDe
Images: we do not own the copyright of the images. If you wish to get the pictures please contact alina <at> ipipan.waw.pl (replace <at> with @).
Publication
Licence
The resources is distributed under the CC BY-SA-NC 4.0 licence.
Contact
For contacting Alina Wróblewska, please write to the email alina <at> ipipan.waw.pl.
Acknowledgments
The building of the resource was founded by SONATA 8 grant no 2014/15/D/HS2/03486 from the National Science Centre Poland and by the Polish Ministry of Science and Higher Education as part of the investment in the CLARIN-PL research infrastructure.