Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Help & Advice
University of Warwick

The Library

  • Login
  • Admin

Variational recurrent sequence-to-sequence retrieval for stepwise illustration

Tools
- Tools
+ Tools

Batra, Vishwas, Haldar, Aparajita, He, Yulan, Ferhatosmanoglu, Hakan, Vogiatzis, George and Guha, Tanaya (2020) Variational recurrent sequence-to-sequence retrieval for stepwise illustration. In: Advances in Information Retrieval : 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14–17, 2020, Proceedings, Part I. Lecture Notes in Computer Science, 12035 . Springer, pp. 50-64. ISBN 9783030454388

[img]
Preview
PDF
WRAP-variational-recurrent-sequence-to-sequence-retrieval-stepwise-illustration-Batra-2020.pdf - Accepted Version - Requires a PDF viewer.

Download (4Mb) | Preview
Official URL: http://dx.doi.org/10.1007/978-3-030-45439-5_4

Request Changes to record.

Abstract

We address and formalise the task of sequence-to-sequence (seq2seq) cross-modal retrieval. Given a sequence of text passages as query, the goal is to retrieve a sequence of images that best describes and aligns with the query. This new task extends the traditional cross-modal retrieval, where each image-text pair is treated independently ignoring broader context. We propose a novel variational recurrent seq2seq (VRSS) retrieval model for this seq2seq task. Unlike most cross-modal methods, we generate an image vector corresponding to the latent topic obtained from combining the text semantics and context. This synthetic image embedding point associated with every text embedding point can then be employed for either image generation or image retrieval as desired. We evaluate the model for the application of stepwise illustration of recipes, where a sequence of relevant images are retrieved to best match the steps described in the text. To this end, we build and release a new Stepwise Recipe dataset for research purposes, containing 10K recipes (sequences of image-text pairs) having a total of 67K image-text pairs. To our knowledge, it is the first publicly available dataset to offer rich semantic descriptions in a focused category such as food or recipes. Our model is shown to outperform several competitive and relevant baselines in the experiments. We also provide qualitative analysis of how semantically meaningful the results produced by our model are through human evaluation and comparison with relevant existing methods.

Item Type: Book Item
Subjects: Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software
Divisions: Faculty of Science, Engineering and Medicine > Science > Computer Science
Library of Congress Subject Headings (LCSH): Programming languages (Electronic computers)—Semantics, Data sets , Sequential processing (Computer science)
Series Name: Lecture Notes in Computer Science
Publisher: Springer
ISBN: 9783030454388
ISSN: 0302-9743
Book Title: Advances in Information Retrieval : 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14–17, 2020, Proceedings, Part I
Official Date: 8 April 2020
Dates:
DateEvent
8 April 2020Published
9 December 2019Accepted
Volume: 12035
Page Range: pp. 50-64
DOI: 10.1007/978-3-030-45439-5_4
Status: Peer Reviewed
Publication Status: Published
Reuse Statement (publisher, data, author rights): The final authenticated version is available online at http://dx.doi.org/10.1007/978-3-030-45439-5_4
Access rights to Published version: Open Access (Creative Commons)
Date of first compliant deposit: 21 April 2020
Date of first compliant Open Access: 22 April 2020
Related URLs:
  • Publisher
Open Access Version:
  • Publisher

Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics

twitter

Email us: wrap@warwick.ac.uk
Contact Details
About Us