The Library
Deep cross-modal discriminant adversarial learning for zero-shot sketch-based image retrieval
Tools
Jiao, Shichao, Han, Xie, Xiong, Fengguang, Yang, Xiaowen, Han, Huiyan, He, Ligang and Kuang, Liqun (2022) Deep cross-modal discriminant adversarial learning for zero-shot sketch-based image retrieval. Neural Computing and Applications, 34 (16). pp. 13469-13483. doi:10.1007/s00521-022-07169-6 ISSN 0941-0643.
Research output not available from this repository.
Request-a-Copy directly from author or use local Library Get it For Me service.
Official URL: http://dx.doi.org/10.1007/s00521-022-07169-6
Abstract
Zero-shot sketch-based image retrieval (ZS-SBIR) is an extension of sketch-based image retrieval (SBIR) that aims to search relevant images with query sketches of the unseen categories. Most previous methods focus more on preserving semantic knowledge and improving domain alignment performance, but neglect to capture the correlation between inter-modal features, resulting in unsatisfactory performance. Hence, a sketch-image cross-modal retrieval framework is proposed to maximize the sketch-image correlation. For this framework, we develop a discriminant adversarial learning method that incorporates intra-modal discrimination, inter-modal consistency, and inter-modal correlation into a deep learning network for common feature representation learning. Specifically, sketch and image features are first projected into a shared feature subspace to achieve modality-invariance. Subsequently, we adopt a category label predictor to achieve intra-modal discrimination, use adversarial learning to confuse modal information for inter-modal consistency, and introduce correlation learning to maximize inter-modal correlation. Finally, the trained deep learning model is used to test unseen categories. Extensive experiments conducted on three zero-shot datasets show that this method outperforms state-of-the-art methods. For retrieval accuracy of unseen categories, this method exceeds the state-of-the-art methods by approximately 0.6% on the RSketch dataset, 5% on the Sketchy dataset, and 7% on the TU-Berlin dataset. We also conduct experiments on the dataset of image-based 3D model scene retrieval, the proposed method significantly outperforms the state-of-the-art approaches in all standard metrics.
Item Type: | Journal Article | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Divisions: | Faculty of Science, Engineering and Medicine > Science > Computer Science | |||||||||||||||
Journal or Publication Title: | Neural Computing and Applications | |||||||||||||||
Publisher: | Springer | |||||||||||||||
ISSN: | 0941-0643 | |||||||||||||||
Official Date: | August 2022 | |||||||||||||||
Dates: |
|
|||||||||||||||
Volume: | 34 | |||||||||||||||
Number: | 16 | |||||||||||||||
Page Range: | pp. 13469-13483 | |||||||||||||||
DOI: | 10.1007/s00521-022-07169-6 | |||||||||||||||
Status: | Peer Reviewed | |||||||||||||||
Publication Status: | Published | |||||||||||||||
Access rights to Published version: | Restricted or Subscription Access | |||||||||||||||
RIOXX Funder/Project Grant: |
|
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |