Cross-modal retrieval based on deep regularized hashing constraints

Asad Khan, Sakander Hayat, Muhammad Ahmad, Jinyu Wen, Muhammad Umar Farooq, Meie Fang*, Wenchao Jiang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

Cross-modal retrieval has attracted great attention due to the increasing demand for tremendous amounts of multimodal data in recent years. These retrievals could either be text-to-image or image-to-text. To address the problem of inappropriate information included between images and texts, we propose two cross-modal recovery techniques established on a dual-branch neural network defined on a common subspace and the hashing learning method. First, a cross-modal recovery technique established on a multilabel information deep ranking model (MIDRM) is provided. In this method, we introduce a triplet-loss function into the dual-branch neural network model. This function takes advantage of the semantic information of the bimodal components, focusing on not only the similarities between similar images and text features but also the distances between dissimilar images and texts. Second, we establish a new cross-modal hashing technique said to be the deep regularized hashing constraint (DRHC). In this method, the regularized function is used to replace the binary constraint, and the discrete value is constrained to a certain numerical range so that the network can achieve end-to-end training. Overall, the time complexity is greatly improved, and the occupied storage space is also greatly reduced. Different experiments on our proposed MIDRM and DRHC models demonstrate their superior performance to those of the state-of-the-art methods on two widely used data sets. The experimental results show that our approach also increases the mean average precision of cross-modal recovery.

Original languageEnglish
Pages (from-to)6508-6530
Number of pages23
JournalInternational Journal of Intelligent Systems
Volume37
Issue number9
DOIs
StatePublished - Sep 2022
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2022 Wiley Periodicals LLC.

Keywords

  • cross-modal retrieval
  • hashing learning
  • image search
  • multilabel information
  • neural network
  • ranking model
  • triplet loss

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Human-Computer Interaction
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Cross-modal retrieval based on deep regularized hashing constraints'. Together they form a unique fingerprint.

Cite this