Skip to main content

Comparative analysis on cross-modal information retrieval: A review.

Kaur, P., Pannu, H.S. and Malhi, A., 2021. Comparative analysis on cross-modal information retrieval: A review. Computer Science Review, 39 (February), 100336.

Full text available as:

cross_modal_survey.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial.


DOI: 10.1016/j.cosrev.2020.100336


Human beings experience life through a spectrum of modes such as vision, taste, hearing, smell, and touch. These multiple modes are integrated for information processing in our brain using a complex network of neuron connections. Likewise for artificial intelligence to mimic the human way of learning and evolve into the next generation, it should elucidate multi-modal information fusion efficiently. Modality is a channel that conveys information about an object or an event such as image, text, video, and audio. A research problem is said to be multi-modal when it incorporates information from more than a single modality. Multi-modal systems involve one mode of data to be inquired for any (same or varying) modality outcome whereas cross-modal system strictly retrieves the information from a dissimilar modality. As the input–output queries belong to diverse modal families, their coherent comparison is still an open challenge with their primitive forms and subjective definition of content similarity. Numerous techniques have been proposed by researchers to handle this issue and to reduce the semantic gap of information retrieval among different modalities. This paper focuses on a comparative analysis of various research works in the field of cross-modal information retrieval. Comparative analysis of several cross-modal representations and the results of the state-of-the-art methods when applied on benchmark datasets have also been discussed. In the end, open issues are presented to enable the researchers to a better understanding of the present scenario and to identify future research directions.

Item Type:Article
Uncontrolled Keywords:Cross-modal; Multimedia; Information retrieval; Data fusion; Comparative analysis
Group:Faculty of Science & Technology
ID Code:36355
Deposited By: Symplectic RT2
Deposited On:17 Dec 2021 15:57
Last Modified:14 Mar 2022 14:31


Downloads per month over past year

More statistics for this item...
Repository Staff Only -