Secret image sharing (SIS) has received increased attention from the research community because of its usefulness in multiparty secure computing, access control, blockchain distributive storage and other security-oriented applications. Prevention of fake ...
Image–text retrieval is a vital task in computer vision and has received growing attention, since it connects cross-modality data. It comes with the critical challenges of learning unified representations and eliminating the large gap between visual and ...
As a crucial part of natural language processing, event-centered commonsense inference task has attracted increasing attention. With a given observed event, the intention and reaction of the people involved in the event are required to be inferred with ...
This study provides a new understanding of the adversarial attack problem by examining the correlation between adversarial attack and visual attention change. In particular, we observed that: (1) images with incomplete attention regions are more ...
Video-based person re-identification (ReID) is challenging due to the presence of various interferences in video frames. Recent approaches handle this problem using temporal aggregation strategies. In this work, we propose a novel Context Sensing ...
With the progress of Mars exploration, numerous Mars image data are being collected and need to be analyzed. However, due to the severe train-test gap and quality distortion of Martian data, the performance of existing computer vision models is ...
Multi-modal medical image fusion is a long-standing important research topic that can obtain informative medical images and assist doctors diagnose and treat diseases more efficiently. However, most fusion methods extract and fuse features by subjectively ...
As an important and challenging problem, image generation with limited data aims at generating realistic images through training a GAN model given few samples. A typical solution is to transfer a well-trained GAN model from a data-rich source domain to ...
Audio information has not been considered an important factor in visual attention models regardless of many psychological studies that have shown the importance of audio information in the human visual perception system. Since existing visual attention ...
Image captioning is a promising task that attracted researchers in the last few years. Existing image captioning models are primarily trained to generate one caption per image. However, an image may contain rich contents, and one caption cannot express ...
Meta-learning approaches have recently achieved promising performance in multi-class incremental learning. However, meta-learners still suffer from catastrophic forgetting, i.e., they tend to forget the learned knowledge from the old tasks when they focus ...
Multi-label classification aims to recognize multiple objects or attributes from images. The key to solving this issue relies on effectively characterizing the inter-label correlations or dependencies, which bring the prevailing graph neural network. ...