Aphantasia, the inability to form mental images, poses a serious challenge to an influential theory of abstract thought in ...
Abstract: Multimodal Large Language Models (MLLMs) have shown promising capabilities in Audio-Video Question-Answering (AVQA) tasks. However, during training and inference, they often suffer from ...
Abstract: Multimodal emotion recognition (MER) aims to understand human emotions by leveraging multiple modalities. Previous MER methods have focused on learning enhanced multimodal representations ...