Multi-view aggregation and multi-relation alignment for few-shot fine-grained recognition
Published in Expert Systems with Applications, 2025
Few-shot fine-grained recognition (FS-FGR) aims to recognize nuanced categories with a limited number of labeled samples that were not encountered during training. Previous work has made significant progress by enhancing the learning of foreground refined regions and the alignment of consistent semantics. However, the detrimental impact of insufficient background diversity on constructing representative category prototypes has been overlooked. Meanwhile, the alignment of semantically consistent features has been hampered by the reliance on singular metrics, resulting in suboptimal feature extraction. To address the limitations above, a novel framework with multi-view aggregation and multi-relation alignment (MVRA) is proposed. In this framework, we strive to refine category prototypes by generating and consolidating multiple views from limited learnable samples. Specifically, we generate foreground-refined views, pinpointing discriminative regions, and background-obfuscated views, broadening the landscape of background diversity. Further, without relying on the entire prior, a global label assignment module is designed to automatically assign reliable labels to the query set samples. Finally, armed with these credible labels, the multi-relation alignment module harnesses the enriched views and their semantic congruencies, facilitating robust feature extraction. The effectiveness and outstanding performance of MVRA are evaluated through extensive experiments conducted on three fine-grained benchmark datasets.
Recommended citation: Jiale Chen, Feng Xu, Xin Lyu, Tao Zeng, Xin Li, and Shangjing Chen. Multi-view aggregation and multi-relation alignment for few-shot fine-grained recognition. Expert Systems with Applications, 283:127764, 2025.
Download Paper
