Computer Science > Computer Vision and Pattern Recognition

arXiv:2507.23601 (cs)

[Submitted on 31 Jul 2025]

Title:Mamba-based Efficient Spatio-Frequency Motion Perception for Video Camouflaged Object Detection

Abstract:Existing video camouflaged object detection (VCOD) methods primarily rely on spatial appearance features to perceive motion cues for breaking camouflage. However, the high similarity between foreground and background in VCOD results in limited discriminability of spatial appearance features (e.g., color and texture), restricting detection accuracy and completeness. Recent studies demonstrate that frequency features can not only enhance feature representation to compensate for appearance limitations but also perceive motion through dynamic variations in frequency energy. Furthermore, the emerging state space model called Mamba, enables efficient perception of motion cues in frame sequences due to its linear-time long-sequence modeling capability. Motivated by this, we propose a novel visual camouflage Mamba (Vcamba) based on spatio-frequency motion perception that integrates frequency and spatial features for efficient and accurate VCOD. Specifically, we propose a receptive field visual state space (RFVSS) module to extract multi-scale spatial features after sequence modeling. For frequency learning, we introduce an adaptive frequency component enhancement (AFE) module with a novel frequency-domain sequential scanning strategy to maintain semantic consistency. Then we propose a space-based long-range motion perception (SLMP) module and a frequency-based long-range motion perception (FLMP) module to model spatio-temporal and frequency-temporal sequences in spatial and frequency phase domains. Finally, the space and frequency motion fusion module (SFMF) integrates dual-domain features for unified motion representation. Experimental results show that our Vcamba outperforms state-of-the-art methods across 6 evaluation metrics on 2 datasets with lower computation cost, confirming the superiority of Vcamba. Our code is available at: this https URL.

Comments:	11 pages, 11 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2507.23601 [cs.CV]
	(or arXiv:2507.23601v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2507.23601

Submission history

From: Xin Li [view email]
[v1] Thu, 31 Jul 2025 14:40:37 UTC (2,710 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Mamba-based Efficient Spatio-Frequency Motion Perception for Video Camouflaged Object Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Mamba-based Efficient Spatio-Frequency Motion Perception for Video Camouflaged Object Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators