Computer Vision and Pattern Recognition

Authors and titles for recent submissions

See today's new changes

Total of 642 entries : 1-50 101-150 151-200 201-250 210-259 251-300 301-350 351-400 ... 601-642

Showing up to 50 entries per page: fewer | more | all

[210] arXiv:2507.22062 [pdf, html, other]: Title: Meta CLIP 2: A Worldwide Scaling Recipe

Yung-Sung Chuang, Yang Li, Dong Wang, Ching-Feng Yeh, Kehan Lyu, Ramya Raghavendra, James Glass, Lifei Huang, Jason Weston, Luke Zettlemoyer, Xinlei Chen, Zhuang Liu, Saining Xie, Wen-tau Yih, Shang-Wen Li, Hu Xu

Comments: 10 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[211] arXiv:2507.22061 [pdf, html, other]: Title: MOVE: Motion-Guided Few-Shot Video Object Segmentation

Kaining Ying, Hengrui Hu, Henghui Ding

Comments: ICCV 2025, Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[212] arXiv:2507.22059 [pdf, html, other]: Title: StepAL: Step-aware Active Learning for Cataract Surgical Videos

Nisarg A. Shah, Bardia Safaei, Shameema Sikder, S. Swaroop Vedula, Vishal M. Patel

Comments: Accepted to MICCAI 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[213] arXiv:2507.22058 [pdf, html, other]: Title: X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again

Zigang Geng, Yibing Wang, Yeyao Ma, Chen Li, Yongming Rao, Shuyang Gu, Zhao Zhong, Qinglin Lu, Han Hu, Xiaosong Zhang, Linus, Di Wang, Jie Jiang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[214] arXiv:2507.22057 [pdf, html, other]: Title: MetaLab: Few-Shot Game Changer for Image Recognition

Chaofei Qi, Zhitai Liu, Jianbin Qiu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[215] arXiv:2507.22052 [pdf, html, other]: Title: Ov3R: Open-Vocabulary Semantic 3D Reconstruction from RGB Videos

Ziren Gong, Xiaohan Li, Fabio Tosi, Jiawei Han, Stefano Mattoccia, Jianfei Cai, Matteo Poggi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[216] arXiv:2507.22041 [pdf, html, other]: Title: Shallow Deep Learning Can Still Excel in Fine-Grained Few-Shot Learning

Chaofei Qi, Chao Ye, Zhitai Liu, Weiyang Lin, Jianbin Qiu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[217] arXiv:2507.22028 [pdf, html, other]: Title: From Seeing to Experiencing: Scaling Navigation Foundation Models with Reinforcement Learning

Honglin He, Yukai Ma, Wayne Wu, Bolei Zhou

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[218] arXiv:2507.22020 [pdf, html, other]: Title: XAI for Point Cloud Data using Perturbations based on Meaningful Segmentation

Raju Ningappa Mulawade, Christoph Garth, Alexander Wiebel

Comments: 18 pages, 14 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[219] arXiv:2507.22008 [pdf, html, other]: Title: VeS: Teaching Pixels to Listen Without Supervision

Sajay Raj

Comments: 6 pages, 1 figure, 1 table. Code and models are released

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[220] arXiv:2507.22003 [pdf, other]: Title: See Different, Think Better: Visual Variations Mitigating Hallucinations in LVLMs

Ziyun Dai, Xiaoqiang Li, Shaohua Zhang, Yuanchen Wu, Jide Li

Comments: Accepted by ACM MM25

Journal-ref: 33rd ACM International Conference on Multimedia 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[221] arXiv:2507.22002 [pdf, html, other]: Title: Bridging Synthetic and Real-World Domains: A Human-in-the-Loop Weakly-Supervised Framework for Industrial Toxic Emission Segmentation

Yida Tao, Yen-Chia Hsu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[222] arXiv:2507.22000 [pdf, html, other]: Title: Staining and locking computer vision models without retraining

Oliver J. Sutton, Qinghua Zhou, George Leete, Alexander N. Gorban, Ivan Y. Tyukin

Comments: 10 pages, 9 pages of appendices, 10 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[223] arXiv:2507.21985 [pdf, html, other]: Title: ZIUM: Zero-Shot Intent-Aware Adversarial Attack on Unlearned Models

Hyun Jun Yook, Ga San Jhun, Jae Hyun Cho, Min Jeon, Donghyun Kim, Tae Hyung Kim, Youn Kyu Lee

Comments: Accepted to ICCV2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[224] arXiv:2507.21977 [pdf, html, other]: Title: Motion Matters: Motion-guided Modulation Network for Skeleton-based Micro-Action Recognition

Jihao Gu, Kun Li, Fei Wang, Yanyan Wei, Zhiliang Wu, Hehe Fan, Meng Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[225] arXiv:2507.21971 [pdf, html, other]: Title: EIFNet: Leveraging Event-Image Fusion for Robust Semantic Segmentation

Zhijiang Li, Haoran He

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[226] arXiv:2507.21968 [pdf, html, other]: Title: A Deep Learning Pipeline Using Synthetic Data to Improve Interpretation of Paper ECG Images

Xiaoyu Wang, Ramesh Nadarajah, Zhiqiang Zhang, David Wong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[227] arXiv:2507.21960 [pdf, html, other]: Title: PanoSplatt3R: Leveraging Perspective Pretraining for Generalized Unposed Wide-Baseline Panorama Reconstruction

Jiahui Ren, Mochu Xiang, Jiajun Zhu, Yuchao Dai

Comments: Accepted to ICCV 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[228] arXiv:2507.21959 [pdf, html, other]: Title: Mitigating Spurious Correlations in Weakly Supervised Semantic Segmentation via Cross-architecture Consistency Regularization

Zheyuan Zhang, Yen-chia Hsu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[229] arXiv:2507.21949 [pdf, html, other]: Title: Contrast-Prior Enhanced Duality for Mask-Free Shadow Removal

Jiyu Wu, Yifan Liu, Jiancheng Huang, Mingfu Yan, Shifeng Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[230] arXiv:2507.21947 [pdf, html, other]: Title: Enhancing Generalization in Data-free Quantization via Mixup-class Prompting

Jiwoong Park, Chaeun Lee, Yongseok Choi, Sein Park, Deokki Hong, Jungwook Choi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[231] arXiv:2507.21945 [pdf, html, other]: Title: Attention-Driven Multimodal Alignment for Long-term Action Quality Assessment

Xin Wang, Peng-Jie Li, Yuan-Yuan Shen

Comments: Accepted to Applied Soft Computing

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[232] arXiv:2507.21924 [pdf, html, other]: Title: MMAT-1M: A Large Reasoning Dataset for Multimodal Agent Tuning

Tianhong Gao, Yannian Fu, Weiqun Wu, Haixiao Yue, Shanshan Liu, Gang Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[233] arXiv:2507.21922 [pdf, html, other]: Title: SwinECAT: A Transformer-based fundus disease classification model with Shifted Window Attention and Efficient Channel Attention

Peiran Gu, Teng Yao, Mengshen He, Fuhao Duan, Feiyan Liu, RenYuan Peng, Bao Ge

Comments: 17 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[234] arXiv:2507.21917 [pdf, html, other]: Title: ArtSeek: Deep artwork understanding via multimodal in-context reasoning and late interaction retrieval

Nicola Fanelli, Gennaro Vessio, Giovanna Castellano

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[235] arXiv:2507.21912 [pdf, html, other]: Title: Predict Patient Self-reported Race from Skin Histological Images

Shengjia Chen, Ruchika Verma, Kevin Clare, Jannes Jegminat, Eugenia Alleva, Kuan-lin Huang, Brandon Veremis, Thomas Fuchs, Gabriele Campanella

Comments: Accepted to the MICCAI Workshop on Fairness of AI in Medical Imaging (FAIMI), 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computational Engineering, Finance, and Science (cs.CE)
[236] arXiv:2507.21905 [pdf, html, other]: Title: Evaluating Deepfake Detectors in the Wild

Viacheslav Pirogov, Maksim Artemev

Comments: Accepted to the ICML 2025 Workshop 'DataWorld: Unifying Data Curation Frameworks Across Domains'

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[237] arXiv:2507.21893 [pdf, other]: Title: Aether Weaver: Multimodal Affective Narrative Co-Generation with Dynamic Scene Graphs

Saeed Ghorbani

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[238] arXiv:2507.21888 [pdf, html, other]: Title: CAPE: A CLIP-Aware Pointing Ensemble of Complementary Heatmap Cues for Embodied Reference Understanding

Fevziye Irem Eyiokur, Dogucan Yaman, Hazım Kemal Ekenel, Alexander Waibel

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[239] arXiv:2507.21858 [pdf, html, other]: Title: Low-Cost Test-Time Adaptation for Robust Video Editing

Jianhui Wang, Yinda Chen, Yangfan He, Xinyuan Song, Yi Xin, Dapeng Zhang, Zhongwei Wan, Bin Li, Rongchao Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[240] arXiv:2507.21857 [pdf, html, other]: Title: Unleashing the Power of Motion and Depth: A Selective Fusion Strategy for RGB-D Video Salient Object Detection

Jiahao He, Daerji Suolang, Keren Fu, Qijun Zhao

Comments: submitted to TMM on 11-Jun-2024, ID: MM-020522, still in peer review

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[241] arXiv:2507.21844 [pdf, html, other]: Title: Cross-Architecture Distillation Made Simple with Redundancy Suppression

Weijia Zhang, Yuehao Liu, Wu Ran, Chao Ma

Comments: Accepted by ICCV 2025 (Highlight)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[242] arXiv:2507.21820 [pdf, html, other]: Title: Anyone Can Jailbreak: Prompt-Based Attacks on LLMs and T2Is

Ahmed B Mustafa, Zihan Ye, Yang Lu, Michael P Pound, Shreyank N Gowda

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[243] arXiv:2507.21809 [pdf, html, other]: Title: HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

HunyuanWorld Team, Zhenwei Wang, Yuhao Liu, Junta Wu, Zixiao Gu, Haoyuan Wang, Xuhui Zuo, Tianyu Huang, Wenhuan Li, Sheng Zhang, Yihang Lian, Yulin Tsai, Lifu Wang, Sicong Liu, Puhua Jiang, Xianghui Yang, Dongyuan Guo, Yixuan Tang, Xinyue Mao, Jiaao Yu, Junlin Yu, Jihong Zhang, Meng Chen, Liang Dong, Yiwen Jia, Chao Zhang, Yonghao Tan, Hao Zhang, Zheng Ye, Peng He, Runzhou Wu, Minghui Chen, Zhan Li, Wangchen Qin, Lei Wang, Yifu Sun, Lin Niu, Xiang Yuan, Xiaofeng Yang, Yingping He, Jie Xiao, Yangyu Tao, Jianchen Zhu, Jinbao Xue, Kai Liu, Chongqing Zhao, Xinming Wu, Tian Liu, Peng Chen, Di Wang, Yuhong Liu, Linus, Jie Jiang, Tengfei Wang, Chunchao Guo

Comments: Technical Report; Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[244] arXiv:2507.21794 [pdf, html, other]: Title: Distribution-Based Masked Medical Vision-Language Model Using Structured Reports

Shreyank N Gowda, Ruichi Zhang, Xiao Gu, Ying Weng, Lu Yang

Comments: Accepted in MICCAI-W 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[245] arXiv:2507.21786 [pdf, html, other]: Title: MSGCoOp: Multiple Semantic-Guided Context Optimization for Few-Shot Learning

Zhaolong Wang, Tongfeng Sun, Mingzheng Du, Yachao Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[246] arXiv:2507.21778 [pdf, html, other]: Title: AU-LLM: Micro-Expression Action Unit Detection via Enhanced LLM-Based Feature Fusion

Zhishu Liu, Kaishen Yuan, Bo Zhao, Yong Xu, Zitong Yu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[247] arXiv:2507.21761 [pdf, html, other]: Title: MOR-VIT: Efficient Vision Transformer with Mixture-of-Recursions

YiZhou Li

Comments: 18 pages,9 figuers

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[248] arXiv:2507.21756 [pdf, html, other]: Title: LiteFat: Lightweight Spatio-Temporal Graph Learning for Real-Time Driver Fatigue Detection

Jing Ren, Suyu Ma, Hong Jia, Xiwei Xu, Ivan Lee, Haytham Fayek, Xiaodong Li, Feng Xia

Comments: 6 pages, 1 figure

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[249] arXiv:2507.21745 [pdf, html, other]: Title: Few-Shot Vision-Language Reasoning for Satellite Imagery via Verifiable Rewards

Aybora Koksal, A. Aydin Alatan

Comments: ICCV 2025 Workshop on Curated Data for Efficient Learning (CDEL). 10 pages, 3 figures, 6 tables. Our model, training code and dataset will be at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[250] arXiv:2507.21742 [pdf, html, other]: Title: Adversarial Reconstruction Feedback for Robust Fine-grained Generalization

Shijie Wang, Jian Shi, Haojie Li

Comments: ICCV 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[251] arXiv:2507.21741 [pdf, html, other]: Title: MAGE: Multimodal Alignment and Generation Enhancement via Bridging Visual and Semantic Spaces

Shaojun E, Yuchen Yang, Jiaheng Wu, Yan Zhang, Tiejun Zhao, Ziyan Chen

Comments: 9 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[252] arXiv:2507.21732 [pdf, html, other]: Title: SAMITE: Position Prompted SAM2 with Calibrated Memory for Visual Object Tracking

Qianxiong Xu, Lanyun Zhu, Chenxi Liu, Guosheng Lin, Cheng Long, Ziyue Li, Rui Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[253] arXiv:2507.21723 [pdf, html, other]: Title: Detection Transformers Under the Knife: A Neuroscience-Inspired Approach to Ablations

Nils Hütten, Florian Hölken, Hasan Tercan, Tobias Meisen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[254] arXiv:2507.21715 [pdf, html, other]: Title: Impact of Underwater Image Enhancement on Feature Matching

Jason M. Summers, Mark W. Jones

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[255] arXiv:2507.21703 [pdf, html, other]: Title: Semantics versus Identity: A Divide-and-Conquer Approach towards Adjustable Medical Image De-Identification

Yuan Tian, Shuo Wang, Rongzhao Zhang, Zijian Chen, Yankai Jiang, Chunyi Li, Xiangyang Zhu, Fang Yan, Qiang Hu, XiaoSong Wang, Guangtao Zhai

Comments: Accepted to ICCV2025;

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[256] arXiv:2507.21690 [pdf, html, other]: Title: APT: Improving Diffusion Models for High Resolution Image Generation with Adaptive Path Tracing

Sangmin Han, Jinho Jeong, Jinwoo Kim, Seon Joo Kim

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[257] arXiv:2507.21665 [pdf, html, other]: Title: Automated Detection of Antarctic Benthic Organisms in High-Resolution In Situ Imagery to Aid Biodiversity Monitoring

Cameron Trotter, Huw Griffiths, Tasnuva Ming Khan, Rowan Whittle

Comments: Accepted to ICCV 2025's Joint Workshop on Marine Vision (ICCVW, CVAUI&AAMVEM). Main paper (11 pages, 3 figures, 3 tables) plus supplementary (7 pages, 5 figures, 2 tables)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[258] arXiv:2507.21649 [pdf, html, other]: Title: The Evolution of Video Anomaly Detection: A Unified Framework from DNN to MLLM

Shibo Gao, Peipei Yang, Haiyang Guo, Yangyang Liu, Yi Chen, Shuai Li, Han Zhu, Jian Xu, Xu-Yao Zhang, Linlin Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[259] arXiv:2507.21627 [pdf, html, other]: Title: GuidPaint: Class-Guided Image Inpainting with Diffusion Models

Qimin Wang, Xinda Liu, Guohua Geng

Subjects: Computer Vision and Pattern Recognition (cs.CV)

Total of 642 entries : 1-50 101-150 151-200 201-250 210-259 251-300 301-350 351-400 ... 601-642

Showing up to 50 entries per page: fewer | more | all

Computer Vision and Pattern Recognition

Authors and titles for recent submissions

Wed, 30 Jul 2025 (showing first 50 of 110 entries )