Computer Science > Computer Vision and Pattern Recognition

arXiv:2507.23755 (cs)

[Submitted on 31 Jul 2025]

Title:Slot Attention with Re-Initialization and Self-Distillation

Authors:Rongzhen Zhao, Yi Zhao, Juho Kannala, Joni Pajarinen

Abstract:Unlike popular solutions based on dense feature maps, Object-Centric Learning (OCL) represents visual scenes as sub-symbolic object-level feature vectors, termed slots, which are highly versatile for tasks involving visual modalities. OCL typically aggregates object superpixels into slots by iteratively applying competitive cross attention, known as Slot Attention, with the slots as the query. However, once initialized, these slots are reused naively, causing redundant slots to compete with informative ones for representing objects. This often results in objects being erroneously segmented into parts. Additionally, mainstream methods derive supervision signals solely from decoding slots into the input's reconstruction, overlooking potential supervision based on internal information. To address these issues, we propose Slot Attention with re-Initialization and self-Distillation (DIAS): $\emph{i)}$ We reduce redundancy in the aggregated slots and re-initialize extra aggregation to update the remaining slots; $\emph{ii)}$ We drive the bad attention map at the first aggregation iteration to approximate the good at the last iteration to enable self-distillation. Experiments demonstrate that DIAS achieves state-of-the-art on OCL tasks like object discovery and recognition, while also improving advanced visual prediction and reasoning. Our code is available on this https URL.

Comments:	Accepted by ACM MM 2025
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2507.23755 [cs.CV]
	(or arXiv:2507.23755v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2507.23755

Submission history

From: Rongzhen Zhao [view email]
[v1] Thu, 31 Jul 2025 17:41:18 UTC (925 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Slot Attention with Re-Initialization and Self-Distillation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Slot Attention with Re-Initialization and Self-Distillation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators