Research2026-06-29

Do Vision Models Truly Forget? New Findings from Representation-Level Certification of Visual Unlearning in Vertical Federated Learning

Originally published byArxiv CS.AI

arXiv:2605.20282v3 Announce Type: replace-cross Abstract: Machine unlearning in Vertical Federated Learning (VFL) has attracted growing interest, yet existing methods certify forgetting solely using output-level metrics. We challenge these works by introducing Mirage, a representation-level...

The Mirage of Forgetting: Why Representation-Level Unlearning Changes the Game

A new preprint from arXiv (2605.20282v3) introduces Mirage, a framework that exposes a critical blind spot in how we certify machine unlearning in Vertical Federated Learning (VFL). Current methods rely on output-level metrics—checking whether a model’s final predictions change after removing a user’s data. Mirage demonstrates that this approach is fundamentally insufficient: even when outputs appear “forgotten,” the model’s internal representations can still encode sensitive information about the supposedly deleted data.

The researchers propose a representation-level certification method that directly audits the latent feature spaces shared between VFL parties. Their findings suggest that existing unlearning guarantees in VFL are often illusory—a “mirage” of privacy where data traces persist in hidden layers.

Why This Matters

This work strikes at the heart of a growing tension in AI: the demand for “right to be forgotten” compliance (GDPR, CCPA) versus the technical reality of neural networks. In VFL—where multiple parties collaborate without sharing raw data—unlearning is already harder than in centralized settings because data is split across feature spaces. If we cannot trust output-level certification, then:

Regulatory compliance becomes unverifiable. Companies claiming to have removed user data may be legally exposed if representations still leak information.
VFL adoption in sensitive domains (healthcare, finance) faces new risk. Institutions that rely on VFL for privacy-preserving collaboration may discover their protections are weaker than advertised.
The entire unlearning research field needs recalibration. Mirage implies that many published VFL unlearning methods may only achieve “forgetting theater” rather than genuine data removal.

Implications for AI Practitioners

For engineers and researchers building VFL systems, this paper introduces a new quality gate: representation-level auditing should become standard practice. Practically, this means:

Audit your latent spaces. If you deploy VFL with unlearning, you must verify that intermediate representations (not just outputs) are cleansed. This may require new tools—Mirage itself is a start, but production-ready solutions are needed.
Expect higher computational costs. Representation-level certification is more expensive than output checks. Budget for this overhead in compliance pipelines.
Reconsider unlearning guarantees. If your system claims to support data deletion, you may need to retrain from scratch unless you can prove representation-level forgetting. Incremental unlearning methods may be insufficient.
Watch for adversarial leakage. The paper suggests that a malicious VFL party could exploit residual representation information to reconstruct deleted data—a serious security concern.

The core insight is uncomfortable but important: in deep learning, forgetting is not a binary state. Representations can retain “ghosts” of deleted data long after outputs appear clean. Mirage forces us to confront that our current certification standards are too weak.

Key Takeaways

Output-level unlearning certification is insufficient in VFL; representation-level auditing reveals hidden data traces that output checks miss.
Existing VFL unlearning methods may provide only illusory privacy, creating legal and technical risks for regulated industries.
Practitioners must adopt representation-level verification as a new standard, accepting higher computational and engineering costs.
The paper challenges the field to redefine what “forgetting” truly means in distributed, multi-party learning systems.

Read Original Article on Arxiv CS.AI

arxivpapersvision