Research2026-04-23
Image Generators are Generalist Vision Learners
Source: Arxiv CS.AI
arXiv:2604.20329v1 Announce Type: cross Abstract: Recent works show that image and video generators exhibit zero-shot visual understanding behaviors, in a way reminiscent of how LLMs develop emergent capabilities of language understanding and reasoning from generative pretraining. While it has long...
arxivpapersvision