Research2026-05-06
When Audio-Language Models Fail to Leverage Multimodal Context for Dysarthric Speech Recognition
Source: Arxiv CS.AI
arXiv:2605.02782v1 Announce Type: new Abstract: Automatic speech recognition (ASR) systems remain brittle on dysarthric and other atypical speech. Recent audio-language models raise the possibility of improving performance by conditioning on additional clinical context at inference time, but it is...
arxivpapersragmultimodal