Research2026-05-14
RealICU: Do LLM Agents Understand Long-Context ICU Data? A Benchmark Beyond Behavior Imitation
Source: Arxiv CS.AI
arXiv:2605.13542v1 Announce Type: new Abstract: Intensive care units (ICU) generate long, dense and evolving streams of clinical information, where physicians must repeatedly reassess patient states under time pressure, underscoring a clear need for reliable AI decision support. Existing ICU...
arxivpapersagentsbenchmark