Research2026-05-11

Evaluating Large Language Models in Scientific Discovery

arXiv:2512.15567v2 Announce Type: replace Abstract: Large language models (LLMs) are increasingly applied to scientific research, yet prevailing science benchmarks probe decontextualized knowledge and overlook the iterative reasoning, hypothesis generation, and observation interpretation that drive...

Read Original Article on Arxiv CS.AI

arxivpapers