BeClaude
Research2026-05-12

Magis-Bench: Evaluating LLMs on Magistrate-Level Legal Tasks

Source: Arxiv CS.AI

arXiv:2605.08437v1 Announce Type: cross Abstract: Existing benchmarks for legal AI focus primarily on tasks where LLMs must produce legal arguments or documents, yet the capacity to \emph{judge} such arguments -- weighing competing claims, applying doctrine to facts, and rendering reasoned...

arxivpapers