Back to News
Release2024-08-13
Introducing SWE-bench Verified
Source: OpenAI
We’re releasing a human-validated subset of SWE-bench that more reliably evaluates AI models’ ability to solve real-world software issues.
openaigpt
We’re releasing a human-validated subset of SWE-bench that more reliably evaluates AI models’ ability to solve real-world software issues.