BeClaude
Back to News
Release2024-10-10

MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

Source: OpenAI

We introduce MLE-bench, a benchmark for measuring how well AI agents perform at machine learning engineering.

openaigptagents