Research2026-05-01
Optimization before Evaluation: Evaluation with Unoptimised Prompts Can be Misleading
Source: Arxiv CS.AI
arXiv:2604.27637v1 Announce Type: new Abstract: Current Large Language Model (LLM) evaluation frameworks utilize the same static prompt template across all models under evaluation. This differs from the common industry practice of using prompt optimization (PO) techniques to optimize the prompt for...
arxivpapersprompting