Research2026-04-24
Fairness Evaluation and Inference Level Mitigation in LLMs
Source: Arxiv CS.AI
arXiv:2510.18914v4 Announce Type: replace-cross Abstract: Large language models often display undesirable behaviors embedded in their internal representations, undermining fairness, inconsistency drift, amplification of harmful content, and the propagation of unwanted patterns during extended...
arxivpapers