BeClaude
Research2026-04-24

Fairness Evaluation and Inference Level Mitigation in LLMs

Source: Arxiv CS.AI

arXiv:2510.18914v4 Announce Type: replace-cross Abstract: Large language models often display undesirable behaviors embedded in their internal representations, undermining fairness, inconsistency drift, amplification of harmful content, and the propagation of unwanted patterns during extended...

arxivpapers