Grounded Auditing as an Evaluation Policy: A Matched-Action Protocol and Stress Test

Under review at NeurIPS 2026, 2026

Abstract

Grounding LLM auditors with retrieved evidence and citation requirements bundles distinct policy choices: which information the auditor observes, what sources it may rely on, which decisions it can make, and how unsupported outputs are handled. We introduce a matched-action evaluation protocol that formalises grounded auditing as an interface policy I = (O, R, A, G). On NQ-Open and TriviaQA, evidence access improves correction, while citation gates reduce over-trust primarily by inducing abstention rather than improving correction.