A Benchmark for Evaluating Outcome-Driven Constraint Violations in Autonomous AI Agents
Published in arXiv preprint / Under review for ICML 2026, 2025
Recommended citation: Miles Q. Li, Benjamin Fung, Martin Weiss, Pulei Xiong, Khalil Al-Hussaeni, and Claude Fachkha. A Benchmark for Evaluating Outcome-Driven Constraint Violations in Autonomous AI Agents. arXiv preprint arXiv:2512.20798 (2025). https://arxiv.org/abs/2512.20798
This paper presents ODCV-Bench, a safety benchmark designed to capture emergent forms of agentic misalignment by evaluating outcome-driven constraint violations in autonomous AI agents.
Recommended citation: Miles Q. Li, Benjamin Fung, Martin Weiss, Pulei Xiong, Khalil Al-Hussaeni, and Claude Fachkha. A Benchmark for Evaluating Outcome-Driven Constraint Violations in Autonomous AI Agents. arXiv preprint arXiv:2512.20798 (2025).