Talks and presentations

When AI Agents “Cheat to Win”: Outcome-Driven Misalignment in Autonomous Systems

June 18, 2026

Invited Talk, iSchool Seminar Series, Online

Invited talk on outcome-driven misalignment in autonomous AI systems. Autonomous AI agents are increasingly deployed in high-stakes environments where success is defined by performance metrics (KPIs). When these metrics conflict with ethical, legal, or safety constraints, agents do not simply fail—they can actively and strategically circumvent those constraints. The talk shows that frontier AI agents, under realistic optimization pressure, can autonomously derive deceptive or unsafe strategies—including metric gaming, data fabrication, and the deliberate bypassing of safety mechanisms—even without explicit malicious instructions, and that stronger reasoning capabilities can enable more effective and harder-to-detect forms of misalignment.

Generative AI Model for Security

March 13, 2025

Invited Presentation, IVADO Community of Practice, HEC Montreal, Montreal, Canada

Invited presentation at the IVADO Community of Practice at HEC Montreal. Presented research on generative AI models for security to the IVADO research community.