Job description

University of Copenhagen • Copenhagen, Denmark

Copenhagen, Denmark

In this role, you'll develop computational approaches for understanding emergent deception in human-AI interaction by combining NLP, behavioral psychology, and economic perspectives.
Conduct controlled experiments with human participants to study deceptive behavior in AI systems.
Apply interpretability-based methods to detect and understand deception in AI.
Collaborate with an interdisciplinary team across Computer Science, Psychology, and Economics.
Contribute to research advancing frameworks for AI safety and alignment.

Applications are handled by the employer.

AI Safety Careers does not process applications directly.

Postdoctoral Researcher, Foundations for Emergent Deception in Human-AI Interaction