Job description
AI Security Institute • UK
About the AI Security Institute
The AI Security Institute is the world's largest and best-funded team dedicated to understanding advanced AI risks and translating that knowledge into action. We’re in the heart of the UK government with direct lines to No. 10 (the Prime Minister's office), and we work with frontier developers and governments globally.
We’re here because governments are critical for advanced AI going well, and UK AISI is uniquely positioned to mobilise them. With our resources, unique agility and international influence, this is the best place to shape both AI development and government action.
The deadline for applying to this role is Sunday 7th June 2026, end of day, anywhere on Earth.
**Team Description **
The ability to effectively evaluate and monitor AI systems will grow in importance as models become more capable, autonomous, and integrated into society. If models can detect and game evaluations, obscure their reasoning, or behave differently under observation, the safety claims that governments and developers rely on become unreliable. Understanding and addressing these risks is essential to ensuring that oversight of advanced AI systems keeps pace with their capabilities.
The Model Transparency team is a research team within AISI focused on ensuring that evaluations, assessments, and monitoring of frontier AI systems remain reliable as models become less transparent. We research how and why oversight is declining – through phenomena such as evaluation awareness, unfaithful chain-of-thought reasoning, and changes in model architectures – and develop methods (including white and black box methods) to detect, measure, and mitigate potential issues. We share our findings with frontier AI companies (including Anthropic, OpenAI, DeepMind), UK government officials, and allied governments, and publicly to inform their deployment, research, and policy decisions. We also work directly with safety teams at frontier labs, contributing to safety case reviews and helping improve their alignment evaluation methodology.