AI Safety CareersCurated jobs in AI safety, governance and frontier AI.

Search roles, companies, or keywordsLocation

About Privacy Policy Terms Submit a job Saved jobs

PhD Studentship, Monitoring and Increasing LLM Safety

Cambridge University, Department of Engineering · Added today

Applications are handled by the employer on an external website. AI Safety Careers does not process applications directly.

Back to roles

AI Safety & Alignment

PhD Studentship, Monitoring and Increasing LLM Safety

Added todayCambridge University, Department of EngineeringCambridge, UK

Cambridge, UK

In this studentship, you'll pursue a PhD exploring large language model safety through mechanistic interpretability and behavioural research.
Investigate Chain-of-Thought faithfulness and detect deceptive behaviour via perturbation methods and mechanistic analysis.

Monitor LLM behaviour at inference time and develop risk reduction strategies.

Apply either perturbation techniques to test CoT meaning or train models for transparency using human predictor evaluation.

Collaborate with your supervisor to define research direction after completing initial 1.5-year projects.

This listing may be aggregated from a public source or submitted by a third party. If you represent this employer and would like to update or remove this listing, contact support@aisafetycareers.com.

View all jobs from Cambridge University, Department of Engineering

Get the best AI safety roles weekly

A concise digest of alignment, governance, and AI risk jobs.

By subscribing, you agree to receive the AI Safety Careers newsletter. You can unsubscribe at any time. See our Privacy Policy.

PhD Studentship, Monitoring and Increasing LLM Safety at Cambridge University, Department of Engineering | AI Safety Careers