Job description

OpenAI • San Francisco, CA

San Francisco Bay Area

$207,000 - $295,000

In this role, you'll define how AI models should behave in high-risk contexts, translating safety concerns into behavioral specifications.
Design and maintain model policies across safety domains including dual-use, agentic, and emerging risk areas.
Partner with research, engineering, and product teams to operationalize policies into scalable model behavior.
Use red-teaming results and deployment data to identify emerging risks and improve policy quality.
Design human data campaigns to measure and improve policy adherence.

OpenAI is a frontier AI research and product company, with teams working on alignment, policy, and security. You can read concerns about doing harm by working at a frontier AI company in our career review on the topic, including concerns about OpenAI in particular.

Our Take On This Role: We have concerns about OpenAI's track record on safety and responsible development and do not recommend almost any roles at OpenAI. Nonetheless, it is possible that OpenAI will create AGI in the next decade, in which case safety and security work at the company could be extremely important. If you receive a job offer from OpenAI, consider contacting us for career advice.

Model Policy