Job description
Mistral AI • London / Paris
About Mistral
About the role
As a Model Behavior Architect on the Function Calling team, you are at the forefront of defining and measuring how LLMs use tools, invoke functions, and orchestrate complex agentic workflows.
We are looking for people who have built a career in engineering, machine learning, and large language models and are experts in model evaluation, policy writing, and creating eval pipelines for tool use and function calling. Your role is to work hand-in-hand with our Science team to define what 'good' looks like for function calling—from accurate parameter selection and schema adherence to multi-step tool orchestration, error recovery, and agentic reasoning.
Join us if you are passionate about tackling cutting-edge, open-ended research challenges and transforming your insights into best-in-class models.
What you will do
-
Interact with models to identify where function calling and tool use behaviour can be improved
-
Gather internal and external feedback on tool-calling behaviour to scope areas for improvement
-
Design and implement evals, data guidelines, data generation, and synthetic tool environments and APIs
-
Identify and fix edge case behaviours, such as malformed arguments, hallucinated functions, and incorrect tool selection—through rigorous testing