Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Engineering Manager, Agent Prompts & Evals (AI): Leading the team that owns the infrastructure for shipping model and prompt changes with confidence, including eval frameworks, system prompt pipelines, and regression-detection systems. Focus on measuring model behavior, building collaboration with other teams, and shaping the team's investment in frontier eval development and model launch automation.
Location: San Francisco, CA or New York City, NY. Expect all staff to be in one of our offices at least 25% of the time.
Salary: $1 - $2 USD
Company
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial for users and society.
What you will do
- Lead and grow a team of prompt engineers and platform software engineers.
- Own the product-side eval platform and system prompt infrastructure, including versioning, deployment, rollback, and review tooling.
- Be a steady hand through model launches, serving as the backstop when things get chaotic.
- Build durable collaboration with other evals groups across the company, focusing on ownership boundaries and shared roadmaps.
- Recruit, close, and retain engineers who want to work at the intersection of product engineering and model behavior.
- Shape where the team invests next, considering paths into frontier eval development, model launch automation, and deeper prompt engineering support.
Requirements
- 8+ years in software engineering with 3+ years managing engineering teams, including experience leading a platform, infra, or developer-tooling team.
- A track record of building tooling and processes that make it easy for other teams to do the right thing.
- Comfort managing a team with a mixed charter: platform ownership, service-to-other-teams, and a launch-driven operational rhythm.
- Enough technical depth to engage on system design, review pipeline architecture, and be credible in debates with strong ICs.
- A product mindset and willingness to wear multiple hats when the work calls for it.
- Demonstrated ability to build and maintain peer relationships with partner orgs, negotiating ownership and aligning roadmaps.
Nice to have
- Prior exposure to LLM evals, ML experimentation platforms, or model quality work.
- Experience with A/B testing infrastructure, feature flagging, or gradual rollout systems.
- Background in devtools, CI/CD platforms, or testing infrastructure at scale.
- A history of managing teams that sit between two larger orgs and making that position an asset rather than a liability.
- Interest in AI safety and alignment.
Culture & Benefits
- Competitive compensation and benefits.
- Optional equity donation matching.
- Generous vacation and parental leave.
- Flexible working hours.
- Lovely office space in which to collaborate with colleagues.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →