Назад
3 месяца назад

Engineering Manager, Agent Prompts & Evals (AI)

1 - 2$
Формат работы
hybrid
Тип работы
fulltime
Грейд
lead
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Engineering Manager, Agent Prompts & Evals (AI): Leading the team that owns the infrastructure for shipping model and prompt changes with confidence, including eval frameworks, system prompt pipelines, and regression-detection systems. Focus on measuring model behavior, building collaboration with other teams, and shaping the team's investment in frontier eval development and model launch automation.

Location: San Francisco, CA or New York City, NY. Expect all staff to be in one of our offices at least 25% of the time.

Salary: $1 - $2 USD

Company

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial for users and society.

What you will do

  • Lead and grow a team of prompt engineers and platform software engineers.
  • Own the product-side eval platform and system prompt infrastructure, including versioning, deployment, rollback, and review tooling.
  • Be a steady hand through model launches, serving as the backstop when things get chaotic.
  • Build durable collaboration with other evals groups across the company, focusing on ownership boundaries and shared roadmaps.
  • Recruit, close, and retain engineers who want to work at the intersection of product engineering and model behavior.
  • Shape where the team invests next, considering paths into frontier eval development, model launch automation, and deeper prompt engineering support.

Requirements

  • 8+ years in software engineering with 3+ years managing engineering teams, including experience leading a platform, infra, or developer-tooling team.
  • A track record of building tooling and processes that make it easy for other teams to do the right thing.
  • Comfort managing a team with a mixed charter: platform ownership, service-to-other-teams, and a launch-driven operational rhythm.
  • Enough technical depth to engage on system design, review pipeline architecture, and be credible in debates with strong ICs.
  • A product mindset and willingness to wear multiple hats when the work calls for it.
  • Demonstrated ability to build and maintain peer relationships with partner orgs, negotiating ownership and aligning roadmaps.

Nice to have

  • Prior exposure to LLM evals, ML experimentation platforms, or model quality work.
  • Experience with A/B testing infrastructure, feature flagging, or gradual rollout systems.
  • Background in devtools, CI/CD platforms, or testing infrastructure at scale.
  • A history of managing teams that sit between two larger orgs and making that position an asset rather than a liability.
  • Interest in AI safety and alignment.

Culture & Benefits

  • Competitive compensation and benefits.
  • Optional equity donation matching.
  • Generous vacation and parental leave.
  • Flexible working hours.
  • Lovely office space in which to collaborate with colleagues.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →