Назад
23 часа назад

AI Validation and Benchmarking Engineer (Student)

30$
Формат работы
hybrid
Тип работы
parttime
Грейд
trainee
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

AI Validation and Benchmarking Engineer (Student): Contributing to the development of autonomous robotaxi technology by maintaining benchmark pipelines and evaluating AI agent performance with an accent on data accuracy and failure mode analysis. Focus on building comprehensive evaluation infrastructure, automating data collection, and improving validation metrics for complex AI systems.

Location: Must be able to work on-site in Foster City, CA

Compensation: $30/hour

Company

Zoox is an autonomous ride-hailing company building purpose-built, fully electric robotaxis designed to make transportation safer and more accessible.

What you will do

  • Run and maintain benchmark pipelines to identify routing errors and regressions.
  • Expand ground truth datasets to evaluate agent outputs against known-correct answers.
  • Develop new evaluation dimensions including label accuracy and structured output correctness.
  • Investigate failure modes in agent outputs and collaborate with engineers on improvements.
  • Automate data collection, result parsing, and metric reporting through scripts and tooling.
  • Document findings and present benchmark trends to the engineering team.

Requirements

  • Currently enrolled in a B.S. or M.S. program in Computer Science, Data Science, Engineering, or a related field.
  • Available to work on-site at the designated office location.
  • Commitment of at least 20 hours per week for a minimum of three months.
  • Proficiency in Python and experience modifying reproducible analysis scripts.
  • Understanding of evaluation concepts such as precision, recall, F1 score, and confusion matrices.
  • Comfortable working with structured data formats like CSV and JSON.

Nice to have

  • Prior exposure to LLM-based systems, prompt engineering, or AI agent evaluation.
  • Experience with collaboration tools like Jira or Slack.

Culture & Benefits

  • Opportunity to work on real-world autonomous vehicle projects alongside experienced researchers.
  • Hands-on professional experience in a high-impact engineering environment.
  • Flexible scheduling designed to complement academic studies.
  • Exposure to cutting-edge AI and robotics technology.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →