Назад
4 дня назад

Member Of Technical Staff - Research Engineer (AI)

180 000 - 290 000$
Формат работы
remote/hybrid
Тип работы
fulltime
Грейд
middle/senior
Английский
b2
Страна
US/Germany
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Member of Technical Staff - Research Engineer (AI): Developing and optimizing large-scale training systems for multimodal generative models with an accent on GPU performance, numerical stability, and distributed training. Focus on implementing custom kernels, low-precision training paths, and debugging complex distributed training failures to enable frontier research.

Location: San Francisco (USA) or Freiburg (Germany). Hybrid (at least 2 days a week) or remote with a required monthly in-person week.

Salary: $180,000 - $290,000 + equity

Company

A frontier research lab behind foundational technologies like Stable Diffusion and FLUX, creating advanced generative models for images and video.

What you will do

  • Optimize the performance, reliability, and numerical stability of production training runs for large multimodal generative models.
  • Profile full training steps across model code, attention, kernels, data loading, and communication.
  • Implement GPU-level optimizations using CUDA, Triton, CuTe, and CUTLASS.
  • Develop and validate low-precision training paths including FP8, MXFP8, and FP4-style formats.
  • Debug distributed training failures such as NaNs, loss spikes, and NCCL issues.
  • Build benchmarking and profiling harnesses to validate performance across various hardware and configurations.

Requirements

  • Deep experience with large-scale training systems and strong PyTorch fluency.
  • Proficiency in distributed training concepts (FSDP, tensor/model parallelism, NCCL).
  • Hands-on experience improving training throughput, memory footprint, or stability.
  • Experience profiling GPU workloads with Nsight Systems, Nsight Compute, or torch profiler.
  • Understanding of low-precision training and quantization tradeoffs (FP8, FP4).
  • Must be based in or able to travel to San Francisco or Freiburg for monthly in-person weeks.

Nice to have

  • Experience co-owning training for a shipped frontier foundation model.
  • Proven ability to write or substantially improve forward/backward GPU kernels.
  • Experience with Hopper or Blackwell-class GPUs.
  • Background in diffusion, flow matching, DiT, or LLM training systems.

Culture & Benefits

  • Distributed team with physical offices in SF and Freiburg.
  • Company covers reasonable travel costs for required in-person weeks.
  • Culture based on scientific obsession, low ego, boldness, and kindness.
  • Equity compensation provided alongside base salary.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →