2 дня назад
Lead Machine Learning Engineer (Inference & Performance)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
Текст:
TL;DR
Lead Machine Learning Engineer (Inference & Performance) (AI): Build and optimize production LLM serving with an accent on throughput, latency, and GPU utilization. Focus on engineering inference/training performance using vLLM/SGLang, profiling bottlenecks, and deploying multiple models at scale on shared GPU clusters with Kubernetes.
Location: Remote
Company
builds AI products and platforms.
What you will do
- Optimize Inference by building and tuning production LLM serving with vLLM and SGLang to maximize throughput and minimize latency.
- Profile and accelerate training/inference runs by instrumenting workloads, identifying bottlenecks, and applying the right attention implementations (e.g., FlashAttention) for the target hardware.
- Engineer for hardware by applying GPU architecture and attention internals to select approaches per accelerator (H200, GB200).
- Serve at scale by deploying and operating multiple models on shared GPU clusters on GKE with autoscaling, bin-packing, and mixed-workload handling.
- Drive efficiency by owning GPU utilization as a first-class metric and improving throughput-per-dollar.
- Collaborate with clients to translate performance, latency, and cost requirements into serving and training architectures.
Requirements
- 5+ years of ML/AI engineering experience with a meaningful focus on performance, infrastructure, or systems.
- Proven experience deploying and optimizing models in production.
- Demonstrated experience profiling and improving GPU utilization for training and/or inference.
- Strong Kubernetes (GKE) experience deploying and autoscaling multiple models on shared GPU clusters.
- Mastery of Python and shell scripting; comfort reading and reasoning about CUDA-adjacent performance code is a strong plus.
- Knowledge of data engineering and SQL.
Culture & Benefits
- Remote work setup.
- Ownership-driven approach from profiling through production optimization.
- Rigor: measure before optimizing and use data to guide engineering effort.
- Consultative collaboration with clients to connect technical performance to business value.
- Emphasis on responsible AI development and data privacy.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →
Похожие вакансии
6 дней назад
Lead AI Engineer (AI)
4 дня назад
Sr. Principal Software Engineer (Generative AI)
141 400 - 226 300$
NDA
13 часов назад
Senior Lead Software Engineer (AI)
175 000 - 195 000$
5 дней назад
Team Lead, Software Engineer (AI)
2 дня назад
Senior Machine Learning Engineer (AI)
Synthesia
3 дня назад