Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
Текст:
TL;DR
Software Engineer (ML Infrastructure): Building and scaling a serverless LLM post-training platform (Cortex Training) with an accent on distributed systems, GPU orchestration, and multi-node training. Focus on optimizing throughput, ensuring fault tolerance, and productionizing state-of-the-art research into enterprise-scale components.
Location: US-WA-Bellevue
Compensation: $160K – $230K
Company
Snowflake is a data cloud company powering the era of the agentic enterprise.
What you will do
- Design and build the full stack from public training APIs and SDKs to the GPU data plane.
- Scale distributed systems for serverless GPU compute, including multi-tenant scheduling and capacity-aware routing.
- Optimize end-to-end performance for training, inference, and RL loops to keep GPUs saturated.
- Partner with Snowflake Research to productionize state-of-the-art training and inference techniques into reliable components.
Requirements
- 5+ years building and shipping production ML systems.
- Strong foundation in distributed systems, designing fault-tolerant services on Kubernetes.
- Familiarity with GPU and LLM infrastructure (e.g., PyTorch, DeepSpeed/FSDP, Ray, CUDA/NCCL, vLLM).
- Demonstrated ability to harden complex systems for reliability and cost efficiency.
- BS in Computer Science or a related field.
Nice to have
- Hands-on LLM post-training or modeling experience.
- MS or PhD in Computer Science or a related field.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →
Похожие вакансии
2 дня назад
ML Infrastructure Engineer (AI)
180 000 - 350 000$
Nebius
3 дня назад
Principal ML Solutions Architect (AI)
208 000 - 261 000$
NDA
12 часов назад
Senior Lead Software Engineer (AI)
175 000 - 195 000$
2 дня назад
Senior Machine Learning Engineer (AI)
3 дня назад
Software Engineer (AI)
150 000 - 225 000$
FAIR
7 часов назад
Research Engineer, SysML (AI)
141 000 - 208 000$