Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Software Engineer (AI Inference): Building a distributed runtime for large-scale LLM inference with an accent on orchestration, model performance, and developer experience. Focus on designing routing, autoscaling, and scheduling systems for GPU workloads in Kubernetes.
Location: Hybrid in San Francisco
Compensation: $180K – $360K + Equity
Company
Baseten powers mission-critical inference for leading AI companies by uniting research, flexible infrastructure, and seamless developer tooling.
What you will do
- Develop infrastructure and orchestration systems for deploying and managing large-scale distributed LLM inference.
- Build platform capabilities related to routing, autoscaling, scheduling, observability, and runtime management.
- Improve the reliability, scalability, and usability of the inference stack.
- Collaborate with Model Performance engineers to make new inference optimizations broadly available.
- Debug complex production systems spanning Kubernetes, distributed runtimes, and GPU workloads.
- Own projects end-to-end from architecture and implementation to monitoring and iteration.
Requirements
- Bachelor's, Master's, or Ph.D. in Computer Science, Engineering, or a related field.
- Strong background in distributed systems, backend infrastructure, or platform engineering.
- Experience building and operating production systems where reliability, latency, and scale are critical.
- Ability to debug complex systems across multiple layers of the stack.
- Must be based in or able to work hybrid in San Francisco.
Nice to have
- Experience with Kubernetes operators and custom resources.
- Prior work with vLLM, SGLang, TensorRT-LLM, or Dynamo.
- Experience operating GPU workloads in production.
- Experience contributing to open-source infrastructure or ML systems.
Culture & Benefits
- Competitive compensation with meaningful equity grants.
- 100% coverage of medical, dental, and vision insurance for employees and dependents.
- Flexible PTO policy including a company-wide Winter Break.
- Paid parental leave and fertility/family-building stipend through Carrot.
- Company-facilitated 401(k).
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →