Назад
7 мСсяцСв Π½Π°Π·Π°Π΄

Senior Software Engineer (AI)

200Β 000 - 275Β 000$
Π€ΠΎΡ€ΠΌΠ°Ρ‚ Ρ€Π°Π±ΠΎΡ‚Ρ‹
onsite
Π’ΠΈΠΏ Ρ€Π°Π±ΠΎΡ‚Ρ‹
fulltime
Π“Ρ€Π΅ΠΉΠ΄
senior
Английский
b2
Π‘Ρ‚Ρ€Π°Π½Π°
US
Вакансия ΠΈΠ· списка Hirify.GlobalВакансия ΠΈΠ· Hirify Global, списка ΠΌΠ΅ΠΆΠ΄ΡƒΠ½Π°Ρ€ΠΎΠ΄Π½Ρ‹Ρ… tech-ΠΊΠΎΠΌΠΏΠ°Π½ΠΈΠΉ
Для мэтча ΠΈ ΠΎΡ‚ΠΊΠ»ΠΈΠΊΠ° Π½ΡƒΠΆΠ΅Π½ Plus

ΠœΡΡ‚Ρ‡ & Π‘ΠΎΠΏΡ€ΠΎΠ²ΠΎΠ΄

Для мэтча с этой вакансиСй Π½ΡƒΠΆΠ΅Π½ Plus

ОписаниС вакансии

ВСкст:
/

TL;DR

Senior Software Engineer (AI): Building and optimizing distributed training infrastructure and scalable pipelines for large-scale foundation models with an accent on GPU utilization, training performance, and model adaptation. Focus on designing and implementing efficient training systems, collaborating cross-functionally, and advancing scalable AI model training technology.

Company

Baseten powers inference for leading AI companies by uniting applied AI research, flexible infrastructure, and developer tooling, backed by $150M Series D funding.

What you will do

  • Design, build, and maintain distributed training infrastructure for foundation models
  • Implement scalable pipelines for fine-tuning and training on heterogeneous GPU clusters
  • Optimize training performance using advanced techniques like FSDP, DDP, ZeRO, and mixed precision
  • Develop frameworks and tooling to improve training workflow efficiency and reproducibility
  • Collaborate with product and infrastructure teams to meet customer needs
  • Research and productionize emerging training efficiency techniques

Requirements

  • Must have 5+ years experience in ML infrastructure or distributed systems, including 2+ years in tech lead or manager role
  • Strong expertise in distributed training frameworks and GPU utilization
  • Bachelor’s degree or equivalent experience in Computer Science or related field
  • Excellent communication skills bridging technical and business needs
  • Location: San Francisco or New York

Nice to have

  • Experience building APIs, SDKs, or developer tools for ML workflows
  • Familiarity with cluster management and scheduling tools
  • Knowledge of parameter-efficient fine-tuning methods and evaluation pipelines
  • Open-source contributions in distributed training or ML infrastructure
  • Experience with cloud environments and container orchestration

Culture & Benefits

  • Competitive compensation with meaningful equity
  • Full medical, dental, and vision insurance coverage
  • Generous PTO including company-wide Winter Break
  • Paid parental leave and 401(k) plan
  • Exposure to diverse ML startups and networking opportunities

Π‘ΡƒΠ΄ΡŒΡ‚Π΅ остороТны: Ссли Ρ€Π°Π±ΠΎΡ‚ΠΎΠ΄Π°Ρ‚Π΅Π»ΡŒ просит Π²ΠΎΠΉΡ‚ΠΈ Π² ΠΈΡ… систСму, ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡ iCloud/Google, ΠΏΡ€ΠΈΡΠ»Π°Ρ‚ΡŒ ΠΊΠΎΠ΄/ΠΏΠ°Ρ€ΠΎΠ»ΡŒ, Π·Π°ΠΏΡƒΡΡ‚ΠΈΡ‚ΡŒ ΠΊΠΎΠ΄/ПО, Π½Π΅ Π΄Π΅Π»Π°ΠΉΡ‚Π΅ этого - это мошСнники. ΠžΠ±ΡΠ·Π°Ρ‚Π΅Π»ΡŒΠ½ΠΎ ΠΆΠΌΠΈΡ‚Π΅ "ΠŸΠΎΠΆΠ°Π»ΠΎΠ²Π°Ρ‚ΡŒΡΡ" ΠΈΠ»ΠΈ ΠΏΠΈΡˆΠΈΡ‚Π΅ Π² ΠΏΠΎΠ΄Π΄Π΅Ρ€ΠΆΠΊΡƒ. ΠŸΠΎΠ΄Ρ€ΠΎΠ±Π½Π΅Π΅ Π² Π³Π°ΠΉΠ΄Π΅ β†’