Manager, Software Engineering (ML Inference)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Manager, Software Engineering (ML Inference): Leading and mentoring a team of ML infrastructure engineers to build and scale systems for model training, inference, and data pipelines with an accent on high-availability, distributed systems, and operational excellence. Focus on setting technical strategy, driving high-impact infrastructure initiatives, and fostering the growth of high-performing engineering teams.
Location: Must be based in the US and work from an office 4+ days per week in Bellevue, Los Angeles, or Palo Alto.
Company
Snap Inc. is a technology company focused on visual messaging and camera-based products that empower people to express themselves and connect with the world.
What you will do
- Lead and mentor a team of ML infrastructure engineers responsible for scaling model training and inference systems.
- Define technical strategy, build roadmaps, and establish measurable goals for ML infrastructure initiatives.
- Perform design and code reviews to maintain high standards for technical excellence, security, and performance.
- Collaborate with ML engineers and cross-functional stakeholders to deliver scalable solutions.
- Hire, retain, and develop high-performing engineers through regular feedback and growth opportunities.
- Advocate for best practices in availability, scalability, and cost management.
Requirements
- Bachelor's degree in a technical field or equivalent experience.
- 9+ years of software engineering experience (or 8+ years with a Master's, 5+ years with a PhD).
- 1+ year of experience managing an engineering team.
- Strong understanding of ML infrastructure, including inference serving, feature stores, and data pipelines.
- Experience with distributed systems and large-scale ML infrastructure.
- Must be able to work in the office 4+ days per week.
Nice to have
- Advanced degree in a related technical field.
- Experience with ML frameworks like TensorFlow, PyTorch, or Spark ML.
- Familiarity with big data technologies such as Spark, Flink, or Ray.
- Experience with infrastructure tools including Kubernetes, NoSQL, Redis, Kafka, and cloud platforms (GCP/AWS).
- Experience with MLOps and production machine learning lifecycles.
Culture & Benefits
- Comprehensive medical coverage and emotional/mental health support programs.
- Paid parental leave.
- Compensation packages designed to share in long-term company success.
- Commitment to diversity, equity, and inclusion as an equal opportunity employer.
- Collaborative environment emphasizing precision, privacy, and fast-paced execution.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →