Research Engineer, Infrastructure, Inference (AI)
ΠΡΡΡ & Π‘ΠΎΠΏΡΠΎΠ²ΠΎΠ΄
ΠΠ»Ρ ΠΌΡΡΡΠ° Ρ ΡΡΠΎΠΉ Π²Π°ΠΊΠ°Π½ΡΠΈΠ΅ΠΉ Π½ΡΠΆΠ΅Π½ Plus
ΠΠΏΠΈΡΠ°Π½ΠΈΠ΅ Π²Π°ΠΊΠ°Π½ΡΠΈΠΈ
TL;DR
Research Engineer, Infrastructure, Inference (AI): Designing, optimizing, and scaling systems that power large AI models with an accent on performant and efficient model inference. Focus on collaborating with researchers to improve performance, latency, and reliability of AI infrastructure.
Location: San Francisco, California
Compensation: $350,000 - $475,000 USD
Company
Thinking Machines Lab empowers humanity through advancing collaborative general intelligence.
What you will do
- Work alongside researchers and engineers to bring cutting-edge AI models into production.
- Collaborate with research teams to enable high-performance inference for novel architectures.
- Design and implement new techniques, tools, and architectures that improve performance, latency, throughput, and efficiency.
- Optimize our codebase and compute fleet (e.g., GPUs) to fully utilize hardware FLOPs, bandwidth, and memory.
- Extend orchestration frameworks (e.g., Kubernetes, Ray, SLURM) for distributed inference, evaluation, and large-batch serving.
- Publish and share learnings through internal documentation, open-source libraries, or technical reports.
Requirements
- Bachelorβs degree or equivalent experience in computer science, engineering, or similar.
- Understanding of deep learning frameworks (e.g., PyTorch, JAX) and their underlying system architectures.
- Experience with inference serving systems optimized for throughput and latency.
- Strong engineering skills, ability to contribute performant, maintainable code and debug in complex codebases.
ΠΡΠ΄ΡΡΠ΅ ΠΎΡΡΠΎΡΠΎΠΆΠ½Ρ: Π΅ΡΠ»ΠΈ ΡΠ°Π±ΠΎΡΠΎΠ΄Π°ΡΠ΅Π»Ρ ΠΏΡΠΎΡΠΈΡ Π²ΠΎΠΉΡΠΈ Π² ΠΈΡ ΡΠΈΡΡΠ΅ΠΌΡ, ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΡ iCloud/Google, ΠΏΡΠΈΡΠ»Π°ΡΡ ΠΊΠΎΠ΄/ΠΏΠ°ΡΠΎΠ»Ρ, Π·Π°ΠΏΡΡΡΠΈΡΡ ΠΊΠΎΠ΄/ΠΠ, Π½Π΅ Π΄Π΅Π»Π°ΠΉΡΠ΅ ΡΡΠΎΠ³ΠΎ - ΡΡΠΎ ΠΌΠΎΡΠ΅Π½Π½ΠΈΠΊΠΈ. ΠΠ±ΡΠ·Π°ΡΠ΅Π»ΡΠ½ΠΎ ΠΆΠΌΠΈΡΠ΅ "ΠΠΎΠΆΠ°Π»ΠΎΠ²Π°ΡΡΡΡ" ΠΈΠ»ΠΈ ΠΏΠΈΡΠΈΡΠ΅ Π² ΠΏΠΎΠ΄Π΄Π΅ΡΠΆΠΊΡ. ΠΠΎΠ΄ΡΠΎΠ±Π½Π΅Π΅ Π² Π³Π°ΠΉΠ΄Π΅ β