Назад
Company hidden
обновлено 7 дней назад

AI Senior Staff Systems Engineer (AI Infrastructure)

136 500 - 253 500$
Формат работы
onsite
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

AI Senior Staff Systems Engineer (AI Infrastructure): Leading the development, operations, and support of enterprise AI infrastructure with an accent on high-performance GPU clusters and LLM deployment. Focus on architecting next-generation AI systems, optimizing inference throughput using vLLM/TGI, and building production-grade Agentic AI workflows.

Location: San Jose, CA

Salary: $136,500 – $253,500

Company

hirify.global is a leading provider of intelligent system design software, enabling the development of the world's most advanced electronic products.

What you will do

  • Design and implement next-generation AI infrastructure to support Agentic AI initiatives.
  • Lead the configuration, installation, and optimization of on-premise GPU server clusters and storage solutions.
  • Integrate and secure public cloud AI services, specifically Azure OpenAI and Google Cloud Platform (GCP) services like Gemini.
  • Deploy and optimize Large Language Models (LLMs) using techniques like quantization and frameworks such as vLLM, TGI, and TensorRT-LLM.
  • Architect production-grade Agentic AI workflows that integrate LLMs with external tools, APIs, and databases.
  • Develop automation scripts in Python, Bash, or Perl and implement monitoring for GPU utilization and system health.

Requirements

  • 10+ years of technical experience, with at least 5 years focused on HPC or AI infrastructure.
  • Expert-level knowledge of NVIDIA GPU architecture, CUDA, and cuDNN.
  • Proven experience managing access, usage, and billing for Azure OpenAI and GCP AI services.
  • Extensive hands-on experience with Docker, Kubernetes, and Linux system administration (RHEL preferred).
  • Proficiency in scripting languages such as Python, Bash, or Perl.
  • Must be based in or able to work from San Jose, CA.

Nice to have

  • Experience administering LSF clusters in production or research environments (Slurm is a plus).
  • Understanding of AI job profiling and tuning (memory, GPU, I/O).
  • Experience with macOS/AppleSilicon system administration and troubleshooting.

Culture & Benefits

  • Competitive compensation package including bonus and equity.
  • Comprehensive health benefits: medical, dental, and vision plan options.
  • Financial security through a 401(k) plan with employer match and an employee stock purchase plan.
  • Generous time off including paid vacation and paid holidays.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →