Назад
Company hidden
1 день назад

Multimodal ML Engineer (AI)

120 000 - 250 000$
Формат работы
hybrid
Тип работы
fulltime
Грейд
senior
Английский
b2
Страна
France/UK
Релокация
France
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Multimodal ML Engineer (PyTorch/Multimodal AI): Training and shipping vision, audio, video, and speech models for an AI safety platform with an accent on large-scale multimodal architecture and production optimization. Focus on building alignment pipelines, optimizing MoE architectures for efficient inference, and designing evaluation metrics for complex multimodal reasoning.

Location: Must be based in or able to relocate to Paris or London (Hybrid)

Salary: $120K – $250K + Equity

Company

hirify.global is an AI Safety company building the safety, reliability, and optimization layer for AI systems.

What you will do

  • Train and fine-tune large-scale multimodal models (vision, audio, speech) from scratch and from pretrained checkpoints.
  • Extend models across modalities, including image understanding, video temporal modeling, and streaming audio.
  • Design and execute experiments focusing on architecture changes, data mixes, and training recipes.
  • Build and maintain multimodal data pipelines and generate synthetic data for training.
  • Optimize MoE architectures for inference and deploy models end-to-end from research to production.
  • Define critical evaluation metrics and benchmarks for visual QA, spatial reasoning, and audio understanding.

Requirements

  • 3+ years of experience training large-scale deep learning models in multimodal domains.
  • Strong PyTorch skills with hands-on distributed training experience (DeepSpeed, FSDP).
  • Deep understanding of multimodal architectures such as LLaVA, Qwen-VL, Whisper, or similar.
  • Experience with RLHF/alignment techniques (GRPO, DPO, reward modeling) for multimodal data.
  • Proven track record of shipping optimized models to production with a focus on latency targets.
  • Must be based in or able to relocate to Paris or London.

Nice to have

  • Understanding of audio signal processing fundamentals, including spectrograms and mel features.

Culture & Benefits

  • Paid time off in accordance with local regulations.
  • Relocation package available for candidates moving to Paris.
  • Comprehensive medical insurance for the France-based team.
  • Full provision of hardware, tools, and paid subscriptions for AI agents and IDEs.
  • Bi-annual team off-sites (e.g., Alps, Saint-Tropez).

Hiring process

  • Introductory HR call (25 min).
  • Take-home technical test task.
  • Technical interview with the Head of Applied Research (60 min).
  • Final conversation with the CEO (45 min).

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →