Назад
обновлСно 4 дня назад

Data Scientist (AI)

210Β 000 - 385Β 000$
Π€ΠΎΡ€ΠΌΠ°Ρ‚ Ρ€Π°Π±ΠΎΡ‚Ρ‹
hybrid
Π’ΠΈΠΏ Ρ€Π°Π±ΠΎΡ‚Ρ‹
fulltime
Π“Ρ€Π΅ΠΉΠ΄
senior
Английский
b2
Π‘Ρ‚Ρ€Π°Π½Π°
UK/US/Serbia
Вакансия ΠΈΠ· списка Hirify.GlobalВакансия ΠΈΠ· Hirify Global, списка ΠΌΠ΅ΠΆΠ΄ΡƒΠ½Π°Ρ€ΠΎΠ΄Π½Ρ‹Ρ… tech-ΠΊΠΎΠΌΠΏΠ°Π½ΠΈΠΉ
Для мэтча ΠΈ ΠΎΡ‚ΠΊΠ»ΠΈΠΊΠ° Π½ΡƒΠΆΠ΅Π½ Plus

ΠœΡΡ‚Ρ‡ & Π‘ΠΎΠΏΡ€ΠΎΠ²ΠΎΠ΄

Для мэтча с этой вакансиСй Π½ΡƒΠΆΠ΅Π½ Plus

ОписаниС вакансии

ВСкст:
/

TL;DR

Data Scientist (AI): Architecting and maintaining automated evaluation pipelines to assess answer quality for an LLM-first search engine with an accent on designing evaluation sets for tool calls and developing VLM-based solutions for visual rendering. Focus on continuous review of public benchmarks and directly shaping product changes through evaluation metrics.

Location: Hybrid in London, New York City, or Belgrade. USD salary ranges apply only to U.S.-based positions. International salaries are set based on the local market.

Salary: $210,000–$385,000

Company

Perplexity serves tens of millions of users daily with a reliable, high-quality LLM-first search engine and specialized data sources.

What you will do

  • Architect and maintain automated evaluation pipelines to assess answer quality across Perplexity's products.
  • Design evaluation sets and methods specifically to measure the impact of tool calls on final answer quality.
  • Develop VLM-based solutions to programmatically evaluate how final answers render visually across platforms and devices.
  • Continuously review and incorporate public benchmarks into regular performance measurements.
  • Collaborate closely with technical leadership to measure and improve Answer Quality.

Requirements

  • PhD or MS in a technical field or equivalent experience.
  • 4+ years of experience in data science or machine learning.
  • Strong proficiency in Python and SQL (expected to write production-grade code).
  • Experience building within a modern cloud data stack, specifically AWS and Databricks.
  • Comfortable with agentic coding workflows and using AI-assisted development tools.

Nice to have

  • 1+ years of experience working with LLMs at scale, specifically with LLM-as-a-judge setups.
  • Prior experience working on customer-facing web products or consumer apps, with real user traffic at scale.
  • A strong research background, with experience applying research methods to real-world ML problems.
  • Experience defining evaluation metrics and building ground truth datasets.

Culture & Benefits

  • Comprehensive benefits program including equity, health, dental, vision, retirement, fitness, commuter, and dependent care accounts for U.S. employees.
  • Full-time employees outside the U.S. enjoy a comprehensive benefits program tailored to their region of residence.
  • Operate within a small, high-impact team.
  • Evaluation metrics directly shape product changes.

Π‘ΡƒΠ΄ΡŒΡ‚Π΅ остороТны: Ссли Ρ€Π°Π±ΠΎΡ‚ΠΎΠ΄Π°Ρ‚Π΅Π»ΡŒ просит Π²ΠΎΠΉΡ‚ΠΈ Π² ΠΈΡ… систСму, ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡ iCloud/Google, ΠΏΡ€ΠΈΡΠ»Π°Ρ‚ΡŒ ΠΊΠΎΠ΄/ΠΏΠ°Ρ€ΠΎΠ»ΡŒ, Π·Π°ΠΏΡƒΡΡ‚ΠΈΡ‚ΡŒ ΠΊΠΎΠ΄/ПО, Π½Π΅ Π΄Π΅Π»Π°ΠΉΡ‚Π΅ этого - это мошСнники. ΠžΠ±ΡΠ·Π°Ρ‚Π΅Π»ΡŒΠ½ΠΎ ΠΆΠΌΠΈΡ‚Π΅ "ΠŸΠΎΠΆΠ°Π»ΠΎΠ²Π°Ρ‚ΡŒΡΡ" ΠΈΠ»ΠΈ ΠΏΠΈΡˆΠΈΡ‚Π΅ Π² ΠΏΠΎΠ΄Π΄Π΅Ρ€ΠΆΠΊΡƒ. ΠŸΠΎΠ΄Ρ€ΠΎΠ±Π½Π΅Π΅ Π² Π³Π°ΠΉΠ΄Π΅ β†’