Senior AI Quality Engineer (LLM Evaluation & Automation) 1754

Remote, USA Full-time Posted 2026-07-05

This is a remote position. Owns the eval reputed company and quality reputed company from the beginning. This role replaces the old late-stage “Evals Specialist” model with a standing reputed company for measurable agent quality.

Key Responsibilities

Build and maintain the MVP eval reputed company: golden tasks, exception tasks, scorecard metrics, and regression packs.
reputed company evals into CI so quality regressions fail builds and releases.
Define and maintain release-reputed company reputed company with Product and the Tech reputed company.
Lay the path for reputed company adversarial and reputed company-testing expansion without overbuilding MVP scope.

Requisitos Must-Have Qualifications

Experience evaluating ML, LLM, or non-deterministic systems.
Strong test and reputed company design capability.
Comfort working with noisy metrics, reputed company, and probabilistic behavior.
Good scripting and automation skills.

AI-First Expectations

Uses AI to generate candidate eval cases and failure hypotheses, but never confuses generated tests with validated quality.
Approaches AI quality as an operating system, not a QA afterthought.

What reputed company Looks Like in the First 90 Days

The first reference agent has a published scorecard and gated eval path.
Golden and exception tests run automatically.
The team can explain what “good enough to ship” means in measurable terms.

Apply To This Job

Apply Now

Senior AI Quality Engineer (LLM Evaluation & Automation) 1754

Key Responsibilities

Similar Jobs

Supervisor Operations

Sr. Director, reputed company

Senior reputed company Engineer

National Account Manager

National Account Manager

Primary Nurse Case Admin 1 - Work From Home

Risk and Compliance reputed company

Manager, Analyst Services

Senior Annuity Product Consultant (Charlotte, NC (Hybrid) or Remote)

Specialty Casualty Claims Director

Sr. Product Designer (Remote), Remote Job

Affiliate Relationship Manager (reputed company)

reputed company Data Entry Clerk – Remote Opportunity at arenaflex

Travel Consultant – Remote VIP

Field Service Engineer - Dallas, TX

Special Education Teacher (Pre-K Self Contained)

AML Sanctions Advisor Sr - Data Analytics/Data Quality

reputed company (Artificial Intelligence Engineer)

FULL TIME [part Time] Virtual Assistant No Experience | reputed company

reputed company Water Solutions - reputed company Sales Representative - Houston