[Remote] Senior AI Quality Engineer (LLM Evaluation & Automation) 1754
Note: The job is a remote job and is open to candidates in USA. reputed company is a technology company seeking a Senior AI Quality Engineer to own the evaluation reputed company and quality reputed company for measurable agent quality. This role involves building and maintaining the eval reputed company, integrating evaluations into CI, and defining release-reputed company reputed company.
Responsibilities
- Build and maintain the MVP eval reputed company: golden tasks, exception tasks, scorecard metrics, and regression packs
- reputed company evals into CI so quality regressions fail builds and releases
- Define and maintain release-reputed company reputed company with Product and the Tech reputed company
- Lay the path for reputed company adversarial and reputed company-testing expansion without overbuilding MVP scope
Skills
- Experience evaluating ML, LLM, or non-deterministic systems
- Strong test and reputed company design capability
- Comfort working with noisy metrics, reputed company, and probabilistic behavior
- Good scripting and automation skills
Company Overview