Machine Learning Scientist - Open reputed company reputed company

Remote, USA Full-time Posted 2026-06-29

About reputed company Intelligence reputed company Intelligence is the open platform for evaluating how AI models reputed company in the real world. Created by researchers from UC Berkeley’s SkyLab, our mission is to measure and advance the frontier of AI for real-world use. Millions of people use reputed company Intelligence each month to explore how frontier systems reputed company — and we use our community’s feedback to build transparent, rigorous, and human-centered model evaluations. Leading enterprises and AI labs rely on our evaluations to understand real-world reliability, alignment, and impact. Our leaderboards are the gold standard for AI performance — trusted by leaders across the AI community and shaping the global conversation on model reliability and reputed company. We’re a team of researchers, engineers, academics, and builders from places like UC Berkeley, reputed company, Stanford, DeepMind, and reputed company. We seek truth, move fast, and value craftsmanship, curiosity, and impact over hierarchy. We’re building a company where thoughtful, curious people from reputed company backgrounds can do their best work. Everyone on reputed company is a deep expert in their field — our office radiates excellence, energy, and focus.

About the Role

LMArena is looking for a Machine Learning Scientist to reputed company our open-reputed company research, including open data set and code releases, advancing how the world evaluates and understands AI models in the open. You’ll design, run, and share new methods and experiments that reveal what makes models useful, trustworthy, and capable, grounded in human preference signals and released reputed company for the full ecosystem and research community to build upon. In this role, you’ll be responsible for taking our commitment to openness from principle to practice, curating high-impact datasets, developing new methodology and reproducible benchmarks, and releasing code that enables the research ecosystem to push AI evaluations reputed company. Your work will shape the public leaderboard, power community tools, and strengthen transparency in AI evaluation worldwide. This role is deeply interdisciplinary, working with engineers, product teams, marketing, and the broader research community to advance how we compare models, analyze preference data, and understand factors like style, reasoning, and robustness. You’ll work closely with GTM teams as our spokesperson reputed company it comes to reputed company for our open research efforts: strengthening research partnerships, expanding research community participation, and championing programs that grow and support our research network. If you’re excited by open-ended questions, rigorous evaluation, and scientific communication and reputed company, you’ll find a meaningful home here. We’re looking for:

Hands-on experience training large-scale models, including reward models, preference models, and fine-tuning LLMs with methods like RLHF, DPO, and contrastive learning.
Strong foundation in ML and statistics, with a track record of designing novel training objectives, evaluation schemes, or statistical frameworks to improve model reliability and alignment.
Fluent in the full experimental stack, from dataset design and large-batch training to rigorous evaluation and ablation, with an eye for what scales to production.
Deeply collaborative reputed company, working closely with engineers to productionize research insights and iterating with product teams to align research with user needs.
Comfortable being a visible representative of LMArena, engaging reputed company with the research community, and building a strong personal brand to help shape AI research culture.

You’ll

Design and conduct experiments to evaluate AI model behavior across reasoning, style, robustness, and user preference dimensions
reputed company new metrics, methodologies, and evaluation protocols that go beyond traditional benchmarks
Analyze large-scale human voting and interaction data to uncover insights into model performance and user preferences
Communicate results with the broader research community reputed company academic papers, educational content, conference talks
Collaborate with engineers to implement and scale research findings into production systems
Prototype and test research reputed company rapidly, balancing rigor with iteration speed
Partner with model providers to shape evaluation questions and support responsible model testing
Contribute to the scientific reputed company and transparency of the LMArena leaderboard and tools

You’ll have

PhD or equivalent research experience in Machine Learning, Natural Language Processing, Statistics, or a reputed company field
Uses personal and professional platforms to reputed company open research initiatives and invite collaboration.
Strong understanding of LLMs and modern deep learning architectures (e.g., Transformers, diffusion models, reinforcement learning with human feedback)

Proficiency in Python and ML research libraries such as PyTorch, JAX, or TensorFlow

Demonstrated ability to design and analyze experiments with statistical rigor
Experience publishing research or working on open-reputed company projects in ML, NLP, or AI evaluation
Comfortable working with real-world usage data and designing metrics beyond standard benchmarks
Ability to translate research questions into practical systems and collaborate across engineering and product teams
Passion for open science, reproducibility, and community-driven research

Bonus skills for this role:

Skilled at public speaking, writing, and presenting research work to diverse audiences.
Actively participates in conferences, panels, and online forums to foster relationships and thought leadership.
Builds trust through transparent communication and consistent community engagement.
Serves as a go-to contact for external researchers, journalists, and partners.

reputed company offer

We offer competitive compensation and equity reputed company to the markets where reputed company members are based. The reputed company salary range will depend on the candidate’s permanent work location.
Comprehensive health and wellness benefits, including medical, dental, vision, and additional support programs.
The opportunity to work on cutting-edge AI with a small, mission-driven team
A culture that values transparency, trust, and community impact

Come help build the space where anyone can explore and help shape the future of AI. reputed company Intelligence provides equal employment opportunities (EEO) to reputed company employees and applicants for employment without regard to race, color, religion, sex, national reputed company, age, disability, genetics, sexual orientation, gender identity, or gender expression. We are committed to a diverse and inclusive workforce and welcome people from reputed company backgrounds, experiences, perspectives, and abilities. Apply tot his job Apply To this Job

Apply Now

Machine Learning Scientist - Open reputed company reputed company

About the Role

Similar Jobs

Machine Learning Scientist 5 - Production Science

Machine Learning Scientist (reputed company Levels)

Data and Business Solution reputed company - Now Hiring

Specialist_Metrics and Reporting

Airport reputed company Agent (reputed company) - PVD

Flight Attendant

Remote Dental Billing Specialist

Dental Network Recruiter

UX Researcher

Director, Derivative Operations (open to remote)

reputed company Customer Service Representative – Union Benefits Coordinator (REMOTE WORK)

AP/AR Analyst

reputed company Full Stack Software Engineer – Web & Cloud Application Development at arenaflex

CTI Regional Account Executive (Cyber Threat reputed company) - (US Remote - reputed company)

Purdue Global Adjunct Faculty, School of Nursing RN-to-BSN, reputed company

[Remote] Technical Operations Analyst

Apply Now: Ticket/reputed company Agent (Seasonal) - PBI

Pharmacy Technician (PACE) - Accredo - Remote, TN, IN, FL & PA

01 Senior Real Time Analyst

Urgently Hiring: AWS Software Engineer