Skip to main navigation Skip to search Skip to main content

Position: In-House Evaluation Is Not Enough. Towards Robust Third-Party Evaluation and Flaw Disclosure for General-Purpose AI

  • Shayne Longpre
  • , Kevin Klyman
  • , Ruth E. Appel
  • , Sayash Kapoor
  • , Rishi Bommasani
  • , Michelle Sahar
  • , Sean McGregor
  • , Avijit Ghosh
  • , Borhane Blili Hamelin
  • , Nathan Butters
  • , Alondra Nelson
  • , Amit Elazari
  • , Andrew Sellars
  • , Casey John Ellis
  • , Dane Sherrets
  • , Dawn Song
  • , Harley Geiger
  • , Ilona Cohen
  • , Lauren McIlvenny
  • , Madhulika Srikumar
  • Mark M. Jaycox, Markus Anderljung, Nadine Farid Johnson, Nicholas Carlini, Nicolas Miailhe, Nik Marda, Peter Henderson, Rebecca S. Portnoff, Rebecca Weiss, Victoria Westerhoff, Yacine Jernite, Rumman Chowdhury, Percy Liang, Arvind Narayanan

Research output: Contribution to journalConference articlepeer-review

Abstract

The widespread deployment of general-purpose AI (GPAI) systems introduces significant new risks. Yet the infrastructure, practices, and norms for reporting flaws in GPAI systems remain seriously underdeveloped, lagging far behind more established fields like software security. Based on a collaboration between experts from the fields of software security, machine learning, law, social science, and policy, we identify key gaps in the evaluation and reporting of flaws in GPAI systems. We call for three interventions to advance system safety. First, we propose using standardized AI flaw reports and rules of engagementfor researchers in order to ease the process of submitting, reproducing, and triaging flaws inGPAI systems. Second, we propose GPAI system providers adopt broadly-scoped flaw disclosure programs, borrowing from bug bounties, with legal safe harbors to protect researchers. Third, we advocate for the development of improvedinfrastructure to coordinate distribution of flawreports across the many stakeholders who may beimpacted. These interventions are increasingly urgent, as evidenced by the prevalence of jailbreaks and other flaws that can transfer across different providers’ GPAI systems. By promoting robust reporting and coordination in the AI ecosystem, these proposals could significantly improve the safety, security, and accountability of GPAI systems.

Original languageEnglish (US)
Pages (from-to)81728-81758
Number of pages31
JournalProceedings of Machine Learning Research
Volume267
StatePublished - 2025
Event42nd International Conference on Machine Learning, ICML 2025 - Vancouver, Canada
Duration: Jul 13 2025Jul 19 2025

All Science Journal Classification (ASJC) codes

  • Software
  • Control and Systems Engineering
  • Statistics and Probability
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Position: In-House Evaluation Is Not Enough. Towards Robust Third-Party Evaluation and Flaw Disclosure for General-Purpose AI'. Together they form a unique fingerprint.

Cite this