Position: When Incentives Backfire, Data Stops Being Human

Research output: Contribution to journalConference articlepeer-review

Abstract

Progress in AI has relied on human-generated data, from annotator marketplaces to the wider Internet. However, the widespread use of large language models now threatens the quality and integrity of human-generated data on these very platforms. We argue that this issue goes beyond the immediate challenge of filtering AI-generated content – it reveals deeper flaws in how data collection systems are designed. Existing systems often prioritize speed, scale, and efficiency at the cost of intrinsic human motivation, leading to declining engagement and data quality. We propose that rethinking data collection systems to align with contributors’ intrinsic motivations – rather than relying solely on external incentives – can help sustain high-quality data sourcing at scale while maintaining contributor trust and long-term participation.

Original languageEnglish (US)
Pages (from-to)82151-82165
Number of pages15
JournalProceedings of Machine Learning Research
Volume267
StatePublished - 2025
Event42nd International Conference on Machine Learning, ICML 2025 - Vancouver, Canada
Duration: Jul 13 2025Jul 19 2025

All Science Journal Classification (ASJC) codes

  • Software
  • Control and Systems Engineering
  • Statistics and Probability
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Position: When Incentives Backfire, Data Stops Being Human'. Together they form a unique fingerprint.

Cite this