Utilizing Past User Feedback for More Accurate Text-to-SQL

Matthias Urban, Jialin Ding, David Kernert, Kapil Vaidya, Tim Kraska

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In the classical problem formulation of Text-to-SQL in academia, each question is translated independently of the others into SQL. This differs from the setting in practice, where questions enter the Text-to-SQL system in sequence. Thus, for all but the first few questions, a translation history is available that contains past questions and how they were translated by the system. So far, it has not been sufficiently explored how Text-to-SQL systems can make use of the translation history to improve future translations. Another crucial difference from the academic setting is that in practice, users generally have a conversation with the Text-to-SQL system. More concretely, if the initial translation contains a mistake, the user can follow up with feedback messages to allow the system to fix the mistake. In this case, it might be helpful to remember these user feedback messages to avoid repeating past mistakes. Thus, in this paper, we explore how a history of such past conversations between users and the Text-to-SQL system can be used to make future Text-to-SQL translations more accurate. We explore several approaches for extracting relevant experiences and insights from this conversation history and show in an evaluation that utilizing them can improve translation accuracy by up to 14.9%.

Original languageEnglish (US)
Title of host publicationHILDA 2025 - Workshop on Human-In-the-Loop Data Analytics, Co-located with SIGMOD 2025
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9798400719592
DOIs
StatePublished - Jul 8 2025
Externally publishedYes
Event2025 Workshop on Human-In-the-Loop Data Analytics, HILDA 2025, Co-located with SIGMOD 2025 - Berlin, Germany
Duration: Jun 22 2025Jun 27 2025

Publication series

NameHILDA 2025 - Workshop on Human-In-the-Loop Data Analytics, Co-located with SIGMOD 2025

Conference

Conference2025 Workshop on Human-In-the-Loop Data Analytics, HILDA 2025, Co-located with SIGMOD 2025
Country/TerritoryGermany
CityBerlin
Period6/22/256/27/25

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computational Theory and Mathematics
  • Computer Science Applications

Keywords

  • NL2SQL
  • Natural Language Interface for Databases
  • Text-to-SQL

Fingerprint

Dive into the research topics of 'Utilizing Past User Feedback for More Accurate Text-to-SQL'. Together they form a unique fingerprint.

Cite this