TY - GEN
T1 - The PADS project
T2 - An overview
AU - Fisher, Kathleen
AU - Walker, David
PY - 2011
Y1 - 2011
N2 - The goal of the PADS project, which started in 2001, is to make it easier for data analysts to extract useful information from ad hoc data files. This paper does not report new results, but rather gives an overview of the project and how it helps bridge the gap between the unmanaged world of ad hoc data and the managed world of typed programming languages and databases. In particular, the paper reviews the design of PADS data description languages, describes the generated parsing tools and discusses the importance of meta-data. It also sketches the formal semantics, discusses useful tools and how can they can be generated automatically from PADS descriptions, and describes an inferencing system that can learn useful PADS descriptions from positive examples of the data format.
AB - The goal of the PADS project, which started in 2001, is to make it easier for data analysts to extract useful information from ad hoc data files. This paper does not report new results, but rather gives an overview of the project and how it helps bridge the gap between the unmanaged world of ad hoc data and the managed world of typed programming languages and databases. In particular, the paper reviews the design of PADS data description languages, describes the generated parsing tools and discusses the importance of meta-data. It also sketches the formal semantics, discusses useful tools and how can they can be generated automatically from PADS descriptions, and describes an inferencing system that can learn useful PADS descriptions from positive examples of the data format.
KW - Ad hoc data
KW - Data description languages
KW - Domain-specific languages
UR - http://www.scopus.com/inward/record.url?scp=79952323192&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79952323192&partnerID=8YFLogxK
U2 - 10.1145/1938551.1938556
DO - 10.1145/1938551.1938556
M3 - Conference contribution
AN - SCOPUS:79952323192
SN - 9781450305297
T3 - ACM International Conference Proceeding Series
SP - 11
EP - 17
BT - Database Theory - ICDT 2011
PB - Association for Computing Machinery
ER -