TY - GEN
T1 - LearnPADS ++
T2 - 14th International Symposium on Practical Aspects of Declarative Languages, PADL 2012
AU - Zhu, Kenny Q.
AU - Fisher, Kathleen
AU - Walker, David
PY - 2012
Y1 - 2012
N2 - An ad hoc data source is any semi-structured, non-standard data source. The format of such data sources is often evolving and frequently lacking documentation. Consequently, off-the-shelf tools for processing such data often do not exist, forcing analysts to develop their own tools, a costly and time-consuming process. In this paper, we present an incremental algorithm that automatically infers the format of large-scale data sources. From the resulting format descriptions, we can generate a suite of data processing tools automatically. The system can handle large-scale or streaming data sources whose formats evolve over time. Furthermore, it allows analysts to modify inferred descriptions as desired and incorporates those changes in future revisions.
AB - An ad hoc data source is any semi-structured, non-standard data source. The format of such data sources is often evolving and frequently lacking documentation. Consequently, off-the-shelf tools for processing such data often do not exist, forcing analysts to develop their own tools, a costly and time-consuming process. In this paper, we present an incremental algorithm that automatically infers the format of large-scale data sources. From the resulting format descriptions, we can generate a suite of data processing tools automatically. The system can handle large-scale or streaming data sources whose formats evolve over time. Furthermore, it allows analysts to modify inferred descriptions as desired and incorporates those changes in future revisions.
UR - http://www.scopus.com/inward/record.url?scp=84857076712&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84857076712&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-27694-1_13
DO - 10.1007/978-3-642-27694-1_13
M3 - Conference contribution
AN - SCOPUS:84857076712
SN - 9783642276934
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 168
EP - 182
BT - Practical Aspects of Declarative Languages - 14th International Symposium, PADL 2012, Proceedings
Y2 - 23 January 2012 through 24 January 2012
ER -