TY - JOUR
T1 - Property-preserving data reconstruction
AU - Ailon, Nir
AU - Chazelle, Bernard
AU - Comandur, Seshadhri
AU - Liu, Ding
N1 - Funding Information:
★ This work was supported in part by NSF grants CCR-998817, 0306283, ARO Grant DAAH04-96-1-0181.
PY - 2004
Y1 - 2004
N2 - We initiate a new line of investigation into online property-preserving data reconstruction. Consider a dataset which is assumed to satisfy various (known) structural properties; eg, it may consist of sorted numbers, or points on a manifold, or vectors in a polyhedral cone, or codewords from an error-correcting code. Because of noise and errors, however, an (unknown) fraction of the data is deemed unsound, ie, in violation with the expected structural properties. Can one still query into the dataset in an online fashion and be provided data that is always sound? In other words; can one design a filter which, when given a query to any item I in the dataset, returns a sound item J that, although not necessarily in the dataset, differs from I as infrequently as possible. No preprocessing should be allowed and queries should be answered online. We consider the case of a monotone function. Specifically, the dataset encodes a function f : {1,...,n} → R that is at (unknown) distance ε from monotone, meaning that f can - and must - be modified at εn places to become monotone. Our main result is a randomized filter that can answer any query in O(log2 n log log n) time while modifying the function f at only O(εn) places. The amortized time over n function evaluations is O(log n). The filter works as stated with probability arbitrarily close to 1. We also provide an alternative filter with O(log n) worst case query time and O(εn log n) function modifications.
AB - We initiate a new line of investigation into online property-preserving data reconstruction. Consider a dataset which is assumed to satisfy various (known) structural properties; eg, it may consist of sorted numbers, or points on a manifold, or vectors in a polyhedral cone, or codewords from an error-correcting code. Because of noise and errors, however, an (unknown) fraction of the data is deemed unsound, ie, in violation with the expected structural properties. Can one still query into the dataset in an online fashion and be provided data that is always sound? In other words; can one design a filter which, when given a query to any item I in the dataset, returns a sound item J that, although not necessarily in the dataset, differs from I as infrequently as possible. No preprocessing should be allowed and queries should be answered online. We consider the case of a monotone function. Specifically, the dataset encodes a function f : {1,...,n} → R that is at (unknown) distance ε from monotone, meaning that f can - and must - be modified at εn places to become monotone. Our main result is a randomized filter that can answer any query in O(log2 n log log n) time while modifying the function f at only O(εn) places. The amortized time over n function evaluations is O(log n). The filter works as stated with probability arbitrarily close to 1. We also provide an alternative filter with O(log n) worst case query time and O(εn log n) function modifications.
UR - http://www.scopus.com/inward/record.url?scp=35048812571&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=35048812571&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-30551-4_4
DO - 10.1007/978-3-540-30551-4_4
M3 - Article
AN - SCOPUS:35048812571
SN - 0302-9743
VL - 3341
SP - 16
EP - 27
JO - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
JF - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ER -