TY - GEN
T1 - Dynamic inference of likely data preconditions over predicates by tree learning
AU - Sankaranarayanan, Sriram
AU - Chaudhuri, Swarat
AU - Ivaňić, Franjo
AU - Gupta, Aarti
PY - 2008
Y1 - 2008
N2 - We present a technique to infer likely data preconditions for procedures written in an imperative programming language. Given a procedure and a set of predicates over its inputs, our technique enumerates different truth assignments to the predicates, deriving test cases from each feasible truth assignment. The predicates themselves are derived automatically using simple heuristics. The enumeration of truth assignments is performed using a propositional SAT solver along with a theory satisfiability checker capable of generating unsatisfiable cores. For each assignment of truth values, a corresponding set of test cases are generated and executed. Based on the result of the execution, the truth assignment is classified as being safe or buggy. Finally, a decision tree classifier is used to generate a Boolean formula over the input predicates that explains the data obtained from the test cases. The resulting Boolean formula is, in effect, a likely data precondition for the procedure under consideration. We apply our techniques on a wide variety of functions from the standard C library. Our experiments show that the proposed technique is quite robust. For most cases, it successfully learns a precondition that captures a safe and permissive calling environment.
AB - We present a technique to infer likely data preconditions for procedures written in an imperative programming language. Given a procedure and a set of predicates over its inputs, our technique enumerates different truth assignments to the predicates, deriving test cases from each feasible truth assignment. The predicates themselves are derived automatically using simple heuristics. The enumeration of truth assignments is performed using a propositional SAT solver along with a theory satisfiability checker capable of generating unsatisfiable cores. For each assignment of truth values, a corresponding set of test cases are generated and executed. Based on the result of the execution, the truth assignment is classified as being safe or buggy. Finally, a decision tree classifier is used to generate a Boolean formula over the input predicates that explains the data obtained from the test cases. The resulting Boolean formula is, in effect, a likely data precondition for the procedure under consideration. We apply our techniques on a wide variety of functions from the standard C library. Our experiments show that the proposed technique is quite robust. For most cases, it successfully learns a precondition that captures a safe and permissive calling environment.
KW - Decision trees
KW - Machine learning
KW - SAT
KW - Software specification
KW - Verification
UR - http://www.scopus.com/inward/record.url?scp=57449084792&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=57449084792&partnerID=8YFLogxK
U2 - 10.1145/1390630.1390666
DO - 10.1145/1390630.1390666
M3 - Conference contribution
AN - SCOPUS:57449084792
SN - 9781605580500
T3 - ISSTA'08: Proceedings of the 2008 International Symposium on Software Testing and Analysis 2008
SP - 295
EP - 305
BT - ISSTA'08
T2 - 2008 International Symposium on Software Testing and Analysis, ISSTA 2008
Y2 - 20 July 2008 through 24 July 2008
ER -