TY - GEN
T1 - DTAM
T2 - 20th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, FSE 2012
AU - Ganai, Malay
AU - Lee, Dongyoon
AU - Gupta, Aarti
PY - 2012
Y1 - 2012
N2 - Testing and debugging multi-threaded programs are notoriously difficult due to non-determinism not only in inputs but also in OS schedules. In practice, dynamic analysis and failure replay systems instrument the program to record events of interest in the test execution, e.g., program inputs, accesses to shared objects, synchronization operations, context switches, etc. To reduce the overhead of logging during runtime, these testing and debugging efforts have proposed tradeoffs for sampling or selective logging, at the cost of reducing coverage or performing more expensive search offline. We propose to identify a subset of input sources and shared objects that are, in a sense, relevant for covering program behavior. We classify various types of relevancy in terms of how an input source or a shared object can affect control flow (e.g., a conditional branch) or dataflow (e.g., state of the shared objects) in the program. Such relevancy data can be used by testing and debugging methods to reduce their recording overhead and to guide coverage. To conduct relevancy analysis, we propose a novel framework based on dynamic taint analysis for multi-threaded programs, called DTAM. It performs thread-modular taint analysis for each thread in parallel during runtime, and then aggregates the thread-modular results offline. This approach has many advantages: (a) it is faster than conducting taint analysis for serialized multi-threaded executions, (b) it can compute results for alternate thread interleavings by generalizing the observed execution, and (c) it provides a knob to tradeoff precision with coverage, depending on how thread-modular results are aggregated to account for alternate interleavings. We have implemented DTAM and performed an experimental evaluation on publicly available benchmarks for relevancy analysis. Our experiments show that most shared accesses and conditional branches are dependent on some program input sources. Interestingly in our test runs, on average, only about 25% input sources and 3% shared objects affect other shared accesses through conditional branches. Thus, it is important to identify such relevant input sources and shared objects for testing and debugging.
AB - Testing and debugging multi-threaded programs are notoriously difficult due to non-determinism not only in inputs but also in OS schedules. In practice, dynamic analysis and failure replay systems instrument the program to record events of interest in the test execution, e.g., program inputs, accesses to shared objects, synchronization operations, context switches, etc. To reduce the overhead of logging during runtime, these testing and debugging efforts have proposed tradeoffs for sampling or selective logging, at the cost of reducing coverage or performing more expensive search offline. We propose to identify a subset of input sources and shared objects that are, in a sense, relevant for covering program behavior. We classify various types of relevancy in terms of how an input source or a shared object can affect control flow (e.g., a conditional branch) or dataflow (e.g., state of the shared objects) in the program. Such relevancy data can be used by testing and debugging methods to reduce their recording overhead and to guide coverage. To conduct relevancy analysis, we propose a novel framework based on dynamic taint analysis for multi-threaded programs, called DTAM. It performs thread-modular taint analysis for each thread in parallel during runtime, and then aggregates the thread-modular results offline. This approach has many advantages: (a) it is faster than conducting taint analysis for serialized multi-threaded executions, (b) it can compute results for alternate thread interleavings by generalizing the observed execution, and (c) it provides a knob to tradeoff precision with coverage, depending on how thread-modular results are aggregated to account for alternate interleavings. We have implemented DTAM and performed an experimental evaluation on publicly available benchmarks for relevancy analysis. Our experiments show that most shared accesses and conditional branches are dependent on some program input sources. Interestingly in our test runs, on average, only about 25% input sources and 3% shared objects affect other shared accesses through conditional branches. Thus, it is important to identify such relevant input sources and shared objects for testing and debugging.
KW - generalization
KW - relevancy
KW - taint analysis
UR - http://www.scopus.com/inward/record.url?scp=84871338744&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84871338744&partnerID=8YFLogxK
U2 - 10.1145/2393596.2393650
DO - 10.1145/2393596.2393650
M3 - Conference contribution
AN - SCOPUS:84871338744
SN - 9781450316149
T3 - Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, FSE 2012
BT - Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, FSE 2012
Y2 - 11 November 2012 through 16 November 2012
ER -