DTAM: Dynamic taint analysis of multi-threaded programs for relevancy

Malay Ganai, Dongyoon Lee, Aarti Gupta

Research output: Chapter in Book/Report/Conference proceedingConference contribution

29 Scopus citations

Abstract

Testing and debugging multi-threaded programs are notoriously difficult due to non-determinism not only in inputs but also in OS schedules. In practice, dynamic analysis and failure replay systems instrument the program to record events of interest in the test execution, e.g., program inputs, accesses to shared objects, synchronization operations, context switches, etc. To reduce the overhead of logging during runtime, these testing and debugging efforts have proposed tradeoffs for sampling or selective logging, at the cost of reducing coverage or performing more expensive search offline. We propose to identify a subset of input sources and shared objects that are, in a sense, relevant for covering program behavior. We classify various types of relevancy in terms of how an input source or a shared object can affect control flow (e.g., a conditional branch) or dataflow (e.g., state of the shared objects) in the program. Such relevancy data can be used by testing and debugging methods to reduce their recording overhead and to guide coverage. To conduct relevancy analysis, we propose a novel framework based on dynamic taint analysis for multi-threaded programs, called DTAM. It performs thread-modular taint analysis for each thread in parallel during runtime, and then aggregates the thread-modular results offline. This approach has many advantages: (a) it is faster than conducting taint analysis for serialized multi-threaded executions, (b) it can compute results for alternate thread interleavings by generalizing the observed execution, and (c) it provides a knob to tradeoff precision with coverage, depending on how thread-modular results are aggregated to account for alternate interleavings. We have implemented DTAM and performed an experimental evaluation on publicly available benchmarks for relevancy analysis. Our experiments show that most shared accesses and conditional branches are dependent on some program input sources. Interestingly in our test runs, on average, only about 25% input sources and 3% shared objects affect other shared accesses through conditional branches. Thus, it is important to identify such relevant input sources and shared objects for testing and debugging.

Original languageEnglish (US)
Title of host publicationProceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, FSE 2012
DOIs
StatePublished - 2012
Event20th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, FSE 2012 - Cary, NC, United States
Duration: Nov 11 2012Nov 16 2012

Publication series

NameProceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, FSE 2012

Other

Other20th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, FSE 2012
Country/TerritoryUnited States
CityCary, NC
Period11/11/1211/16/12

All Science Journal Classification (ASJC) codes

  • Software

Keywords

  • generalization
  • relevancy
  • taint analysis

Fingerprint

Dive into the research topics of 'DTAM: Dynamic taint analysis of multi-threaded programs for relevancy'. Together they form a unique fingerprint.

Cite this