TY - GEN
T1 - Static typing for a faulty lambda calculus
AU - Walker, David
AU - Mackey, Lester
AU - Ligatti, Jay
AU - Reis, George A.
AU - August, David I.
PY - 2006
Y1 - 2006
N2 - A transient hardware fault occurs when an energetic particle strikes a transistor, causing it to change state. These faults do not cause permanent damage, but may result in incorrect program execution by altering signal transfers or stored values. While the likelihood that such transient faults will cause any significant damage may seem remote, over the last several years transient faults have caused costly failures in high-end machines at America Online, eBay, and the Los Alamos Neutron Science Center, among others [6, 44, 15]. Because susceptibility to transient faults is proportional to the size and density of transistors, the problem of transient faults will become increasingly important in the coming decades. This paper defines the first formal, type-theoretic framework for studying reliable computation in the presence of transient faults. More specifically, it defines λzap, a lambda calculus that exhibits intermittent data faults. In order to detect and recover from these faults, λzap programs replicate intermediate computations and use majority voting, thereby modeling software-based fault tolerance techniques studied extensively, but informally [10, 20, 30, 31, 32, 33, 41]. To ensure that programs maintain the proper invariants and use λzap primitives correctly, the paper defines a type system for the language. This type system guarantees that well-typed programs can tolerate any single data fault. To demonstrate that λzap can serve as an idealized typed intermediate language, we define a type-preserving translation from a standard simply-typed lambda calculus into λzap.
AB - A transient hardware fault occurs when an energetic particle strikes a transistor, causing it to change state. These faults do not cause permanent damage, but may result in incorrect program execution by altering signal transfers or stored values. While the likelihood that such transient faults will cause any significant damage may seem remote, over the last several years transient faults have caused costly failures in high-end machines at America Online, eBay, and the Los Alamos Neutron Science Center, among others [6, 44, 15]. Because susceptibility to transient faults is proportional to the size and density of transistors, the problem of transient faults will become increasingly important in the coming decades. This paper defines the first formal, type-theoretic framework for studying reliable computation in the presence of transient faults. More specifically, it defines λzap, a lambda calculus that exhibits intermittent data faults. In order to detect and recover from these faults, λzap programs replicate intermediate computations and use majority voting, thereby modeling software-based fault tolerance techniques studied extensively, but informally [10, 20, 30, 31, 32, 33, 41]. To ensure that programs maintain the proper invariants and use λzap primitives correctly, the paper defines a type system for the language. This type system guarantees that well-typed programs can tolerate any single data fault. To demonstrate that λzap can serve as an idealized typed intermediate language, we define a type-preserving translation from a standard simply-typed lambda calculus into λzap.
UR - http://www.scopus.com/inward/record.url?scp=34247250322&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34247250322&partnerID=8YFLogxK
U2 - 10.1145/1159803.1159809
DO - 10.1145/1159803.1159809
M3 - Conference contribution
AN - SCOPUS:34247250322
SN - 1595933093
SN - 9781595933096
T3 - Proceedings of the ACM SIGPLAN International Conference on Functional Programming, ICFP
SP - 38
EP - 49
BT - ICFP'06 - Proceedings of the Eleventh ACM SIGPLAN International Conference on Functional Programming
T2 - ICFP'06 - Eleventh ACM SIGPLAN International Conference on Functional Programming
Y2 - 16 September 2006 through 21 September 2006
ER -