Automatic instruction-level software-only recovery

Jonathan Chang, George A. Reis, David I. August

Research output: Chapter in Book/Report/Conference proceedingConference contribution

78 Scopus citations

Abstract

As chip densities and clock rates increase, processors are becoming more susceptible to transient faults that can affect program correctness. Computer architects have typically addressed reliability issues by adding redundant hardware, but these techniques are often too expensive to be used widely. Software-only reliability techniques have shown promise in their ability to protect against soft-errors without any hardware overhead. However, existing low-level software-only fault tolerance techniques have only addressed the problem of detecting faults, leaving recovery largely unaddressed. In this paper, we present the concept, implementation, and evaluation of automatic, instruction-level, software-only recovery techniques, as well as various specific techniques representing different trade-offs between reliability and performance. Our evaluation shows that these techniques fulfill the promises of instruction-level, software-only fault tolerance by offering a wide range of flexible recovery options.

Original languageEnglish (US)
Title of host publicationProceedings - DSN 2006
Subtitle of host publication2006 International Conference on Dependable Systems and Networks
Pages83-92
Number of pages10
DOIs
StatePublished - 2006
EventDSN 2006: 2006 International Conference on Dependable Systems and Networks - Philadelphia, PA, United States
Duration: Jun 25 2006Jun 28 2006

Publication series

NameProceedings of the International Conference on Dependable Systems and Networks
Volume2006

Other

OtherDSN 2006: 2006 International Conference on Dependable Systems and Networks
Country/TerritoryUnited States
CityPhiladelphia, PA
Period6/25/066/28/06

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Automatic instruction-level software-only recovery'. Together they form a unique fingerprint.

Cite this