PPU: A control error-tolerant processor for streaming applications with formal guarantees

Pareesa Ameneh Golnari, Yavuz Yetim, Margaret Martonosi, Yakir Vizel, Sharad Malik

Research output: Contribution to journalArticlepeer-review

Abstract

With increasing technology scaling and design complexity there are increasing threats from device and circuit failures. This is expected to worsen with post-CMOS devices. Current error-resilient solutions ensure reliability of circuits through protection mechanisms such as redundancy, error correction, and recovery. However, the costs of these solutions may be high, rendering them impractical. In contrast, error-tolerant solutions allow errors in the computation and are positioned to be suitable for error-tolerant applications such as media applications. For such programmable error-tolerant processors, the Instruction-Set-Architecture (ISA) no longer serves as a specification since it is acceptable for the processor to allow for errors during the execution of instructions. In this work, we address this specification gap by defining the basic requirements needed for an error-tolerant processor to provide acceptable results. Furthermore, we formally define properties that capture these requirements. Based on this, we propose the Partially Protected Uniprocessor (PPU), an error-tolerant processor that aims to meet these requirements with low-cost microarchitectural support. These protection mechanisms convert potentially fatal control errors to potentially tolerable data errors instead of ensuring instruction-level or byte-level correctness. The protection mechanisms in PPU protect the system against crashes, unresponsiveness, and external device corruption. In addition, they also provide support for achieving acceptable result quality. Additionally, we provide a methodology that formally proves the specification properties on PPU using model checking. This methodology uses models for the hardware and software that are integrated with the fault and recovery models. Finally, we experimentally demonstrate the results of model checking and the application-level quality of results for PPU.

Original languageEnglish (US)
Article number43
JournalACM Journal on Emerging Technologies in Computing Systems
Volume13
Issue number3
DOIs
StatePublished - Apr 2017

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Electrical and Electronic Engineering

Keywords

  • Control flow
  • Error-tolerant computing
  • Progress
  • Reliability requirements
  • Streaming applications
  • Verification

Fingerprint Dive into the research topics of 'PPU: A control error-tolerant processor for streaming applications with formal guarantees'. Together they form a unique fingerprint.

Cite this