Abstract
An array processor is a collection of many similar processing elements (PE's), which can be executed in both parallel and pipeline processing. For the implementation of arrays of large number of processors, fault tolerance has always been a very critical design issue. Very often, spare PE's and switching lattices are incorporated in the array to improve the (fabrication-time) yield and the (runtime) reliability. In this paper, an array grid model based on single-track switches is proposed. A reconflgurability theorem is developed to provide the theoretical footing for new reconfiguration algorithms for the fabrication-time and runtime processing. For fabrication-time yield enhancement, the problem of finding a feasible reconfiguration using global control can be reformulated as a maximum independent set problem. An existing algorithm in graph theory is adopted to solve this problem. The simulations conducted indicate that the algorithm is computationally very efficient; therefore, it may also be applicable to certain runtime fault tolerance. In real-time fault tolerance, the propagation time of data/control signals between the host computer incurred in the global control is often prohibitively long; therefore, only distributed processing is feasible. Based on the same reconflgurability theorem, a distributive reconfiguration algorithm is developed for (asynchronous) array processors. The algorithm has several important features: 1) it is distributively executed by the PE's; 2) no global information is required by the individual PE's; 3) the time overhead for reconfiguration is independent of the array size; 4) transient faults are handled by retries or by deactivating/ reactivating the temporarily failed PE. Based on simulations, the performance of the algorithms are evaluated.
Original language | English (US) |
---|---|
Pages (from-to) | 501-514 |
Number of pages | 14 |
Journal | IEEE Transactions on Computers |
Volume | 38 |
Issue number | 4 |
DOIs | |
State | Published - Apr 1989 |
All Science Journal Classification (ASJC) codes
- Software
- Theoretical Computer Science
- Hardware and Architecture
- Computational Theory and Mathematics
Keywords
- Array processor
- compensation path
- fabrication-time fault tolerance
- reconfiguration
- runtime fault tolerance
- transient faults
- yield enhancement