Design techniques to improve the resilience of computing systems: logic layer
High-reliability and high-dependability applications require integrated solutions against random hardware faults and transient faults. Random hardware faults or intermittent faults are generated by process or time-dependent variations, i.e., aging, while transients are induced either by radiation, namely, soft errors, or by extreme operating conditions or electronic interference. Indeed, nanometric static process variations, voltage and temperature dynamic fluctuations due to chip activity, Bias Temperature Instability caused by the stress on the transistors, and single event effects or soft errors are reported to be very important issues in nanometric technology nodes [1,2]. These phenomena induce performance reduction if not taken care properly and may reduce circuit lifetime and Mean Time To Failure. Hence, onchip accurate yield, reliability and performance monitors that check online or periodically violations of guardbands have become necessary. Adaptive compensation schemes are combined with monitors in the attempt to recover from potential error when timing violation occurs. This chapter presents up-to-date state of the art of performance and reliability monitors, insertion methodology and experimental results of different sensors and monitors used for process and environment variations as well as aging compensation.
Design techniques to improve the resilience of computing systems: logic layer, Page 1 of 2
< Previous page Next page > /docserver/preview/fulltext/books/cs/pbcs057e/PBCS057E_ch2-1.gif /docserver/preview/fulltext/books/cs/pbcs057e/PBCS057E_ch2-2.gif