Your browser does not support JavaScript!

Design techniques to improve the resilience of computing systems: software layer

Design techniques to improve the resilience of computing systems: software layer

For access to this article, please select a purchase option:

Buy chapter PDF
(plus tax if applicable)
Buy Knowledge Pack
10 chapters for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
Cross-Layer Reliability of Computing Systems — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Hardware techniques to improve the robustness of a computing system can be very expensive, difficult to implement and validate. Moreover, they require long evaluation processes that could lead to the redesign of the hardware itself when reliability requirements are not satisfied. This chapter will cover the software techniques that allow improving the tolerance of the system to hardware faults by acting at software level only. We will cover the recently proposed approaches to detect and correct transient and permanent faults.

Chapter Contents:

  • 4.1 Introduction
  • 4.2 Fault taxonomy
  • 4.2.1 Software faults
  • 4.3 Software-Implemented Hardware Fault Tolerance
  • 4.3.1 Modify the software in order to reduce the probability of fault occurrences
  • 4.3.2 Detecting/tolerating the presence of an error
  • 4.4 Software-Based Self-Test
  • 4.4.1 Basics on SBST
  • 4.5 SBST for GPGPUs
  • 4.5.1 Introduction
  • 4.5.2 Effects of permanent faults in GPGPU devices
  • 4.5.3 SBST techniques for testing the GPGPU scheduler
  • References

Inspec keywords: fault diagnosis; fault tolerant computing

Other keywords: computing system; hardware fault techniques; reliability requirements; long evaluation processes; software layer techniques

Subjects: Systems analysis and programming

Preview this chapter:
Zoom in

Design techniques to improve the resilience of computing systems: software layer, Page 1 of 2

| /docserver/preview/fulltext/books/cs/pbcs057e/PBCS057E_ch4-1.gif /docserver/preview/fulltext/books/cs/pbcs057e/PBCS057E_ch4-2.gif

Related content

This is a required field
Please enter a valid email address