Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

FADI: A fault tolerant environment for open distributed computing

FADI: A fault tolerant environment for open distributed computing

For access to this article, please select a purchase option:

Buy article PDF
$19.95
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IEE Proceedings - Software — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

FADI (FAult tolerant DIstributed environment) is a complete programming environment for the reliable execution of distributed application programs. FADI encompasses all aspects of modern fault-tolerant distributed computing. The built-in user-transparent error detection mechanism covers processor node crashes and hardware transient failures. The mechanism also integrates user-assisted error checks into the system failure model. The nucleus non-blocking checkpointing mechanism combined with a novel selective message logging technique delivers an efficient, low-overhead backup and recovery mechanism for distributed processes. FADI also provides a means of remote automatic process allocation on distributed system nodes.

References

    1. 1)
      • L. RIBERIO , P. REGUEIRAS , M. GUIMARAES , J. CRUZ-PINTO . Numerical simulations of liquid-liquid agitated dispersions on the VAX 6250/VP. Comput. Syst. Eng. , 465 - 469
    2. 2)
      • OSMAN, T., BARGIELA, A.: `Process checkpointing in an open distributed environment', In the Proc. of the 11th European Simulation Multiconference, 1997, Turkey, p. 536–541.
    3. 3)
      • OSMAN, T., BARGIELA, A: `Error detection for reliable distributed simulations', In proc. 7th European Simulation Symposium, 1995, p. 385–362.
    4. 4)
      • POWELL, D., DONN, G., SEATON, D., VERISSIMO, P., WAESELYNCK, F.: `The Delta-4 approach to dependability in open distributed computing systems', Proc. 18th International Symposium on Fault-Tolerant Computing, 1988, p. 246–251.
    5. 5)
      • J.S. PLANK , K. LI. . A consistent checkpointer for multi-computers. IEEE Parallel & Distrib. Technol. , 2 , 62 - 67
    6. 6)
      • K.P. BIRMAN , R.V. RENESSE . (1994) , Reliable distributed computing with the ISIS toolkit.
    7. 7)
      • BARGIELA, A., HARTLEY, J.: `Parallel simulation of large scale water distribution systems', Proc. 9th European Simulation Multiconference, 1995, p. 723–727.
    8. 8)
      • SILVA, L., SILVA, G.: `Global checkpointing for distributed programs', Proc. of the 11th Symposium on Reliable Distributed Systems, 1992, Huston, Texas, p. 155–162.
    9. 9)
      • SENS, P.: `The performance of independent checkpointing in distributed systems', Proc. The 28th Hawaii International Conference on Systems Sciences, 1995.
    10. 10)
      • LAMOTTE, W., ELLENS, K.: `Surface tree caching for rendering patches in a parallel ray tracing system', Proc. Conf. on Scientific Visualisation of Physical Phenomena, 1991, p. 189–207.
    11. 11)
      • ELNOZAHY, E., ZWAENEPOEL, W.: `On the use and implementation of message logging', Digest of Papers. The 24th International Symposium on Fault-Tolerant Computing, 1994, p. 289–307.
    12. 12)
      • APPEL, B., KANTZ, H., KOZA, C.: `Implications of fault management and replica determinism on the real-time execution scheme of VOTRICS', 1993, p. 39–43.
    13. 13)
      • JANAKIRAMAN, G., TAMIR, Yuval: `Coordinated checkpointing-rollback recovery for distributed shared memory multicomputers', Proc. The 13th Symposium on Reliable Distributed Systems, 1994, CA, p. 42–51.
    14. 14)
      • A. GEIST , A. BEGUELIU , J. DONGARRA , W. JIANG , R. MANCHEK , V. SUNDERAM . (1994) , PVM: Parallel virtual machine. A users' guide and tutorial for networked parallel computing.
http://iet.metastore.ingenta.com/content/journals/10.1049/ip-sen_20000702
Loading

Related content

content/journals/10.1049/ip-sen_20000702
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address