FADI: A fault tolerant environment for open distributed computing
- « Previous Article
- Table of contents
- Next Article »
FADI (FAult tolerant DIstributed environment) is a complete programming environment for the reliable execution of distributed application programs. FADI encompasses all aspects of modern fault-tolerant distributed computing. The built-in user-transparent error detection mechanism covers processor node crashes and hardware transient failures. The mechanism also integrates user-assisted error checks into the system failure model. The nucleus non-blocking checkpointing mechanism combined with a novel selective message logging technique delivers an efficient, low-overhead backup and recovery mechanism for distributed processes. FADI also provides a means of remote automatic process allocation on distributed system nodes.