Toggle Main Menu Toggle Search

Open Access padlockePrints

Error Recovery in Asynchronous Systems

Lookup NU author(s): Professor Brian RandellORCiD

Downloads

Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


Abstract

The demand for highly reliable computer systems has led to techniques for the construction of fault-tolerant software systems. A fault-tolerant system detects errors created as the effects of a fault and applies error recovery provisions in the form of abnormal or exceptional mechanisms and algorithms to continue operation and resume normal computation. Backward error recovery is intended to restore a system state which occurred prior to the manifestation of the fault. Forward error recovery is intended to correct or isolate specific errors and is accomplished in the system state containing the errors. The organisation and control of error recovery in anynchronous systems is very complex. Nevertheless, it is possible to limit this complexity by appropriate structuring aids. Techniques for structuring backward error recovery are comparatively well understood. This paper proposes techniques for structuring forward error recovery measures in asynchronous systems and generalizes recent ideas of atomic actions (transactions) so as to support fault-tolerant interactions between processes.


Publication metadata

Author(s): Campbell RH, Randell B

Publication type: Article

Publication status: Published

Journal: IEEE Transactions on Software Engineering

Year: 1986

Volume: 12

Issue: 8

Pages: 811-826

ISSN (print): 0098-5589

ISSN (electronic): 1939-3520

Publisher: IEEE Computer Society


Share