Toggle Main Menu Toggle Search

Open Access padlockePrints

Fault Tolerance and System Structuring

Lookup NU author(s): Professor Brian RandellORCiD

Downloads

Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


Abstract

We discuss a general approach to the design of fault-tolerant computing systems, concentrating on the issues of system structuring rather than on the design of particular algorithms. Three forms of structuring are described. The first is based on what we term "idealized fault-tolerant components". Such components provide a means of system structuring which makes it easy to identify what parts of a system have what responsibilities for trying to cope with what sorts of fault. The second is a "recursive structuring" scheme. It involves using complete computers as the basic idealized fault-tolerant components of a distributed computing system whose functionality matches that of its component computers. Finally we discuss a generalization of the usual concept of an "atomic action", which provides a means of structuring both forward and backward error recovery in distributed systems. These discussions are given in general terms, and also illustrated by brief accounts of recent and current work at Newcastle on the construction of UNIX-based fault-tolerant and distributed systems.


Publication metadata

Author(s): Randell B

Publication type: Conference Proceedings (inc. Abstract)

Publication status: Published

Conference Name: 4th Jerusalem Conference on Information Technology: Next Decade in Information Technology (JCIT)

Year of Conference: 1984

Pages: 182-191

Publisher: IEEE Computer Society Press

Library holdings: Search Newcastle University Library for this item

ISBN: 0818605359


Share