Toggle Main Menu Toggle Search

Open Access padlockePrints

Rigorous development of an embedded fault-tolerant system based on coordinated atomic actions

Lookup NU author(s): Professor Brian RandellORCiD, Professor Alexander RomanovskyORCiD, Dr Robert Stroud, Avelino Zorzo

Downloads


Abstract

This paper describes our experience using coordinated atomic (CA) actions as a system structuring tool to design and validate a sophisticated and embedded control system for a complex industrial application that has high reliability and safety requirements. Our study is based on an extended production cell model, the specification and simulator for which were defined and developed by FZI (Forschungszentrum Informatik, Germany). This "Fault-Tolerant Production Cell" represents a manufacturing process involving redundant mechanical devices (provided in order to enable continued production in the presence of machine faults). The challenge posed by the model specification is to design a control system that maintains specified safety and liveness properties even in the presence of a large number and variety of device and sensor failures. Based on an analysis of such failures, we provide in this paper details of: 1) a design for a control program that uses CA actions to deal with both safety-related and fault tolerance concerns and 2) the formal verification of this design based on the use of model-checking. We found that CA action structuring facilitated both the design and verification tasks by enabling the various safety problems (involving possible clashes of moving machinery) to be treated independently. Even complex situations involving the concurrent occurrence of any pairs of the many possible mechanical and sensor failures can be handled simply yet appropriately. The formal verification activity was performed in parallel with the design activity and the interaction between them resulted in a combined exercise in "design for validation"; formal verification was very valuable in identifying some very subtle residual bugs in early versions of our design which would have been difficult to detect otherwise.


Publication metadata

Author(s): Xu J, Randell B, Romanovsky A, Stroud RJ, Zorzo AF, Canver E, Von Henke F

Publication type: Article

Publication status: Published

Journal: IEEE Transactions on Computers

Year: 2002

Volume: 51

Issue: 2

Pages: 164-179

Print publication date: 01/02/2002

Date deposited: 13/09/2010

ISSN (print): 0018-9340

ISSN (electronic):

Publisher: IEEE Computer Society

URL: http://dx.doi.org/10.1109/12.980006

DOI: 10.1109/12.980006


Altmetrics

Altmetrics provided by Altmetric


Share