Toggle Main Menu Toggle Search

Open Access padlockePrints

Action Models: A Reliability Modeling Formalism for Fault-Tolerant Distributed Computing Systems

Lookup NU author(s): Professor Aad van Moorsel

Downloads

Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


Abstract

Modern-day computing system design and development is characterized by increasing system complexity and ever shortening time to market. For modeling techniques to be deployed successfully, they must conveniently deal with complex system models, and must be quick and easy to use by non-specialists. In this paper we introduce “action models”, a modeling formalism that tries to achieve the above goals for reliability evaluation of fault-tolerant distributed computing systems, including both software and hardware in the analysis. The metric of interest in action models is the job success probability, and we will argue why the traditional availability metric is insufficient for the evaluation of fault-tolerant distributed systems. We formally specify action models, and introduce path-based solution algorithms to deal with the potential solution complexity of created models. In addition, we show several examples of action models, and use a preliminary tool implementation to obtain reliability results for a reliable clustered computing platform.


Publication metadata

Author(s): van Moorsel A

Publication type: Conference Proceedings (inc. Abstract)

Publication status: Published

Conference Name: Proceedings of the IEEE International Computer Performance and Dependability Symposium, IPDS '98

Year of Conference: 1998

Pages: 119-128

ISSN: 1087-2191

Publisher: IEEE Computer Society

URL: http://dx.doi.org/10.1109/IPDS.1998.707715

DOI: 10.1109/IPDS.1998.707715

Library holdings: Search Newcastle University Library for this item

ISBN: 0818686790


Share