Toggle Main Menu Toggle Search

ePrints

A System for Fault-Tolerant Execution of Data and Compute Intensive Programs Over a Network of Workstations

Lookup NU author(s): Dr James Smith, Emeritus Professor Santosh Shrivastava

Downloads


Abstract

A well known structuring technique for a wide class of parallel applications is the bag of tasks, which allows a computation to be partitioned dynamically between a collection of concurrent processes. This paper describes a fault-tolerant implementation of this structure using atomic actions (atomic transactions) to operate on persistent objects, which are accessed in a distributed setting via a Remote Procedure Call (RPC). The system developed is suited to parallel execution of data and compute intensive programs that require persistent storage and fault tolerance facilities. The suitability of the system is examined in the context of the measured performance of three specific applications; ray tracing, matrix multiplication and Cholesky factorization. The system developed runs on stock hardware and software platforms, specifically UNIX, C++.


Publication metadata

Author(s): Smith JA, Shrivastava SK

Publication type: Report

Publication status: Published

Series Title: Department of Computing Science Technical Report Series

Year: 1996

Pages: 15

Print publication date: 01/01/1996

Source Publication Date: 1996

Report Number: 553

Institution: Department of Computing Science, University of Newcastle upon Tyne

Place Published: Newcastle upon Tyne

URL: .A System for Fault-Tolerant Execution of Data and Compute Intensive Programs Over a Network of Workstations


Share