Lookup NU author(s): Dr Kenneth Wright
Full text is not currently available for this publication.
A number of different parallel algorithms for the LU decomposition of a square A matrix are considered. One aim of the methods is to collect together updates to columns as far as possible, to make good use of the storage hierarchy of the shared memory multiprocessor used to test the algorithms. Both unit lower triangular form for L and unit upper triangular form U variants are considered. The results presented were obtained using the C++ programming language, with parallel constructs provided by the Encore Parallel Threads package, on an Encore Multimax computer. These results indicate significant improvements over a simple parallel implementation of the standard Crout algorithm, and good speedup compared to the sequential Crout algorithm.
Author(s): Kaya D, Wright K
Publication type: Report
Series Title: Department of Computing Science Technical Report Series
Report Number: 450
Institution: Department of Computing Science, University of Newcastle upon Tyne
Place Published: Newcastle upon Tyne