Abstract

Fast sparse matrix-vector multiplication on TeraFlop/s computers
Gerhard Wellein - Computing Center - Friedrich Alexander University Erlangen Nuernberg Georg Hager - Computing Center - Friedrich Alexander University Erlangen Nuernberg Achim Basermann - C&C Research Laboratories, NEC Europe Holger Fehske - Institut fuer Physik, Ernst-Moritz-Arndt Universitaet Greifswald
Eigenvalue problems involving very large sparse matrices are common to various fields in science. In general, the numerical core of iterative eigenvalue algorithms is a matrix-vector multiplication (MVM) involving the large sparse matrix. We present three different programming approaches for parallel MVM on present day supercomputers. In addition to a pure message-passing approach, two hybrid parallel implementations are introduced based on simultaneous use of message-passing and shared-memory programming models. For a modern SMP cluster (HITACHI SR8000) performance and scalability of the hybrid implementations are discussed and compared with the pure message-passing approach on massively-parallel systems (CRAY T3E), vector computers(NEC SX5e) and distributed shared-memory systems (SGI Origin3800).

Fast sparse matrix-vector multiplication on TeraFlop/s computers

Gerhard Wellein - Computing Center - Friedrich Alexander University Erlangen Nuernberg
Georg Hager - Computing Center - Friedrich Alexander University Erlangen Nuernberg
Achim Basermann - C&C Research Laboratories, NEC Europe
Holger Fehske - Institut fuer Physik, Ernst-Moritz-Arndt Universitaet Greifswald

Eigenvalue problems involving very large sparse matrices 
are common to various fields in science. In general, the 
numerical core of iterative eigenvalue algorithms is 
a matrix-vector multiplication (MVM) involving the large 
sparse matrix.

We present three different programming approaches for 
parallel MVM on present day supercomputers. In addition 
to a pure message-passing approach, two hybrid parallel 
implementations are introduced based on simultaneous use 
of message-passing and shared-memory programming models.
For a modern SMP cluster (HITACHI SR8000) performance and 
scalability of the hybrid implementations are discussed and 
compared with the pure message-passing approach on 
massively-parallel systems (CRAY T3E), vector 
computers(NEC SX5e) and distributed shared-memory 
systems (SGI Origin3800).

Last update: Wed Jun 12 14:26:52 2002 WEST