Wavelet Transform for Large Scale Image Processing on Modern Microprocessors
D. Chaver - Universidad Complutense
C. Tenllado - Universidad Complutense
L. Pi˝uel - Universidad Complutense
M. Prieto - Universidad Complutense
Francisco Tirado - Universidad Complutense
In this paper we discuss several issues relevant to the vectorization of a 2-D
Discrete Wavelet Transform on current microprocessors. Our research is based
on previous studies about the efficient exploitation of the memory hierarchy,
due to its tremendous impact on performance. We have extended this work with
more detailed analysis based on hardware performance counters and a study of
vectorization, in particular, we have used the Intel Pentium SSE instruction
set. Most of our optimizations are performed at source code level to allow
automatic vectorization, though some compiler intrinsic functions have been
introduced to enhance performance. Taking into account the abstraction at
which the optimizations are performed, the results obtained on an Intel
Pentium III microprocessor are quite satisfactory, even though further
improvement can be obtained by a more extensive use of compiler intrinsics.
