Lecture 5: Automatic Performance Tuning
Background Material
- Lecture 1 material on block matrix multiplication and the FFT.
- Lecture 4 on optimizing for the memory hierarchy.
Reading
- Kamen Yotov, Xiaoming Li, Gang Ren, Maria Jesus Garzaran,
David Padua, Keshav Pingali, and Paul Stodghill, "Is Search Necessary
to Generate High-Performance BLAS?", In Special Issue on: Program
Generation, Optimization, and Platform Adaptation, Proc. of the IEEE,
Vol. 93, No. 2, 2005.
- Jim Demmel (USC Berkeley), Jack Dongarra (U. Tennessee),
Victor Eijkhout (U. Tennessee), Erika Fuentes (SUN microsystems),
Antoine Petitet (SUN microsystems), Rich Vuduc (USC Berkeley),
R. Clint Whaley (Florida State Univ.), Kathy Yelick (USC Berkeley),
"Self-Adapting Linear Algebra Algorithms and Software", In Special Issue on: Program
Generation, Optimization, and Platform Adaptation, Proc. of the IEEE,
Vol. 93, No. 2, 2005.
- ATLAS
- Matteo Frigo, Steven G. Johnson (MIT),
"The Design and Implementation of FFTW3", In Special Issue on: Program
Generation, Optimization, and Platform Adaptation, Proc. of the IEEE,
Vol. 93, No. 2, 2005.
- FFTW
- Markus Püschel, José Moura, Jeremy Johnson, David Padua, Manuela Veloso,
Bryan Singer, Jianxin Xiong, Franz Franchetti, Aca Gacic, Yevgen Voronenko,
Kang Chen, Robert W. Johnson, and Nick Rizzolo, "SPIRAL: Code Generation for
DSP Transforms", In Special Issue on: Program
Generation, Optimization, and Platform Adaptation, Proc. of the IEEE,
Vol. 93, No. 2, 2005.
- SPIRAL
Topics
- ATLAS (Automatic tuning of matrix multiplication)
- FFTW (Automatic tuning of FFT)
- SPIRAL (Automatic tuning of DSP algorithms - fast transforms)
Tasks
- Determine machine parameters (CPU type, CPU speed, CPU info such
as pipeline and functional units, memory, cache info)
- Time and instrument matrix multiplication code.
- Experiment with variants of matrix multiplication.
- Install ATLAS
and MKL
(compare to
Numeric Recipes)
Lecture Notes
Resources
Assignments
Created: Oct. 21, 2008 by jjohnson AT cs DOT drexel DOT edu