Covers the design, evaluation and use of high-performance processors,
including instruction set architecture, pipelining, superscalar execution,
instruction level parallelism, vector instructions, memory hierarchy,
parallel computing including multi-core and GPU, and high-performance I/O.
Special attention is given to the effective utilization of these features,
including automated techniques, in the design and optimization of
performance-driven software.
The course will be organized around three fundamental computations:
matrix multiplication, the fast Fourier transform, and integer multiplication.