**Course Description**-
The fast evolution and increasing complexity of computing platforms pose a
major challenge for developers of high performance image and signal processing
libraries (Digital Signal Processing - DSP): it is increasingly difficult to
harness the available computing power; conversely, straightforward implementations
may loose as much as one or two orders of magnitude in performance. Creating
optimal implementations requires the developer to have an understanding of algorithms,
capabilities and limitations of compilers, and the target platform's microarchitecture.
For these reasons, a recent trend in numerical computing is towards "self-adaptable"
software to achieve optimal performance and portability with reduced coding effort.
One approach to self-adapting software is the automatic generation of algorithms
and implementations and the use of intelligent search to find the "best"
implementation on a given platform.
This course introduces the student to the foundations and state-of-the-art techniques, including automatic tuning and self adaptation, in high performance software development for DSP libraries such as Intel's MKL (Math Kernel Library) and IPP (Integrated Performance Primitives). The course will focus on algorithms for fast signal transforms such as the Fast Fourier Transform (FFT), Walsh-Hadamard Transform (WHT), Discrete Trigonometric Transforms (e.g Discrete Cosine Transform (DCT)), Convolution and Filtering, and Fast Wavelet Transforms. Applications such as JPEG, radar, image processing, and integer multiplication will be discussed. The course will study the WHT package, FFTW, and the SPIRAL system.

More generally, topics include:

- Fundamental tools in algorithm theory and analysis
- Benchmarking and performance modeling
- Fast signal transform algorithms
- Features of modern computer architectures (memory hierarchy, pipelining, instruction level parallelism, speculative computation and branch prediction)
- Special instruction sets, such as SSE/MMX on Pentium;
- Multi-core architectures and parallel implementations
- Optimizing compilers
- Self-adaptable software and program generators.

**Course Objective**- To develop the skills required to implement high-performance software, including the interaction between algorithms, computer architecture and compilers. To learn techniques for analyzing the performance of programs and their interaction with the underlying hardware. To utilize techniques to automatically implement, optimize, and adapt programs to different platforms. To understand and be able to implement fast algorithms for various signal transforms.
**Course Audience**- Graduate Computer Science and Computer and Electrical Engineering students interested in High-Performance Computing with applications to Digital Signal Processing algorithms. Should have background in Computer Architecture, Algorithms, Linear Algebra, and Programming Languages. Knowledge of Signal Processing Transforms is not required, though students with such knowledge should benefit form this course provided they have sufficient computing knowledge. The course will count as an advanced topics course for CS students. Advanced undergraduate students (GPA of 3.0 of better) and prerequisite knowledge may take the course with instructor permission.
**Prerequisites**- Graduate students should have had CS 521 (Data Structures and Algorithms I), CS 550 (Programming Languages), and undergraduate courses in discrete mathematics, linear algebra, and computer architecture. Students, interested in the course, without all of these prerequisites should contact the instructor.
**Instructor**- Jeremy Johnson
**Meeting Time**- R 6:00-9:00 in Crossings 149
**Textbook**-
There is no text.
The foundational material will come from standard
texts on algorithms (Cormen, Leiserson, and Rivest),
computer architecture (e.g. Hennessy and Patterson), and a summary
journal paper on compiler optimization. The remainder of the material
will come from notes from the instructor and recent journal papers including
papers from the recent issue (Feb. 2005) of the Proceedings of the IEEE on
"Program Generation, Optimization, and Platform Adaptation"

Grading- Project 1 - WHT Package(25%)
- Project 2 - Performance Models(25%)
- Project 3 - SPIRAL (25%)
- Class Participation(25%)

All assignments must be completed alone unless otherwise stated. No Late assignments will be accepted without prior approval.

Resources**Reference Books**- More to be added.
**Web Pages**- General Architecture References
- Intel
- MIPS
- Sun
- Benchmarks
- Simulators and Performance Tools
- PAPI (Performance Application Programming Interface)

- Scientific Computing
- Programming and Compiler Tools
- Architecture-Adapting Software
- SPIRAL project (Automatic Implementation of Signal Processing Algorithms)
- FFTW (High Performance, self-adapting FFT package)
- ATLAS (Automatically Tuned Linear Algebra Software)
- PHIPAC (Portable High Performance ANSI C)
- Sparsity (Automatically tuned sparse matrix package)
- WHT package (Self-adapating package for computing the Walsh-Hadamard Transform)

- More to be added.
**Other Reference**- More to be added.

Announcements ()

Look Here for Important Announcements

This list is tentative and may be modified at the instructor's discretion.

Lectures- Lecture 1: WHT Package - A self-adapting package to compute the Walsh-Hadamard Transform (WHT).
- Lecture 2: Review of modern computer architecture and performance models for the WHT.
- Lecture 3: Vectorized WHT package.
- Lecture 4: Matrix Transposition
- Lecture 5: Shared Memory Parallel WHT package
- Lecture 6: Distributed Memory Parallel WHT package
- Lecture 7: The fast Fourier transform (FFT)
- Lecture 8: FFTW
- Lecture 9: Sigma-SPL and the SPL compiler
- Lecture 10: Generating vector and parallel code in SPIRAL.
- Project presentations.

Programs- TBA.

Assignments- Team Projects Due finals week.

Created: 8/17/07 by jjohnson AT cs DOT drexel DOT edu

- Office: 100 University Crossings

phone: (215) 895-2669

e-mail: jjohnson@cs.drexel.edu

office hours: T 2-3, R 3-6. Additional hours by appointment.

Course mail list: HPC AT cs dot drexel dot edu