CS 680 Automatic Generation and DSP Algorithms

 Announcments  Lectures  Programs  Course Resources  Assignments & Solutions  Grading Policy
Course Description
The fast evolution and increasing complexity of computing platforms pose a major challenge for developers of high performance image and signal processing libraries (Digital Signal Processing - DSP): it is increasingly difficult to harness the available computing power; conversely, straightforward implementations may loose as much as one or two orders of magnitude in performance. Creating optimal implementations requires the developer to have an understanding of algorithms, capabilities and limitations of compilers, and the target platform's microarchitecture. For these reasons, a recent trend in numerical computing is towards "self-adaptable" software to achieve optimal performance and portability with reduced coding effort. One approach to self-adapting software is the automatic generation of algorithms and implementations and the use of intelligent search to find the "best" implementation on a given platform.

This course introduces the student to the foundations and state-of-the-art techniques, including automatic tuning and self adaptation, in high performance software development for DSP libraries such as Intel's MKL (Math Kernel Library) and IPP (Integrated Performance Primitives). The course will focus on algorithms for fast signal transforms such as the Fast Fourier Transform (FFT), Walsh-Hadamard Transform (WHT), Discrete Trigonometric Transforms (e.g Discrete Cosine Transform (DCT)), Convolution and Filtering, and Fast Wavelet Transforms. Applications such as JPEG, radar, image processing, and integer multiplication will be discussed. The course will study the WHT package, FFTW, and the SPIRAL system.

More generally, topics include:

  1. Fundamental tools in algorithm theory and analysis
  2. Benchmarking and performance modeling
  3. Fast signal transform algorithms
  4. Features of modern computer architectures (memory hierarchy, pipelining, instruction level parallelism, speculative computation and branch prediction)
  5. Special instruction sets, such as SSE/MMX on Pentium;
  6. Multi-core architectures and parallel implementations
  7. Optimizing compilers
  8. Self-adaptable software and program generators.

Course Objective
To develop the skills required to implement high-performance software, including the interaction between algorithms, computer architecture and compilers. To learn techniques for analyzing the performance of programs and their interaction with the underlying hardware. To utilize techniques to automatically implement, optimize, and adapt programs to different platforms. To understand and be able to implement fast algorithms for various signal transforms.
Course Audience
Graduate Computer Science and Computer and Electrical Engineering students interested in High-Performance Computing with applications to Digital Signal Processing algorithms. Should have background in Computer Architecture, Algorithms, Linear Algebra, and Programming Languages. Knowledge of Signal Processing Transforms is not required, though students with such knowledge should benefit form this course provided they have sufficient computing knowledge. The course will count as an advanced topics course for CS students. Advanced undergraduate students (GPA of 3.0 of better) and prerequisite knowledge may take the course with instructor permission.
Graduate students should have had CS 521 (Data Structures and Algorithms I), CS 550 (Programming Languages), and undergraduate courses in discrete mathematics, linear algebra, and computer architecture. Students, interested in the course, without all of these prerequisites should contact the instructor.

Jeremy Johnson
Office: 100 University Crossings
phone: (215) 895-2669
e-mail: jjohnson@cs.drexel.edu
office hours: T 2-3, R 3-6. Additional hours by appointment.
Course mail list: HPC AT cs dot drexel dot edu
Meeting Time
R 6:00-9:00 in Crossings 149
There is no text. The foundational material will come from standard texts on algorithms (Cormen, Leiserson, and Rivest), computer architecture (e.g. Hennessy and Patterson), and a summary journal paper on compiler optimization. The remainder of the material will come from notes from the instructor and recent journal papers including papers from the recent issue (Feb. 2005) of the Proceedings of the IEEE on "Program Generation, Optimization, and Platform Adaptation"


  1. Project 1 - WHT Package(25%)
  2. Project 2 - Performance Models(25%)
  3. Project 3 - SPIRAL (25%)
  4. Class Participation(25%)

All assignments must be completed alone unless otherwise stated. No Late assignments will be accepted without prior approval.


Reference Books
  1. More to be added.
Web Pages
Other Reference
  • More to be added.

Look Here for Important Announcements

Announcements ()


This list is tentative and may be modified at the instructor's discretion.
  1. Lecture 1: WHT Package - A self-adapting package to compute the Walsh-Hadamard Transform (WHT).
  2. Lecture 2: Review of modern computer architecture and performance models for the WHT.
  3. Lecture 3: Vectorized WHT package.
  4. Lecture 4: Matrix Transposition
  5. Lecture 5: Shared Memory Parallel WHT package
  6. Lecture 6: Distributed Memory Parallel WHT package
  7. Lecture 7: The fast Fourier transform (FFT)
  8. Lecture 8: FFTW
  9. Lecture 9: Sigma-SPL and the SPL compiler
  10. Lecture 10: Generating vector and parallel code in SPIRAL.
  11. Project presentations.



Created: 8/17/07 by jjohnson AT cs DOT drexel DOT edu