Lecture 4: Optimizing for the Memory Hierarchy
Date: Oct. 15
- Motivation for memory heirarchy
- Processor memory gap
- Principle of locality
- Memory performance equation:
Memory stall cycles = Instruction Count
X Memory references per instruction * Miss rate * Miss penalty
- Basics of caches
- Where can a block be placed in cache? (block placement)
- How can a block be found in cache? (block identification)
- Which block should be replaced on a miss? (block replacement)
- What happens on a write? (write strategy)
- Three C model
- Compulsory misses
- Capacity misses
- Conflict misses
- Program to detect cache structure and performance
- Cache simulator (cache.map - also see
cache.mw for a Maple worksheet containing the
program and traces or cache.html for a viewable
copy of the worksheet)
- PAPI counters for memory performance
- cachegrind cache simulator.
- Dinero IV cache simulator.
- Determine the cache configuration for your machine.
- Run cache.c and
analyze (try to determine cache configuration from access times).
- Run cache.c with PAPI
- Run cache.c with cachegrind
- Run cache.c with dinero and
explore different cache configurations.
- Create memory trace for matrix multiplication for the
Cache simulator (cache.map
and explore miss rate with different cache configurations.
Created: Oct. 15, 2008 by
jjohnson AT cs DOT drexel DOT edu