Course: Node-Level Performance Engineering
LRZ aktuell
publish at lrz.de
Do Okt 18 11:04:08 CEST 2012
+----------------------------------------------------------------------+
| Date: |Thursday Dec 6, 2012 10:00 - 18:00 |
| |Friday Dec 7, 2012 09:00 - 17:00 |
|-------------+--------------------------------------------------------|
| |LRZ Building, University campus Garching, near Munich |
| Location: |Boltzmannstr. 1 |
| |Hoersaal H.E.009 |
|-------------+--------------------------------------------------------|
| |This course teaches performance engineering approaches |
| |on the compute node level. "Performance Engineering" as |
| |we define it is more than employing tools to identify |
| |hotspots and bottlenecks. It is about developing a |
| |thorough understanding of the interactions between |
| |software and hardware. This process must start at the |
| |core, socket, and node level, where the code gets |
| |executed that does the actual computational work. Once |
| |the architectural requirements of a code are understood |
| |and correlated with performance measurements, the |
| |potential benefit of optimizations can often be |
| |predicted. We introduce a "holistic" node-level |
| |performance engineering strategy, apply it to different |
| |algorithms from computational science, and also show how|
| |an awareness of the performance features of an |
| |application may lead to notable reductions in power |
| |consumption. |
| | |
| |Introduction |
| | |
| | * Intel and AMD x86 architectures |
| | * ccNUMA |
| | * Performance modeling & engineering approaches |
| | * Our Approach |
| | |
| |Practical performance analysis |
| | |
| | * The LIKWID tools |
| | * Typical performance patterns |
| | |
| |Microbenchmarks and the memory hierarchy |
| | |
| | * Understanding the memory hierarchy |
| | + Data transfer between memory levels |
| | + Write allocate vs. NT stores |
| | + Modeling of cache hierarchies |
| | + Contention |
| | * NUMA effects - anisotropy and asymmetry |
| | |
| |Typical node-level software overheads |
| | |
| | * Cost of synchronization |
| | * Work distribution |
| Contents: | |
| |Example Problem: The 3D Jacobi solver |
| | |
| | * Core-level optimizations |
| | + Blocking |
| | + Non Temporal stores |
| | + SIMD vectorization (SSE, AVX) |
| | * Multithreading - contention at different memory |
| | hierarchies |
| | * Temporal Blocking |
| | |
| |Example Problem: The Lattice-Boltzmann Method (LBM) |
| | |
| | * Introduction |
| | * Roofline Model |
| | * Data layout |
| | * Non Temporal stores |
| | * Model for in-cache data & multicore scaling |
| | * Sparse representation and options for propagation |
| | |
| |Example Problem: Sparse Matrix-Vector Multiplication |
| | |
| | * Data layouts |
| | * Performance model - CPU vs. GPU |
| | * Bandwidth reduction |
| | |
| |Example Problem: A backprojection algorithm for CT |
| |reconstruction |
| | |
| | * The algorithm |
| | * Naive analysis |
| | * Detailed analysis and performance model |
| | * Optimizations |
| | |
| |Energy & Parallel Scalability |
| | |
| | * Energy consumption of modern processors |
| | * The energy-to-solution metric |
| | * Performance engineering = Power engineering and |
| | energy efficiency |
| | * Case studies |
| | |
| |Between each module, there is time for Questions and |
| |Answers! |
|-------------+--------------------------------------------------------|
|Prerequisites|Participants must have basic knowledge in programming |
| |with Fortran or C |
|-------------+--------------------------------------------------------|
| Language: |English |
|-------------+--------------------------------------------------------|
| Teacher: |Prof. Gerhard Wellen/RRZE, Dr. Georg Hager/RRZE et. al. |
|-------------+--------------------------------------------------------|
|Registration:|Please register via the LRZ registration form (http:// |
| |www.lrz.de/services/schulung/kursanmeldung) |
| |(Please choose course HNPF1W12) |
+----------------------------------------------------------------------+
This information is also available on our web server
http://www.lrz-muenchen.de/services/compute/hlrb/aktuell/ali4442/
Matthias Brehm
Mehr Informationen über die Mailingliste aktuell