event list
Registration
Sorry, the registration period for this event is over.
Get "Up to Speed" with Cray Cascade - 11-14 March 2013
CSCS next flagship system to be brought into production in April 2013 is a Cray XC30 (Cascade) system with a peak performance of 750 Teraflops using Intel Sandy Bridge processors and with a new network interface chip and an advanced interconnect topology. The CSCS system is currently the largest Cray XC30 in the world with 2256 compute nodes, over 36,000 compute cores and 70 Terabytes of memory with a five-fold increase in system bisection network bandwidth compared to Monte Rosa.
This three and a half day course gives an introduction to the Cray XC30 at CSCS, demonstrates how to get the best performance out of the Intel Sandy Bridge processors and shows how to take full advantage of the interconnect. Experts from Intel will dive into the features of the Sandy Bridge processor and demonstrate the use of the Intel tools for generating optimised code and Cray specialists will show how to use their set of tools and numerical libraries and how to take full advantage of the MPI libraries and communication strategies.
The course will be rich in hands-on practical sessions to demonstrate the tools and techniques.
Registration deadline: March 6, 2013.
Please contact Neil Stringfellow (neil.stringfellow(at)cscs.ch) or Sadaf Alam (sadaf.alam(at)cscs.ch) for further technical informations.
Instructors | Intel's High Performance Computing Team (Christopher Dahnken and Michael Klemm), Cray's Performance and Exascale Teams (Nathan Wichmann and Alfio Lazzaro) and CSCS own staff (Neil Stringfellow). |
Venue | CSCS, Via Trevano 131, Lugano www.cscs.ch/about_us/visitor_information/index.html |
Time | Day 1: 13:30 - 17:30; Day 2: 09:00 - 18:00; Day 3: 09:00 - 18:00; Day 4: 09:00 - 18:00 |
Prerequisites |
Competency in Fortran or C++ or C, combined with MPI and OpenMP. You will need to bring a laptop computer with the capability of ssh access to CSCS machines and the ability to display output from applications using the X11 window system. |
Maximum number of participants | 30 |
Minimal number of participants | If the minimal number of participants is not reached we reserve the right to cancel the course. You will be informed two weeks in advance. |
Accommodation | Participants are kindly requested to make their own arrangements for accommodation |
***
Course Outline
---------------------
Day 1 (afternoon only) - March 11, 2013
13:50-14:00 Welcome - [CSCS]
14:00-15:30 Overview of the Cray XC30 architecture - [Cray]
Cray will present the XC30 architecture in general, discussing node structure, the Aries network interface chip and Dragonfly network topology, and providing an overview of the software infrastructure and some early benchmark results.
16:00-18:00 Configuration of Piz Daint at CSCS - [CSCS]
CSCS staff will describe the particular configuration of the XC30 at CSCS - "Piz Daint" - the filesystems and external login environment. Best practices for using Slurm and ALPS interfaces will be demonstrated and useful tools in the operating environment will be shown with plenty of practical opportunity for participants to familiarise themselves with the usage of the machine.
Day 2 - March 12, 2013
9:00-10:30 Overview of the Intel Sandy Bridge processor technology - [Intel]
Intel will present a deep dive into the Intel Sandy Bridge architecture, discussing the processor pipelines, branch prediction, out-of-order operation, superscalarity and the important operation of the AVX-enabled vector units critical for high-performance of applications. The memory subsystem will be covered together with its interaction with the Linux operating system by discussing the various levels of caches available on the processor, the virtual memory system and UMA/NUMA issues (Uniform memory access, non-uniform memory access).
11:00-13:00 and 14:00-15:30 Programming for Core Performance - [Intel]
Intel will who how to get the best single-core performance out of the Sandy Bridge processor through the use of a number of tools and techniques. The Intel C/C++ Compiler and Intel Fortran compiler will be introduced and current features available in the compilers for the different languages and compiler optimisation techniques will be discussed. The all-important issue of vectorization will be presented and a number of techniques to overcome vectorization problems will be shown using features available in the compiler such as effective use of vectorization reports, compiler flags and pragmas.
16:00-17:00 Advanced performance programming - [Intel]
For those instances where the compiler cannot be persuaded to generate optimal code, Intel will give a presentation on "Understanding and programming machine language (with no fear!)" and will show how compiler intrinsics can be used to generate optimal code.
17:00-18:00 Using Intel Performance Libraries - [Intel]
Intel provide a set of numerical libraries that can be used in your applications including the Intel Math Kernel Library (MKL) for BLAS and LAPACK operations, and the use of these for common algorithms will be discussed.
Day 3 - March 13, 2013
9:00-10:30 Programming for Parallel Performance - [Intel and CSCS]
Moving beyond the single-core performance into multi-core and full-node performance, Intel will discuss processes, threads and threading methods. There will be a review of OpenMP features in the latest compiler. CSCS will present on affinity mappings on Cray systems and will show performance differences by using different options and environment variables. Intel will give a hint on processes and MPI.
11:00-13:00 Detecting problems - [Intel]
Intel will look at hardware performance monitoring using Intel tools, detecting problems in the processor pipeline using tools and also detecting parallel problems in threaded codes (practicals will be on a cluster, not on the Cray XC30 itself).
14:00-15:30 Cray XC30 vs Cray XE6 and early performance tips (including a little IO and MPI) - [Cray]
Cray will present the main differences and new features available on the XC30 platform for those people who are migrating from the XE6 Monte Rosa, and course participants will learn some early performance tips on the Cray XC system.
16:00-18:00 Introduction to tools on the Cray XC system and hands-on lab profiling applications - [Cray]
The Cray programming environment has a rich set of tools available for analysing and improving performance, and in this session attendees will understand how to identify performance bottlenecks by using performance tools to profile their applications, and will be able to tune their applications in the hands-on sessions.
Day 4 - March 14, 2013
09:00-10:30 On-node optimisations and Cray compiling environment - [Cray]
Cray will present single-core and on-node optimisation strategies on the XC30 and will show the capabilities of the Cray compilation and programming environment to help deliver high performance.
11:00-13:00 Performance analysis hands-on lab tuning applications - [Cray]
Attendees will use Cray Apprentice2 for performance analysis and visualisation of performance issues, and will be able to test a variety of optimisation techniques for delivering improved performance. Participants will be able to use the tools and techniques to work on provided examples or their own applications.
14:00-15:00 Off-node communication and MPI tuning - [Cray]
Cray systems are designed for highly scalable applications and in this session attendees will discover optimisation and tuning strategies to enable improved parallel performance across large numbers of nodes, and will learn about advanced techniques to deal with scaling problems.
15:00-15:30 Cray scientific and math libraries - [Cray]
Cray will present the key features of the highly optimised Cray scientific library packages available on the XC30 system.
16:00-18:00 Hands-on lab tuning applications - [Cray]
Attendees will have the opportunity to continue with the lab tuning exercises (and will be able to make continued use of access to Cray performance tuning personnel).
===============
Day 5 - March 15, 2013
There will then be the opportunity for attendees to stay on for a fifth day to participate in the debugging course given by Allinea on the DDT debugging tool. Those people who wish to do so should register separately for that event here: www.cscs.ch/events/event_list/event_detail/index.html