CSCS offers internships at various levels and in most organisational units.

Currently, CSCS is offering different types of internships in the field of High Performance Computing for the development and optimization of scientific applications, in order to exploit the coming generations of supercomputing devices.

For the below positions applicants must be enrolled in a Swiss University (Bachelor/Master level) and the internship must be part of their mandatory education. The candidate must be a student in one of the following fields: Computer Science, Mathematics, Physics or related fields. Please note that due to Swiss labour laws the internship must be an imperative part of the education for non-EU 25 / EFTA nationals in order to obtain a cantonal work authorization. Ph.D. students will not be considered.

The ideal candidate is a team player and feels comfortable working in an international environment. Excellent command of written and spoken English (our official working language) is a must.

We look forward to receiving your complete online application, which we ask you to submit to Stephanie Frequente, CSCS, via Trevano 131, 6900. Please specify in your application explicitly a maximum of 2 topics which fit your interests. As there is a high demand for the internships in certain periods and we can only offer 2 internships per quarter, kindly also state your availability. The closing date for applications is 30th November 2017. Applications will only be reviewed immediately after the closing date.

For further information, please contact Dr. Claudio Gheller by email (no applications).

Python and High-performance software containers

Docker containers are a convenient way of packaging and deploying software that requires numerous and complex dependencies. The use of such containers in HPC environments is a rapidly emerging trend, however it imposes some unique challenges.

The objective of this internship is to develop and demonstrate new features for Shifter, a Docker-compatible container technology currently operating at scale on CSCS systems:

These enhancements are intended to enable innovative workflows and push the boundaries of software containers in HPC platforms.

The candidate should have experience in software development using Python. Previous experience with Linux, C++ or knowledge about Docker are certainly beneficial.

The details of the work plan will be defined according to the candidate’s experience and background.

Duration: 3 months
: Lucas Benedicic
Working place: Lugano

Extensions of the CLAW framework for porting code to multiple target architectures

The CLAW framework provides source-to-source translation of application code, which interprets a small set of parallelization directives. Currently it generates output source code containing OpenMP (multi-threading) or OpenACC (many-core) directives which are optimized for a specified target architecture (e.g., multi-core CPUs, Xeon Phi or GPGPUs). In this internship the candidate would extend CLAW in key ways to make it easier for the application developer to use and/or to generate more optimal code.  The CLAW framework is based on the OMNI compiler ( and uses Java to perform the code translation.

Several possible CLAW extensions are planned. The first is an improved translation of a point-wise (“box”) or a one-dimensional (“single column”) model — a common base for describing physical phenomena —  to a fully parallel execution on multiple threads of execution, one for each box or column.  This will allow atmospheric physicists to draft a simple local model, e.g., the scattering and absorption of long-wave radiation in the atmosphere, without having to worry about how it is executed on modern highly parallel architectures.  A second extension is to generate GridTools ( code, which can then be compiled by a standard C++ compiler (such as GNU).  As GridTools is designed for high performance on multiple architectures, a considerable improvement in application execution time is anticipated.  Finally, an extension is foreseen to automatically insert the serialization statements for input and output variables, since this helps with the validation of components. As CLAW is currently evolving, other extensions might be suggested in place of or in addition to these.

Prerequisites: Java
Duration: 4 months
Mentor: Hannes Vogt
Working place: Zurich

Porting Physical Parameterizations from a Climate Model to Accelerators

This internship involves the new ICON (ICOsahedral Non-hydrostatic) climate and numerical weather prediction model, being developed by the Max Planck Institute for Meteorology (MPI-M) and the German Weather Service (DWD). The task will involve porting certain physical parameterizations (which calculate the collective effect of physical phenomena which occur on a scale that is smaller than the underlying numerical grid) to accelerators using the OpenACC programming paradigm. The parameterization will first be isolated in a testbed subset of the model, so that subsequent changes can be easily validated. 

The finished product would be the validated parameterization running within this testbed framework, however, potentially the work might also include its integration into the overall ICON model.

Prerequisites: Knowledge of high-level programming languages and parallel programming
4 months
William Sawyer
Working place: Lugano

Porting ICON atmospheric dynamics solver components using GridTools

This internship involves the new ICON (ICOsahedral Non-hydrostatic) climate and numerical weather prediction model, being developed by the Max Planck Institute for Meteorology (MPI-M) and the German Weather Service (DWD). The task will involve porting portions of the ICON atmospheric dynamics solver ("dycore") to multiple target architectures (e.g., multi-core CPUs, Xeon Phi, GPGPUs). One such component might be the advection of atmospheric constituents ("tracers").  

The implementation will utilize GridTools ( library, a domain-specific embedded language utilizing template meta-programming to expand computational kernels into C++ code.  GridTools code can then be compiled by a standard C++ compiler (such as GNU), but uses an optimized back-end for the targeted architecture, thus ensuring performance portability.

The finished product would be the validated dycore component running within a testbed framework, however, potentially the work might also include its integration into the overall ICON model.

Prerequisites: Knowledge of C++ template meta-programming and parallel programming
: 4 months
Mauro Bianco
Working place: Lugano

Scalability of TensorFlow and Caffe2

In the 2017 internship the scalability of distributed TensorFlow was investigated. The results indicate a good scalability on Daint up to 128 nodes, which seems to be comparable to benchmarks from Google. Starting from 128 there is a huge performance drop. Further, the performance depends strongly on the number of deployed parameter servers. However, there is no insight why the number of parameter servers impact the performance in the way it does. The first goal of the internship is to analyze the root cause of these performance related questions.

TensorFlow is one of the most popular DL toolkits however is one of the least HPC-aware toolkit. Other DL toolkits provide better integration of HPC technology such as Caffe2 which uses MPI as a communication layer. The second goal of the internship Is to study the performance of Caffe2 on Piz-Daint. A performance analysis is required together with an assessment on the effectiveness of Caffe2 as being more HPC friendly. For instance, is the usage of MPI in Caffe2 justify in term of performance/scalability that we should considered this toolkit a more suitable toolkit than TensorFlow for Piz Daint. A similar performance analysis will be done as the one already accomplished on TensorFlow.


  • Good math background
  • Basic concepts of parallelism
  • Good knowledge of Python

Duration: 4 months
Mentor: Maxime Martinasso
Working place: Lugano

Continuous Integration/Testing for large HPC systems

CSCS is developing (, a CI/regression testing tool for HPC systems. The goal of the Reframe framework is to abstract away the complexity of the interactions with the system, separating the logic of a regression test from the low-level details, which pertain to the system configuration and setup. This allows users to write easily portable regression tests, focusing only on the functionality.

Regression tests in ReFrame are simple Python classes that specify the basic parameters of the test. The framework will load the test and will send it down a well-defined pipeline that will take care of its execution. The stages of this pipeline take care of all the system interaction details, such as programming environment switching, compilation, job submission, job status query, sanity checking and performance assessment.

Writing system regression tests in a high-level modern programming language poses a great advantage in organizing and maintaining the tests. On the other hand, this may discourage some users from writing regression tests. In this internship, we are going to seek ways of improving the user experience of the regression test writers.

Some alternatives to consider are simple configuration files that will be translated automatically to framework's internal regression test API or even building a domain specific language embedded in Python that would fit more naturally to the needs of the framework's user. The interested intern will be part of the core devteam of Reframe, which will allow him/her to gain also experience in the agile code development and the Scrum methodology.


  • Good knowledge of Python
  • Knowledge of object-oriented programming concepts and design patterns
  • Experience with test-driven development is a plus
  • Experience with Github is a plus

Duration: 4 months
Mentor: Vasileios Karakasis
Working place: Lugano
Additional info:

  • Reframe public release:
  • Related tools: Python, Jenkins, Cdash, Bamboo, Google Test, Gitlab, Agile programming, test driven development

Improving the life-cycle of scientific applications on HPC systems

Managing scientific software on HPC systems can be very challenging and time consuming. CSCS has recently adopted a framework (EasyBuild - for automatic build and deployment of optimised scientific software across our systems. The transition is ongoing, however we're constantly developing new ideas to ease this process.

The internship will be focused on developing new features for EasyBuild and improve the integration with our existing tools. In particular:

  • Enhance integration with other tools (CI/Jenkins and Github)
  • Enable multiple external repositories for build recipes


  • Good knowledge of Python
  • Previous experience with scientific computing is considered aplus

Duration: 4 months
Mentor: Luca Marsella
Working place: Lugano
Additional info:

  • Reference: [Making Scientific Software Installation Reproducible On Cray Systems Using EasyBuild] (
  • Related tools: EasyBuild, Python, Jenkins, Spack, Git

Regression testing of large HPC systems

CSCS is operating large HPC systems. In order to check their sanity a regression testing framework is under development [Reframe]( While we use standard test for the hardware and the software installed, we also want to develop some customised tests.

A critical feature of the HPC system is the communication part which includes the hardware of the network and the software stack. With the test to be written we want to check both the hardware and the software of the communication during operation and maintenance of our machines. One difficulty of testing is reproducibility since different jobs might interact and congestion might occur. The new test to be written will perform communication between nodes according to an algorithm to be developed and analyse the performance data with statistical means.

The interested intern will be part of the core devteam of Reframe, which will allow him/her to gain also experience in the agile code development and the Scrum methodology.


  • Good knowledge of C++
  • Familiarity with MPI is a plus
  • Knowledge of Python is a plus

Duration: 3-4 months
Mentor: Andreas Jocksch
Working place: Lugano
Additional info:

GPU implementation of Gadget3 with OpenACC

The cosmological TreePM-MHD-SPH Gadget is a highly optimized and fully MPI/OpenMP parallelized code. It is used within various, large scale projects (e.g. Magneticum,, typically running on hundred of thousands of cores. It makes use of the FFTW library to perform the PM part of the gravity solver, where for the local short range part a tree code is used. These routines - as well as the core routines of the hydro solver (based on smoothed particle hydrodynamic method, SPH) – have been started to port to modern systems, which in addition to multi core CPUs provide GPU accelerators on the individual nodes (like PizDaint at CSCS) modern Architectures. In addition, many of the additional physics modules (like cooling, star-formation, chemical networks), which are essential for modern, cosmological applications, would also need to be ported to GPU accelerators. Being purely local processes, they offer the ideal target to the OpenACC programming model. Within this internship project, the applicant will learn the OpenACC programming model, apply it to a real cosmological application and will get insight into the complexity of highly parallel, cosmological simulation tools.


  • Good knowledge of C
  • Familiarity with GPU programming
  • Basic knowledge of parallel programming concepts

Duration: 3-4 months
Mentor: Claudio Gheller
Working place: Lugano