November 19, 2014 - by Simone Ulmer

In Bonsai – the N-body gravitational tree-code – Simon Portegies Zwart’s team from Leiden University has a code that runs optimally on graphics processors. The scientists used it to simulate the long-term evolution of the Milky Way in high resolution, from the origin of the bar structure to the full formation of the galaxy’s spiral arms, based solely on the gravitational interaction of billions of individual bodies. These simulations, which were conducted on the supercomputers Piz Daint (CSCS) and Titan (Oak Ridge National Lab), helped the team to reach the list of Gordon Bell Prize finalists. Moreover, the scientists developed methods to use Piz Daint’s NVIDIA Tesla GPUs not only for computing, but also for visualizing and analyzing the simulation data on the fly, without having to save it to disk and analyze it in a later step. This saves time, energy and unnecessary transferral of data between the processors, memory and disk space.

Seeing what the supercomputer is calculating online

During the actual calculation and simulation, the team succeeded in performing an in-situ visualisation on over 2,000 nodes, which enables the viewer to follow the formation of the Milky Way from different angles and perspectives – and, according to the researchers, virtually in film quality, too, with ten to twenty images per second. The in-situ visualisation can be viewed during the supercomputing conference SC14 at NVIDIA and the CSCS & hpc-ch booth.

Peter Messmer from NVIDIA, who was involved in the project, is excited about the possibilities that in-situ visualisation can offer. “If I can see how a system evolves and maybe even interact with it while the simulation is running, it helps tremendously to get an intuitive understand of this complex system,” he says. After all, it gives the researcher a direct insight into the simulations without any complicated and time-consuming post-processing of the data. For Messmer, this additional usage of the graphics processors will become increasingly relevant for the so-called exascale era, which computer manufacturers and users have set their sights on, and according to some experts, could even replace the petaflop class systems by the end of this decade. An exaflop computer would perform a thousand times more computer operations per second – a trillion (that’s eighteen zeros) computer operations in all. “If we have such a high-performance supercomputer, we can no longer afford to write out and save the data first before it is analysed; otherwise, we will be spending more time and resources to write data to and from disk, than actually computing it.”

Analyse simulation data on the fly

This new approach allows the user to attach a data analysis tool to the running simulation. For this new tool, the scientists needed to combine two entirely different global data distribution approaches that are used in the simulation and the custom in-situ renderer. The former uses a fractal particle distribution to provide for excellent parallel efficiency in the simulation, while the latter requires a rectangular domain to accommodate interactive volumetric rendering at scale. “At the same time, the GPUs are not only busy simulating the galaxy, but also rendering the data set—both of which are computation- and communication-intensive steps. By taking advantage of multiple CPU cores, our custom software carries out these tasks in an asynchronous manner” says Evghenii Gaburov, from SURFsara, who is part of the team.