Exploiting Supercomputers and Containers for Data Science
The Swiss National Supercomputing Centre is pleased to announce that the workshop "Exploiting Supercomputers and Containers for Data Science" will be held from June 13 to 15, 2018 in the ML building of ETH Zurich in the meeting room ML H 37.1 (plan of the ETH Zurich Main Campus available below).
Researchers in data science, analytics and artificial intelligence are increasingly seeing the need for incorporating supercomputing resources in their workflows. The challenge for supercomputing centres is providing the right tools and interfaces to the data science community.
Container technologies such as Docker and Shifter (a container deployment for HPC environments) provide the possibility for users or third parties to create and support workloads that are able to run efficiently and easily on platforms ranging from laptops to HPC centers to commercial elastic cloud providers. Containers provide advantages with respect to portability and reproducibility; data science applications are particularly amenable to containerization as they tend to involve very complex software stacks composed of anything from Python to GPU-enabled code, often with many, version-specific, software dependencies.
The focus of this workshop is data science applications and containerization. You will learn how to create and run your own container images, and how to make use of containers that are provided by third parties such as CSCS, Cray Inc., or NVIDIA.
A significant portion of the workshop will be dedicated to a hands-on exploration of Cray’s Urika-XC, which is an integrated suite of advanced analytics, AI, deep learning applications and graph tools that are optimized for the Cray XC platform and based on Shifter containers. Exercises will involve the use of interactive Jupyter notebooks. Familiarity with Cray systems is not a prerequisite for this course.
This two and a half day workshop will be of interest to data scientists who are already using – or are interested in exploring the use of – containerization to facilitate their workflows, as well as those who are interested in learning about novel data analytics tools and interfaces available at CSCS. All course attendees will be given the opportunity to present their use cases, experiences and expectations. Attendees will be able to test and deploy their workflows with the assistance of experts from Cray and CSCS.
Wednesday, June 13, 2018
12:00 – 13:00 Lunch and Registration
13:00 – 13:15 Welcome and Workshop Overview (Tim Robinson, CSCS)
13:15 – 14:30 Short Presentations by Participants (All)
14:30 – 15:00 Keynote: LHC on Cray (Maxime Martinasso, CSCS)
15:00 – 15:30 Coffee Break
15:30 – 18:00 Tutorial: Introduction to Creating and Using Containers (Alberto Madonna, CSCS)
Thursday, June 14, 2018
09:00 - 10:30 Introduction to CSCS Systems and Shifter (CSCS)
10:30 - 11:00 Coffee Break
11:00 - 12:30 Tutorial: Analytics and AI on Cray Systems, Part I (James Maltby, Charles Siegel, and Alessandro Rigazzi, Cray Inc.)
- Introduction to Urika-XC, a Container-based AI Environment
- Python, Anaconda, and Dask
- Deep Learning with TensorFlow
12:30 - 13:30 Lunch Break
13:30 - 15:00 Tutorial: Analytics and AI on Cray Systems, Part II (James Maltby, Charles Siegel, and Alessandro Rigazzi, Cray Inc.)
- Scaling Deep Learning with the Cray PE Machine Learning Plugin
- HPC, AI, and Analytics with R and pdbR
15:00 - 15:45 Keynote: ABCpy (Ritabrata Dutta, USI)
15:45 - 16:00 Coffee break
16:00 - 17:30 Tutorial: Analytics and AI on Cray Systems, Part III (James Maltby, Charles Siegel, and Alessandro Rigazzi, Cray Inc.)
- Spark and Alchemist
Friday, June 15, 2018
09:00 - 10:30 Tutorial: Analytics and AI on Cray Systems, Part IV (James Maltby, Charles Siegel, and Alessandro Rigazzi, Cray Inc.)
- Cray Graph Engine
- Urika-XC Success Stories
10:30 - 11:00 Coffee Break
11:00 - 12:00 Keynote: Pipeline Interoperability for Biomedical Use Cases Using Docker and Singularity (Balazs Laurenczy, ETH Zurich)
12:00 - 13:00 Lunch Break
13:00 - 14:00 NVIDIA Solutions for Data Science (Peter Messmer, and Vishal Mehta, NVIDIA)
14:00 - 15:00 NVIDIA Containers (Peter Messmer, and Vishal Mehta, NVIDIA)
15:00 - 15:15 Coffee Break
15:15 - 17:30 Hands-On with Participants' Use Cases (All)
All participants must register for the meeting. The registration fee includes coffee breaks and lunches throughout the two and a half day course.
Course Fee: 240 CHF
Deadline for registration: Tuesday, June 05, 2018
Kindly note that the workshop can take place only if there are sufficient confirmed registrations received by the deadline. The minimum number of participants is 8.
Please contact Tim Robinson (firstname.lastname@example.org) for questions related to the course content and email@example.com for questions related to the event logistics.
The ETH Zurich main building (HG Hauptgebäude), in dark red on the plan below, is located at the following address:
"ETH/Universitätsspital" is the closest tram stop to the ETH Zurich main building. This tram stop is reachable by tram number 6 and by tram number 10 from the main train station.
The ML building, where the meeting will take place, is located close to the ETH Zurich main building, and it is indicated with a blue circle on the plan below. The address of the ML building is:
The workshop will be held in the meeting room ML H 37.1.