Multi GPU Training with TensorFlow on Piz Daint
The Swiss National Supercomputing Centre (CSCS) is pleased to announce the workshop Multi GPU Training with TensorFlow on Piz Daint, which will be held online from September 7 to 8, 2020.
The Piz Daint supercomputer at CSCS provides an ideal platform for supporting intensive deep learning workloads as it comprises thousands of Tesla GPU compute nodes communicating through a high-speed interconnect. In this two-day course, we will look at how to run distributed deep learning workloads with TensorFlow on Piz Daint. We will use simple examples to demonstrate best practices for building efficient input pipelines to maximize the throughput of deep learning models with TensorFlow.
The course will focus on two main subjects: reading data through input pipelines and asynchronous distributed training.
More specifically, the following topics will be covered:
- Running TensorFlow-2 on Piz Daint.
- Creating efficient input pipelines with TensorFlow's Dataset API for optimizing throughput on Piz Daint.
- Reading and writing data as TFRecords files.
- Understanding the stochastic gradient descent and distributed synchronous stochastic gradient descent algorithms.
- Running distributed training with TensorFlow using the ring allreduce algorithm implemented in Horovod.
- TensorFlow's operations statistics with TensorBoard.
This course is an update from last years’s workshop Efficient and Distributed Training with TensorFlow on Piz Daint. In this year's edition we will be using TensorFlow-2, TensorBoard, and we will learn how to run multi GPU training from JupyterLab notebook.
This course is addressed to scientists who are planning or are already engaged in intensive machine learning workloads and wish to start using TensorFlow on Piz Daint.
Participants are required to have basic knowledge of deep learning and some familiarity with TensorFlow.
An agenda will be available closer to the meeting dates.
The course will start on Monday, September 7, 2020 at 9:00 and end on Tuesday, September 8 at 16:00 Central European Summer Time (CEST).
Dr. Rafael Sarmiento (Computational Scientist, CSCS)
Dr. Henrique Mendonça (Computational Scientist, CSCS)
Registration for this program is free-of-charge. Mentors and learning materials are offered by CSCS.
All participants must register for the course. Registered attendees will receive the ZOOM details for participation at the email provided a few days prior to the workshop start. The link and password you will receive are unique to you and should not be shared with others.
Deadline for registration: Sunday, August 23, 2020.
Please note that the workshop can take place only if there are sufficient registrations received by the deadline. The minimum number of participants is eight. Registration for the course will automatically close when we reach the maximum number of participants (30).
Inquiries may be addressed to email@example.com.