February 19, 2019 - Interview by Simone Ulmer
Maria Grazia, are you bored after nearly a decade of user support?
Maria Grazia Giuffreda: It is difficult to be bored in this role (laughs). CSCS has always been a very dynamic workplace. With enthusiasm and readiness to get involved and be challenged, there is really no space for boredom. I cannot remember a single quiet year since I have taken over User Support. Every couple of years there is a new flagship system being installed or upgraded, at other times there is a new user community coming in, a new software stack, new services. Frequent changes are common in High-Performance Computing and data centers that are at the forefront of innovation and technology, like CSCS.
What does a Head of User Engagement and Support at CSCS do? What does your normal day look like?
I am responsible for the User Lab Program, including proposal submissions, and I am the liaison with the User Lab scientific community. Furthermore, I have account managers reporting to me on paying customers, and I am responsible for PRACE Tier-0 proposal calls. On a typical day I receive several emails from PIs and users asking me questions that need my attention, and I am involved in multiple meetings with colleagues and other leadership team members. I coordinate the activities of two groups, Scientific Computing Support and Compute and Data Services Support, with weekly discussions with the group leaders. I am also responsible for the account management team who is taking care of the relationships between CSCS and the paying customers, making sure that needs and expectations from both parties are met. Additionally, I supervise help desk activities, problem intervention, and user communication.
You mentioned PRACE, the Partnership for Advanced Scientific Computing in Europe, where CSCS is a hosting member of a so-called Tier-0 system. What does this mean?
The objective of PRACE is to enable high-impact scientific discovery and engineering research and development across all disciplines, to enhance European competitiveness for the benefit of society, and to provide a persistent pan-European High-Performance Computing service and infrastructure. Being a hosting member in this organization means that CSCS is offering world class computing and data management resources to scientists all over Europe and seeking to promote challenging and ambitious science. In PRACE, Swiss scientists can receive access to extreme-scale computing resources of different architectures. Together with Switzerland, currently the other hosting members are France, Italy, Germany and Spain.
Are there other European collaborations involving CSCS?
The beauty of working in this field is that science has no borders. Scientists and experts in extreme computing and data science thrive on collaborations and joint ventures. This is why CSCS is involved in a number of European and international collaborations, such as the European Centers of Excellence for HPC applications, MaX (Materials at eXascale), ESiWACE2 (Excellence in Simulation of Weather and Climate in Europe), the Human Brain Project, MAESTRO (Middleware for memory and data-awareness in workflows), and PLAN-E (Platform of National eScience/Data Research Centers in Europe), to name a few.
Users from the User Lab scientific community apply for free resources, but customers are users buying computing time without application?
Academic users can get access to computational and data resources for free, but they have to present high-quality projects that their peers deem to be worth pursuing. In particular, CSCS organizes two national calls for proposals each year and participates in two annual European PRACE Tier-0 calls. Proposals submitted to the national calls are first scrutinized by in-house experts for their technical soundness and feasibility and then sent to two scientific reviewers from academic institutions abroad. Based on these assessments, an independent expert committee ranks the proposals and makes recommendations on allocations of computer time, which the Director of CSCS has so far always followed in making final decisions. The painstaking procedure is designed to guarantee that all projects be treated equally and that all promising projects can be implemented on high-performance computers. Alternatively, users have the choice to buy resources and so become paying customers of CSCS. Allocations are then granted without peer review; however, used funds will typically come from funding organizations that implement their own selection process.
We are talking about 1,500 users. How many people take care of them, or in other words, how large is your team?
There are 20 members in the User Engagement and Support Unit. This might look like a lot but, actually, being part of this team does not mean that all we do is answer tickets from users. The team does tremendous work to keep the system healthy from a user's point of view. The team recently assembled a professional regression suite, known as Reframe, to check the status of the system, and they have been so successful that other centers are starting to adopt their work flow and other teams within CSCS are starting to use it for their own daily work. The team has automated the installation of applications and scientific libraries such that, whenever there are major upgrades, we can easily reinstall and recompile our supported software stack. The team also prepares procedures for users to help them install their own applications easily. Furthermore, team members are working on benchmark suites for the production system and they are looking into new cloud services that CSCS is starting to provide, including continuous integration and interactive computing. What perhaps is not clear is that being part of the support team comes with a huge responsibility; after all, we take care of the core business of CSCS: If users are happy, we are happy, and we can consider ourselves successful.
“User Engagement” sounds like a challenge.
It depends on what we mean by User Engagement. For me it means to have an open channel with the user community. I am very excited by the User Lab Day, which we re-introduced in 2018. It is important that members of our user community know that there is an opportunity to come and discuss with us their wishes and their requests, and to make sure that they understand that we value their opinion. On the other hand, it is also of absolute importance for CSCS to reach out and present new services being offered. In my opinion, we cannot detach ourselves from our users. We need to make sure that we convey our messages, inform about our strategies, visions, and services, and work together with users who play a vital role in ensuring our success through supporting their outstanding science in the most effective way.
I assume there are users with a lot of experience as well as newcomers. How specific to a particular person is the user support?
The very experienced users are often considered collaborators more than anything else. They come to us with very high-level issues that regularly require the effort of both parts to diagnose and to solve, but we also get in touch with them when we need their help, for example when testing new services in pre-production. Newcomers are more likely to require our assistance to get started. There are numerous ways in which we support them, for instance with an interactive tool on our user portal that generates job scripts custom-tailored to their needs. We also offer webinars that help them get started at CSCS. Our webpage provides instructions and information on a range of topics. Furthermore, we offer courses that are relevant to new users. I think the bottom line is that all users are important. There are no silly or intelligent questions, there are just questions; and we are there to help our users to get the most out of our resources.
Has your job changed over the years, or has it stayed pretty much the same?
The job has certainly changed as it needs to adapt to the rapid advances in hardware and software technology. Responsibilities and even the strategy of CSCS as an organization may change whenever new services are implemented. I am always behind my team and find it very important to discuss changes in daily work as well as in medium-term goals. This may not be as obvious for other units, but whatever we do has an immediate impact on our user community. Whatever we deploy as tools, any new service has to be robust, well-thought out and well-planned. We do not have the freedom to simply test and see how it goes, because the impact on the users will be immediate and non-negligible. Our services are evolving, and therefore also are the responsibilities that come with it. Even in the user program, I face challenges at times when new scientific disciplines join our user community. Their requirements may be quite different from those that we are used to, and we may need to implement new tools, software, and services and even adapt proposal submissions.
In addition to user support, another challenging part of your work is your involvement in the distribution of computing time. You are a kind of “interface” between user and Scientific Advisory Board. What is the biggest challenge for you in that role?
This is something that I really enjoy doing. I like to look at proposals and find expert scientific reviewers, even though this requires a lot of time and concentration. We have an excellent Scientific Advisory Board who meticulously discuss every proposal based on technical assessment at CSCS and scientific review. The biggest challenge for me is to convey the right messages to the applicants concerning the outcome of their project proposals. There is a lot of competition for HPC resources on Piz Daint, and therefore only the very best projects are granted full allocation. Lower-ranked proposals need to be cut in allocation and some proposals need to be rejected. Proper response to the latter is not always easy, however, it is important to let applicants know why their proposals were cut or rejected so they rest assured that their proposal was considered seriously and they find ways to improve their chances in future calls.
In addition to the daily business of your group, CSCS offers a wide range of training courses for users. Can you briefly name the most important ones?
Every year we develop a training program that includes new offerings covering new tools and technology, as well as courses we have repeatedly offered over the years due to their proven importantance to many of our users. In 2019 we are offering courses on: Distributed TensorFlow, Scientific Python, GPU programming, OpenACC (in our Summer School), Interactive supercomputing (Jupyter and similar services), Advanced C++, and HPX as well as visualization.
Is the large spectrum of courses a consequence of the increasingly complex technologies and the increasingly complex scientific questions that researchers want to solve with the help of simulations, or are there other reasons?
We are trying to help our users to deploy our production systems in the most effective way, therefore we offer courses in parallel computing on GPUs among others. On the other hand, we also want to make users aware of new technologies and new services that they might not be aware of but may prove useful to them, or that they might know of but not in as much detail as is necessary to tap their full potential.
Has the user behavior changed during your many years of experience?
Of course, users always want to do their research as soon as possible, and, if possible, just “now”. Still I have noticed growing awareness of the complexity of running a computer center successfully, of offering stable and reliable HPC and storage resources. Users have definitely shown a growing readiness to collaborate with us and contribute to making our services as useful as possible. I think that these days, more than ever before, they see us as their peers, not on a scientific level, but for issues regarding the technical realization of their scientific projects.
As you mentioned earlier, CSCS re-introduced the user day in 2018. The annual user meeting offered — besides a scientific presentation of ETH-professor Vanessa Wood and insights into the work of CSCS behind the curtain — for the first time workshops on various topics, at which CSCS experts answered questions from the audience. This was very well received. What was the trigger for the new programme, which offered plenty of room for discussion?
We want to reach out to our users. We wish to make them aware of new services in place at CSCS. In other words, we need to move forward but we want the community to evolve with us and not be left behind to catch up only later. The new format establishes better communication needed for CSCS to know about changing user needs and for users to learn about future plans of CSCS. We also need to reach out to scientists that have not yet used HPC but might well benefit of it. The User Lab Day is the day where everybody can “meet the Swiss National Supercomputing Centre” to openly discuss wishes, visions, and services.
Will next year’s programme again have such a broad spectrum of topics or has the success inspired even more new ideas?
We will certainly repeat the format with parallel sessions to cover those topics that are important for CSCS and for our users. We are finalizing the program based also on feedback from the participants and the users.
If you made a wish for the CSCS users, what would it be?
Looking at the future, I wish for a community of users and customers that continues to be open-minded, like only scientists and dreamers can be, that embraces whatever new technologies and evolutions come our way, and that willingly accepts the challenges and the opportunities rather than looking back and wishing for what can no longer be, as nice as it might have been.