CARLA 2022

Tuesday, September 27th, 2022 / 09:00-12:30

Abstract:

Physics-Informed Neural Networks (PINNs) are machine learning tools that approximate the numerical solution of model equations such as Partial Differential Equations (PDEs) by adding them in some form as a component (e.g. loss function, activation function) of the neural network itself. PINNs have received a lot of attention lately, being used to solve not only deterministic and stochastic PDEs and their related inverse problems, but also fractional equations and integral-differential equations. In this tutorial, we provide a quick overview of these networks and their supervised and unsupervised flavors. We also contrast PINNs with classical numerical techniques, such as the Finite Element Method (FEM), showing some applications where PINNs shine. To end the tutorial, we provide some potential avenues for further research in the area. We supplement many of the examples explored throughout the tutorial with code snippets using some of the main software libraries used with PINNs, such as DeepXDE and JAX.

Instructors:

• Alvaro Coutinho (Federal University of Rio de Janeiro, Brazil)
• Romulo Montalvão Silva (Federal University of Rio de Janeiro, Brazil)
• Antônio Tadeu Azevedo Gomes (National Laboratory for Scientific Computing, Brazil)
• Frédéric Valentin (National Laboratory for Scientific Computing, Brazil)

Wednesday, September 28th, 2022 / 09:00-10:30 - 14:22-15:50

Abstract:

Deep Learning (DL) is successfully used for different tasks, from labeling cancerous cells in medical images to identifying traffic signals and pedestrians in self-driving cars. Supervised DL classifies raw input data according to the patterns learned from an input training set. This set is typically obtained by manually labeling the image and using some gradient descent to optimize the model iteratively. The level of expertise of the professionals labeling the images (experts labeling earthquake damages); the clarity of the image (a cat only showing his tail under a sofa), and the unexpected content of some of the images (a stop sign that has been graffitied), and other factors can determine the success of the computer vision task. In DL, most algorithms assume that the input data distribution is identical between test and validation, but in reality, they are not. The objective of Uncertainty for DL is to provide not just a single prediction but a distribution over prediction, which for a classification problem will mean to output a label and confidence and for a regression problem to produce a mean and a variance.
In this tutorial, we present an overview of different probabilistic analyses to evaluate the uncertainty of the output of a DL task. A probability density represents the uncertainty in the likelihood. It represents a spreading (as it were) of the DL likelihood estimate over a range of values dictated by the uncertainty in the truth set. The goal is to create DL algorithms that can output how much it does not know. The output of our DL with uncertainty will not just be the label name but also the uncertainty associated with the label.

Instructors:

• Maria Pantoja (California Polytechnic State University, USA)
• Drazen Fabris (Santa Clara University, CA, USA)
• Robert Kleinhenz (Santa Clara University, CA, USA)

Monday, September 26th, 2022 / 16:00-17:30

Tuesday, September 27th, 2022 / 14:00-15:30

Abstract:

Recently, several real quantum devices have become available through the cloud. Nevertheless, in the near term, they are expected to be very limited in the number and quality of the fundamental storage element, the qubit. But the promise of developing a quantum computer sophisticated enough to execute quantum algorithms which accelerate some complex tasks significantly has been a primary motivator for advancing the field of quantum computation.
Software quantum simulators are widely available tools to design and test quantum algorithms. The simulation of quantum systems in classical computers is a relatively old problem. However, with the emergence of real quantum computers, the limit of what classical simulations can handle is being pushed to understand its operation better and verify that t is behaving as predicted. The simulations of NISQ (Noisy Intermediate-Scale Quantum) devices on classical computers represent an invaluable experimental testbed for noise characterizations, the development of quantum error correction, and the verification of quantum systems. With this momentum, various quantum simulators have come out capable of executing quantum algorithms on classical computers. This tutorial intends to introduce the participants to quantum computing using quantum simulators.

Language: English.

Instructors:

• Gilberto Díaz (SC3UIS - CAGE, Universidad Industrial de Santander, Bucaramanga, Colombia)
• Carlos J. Barrios H. (SC3UIS - CAGE, Universidad Industrial de Santander, Bucaramanga, Colombia)

Monday, September 26th, 2022 / 14:00-15:30

Abstract:

Often in Computer Science, you need to: demonstrate that a new concept, technique, or algorithm is feasible, show that a new method is better than an existing method, and understand the impact of vari-ous factors and parameters on the performance, scalability, or robustness of a system. In HPC, the main goal is to achieve the maximum performance and accuracy of all the elements, components, and processes in an advanced computing environment. This tutorial introduces profiling and debugging scientific computing codes to execute in HPC systems, highlighting promising practices and common mistakes.

Language: English (Spanish instruction if necessary).

Instructors:

• Carlos J. Barrios H. (SC3UIS - CAGE, Universidad Industrial de Santander, Bucaramanga, Colombia)

Monday, September 26th, 2022 / 14:00-17:30

Abstract:

Developing software environments for experiments is an iterative and time-consuming process. Users usually build multiple times the image, adding every time a forgotten dependency or fixing a previously added one. As the building time of such images is in order of ten minutes for full system tarballs, it does not encourage the experimenters to follow good reproducible practices when setting them up. As a result, those images cannot be rebuilt nor modified by someone else.
In this tutorial, we introduce the users to NixOS Compose, a tool based on Nix and NixOS to generate and deploy reproducible environments on distributed platforms. We will first present Nix and the notions required to use NixOS Compose. As NixOS Compose can target several platforms, the users will set up their environement with lightweight containers (docker) on their local machines, allowing them to iterate quickly on their environment description. Once the environment ready with containers, users will be able to quickly test it on Grid’5000 using kexec, before generating a full system tarball.

Instructors:

• Quentin Guilloteau (Univ. Grenoble Alpes, Inria, CNRS, LIG, F-38000 Grenoble, France)
• Jonathan Bleuzen (Univ. Grenoble Alpes, Inria, CNRS, LIG, F-38000 Grenoble, France)
• Millian Poquet (Université de Toulouse, IRIT, CNRS, Toulouse INP, UT3, Toulouse, France)
• Olivier Richard (Univ. Grenoble Alpes, Inria, CNRS, LIG, F-38000 Grenoble, France)

Wednesday, September 28th, 2022 / 09:00-10:30 - 14:22-15:50

Thursday, September 29th, 2022 / 14:00-15:30

Abstract:

In this tutorial we will present the new OpenMP Cluster (OMPC) distributed programming model. The OMPC runtime allows the programmer to annotate their code using OpenMP target offloading directives and run the application in a distributed environment seamlessly using a task-based programming model. OMPC is responsible for scheduling tasks to available nodes, transferring input/output data between nodes, and triggering remote execution all the while handling fault tolerance. The runtime leverages the LLVM infrastructure and is implemented using the well-known MPI library.

Instructors:

• Gustavo Leite (University of Campinas, Brazil)
• Guilherme Valarini (University of Campinas, Brazil)
• Guido Araújo (University of Campinas, Brazil)

Monday, September 26th, 2022 / 09:00-17:30

Abstract:

OpenMP was originally released in 1998 with an emphasis on multi-core shared memory systems. Since then, this programming framework has evolved and it is one of the most used frameworks for developing software for parallel and heterogeneous computer systems. The latest revision, OpenMP 5.2, includes features that allow it to describe creation of workers, workload distribution and data management across not only multi-core systems, but also accelerator devices such as GPUs. These new functionalities have resulted in more complex semantics that extend beyond simple “parallel for” directives.
This hands-on tutorial is intended to be a beginners introduction to OpenMP with an emphasis on offloading to accelerator devices. We do not focus on performance, but rather on usability of OpenMP for the latest HPC systems. We will begin by explaining basic concepts of parallelism, threads and memory. We will also explain the offloading abstraction and how it is possible to program both CPU and devices (mainly GPUs) using OpenMP. This tutorial uses Chameleon Cloud to provide access to compute resources that contain GPUs. It is also written in Jupyter Notebooks, lowering the bar for attendees with little command-line (CLI) experience.

Instructors:

• Jose M Monsalve Diaz (Argonne National Laboratory, USA)
• Aurelio Vivas (Universidad de los Andes, Colombia)
• Kevin Brown (Argonne National Laboratory, USA)
• Verónica Melesse Vergara (Oakridge National Laboratory, USA)

Monday, September 26th, 2022 / 09:00-17:30

Abstract:

This workshop teaches the fundamental tools and techniques to accelerate applications written in C/C++ languages to run on massively parallel architectures with CUDA. The CUDA computing platform provides a way to accelerate applications and code. With CUDA, developers can dramatically accelerate the computation of applications on GPU (graphic processing unit) architectures. The course does not require any prior knowledge of CUDA but requires some basic understanding of C/C++, such as variable types, loops, conditional statements, functions, and array manipulation.
The workshop has the following learning objectives for the attendants: (I) writing parallel code to run on the GPU; (II) Expose and express data and instruction level parallelism in C/C++ applications using CUDA; (III) Use CUDA-managed memory and optimize memory migration using asynchronous prefetching; (IV) Use concurrent streams for instruction-level parallelism; and (V) writing GPU-accelerated applications in CUDA C/C++, or refactoring existing applications that only run on the CPU, using a profile-driven approach. By the end of the workshop, participants will have access to additional resources to accelerate new applications on the GPU on their own. In addition, after completing the final test, participants will receive an NVIDIA DLI certificate to recognize their competence in the subject and to support their professional career growth.

Instructors:

• Arthur Lorenzon (Federal University of Rio Grande do Sul, Brazil)

Tutorials

A Short Introduction to Physics-Informed Neural Networks

AI specific tools and techniques: Adding Uncertainty to AI

Introduction to Quantum Computing Simulations Tutorial

An Introduction to Good Practices for Debugging and Profiling Parallel Computing Codes

Reproducible distributed environments with NixOS Compose

Getting Up and Running with the OpenMP Cluster Programming Model

OpenMP for beginners: Parallelism, and heterogeneity

NVIDIA DLI Fundamentals of Accelerated Computing with CUDA C/C++