Seminar Topics
List of Topics
This topic focuses on exploring the use of programming languages such
as Python to develop and synthesize algorithms for Field-Programmable
Gate Arrays (FPGAs). FPGAs are electronic devices that allow for the
customization of digital circuits, and they can be programmed to
perform a wide variety of tasks. However, programming FPGAs can be
challenging due to the low-level hardware-specific languages typically
used. High-level programming offers a more intuitive and efficient
approach to FPGA programming, enabling faster development and
iteration of designs. This topic will examine recent research in
using Python as a high-level language for FPGA programming, including
the PyLog and data-centric multi-level design approaches.
This is a topic that explores how CXL technology can be used to
improve memory performance in High-Performance Computing (HPC)
systems. With CXL-enabled memory pooling, HPC systems can effectively
disaggregate memory and distribute it across multiple servers,
allowing for faster and more efficient data processing. The topic also
covers direct access and high-performance memory disaggregation with
Direct CXL, a technology that provides a low-latency and
high-bandwidth interconnect between CPUs and accelerators.
This topic explores the use of serverless architectures in
high-performance computing. Serverless architectures are based on
function-as-a-service (FaaS) platforms, allowing researchers and
scientists to execute computing tasks without worrying about the
underlying infrastructure. The topic explores the potential of
serverless computing for scientific applications, including the usage
of GPUs (graphics processing units).
This topic explores the use of Graphcore's intelligent processing unit
(IPU) as an accelerator in high-performance computing (HPC). The
topic will delve into the architecture and micro-benchmarking of
Graphcore's IPU, and how it can be utilized for both traditional HPC
applications and specialized fields such as particle physics. The goal
is to gain a deeper understanding of the potential of AI
accelerators in HPC.
AI accelerators are designed to handle specific types of computations,
such as matrix multiplication and convolution operations, which are
heavily used in deep learning models. One example of an AI accelerator
is Cerebras, which is a chip-based solution that contains a large
number of processing elements and includes an on-chip fabric that
allows for efficient communication between different parts of the
chip. There is increasing interest in employing these architectures in
the field of HPC to accelerate workloads outside the field of AI.
Mixed precision is a topic in computer science and mathematics that
explores the use of different precision levels (e.g., single, half, or
double precision) in scientific computations. By optimizing the
precision level of different parts of an algorithm, mixed precision
can accelerate the execution of scientific computations while
maintaining high accuracy. This topic explores recent research in
mixed precision, including strategies for utilizing mixed precision in
numerical linear algebra and the potential new scientific
opportunities enabled by this technique.
This topic explores the intersection of trusted execution environments
(TEEs) and high-performance computing (HPC). TEEs provide secure
enclaves within a computing environment that can protect sensitive
data and code from external threats. The topic will investigate the
use of TEEs in scientific computing workloads, confidential HPC in the
public cloud, and programming applications suitable for secure
multiparty computation based on TEEs.
This topic is focused on the architecture and programming of Intel's
Data Center GPU, specifically the Ponte Vecchio model. The goal is to
explore the technical details of the GPU architecture, systems, and
software, including the latest advancements in data parallel
programming using C++. The topic also includes research on enhancing
productivity and performance through extensions to the SYCL
programming language.
This topic explores the latest approaches for simulating quantum
computers on high-performance computing (HPC) systems. Quantum
computers are expected to revolutionize computing by solving problems
that classical computers cannot. However, building and operating
large-scale quantum computers is extremely challenging. Hence,
simulating quantum computers on classical computers has become an
essential tool for exploring the potential of quantum computing and
developing quantum algorithms. The suggested resources provide a
overview of the latest approaches for quantum simulation with hardware
acceleration, just-in-time compilation, and scalable state vector
simulation of quantum circuits. This topic should cover the challenges
of simulating quantum computers on HPC systems and how novel
approaches address some of the limitations of traditional quantum
simulation methods.
This topic focuses on efforts to ensure that scientific software can
be reliably reproduced and executed in high-performance computing
(HPC) environments. This is critical to ensure that research results
are trustworthy and can be independently verified. The suggested
resources provide a comprehensive overview of the latest approaches
for managing software packages, creating user-controlled software
environments, and using reproducible containers to enable scientific
computing. The goal of this topic is to address the the limitations of
traditional software management and deployment in HPC environments and
how these can be addressed by using reproducibility efforts.
This topic explores the latest developments in containerization
technologies for high-performance computing (HPC). Containers have
become an essential tool for managing and deploying applications in a
reproducible and scalable manner. The suggested resources provide an
overview of the most recent approaches for containerization, including
the use of WebAssembly, unprivileged containers, and Singularity. This
topic covers the benefits and challenges of using containers in HPC,
and how these novel approaches address some of the limitations of
traditional containerization technologies.
SmartNICs are a technology that allows the use of specialized network
interface cards (NICs) that are capable of performing complex
computations at the edge of the network. These intelligent NICs have
become increasingly important as data centers and cloud
infrastructures seek to offload processing tasks from the central
processing units (CPUs) of servers. The suggested resources provide a
comprehensive overview of SmartNIC architectures, applications, and
performance benchmarks.
Processing in Memory is a technology that supports performing
computations directly in memory, instead of transferring data between
memory and processing units. This topic is relevant in the context of
big data, machine learning, and other data-intensive applications that
require high-speed processing of large amounts of data. The goal is to
provide a comprehensive overview of this topic, from technical
concepts to practical implementation, and to provide an analysis of the
advantages and challenges of processing in memory and its potential
impact on the future of computing.
This topic focuses on exploring the use of the RISC-V instruction set
architecture in high performance computing applications. It covers
various aspects of RISC-V in HPC, including xBGAS, a global address
space extension on RISC-V for high-performance computing, Coyote, an
open-source simulation tool that enables RISC-V in HPC, and
Vitruvius+, an area-efficient RISC-V decoupled vector coprocessor for
high-performance computing applications. The goal for this topic is to
provide a deep understanding of the benefits and challenges of using
RISC-V in HPC and how to design and optimize RISC-V-based systems for
high-performance computing applications.
This topic explores the use of ARM processors as an alternative to x86
processors in HPC. This topic examines the performance and energy
consumption of HPC workloads on example ARM systems such as the
ThunderX2 CPU and the Fujitsu A64FX. The advantages and disadvantages
of these systems compared to alternatives will be analyzed in this
topic too.
This topic is focused on MLIR, an approach for developing efficient
compiler infrastructure for domain-specific computations. MLIR is
designed to support compiler infrastructure for optimizing
domain-specific computations. An example for this is support for
sparse tensor computations in MLIR which can lead to more efficient
computation. The topic also explores higher-level synthesis and
experimentation with MLIR polyhedral representations for accelerator
design.
Distributed Asynchronous Object Storage (DAOS) is an open-source,
distributed object store designed for HPC systems that require
high-performance storage solutions. The DAOS architecture is designed
to provide scalable, distributed, and asynchronous access to object
storage. It uses a distributed approach, with data access managed
through object-oriented APIs. This architecture enables applications
to access data in parallel, without the need for centralized metadata
management, which can be a bottleneck for high-performance storage
solutions. This topic has the goal of analyzing the pros and cons of
DAOS in comparison to alternatives such as Lustre and Ceph.
This topic explores a new number representation system that aims to
improve the precision and efficiency of scientific calculations in
high-performance computing. Posits are an alternative to traditional
floating-point numbers, and some researchers believe they may offer
superior accuracy and speed for certain types of HPC
applications. This topic will examine the theory behind posits and
assess their potential benefits in scientific computing, based on
recent research published in academic journals.
This topic explores the use of Field-Programmable Gate Arrays (FPGAs)
to speed up the execution of programs that run on distributed
systems. Communication between the computers in a distributed system
is often a bottleneck that limits performance. FPGAs offer a way to
accelerate communication by offloading communication tasks to
hardware. This topic will examine recent research in using FPGAs to
accelerate communication in distributed systems, including
accelerating MPI collectives, scaling HPC challenge benchmarks via MPI
and inter-FPGA networks, and using streaming message interfaces for
high-performance distributed memory programming on reconfigurable
hardware.
This topic revolves around the concept of energy-aware computing,
which focuses on reducing the energy consumption of computing systems
and data centers. The topic covers aspects such as energy-efficient
hardware and software design, power management techniques, and
renewable energy integration.
Coarse-Grained Reconfigurable Architectures (CGRAs) are a class of
processors that are designed to provide high performance and energy
efficiency for a specific set of applications. They are typically
composed of a large number of processing elements (PEs) that can be
configured and reconfigured dynamically to perform a wide range of
computational tasks. One promising application domain for CGRAs is
High-Performance Computing (HPC), particularly for data-intensive
workloads such as matrix-based graph analytics.