Im Folgenden finden Sie eine Aufstellung der zur Verfügung stehenden Themen. Die angegebene Literatur versteht
sich als Startlektüre und weitere Literatur sollte selbstständig recherchiert wertden.
The suggested topics are listed below. The literature indicated here is intended as a starting point only - further literature should be researched independently.
1) Intel's Global Extensible Open Power Manager (GEOPM)
A framework for power/energy optimizations targeting
High Performance Computing (HPC) by dynamic coordination of hardware settings across system compute nodes utilized
by a given parallel application in response to the application's behavior and requests from the resource management
and scheduling system.
2) Parallel and Distributed Deep Learning
Deep Neural Networks (DNNs) are becoming an important vehicle for
modern large-scale computing applications, while their training in most cases still requires a significant amount of
time. The topic discusses the training problem and describes possible approaches for its parallelization. Final technical report
3) Tensor Processing Units (TPUs) and their In-Datacenter Deployment Analysis
A Tensor Processing Unit (TPU)
is application-specific integrated circuit (ASIC) developed by Google for accelerating the inference phase of
artificial neural networks. The topic describes these AI accelerators and discusses their indatacenter
performance.
- Jouppi, Norman P., Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates et al. "In-datacenter performance analysis of a tensor processing unit." In Computer Architecture (ISCA), 2017 ACM/IEEE 44th Annual International Symposium on, pp. 1-12. IEEE, 2017.
4) Container technologies (e.g. Singularity, Docker, Charliecloud, Shifter)
Containers, being instances of
an Operating System (OS) level virtualization, are getting more appealing due to their higher efficiency as compared
to the full, hardware-level, virtualization. The topic describes the concept of the containers, and discusses the
differences among main software containers like Singularity, Docker, Kubernetes, Charliecloud, and Shifter that are
currently used in HPC. Final technical report
- Abdelbaky, Moustafa, Javier Diaz-Montes, Manish Parashar, Merve Unuvar, and Malgorzata Steinder. "Docker containers across multiple clouds and data centers." In Utility and Cloud Computing (UCC), 2015 IEEE/ACM 8th International Conference on, pp. 368-371. IEEE, 2015.
- Kurtzer, Gregory M., Vanessa Sochat, and Michael W. Bauer. "Singularity: Scientific containers for mobility of compute." PloS one 12, no. 5 (2017): e0177459.
- Charliecloud
5) Deep Neural Networks for Malware Detection in Executables
As the use of computing devices increases in
every day life, malware detection continues to remain a serious challenge for corporations, governmental agencies,
and individuals. Today malware detection systems still heavily rely on heuristic and signature-based methods, where
signature represents set of rules that are generally specific and thus usually fail to capture a new malware. This
topic discusses an alternative approach, that uses neural networks and the raw bytes of the binary program itself to
determine maliciousness without executing the target application. Final technical report
- David, Omid E., and Nathan S. Netanyahu. "Deepsign: Deep learning for automatic malware signature generation and classification." In Neural Networks (IJCNN), 2015 International Joint Conference on, pp. 1-8. IEEE, 2015.
- Raff, Edward, Jon Barker, Jared Sylvester, Robert Brandon, Bryan Catanzaro, and Charles K. Nicholas. "Malware detection by eating a whole exe."
6) Apache Kafka: a distributed streaming platform
The topic discusses Apache Kafka which is a streaming
platform allowing to publish and subscribe to streams of data enabling their corresponding storage and processing.
Kafka is being used by tens of thousands of organizations and is among fastest growing open source projects and is
currently one of the key technologies for managing and processing streams of data. The topic provides a general
overview of the system and describes the Kafka's key concepts (writing/reading data to/from Kafka). Final technical report
7) Streaming and batch processing with Apache Flink
According to a recent article from Forbes, around 90
percent of the data in the world today has been created in the last two years alone, and with the growth of Internet
of Things (IoT) devices this number is expected to increase further.The topic discusses Apache Flink, built as open
source software by an open community for distributed stream processing engineered to overcome certain tradeoffs that
have limited the effectiveness and/or ease-of-use of other approaches for processing streaming and batch data. Final technical report
- Carbone, Paris, Asterios Katsifodimos, Stephan Ewen, Volker Markl, Seif Haridi, and Kostas Tzoumas. "Apache flink: Stream and batch processing in a single engine." Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 36, no. 4 (2015).
- Friedman, Ellen, and Kostas Tzoumas. Introduction to Apache Flink: Stream Processing for Real Time and Beyond. " O'Reilly Media, Inc.", 2016.
8) Contemporary HPC and the World's Fastest Supercomputer Summit
The topic discusses the supercomputing
today by analyzing current TOP500 list and identifying the major trends. Additionally, it describes Summit system,
deployed at Oak Ridge National Laboratory and recognized as the world's fastest computer back in June' 18 during
International Supercomputing Conference by delivering 122.3 PFLOPS performance during the LINPACK run. Since then
the system has been upgraded to deliver 143.5 PFLOPS LINPACK performance making it still the fastest supercomputer
in the world (TOP500 November'18 rankings). The topic discusses the system design aspects and describes its compute
and cooling infrastructure. Final technical report
9) Bridges: Converging HPC, AI, and Big Data
Bridges system, deployed at Pittsburgh Supercomputing Center
(PSC), is specifically designed to meet the requirements of both traditional and non-traditional HPC communities.
The system features interconnected set of interacting systems that offer exceptional flexibility for data analytics,
simulation, workflows and gateways, leveraging interactivity, parallel computing, Spark and Hadoop. The topic
describes the highly heterogeneous architecture of the Bridges system and outlines its design advantages. Final technical report
10) Can less complex processors enable HPC?
Currently, there are three technology architectures evolving for
HPC: the first one is taking the commodity processors and putting them together with a high performance interconnect
to createa supercomputer. The second way to buildsuch a high-end system is to is to take the commodity processors
and augment them with accelerators (e.g. GPUs) that boost the delivered FLOP rate,i.e. performance of the system.
This topic discusses the third category, which is using very lightweight processor cores, having a very simple
structures associated with them and are less complex than the commodity ones. More specifically, the topic
addressesthe possibility of building HPC systems from ARM multicore chips, and provides a detailed performance and
energy efficiency evaluation. Final technical report
- Rajovic, Nikola, Paul M. Carpenter, Isaac Gelado, Nikola Puzovic, Alex Ramirez, and Mateo Valero. "Supercomputing with commodity CPUs: Are mobile SoCs ready for HPC?." In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, p. 40. ACM, 2013.
- Rajovic, Nikola, Alejandro Rico, Nikola Puzovic, Chris Adeniyi-Jones, and Alex Ramirez. "Tibidabo1: Making the case for an ARM-based HPC system." Future Generation Computer Systems 36 (2014): 322-334.
11) An empirical survey of performance and power variation of recent Intel processors
Traditional
performance and energy/power efficiency analysis of homogeneous HPC systems (in terms of installed hardware and
system software stack) assumes homogeneity across performance and power characteristics of underlying system compute
nodes. As a result, processor power and performance variation were relegated to the background during
system/application evaluation and benchmarking procedures. This topic considers the performance and power variation
aspects of HPC-grade Intelprocessors of recent generations.
12) Machine Learning Based Classification Over Encrypted Data
The HPC systems today enable the application
of Machine Learning (ML) based methods in various domains, ranging from face recognition to medical or genomics
predictions, as they provide the required data storage and computational power. However, many of these ML based
applications rely on sensitive data usage making it important to control the privacy of the considered data and the
underlying classifier. The topic discusses ML based classification techniques over encrypted data. Final technical report
13) Cross-Architectural Modelling of Power Consumption Using Neural Networks
Energy consumption is becoming
a dominating factor for the Total Cost of Ownership of many supercomputers, making it important to keep energy costs
in budget and to operate within available capacities of power distribution and cooling systems. The topic
considersprediction of power consumption of HPC systems utilizing artificial neural networks that use data
obtained from hardware performance counters. The topic discusses the accuracy of the proposed, portable across
different micro-architectureimplementations, methodology and outlines the advantages against its simpler,
linear-regression based, counterparts.
14) Next generation arithmetic for HPC and AI
Posit arithmetic, a form of universal number (unum)
computer arithmetic, is designed as a direct drop-in replacement for IEEE Standard 754 floating-point numbers
(floats), provide compelling advantages over floats, including larger dynamic range, higher accuracy, better
closure, bitwise identical results across systems, simpler hardware, and simpler exception handling.Posits never
overflow to infinity or underflow to zero, and 'Nota-Number' (NaN) indicates an action instead of a bit pattern. The
topic discusses the posit arithmetic and outlines its advantages against fixed-pointarithmetic approaches currently
used for AI and signal processing. Final technical report