Im Folgenden finden Sie eine Aufstellung der zur Verfügung stehenden Themen. Die angegebene Literatur versteht
sich als Startlektüre und weitere Literatur sollte selbstständig recherchiert wertden.
The suggested topics are listed below. The literature indicated here is intended as a starting point only - further literature should be researched independently.
1) Modern accelerator technologies for machine learning applications
Accelerator technologies, that work alongside CPUs of the compute servers for accelerating certain regions of an application requiring a large amount of numerical operations, are becoming more broadly used and adopted in HPC systems. This topic discusses different contemporary accelerator technologies (GPUs, TPUs, etc.) that are used for boosting the performance of various machine learning applications. In addition, the topic examines the energy-efficiency of the considered accelerator technologies, provides an analysis of state-of-the-art benchmarks used for their evaluation, and suggests a summary outlining the main differences of the considered accelerators.
- Reuther, Albert, Peter Michaleas, Michael Jones, Vijay Gadepally, Siddharth Samsi, and Jeremy Kepner. "Survey and benchmarking of machine learning accelerators." In 2019 IEEE high performance extreme computing conference (HPEC), pp. 1-9. IEEE, 2019.
- Wang, Yuxin, Qiang Wang, Shaohuai Shi, Xin He, Zhenheng Tang, Kaiyong Zhao, and Xiaowen Chu. "Benchmarking the performance and energy efficiency of ai accelerators for ai training." In 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), pp. 744-751. IEEE, 2020.
2) Introduction to Field-Programmable Gate Arrays (FPGAs)
The topic provides an introduction to FPGAs, outlines their purpose and usability as well as discusses different design flows and concepts.
3) Arm Scalable Vector Extension version 2 (SVE2)
The topic introduces Arm Scalable Vector Extension version 2 (SVE2) which builds on top of the SVE (v1) - technology utilized by the processors of today's fastest (as of November 2020 Top500 rankings) supercomputer: Fugaku. SVE2 is used by Arm's latest ARMv9-A architecture for further processing enhancement of ML workloads. The topic outlines the fundamentals of SVE2 architecture, describes the new features, as well as discusses the power-efficiency aspects.
4) Contemporary HPC and the World's Fastest Supercomputer Fugaku
The topic discusses the supercomputing
today by analyzing current TOP500 list and identifying the major trends. Additionally, it describes Fugaku system,
deployed at RIKEN Center for Computational Science (Japan) and currently recognized as the world's fastest
supercomputer (TOP500 November'20 rankings). The topic discusses the system design aspects, describes its compute
and cooling infrastructures, and examines the energy-efficiency.
5) Emerging Non-Volatile Memory (NVM) Technologies
The topic discusses different contemporary NVM technologies, introduces their design, architecture, and outlines corresponding applications.
6) Ensemble learning
This topic introduces the concept of ensemble learning - a technique for combining the results of several models trained on the same data set for the increase of the overall prediction accuracy.
7) Transfer learning
This topic introduces the concept of transfer learning, discusses its theory and the statistical guarantees.
- Yang, Qiang, Yu Zhang, Wenyuan Dai, and Sinno Jialin Pan. Transfer learning. Cambridge University Press, 2020.
- Tripuraneni, Nilesh, Michael I. Jordan, and Chi Jin. "On the theory of transfer learning: The importance of task diversity." Part of Advances in Neural Information Processing Systems 33 (NeurIPS 2020).
- Neyshabur, Behnam, Hanie Sedghi, and Chiyuan Zhang. "What is being transferred in transfer learning?" Part of Advances in Neural Information Processing Systems 33 (NeurIPS 2020).
8) Beyond supervised and unsupervised learning
This topic discusses machine learning paradigms beyond classical supervised and unsupervised learning, and outlines best practices for algorithm design taking advantage of such learning attempts.
9) Quantum machine learning
Topic discusses the aspects of computational speed and data storage improvements with the help of quantum-enhanced machine learning. This topic also introduces the TensorFlow Quantum library.
10) Evaluating unsupervised learning methods for outlier detection
Data sanitization is an essential step when building effective machine learning based models. This topic discusses different unsupervised learning methods for outlier detection, analysis their differences as well as outlines their advantages and disadvantages.
- Campos, Guilherme O., Arthur Zimek, Jörg Sander, Ricardo JGB Campello, Barbora Micenková, Erich Schubert, Ira Assent, and Michael E. Houu
le. "On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study." Data mining and knowledge discovery 30, no. 4 (2016): 891-927.
- Domingues, Rémi, Maurizio Filippone, Pietro Michiardi, and Jihane Zouaoui. "A comparative evaluation of outlier detection algorithms::
Experiments and analyses." Pattern Recognition 74 (2018): 406-421.
11) Deep Semi-Supervised Anomaly Detection
This topic continues the discussion on anomaly detection. It investigates the cases where in addition to large set of unlabeled data, one has an access to a small labeled data. The topic discusses the possibilities of maximizing the advantage of the available labeled data using semi-supervised learning approach for the overall increase of the classification accuracy.
12) Unsupervised translation of programming languages
The topic introduces a machine learning based transcompiler and discusses its generalization.
13) Generating face images from sketches
This topic discusses recent machine learning based approaches that transform human drawn sketches to images.
14) Edge computing: is the era of traditional data centers coming to an end?
The topic introduces the edge computing paradigm, outlines current trends and challenges, and discusses the impact this distributed computing paradigm might have on future of the data centers.
15) Building your own storage cloud with Raspberry Pi and external hard drives
This topic discusses ways for deploying a secure and robust storage cloud using open-source frameworks on Raspberry Pi. The report provides a detailed analysis of the required open-source frameworks as well as evaluates their ease of deployment, security, extendability as well as assesses the efforts required for operation and maintenance.
16) Performance analysis of cloud applications
This topic discusses various strategies and techniques for analysing the performance of large-scale cloud applications.
17) Container orchestration tools: advantages, disadvantages, and challenges
This topic discusses contemporary container orchestration/management platforms and outlines their advantages and disadvantages. Furthermore, the topic touches upon the aspects of deployment and operation challenges, as well as discusses how those help with the development of AI microservices.
18) Elasticsearch
Nowadays, when checking emails, when purchasing from an online store, when reading a document, we always expect to have a time-efficient search engine working for us flawlessly in the background. Moreover, we typically expect these engines to be intelligent enough to support us with various manipulations with our (pending) search results (e.g. sorting, suggesting, etc.).
Elasticsearch is an open-source search engine library developed in Java. This topic introduces the basic concepts and principles of Elasticsearch, and outlines how typical search problems can be tackled with the help of Elasticsearch.