topics

Below is the tentative list of the topics that are available. Please, consider the specified literature as starting point for your literature research.

Topic Area 1: Machine Learning

Convolutional Neural Networks and Vision Transformers

Dosovitskiy, An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, https://arxiv.org/abs/2010.11929, 2020
Liu et al., A ConvNet for the 2020s https://arxiv.org/pdf/2201.03545.pdf, 2022
Krizhevsky et al., ImageNet Classification with Deep Convolutional Neural Networks, 2012

Federated Learning

Konečný et al., Federated Optimization: Distributed Machine Learning for On-Device Intelligence, https://arxiv.org/abs/1610.02527, 2016
Yang et al., Federated Machine Learning: Concept and Applications, https://arxiv.org/abs/1902.04885, 2019
Kairouz et al., Advances and Open Problems in Federated Learning, https://arxiv.org/abs/1912.04977, 2021,
LEAF: A Benchmark for Federated Settings, https://leaf.cmu.edu/, 2019

ML in Computational Sciences and HPC

John Jumper et al., Highly accurate protein structure prediction with AlphaFold,https://www.nature.com/articles/s41586-021-03819-2, 2021
Fox et al., Understanding ML driven HPC: Applications and Infrastructure, https://arxiv.org/abs/1909.02363, 2021
Josh Abramson et al., Accurate structure prediction of biomolecular interactions with AlphaFold 3,https://www.nature.com/articles/s41586-024-07487-w, 2024
Fawzi et al., Discovering faster matrix multiplication algorithms with reinforcement learning, https://www.nature.com/articles/s41586-022-05172-4, 2022
Li et al., Fourier Neural Operator for Parametric Partial Differential Equations, https://arxiv.org/abs/2010.08895, 2020
Pestourie et al., Active learning of deep surrogates for PDEs: application to metasurface design, https://www.nature.com/articles/s41524-020-00431-2, 2021
Karniadakis et al., Physics-informed machine learning, Nature Reviews, https://www.nature.com/articles/s42254-021-00314-5, 2021

AI Sustainability

Wu et al., Sustainable AI: Environmental Implications, Challenges and Opportunities, https://proceedings.mlsys.org/paper/2022/file/ed3d2c21991e3bef5e069713af9fa6ca-Paper.pdf, 2022
Dodge et al., Measuring the Carbon Intensity of AI in Cloud Instances, https://arxiv.org/abs/2206.05229, 2022
Strubell et al., Energy and Policy Considerations for Deep Learning in NLP, https://arxiv.org/pdf/1906.02243.pdf, 2019.
Patterson et al., Carbon Emissions and Large Neural Network Training, https://arxiv.org/abs/2104.10350, 2021

Topic Area 2: Generative AI

Transformer Models

Dubey et al., The Llama 3 Herd of Models, https://arxiv.org/abs/2407.21783, 2024
Open AI, GPT-4 Technical Report, https://arxiv.org/abs/2303.08774
Touvron et al., Llama 2: Open Foundation and Fine-Tuned Chat Models, https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/
Brown et al., Language Models are Few-Shot Learners, 2020
Devlin et al., BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, 2019
Vaswani et al., Attention is all you need, https://arxiv.org/abs/1706.03762

Emerging LLM Architectures
- Gu et al., Mamba: Linear-Time Sequence Modeling with Selective State Spaces, https://arxiv.org/abs/2312.00752, 2023
- Hasani et al., Liquid Structural State-Space Models, https://arxiv.org/abs/2209.12951, 2024
- Lieber et al., Jamba: A Hybrid Transformer-Mamba Language Model, https://arxiv.org/pdf/2403.19887, 2024
- Jiang et al., Mixtral of Experts, https://arxiv.org/abs/2401.04088, 2024
Diffusion Models
- Rombach et al, High-Resolution Image Synthesis with Latent Diffusion Models, https://arxiv.org/abs/2112.10752, 2022.
- DALL-E: Creating Images from Text, https://openai.com/dall-e-2/, 2021
- Ramesh et al, Hierarchical Text-Conditional Image Generation with CLIP Latents, https://arxiv.org/abs/2204.06125, 2022
- Goodfellow et al., Generative Adversarial Nets, https://arxiv.org/abs/1406.2661, In Advances in Neural Information Processing Systems (NeurIPS), 2014
Large Multi-Modal Models
- Radford et al., Learning Transferable Visual Models From Natural Language Supervision, https://arxiv.org/abs/2103.00020, 2021
- Driess et al., PaLM-E: An embodied multimodal language model, Google Research, https://palm-e.github.io/assets/palm-e.pdf
- Liu et al., Visual Instruction Tuning, https://arxiv.org/pdf/2304.08485.pdf
- Pravesh Agrawal et al., Pixtral 12B, https://arxiv.org/abs/2410.07073, 2024
- Meta Blog, Llama 3.2, https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/, 2024
Neurosymbolic AI
- Garcez and Lamb, Neurosymbolic AI: the 3rd wave, https://openaccess.city.ac.uk/id/eprint/32536/1/2012.05876.pdf, 2023
- Sun et al., Neurosymbolic Programming for Science, https://arxiv.org/pdf/2210.05050, 2022
- Bhuyan et al., Neuro-symbolic artificial intelligence: a survey, https://link.springer.com/article/10.1007/s00521-024-09960-z, 2024
- Trinh et al., Solving olympiad geometry without human demonstrations, https://www.nature.com/articles/s41586-023-06747-5, 2024
Benchmarking Generative AI
- Srivastava et al., Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models, https://arxiv.org/abs/2206.04615, 2023
- Liang et al., Holistic Evaluation of Language Models, https://arxiv.org/pdf/2211.09110.pdf, 2022
- Ali Borji, Pros and Cons of GAN Evaluation Measures: New Developments, https://arxiv.org/abs/2103.09396, 2021
- ML Commons, https://mlcommons.org/, 2023
- Dehghani et al., Benchmark Lottery, https://arxiv.org/abs/2107.07002, 2021

Topic Area 3: AI Inference, Agents and Scaling

LLM Inference and Routers
- Ong et al., RouteLLM: Learning to Route LLMs with Preference Data, https://arxiv.org/abs/2406.18665, 2024
- Hu et al., ROUTERBENCH: A Benchmark for Multi-LLM Routing System, https://arxiv.org/pdf/2403.12031, 2024
- Kwon et al., Efficient Memory Management for Large Language Model Serving with PagedAttention, https://arxiv.org/abs/2309.06180, 2023
- VLLM, https://github.com/vllm-project/vllm?
Vector Databases

Johnson et al., Billion-scale similarity search with GPUs, https://arxiv.org/abs/1702.08734, 2017
Facebook AI Similarity Search (Faiss), https://faiss.ai, 2023
Guo et al., Manu: A Cloud Native Vector Database Management System, https://arxiv.org/pdf/2206.13843.pdf, 2022
Yi et al., Milvus: A Purpose-Built Vector Data Management System, https://arxiv.org/pdf/2107.10021.pdf, 2021

Retrieval Augmented Generation

Lewis et al., Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, https://arxiv.org/abs/2005.11401v4, 2020
LangChain, https://python.langchain.com/docs/get_started/introduction.html, 2023
Semantic Kernel, https://github.com/microsoft/semantic-kernel, 2023

AI Agents

Wu et al., AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation, https://arxiv.org/abs/2308.08155
Crew.ai, https://github.com/crewAIInc/crewAI, 2024
Langgraph, https://www.langchain.com/langgraph, 2024
Han et al., LLM Multi-Agent Systems: Challenges and Open Problems, https://arxiv.org/pdf/2402.03578
Yao et al., ReAct: Synergizing Reasoning and Acting in Language Models, https://arxiv.org/abs/2210.03629
Yan et al, SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering, https://arxiv.org/abs/2405.15793

Scaling Machine Learning

Hoffmann et al., Training Compute-Optimal Large Language Models (Chincilla), Deepmind, https://arxiv.org/pdf/2203.15556.pdf, 2022.
Kaplan et al., Scaling Laws for Neural Language Models, https://arxiv.org/pdf/2001.08361.pdf, 2021
Dean et al., Large Scale Distributed Deep Networks, 2012
Alex Krizhevsky, One weird trick for parallelizing convolutional neural networks, 2014
Li et al., Scaling Distributed Machine Learning with the Parameter Server, OSDI, 2014
Zhao et al., PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel, https://arxiv.org/pdf/2304.11277.pdf, 2023
Narayanan et al., Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM, https://arxiv.org/abs/2104.04473, 2021

AI Hardware
- Hooker, The Hardware Lottery, 2020
- NVIDIA H100 Tensor Core GPU Architecture Overview, https://resources.nvidia.com/en-us-tensor-core/gtc22-whitepaper-hopper, 2022
- GraphCore, https://www.graphcore.ai/
- Jouppi et al., TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings, https://arxiv.org/abs/2304.01433, 2023
- Google TPU v5, https://cloud.google.com/blog/products/compute/announcing-cloud-tpu-v5e-and-a3-gpus-in-ga
- Cerabras, https://www.cerebras.net/

Topic Area 4: Quantum Computing

Quantum Machine Learning
- Abbas, The power of quantum neural networks, https://arxiv.org/pdf/2011.00027.pdf, 2020
- Cerezo, Variational quantum algorithms, https://www.nature.com/articles/s42254-021-00348-9, 2021
- Lloyd et al., Quantum machine learning, https://www.nature.com/articles/nature23474, 2016
- Schuld et al., An introduction to quantum machine learning, https://arxiv.org/pdf/1409.3097.pdf
- Hubregtsen et al., Evaluation of Parameterized Quantum Circuits: on the relation between classification accuracy, expressibility and entangling capability, 2020
- PennyLane, https://pennylane.ai/
Quantum Chemistry
- Cao et al., Quantum Chemistry in the Age of Quantum Computing, https://pubs.acs.org/doi/10.1021/acs.chemrev.8b00803, 2022
- Quantum Algorithms for Quantum Chemistry and Quantum Materials Science, https://pubs.acs.org/doi/10.1021/acs.chemrev.9b00829, 2020
- Huggins et al., Unbiasing fermionic quantum Monte Carlo with a quantum computer, https://arxiv.org/abs/2106.16235, 2022

Topic Area 5: Chat GPT

ChatGPT meets WolframAlpha: Combining LLMs with real-time computation tools
- Helfrich-Schkarbanenko, A. (2023). Wolfram-Plugin. In: Mathematik und ChatGPT. Springer Spektrum, Berlin, Heidelberg. https://link-springer-com.emedien.ub.uni-muenchen.de/chapter/10.1007/978-3-662-68209-8_16
- Spannagel, Christian. "Hat ChatGPT eine Zukunft in der Mathematik?" Mitteilungen der Deutschen Mathematiker-Vereinigung, vol. 31, no. 3, 2023, pp. 168-172. https://doi.org/10.1515/dmvm-2023-0055
- Stephen Wolfram (2023), "ChatGPT Gets Its 'Wolfram Superpowers'!,", Stephen Wolfram Writings. https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers/

Last Change: Wed, 16 Oct 2024 21:55:28 +0200 - Viewed on: Tue, 09 Sep 2025 06:18:16 +0200
Copyright © MNM-Team http://www.mnm-team.org - Impressum / Legal Info - Datenschutz / Privacy

Institut für Informatik der Technischen Universität München

Lehrstuhl für Technische Informatik - Rechnernetze

Prof. Dr. Heinz-Gerd Hegering

Topic Area 1: Machine Learning

Topic Area 2: Generative AI

Topic Area 3: AI Inference, Agents and Scaling

Topic Area 4: Quantum Computing

Topic Area 5: Chat GPT