Below is the tentative list of the topics that are available. Please, consider the specified literature as starting point for your literature research.
Topic Area 1: Machine Learning
- Convolutional Neural Networks and Vision Transformers
- Federated Learning
- Konečný et al., Federated Optimization: Distributed Machine Learning for On-Device Intelligence, https://arxiv.org/abs/1610.02527, 2016
- Yang et al., Federated Machine Learning: Concept and Applications, https://arxiv.org/abs/1902.04885, 2019
- Kairouz et al., Advances and Open Problems in Federated Learning, https://arxiv.org/abs/1912.04977, 2021,
- LEAF: A Benchmark for Federated Settings, https://leaf.cmu.edu/, 2019
- ML in Computational Sciences and HPC
- John Jumper et al., Highly accurate protein structure prediction with AlphaFold,https://www.nature.com/articles/s41586-021-03819-2, 2021
- Fox et al., Understanding ML driven HPC: Applications and Infrastructure, https://arxiv.org/abs/1909.02363, 2021
- Josh Abramson et al., Accurate structure prediction of biomolecular interactions with AlphaFold 3,https://www.nature.com/articles/s41586-024-07487-w, 2024
- Fawzi et al., Discovering faster matrix multiplication algorithms with reinforcement learning, https://www.nature.com/articles/s41586-022-05172-4, 2022
- Li et al., Fourier Neural Operator for Parametric Partial Differential Equations, https://arxiv.org/abs/2010.08895, 2020
- Pestourie et al., Active learning of deep surrogates for PDEs: application to metasurface design, https://www.nature.com/articles/s41524-020-00431-2, 2021
- Karniadakis et al., Physics-informed machine learning, Nature Reviews, https://www.nature.com/articles/s42254-021-00314-5, 2021
- AI Sustainability
- Wu et al., Sustainable AI: Environmental Implications, Challenges and Opportunities, https://proceedings.mlsys.org/paper/2022/file/ed3d2c21991e3bef5e069713af9fa6ca-Paper.pdf, 2022
- Dodge et al., Measuring the Carbon Intensity of AI in Cloud Instances, https://arxiv.org/abs/2206.05229, 2022
- Strubell et al., Energy and Policy Considerations for Deep Learning in NLP, https://arxiv.org/pdf/1906.02243.pdf, 2019.
- Patterson et al., Carbon Emissions and Large Neural Network Training, https://arxiv.org/abs/2104.10350, 2021
Topic Area 2: Generative AI
- Transformer Models
- Dubey et al., The Llama 3 Herd of Models, https://arxiv.org/abs/2407.21783, 2024
- Open AI, GPT-4 Technical Report, https://arxiv.org/abs/2303.08774
- Touvron et al., Llama 2: Open Foundation and Fine-Tuned Chat Models, https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/
- Brown et al., Language Models are Few-Shot Learners, 2020
- Devlin et al., BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, 2019
- Vaswani et al., Attention is all you need, https://arxiv.org/abs/1706.03762
- Emerging LLM Architectures
- Gu et al., Mamba: Linear-Time Sequence Modeling with Selective State Spaces, https://arxiv.org/abs/2312.00752, 2023
- Hasani et al., Liquid Structural State-Space Models, https://arxiv.org/abs/2209.12951, 2024
- Lieber et al., Jamba: A Hybrid Transformer-Mamba Language Model, https://arxiv.org/pdf/2403.19887, 2024
- Jiang et al., Mixtral of Experts, https://arxiv.org/abs/2401.04088, 2024
- Diffusion Models
- Rombach et al, High-Resolution Image Synthesis with Latent Diffusion Models, https://arxiv.org/abs/2112.10752, 2022.
- DALL-E: Creating Images from Text, https://openai.com/dall-e-2/, 2021
- Ramesh et al, Hierarchical Text-Conditional Image Generation with CLIP Latents, https://arxiv.org/abs/2204.06125, 2022
- Goodfellow et al., Generative Adversarial Nets, https://arxiv.org/abs/1406.2661, In Advances in Neural Information Processing Systems (NeurIPS), 2014
- Large Multi-Modal Models
- Radford et al., Learning Transferable Visual Models From Natural Language Supervision, https://arxiv.org/abs/2103.00020, 2021
- Driess et al., PaLM-E: An embodied multimodal language model, Google Research, https://palm-e.github.io/assets/palm-e.pdf
- Liu et al., Visual Instruction Tuning, https://arxiv.org/pdf/2304.08485.pdf
- Pravesh Agrawal et al., Pixtral 12B, https://arxiv.org/abs/2410.07073, 2024
- Meta Blog, Llama 3.2, https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/, 2024
- Neurosymbolic AI
- Garcez and Lamb, Neurosymbolic AI: the 3rd wave, https://openaccess.city.ac.uk/id/eprint/32536/1/2012.05876.pdf, 2023
- Sun et al., Neurosymbolic Programming for Science, https://arxiv.org/pdf/2210.05050, 2022
- Bhuyan et al., Neuro-symbolic artificial intelligence: a survey, https://link.springer.com/article/10.1007/s00521-024-09960-z, 2024
- Trinh et al., Solving olympiad geometry without human demonstrations, https://www.nature.com/articles/s41586-023-06747-5, 2024
- Benchmarking Generative AI
- Srivastava et al., Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models, https://arxiv.org/abs/2206.04615, 2023
- Liang et al., Holistic Evaluation of Language Models, https://arxiv.org/pdf/2211.09110.pdf, 2022
- Ali Borji, Pros and Cons of GAN Evaluation Measures: New Developments, https://arxiv.org/abs/2103.09396, 2021
- ML Commons, https://mlcommons.org/, 2023
- Dehghani et al., Benchmark Lottery, https://arxiv.org/abs/2107.07002, 2021
Topic Area 3: AI Inference, Agents and Scaling
- LLM Inference and Routers
- Ong et al., RouteLLM: Learning to Route LLMs with Preference
Data, https://arxiv.org/abs/2406.18665, 2024
- Hu et al., ROUTERBENCH: A Benchmark for Multi-LLM Routing System, https://arxiv.org/pdf/2403.12031, 2024
- Kwon et al., Efficient Memory Management for Large Language Model Serving with PagedAttention, https://arxiv.org/abs/2309.06180, 2023
- VLLM, https://github.com/vllm-project/vllm?
- Vector Databases
- Johnson et al., Billion-scale similarity search with GPUs, https://arxiv.org/abs/1702.08734, 2017
- Facebook AI Similarity Search (Faiss), https://faiss.ai, 2023
- Guo et al., Manu: A Cloud Native Vector Database Management System, https://arxiv.org/pdf/2206.13843.pdf, 2022
- Yi et al., Milvus: A Purpose-Built Vector Data Management System, https://arxiv.org/pdf/2107.10021.pdf, 2021
- Retrieval Augmented Generation
- AI Agents
- Wu et al., AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation, https://arxiv.org/abs/2308.08155
- Crew.ai, https://github.com/crewAIInc/crewAI, 2024
- Langgraph, https://www.langchain.com/langgraph, 2024
- Han et al., LLM Multi-Agent Systems: Challenges and Open Problems, https://arxiv.org/pdf/2402.03578
- Yao et al., ReAct: Synergizing Reasoning and Acting in Language Models, https://arxiv.org/abs/2210.03629
- Yan et al, SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering, https://arxiv.org/abs/2405.15793
- Scaling Machine Learning
- Hoffmann et al., Training Compute-Optimal Large Language Models (Chincilla), Deepmind, https://arxiv.org/pdf/2203.15556.pdf, 2022.
- Kaplan et al., Scaling Laws for Neural Language Models, https://arxiv.org/pdf/2001.08361.pdf, 2021
- Dean et al., Large Scale Distributed Deep Networks, 2012
- Alex Krizhevsky, One weird trick for parallelizing convolutional neural networks, 2014
- Li et al., Scaling Distributed Machine Learning with the Parameter Server, OSDI, 2014
- Zhao et al., PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel, https://arxiv.org/pdf/2304.11277.pdf, 2023
- Narayanan et al., Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM, https://arxiv.org/abs/2104.04473, 2021
- AI Hardware
- Hooker, The Hardware Lottery, 2020
- NVIDIA H100 Tensor Core GPU Architecture Overview, https://resources.nvidia.com/en-us-tensor-core/gtc22-whitepaper-hopper, 2022
- GraphCore, https://www.graphcore.ai/
- Jouppi et al., TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings, https://arxiv.org/abs/2304.01433, 2023
- Google TPU v5, https://cloud.google.com/blog/products/compute/announcing-cloud-tpu-v5e-and-a3-gpus-in-ga
- Cerabras, https://www.cerebras.net/
Topic Area 4: Quantum Computing
- Quantum Machine Learning
- Abbas, The power of quantum neural networks, https://arxiv.org/pdf/2011.00027.pdf, 2020
- Cerezo, Variational quantum algorithms, https://www.nature.com/articles/s42254-021-00348-9, 2021
- Lloyd et al., Quantum machine learning, https://www.nature.com/articles/nature23474, 2016
- Schuld et al., An introduction to quantum machine learning, https://arxiv.org/pdf/1409.3097.pdf
- Hubregtsen et al., Evaluation of Parameterized Quantum Circuits: on the relation
between classification accuracy, expressibility and entangling capability, 2020
- PennyLane, https://pennylane.ai/
- Quantum Chemistry
Topic Area 5: Chat GPT
- ChatGPT meets WolframAlpha: Combining LLMs with real-time computation tools
- Helfrich-Schkarbanenko, A. (2023). Wolfram-Plugin. In: Mathematik und ChatGPT. Springer Spektrum, Berlin, Heidelberg. https://link-springer-com.emedien.ub.uni-muenchen.de/chapter/10.1007/978-3-662-68209-8_16
- Spannagel, Christian. "Hat ChatGPT eine Zukunft in der Mathematik?" Mitteilungen der Deutschen Mathematiker-Vereinigung, vol. 31, no. 3, 2023, pp. 168-172. https://doi.org/10.1515/dmvm-2023-0055
- Stephen Wolfram (2023), "ChatGPT Gets Its 'Wolfram Superpowers'!,", Stephen Wolfram Writings. https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers/