Funktionen

Print[PRINT]
.  Home  .  Lehre  .  Vorlesungen  .  Sommersemester 2025  .  Data Analytics and ML

Advanced Analytics and Machine Learning (AAML)

Block Lecture in the Summer Semester 2025

Prof. Dr. Dieter Kranzlmüller
Priv.-Doz. Dr. Andre Luckow
Dr. Karl Fuerlinger

News

  • 15.02.2025: The lecture will take place between March 1 and April 12, 2025. Enrollment will be done via Moodle and remaining seats during the initial event on Saturday, March 1, 2025, starting at 9 c.t.

Content

The rapid digitalization of science, industry, and society has led to an unprecedented increase in data generation, necessitating scalable and efficient data processing, storage, and analytics solutions, as well as enabling AI and machine learning. The exponential growth of machine-generated data—such as sensor readings, server logs, and transactional records—poses significant computational challenges. The proliferation of the Internet of Things (IoT) continues to accelerate this data explosion, reinforcing the need for advanced methodologies in high-performance and distributed computing. This course explores the foundations and practical implementations of large-scale data processing, distributed machine learning, and scalable AI solutions. We will investigate computational frameworks designed for high-throughput data analytics, deep learning, and generative AI, with a focus on modern transformer-based architectures and foundation models such as LLaMA and DeepSeek. The curriculum also covers emerging trends in quantum machine learning and AI ethics, ensuring a comprehensive understanding of both technological advancements and their broader implications. Students will engage with state-of-the-art distributed computing frameworks such as Spark/Dask/Ray and Pytorch/Transformers to develop scalable AI solutions with applications spanning computer vision, natural language processing (NLP), and generative AI.

This class will cover the following topics:
  • Data applications in industry and sciences
  • Data-intensive methods in high performance computing
  • Large-scale data processing using Spark, Dask, Flink and Ray
  • SQL for unstructured data: Hive, Spark-SQL, Presto
  • Stream processing: Kafka, Spark Streaming, Flink
  • Data science and machine learning: unsupervised and supervised methods, tools (numpy, pandas, scikit-learn)
  • Deep learning for computer vision: convolutional neural networks (Pytorch))
  • Natural language processing: word embeddings, large language models (RNNs, LSTMs, Transformers) incl. recent development (reasoning models like DeepSeek R1)
  • Quantum machine learning
  • AI ethics and responsible AI
The course will be offered as a block lecture. The lecture will be held in English.

Audience

The lecture is aimed at master's and bachelor's degree students in the computer science and data science programs.

Exercises

Exercises and code for the exercise are under: https://github.com/scalable-infrastructure/exercise-2025 verfügbar.

Scope and Exam

The class comprises 14 modules and 10 exercises (6 ECTS).

The final grade of the class is determined based on a project work and an oral examination. In order to be admitted, all exercise must be submitted and passed. For the lecture to be successful, a grade of at least 4 must be achieved.

Pre-Requisites

Attendance of the lectures on computer networks and distributed systems, operating systems, computer architecture or comparable knowledge required. Programming knowledge in Python and handling Linux command line required.

Time and Location

Time / Dates : Saturdays, March 01 to April 12, 2025.

Tentative Schedule:


Location:

  • LMU, Oettingenstr. 67, Room 161 (to be confirmed). Exam in room: EU102.

Enrollment: The places will be allocated via Moodle: Moodle-Application.
We ask you to describe your previous knowledge in your application and to motivate your participation.

Downloads

Contact

For questions or inquiries please contact Andre Luckow.