Advanced Analytics and Machine Learning (AAML)
Block Lecture in the Summer Semester 2024
Prof. Dr. Dieter Kranzlmüller
Priv.-Doz. Dr. Andre Luckow
Dr. Karl Fuerlinger
News
- 29.04.2024: Due to an unforeseen personal obligation, specifically a funeral, the oral exams is rescheduled to June 21 between 2 and 4 pm in room EU102. See exam schedule.
- 13.04.2024:
The oral exams will take place on May 8 between 2 and 4 pm. See exam schedule.
- 26.02.2024: The dates for the lecture will be March 2, 9, 16, 22 and April 13, 2024. Please, make sure you are available on these dates. It is mandatory to attend all days.
- 24.02.2024: Moodle registration is closed. To claim your final place in the course you must be present in person on 03/02/2024, 9 am c.t. We will backfill no show spots and you will loose your place. Please, also join if you are interested in a standby spot.
- 16.02.2024: The first lecture will take place on March 2, 9 am c.t. in Oettingenstr. Attendance is mandatory. Please register via email: contact us.
- 06.01.2024: Dates: The lecture will take place between March 2 and April 17, 2024. Moodle page: https://moodle.lmu.de/course/view.php?id=32199
Content
The ongoing data deluge driven by the increasing digitalization of science, society and industry, leads to a significant increase in demand for data storage, processing and analytics within several industrial domains. Sciences and industry are overwhelmed by the need to store large amounts of transactional and machine-generated data resulting from the customer, service and manufacturing processes. Examples of machine-generated data are server logs as well as sensor data that is generated in finer granularities and frequencies. Further, datasets are often enriched with web and open data from social media, blogs or other open data sources. The Internet of Things (IoT) will further blur the boundaries between the physical and the digital world causing an even further increase in the digital footprint of the world. In this course, we will learn about data applications and their requirements.
In this lecture, we will learn about methods and technologies for handle the large data volumes, analytics and machine learning. As part of the exercises students will utilize different frameworks, e.g., MapReduce, Spark and Tensorflow/Keras, to implement different algorithms.
This class will cover the following topics:
- Data applications in industry and sciences
- Data-intensive methods in high performance computing
- Large-scale data processing using Spark, Dask, Flink and Ray
- SQL for unstructured data: Hive, Spark-SQL, Presto
- Stream processing: Kafka, Spark Streaming, Flink
- Data science and machine learning: unsupervised and supervised methods, tools (numpy, pandas, scikit-learn)
- Deep learning: convolutional neural networks (Tensorflow, Keras)
- Natural language processing: word embeddings, large language models (RNNs, LSTMs, Transformers)
- Quantum machine learning
- AI Ethics
The course will be offered as a block lecture. The lecture will be held in English.
Audience
The lecture is aimed at master's and bachelor's degree students in the computer science and data science programs.
Exercises
Exercises and code for the exercise are under: https://github.com/scalable-infrastructure/exercise-2024 verfügbar.
Scope and Exam
The class comprises 14 modules and 10 exercises (6 ECTS).
The final grade of the class is determined based on a project work and an oral examination. In order to be admitted, all exercise must be submitted and passed. For the lecture to be successful, a grade of at least 4 must be achieved.
Pre-Requisites
Attendance of the lectures on computer networks and distributed systems, operating systems, computer architecture or comparable knowledge
required. Programming knowledge in Python and handling Linux command line required.
Time and Location
Time / Dates : Saturdays, March 02 to April 13, 2024.
Tentative Schedule:
Location:
- LMU, Oettingenstr. 67, Room 161. Exam in room: EU102.
Enrollment: The places will be allocated via Moodle: Moodle-Application.
We ask you to describe your previous knowledge in your application and to motivate your participation.
Downloads
Introduction, HPC, Hadoop and Spark
SQL, Data Science, Machine Learning, Deep Learning, Computer Vision
NLP, Transformer, Scalable Machine Learning, Performance
MLOps, Streaming, Quantum Computing, Ethics
Exam Schedule
Contact
For questions or inquiries please contact Andre Luckow.