Syllabus

Title
5377 Exploiting Data: The Machine Learning Approach
Instructors
Stefan Vamosi, MSc.
Contact details
Type
PI
Weekly hours
2
Language of instruction
Englisch
Registration
02/16/22 to 03/10/22
Registration via LPIS
Notes to the course
Dates
Day Date Time Room
Tuesday 03/15/22 04:00 PM - 07:00 PM TC.5.04
Tuesday 03/22/22 04:00 PM - 07:00 PM TC.5.04
Tuesday 03/29/22 04:00 PM - 07:00 PM TC.5.04
Tuesday 04/05/22 04:00 PM - 07:00 PM TC.3.07
Tuesday 05/03/22 04:00 PM - 07:00 PM TC.5.04
Tuesday 05/10/22 04:00 PM - 07:00 PM TC.5.04
Tuesday 05/17/22 04:00 PM - 07:00 PM TC.5.04
Tuesday 05/31/22 04:00 PM - 06:00 PM D5.1.002
Contents

In customer and market analysis, the increasing amount of data available creates an immense opportunity for machine learning. This course introduces a small starter-set of simple (yet powerful) machine learning methods including their theoretical background, with practical hands-on exercises. The R programming language is used for the exercises and homeworks: prior programming experience is not required, but is recommended for successful passing of this course (see details below).

keywords: supervised and unsupervised learning, model performance evaluation, k-nearest neighbors (kNN), decision trees, random forest, k-means clustering, neural networks, support vector machines (SVM)

Learning outcomes

Our goal is give you the ability to solve machine learning challenges on your own, and hopefully inspire your future work in this exciting, dynamic field. Additionally, we aim to give an insight into some of the many modern applications of machine learning which are increasingly invading our private and professional lives. 

  • Understand the theory of machine learning and the concept of generalization.
  • Get familiar with machine learning in R.
  • Learn to chose the right method for a given task and evaluate results.
  • Realising your own data science project.
Attendance requirements

The course will take place in presence mode only as normal classroom lectures (no hybrid mode).

You need to attend at least 80% of all classes to pass the course, which means effectively that you can miss not more than one lecture.

However, we reserve the right, to change the course format into a remote modus only (MS Teams or Zoom), depending on the Covid-19 situation. The course dates and times would remain as scheduled. Also the homework assignments and the student's presentations would be proceeded.

Teaching/learning method(s)

The first two class lectures are dedicated to a theoretical and practical background of machine learning, including the introduction of machine learning methodologies. In-class programming exercises guarantee a hands-on approach and build the basis for the study project.

The rest of the course is build around the Study Project. It is a machine learning project, that has to be solved by the students individually. The project includes: solving a machine learning task on a real world data-set in R and benchmarking and evaluating the results with appropriate metrics and alternative models.

The project and the results have to be presented in one of the lectures. The R-Code, that was developed in course of the project, has to be submitted and is also part of the grading. The students are highly welcome to propose own ideas and data-sets for the Study Project, but there will be also data-sets and tasks suggested by the lecturer if needed.

Assessment

The final grade will be evaluated as follows:

  • Class participation: [weight: 10%]
  • Study Project presentation: [weight: 50%]
  • Study Project R-Code: [weight: 40%]

To pass this course: 

  • your weighted final grade needs to exceed 60%
  • you need to present the Study Project
  • you need to submit a running R-Script of the Study Project

Grading-scheme:

< 60%                                fail (5)
60% bis 69,99%               sufficient (4)
70% bis 79,99%               satisfactory (3)
80% bis 89,99%               good (2)
>= 90%                              excellent (1)
Prerequisites for participation and waiting lists

The class does not take place in a computer room.

Please bring your own laptop with a working RStudio installation. 

The software is free, runs on Windows, Mac, and Linux.

Download RStudio Desktop from: https://rstudio.com/products/rstudio/ 

R-knowledge is recommended

Readings
1 Author: Brett Lantz
Title:

Machine Learning with R


Publisher: PACKT
Edition: 2nd
Recommendation: Reference literature
Recommended previous knowledge and skills

Prior programming experience or experience with R is recommended, but can be also acquired during the course and the Study Project. For those who are not familiar with R, there are many online resources that provide the basic skills required to take this course, and we strongly encourage you to do so. The following is just a small sample of the vast resources available for free:

https://www.udemy.com/course/r-basics/

https://www.coursera.org/learn/r-programming

https://www.edx.org/course/data-science-r-basics

https://www.datacamp.com/courses/free-introduction-to-r

Availability of lecturer(s)

Consultations after prior arrangement via email.

Last edited: 2021-12-21



Back