Syllabus

Title
6231 Data Management and Analytics Group B
Instructors
Dr. Christian Haas, Univ.Prof. Dr. Axel Polleres
Contact details
axel.polleres@wu.ac.at, christian.haas@wu.ac.at. In emails to the instructors, please use subject: "[Data Management and Analytics]"
Type
PI
Weekly hours
2
Language of instruction
Englisch
Registration
02/13/23 to 02/17/23
Registration via LPIS
Notes to the course
This class is only offered in summer semesters.
Subject(s) Master Programs
Dates
Day Date Time Room
Friday 03/03/23 10:00 AM - 12:00 PM TC.3.05
Friday 03/10/23 10:00 AM - 12:00 PM TC.3.05
Friday 03/17/23 10:00 AM - 12:00 PM TC.3.05
Friday 03/24/23 10:00 AM - 12:00 PM TC.3.05
Friday 04/14/23 10:00 AM - 12:00 PM TC.3.05
Friday 04/21/23 10:00 AM - 12:00 PM TC.3.05
Friday 04/28/23 10:00 AM - 12:00 PM TC.3.05
Friday 05/05/23 10:00 AM - 12:00 PM TC.3.05
Friday 05/12/23 10:00 AM - 12:00 PM TC.3.05
Friday 05/26/23 10:00 AM - 12:00 PM TC.3.05
Friday 06/02/23 10:00 AM - 12:00 PM TC.3.05
Friday 06/16/23 10:00 AM - 12:00 PM TC.3.05
Friday 06/23/23 09:00 AM - 11:00 AM TC.0.02
Contents

Assuming familiarity with basic data management and storage techniques (such as ER models and SQL) as well as basic coding skills (Python and R), which – if needed – will be repeated in a bridging course, this course shall teach you essentials of data management and analytics, from the concepts to their application on practical examples (for example applied to Web data but also to business scenarios).

In part one (A) on Data Management, we will focus on advanced databases, storage and data management techniques, analytical queries using SQL and Relational Database Management Systems, but also discusses Document and Graph Databases. In part two (B) on Data Analytics, we will discuss how to make certain tasks scale with big data (i.e. high volume, high velocity or highly heterogeneous data). We will also look into (Descriptive and Predictive) Data Analytics techniques, Data Visualization topics as well as cover a brief overview of Natural Language Processing (NLP).

The two parts will be taught in interleaved, parallel sessions, such that students advance their skills on both data management and analytics driven by common use case scenarios used for motivating examples throughout the course. Specifically, a session from part A will be followed by a session from part B, and vice versa.

Learning outcomes

In this course you shall

  • learn how to structure and model data for analytics
  • understand how to store this data in modern database systems
  • understand how to extract knowledge from a database by formulate complex questions as queries using SQL and other query languages
  • understand how to improve query performance for common queries using indexes
  • learn about data analytics tasks to be performed on data in a database or collected/integrated from different structured and unstructured data sources
  • apply your conceptual learnings on practical cases using (publicly available) real data using tools such as R and Python
Attendance requirements

According to the examination regulation full attendance is intended for a PI. Attendance of 80% of all classes is compulsory.

Teaching/learning method(s)

The covered topics will be discussed in 12 classes, each of which will consist of concepts delivered in the form of pre-watching videos or reading materials to be prepared individually by the students, which are then in the lecture applied in Jupyter notebooks.

Assessment

Each of the classes will be accompanied by an interactive Jupyter Notebook that we will work on together in class, including some group tasks, to be submitted after each class as documentation of in-class participation, along with up to 3 (4 in the first class) clicker quiz questions per class (max. 6% each class, the worst result can be discarded, , i.e. max. 60% for in-class participation overall), plus two individualised assessments, in the form of a final take-home exam on part A (20%) at the end of the semester, as well as an individual project on part B (20%) after the first half of the semester.

 

Grading Scheme:

>= 90% ... Excellent (1)

>= 80% ... Good (2)

>= 70% ... Satisfactory (3)

>= 60% ... Sufficient (4)

 <  60% ... Fail

Readings

Please log in with your WU account to use all functionalities of read!t. For off-campus access to our licensed electronic resources, remember to activate your VPN connection connection. In case you encounter any technical problems or have questions regarding read!t, please feel free to contact the library at readinglists@wu.ac.at.

Recommended previous knowledge and skills

We expect that you fully understand and will build upon the contents of the Bridging Course in DigEcon: IT & IS Skills, particularly:

  • Block 2: Basic Programming Skills
  • Block 3: Databases

In more detail, we expect a basic understanding of Data Modeling and Schemas, basic SQL, relational algebra. Also, we will use Jupyter and Python as tools in the course, which you should have become familiar with during the Bridging courses.

In particular, we may check the prior knowledge of Block 3 in a "prior knowledge check" quiz in the first unit of the course.

Last edited: 2023-01-10



Back