COURSE TITLE
DEPARTMENT
SYLLABUS
STATISTICAL LEARNING
Faculty of Mathematics and Natural Sciences/
INSTITUTE OF COMPUTER SCIENCE
COURSE CODE
DEGREE PROGRAMME
FACULTY
COMPUTER SCIENCE
COURSE FORMAT
INSTRUCTOR(S)
YEAR AND SEMESTER
COURSE COORDINATOR
QUALIFICATION LEVEL
2
STUDY MODE
Full-time studies
YEAR II, SEMESTER I
WOJCIECH RZĄSA, PHD
WOJCIECH RZĄSA
COURSE OBJECTIVES
A student should acquire a basic knowledge about the most fundamental and useful notions, concepts and methods of the statistical learning, and ability of using a computer program to explore sample data.
PREREQUISITES Operations on matrices, computing eigenvalues and eigenvectors, knowledge about Bayes’ Thorem, normal distribution, metrics
LEARNING OUTCOMES
KNOWLEDGE:
A student knows
what machine learning is,
understands the aim of the 3 following aspects of ML: preprocessing, clustering, classification
some statistical methods and algorithms of data mining
SKILLS:
A student can use a computer program to
-
-
prepare some real-life data to data mining join data into clusters induce classification and decision tree and classify new cases
FINAL COURSE OUTPUT - SOCIAL COMPETENCES
COURSE ORGANISATION –LEARNING FORMAT AND NUMBER OF HOURS
Lecture: 15 hours
Laboratory: 30 hours
COURSE DESCRIPTION
The purpose of the course is to provide some most fundamental concepts and methods of the statistical learning with emphasize on their application to real-life data.
Course topics:
1.
Introduction to statistical learning – motivation, notion of machine learning, typical areas of machine learning application.
2.
Preprocessing a.
PCA for feature extraction and feature selection b.
Fisher’s linear discriminant method c.
Handling missing values
3.
Clustering a.
Agglomerative approach b.
K-means algorithm
4.
Classification a.
Bayesian methods b.
K-nearest neighbor algorithm c.
Classification and decision trees
METHODS OF INSTRUCTION Lecture, classes, laboratory
REQUIREMENTS AND ASSESSMENTS Lecture: written essay
Laboratory: Questions during every class, exploration of sample data.
GRADING SYSTEM
TOTAL STUDENT WORKLOAD
NEEDED TO ACHIEVE EXPECTED
LEARNING OUTCOMES EXPRESSED IN
TIME AND ECTS CREDIT POINTS
LANGUAGE OF INSTRUCTION
INTERNSHIP
MATERIALS
100 hours
4 ECTS
ENGLISH
-
PRIMARY OR REQUIRED BOOKS/READINGS:
K.J. Cios, W. Pedrycz, R.W. SwinIarski, L.A. Kurgan,
Data Mining. A Knowledge Discovery Approach,
Springer 2007
UCI ML Repository, www.ics.uci.edu/~mlearn/
COURSE COORDINATOR ’S
SIGNATURE
DEPARTMENT HEAD ’S SIGNATURE
SUPPLEMENTAL OR OPTIONAL BOOKS/READINGS: