Facultatea de Științe Economice și Gestiunea Afacerilor

advertisement
Facultatea de Științe Economice și Gestiunea Afacerilor
Str. Teodor Mihali nr. 58-60
Cluj-Napoca, RO-400951
Tel.: 0264-41.86.52-5
Fax: 0264-41.25.70
econ@econ.ubbcluj.ro
www.econ.ubbcluj.ro
DETAILED SYLLABUS
Methods in Data Science
1. Information about the study program
1.1 University
1.2 Faculty
1.3 Department
1.4 Field of study
1.5 Program level (bachelor or
master)
1.6 Study program /
Qualification
Babeș Bolyai
Economic Sciences and Business Administration
Business Information Systems
Business Information Systems
Master
Business Modeling and Distributed Computing
2. Information about the subject
2.1 Subject title
Methods in Data Science
2.2 Course activities professor Lect. Dr. Darie Moldovan
2.3 Seminar activities
Lect. Dr. Darie Moldovan
professor
2.4 Year of
2.6 Type of
I 2.5 Semester I
Summative 2.7 Subject regime Mandatory
study
assessment
3. Total estimated time (teaching hours per semester)
3.1 Number of hours per week
out of which: 3.2
course
out of which: 3.5
56
course
4
2
3.3
seminar/laboratory
3.6
seminar/laboratory
2
3.4 Total number of hours in
28
28
the curriculum
Time distribution
Hours
Study based on textbook, course support, references and notes
38
Additional documentation in the library, through specialized databases and field activities 24
Preparing seminars/laboratories, essays, portfolios and reports
45
Tutoring
8
Assessment (examinations)
4
Others activities
0
3.7 Total hours for individual
119
study
3.8 Total hours per semester
175
3.9 Number of credits
7
1
NOTE: This document represents an informal translation performed by the faculty.
4. Preconditions (if necessary)
4.1 Curriculum
4.2 Skills
Not necessary
Basic programming skills, basic statistics knowledge
5. Conditions (if necessary)
5.1. For course
development
5.2. For seminar /
laboratory
development
Notebook, beamer, Internet connection
Computers with Internet connection
6. Acquired specific competences
Professional
competences

Transversal
competences

Obtain key competences in data science
- Cleaning and sampling data sets
- Data management
- Exploratory data analysis
- Prediction based on statistical methods
- Communication of results
Gain competences in working within a team, segregate tasks, are able to learn
from different areas connected to the addressed problem.
7. Subject objectives (arising from the acquired specific competences)
7.1 Subject’s general objective
7.2 Specific objectives

Students must be familiar with data science methods and
work through a data science project end to end.
Students have to:
 learn how to analyze a dataset
 be able to access big data
 explore data and generate hypotheses
 use specific methods such as regression and
classification for prediction
 communicate the results of their research using
visualization tools and summaries
8. Contents
8.1 Course
1. Introduction. Course overview. About Data Science.
2. Univariate linear regression. Applications.
Teaching methods Observations
Lecture,
demonstration, open 1 lecture
discussion
Lecture,
demonstration, open 1 lecture
discussion
2
NOTE: This document represents an informal translation performed by the faculty.
Lecture, open
1 lecture
discussion
Lecture,
demonstration, open 2 lectures
discussion
Lecture,
1 lecture
open discussion
Lecture,
demonstration, open 1 lecture
discussion
Lecture,
1 lecture
open discussion
Lecture, open
discussion, case
1 lecture
studies
Lecture, open
discussion,
1 lecture
demonstration
Lecture, open
discussion,
1 lecture
demonstration
Lecture,
demonstration, open 2 lectures
discussion
Lecture,
1 lecture
demonstration
3. Multivariate linear regression. Applications
4. Classification methods. Logistic regression. Decision
Trees.
5. Neural networks.
6. Data Visualization. Effective Information visualization.
7. Applying learning algorithms. Data preprocessing.
8. Support Vector Machines
9. Clustering
10. Solution deployment
11. Big Data and Map Reduce
12. Large-scale data mining
References:
1. Ian H. Witten, Eibe Frank, Datamining: practical machine learning tools and techniques, Morgan
Kaufmann, 2011, 3rd ed.
2. Trevor Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning.
Springer, 2009
3. Jure Leskovec, Anand Rajaraman, Jeff Ullman, Mining of Massive Datasets, Cambridge,
2011
4. Pan-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Datamining, Addison Wesley,
2006
5. Richard Duda, Peter Hart and David Stork, Pattern Classification, 2nd ed. John Wiley & Sons,
2001.
6. Drew Conway, John Myles White, Machine Learning for Hackers. Case Studies and Algorithms
to Get You Started, O'Reilly Media, 2012
7. Tom Mitchell, Machine Learning. McGraw-Hill, 1997.
8. S. Haykin, Neural Networks and Machine Learning, 3rd ed., Prentice Hall, 2008
8.2 Seminar/laboratory
Demonstrative example case
Building a simple linear regression model
Teaching methods
Observations
Running examples and 1 Laboratory
individual exercises/
Homework
Running examples and 1 Laboratory
individual exercises/
Homework
3
NOTE: This document represents an informal translation performed by the faculty.
Multivariate linear regression in practice.
Classification methods. Naïve Bayes, Decision trees,
Logistic regression.
Neural networks.
Data Visualization tools.
Feature selection, sampling the datasets and other
preprocessing operations.
Support Vector Machines.
Clustering.
Deploying the solution.
MapReduce tools.
Running examples and
individual exercises/
Homework
Running examples and
individual exercises/
Homework
Running examples and
individual exercises/
Homework
Running examples and
individual exercises/
Homework
Running examples and
individual exercises/
Homework
Running examples and
individual exercises/
Homework
Running examples and
individual exercises/
Homework
Running examples and
individual exercises/
Homework
Running examples and
individual exercises/
Homework
1 Laboratory
2
Laboratories
1 Laboratory
1 Laboratory
1 Laboratory
1 Laboratory
1 Laboratory
1 Laboratory
3
Laboratories
References:
1. Ian H. Witten, Eibe Frank, Datamining: practical machine learning tools and
techniques, Morgan Kaufmann, 2011, 3rd ed.
2. Trevor Hastie, Robert Tibshirani and Jerome Friedman, The Elements of
Statistical Learning. Springer, 2009
3. Jure Leskovec, Anand Rajaraman, Jeff Ullman, Mining of Massive Datasets,
Cambridge, 2011
4. Pan-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Datamining,
Addison Wesley, 2006
5. Richard Duda, Peter Hart and David Stork, Pattern Classification, 2nd ed. John
Wiley & Sons, 2001.
6. Drew Conway, John Myles White, Machine Learning for Hackers. Case Studies
and Algorithms to Get You Started, O'Reilly Media, 2012
7. Tom Mitchell, Machine Learning. McGraw-Hill, 1997.
8. S. Haykin, Neural Networks and Machine Learning, 3rd ed., Prentice Hall, 2008
9. Corroboration / validation of the subject’s content in relation to the expectations
coming from representatives of the epistemic community, of the professional
associations and of the representative employers in the program’s field.
4
NOTE: This document represents an informal translation performed by the faculty.
 The profession of data scientist has recently become very popular due to the growing data
available for analysis. The increasing computational power has generated new possibilities for
statisticians and other specialists working with data to access a new field: the automated data
analysis, which requires interdisciplinary skills: statistics, machine learning and their
applications.
10. Assessment (examination)
Type of activity 10.1 Assessment criteria
10.2 Assessment methods
10.4 Course
Multiple choice test grid
Multiple choice quiz
10.5 Seminar/
laboratory
Homework assignments
10.3 Weight
in the final
grade
40%
20%
40%
End of semester project
10.6 Minimum performance standard
• Minimum 50% of points for the course component
• Minimum 50% of points for the seminar component
Date of filling
28.01.2015
Signature of the course professor
Lect.Dr. Darie Moldovan
Date of approval by the department
28.01.2015
Signature of the seminar professor
Lect. Dr. Darie Moldovan
Head of department’s signature
Prof. habil. Dr. Gheorghe Cosmin Silaghi
5
NOTE: This document represents an informal translation performed by the faculty.
Download