Uploaded by Claire Ngern

DAO2702 Course overview

advertisement
NATIONAL UNIVERSITY OF SINGAPORE
NUS Business School
Department of Analytics & Operations
DAO2702/DSC2008 Programming for Business Analytics
Session: Semester 2, 2021/2022
Instructor: Peng Xiong bizxio@nus.edu.sg
Tutors: Peng Zheng peng@nus.edu.sg; Bin Huang huang.bin@nus.edu.sg
Description:
This module is an introductory course to business analytics and data science. It covers basic Python
programming and preliminary statistics, with a great emphasis on addressing practical business problems and
real datasets. Data science is an interdisciplinary field that requires business insights and expertise, proficiency
in programming, as well as a strong background in mathematics and statistics. Therefore, lectures and tutorials
in this semester would focus on trainings in the following perspectives:
• Python programming and Pythonic coding styles
• Analytical and visualization packages
• Math and statistics
• Practical business insights and problem solving skills
Syllabus:
1. Basics of Python programming
1. Data structures and flow control
2. Functions and packages
2. Data analysis with Python
1. Analytical tools: NumPy, SciPy, Pandas
2. Data visualization: Matplotlib
3. Data collection and cleaning
3. Statistical inference
1. Sampling and inference
2. Confidence intervals
3. Hypothesis testing
4. Linear regression
1. Model assumptions and interpretations
2. Categorical variables and modelling nonlinearity
3. Package Statsmodels for regression analysis
Software:
Anaconda: installation
E-learning Mode:
• Lectures will be given as offline recorded videos.
• The lecture time is used for discussing exercise questions and online consultation. Each session will be
recorded and uploaded to LumiNUS multimedia channels.
• Tutorial sessions will be face-to-face and broadcasted via Zoom.
• Limited face-to-face meetings can be arranged.
Class Materials:
• Offline lecture video uploaded in LumiNUS multimedia channels
• Jupyter Notebook files as lecture notes
•
•
•
•
Jupyter Notebook files as tutorial case studies
Jupyter Notebook files as exercises
Slides as supplementary
The folder "Advanced topics" provides supplementary reading materials. They are not tested but may be
helpful for your project.
Materials for each week can be located via LumiNUS->Module Overview.
Reference Books:
Python programming:
• Python data science handbook, by Jake VanderPlas
Statistics and data analysis:
• Introductory econometrics, by Jeffrey M. Wooldridge
• Storytelling with data, by Cole Nussbaumer Knaflic
Online Resources:
• Python tutor: https://pythontutor.com/
• Programming for Business Analytics: https://share.streamlit.io/xiongpengnus/learn_dao/main/web.py or
https://nus-biz-dao.herokuapp.com/
• Storytelling with data: https://www.storytellingwithdata.com/blog
Workload:
• Students are required to review lectures covered in each week
• Students are supposed to work on exercises for each week. The solutions will be uploaded in the next
week, and there is no need to submit the assignment.
• All tutorial case studies are related to topics covered in previous lectures.
Week 𝑡 − 𝟏
Upload notes, tutorial,
and exercise for Lecture 𝑡
Week 𝑡
Discuss exercise solution
for Lecture 𝑡
Week 𝑡 + 𝟏
Discuss tutorial solution
for Lecture 𝑡 + 𝟏
Assessments:
Continuous Assessment:
Class Participation
10%
• Answering questions posted by the instructor on the forum.
• Posting questions and answering questions post by other students will also be considered
• The class participation will be given according to 1) the quality of the posts; 2) the time of the posts.
• A question may be closed if it is answered by many students.
Group Project
20% for report and 15% for presentation
• Team work. Each team has five to six members.
• You may choose your own teammates (in the same tutorial session) by filling a survey. All team
members need to fill the survey. If you fail to fill the survey before the deadline, we will randomly
assign you to a team. The random team may have fewer members.
• If you ask for changing to another team after the deadline without valid reasons, then there will
be a 20% penalty on your report.
• An eight-page (4 pieces of double-sided paper) report and a formal 10 to 15-minute presentation.
• In the video, please provide the name of the current presenter, otherwise there will be a 20%
penalty on your presentation.
• All files should be zipped into a zip file with the name: Project_<tutorial session name>_<team name>.
Please make sure that you follow the name convention, otherwise there will be a 20% penalty.
• Your grade of the project would also be affect by the peer evaluation of your teammates. Zero mark for
zero contribution!
• Zero mark for plagiarism.
• More details can be found from Appendix B.
Final Examination:
55%
• Paper-based close-book examination.
• A number of multiple-choice questions and a written question.
• You are allowed to take a double-sided A4 cheat-sheet and a calculator with you. Other notes or
electronic devices are prohibited.
• All topics covered in lectures/exercises/tutorials could be tested.
Schedule:
Week 1.
Course Overview
Introduction to Programming and Business Analytics
Week 2.
Python Basics I
Week 3.
Python Basics II
Week 4.
Built-in compound data types
Week 5.
Functions, modules, and packages
Week 6.
Data arrays and data visualization
Week 7.
Basics of Pandas
Week 8.
Facts from data
Week 9.
Confidence intervals and hypothesis testing
Week 10.
Introduction to regression analysis
Week 11.
Regression analysis for explanatory modeling
Week 12.
Nonlinearity and categorical variables
Week 13.
Review
FAQ:
Lecture and tutorial registrations
What if I want to change lecture sessions or have difficulties in securing a tutorial slot?
Please contact the BBA office or ModReg as soon as possible. The instructor and tutors do not have the
access to the module registration system, so it does not help to email them.
Class Participation
How are class participation marks calculated?
The majority of class participation marks come from posting answers to the "Questions given by
instructors" on the forum. A minor part will come from the "DAO2702 StackOverflow" forum, where
students are able to post their own questions and answers there.
The questions given by instructors are typically open-ended questions, so every student should have a
chance to contribute to the forum discussion.
The marks will be given according to the ranking of your contributions on the forum, and we mainly
consider 1) the quality of your posts; and 2) the time of your posts (e.g. a post made after two months is not
considered as valuable as the one made within the first week.)
Will the attendance of tutorials be accounted for class participation? What if I am unable to attend
one of my tutorial sessions?
You are encouraged to attend the F2F tutorial sessions, but the attendance is not mandatory and will not
affect your class participation grades. The tutor may ask you to mark your attendance, but it is just for
Covid-19 safety measures.
All tutorial sessions provide Zoom links for students to attend from distance, so if you have difficulties in
attending one of your F2F sessions, you could always attend any other sessions via Zoom at your
convenience. There is no need to inform your tutor.
Project
Are the cover page, table of contents, reference, and appendices counted in the 8-page limit of the
report?
The cover page, table of contents, reference, and appendices are not counted in the 8-page limit. Do not
place your key findings and important diagrams in the appendices.
Is it allowed to place data visualizations in appendices?
Please make sure that the graders can read your report without referring to the appendices.
Are we supposed to include code in the project report or presentation?
Code is submitted in a separate file. Please avoid showing code in your project report or presentation.
Are there any requirements for the font, line spacing and margin of the report?
There is no specific requirement on the font, spacing, margin of the report. You could use any page layouts
as long as the report is clear and neat for reading (as A4-size).
What if my presentation recording file is too large to be uploaded to LumiNUS?
You could upload the recordings to a cloud storage drive and enclose the link of the drive in the submitted
zip file.
Is the peer evaluation survey anonymous? Is it compulsory? What shall I do if one of my teammates
is not contributing to the project?
The peer evaluation survey is totally anonymous. Only the instructor and tutors can see your responses.
The peer evaluation survey is entirely optional, and there is no need for the whole team to fill the survey.
You can totally ignore it if you have nothing to report.
If your teammates are not contributing, it is better to communicate with them first and solve the issue
internally. If the issue cannot be settled internally and they still refuse to contribute, please keep all
evidence, and file a report in the peer evaluation survey. Your report will be taken into consideration in
grading the project.
Final Exam
What are the topics to be tested in the final exam?
All topics covered in lectures/tutorials/exercises could be tested. These topics include 1) basics of Python
programming (Lecture 2 to Lecture 6); 2) data arrays, frames, and visualization (Lecture 7 to Lecture 9);
and 3) statistics and regression analysis (Lecture 10 to Lecture 13). Only the advanced topics will not be
tested.
Where can I find the time and venue for the final exam?
Please visit the EduRec: https://myedurec.nus.edu.sg/psp/cs90prd/?cmd=login
What are needed in the paper-based exam?
1) The double-sided A4 size cheat sheet
2) A scientific calculator (a graphic /programmable calculator is allowed, but it will not give you special
advantages in the exam)
3) A pencil for filling the bubble card. All MCQs will be graded based on the submitted bubble cards.
Except for the calculator, no electronic device, including the laptop, is allowed in the paper-based exam.
Appendix A: Important dates
Dates
Consequences of missing the dates
Deadline of the general survey
14/01/2022 at 11:59 PM
None
Deadline of the project group survey
18/02/2022 at 11:59 PM
You will be randomly assigned to a team.
Deadline of submitting the project
15/04/2022 at 11:59 PM
20% penalty on late submission.
The forum of instructors' questions expire
15/04/2022 at 11:59 PM
No posts can be made on the forum after this date.
The forum of students' questions expire
22/04/2022 at 11:59 PM
No posts can be made on the forum after this date.
Deadline of the peer evaluation survey
03/05/2022 at 11:59 PM
Peer evaluation cannot be submitted after this date
Appendix B: Project Guidelines
Submission
Each team must submit a zip file containing:
• A formal eight-page (4 pieces of double-sided paper) report saved as a PDF file (no code should appear in the formal
report).
• Your source code saved as HTML files.
• A video presentation (about 10~15 minutes) where each team member participates in the presentation. It can
be recorded by ordinary smart phones, as long as it is able to show the presenters and the slides clearly.
The zip file name must follow the format: Project_<tutorial session name>_<team name>. Please make sure
that you follow the name convention, otherwise there will be a 20% penalty.
Any files submitted after the deadline are considered late submission. There will be a 20% penalty for late
submission within 24 hours after the deadline. Submission later for more than 24 hours will be considered
no submission and receives zero mark.
Scope of the Project
The team project requires students to solve ONE practical business problem by exploiting real datasets. Possible
sources of real datasets: 1) Data.world; 2) Kaggle; 3) Data.gov.sg. You can also download datasets from other
websites, or even conduct your own survey to collect data. A few examples are given below:
1) Make property investment decisions based on property price data or Airbnb data;
2) Use a dataset of movies/video games made in recent years to decide what kind of movies/games to make in
the future;
3) Use a dataset of employees in a company to investigate if there is discrimination in the company.
4) Based on the demand data of a perishable product, use Monte Carlo simulation to find the optimal
production quantity.
You are also allowed to work on other directions rather than just analysing real datasets. For example,
you could create a dashboard for data visualization, or to develop your own Python packages. For such
project topics, please contact the instructor first for his permission.
Detailed Guidelines and Grading Criteria for Project Report
The main focus of the project should be on data visualization, but you are allowed to use whatever
programming/analytics techniques you know (not restricted to this module). Please do not write your project
report as an assignment with separated tasks. You need to have a good flow for each part of your report,
and they are all used in solving ONE problem.
The project report of the whole team would be evaluated following the guidelines below:
Components
Problem
statements
(25%)
Suggestions
• A Clear and concise description of only one business problem
• Emphasize how the dataset would help solve the problem.
• Justify any assumptions/approximation/missing data.
• Novelty and creativity in the selected problem/dataset is highly valued.
• Data visualization is the key component of this project
Data
visualization
(40%)
• Use proper graphs to present the information of the dataset
• Clear presentation and accurate description of your graphs
• Do not create irrelevant graphs; good data visualization should well support your
analysis and be efficient in delivering data information.
Analysis and
Discussion of
the results
(25%)
• Derive relevant information form data visualization and use such information to
solve the business problem.
• You may use statistical models (KNN, linear regression, or any other models you
know) covered in the second half of the semester, but this is not compulsory.
• Finalize your report with necessary discussion/conclusion/recommendations.
• Possible discussion on the influence of missing data and model limitations; give
suggestions on solutions or future research.
• Well-structured flow and clear logic.
Writing (10%)
• Better to keep a list of reference for all literatures/external sources you refer to.
• Good English and free of error/typos
Detailed Guidelines for Project Presentation
For the presentation,
•
Every team member needs to participate.
•
You may simply use your cell phone to record the video.
•
Due to the current situation, you are allowed to make separate video clip at home and combine them as a
presentation.
•
The audience should be able to see your slides and your face. The preferred formats are shown as the
following pictures.
•
No need to dress professionally
•
Make sure that the presenter's name is displayed while he/she is talking, otherwise there will be
a 20% penalty on the presenter.
•
Use proper data visualization for your slides
•
Try to focus more on the business problem itself and make it as friendly as possible for audience
with a limited programming/math/statistics background.
•
Presentation is graded individually, according to each student's presentation skills.
•
DO NOT read a script! Students will be penalized if they are reading the scripts on their screen.
Reference
This book may help to work on your project report/presentation: Storytelling with Data
Some videos from the same author:
https://www.youtube.com/watch?v=8EMW7io4rSI
https://www.youtube.com/channel/UCjhGlILWNloXJdR2NTCBMlA/videos?view=0&sort=da&flow=grid
Appendix C: ACADEMIC HONESTY & PLAGIARISM
Academic integrity and honesty is essential for the pursuit and acquisition of knowledge. The University and
School expect every student to uphold academic integrity & honesty at all times. Academic dishonesty is any
misrepresentation with the intent to deceive, or failure to acknowledge the source, or falsification of
information, or inaccuracy of statements, or cheating at examinations/tests, or inappropriate use of resources.
Plagiarism is ‘the practice of taking someone else's work or ideas and passing them off as one's own' (The New
Oxford Dictionary of English). The University and School will not condone plagiarism. Students should adopt
this rule - You have the obligation to make clear to the assessor which is your own work, and which is the work
of others. Otherwise, your assessor is entitled to assume that everything being presented for assessment is
being presented as entirely your own work. This is a minimum standard. In case of any doubts, you should
consult your instructor.
Additional guidance is available at:
http://www.nus.edu.sg/registrar/adminpolicy/acceptance.html#NUSCodeofStudentConduct
Online Module on Plagiarism: http://emodule.nus.edu.sg/ac/
Download