Syllabus

advertisement
B8114 Applied Regression Analysis
FALL 2014, 2nd Half
PROFESSOR
Peter Kolesar
Office Location: 314 Uris
Office Phone: 212 854 4105
Fax: 212 316 9180
E-mail: pjk4@columbia.edu
Class Meetings: Thursdays 2:30 to 5:00 pm
Office Hours: Thursdays, noon to 2 pm and by appointment. Skype conferences welcomed
TEACHING ASSISTANTS to be named
REQUIRED COURSE MATERIAL: See course description for more information
Recommended Textbook: Sampit Chatterjee, Ali S. Hadi and Bertram Price, Regression Analysis by Example, 4th
edition (Wiley 2006) ISBN 978-0-471-74696-6
Recommended Computer Software: Minitab 17 (A free 30-day trial version may be downloaded from
http://it.minitab.com/en-us/products/minitab/free-trial.aspx
PREREQUISITE
This course will utilize, build on and extend concepts covered in the MBA core statistics course B6100 or the
equivalent – that is, the basics of probability and statistical inference. Students who qualified out of the core
statistics course will generally be accepted, but should consult with the professor prior to the start of term, or
in the first class.
COURSE DESCRIPTION
This course studies the family of statistical methods called regression analysis and is a logical successor to the core
B6100 Managerial Statistics course. It is frequently taken by 2nd-year MBA students wishing to solidify and extend their
quantitative and statistical data analysis skills.
Regression analysis is used to build statistical models of the relationships between variables that can be used for
enhanced understanding of the causes of a phenomenon and, when it works, for prediction of future outcomes. In
business the ultimate goal of regression analysis is often to support better decision making. In the contemporary world
of ‘big data’, regression provides foundational methods and ideas for many of the techniques used in ‘data mining.’
Regressions have been used in financial analyses of investment opportunities, in marketing analyses of customer
behavior, in human resources to test the fairness of employment policies, in operations to identify the determinants of
Page 1 of 4
product quality, and in strategic planning to create sales forecasts. Regression models are also widely used in many
other fields in the sciences, economics and engineering.
Although contemporary computing hardware and statistical software has made it extraordinarily easy to mechanically
produce regression analyses (for example, Microsoft Excel has a powerful regression tool that is easy to use without any
knowledge of the underlying concepts or theory) it is a challenge to create a regression model that is really useful and
reliable. The explicit goal of this course is to learn how, in a business context, to create reliable, valid and useful
regressions, and to be able to judge the validity and usefulness of regressions done by others. The course premise is
that successful applications of regression require understanding of both the practical problem situation, and the
underlying statistical theory. The course blends theory and applications -- avoiding the extremes of presenting
unneeded theory in isolation, or of giving application tools without the foundation needed for practical understanding.
The course integrates three topics: First and most basic, is an approach to data and data analysis that is based on
statistical theory, the scientific method and on some pragmatic epistemology. Second, is regression analysis mechanics
and theory, including extensions of the basic linear regression model to logistic regressions, non-linear models and
multivariate methods. Third, is forecasting of time series from historical data. We will seek to introduce some elements
of modern ‘big data/data mining’ as time permits. The title of our textbook is descriptive of our approach: Regression by
Example. Concepts and procedures shall generally be introduced by example. Moreover, we will emphasize
applications in which the business context matters.
Computing: The course will be computationally hands-on from the very first lecture. Your laptop computer will be
used for all data analysis. Some of the course work, at least at the outset, can be done in Excel and we assume a basic
familiarity with its data analysis tools and capabilities. However, there are advantages and conveniences to using a
statistical software package. Several important regression procedures cannot be done in Excel, so we will supplement it
with the Minitab statistical analysis system. Minitab gives us professional statistical analysis capabilities while being
inexpensive and very easy to learn and use. An advantage of Minitab is the ease with which it interfaces with
Microsoft’s Excel , Word and PowerPoint. Any version of Minitab, or indeed any other software that can do regression,
stepwise regression and logistic regression will be adequate. Students who already are familiar with, or have access to,
another software package that has these capabilities are welcome to use it instead. ( e.g. STATA, BMDP, SAS, S4,
JMPIN,R)
Conduct of the Course
Course Project: A major part of the course will be a term project consisting of a significant regression oriented data
analysis in a real business context. I will provide a standard ‘default’ project. However, I suggest that students who have
particular application interests propose their own project, as this can increase greatly the value you get out of the
course. The term project can be either an individual effort or by a team of two. Specifications for the final project
report and timing will be provided in class.
Workload and Grading: It is expected that students will attend class regularly and participate fully in class discussions.
Since many of these discussions will be based on our analytic homework assignments (mini-cases), it is important that
assigned work be done thoroughly and on time. Most regular homework will be of the Business School’s “Type A”
Page 2 of 4
variety, but with the group size limited to a maximum of 2 people. You may make one submission and an identical grade
will be given for both members of the group. You have the option of doing these exercises individually as well. Some
homework, specified in advance, will need to be done individually.
There will be one short electronically administered exam.
In class I will generally expect professional comportment appropriate to serious learning environment. On the other
hand, I intend that we all will have fun while learning.
The overall work load should be moderate, but as in any serious learning endeavor, you will benefit from the course in
proportion to what you put in. The final course grade will be composed of four components:
Exam
15%
Attendance and class participation
15%
Written Assignments
35%
Term Project
35%
Textbooks and Software
The course will follow the same general outline and notation as the textbook Regression Analysis by Example by
Chatterjee, Hadi and Price, listed below. In assignments and lectures we will often refer to the book as RAE. It, RAE, is
a good resource and reference, and goes into greater depth on some topics than we will have time for in class. This
textbook strikes a balance between providing a theoretical understanding and keeping a concrete focus on applications.
While we may reference some of the book’s examples in class, we will frequently use different examples so the book will
offer complimentary views and illustrations on some issues and procedures. We strongly recommend purchasing it;
however it is possible to do very well in this course without owning the textbook – it will be on reserve in the Business
School Library. There are a good number of excellent books on regression and if you already own another book, it may
suffice. I recommend as an alternative the excellent book – at a slightly higher mathematical level than RAE,
Introduction to Linear Regression Analysis by Montgomery, Peck and Vining. As stated above, In addition to Excel we
will use the Minitab statistical package, the software for which comes with a helpful user’s manual.
Textbook: Sampit Chatterjee, Ali S. Hadi and Bertram Price, Regression Analysis by Example, 4th edition (Wiley 2006)
ISBN 978-0-471-74696-6
Computer Software: Minitab 17 (A free 30-day trial version may be downloaded from www.minitab.com
http://www.minitab.com/products/minitab/demo/)
Page 3 of 4
COURSE OBJECTIVES
The primary objective of the course is to enable you to carry out meaningful regression analyses in a business context
and to be a knowledgeable consumer of such analyses done on your behalf or on behalf of your firm by others. Another
goal is provide a foundation for other data mining techniques such as regression trees and discriminant analysis. In the
process the course should greatly enhance your understanding and comfort with variation, statistics and probability.
For further information contact me, Professor Peter Kolesar, preferably initially by email and I will follow up by phone or
Skype.
pjk4@columbia.edu
212 854 4105 (office) or 845 557 6307 (alternative phone number)
Page 4 of 4
Download