MIS2502003 Final exam practice

advertisement
MIS 2502 Section 03 Final Exam
May 10th 2011, Speakman Hall 114
10:30 am – 11:30 am
PLEASE DO NOT TURN THE PAGES UNTIL INSTRUCTED BY THE PROFESSOR.
Please read the following instructions carefully before you start the exam.
You will have 60 minutes to complete this exam.
This exam has 6 pages (including the cover) and is worth 100 points. The amount of points for each
question is indicated next to the question number. Budget your time accordingly.
Space is provided to answer each of the questions. If you need more space, please use the back and
number your answers. Be sure to write legibly – I cannot grade an exam that I can’t read.
Good luck!
TUID:
Name:
1
I. Discussing shortly (no more than 30 words each) whether or not each of the following
activities is a data mining task. No matter yes or no, name the task with your knowledge.
(8 x 4 = 32 points total)
1. Computing the revenue of a company
2. Sorting a customer database based on customer names
3. Predicting the future stock price of a company using historical records
4. Clustering Temple students by their registered department names
5. Clustering Temple students into several groups by analyzing course GPA
6. Testing IQ for each student
7. Predicting the music preference of a user based on his or her favorite songs
8. Using ERD theory to establish a data warehouse for a company
2
II. Multiple Choices Questions. Each has only one answer. (5 x 4 = 20 points total)
_____1. The software that we use for data mining practice is developed by:
A. SPSS
B. SAS
C. Oracle
D. Microsoft
_____2. Microsoft usually clusters all potential customers into three groups: home users, professional
users, and enterprise users. For each product, Microsoft will customize it into three different versions
based on the needs of each user group. However Apple prefers providing just one universal version
since they don’t cluster their customers in needs. From clustering view of data mining, which
clustering strategy is dominantly better?
A. Microsoft
B. Apple
C. Hard to say
_____3. OLAP (On-Line Analytical Processing) are a set of multidimensional data analysis techniques.
Which of following is not an OLAP technique?
A. Slice
B. Dice
C. Cluster
D. Pivot
_____4. Which of following questions could NOT be answered with OLAP?
A. what are the total sales for each product?
B. how much salary did each employee receive for each month?
C. which salesperson has sold the most?
D. which product did each salesperson sells most?
_____5. Which of following statements about regression is NOT correct?
A. R square measures how much variance can be explained by a model
B. if the p value of a factor is smaller than 0.1, this factor has significant influence on the target
C. Predictors of a regression model can be correlated with each other
D. Estimate coefficients are meaningless if the according p values are larger than 0.1
3
III. Data Analytics (48 points)
A marketing manager is in charge of promoting their dungaree jeans which have four different styles:
leisure, stretch, fashion, and original.
1. (10 points) The manager first did a clustering and segmentation analysis in order to understand the
sales features of different stores and make different marketing strategies. In the output, he saw
following information:
Please use your own words to explain the feature of this segment.
2. (10 points) Understanding customer purchase behavior is essential, so the manager also analyzed
the historical transaction records to identify the underlying association rules. Finally he found this
information in output:
Rule 1: Original  Leisure (lift value = 6.0)
Rule 2: Leisure  Original (lift value = 5.2)
What can you get from above output?
4
3. (20 points) The manager has planned and implemented several marketing plans. After a year, he
decides to evaluate his marketing plans based on their operational data with regression. In this
regression, the dependent variable and independent variables (predictors) are listed in table 1.
Variable Name
MarketShare
DirectMail
Internet
PrintMedia
TVRadio
Table 1. Regression Variables
1 Unit
Description
1%
Weekly market share that this company is taking.
$1.00
Weekly spend on Direct Mail advertisement
$1.00
Weekly spend on Internet advertisement
$1.00
Weekly spend on Print Media advertisement
$1.00
Weekly spend on TV or Radio advertisement
Role
Target
Input
Input
Input
Input
Regression Output:
Model Fit Statistics
R-Square
0.5267
Adj R-Sq
AIC
-586.7872
BIC
SBC
-574.4005
C(p)
0.5038
-584.1920
5.0000
Analysis of Maximum Likelihood Estimates
Standard
Parameter
DF
Estimate
Error
t Value
P value
Intercept
1
0.7547
0.0153
49.23
<.0001
DirectMail
1
-0.00011
0.000049
-2.25
0.0272
Internet
1
0.00025
0.000037
-6.98
<.0001
PrintMedia
1
-0.00049
0.000092
-5.34
<.0001
TVRadio
1
0.000026
0.000019
1.35
0.1820
Please answer following questions based on above output:
a. How well is this regression model?
b. Based on the output, which marketing strategy is the best?
5
c. How to explain the estimate coefficient of Internet?
d. If you are the marketing manager, what actions will you take based on above regression
result?
4. (8 points) In practice, tools using collective intelligence have performed better than theorists can
explain, especially in IT industry. Can this manager use collective intelligence to improve his
marketing performance? If yes, please design a detailed strategy to explain the power of collective
intelligence in marketing. If no, please explain.
6
Download