MIS2502003 Practice Final answer keys

advertisement
MIS 2502 Section 03 Final Exam Practice
May 10th 2011, Speakman Hall 114
10:30 am – 11:30 am
PLEASE DO NOT TURN THE PAGES UNTIL INSTRUCTED BY THE PROFESSOR.
Please read the following instructions carefully before you start the exam.
You will have 60 minutes to complete this exam.
This exam has 6 pages (including the cover) and is worth 100 points. The amount of points for each
question is indicated next to the question number. Budget your time accordingly.
Space is provided to answer each of the questions. If you need more space, please use the back and
number your answers. Be sure to write legibly – I cannot grade an exam that I can’t read.
Good luck!
TUID:
Name:
1
Please try your best to practice it. If you have finished all labs and homework, I hope your score
is higher than 85.
I. Discussing shortly (no more than 30 words each) whether or not each of the following
activities is a data mining task. No matter yes or no, name the task with your knowledge.
(8 x 4 = 32 points total)
Grading: If “yes” or “no” is correct, give 4 points. Take 2 points off at most if the answer is not
blank.
1. Computing the revenue of a company
Answer: No. This is just a common calculation. Data mining is aiming to explore underlying patterns
that are not obvious.
2. Sorting a customer database based on customer names
Answer: No, this is just a sorting task which can be done with SQL.
3. Predicting the future stock price of a company using historical records
Answer: Yes. Regression can do that and is a typical data mining task.
4. Clustering Temple students by their registered department names
Answer: No. Department names are taken as given which is not underlying
5. Clustering Temple students into several groups by analyzing course GPA
Yes.
6. Testing IQ for each student
No. That’s just an examination, a way to collect data.
7. Predicting the music preference of a user based on his or her favorite songs
Yes.
8. Using ERD theory to establish a data warehouse for a company
No. That’s ERD and data warehouse.
2
II. Multiple Choices Questions. Each has only one answer. (5 x 4 = 20 points total)
B_____1. The software that we use for data mining practice is developed by:
A. SPSS
B. SAS
C. Oracle
D. Microsoft
C_____2. Microsoft usually clusters all potential customers into three groups: home users,
professional users, and enterprise users. For each product, Microsoft will customize it into three
different versions based on the needs of each user group. However Apple prefers providing just one
universal version since they don’t cluster their customers in needs. From clustering view of data
mining, which clustering strategy is dominantly better?
A. Microsoft
B. Apple
C. Hard to say
C_____3. OLAP (On-Line Analytical Processing) are a set of multidimensional data analysis
techniques. Which of following is not an OLAP technique?
A. Slice
B. Dice
C. Cluster
D. Pivot
B_____4. Which of following questions could NOT be answered with OLAP?
A. what are the total sales for each product?
B. how much salary did each employee receive for each month?
C. which salesperson has sold the most?
D. which product did each salesperson sells most?
C_____5. Which of following statements about regression is NOT correct?
A. R square measures how much variance can be explained by a model
B. if the p value of a factor is smaller than 0.1, this factor has significant influence on the target
C. Predictors of a regression model can be correlated with each other
3
D. Estimate coefficients are meaningless if the according p values are larger than 0.1
III. Data Analytics (48 points)
A marketing manager is in charge of promoting their dungaree jeans which have four different styles:
leisure, stretch, fashion, and original.
1. (10 points) The manager first did a clustering and segmentation analysis in order to understand the
sales features of different stores and make different marketing strategies. In the output, he saw
following information:
Please use your own words to explain the feature of this segment.
Answer: contains stores selling a higher-than-average number of stretch jeans.
Grading: if the answer is close, give full points.
2. (10 points) Understanding customer purchase behavior is essential, so the manager also analyzed
the historical transaction records to identify the underlying association rules. Finally he found this
information in output:
Rule 1: Original  Leisure (lift value = 6.0)
Rule 2: Leisure  Original (lift value = 5.2)
What can you get from above output?
Answer: customers purchase original style is also likely to purchase leisure style, and vice versa. So
it’s better for stores to put these two styles next to each other.
Grading: if the answer is close, give full points.
4
3. (20 points) The manager has planned and implemented several marketing plans. After a year, he
decides to evaluate his marketing plans based on their operational data with regression. In this
regression, the dependent variable and independent variables (predictors) are listed in table 1.
Variable Name
MarketShare
DirectMail
Internet
PrintMedia
TVRadio
Table 1. Regression Variables
1 Unit
Description
1%
Weekly market share that this company is taking.
$1.00
Weekly spend on Direct Mail advertisement
$1.00
Weekly spend on Internet advertisement
$1.00
Weekly spend on Print Media advertisement
$1.00
Weekly spend on TV or Radio advertisement
Role
Target
Input
Input
Input
Input
Regression Output:
Model Fit Statistics
R-Square
0.5267
Adj R-Sq
AIC
-586.7872
BIC
SBC
-574.4005
C(p)
0.5038
-584.1920
5.0000
Analysis of Maximum Likelihood Estimates
Standard
Parameter
DF
Intercept
1
DirectMail
1
Internet
1
PrintMedia
TVRadio
Estimate
Error
t Value
P value
0.7547
0.0153
49.23
<.0001
-0.00011
0.000049
-2.25
0.0272
0.00025
0.000037
-6.98
<.0001
1
-0.00049
0.000092
-5.34
<.0001
1
0.000026
0.000019
1.35
0.1820
Please answer following questions based on above output:
a. How well is this regression model?
Answer: R square is 52.67%. That means 52.67% variance can be explained by this model.
Grading: if the answer is close, give full points.
b. Based on the output, which marketing strategy is the best?
Answer: Internet, because it’s the only strategy that can help to increase the market share.
5
Grading: if the answer is close, give full points.
c. How to explain the estimate coefficient of Internet?
Answer: estimate coefficent is 0.00025. By investing every $1.00 to internet advertisement, the market
share will increase 1*0.00025=0.25%.
Grading: if the answer is close, give full points.
d. If you are the marketing manager, what actions will you take based on above regression
result?
Answer: Cancel all other marketing plans and only invest on internet advertisement.
Grading: if the answer is close, give full points.
4. (8 points) In practice, tools using collective intelligence have performed better than theorists can
explain, especially in IT industry. Can this manager use collective intelligence to improve his
marketing performance? If yes, please design a detailed strategy to explain the power of collective
intelligence in marketing. If no, please explain.
Answer: definitely yes. For example, viral marketing where a mechanism is designed to have
customers promote the products to customers’ friends.
Grading: if answer is yes, give at least 6 points. If no, but explanation is given, give 4 points.
6
Download