STAT 503 Assignment 1: Movie Ratings SOLUTION NOTES

advertisement
STAT 503 Assignment 1: Movie Ratings
SOLUTION NOTES
These are my suggestions on how to analyze this data and organize the results. I’ve given more questions
below than I can address in my analysis, so show the range of questions groups addressed in their reports.
1
Data Description
With the upcoming academy awards and the just finished golden globes we collected data on 14 recent
movies from http://movies.yahoo.com/. The movies are Electra, The Aviator, Ocean’s Twelve, Coach
Carter, Meet the Fockers, Racing Stripes, In Good Company, White Noise, Lemony Snicket, Fat Albert, The
Incredibles, Alexander, Shall We Dance, Closer, Sideways. The movies were selected mostly from the top 10
list on January 14, 2005. Several movies were also suggested by class members.
We also collected ratings from 9 critics (Atlanta Journal, Boston Globe, Chicago Sun-Times, Chicago Tribune, E! Online, filmcritic, Hollywood Reporter, New York Times, Seattle Post), and the average Yahoo
user rating. The list of critics were the ones that rated all 14 movies.
The main question we want to answer is:
“Which movies are consistently rated highly by both critics and reviewers?”
Other questions of interest might be:
• Are some critics more generous in their ratings than others?
• Can the critics ratings accurately predict the average user rating?
• Does the number of users rating the movie accurately predict the average user rating?
• Is there a relationship between the weekly gross of a movie and the average user rating or the average
critic rating?
1
2
Suggested approaches
Approach
Data Restructuring
Make new variables ratings by converting letter
grades to GPA’s
Summary
statistics
Tabulate average critics
ratings and average user
ratings, and average
movie ratings for each
critic, calculate confidence intervals for the
critics ratings
Dotplots Of the critics
ratings for each movie
Reason
Type of questions addressed
Pairwise Scatterplots
Of average critics rating
against average user rating, user rating against
number of users
Regression Of average
critics rating against average user rating, user
rating against number of
users
Explore the relationship between
these pairs of variables
The letter grade is an ordinal variable, and so it needs to be treated
numerically for various calculations.
To compare the numbers for each
movie, and for each critic
Examine the distribution of the ratings for each movie.
To build a numerical model for the
relationship between the pairs of
variables
2
“What is the average critic rating for
each movie?” “Does the average user
rating compare similarly to the average critic rating?”
“Are some movies consistently rated
higher by the critics?” “Do some
movies split the critics, and draw
dramatically different ratings from
different critics?”
“What is the relationship between
number of users and average user rating?”
“Can the critics ratings accurately
predict the average user rating?”
“Does the number of users rating the
movie accurately predict the average
user rating?”
3
Actual Results
3.1
Summary Statistics
Table 1 contains a summary of the data for each movie. The users average grade and the critics average
grade are listed, along with the standard deviation and Bonferroni-adjusted confidence intervals for the critics
ratings for each movie. The top rated movie is The Incredibles. The lowest rated movie is White Noise.
Coach Carter is a movie which is rated comparatively lower by critics than by users. The other movies rated
better by users than critics are Lemony Snicket, Shall We Dance, Fat Albert, Electra and White Noise.
Alexander is rated low by the critics, C, but even lower by the users, C-. Closer is another movie rated
lower by users, B-, than critics, B. Most of the differences in the average movies ratings by critics are not
considered to differ significantly. The one exception is Electra is considered significantly lower rated than
The Incredibles.
Table 1: Summary of Movie Ratings. The top rated movie is The Incredibles. The lowest rated movie is White
Noise. Coach Carter is a movie which is rated comparatively lower by critics than by users. Alexander is rated
low by the critics but even lower by the users. Bonferroni-adjusted confidence intervals for the average critic
ratings for each movie are provided to compare the significance of the differences. Most of the differences in
the average movies ratings by critics are not considered to differ significantly. The one exception is Electra
is considered significantly lower rated than The Incredibles.
Movie
Number
31517
5214
1761
5625
7320
User
Average
3.67
3.00
3.00
2.67
3.67
Meet the Fockers
Lemony Snicket
Ocean’s Twelve
Shall We Dance
Racing Stripes
22667
16816
14027
6933
1320
3.00
3.00
2.67
2.67
2.33
B
B
BBC+
Alexander
Fat Albert
Electra
White Noise
12340
7740
4191
7197
1.67
2.67
2.67
2.33
CBBC+
The Incredibles
The Aviator
In Good Company
Closer
Coach Carter
Letter
AB
B
BA-
Letter
B+
B+
B
B
B
Critic
SD Lower CI
0.69
2.56
0.67
2.30
0.68
2.06
0.91
1.72
0.24
2.58
2.89
2.78
2.56
2.44
2.33
B
BBC+
C+
0.24
0.67
0.44
0.90
0.37
2.58
1.89
1.97
1.25
1.84
3.20
3.67
3.14
3.64
2.83
2.00
1.93
1.74
1.70
C
C
CC-
0.85
0.57
0.47
0.81
0.87
1.17
1.12
0.63
3.13
2.69
2.36
2.78
Average
3.48
3.19
2.96
2.93
2.89
Upper CI
4.40
4.08
3.86
4.14
3.20
Table 2 contains a summary of the data for each critic. Both Chicago newspapers rate movies higher
on average, and the Hollywoord Reporter and The New York Times rate movies the lowest on average.
Bonferroni-adjusted confidence intervals to compare the significance of the difference in average ratings are
given. The differences in the critics averages are not considered to differ significantly.
3.2
Dotplots
Figure 1 shows the distribution of the critics ratings for each movie. Although The Incredibles was rated
very highly on average (as reported in Table 1) we can see from these plots that two critics gave it much
3
Table 2: Summary of Critics Ratings. Both Chicago newspapers rate movies higher on average, and the
Hollywoord Reporter and The New York Times rate movies the lowest on average. Bonferroni-adjusted
confidence intervals to compare the significance of the difference in average ratings are given. The differences
in the critics averages are not considered to differ significantly.
Critic
Chicago Tribune
Chicago Sun-Times
Average
2.98
2.88
SD
0.89
0.66
Lower CI
2.20
2.30
Upper CI
3.76
3.46
E! Online
Boston Globe
Seattle Post
Atlanta Journal
filmcritic
2.71
2.69
2.60
2.48
2.33
0.94
0.70
0.66
0.60
0.73
1.89
2.08
2.02
1.96
1.70
3.54
3.30
3.17
3.00
2.97
New York Times
Hollywood Reporter
2.19
2.17
0.87
0.97
1.43
1.32
2.96
3.01
lower ratings: C from The Atlanta Journal, and B- from the New York Times. Shall We Dance was rated
reasonably well, B, but it was given an F from The Hollywood Reporter, and C- from E! Online. The ratings
for Alexander are uniformly spread, with one critic, Chicago Tribune, rating it as high as A-. Coach Carter
and Meet the Fokkers were consistently rated near B.
Figure 1: Ratings given by 9 critics to 14 movies.
3.3
Scatterplots
Figure 2 shows the scatterplots of users ratings against critics ratings. Some critics ratings reasonably match
the user ratings: E! Online, filmcritic, New York Times. Some critics don’t match the user ratings at all:
Atlanta Journal and Chicago Tribune.
4
Figure 2: Scatterplots of users ratings against critics ratings, to compare their similarity.
5
3.4
Regression
We first fitted a multiple linear model of average user rating against all the critics ratings. This model did
not explain a significant amount of the variation in users ratings, and no coefficients were significant.
So we fitted separate models for each critic, using a stepwise regression approach. Table 3.4 summarizes
the results. The ratings of E! Online and filmcritic most closely match the users ratings. The New York
Times, Chicago Sun-Times and Seattle Post do a quite reasonable job of predicting the average user rating.
The worst performance are the Chicago Tribune and the Atlanta Journal.
Table 3: Summary of individual regression models for each critic. The ratings of E! Online and filmcritic
most closely match the users ratings. The New York Times, Chicago Sun-Times and Seattle Post do a quite
reasonable job of predicting the average user rating. The worst performance are the Chicago Tribune and the
Atlanta Journal.
Critic
Deviance Intercept Slope p-value for slope
E! Online
1.37
1.63
0.43
0.001
filmcritic
1.52
1.54
0.53
0.002
New York Times
2.18
2.00
0.36
0.021
Chicago Sun-Times
Seattle Post
Hollywood Reporter
2.53
2.58
2.70
1.62
1.75
2.24
0.41
0.40
0.25
0.057
0.065
0.090
Boston Globe
Chicago Tribune
Atlanta Journal
2.99
3.19
3.21
2.05
2.29
2.20
0.27
0.17
0.24
0.191
0.323
0.343
6
4
Conclusions
• Only “The Incredibles” is rated highly, A-, by both users and critics. The Aviator and In Good
Company are rated well by both users and critics at a B.
• The critics ratings for each movie can vary a lot. Although The Incredibles is rated very highly on
average, two critics gave it low scores: C from The Atlanta Journal, and B- from the New York
Times. Shall We Dance was rated reasonably well, B, by all but it received and F from The Hollywood
Reporter, and C- from E! Online.
• There is some disagreement between critics and users. Coach Carter is rated poorly by critics but
highly by users. Electra was rated poorly by critics, C-, but reasonably by the users, B-. Alexander is
a movie that users rated worse than critics, and both rated it poorly.
• There is some bias in the critics ratings. The two Chicago newspapers rate movies more highly on
average than other critics. The Hollywood Reporter and The New York Times rate movies lower on
average.
• The critics who’s ratings best predict the users ratings are E! Online, filmcritic and the New York
Times. The ratings given Atlanta Journal and the Chicago Tribune both differ substantially from the
users ratings.
7
5
References
R Development Core Team (2003) R: A language and environment for statistical computing, R Foundation
for Statistical Computing, Vienna, Austria, ISBN 3-900051-00-3, http://www.R-project.org.
Yahoo (2005) Yahoo! Movies Website. http://movies.yahoo.com. Accessed Jan 14, 2005.
8
Appendix
# Read in data
d.movies.critics<-read.csv("movies-critics.csv")
table(d.movies.critics[,1],d.movies.critics[,3])
d.movies.users<-read.csv("movies-users.csv")
table(d.movies.users[,1],d.movies.users[,3])
# code for automatically creating numerical values for ratings
critics.ratings<-factor(as.character(d.movies.critics[,3]),
levels=c(" F"," D-"," D"," D+"," C-"," C"," C+"," B-"," B"," B+"," A-",
" A"," A+"),ordered=T)
critics.ratings.gpa<-as.numeric(critics.ratings)/3
users.ratings<-factor(as.character(d.movies.users[,3]),
levels=c(" F"," D-"," D"," D+"," C-"," C"," C+"," B-"," B"," B+"," A-",
" A"," A+"),
ordered=T)
users.ratings.gpa<-as.numeric(users.ratings)/3
# Summarize the ratings by movies, critical values for
# confidence intervals are not adjusted
options(digits=3,width=100)
movies.summary<-NULL
for (i in 1:14) {
mn<- mean(critics.ratings.gpa[d.movies.critics[,1]==
levels(d.movies.critics[,1])[i]])
sd1<-sd(critics.ratings.gpa[d.movies.critics[,1]==
levels(d.movies.critics[,1])[i]])
movies.summary<-rbind(movies.summary,
cbind(levels(d.movies.critics[,1])[i],
round(mn,digits=3),round(sd1,digits=3),
round(mn-sd1*qt(0.998,8)/sqrt(9),digits=3),
round(mn+sd1*qt(0.998,8)/sqrt(9),digits=3)))
}
dimnames(movies.summary)<-list(NULL,
c("Movie","Critics Av","Critics SD",
"Critics Lower CI","Critics Upper CI"))
movies.summary[order(movies.summary[,2],decreasing=T),]
# Summarize the ratings by critics,
critics.summary<-NULL
for (i in 1:9) {
mn<- mean(critics.ratings.gpa[d.movies.critics[,2]==
levels(d.movies.critics[,2])[i]])
sd1<-sd(critics.ratings.gpa[d.movies.critics[,2]==
levels(d.movies.critics[,2])[i]])
critics.summary<-rbind(critics.summary,
cbind(levels(d.movies.critics[,2])[i],
round(mn,digits=3),round(sd1,digits=3),
round(mn-sd1*qt(0.997,13)/sqrt(14),digits=3),
round(mn+sd1*qt(0.997,13)/sqrt(14),digits=3)))
}
9
dimnames(movies.summary)<-list(NULL,
c("Critic","Av","SD","Lower CI","Upper CI"))
critics.summary[order(critics.summary[,2],decreasing=T),]
# Change levels of movie names from alphabetical to order of the mean
ordered.movies<-factor(as.character(d.movies.critics[,1]),
levels=c("The Incredibles","The Aviator","In Good Company","Closer",
"Coach Carter","Meet the Fockers","Lemony Snicket","Ocean’s Twelve",
"Shall We Dance","Racing Stripes","Alexander","Fat Albert","Electra",
"White Noise"), ordered=T)
# Graphics for movies
library(lattice)
dotplot(ordered.movies~jitter(critics.ratings.gpa),
ylab="Movie",xlab="Rating",
aspect=0.8,col=1)
# Regression
critics.ratings.gpa.mat<-matrix(critics.ratings.gpa,ncol=9,byrow=T)
par(mfrow=c(3,3),mar=c(4,5,3,1))
for (i in 1:9) {
plot(critics.ratings.gpa.mat[,i],users.ratings.gpa,pch=16,
xlab="Critic",ylab="User",xlim=c(0,4),ylim=c(0,4),cex=1.5)
text(0.5,3.5,paste("r=",
round(cor(critics.ratings.gpa.mat[,i],users.ratings.gpa),digits=2)),
cex=1.5)
title(as.character(d.movies.critics[i,2]))
}
critics.users.reg<-glm(users.ratings.gpa~critics.ratings.gpa.mat)
summary(critics.users.reg)
1-pchisq(critics.users.reg$null.deviance-critics.users.reg$deviance,9)
x<-NULL
for (i in 1:9) {
y<-summary(glm(users.ratings.gpa~critics.ratings.gpa.mat[,i]))
x<-rbind(x,c(y$deviance,y$coefficients[c(1,2,8)]))
}
10
Download