STAT 503 Assignment 1: Movie Ratings SOLUTION NOTES These are my suggestions on how to analyze this data and organize the results. I’ve given more questions below than I can address in my analysis, so show the range of questions groups addressed in their reports. 1 Data Description With the upcoming academy awards and the just finished golden globes we collected data on 14 recent movies from http://movies.yahoo.com/. The movies are Electra, The Aviator, Ocean’s Twelve, Coach Carter, Meet the Fockers, Racing Stripes, In Good Company, White Noise, Lemony Snicket, Fat Albert, The Incredibles, Alexander, Shall We Dance, Closer, Sideways. The movies were selected mostly from the top 10 list on January 14, 2005. Several movies were also suggested by class members. We also collected ratings from 9 critics (Atlanta Journal, Boston Globe, Chicago Sun-Times, Chicago Tribune, E! Online, filmcritic, Hollywood Reporter, New York Times, Seattle Post), and the average Yahoo user rating. The list of critics were the ones that rated all 14 movies. The main question we want to answer is: “Which movies are consistently rated highly by both critics and reviewers?” Other questions of interest might be: • Are some critics more generous in their ratings than others? • Can the critics ratings accurately predict the average user rating? • Does the number of users rating the movie accurately predict the average user rating? • Is there a relationship between the weekly gross of a movie and the average user rating or the average critic rating? 1 2 Suggested approaches Approach Data Restructuring Make new variables ratings by converting letter grades to GPA’s Summary statistics Tabulate average critics ratings and average user ratings, and average movie ratings for each critic, calculate confidence intervals for the critics ratings Dotplots Of the critics ratings for each movie Reason Type of questions addressed Pairwise Scatterplots Of average critics rating against average user rating, user rating against number of users Regression Of average critics rating against average user rating, user rating against number of users Explore the relationship between these pairs of variables The letter grade is an ordinal variable, and so it needs to be treated numerically for various calculations. To compare the numbers for each movie, and for each critic Examine the distribution of the ratings for each movie. To build a numerical model for the relationship between the pairs of variables 2 “What is the average critic rating for each movie?” “Does the average user rating compare similarly to the average critic rating?” “Are some movies consistently rated higher by the critics?” “Do some movies split the critics, and draw dramatically different ratings from different critics?” “What is the relationship between number of users and average user rating?” “Can the critics ratings accurately predict the average user rating?” “Does the number of users rating the movie accurately predict the average user rating?” 3 Actual Results 3.1 Summary Statistics Table 1 contains a summary of the data for each movie. The users average grade and the critics average grade are listed, along with the standard deviation and Bonferroni-adjusted confidence intervals for the critics ratings for each movie. The top rated movie is The Incredibles. The lowest rated movie is White Noise. Coach Carter is a movie which is rated comparatively lower by critics than by users. The other movies rated better by users than critics are Lemony Snicket, Shall We Dance, Fat Albert, Electra and White Noise. Alexander is rated low by the critics, C, but even lower by the users, C-. Closer is another movie rated lower by users, B-, than critics, B. Most of the differences in the average movies ratings by critics are not considered to differ significantly. The one exception is Electra is considered significantly lower rated than The Incredibles. Table 1: Summary of Movie Ratings. The top rated movie is The Incredibles. The lowest rated movie is White Noise. Coach Carter is a movie which is rated comparatively lower by critics than by users. Alexander is rated low by the critics but even lower by the users. Bonferroni-adjusted confidence intervals for the average critic ratings for each movie are provided to compare the significance of the differences. Most of the differences in the average movies ratings by critics are not considered to differ significantly. The one exception is Electra is considered significantly lower rated than The Incredibles. Movie Number 31517 5214 1761 5625 7320 User Average 3.67 3.00 3.00 2.67 3.67 Meet the Fockers Lemony Snicket Ocean’s Twelve Shall We Dance Racing Stripes 22667 16816 14027 6933 1320 3.00 3.00 2.67 2.67 2.33 B B BBC+ Alexander Fat Albert Electra White Noise 12340 7740 4191 7197 1.67 2.67 2.67 2.33 CBBC+ The Incredibles The Aviator In Good Company Closer Coach Carter Letter AB B BA- Letter B+ B+ B B B Critic SD Lower CI 0.69 2.56 0.67 2.30 0.68 2.06 0.91 1.72 0.24 2.58 2.89 2.78 2.56 2.44 2.33 B BBC+ C+ 0.24 0.67 0.44 0.90 0.37 2.58 1.89 1.97 1.25 1.84 3.20 3.67 3.14 3.64 2.83 2.00 1.93 1.74 1.70 C C CC- 0.85 0.57 0.47 0.81 0.87 1.17 1.12 0.63 3.13 2.69 2.36 2.78 Average 3.48 3.19 2.96 2.93 2.89 Upper CI 4.40 4.08 3.86 4.14 3.20 Table 2 contains a summary of the data for each critic. Both Chicago newspapers rate movies higher on average, and the Hollywoord Reporter and The New York Times rate movies the lowest on average. Bonferroni-adjusted confidence intervals to compare the significance of the difference in average ratings are given. The differences in the critics averages are not considered to differ significantly. 3.2 Dotplots Figure 1 shows the distribution of the critics ratings for each movie. Although The Incredibles was rated very highly on average (as reported in Table 1) we can see from these plots that two critics gave it much 3 Table 2: Summary of Critics Ratings. Both Chicago newspapers rate movies higher on average, and the Hollywoord Reporter and The New York Times rate movies the lowest on average. Bonferroni-adjusted confidence intervals to compare the significance of the difference in average ratings are given. The differences in the critics averages are not considered to differ significantly. Critic Chicago Tribune Chicago Sun-Times Average 2.98 2.88 SD 0.89 0.66 Lower CI 2.20 2.30 Upper CI 3.76 3.46 E! Online Boston Globe Seattle Post Atlanta Journal filmcritic 2.71 2.69 2.60 2.48 2.33 0.94 0.70 0.66 0.60 0.73 1.89 2.08 2.02 1.96 1.70 3.54 3.30 3.17 3.00 2.97 New York Times Hollywood Reporter 2.19 2.17 0.87 0.97 1.43 1.32 2.96 3.01 lower ratings: C from The Atlanta Journal, and B- from the New York Times. Shall We Dance was rated reasonably well, B, but it was given an F from The Hollywood Reporter, and C- from E! Online. The ratings for Alexander are uniformly spread, with one critic, Chicago Tribune, rating it as high as A-. Coach Carter and Meet the Fokkers were consistently rated near B. Figure 1: Ratings given by 9 critics to 14 movies. 3.3 Scatterplots Figure 2 shows the scatterplots of users ratings against critics ratings. Some critics ratings reasonably match the user ratings: E! Online, filmcritic, New York Times. Some critics don’t match the user ratings at all: Atlanta Journal and Chicago Tribune. 4 Figure 2: Scatterplots of users ratings against critics ratings, to compare their similarity. 5 3.4 Regression We first fitted a multiple linear model of average user rating against all the critics ratings. This model did not explain a significant amount of the variation in users ratings, and no coefficients were significant. So we fitted separate models for each critic, using a stepwise regression approach. Table 3.4 summarizes the results. The ratings of E! Online and filmcritic most closely match the users ratings. The New York Times, Chicago Sun-Times and Seattle Post do a quite reasonable job of predicting the average user rating. The worst performance are the Chicago Tribune and the Atlanta Journal. Table 3: Summary of individual regression models for each critic. The ratings of E! Online and filmcritic most closely match the users ratings. The New York Times, Chicago Sun-Times and Seattle Post do a quite reasonable job of predicting the average user rating. The worst performance are the Chicago Tribune and the Atlanta Journal. Critic Deviance Intercept Slope p-value for slope E! Online 1.37 1.63 0.43 0.001 filmcritic 1.52 1.54 0.53 0.002 New York Times 2.18 2.00 0.36 0.021 Chicago Sun-Times Seattle Post Hollywood Reporter 2.53 2.58 2.70 1.62 1.75 2.24 0.41 0.40 0.25 0.057 0.065 0.090 Boston Globe Chicago Tribune Atlanta Journal 2.99 3.19 3.21 2.05 2.29 2.20 0.27 0.17 0.24 0.191 0.323 0.343 6 4 Conclusions • Only “The Incredibles” is rated highly, A-, by both users and critics. The Aviator and In Good Company are rated well by both users and critics at a B. • The critics ratings for each movie can vary a lot. Although The Incredibles is rated very highly on average, two critics gave it low scores: C from The Atlanta Journal, and B- from the New York Times. Shall We Dance was rated reasonably well, B, by all but it received and F from The Hollywood Reporter, and C- from E! Online. • There is some disagreement between critics and users. Coach Carter is rated poorly by critics but highly by users. Electra was rated poorly by critics, C-, but reasonably by the users, B-. Alexander is a movie that users rated worse than critics, and both rated it poorly. • There is some bias in the critics ratings. The two Chicago newspapers rate movies more highly on average than other critics. The Hollywood Reporter and The New York Times rate movies lower on average. • The critics who’s ratings best predict the users ratings are E! Online, filmcritic and the New York Times. The ratings given Atlanta Journal and the Chicago Tribune both differ substantially from the users ratings. 7 5 References R Development Core Team (2003) R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria, ISBN 3-900051-00-3, http://www.R-project.org. Yahoo (2005) Yahoo! Movies Website. http://movies.yahoo.com. Accessed Jan 14, 2005. 8 Appendix # Read in data d.movies.critics<-read.csv("movies-critics.csv") table(d.movies.critics[,1],d.movies.critics[,3]) d.movies.users<-read.csv("movies-users.csv") table(d.movies.users[,1],d.movies.users[,3]) # code for automatically creating numerical values for ratings critics.ratings<-factor(as.character(d.movies.critics[,3]), levels=c(" F"," D-"," D"," D+"," C-"," C"," C+"," B-"," B"," B+"," A-", " A"," A+"),ordered=T) critics.ratings.gpa<-as.numeric(critics.ratings)/3 users.ratings<-factor(as.character(d.movies.users[,3]), levels=c(" F"," D-"," D"," D+"," C-"," C"," C+"," B-"," B"," B+"," A-", " A"," A+"), ordered=T) users.ratings.gpa<-as.numeric(users.ratings)/3 # Summarize the ratings by movies, critical values for # confidence intervals are not adjusted options(digits=3,width=100) movies.summary<-NULL for (i in 1:14) { mn<- mean(critics.ratings.gpa[d.movies.critics[,1]== levels(d.movies.critics[,1])[i]]) sd1<-sd(critics.ratings.gpa[d.movies.critics[,1]== levels(d.movies.critics[,1])[i]]) movies.summary<-rbind(movies.summary, cbind(levels(d.movies.critics[,1])[i], round(mn,digits=3),round(sd1,digits=3), round(mn-sd1*qt(0.998,8)/sqrt(9),digits=3), round(mn+sd1*qt(0.998,8)/sqrt(9),digits=3))) } dimnames(movies.summary)<-list(NULL, c("Movie","Critics Av","Critics SD", "Critics Lower CI","Critics Upper CI")) movies.summary[order(movies.summary[,2],decreasing=T),] # Summarize the ratings by critics, critics.summary<-NULL for (i in 1:9) { mn<- mean(critics.ratings.gpa[d.movies.critics[,2]== levels(d.movies.critics[,2])[i]]) sd1<-sd(critics.ratings.gpa[d.movies.critics[,2]== levels(d.movies.critics[,2])[i]]) critics.summary<-rbind(critics.summary, cbind(levels(d.movies.critics[,2])[i], round(mn,digits=3),round(sd1,digits=3), round(mn-sd1*qt(0.997,13)/sqrt(14),digits=3), round(mn+sd1*qt(0.997,13)/sqrt(14),digits=3))) } 9 dimnames(movies.summary)<-list(NULL, c("Critic","Av","SD","Lower CI","Upper CI")) critics.summary[order(critics.summary[,2],decreasing=T),] # Change levels of movie names from alphabetical to order of the mean ordered.movies<-factor(as.character(d.movies.critics[,1]), levels=c("The Incredibles","The Aviator","In Good Company","Closer", "Coach Carter","Meet the Fockers","Lemony Snicket","Ocean’s Twelve", "Shall We Dance","Racing Stripes","Alexander","Fat Albert","Electra", "White Noise"), ordered=T) # Graphics for movies library(lattice) dotplot(ordered.movies~jitter(critics.ratings.gpa), ylab="Movie",xlab="Rating", aspect=0.8,col=1) # Regression critics.ratings.gpa.mat<-matrix(critics.ratings.gpa,ncol=9,byrow=T) par(mfrow=c(3,3),mar=c(4,5,3,1)) for (i in 1:9) { plot(critics.ratings.gpa.mat[,i],users.ratings.gpa,pch=16, xlab="Critic",ylab="User",xlim=c(0,4),ylim=c(0,4),cex=1.5) text(0.5,3.5,paste("r=", round(cor(critics.ratings.gpa.mat[,i],users.ratings.gpa),digits=2)), cex=1.5) title(as.character(d.movies.critics[i,2])) } critics.users.reg<-glm(users.ratings.gpa~critics.ratings.gpa.mat) summary(critics.users.reg) 1-pchisq(critics.users.reg$null.deviance-critics.users.reg$deviance,9) x<-NULL for (i in 1:9) { y<-summary(glm(users.ratings.gpa~critics.ratings.gpa.mat[,i])) x<-rbind(x,c(y$deviance,y$coefficients[c(1,2,8)])) } 10