Movies Josh Finkelstein John Hottinger Jenny Yaillen Xiang Huang Edward Han Rory MacDonald Tyronne Martin Intro • For an economist the study of econometrics, or the statistical analyzing of economic data, is extremely important in understanding economic phenomena. By analyzing sets of past economic data, economists hope to be able to discover trends and tendencies that can be used to predict future economic events with greater accuracy. Cinema has strongly influenced society since its inception, not only socially but economically. It is not uncommon for modern movies to be produced at the cost of tens of million dollars, and to achieve gross profits many times that. With figures as impressive as these it is clear that movies make a discernable impact on the economy. A clearer understanding of what factors influence the gross profit generated by movies is vital to predicting the impact they will have on the economy. What we are studying? • For this study fifty high popularity movies created within the past sixty years were selected to be analyzed. The aspects of the movies that were studied were the rating, budget, domestic gross, length, viewer score, critic score and profit. These variables were then regressed using statistical analysis to determine important relationships between variables, and their relation to the revenue generated by the movies. Variables – Rating (R, PG-13, PG, G) – Production Budget – Gross (domestically only) www.the-numbers.com www.rottentomatoes.com – Length – Viewer Rating (Rotten Tomatoes) – Critic Rating (Rotten Tomatoes) – Profit (Gross-Budget) Why we are studying it? • Study past economic data to predict future events • Better understand factors that influence economic impact of movies • Understanding of past movies allow us to predict gross/budget of future movies On average, what type of movie (ie R, PG-13, PG) has the highest gross and budget (in millions)??? Average Gross & Budget by Rating 32.343125 R 51.41 Rating Budget 114.375 Gross PG-13 138.62 70.125 PG 209.18 0 50 100 150 Budget & Gross (in millions) 200 250 Does there seem to be a trend in what type of movies are being produced? Gone with the Wind Jaws Grease Halloween Raiders of the Lost Ark Terminator Aliens 1939 G 1975 PG 1978 PG 1978 R 1981 PG 1984 R 1986 R Indiana Jones and the Last Crusade Ghost Schindler's List Forrest Gump Pulp Fiction True Lies Braveheart The American President Executive Decision Independence Day Multiplicity Scream As Good As It Gets Chasing Amy Contact Dante's Peak Good Will Hunting 1989 PG-13 1990 PG-13 1993 R 1994 PG-13 1994 R 1994 R 1995 R 1995 PG-13 1996 R 1996 PG-13 1996 PG-13 1996 R 1997 PG-13 1997 R 1997 PG 1997 PG-13 1997 R I Know What You Did Last Summer Men in Black Speed 2:Cruise Control The Fifth Element The Game Titanic Volcano Armageddon Deep Impact Hard Rain Saving Private ryan The Man in the Iron Mask 1997 R 1997 PG-13 1997 PG-13 1997 PG-13 1997 R 1997 PG-13 1997 PG-13 1998 PG-13 1998 PG-13 1998 R 1998 R 1998 PG-13 Star Wars Ep. I: The Phantom Menace How the Grinch Stole Christmas 1999 PG 2000 PG Harry Potter and the Sorcerer's Stone Spider-Man The Lord of the Rings: The Return of the King Shrek 2 Star Wars Ep. III: Revenge of the Sith Pirates of the Caribbean: Dead Man's Chest Spider-Man 3 The Dark Knight Avatar Paranormal Activity Toy Story 3 Robin Hood 2001 PG 2002 PG-13 2003 PG-13 2004 PG 2005 PG-13 2006 PG-13 2007 PG-13 2008 PG-13 2009 PG-13 2009 R 2010 G 2010 PG-13 800 16 3400 14 2900 400 Movies (#) Total Gross Avg Gross 12 200 Number of Movies GROSS 600 10 0 0 20 40 60 CRIT ICRAT ING 80 100 2400 1900 8 1400 6 900 4 2 400 0 -100 0 10 20 30 40 50 60 Critic Rating 70 80 90 Does Critic Rating Effect Gross? 100 Total Gross for all Movies ($Million) Histogram of Critic Ratings for Movies Does Critic Rating Effect Gross? Dependent Variable: LNGROSS Took log Method: Least Squares Date: 11/27/10 Time: 15:02 Sample: 1 50 Included observations: 50 Variable Coefficient Std. Error t-Statistic Prob. CRITICRATING 0.010266 0.004614 2.224831 0.0308 C 4.248005 0.339487 12.51302 0.0000 R-squared 0.093482 Mean dependent var 4.946064 Adjusted R-squared 0.074596 S.D. dependent var 0.952939 S.E. of regression 0.916708 Akaike info criterion 2.703121 Sum squared resid 40.33693 Schwarz criterion 2.779602 F-statistic 4.949874 Prob(F-statistic) 0.030828 Log likelihood Durbin-Watson stat -65.57803 1.133514 Is There a Relationship Between Viewer Rating and Gross? Dependent Variable: LNGROSS Method: Least Squares Date: 11/27/10 Time: 15:08 Sample: 1 50 Included observations: 50 Variable Coefficient Std. Error t-Statistic Prob. VIEWERRATING 0.017814 0.007589 2.347496 0.0231 C 3.632812 0.574099 6.327852 0.0000 R-squared 0.102984 Mean dependent var 4.946064 Adjusted R-squared 0.084296 S.D. dependent var 0.952939 S.E. of regression 0.911891 Akaike info criterion 2.692585 Sum squared resid 39.91414 Schwarz criterion 2.769066 F-statistic 5.510737 Prob(F-statistic) 0.023071 Log likelihood Durbin-Watson stat -65.31462 1.067283 Is there a relationship between the length of a movie and it’s gross?? Dependent Variable: LNGROSS Method: Least Squares Date: 11/27/10 Time: 15:06 Sample: 1 50 Included observations: 50 Variable Coefficient Std. Error t-Statistic Prob. LENGTH 0.011416 0.004302 2.653518 0.0108 C 3.432083 0.584553 5.871292 0.0000 R-squared 0.127925 Mean dependent var 4.946064 Adjusted R-squared 0.109757 S.D. dependent var 0.952939 S.E. of regression 0.899124 Akaike info criterion 2.664386 Sum squared resid 38.80433 Schwarz criterion 2.740867 F-statistic 7.041160 Prob(F-statistic) 0.010770 Log likelihood Durbin-Watson stat -64.60965 1.113324 Is there a relationship between critic and viewer ratings? Dependent Variable: CRITICRATING Method: Least Squares Date: 11/24/10 Time: 00:05 Sample: 1 50 Included observations: 50 Variable Coefficient Std. Error t-Statistic Prob. VIEWERRATING 1.209273 0.162735 7.430955 0.0000 C -21.14761 12.31142 -1.717723 0.0923 R-squared 0.534970 Mean dependent var 68.00000 Adjusted R-squared 0.525282 S.D. dependent var 28.38223 S.E. of regression 19.55530 Akaike info criterion 8.823548 Sum squared resid 18355.67 Schwarz criterion 8.900029 F-statistic 55.21909 Prob(F-statistic) 0.000000 Log likelihood Durbin-Watson stat -218.5887 2.155193 Is there a relationship between profit and viewer rating? Dependent Variable: PROFIT Method: Least Squares PROFIT = 3.00441646*VIEWERRATING - 96.84122145 Date: 11/24/10 Time: 00:01 Sample: 1 50 Included observations: 50 Variable Coefficient Std. Error t-Statistic Prob. VIEWERRATING 3.004416 1.051507 2.857247 0.0063 C -96.84122 79.55012 -1.217361 0.2294 R-squared 0.145358 Mean dependent var 124.6444 Adjusted R-squared 0.127553 S.D. dependent var 135.2781 S.E. of regression 126.3564 Akaike info criterion 12.55527 Sum squared resid 766364.5 Schwarz criterion 12.63175 F-statistic 8.163863 Prob(F-statistic) 0.006300 Log likelihood Durbin-Watson stat -311.8817 1.200674 Is there a relationship between the profit of a movie (gross-budget) and the rating received from critics? Dependent Variable: PROFIT Method: Least Squares Date: 11/23/10 Time: 23:59 Sample: 1 50 Included observations: 50 Variable Coefficient Std. Error t-Statistic Prob. CRITICRATING 2.232723 0.607806 3.673416 0.0006 C -27.18079 44.71995 -0.607800 0.5462 R-squared 0.219436 Mean dependent var 124.6444 Adjusted R-squared 0.203174 S.D. dependent var 135.2781 S.E. of regression 120.7561 Akaike info criterion 12.46460 Sum squared resid 699938.2 Schwarz criterion 12.54108 F-statistic 13.49399 Prob(F-statistic) 0.000602 Log likelihood Durbin-Watson stat -309.6150 1.219835 Dependent Variable: PROFIT Method: Least Squares Date: 11/27/10 Time: 15:18 Sample: 1 50 DROP CONSTANT Included observations: 50 Variable Coefficient Std. Error t-Statistic Prob. CRITICRATING 1.891296 0.230608 8.201334 0.0000 R-squared 0.213428 Mean dependent var 124.6444 Adjusted R-squared 0.213428 S.D. dependent var 135.2781 S.E. of regression 119.9766 Akaike info criterion 12.43227 Sum squared resid 705325.2 Schwarz criterion 12.47051 Durbin-Watson stat 1.188816 Log likelihood -309.8067 Is there relationship between how much a movie makes (profit=grossbudget) and it’s length? Dependent Variable: PROFIT Method: Least Squares PROFIT = 1.299759485*LENGTH - 47.7297429 Date: 11/24/10 Time: 00:13 Sample: 1 50 Included observations: 50 Variable Coefficient Std. Error t-Statistic Prob. LENGTH 1.299759 0.626510 2.074603 0.0434 C -47.72974 85.12612 -0.560694 0.5776 R-squared 0.082288 Mean dependent var 124.6444 Adjusted R-squared 0.063169 S.D. dependent var 135.2781 S.E. of regression 130.9357 Akaike info criterion 12.62647 Sum squared resid 822920.0 Schwarz criterion 12.70295 F-statistic 4.303979 Prob(F-statistic) 0.043408 Log likelihood Durbin-Watson stat -313.6617 1.239945 Dependent Variable: PROFIT Method: Least Squares Date: 11/27/10 Time: 15:21 DROPPED THE CONSTANT Sample: 1 50 Included observations: 50 Variable Coefficient Std. Error t-Statistic Prob. LENGTH 0.956890 0.135325 7.071047 0.0000 R-squared 0.076277 Mean dependent var 124.6444 Adjusted R-squared 0.076277 S.D. dependent var 135.2781 S.E. of regression 130.0165 Akaike info criterion 12.59300 Sum squared resid 828309.8 Schwarz criterion 12.63124 Durbin-Watson stat 1.205902 Log likelihood -313.8249 Making lngross regression more significant… Dependent Variable: LNGROSS Dependent Variable: LNGROSS Method: Least Squares Method: Least Squares Date: 11/27/10 Time: 15:09 Date: 11/30/10 Time: 11:46 Sample: 1 50 Sample: 1 50 Included observations: 50 Included observations: 50 Variable Coefficient Std. Error t-Statistic Prob. CRITICRATING 0.006765 0.006616 1.022541 0.3119 VIEWERRATING 0.002967 0.011707 0.253405 0.8011 LENGTH 0.009267 0.004711 1.967141 0.0552 C 3.038402 0.678472 4.478300 0.0000 R-squared 0.182595 Mean dependent var Variable Coefficient Std. Error t-Statistic Prob. CRITICRATING 0.007972 0.004547 1.753158 0.0861 LENGTH 0.009715 0.004322 2.247498 0.0293 C 3.115627 0.600113 5.191738 0.0000 R-squared 0.181454 Mean dependent var 4.946064 Adjusted R-squared 0.146622 S.D. dependent var 0.952939 S.E. of regression 0.880310 Akaike info criterion 2.641040 36.42248 Schwarz criterion 2.755762 F-statistic 5.209447 Prob(F-statistic) 0.009047 4.946064 Adjusted R-squared 0.129286 S.D. dependent var 0.952939 S.E. of regression 0.889207 Akaike info criterion 2.679645 Sum squared resid Sum squared resid 36.37171 Schwarz criterion 2.832607 Log likelihood F-statistic 3.425221 Prob(F-statistic) 0.024724 Durbin-Watson stat Log likelihood Durbin-Watson stat -62.99113 1.110069 -63.02601 1.123581 Final lngross regression….. Dependent Variable: LNGROSS Method: Least Squares Date: 11/30/10 Time: 11:39 Sample: 1 50 Included observations: 50 Variable Coefficient Std. Error t-Statistic Prob. R -1.192232 0.218347 -5.460266 0.0000 LENGTH 0.006888 0.003442 2.000871 0.0513 CRITICRATING 0.013170 0.003704 3.555065 0.0009 C 3.518552 0.478231 7.357425 0.0000 R-squared 0.503352 Mean dependent var 4.946064 Adjusted R-squared 0.470962 S.D. dependent var 0.952939 S.E. of regression 0.693120 Akaike info criterion 2.181392 Sum squared resid 22.09913 Schwarz criterion 2.334354 F-statistic 15.54032 Prob(F-statistic) 0.000000 Log likelihood Durbin-Watson stat -50.53480 1.552562 Average Gross & Budget by Rating 32.343125 R 51.41 Rating Budget 114.375 Gross PG-13 138.62 70.125 PG 209.18 0 50 100 150 Budget & Gross (in millions) 200 250 Conclusion • • • • • The lower the rating (R, PG-13)=higher gross Higher critic rating=higher gross/profit Higher viewer rating=higher gross/profit Critic rating and viewer rating=correlated By taking some of the most significant relationships we found we were able to create our final significant and correlated lngross regression