Movies Josh Finkelstein John Hottinger Jenny Yaillen Xiang Huang Edward Han Rory MacDonald Tyronne Martin
Feb 25, 2016
Movies
Josh FinkelsteinJohn HottingerJenny YaillenXiang HuangEdward Han
Rory MacDonaldTyronne Martin
Intro• For an economist the study of econometrics, or the statistical
analyzing of economic data, is extremely important in understanding economic phenomena. By analyzing sets of past economic data, economists hope to be able to discover trends and tendencies that can be used to predict future economic events with greater accuracy. Cinema has strongly influenced society since its inception, not only socially but economically. It is not uncommon for modern movies to be produced at the cost of tens of million dollars, and to achieve gross profits many times that. With figures as impressive as these it is clear that movies make a discernable impact on the economy. A clearer understanding of what factors influence the gross profit generated by movies is vital to predicting the impact they will have on the economy.
What we are studying?
• For this study fifty high popularity movies created within the past sixty years were selected to be analyzed. The aspects of the movies that were studied were the rating, budget, domestic gross, length, viewer score, critic score and profit. These variables were then regressed using statistical analysis to determine important relationships between variables, and their relation to the revenue generated by the movies.
Variables
– Rating (R, PG-13, PG, G)– Production Budget– Gross (domestically only) – Length– Viewer Rating (Rotten Tomatoes)– Critic Rating (Rotten Tomatoes)– Profit (Gross-Budget)
www.the-numbers.comwww.rottentomatoes.com
Why we are studying it?
• Study past economic data to predict future events
• Better understand factors that influence economic impact of movies
• Understanding of past movies allow us to predict gross/budget of future movies
On average, what type of movie (ie R, PG-13, PG) has the highest gross
and budget (in millions)???
PG
PG-13
R
0 50 100 150 200 250
209.18
138.62
51.41
70.125
114.375
32.343125
Average Gross & Budget by Rating
Budget
Gross
Budget & Gross (in millions)
Ratin
g
Does there seem to be a trend in what type of movies are being produced?
Gone with the Wind 1939GJaws 1975PGGrease 1978PGHalloween 1978RRaiders of the Lost Ark 1981PGTerminator 1984RAliens 1986R
Indiana Jones and the Last Crusade 1989PG-13Ghost 1990PG-13Schindler's List 1993RForrest Gump 1994PG-13Pulp Fiction 1994RTrue Lies 1994RBraveheart 1995RThe American President 1995PG-13Executive Decision 1996RIndependence Day 1996PG-13Multiplicity 1996PG-13Scream 1996RAs Good As It Gets 1997PG-13Chasing Amy 1997RContact 1997PGDante's Peak 1997PG-13Good Will Hunting 1997R
I Know What You Did Last Summer 1997RMen in Black 1997PG-13Speed 2:Cruise Control 1997PG-13The Fifth Element 1997PG-13The Game 1997RTitanic 1997PG-13Volcano 1997PG-13Armageddon 1998PG-13Deep Impact 1998PG-13Hard Rain 1998RSaving Private ryan 1998RThe Man in the Iron Mask 1998PG-13
Star Wars Ep. I: The Phantom Menace 1999PGHow the Grinch Stole Christmas 2000PG
Harry Potter and the Sorcerer's Stone 2001PGSpider-Man 2002PG-13The Lord of the Rings: The Return of the King 2003PG-13Shrek 2 2004PGStar Wars Ep. III: Revenge of the Sith 2005PG-13Pirates of the Caribbean: Dead Man's Chest 2006PG-13Spider-Man 3 2007PG-13The Dark Knight 2008PG-13Avatar 2009PG-13Paranormal Activity 2009RToy Story 3 2010GRobin Hood 2010PG-13
0
200
400
600
800
0 20 40 60 80 100
CRITICRATING
GR
OS
S
0 10 20 30 40 50 60 70 80 90 1000
2
4
6
8
10
12
14
16
-100
400
900
1400
1900
2400
2900
3400Histogram of Critic Ratings for Movies
Movies (#)Total GrossAvg Gross
Critic Rating
Num
ber o
f Mov
ies
Tota
l Gro
ss fo
r all
Mov
ies (
$Mill
ion)
Does Critic Rating Effect Gross?
Does Critic Rating Effect Gross?
Dependent Variable: LNGROSS
Method: Least Squares
Date: 11/27/10 Time: 15:02
Sample: 1 50
Included observations: 50
Variable Coefficient Std. Error t-Statistic Prob.
CRITICRATING 0.010266 0.004614 2.224831 0.0308
C 4.248005 0.339487 12.51302 0.0000
R-squared 0.093482 Mean dependent var 4.946064
Adjusted R-squared 0.074596 S.D. dependent var 0.952939
S.E. of regression 0.916708 Akaike info criterion 2.703121
Sum squared resid 40.33693 Schwarz criterion 2.779602
Log likelihood -65.57803 F-statistic 4.949874
Durbin-Watson stat 1.133514 Prob(F-statistic) 0.030828
Took log
Is There a Relationship Between Viewer Rating and Gross?
Dependent Variable: LNGROSS
Method: Least Squares
Date: 11/27/10 Time: 15:08
Sample: 1 50
Included observations: 50
Variable Coefficient Std. Error t-Statistic Prob.
VIEWERRATING 0.017814 0.007589 2.347496 0.0231
C 3.632812 0.574099 6.327852 0.0000
R-squared 0.102984 Mean dependent var 4.946064
Adjusted R-squared 0.084296 S.D. dependent var 0.952939
S.E. of regression 0.911891 Akaike info criterion 2.692585
Sum squared resid 39.91414 Schwarz criterion 2.769066
Log likelihood -65.31462 F-statistic 5.510737
Durbin-Watson stat 1.067283 Prob(F-statistic) 0.023071
Is there a relationship between the length of a movie and it’s gross??
Dependent Variable: LNGROSS
Method: Least Squares
Date: 11/27/10 Time: 15:06
Sample: 1 50
Included observations: 50
Variable Coefficient Std. Error t-Statistic Prob.
LENGTH 0.011416 0.004302 2.653518 0.0108
C 3.432083 0.584553 5.871292 0.0000
R-squared 0.127925 Mean dependent var 4.946064
Adjusted R-squared 0.109757 S.D. dependent var 0.952939
S.E. of regression 0.899124 Akaike info criterion 2.664386
Sum squared resid 38.80433 Schwarz criterion 2.740867
Log likelihood -64.60965 F-statistic 7.041160
Durbin-Watson stat 1.113324 Prob(F-statistic) 0.010770
Is there a relationship between critic and viewer
ratings?
Dependent Variable: CRITICRATING
Method: Least Squares
Date: 11/24/10 Time: 00:05
Sample: 1 50
Included observations: 50
Variable Coefficient Std. Error t-Statistic Prob.
VIEWERRATING 1.209273 0.162735 7.430955 0.0000
C -21.14761 12.31142 -1.717723 0.0923
R-squared 0.534970 Mean dependent var 68.00000
Adjusted R-squared 0.525282 S.D. dependent var 28.38223
S.E. of regression 19.55530 Akaike info criterion 8.823548
Sum squared resid 18355.67 Schwarz criterion 8.900029
Log likelihood -218.5887 F-statistic 55.21909
Durbin-Watson stat 2.155193 Prob(F-statistic) 0.000000
Is there a relationship between profit and viewer rating?
Dependent Variable: PROFIT
Method: Least Squares
Date: 11/24/10 Time: 00:01
Sample: 1 50
Included observations: 50
Variable Coefficient Std. Error t-Statistic Prob.
VIEWERRATING 3.004416 1.051507 2.857247 0.0063
C -96.84122 79.55012 -1.217361 0.2294
R-squared 0.145358 Mean dependent var 124.6444
Adjusted R-squared 0.127553 S.D. dependent var 135.2781
S.E. of regression 126.3564 Akaike info criterion 12.55527
Sum squared resid 766364.5 Schwarz criterion 12.63175
Log likelihood -311.8817 F-statistic 8.163863
Durbin-Watson stat 1.200674 Prob(F-statistic) 0.006300
PROFIT = 3.00441646*VIEWERRATING - 96.84122145
Is there a relationship between the
profit of a movie (gross-budget) and the rating received from
critics?
Dependent Variable: PROFIT
Method: Least Squares
Date: 11/23/10 Time: 23:59
Sample: 1 50
Included observations: 50
Variable Coefficient Std. Error t-Statistic Prob.
CRITICRATING 2.232723 0.607806 3.673416 0.0006
C -27.18079 44.71995 -0.607800 0.5462
R-squared 0.219436 Mean dependent var 124.6444
Adjusted R-squared 0.203174 S.D. dependent var 135.2781
S.E. of regression 120.7561 Akaike info criterion 12.46460
Sum squared resid 699938.2 Schwarz criterion 12.54108
Log likelihood -309.6150 F-statistic 13.49399
Durbin-Watson stat 1.219835 Prob(F-statistic) 0.000602
Dependent Variable: PROFIT
Method: Least Squares
Date: 11/27/10 Time: 15:18
Sample: 1 50
Included observations: 50
Variable Coefficient Std. Error t-Statistic Prob.
CRITICRATING 1.891296 0.230608 8.201334 0.0000
R-squared 0.213428 Mean dependent var 124.6444
Adjusted R-squared 0.213428 S.D. dependent var 135.2781
S.E. of regression 119.9766 Akaike info criterion 12.43227
Sum squared resid 705325.2 Schwarz criterion 12.47051
Log likelihood -309.8067 Durbin-Watson stat 1.188816
DROP CONSTANT
Is there relationship between how much a movie makes (profit=gross-
budget) and it’s length?
Dependent Variable: PROFIT
Method: Least Squares
Date: 11/24/10 Time: 00:13
Sample: 1 50
Included observations: 50
Variable Coefficient Std. Error t-Statistic Prob.
LENGTH 1.299759 0.626510 2.074603 0.0434
C -47.72974 85.12612 -0.560694 0.5776
R-squared 0.082288 Mean dependent var 124.6444
Adjusted R-squared 0.063169 S.D. dependent var 135.2781
S.E. of regression 130.9357 Akaike info criterion 12.62647
Sum squared resid 822920.0 Schwarz criterion 12.70295
Log likelihood -313.6617 F-statistic 4.303979
Durbin-Watson stat 1.239945 Prob(F-statistic) 0.043408
PROFIT = 1.299759485*LENGTH - 47.7297429
Dependent Variable: PROFIT
Method: Least Squares
Date: 11/27/10 Time: 15:21
Sample: 1 50
Included observations: 50
Variable Coefficient Std. Error t-Statistic Prob.
LENGTH 0.956890 0.135325 7.071047 0.0000
R-squared 0.076277 Mean dependent var 124.6444
Adjusted R-squared 0.076277 S.D. dependent var 135.2781
S.E. of regression 130.0165 Akaike info criterion 12.59300
Sum squared resid 828309.8 Schwarz criterion 12.63124
Log likelihood -313.8249 Durbin-Watson stat 1.205902
DROPPED THE CONSTANT
Making lngross regression more significant…
Dependent Variable: LNGROSS
Method: Least Squares
Date: 11/27/10 Time: 15:09
Sample: 1 50
Included observations: 50
Variable Coefficient Std. Error t-Statistic Prob.
CRITICRATING 0.006765 0.006616 1.022541 0.3119
VIEWERRATING 0.002967 0.011707 0.253405 0.8011
LENGTH 0.009267 0.004711 1.967141 0.0552
C 3.038402 0.678472 4.478300 0.0000
R-squared 0.182595 Mean dependent var 4.946064
Adjusted R-squared 0.129286 S.D. dependent var 0.952939
S.E. of regression 0.889207 Akaike info criterion 2.679645
Sum squared resid 36.37171 Schwarz criterion 2.832607
Log likelihood -62.99113 F-statistic 3.425221
Durbin-Watson stat 1.110069 Prob(F-statistic) 0.024724
Dependent Variable: LNGROSS
Method: Least Squares
Date: 11/30/10 Time: 11:46
Sample: 1 50
Included observations: 50
Variable Coefficient Std. Error t-Statistic Prob.
CRITICRATING 0.007972 0.004547 1.753158 0.0861
LENGTH 0.009715 0.004322 2.247498 0.0293
C 3.115627 0.600113 5.191738 0.0000
R-squared 0.181454 Mean dependent var 4.946064
Adjusted R-squared 0.146622 S.D. dependent var 0.952939
S.E. of regression 0.880310 Akaike info criterion 2.641040
Sum squared resid 36.42248 Schwarz criterion 2.755762
Log likelihood -63.02601 F-statistic 5.209447
Durbin-Watson stat 1.123581 Prob(F-statistic) 0.009047
Dependent Variable: LNGROSS
Method: Least Squares
Date: 11/30/10 Time: 11:39
Sample: 1 50
Included observations: 50
Variable Coefficient Std. Error t-Statistic Prob.
R -1.192232 0.218347 -5.460266 0.0000
LENGTH 0.006888 0.003442 2.000871 0.0513
CRITICRATING 0.013170 0.003704 3.555065 0.0009
C 3.518552 0.478231 7.357425 0.0000
R-squared 0.503352 Mean dependent var 4.946064
Adjusted R-squared 0.470962 S.D. dependent var 0.952939
S.E. of regression 0.693120 Akaike info criterion 2.181392
Sum squared resid 22.09913 Schwarz criterion 2.334354
Log likelihood -50.53480 F-statistic 15.54032
Durbin-Watson stat 1.552562 Prob(F-statistic) 0.000000
Final lngross regression…..
PG
PG-13
R
0 50 100 150 200 250
209.18
138.62
51.41
70.125
114.375
32.343125
Average Gross & Budget by Rating
Budget
Gross
Budget & Gross (in millions)
Ratin
g
Conclusion
• The lower the rating (R, PG-13)=higher gross• Higher critic rating=higher gross/profit• Higher viewer rating=higher gross/profit• Critic rating and viewer rating=correlated• By taking some of the most significant
relationships we found we were able to create our final significant and correlated lngross regression