The Internet Movie Database (IMDB) user ratings of the movie Top Gun: Maverick can be analyzed in R with the following command: TGM.ratings <-c(rep(x=1, times=”3266),” rep(x=”2,” times=”968),” rep(x=”3,” times=”1225),” rep(x=”4,” times=”1840),” rep(x=”5,” times=”4243),” rep(x=”6,” times=”12048),” rep(x=”7,” times=”39036),” rep(x=”8,” times=”89864),” rep(x=”9,” times=”109439),” rep(x=”10,” times=”124807))” counts=”” are=”” valid=”” as=”” of=”” october=”” 1,=”” 2022.=”” note:=”” if=”” you=”” want,=”” you=”” may=”” pick=”” a=”” different=”” movie=”” that=”” you=”” like=”” and=”” use=”” its=”” imdb=”” user=”” ratings=”” on=”” this=”” problem,=”” so=”” long=”” as=”” at=”” least=”” 1,000=”” people=”” have=”” submitted=”” ratings.=”” if=”” you=”” do=”” this,=”” name=”” the=”” movie=”” and=”” provide=”” the=”” code=”” you=”” used=”” to=”” input=”” the=”” ratings=”” into=””>

a) Make a histogram of the ratings. Start the breaks at 0.5, end the breaks at 10.5, and make each rectangle width 1. (This centres the rectangles at the integers 1 through 10.)

b) Is the histogram approximately normal? Does it matter, regarding the Central Limit Theorem? (That is, do the sample data need to be approximately normal in order to use z-tools? Consider problems 1-4 of this assignment.)

c) Set up a z-interval of the average user rating. Is this valid statistical inference? If it is, how is it interpreted (and what population it is about)? If it is not valid inference, why not?

d) Set up a two-tail z-test of whether the average user rating is significantly different (either way) from 5. Give the z-statistic and the P-value.

e) Provide a plot (made in R) of the normal curve with the outer two tails shaded that illustrates this P-value. (This would be hard to do “from scratch.” Instead, modify the shade. Norm function provided. See also the shade.t.outer function provided in my R script in eLearning.)

f) Is this a valid statistical test? If it is, how is it interpreted? If it is not, why not?