In fact, adding a second variable that is correlated with the first one distorts the values of the regression coefficients. However, we can often predict Y well, even when there is multicollinearity. Let’s consider an example where multicollinearity exists to see how it affects regression. For the past 12 months, the manager of the Pizza Shack restaurant has been putting a series of advertisements in the local newspaper. Ads are scheduled and paid for the month before they appear.
Each ad contains a coupon that allows you to take two pizzas by paying only the one with the highest price. The manager collected the data in Table 13-4 and would like to use it to predict pizza sales. In Figures 13-6 and 13-7, Minitab results were given for the respective regressions of total sales on the number of ads and on cost. For the regression on the number of ads, we have that the observed t-value is 3.95. With 10 degrees of freedom and a significance level of 0.01, the critical t value (taken from table 2 of the appendix) was found to be 3.169. Since to tc (or equivalently, since p is less than 0.01), we conclude that the number of ads is a highly significant explanatory variable of total sales. Note also that r 2 61.0%, so the number of ads explains approximately 61% of the variation in pizza sales. For the regression on the cost of ads, the observed t-value is 4.54, so that the cost of the ads is an explanatory variable of total sales even more significant than the number of ads (for which the observed t-value was only 3.95). In this regression, r 2 67.3%, so the cost of ads explains about 67% of the variation in pizza sales. Since both explanatory variables are highly significant on their own, we try to use both in a multiple regression. The result is presented in Figure 13-8. Multiple regression is highly significant as a whole, since the ANOVA p is 0.006.
Regression Analysis The regression equation is SALES = 16.9 + 2.08 ADS Constant Forecaster ADS
Regression analysis The regression equation is SALES = 4.17 + 2.87 COST Forecaster Constant ADS s = 3.849
FIGURE 13-8 Minitab Regression for Sales on the Number and Cost of Ads
The coefficient of multiple determination is R2 68.