ANOVA and Least Squares Analysis

Correlation between annual income and amount spent on car

From the given data on annual income level and amount spent on car, it is expected that there is a positive correlation between annual income level and amount spent on car. In other words, an increase in level of annual income leads to a concomitant increase in amount spent on car. For instance, when the level of income increases from 38,000 to 40,000, the amount spent on car increases from 12,000 to 16,000. Likewise, the amount spent on car when the level of income is 117,000 is 41,000, compared to 21,000 when the level of income is 79,000. Not a single datum indicates a negative relationship. It is also expected that this correlation is strong considering the high margin of increase in amount spent on car even with a slight increase in annual income level. For instance, while annual income increases by 2,000 (from 38,000 to 40,000), amount spent on car increases by 4,000 (from 12,000 to 16,000).

To test the relationship between annual income level and amount spent on car, it would be best to use a regression rather than Analysis of Variance (ANOVA). According to Field (2009), regression analysis is an important test for “predicting an outcome variable from one predictor variable (simple regression) or several predictor variables (multiple regression)” (p. 198). The regression analysis is powerful enough to predict an outcome where a model that is best for the data is selected, with the linear model being the most common model in regression. A linear model aids in summarizing the data set using a straight line and also allows the use of least squares method to identify the line that suitably describes the data. Using the above data on income level and expenditures on car, a regression analysis allows for the prediction of amount spent on car (DV) using the annual income level as the predictor variable. Using the linear model in the regression test, one is able to determine the regression coefficients as defined by the gradient of the line and the Y intercept. It is also possible to establish the error, which represents the difference between the actual measurement and the predicted value. Using the graph generated in this test, it becomes possible to tell whether the relationship is positive (represented by a positive gradient) or negative (negative gradient), using the coefficients and direct observation of the direction of the graph. A mathematical equation can actually be used to describe the relationship, where values of predictor variable are mathematically used to determine the outcome. The above characteristics of the regression analysis qualify it as the most appropriate method for testing the relationship between annual income level and amount spent on car.

In contrast, ANOVA is useful when one wants substantiate differences that may exist between groups in a given variable (comparing means). When for instance one is conducting one-way ANOVA, the objective is to find any statistically significant difference that may exist between at least three alternatives (Online Statistics, n.d). In the above data, it would be inappropriate to conduct an ANOVA test since the data is not divided into different groups to enable testing for any statistically significant difference. Furthermore, the task in this case is to describe the relationship between the dependent variable (DV) and the independent variable (IV) and not to define differences between groups, which is the function of ANOVA. As such, the regression analysis is the best suited for testing the relationship between annual income level and amount spent on car.

On conducting a simple regression on the data (Waner & Costenoble, 1999), the following output was produced showing the equation of the line of best fit:

y = 0.329224 x + -3752.76

r = 0.88895

The gradient of the graph is 0.329 whereas the Y intercept is -3752.76 and the scatter plots are closely lying along the best line of fit. This implies that 1 percent change in the income level causes a 32.92 percent change in amount spent on car. It also implies that when there is zero annual income, the expenditure on car is -3752.76. From the above results, it is also evident that there is a strong positive correlation (r =.89) between annual income level and amount spent on car. Overall, the results and the graph confirm a positive relationship between annual income and amount spent on car.

ANOVA Results

The results of a ANOVA statistical test performed at 00:08 on 4-FEB-2011

Source of Sum of d.f. Mean F

Variation Squares Squares

between 7.3692E+09 2 3.6846E+09 6.614

error 9.4705E+09 17 5.5709E+08

total 1.6840E+10 19

The probability of this result, assuming the null hypothesis, is 0.008

The ANOVA test conducted to identify differences between the two sets of data indicated that there is a significance difference (F (2, 17) = 6.614, p =.008) between the two halves of the data. In other words, there is a significant difference in amount spent on car in the different groups of level of income.