1 Instructions for Conducting One-Way ANOVA in SPSS One way ANOVA is used to examine mean differences between two or more groups. It is a bivariate test with one IV and one DV. The IV must be categorical and the DV must be continuous. In this demonstration we describe how to conduct one way ANOVA, and planned and post hoc comparisons. The instructions also include a check for whether the homogeneity of variance has been met. The DV is a measure of base year standardized reading scores (BY2XRSTD); the IV is an indicator of the geographic region of each school (G8REGON). The IV has four values—northeast (coded as 1), north central (coded as 2), south (coded as 3), and west (coded as 4). This analysis is based on 300 randomly selected cases from the NELS database with no missing observations. Although not used in the actual analysis, the data set also includes one weight variable (F2PNWLWT). Copies of the data set and output are available on the companion website. The data set file is entitled, “ONE WAY ANOVA.sav”. The output file is entitled, “One Way ANOVA results.spv ”.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Instructions for Conducting One-Way ANOVA in SPSS
One way ANOVA is used to examine mean differences between two or more groups. It is
a bivariate test with one IV and one DV. The IV must be categorical and the DV must be
continuous. In this demonstration we describe how to conduct one way ANOVA, and planned
and post hoc comparisons. The instructions also include a check for whether the homogeneity of
variance has been met. The DV is a measure of base year standardized reading scores
(BY2XRSTD); the IV is an indicator of the geographic region of each school (G8REGON). The
IV has four values—northeast (coded as 1), north central (coded as 2), south (coded as 3), and
west (coded as 4). This analysis is based on 300 randomly selected cases from the NELS
database with no missing observations. Although not used in the actual analysis, the data set also
includes one weight variable (F2PNWLWT).
Copies of the data set and output are available on the companion website. The data set
file is entitled, “ONE WAY ANOVA.sav”. The output file is entitled, “One Way ANOVA
results.spv ”.
The following instructions are divided into three sets of steps:
1. Conduct an exploratory analysis to a) examine descriptive statistics, b) check for
outliers, c) check that the normality assumption is met, and d) verify that there are
mean differences between groups to justify ANOVA.
2. Conduct the actual one-way ANOVA to determine whether group means are different
form one another (warranting planned or post hoc comparison tests, as described in
step 3). Also, check that the homogeneity of variance assumption is met.
3. Conduct planned or post hoc comparisons if warranted. For illustration purposes, we
provide instructions for conducting both planned and post hoc comparisons.
2
NOTE: There is no way use SPSS to check that the independence assumption has been met. The
independence assumption means that the errors associated with each observation are independent
from one another. For this particular example, it is the assumption that if two students are in the
same region/group, the extent to which the score of one student deviates from the group mean is
not correlated with the extent to which the scores of other students deviate from the group mean.
The independence assumption is met when case selection is random. Thus it is based on the
sampling procedures used to create the sample. Because the NELS study incorporated random
sampling, we assume that the independence assumption has been met. If the independence
assumption is violated, then one-way ANOVA should not be used.
To get started, open the SPSS data file entitled, ONE WAY ANOVA.sav.
STEP 1: Exploratory Analyses
First we calculate descriptive statistics. At the Analyze menu, select Compare Means.
Click on Means. Highlight BY2XRSTD (Reading Standardized Score) and move it to the
Dependent List box. Highlight G8REGON (Composite Geographic Region of School) and move
it to the Independent List box. Click on Options. In the Statistics box, highlight
Variance, Skewness, and Std. Error of Skewness. Click to move them to the Cell
Statistics box. By default Mean, Number of Cases, and Standard Deviation are already in the
Cell Statistics box. If they are not in the Cell Statistics box, move them over now. Click
Continue. Click OK.
In addition to the case processing summary, the output includes a report (see following
table) that provides descriptive statistics for each of the factors for the IV. As the report
indicates, the means of standardized reading scores for the West, South, and North Central
3
regions are similar to one another while the mean for the North East region is somewhat higher.
This suggests that these groups may differ with regard to their average standardized reading
scores. These differences, especially between the northeast region and the other regions, warrant
an ANOVA.
Skewness statistics provide preliminary information about the existence of outliers.
Skewness values outside of -1 and 1 suggest that outliers may be present. The report shows that
skewness values for all regions fall within -1 to 1 range, suggesting no outliers. We can examine
box plots to confirm this. Select Graphs from the menu, then Legacy Dialogs, then click Box
plot. Click Simple. Within the “Data in Chart Are”, select Summaries for groups of cases.
Click Define. Highlight BY2XRSTD (Reading Standardized Score) and move it to the
Variable box. Highlight G8REGON (Composite Geographic Region of School) and move
it to the Category Axis box. Click OK to create a box plot graph.
The output includes the case processing summary and a graph of the box plots of reading
scores by region. The box plots indicate that none of the regions include outliers.
4
If outliers are present, their cases would be displayed at either end of the respective box
plot, as illustrated in the following example. As you can see, cases 163 and 237 are outliers.
Next, we check whether the normality assumption has been met. The normality
assumption means that the residual errors are assumed to be normally distributed, roughly in the
shape of the normal curve. We check this assumption by examining histograms of each group. At