University of San Francisco
  Previous   font
 

Experimental Data: t-test and ANOVA

In this section...

The Independent Samples t-test
Analysis of Variance (ANOVA)
One-Way ANOVA
Two-Way ANOVA


The Independent Samples t-test

To compare mean differences between two (and only two) groups that vary on some measure, you are likely to run an independent samples t-test. The command string for this test is Analyze --> Compare means --> Independent samples t-test. When you execute this command string for the first time you will see that under Compare Means there are other types of t-test you can run: the one-sample t-test and the paired-samples t-test. The former is used to determine if the mean on a dependent measure is significantly different from a constant predetermined by the researcher; the latter to test whether the mean difference between paired measures in a single group is significantly different from zero (e.g., pre- and post-test scores for each subject). The one-sample and paired-samples t-tests are less commonly used than the independent samples t-test, and for that reason are not addressed here.

Using the Independent Samples t-test

To use the independent samples t-test your data set must have at least two variables. SPSS labels these the "test variable" and the "grouping variable." The test variable is the outcome measure you are interested in studying, and the grouping variable divides the data into two mutually exclusive groups. (You could consider the test variable the dependent variable and the grouping variable the independent variable.) For example, in a study of income disparity you may wish to determine if there is a difference in salary between males and females, or between minority and non-minority employees. In this instance salary level would be the test (dependent) variable, and gender or ethnicity would be the grouping (independent) variable. The test variable must be interval in nature and normally distributed, and the grouping variable must have two and only two levels (or categories).

Analyze --> Compare means --> Independent samples t-test

Choose from the complete list of variables displayed at the left of the dialog box the variable you wish to study (the test variable) and move it to the "test variable" box. (Note: While SPSS allows you to examine unlimited test variables at random, your research question does not.) Below the test variable box you'll see the "grouping variable" box. Chose from the complete list of variables the grouping (independent) variable and move it here. Remember, since you are comparing means between two and only two groups your grouping variable must be dichotomous; that is, it cannot have more than two categories.

You will have to assign a value to each group in order to proceed. To do this click on the "define groups" button. Enter under "Group 1" the value you wish to assign to the first group, and then tab and repeat the procedure for the second group. The values you assign here have no inherent value, rather they serve only to "mark" or indicate group membership. Once you have assigned the values, click OK.

If you haven't been working with your data set until now, it is recommended you do so before proceeding.


Two tables are created and displayed in the Output Editor. The first, "Group Statistics," summarizes and describes the data on the test variable by group. The number of subjects in each group and the means, standard deviations, and standard errors on the test variable for each group are presented here. This table should give you a general feeling for the differences between the groups under study. Review it carefully.

The second table, "Independent Samples Test," contains the information you need to answer the question, "Is there a statistically significant difference in X (the test variable) between these two groups?" Concentrate your attention on the column labelled "Sig (2-tailed)." If the values in this column are less than or equal to your predetermined level of significance (generally .05 or .01), the results are statistically significant, and you can report that a difference exists on the test variable between the two groups. Conversely, if the values in the significance column are larger than your predetermined significance level (.05 or .01) there is no evidence to suggest that there is a (statistically significant) difference on the test variable between the groups. Any difference you observe may be attributable to other causes.

Why the two t values and alphas in this column? Notice the header of the second and third columns: Levene's Test of Equality of Variances. This test determines whether the variance on the test variable is equal in both groups. Like normality, equality of variance is assumed in running an independent samples t-test. This assumption is tested by the Levene's Test of Equality of Variance.

The standard method of reporting the results of a t-test is to include the t value and the significance level, e.g., t (df)= 10.945, p = .003.

Run an independent samples t-test on your data. Did you uncover a difference between the groups?


Analysis of Variance (ANOVA)

ANOVA is a popular tool and widely used by researchers and students in many disciplines. This may be due to the versatility and wide applicability of this test. ANOVAs can be used in very simple situations (involving only a single dependent variable and a single independent variable, a "one-way ANOVA") to highly complex ones (involving multiple independent and dependent variables, "factoral ANOVA" or "MANOVA"). In order to use this test you need at least one quantitative, normally distributed, dependent variable and at least one independent variable with two or more levels.

In the fictitious income disparity study discussed above an independent samples t-test was used to answer the question, "Is there a difference in salary between males and females?". In this instance only two groups were being examined (male and female) and only two mean scores were available (mean salary for males, mean salary for females). In cases where three or more groups are under study, and three or more mean scores on a dependent variable are available, a more advanced statistical test is required: analysis of variance. This statistical test assesses the effect of one or more independent variables on one or more dependent variables. Within the context of our income disparity study, ANOVA would answer more complex questions, such as: "Is there a difference in salary between Caucasians, Asians, Hispanics, and African Americans?". Four groups (based on ethnicity) and four mean scores (average salary for each ethnic group) would be compared. Or, in a more extreme case, "Is there a difference in salary between Caucasians, Asians, Hispanics, and African Americans depending upon type of college or university attended (public 2-year, private 2-year, public 4-year, private 4-year)?" In this example mean salary scores for 16 groups would be compared (Caucasian/public 2-yr, Caucasian/private 2-year, Caucasian/public 4-year, Caucausian/private 4-year; Asian/public 2-yr, Asian/private 2-year, and so on). The diagram below may help you visualize this particular research question (one involving two independent variables each with more than two levels) and help you understand the power of ANOVA.

School
Type

Ethnicity
Caucasian
Hispanic
African American
API
Public
2-year
grp 1
grp 3
grp 5
grp 7
4-year
grp 2
grp 4
grp 6
grp 8
Private
2-year
grp 9
grp 11
grp 13
grp 15
4-year
grp 10
grp 12
grp 14
grp 16


One-Way ANOVA (also called "simple ANOVA" or "one factor ANOVA")

In order to run a one-way ANOVA your data set must contain one independent variable with two or more levels (categories) and one normally distributed dependent variable for each subject or case.

Note: The number of independent variables under study determines which ANOVA test you use. With one independent variable with two or more levels, you use a one-way ANOVA; with two independent variables with two or more levels each, you use a two-way ANOVA. At some point you reach a limit, however, since there is no such thing as a 12- or 22-way ANOVA. When you are examining multiple independent variables, you use a factorial ANOVA.

Analyze --> Compare means --> One-way ANOVA

Example 1

If your independent variable has only two levels or categories, say gender with "male" and "female" or ethnicity with "minority" and "nonminority," running a one-way ANOVA is very staightforward. Let's start there.

In our hypothetical compensation study, we want to examine differences in current salary between minority and non-minority employees. Current salary is the variable of interest, the measure on which we suspect there is a difference between the two groups under study. This is the dependent variable. Ethnicity, which we have defined as "minority" and "non-minority," is the independent variable and has two levels.

Execute the command sequence Analyze --> Compare means --> One-way ANOVA. Choose from the complete list of variables on the left your variable of interest and move it to the "Dependent" box. Next, choose and move to the "factor" box the independent variable, that is, the measure that divides the data set into at least two groups for the purpose of comparison.

Click on the options button at the bottom of the dialog box. If the descriptives box is not already marked, it is recommended you mark it now. Click the continue button to return to the main dialog box, and then click OK.

You may wish to work through to this stage with your data set before continuing.


If you checked the descriptives box as outlined above, the first table in the Viewer presents common descriptive statistics on your data: frequencies, means, standard deviations, and the range. While this information is important and always worthy of your review, the "meat" of the ANOVA is presented in the second table, the ANOVA table. Focus your attention on the "F" value in the fifth column and the statistical significance of this value reported in the sixth.

The "F" value is the ratio of the variance between the two groups to the variance within each individiual group. Don't bother too much about that now. Just know that when running an ANOVA, you generally want "F" values larger than 1.0. The larger the F value the more likely it is that you have uncovered a true difference between the groups, not one due to extraneous factors (such as sampling fluctuation). If the F value is significant (as noted in the sixth column), you can report a statistically significant difference between the groups on the variable of interest. If the F value is not significant, there is no difference between the groups.

The standard method of reporting results of an ANOVA is to include the "F" value, the degrees of freedom (from the second column of the ANOVA table), and the significance level: e.g., F (1, 276) = 105.78, p = .002. (Notice how large the F value is in this example.)

Go now to SPSS and look at the ANOVA table for your own data. Did you find a significant difference between the two groups? How would you report your results?


Example 2

Let's move now to something a little more complex. Let's say you are working in a data set where the independent variable has three levels. This translates into three mean scores on the dependent variable. In this scenario, to answer your research question you will need to compare the mean scores among all three groups. That is, you will have to compare the mean of group 1 with the mean of group 2, and the mean of group 1 with the mean of group 3, and the mean of group 2 with the mean of group 3.

How might this look in our fictitious salary study? We want to find out if there is a difference in current salary level (the dependent variable) among employees with different educational backgrounds. The independent variable in this case is educational background, which we have defined as "high school," "college" or "graduate school." Our independent variable thus has three levels. We will be comparing mean salary level among all three groups to see if there are differences among them; that is, we will compare the mean salary level for high school graduates with the mean salary level of college graduates and the mean salary level of those with graduate degrees; we will also be comparing the mean salary level of college graduates with the mean salary level of holders of graduate degrees. We still choose to run a one-way ANOVA, however, since we have only one independent variable.

There are many similarities between running a one-way ANOVA with an independent variable with two levels and running one with an independent variable with three levels. In both instances you are interested in examining differences between groups on a single dependent variable. In both instances, an F value will be calculated. In both instances you hope for a large calculated F value. There is one important difference, however. In a one-way ANOVA comparing only two group means the F test will indicate whether there is a difference between those two groups since they are the only two groups under study. In a one-way ANOVA where three groups are being compared, the same single F test indicates only that there is a difference. It does not tell you where the difference is. Is there a difference between groups 1 and 2? Or between 1 and 3? Or between 2 and 3? Perhaps there are differences among all groups being compared! In order to further investigate where exactly the differences are to be found, you must run a post-hoc comparison test.

To run a post hoc you need to add only one step to the process outlined in the first example. Let's learn how to do that now.

Remember, it is strongly recommended that you have a data set running while working through this section. If you do not have a data set of your own, you should access one available in SPSS. Take a moment to catch up before proceeding.


Analyze --> Compare means --> One-way ANOVA

To illustrate we will use a research question from our working example: is there a difference in current salary level based on education level? Education level is defined as "high school", "college" or "graduate."

Execute the command sequence Analyze --> Compare means --> One-way ANOVA. Choose from the complete list of variables on the left the dependent variable and move it to the "Dependent" box (from our example, current salary level). Next, choose and move to the "factor" box the independent variable, that is, the measure that divides the data set into three groups for the purpose of comparison (educational level is the factor).

Click on the options button at the bottom of the dialog box. If the descriptives box is not already marked, it is recommended that you mark it at this stage. Click the continue button to return to the main dialog box.

Notice to the left of the options button the post hoc button. Click it now. You'll see the wide variety of post hoc tests available to you. It is recommended you choose one of the three most common from the "equal variances assumed" listing: Bonferroni, Scheffe, or Tukey.

Before leaving the post hoc dialog box, you should check the significance level at the bottom. The value ".05" defaults in. If you prefer the ".01" significance level, place the cursor in the field and enter that value. The next step is to click OK.

If you haven't been working in your data set up to this point, it is strongly recommended you do so before clicking OK.


If you checked the descriptives box as outlined above, the first table in the Viewer presents common descriptive statistics on your data: frequencies, means, standard deviations, and range. You should always review this data.

Next is the ANOVA table. Look at the "F" value in the fifth column and its statistical significance in the sixth. If the F value is significant, you have discovered a difference somewhere among the means, but you don't know where just yet. If the F value is not significant, you are essentially done with the analysis. There are no statistically significant differences among the groups.

The results of the post hoc tests are reported in the third table, called a "multiple comparisons" table. If you indicated that you wanted only one post hoc run, the results of that test are provided here; if you decided you wanted multiple post hocs run, all the results are reported in this table.

Take a moment to study this table. See if you can understand what it is reporting.

The first line of the table indicates that SPSS compared the mean of group 1 with the means of group 2 and 3; in the second line, it compared the mean of group 2 with the means of group 1 and 3; and in the third line it compared the mean of group 3 with the means of group 1 and 2. (Is this double reporting of the same result redundant? Certainly. But you can appreciate the thoroughness of the calculations.) To determine which differences are significant, examine the values in the fifth column. If the value there is less than or equal to your level of significance (.05 or .01), you have uncovered a statistically significant difference between the groups.

Before leaving this section you should practice running a one-way ANOVA on your data set. For a refresher, you may wish to run one with an independent variable with two levels and then proceed to running a second one with an independent variable with three levels. Did you detect differences between the three groups?


Two-Way ANOVA (two factor ANOVA)

As pointed out above, ANOVA is a remarkably useful addition to your statistical toolbox. Its power and flexibility are considerable. The move from an analysis of the effect of one independent variable on a dependent variable to an analysis of the effect of two independent variables on a dependent variable is relatively easy. It will also enable you to ask much more complex questions of your data.

Tip: As the complexity of the analysis increases it is doubly wise to have an active data set running as you work through these sections.


Up to this point our fictitious compensation study has looked separately at the impact of gender and ethnicity on salary level. First, a one-way ANOVA was run with gender as the independent variable, and then a second one-way ANOVA was run using ethnicity as the independent variable. A much simpler, quicker, and more statistically sound approach would be to run a single two-way ANOVA. Use a two-way ANOVA when you wish to examine the impact of two independent variables (each with at least two levels) on a quantitative, normally-distributed dependent variable. The added advantage to running a two-way ANOVA is that it analyzes not only the impact on the dependent variable of each independent variable separately, it also looks at the impact on the dependent measure of a combination of the two variables.

To illustrate the range of a two-way ANOVA, we return to our compensation study. A two-way ANOVA will enable us to examine differences between male and female minority employees, and between male and female non-minority employees, and between male minority and male non-minority employees, and between female minority and female non-minority employees. In other words, with a two-way ANOVA you can study the impact of gender alone on salary level; and of ethnicity alone on salary level; and the interaction of ethnicity and gender on salary level. (A statistician would correctly point out that you are studying the "main effects" of gender and ethnicity, separately and individually, and the "interaction effect" of gender and ethnicity combined.)

Analyze --> General Linear Model --> Univariate

Note: The command string for a two-way ANOVA is different from the command string of a one-way ANOVA.

From the complete list of variables at the left of the dialog box, click and move the dependent measure to the "dependent variable" box. Then click and move both independent variables to the "fixed factor(s)" box. Next, click on the Options button. The list of variables in the "Factor and Factor Interactions" box at the left shows each independent variable (the main effects) as well as a combination of both independent variables (the interaction effect). Move all three of these to the "Display Means for" box on the right. Move now to the "Display box" below and click "descriptives." Before clicking Continue, you should check the significance level appearing at the bottom and adjust it if necessary.

As you know from an earlier discussion, if your independent variables have three or more levels you must run a post hoc test to identify where exactly differences between the groups are found. To run a post hoc, click the post hoc button. From the display box on the left move all variables that have three or more levels to the box on the right labelled "Post hoc tests for." (Don't be tricked! If one of your variables of interest is gender, don't move it. Since gender is dichotomous, meaning there are always two and only two groups, there is no need to run a post hoc comparison). Notice that once you move a variable to the "post hoc tests for" box, the lower half of the screen is activated. Choose at least one post hoc from the options listed. Common choices for post hoc tests are Bonferroni, Scheffe, and Tukey. Click on the continue button to return to the main dialog box. Click OK (located in the bottom left hand corner).

If you haven't been working with a data set up to this point, take a moment now to catch up.


Don't panic because of the amount of output you have generated! Take it table by table.

In the first table, for example, don't be thrown by the statistical title! All that is presented here are frequency counts by category (or level) of each independent variable. This is information you certainly want to know (and may already have seen if you ran frequencies or crosstabs on your data early on). Always spend a moment reviewing this data.

The second table provides more descriptive information--means, standard deviations, and frequency counts. But the information presented here is no longer unidimensional. Rather, notice that SPSS has provided means, standard deviations, and frequency counts within and across levels of both independent variables. In our fictitious salary study, this table would indicate the mean salary level (and standard deviation and count) of Hispanic males and females, and of African American males and females, and of Caucasian males and females, and of Asian males and females. It's in your best interest to also spend some time studying this information.

The third table is the ANOVA table, mysteriously renamed "Tests of Between-Subject Effects" for a two-way ANOVA. (Don't be intimidated.) Just as with a one-way ANOVA, you are interested in significant F values reported in columns five and six. In a two-way ANOVA, however, you're interested in more than one significant result; in fact, you're interested in three results. Remember, a two-way ANOVA anlayzes the impact on the dependent variable of each independent variable separately (main effects) as well as the impact of a combination of the independent variables (interaction effect). In other words, you may discover that only one main effect is significiant (e.g., that only gender impacts salary level) or you may learn that both main effects are significant (e.g., gender in and of itself impacts salalry level and ethnicity in and of itself impacts salary level) or you may discover that gender and ethnicity interact to impact salary level. The bottom line is: remember to look at three F-values when evaluating a two-way ANOVA table.

At the present time there is no need to spend time on the estimated marginal means tables. Skip over them to the post hoc tables.

The ANOVA table indicates that significant differences were found somewhere between and among the variables under study. The post hoc table tells you where exactly those differences lie. As discussed in the section on the one-way ANOVA, the first line of the table indicates that SPSS compared the mean of group 1 with the means of group 2 and 3; in the second line, it compared the mean of group 2 with the means of group 1 and 3; and in the third line it compared the mean of group 3 with the means of group 1 and 2. To determine which of the mean differences are significant, examine the values in the fifth column. If the value there is less than or equal to your level of significance (.05 or .01), you have uncovered a statistically significant difference between the groups.

Run a two-way ANOVA on your own and see if it makes sense to you.
 
SPSS Resource and Tutorial
SPSS Home Page
Module I: Getting Started
Module II: Navigation, Data Entry and Management
Module III: Summarizing and Describing Data
Module IV: Data Analysis
Frequently Asked Questions
 
  About USF | Academics | Prospective Students | Admission | Current Students | Alumni Contact Us | SOE Home