University of San Francisco
  Previous   font
 

Non-Parametric Tests

In this section...

The Mann-Whitney U
The Kruskal-Wallis H Test
Chi-Square

If you have not worked with SPSS before, it is recommended that you learn some navigation and data entry techniques before beginning this section. To learn the basics on these important skills click here.

For all users, it is suggested that you have an active SPSS data set to use while working through this section. It can be your actual data or one of the data sets available in SPSS (to learn how to access SPSS data sets click here). It is further suggested that you run descriptives on your data before continuing. You can refresh your memory on how to run descriptive statistics by clicking here.

Like all of the statistical tests discussed up to this point, non-parametric tests are used to investigate the relationship between two or more variables. Recall from our discussion at the start of this module that one of the key factors in determining which statistical test to run is the nature of the data to be analyzed. All of the statistical techniques you have learned up to now have made assumptions regarding the data (in particular regarding the population parameters estimated by the data). Correlation, ANOVA, independent and paired-samples t-tests, and regression all assume that the population parameters captured by the data are (1) normally distributed (values on all variables correspond roughly to the bell shaped normal curve); (2) quantitative in nature (the values can be manipulated arithmetically in a meaningful manner); (3) and, at the very least, interval (differences between values are captured by equal intervals). Indeed, these are conditions that must be met in order to run parametric tests.

But if you reflect for a moment on the nature of data in general, you will realize that not all data sets meet these assumptions. Consider, for example, the following: what if in our fictitious compensation study salary levels for our sample "bunch" around the extremes (high salary and low salary), with very few people earning amounts in the "average" range. Data such as these are not normally distributed--they "violate the normality assumption." Or say one of our questions is "are you a college graduate?" and we offer only two response options, "yes" or "no." This dichotomous variable is not quantitative in nature (how do you determine the mean of "yes"?). Lastly, there are many variables that are not captured on an interval or ratio scale. Some data simply divide the values into two mutually exclusive groups--USF graduates and non-USF graduates, for example. Such data are called "nominal" or "categorical": they "name" a group or create a mutually exclusive "category". Or, you might discover that your research question involves a variable that is ordinal in nature, that is, where the relationship between cases or subjects is expressed in terms of rank (first, second, third, etc.). A research question from our compensation study that includes an ordinal variable might be: Is there a relationship between current salary and the national ranking of the college or university from which the subject received the highest degree?

Because not all data are the same and because research questions should not be limited by available data analytic techniques, alternate methods of statistical analysis are necessary. Non-parametric tests are one such useful alternative. Non-parametric tests, unlike their parametric equivalents, do not make any assumptions regarding the population parameters (hence the name). You should consider using non-parametrics in the situations listed below (if the truth be told, in some cases you have no choice but to use them). Remember, you always have the choice to run a parametric or a non-parametric test to answer your research questions. Parametric tests pack more statistical muscle, though, so examine your data closely and make this choice wisely. A consideration of the statistical power behind your analysis will help (or could hurt) your results.

Use non-parametric tests when:

  1. One or more variables in your data set, including the dependent variable, is measured on a nominal or ordinal scale.
  2. One or more variables in your data set, including the dependent variable, violates the normality assumption.
  3. The sample size is small (< 20 cases or subjects).

There are many non-parametric tests. In fact, non-parametric equivalents exist for most "standard" parametric techniques. In this section we will describe only three common non-parametric tests: the Mann-Whitney U (the non-parametric equivalent of an independent samples t-test), the Kruskal Wallis H test (the non-parametric parallel to a one-way ANOVA), and Chi-square.

As you progress through the next few sections you should not forget that research questions answered by non-parametric tests are not much different than research questions answered by parametric tests. The purpose of all of these techniques is to investigate (compare) the relationship between two or more groups on a dependent measure. Parametric tests answer research questions by calculating and comparing the means of groups under study on the dependent variable. Non-parametric tests accomplish the same task by comparing the mean rank (or median) between groups. (In a clever trick, non-parametrics combine data on the dependent measure from all groups under consideration, then rank all subjects or cases based on the total sample, and then separate the groups out again. You need do nothing but interpret the results!)


The Mann-Whitney U (Wilcoxon-Mann-Whitney)

As an illustrative example in the discussion of the the independent samples t-test at the beginning of this module, we considered an investigation of the relationship between gender and current salary level. The research question asked was this: is there a difference in salary level between men and women? In that case we were interested in comparing the mean differences between two groups on a dependent measure. As you will see, the Mann-Whitney U is interested in the same question but works with the data in a slightly different way.

In the earlier discussion we chose to run an independent samples t-test to compare the mean salary level for men with that of women. Why did we decide against the non-parametric equivalent, the Mann-Whitney U? Review (and then apply) the three conditions outlined above in which a non-parametric test is indicated. Is at least one variable nominal or ordinal? Certainly gender is an obvious dichotomous nominal variable (male, female). This alone suggests that a non-parametric test is at least plausible in this case. Condition 2 asks if at least one variable violates the normality assumption. Does the distribution of current salary levels approximate the normal curve? Possibly, but since we don't know for certain we would need to run Descriptives to check. Finally, condition 3 deals with the sample size. Are we dealing with an unsually small sample size in this instance? Let's assume we've done due diligence and that we have a sufficient sample.

Reflect now on the parametric versus non-parametric dilemma in this case. What was the determining factor in the choice of the independent samples t-test over the Mann-Whitney U to compare the differences between the two groups?

The answer is found in the assumption made by the researcher on the nature of the distribution of the dependent variable (current salary level). The very act of choosing the independent samples t-test over the Mann-Whitney U indicates that it was assumed current salary level is normally distributed. If we abandon that assumption or prove that it is false (here's where running Descriptives is critical), we are left with a categorial variable (gender) and a quantitative variable in violation of the normality assumption. These are two important conditions that call for the use of a non-parametric test.

Remember, the research question and purpose of the Mann Whitney U is the same as that of the independent samples t-test. We are still interested in determining if there is a difference in salary level between men and women. The only difference is in what we know about our data. This seemingly minor point has a huge impact, and directs us to chose one statistical procedure over another. It is hopefully also an important reminder to run and analyze descriptives on your data set!

Analyze --> Nonparametric Tests --> 2 Independent Samples

Remember, it is recommended that you have an active SPSS data set to use while working through this section. If you don't have a data set of your own, you should access one of the the data sets available in SPSS . To learn how to access SPSS data sets, click here. Run Descriptives on your data first to make certain it is appropriate for non-parametric tests.

Choose from the complete list of variables displayed at the left of the dialog box the dependent variable and move it to the "test variable" box on the right (from our compensation study, this would be "current salary"). Next, choose from the list the independent variable and move it to the "grouping variable" box ("gender" from the compensation study). (Note: The term "grouping variable" should suggest to you that this variable should be categorical.)

Just like the independent samples t-test you will now have to define the groups to be used in the analysis. Click on the "define groups" button. Enter under "Group 1" the value you wish to assign to the first group, and then tab and repeat the procedure for the second group. In our running example, this might be "1" for females and "2" for males. The values you assign here have no inherent meaning, rather they serve only to "mark" or indicate group membership. Once you have assigned the values, click continue. Notice in the test type box at the lower left that Mann-Whitney U is already marked.

For thoroughness sake, click now on the Options button and mark Descriptives.

If you haven't been working with your data set to this point, it is recommended you do so before clicking OK.

The Output Editor shows three tables. The first table contains the traditional descriptives on your data set: number of cases and means and standard deviations on both variables. Look closely at the mean of the dependent variable, since that might give you a sense of the differences between the two groups. (On a side note, consider the mean and standard deviation for your dichotomous variable. What does this mean?) Before leaving this table, notice there is no information in this table to support a violation of normality assumption. Running more detailed Descriptives as suggested above is necessary for that.

The second table is labelled "ranks." This should not surprise you when you recall that non-parametric tests investigate differences between groups by comparing ranks (and sometimes medians). Just as you examined the means on the dependent variable in the independent samples t-test output, look now at the relative ranks of the dependent variable in table 2. In our hypothetical example from the compensation study, the mean rank of current salary level for males is 149.7 and for females it's 309.2. Do you suspect there is a difference in salary level for men and women?

The final table in this output confirms our hunch. The first line of table 3 presents a calculated value for the Mann Whitney U test. In order to determine if this value is significant or not, check the value on line 4. If it is less than or equal to your predetermined significance level (.05 or .01), you have found a statistically significant difference between the groups.

Run a Mann Whitney test on your data set now to see if there is a difference between the two groups. Remember--run Descriptives first!


The Kruskal-Wallis H Test

The Mann Whitney U is used to compare differences between two independent groups. When investigating differences between more than two groups, the Kruskal-Wallis H test is the more appropriate choice. The relationship between these two tests is the same as between their parametric equivalents, the independent samples t-test and the one-way ANOVA.

To illustrate this procedure, we propose to investigate the relationship between ethnicity and salary level from our made up compensation study. Our four ethnic categories are: African American, Hispanic, Asian Pacific Islander, and Caucasian. Here ethnicity is a categorical (nominal) independent variable with four levels; hence, to answer our reserach question we will be considering four groups. In the case of salary level, we have two options. The first is to run Descriptives on current salary level to determine if the data violates the normality assumption. If that's the case, we simply proceed to analysis. The second option is a bit more complicated. We can recode current salary level into a dichotomous variable ("high" and "low") and then run the analysis. In doing so we clearly meet the criteria for running a non-parametric test. (To refresh your memory on how to recode variables, click here. In this example, to determine where to draw the line between high and low salary, running Frequencies to determine the median of the distribution is helpful.)

Analyze --> Nonparametric Tests --> K Independent Samples

Note: "K" refers to the number of independent variables under study. It is assumed that k > 2.

Remember, it is recommended that you have an active SPSS data set to use while working through this section. If you don't have a data set of your own, you should access one of the the data sets available in SPSS . To learn how to access SPSS data sets, click here. Run Descriptives on your data first to make certain it is appropriate for non-parametric tests.

Choose from the complete list of variables displayed at the left of the dialog box the dependent variable and move it to the "test variable" list on the right (from our compensation study, this would be the recoded dichotomous variable"high salary" and "low salary."). Next, choose from the list the independent variable and move it to the "grouping variable" box ("ethnicity" from the compensation study). You must now define the range of values for the groups to be used in the analysis. (A range of values is necesary for this test because you are examining more than two groups.) Click on the "define range" button, and enter the range of values for your variable (in the salary study, the range of values if 1 to 4, one value for each ethnic group.) Remember, these values have no inherent meaning, instead they only "mark" or indicate group membership. Once you have assigned the values, click Continue. Notice in the test type box at the lower left that Kruskal-Wallis H is already marked.

For thoroughness sake click now on the Options button and mark Descriptives.

If you haven't been working with your own (or SPSS') data set to this point, it is recommended you do so before clicking OK.

The Output Editor shows three tables. The first table contains the traditional descriptives on your data set: number of cases, means, and standard deviations on both variables. Since in our fictitious example from the compensation study both variables are categorical, this table is not particularly helpful. Still, always review the contents of this table.

The second table is the rank table. Notice a rank on the dependent variable has been calculated for all levels of your independent variable. In our ongoing example, we have ranks for four groups--Hispanic, Asian, Caucasian and African American. To get a sense of differences between groups, look at the relative ranks of the dependent variable. In our hypothetical example from the compensation study, the mean rank of in salary level for Caucasians is 239.01, for Asians, 245.95; for African Americans, 284.76; and for Hispanics, 279.14. Given this information, do you suspect there is a difference in salary category (high versus low) based on ethnicity?

Table 3 contains the answer to that question. If the Chi-square value reported on line 1 is not significant, there is no statistically significant difference between the groups on the dependent measure under study. If the Chi-square value reported is significant there is a difference between the groups.

Like the results from an ANOVA, these tell you only that a difference exists somewhere among the groups; it does not tell you exactly where thse difference(s) is (are). So, like ANOVA, post hoc tests are required to determine which groups differ on the dependent variable. Unfortunately, there is not a quick and easy procedure for running non-parametric post hocs in SPSS. Rather, the researcher must run a Mann Whitney U on all pairs of groups in order to determine if there are differences between them. In our example, it's likely that there is a difference in rank salary level between Asians and African Americans and Asians and Hispanics. But is there a difference between Caucasians and Asians? Or between Hispanis and African Americans? It's good practice to run a post hoc test for each pair of groups under study; you will not want to report in your thesis, disseration, or article that "visual inspection of mean ranks indicates there is no difference between the groups." In our example, we would run a Mann Whitney U comparing group 1 with groups 2, 3, and 4; group 2 with 3 and 4; and group 3 with 4. To refresh your memory on the Mann Whitney U, click here.

Run a Kruskal-Wallis H test on your data set now to see if there is a difference between groups. Remember--run Descriptives first!


Chi-Square

Reminder: you should have an active SPSS data set running while working through this section. If you don't have a data set of your own, you can access one of the the data sets available in SPSS . To learn how to access SPSS data sets, click here. Run Descriptives on your data first to make certain it is appropriate for non-parametric tests.

Probably the most commonly used non-parametric test is chi-square. Chi-square is quite versatile and can be applied in a variety of situations. Two of the more frequent uses of chi-square are described in this section.

The most common use of the chi-square test is to examine the relationship between two nominal variables. If this sounds like correlation to you, it should. This application of Chi-square is the non-parametric equivalent of a Pearson correlation coefficient. From our compensation study, for example, we could use Chi-square to examine the relationship between gender and salary group ("high" and "low").

Analyze --> Descriptives --> Crosstabs

Identify from the complete list of variables displayed on the left of the dialog box the two variables you wish to study. Move one of the variables to the "row" box and the other to the "column" box. (It doesn't really matter which one goes where.) Click on the Statistics botton at the bottom of the dialog box. Mark Chi-square and then click Continue. Click OK.

If you haven't been following along in your own data set up to this point, take a moment to catch up.

The first table in the output, Case Processing Summary, indicates how many cases were used in the analysis. It also shows how many cases were missing. Remember, it is important to always review this information. Consider this: what if your results are based on data from only half the total sample? Would you have as much confidence in results?

The next table is the crosstabulation of the two variables under study (in our case, gender and salary group). This table shows the distribution of one variable across the levels of the other. In our example, the crosstabs function tells us exactly how many men and how many women fall into "high salary" and "low salary" groups, respectively.

The final table, the Chi-Square Tests table, tells you whether the two variables under study are related. Direct your attention to the very first line of that table: the Pearson Chi-Square. (The fortuitous naming of this coefficient is a handy reminder of the purpose of this test!) If the value reported in column four on the first line is equal to or less than your level of significance (.05 or .01), you can conclude that the two variables are related.

Run a Chi-square on your own data (if you haven't already).

A second, more advanced use of Chi-square is for hypothesis testing. You can use Chi-square to test a hunch or suspision you have about the relationship between two variables. In this application of the test, Chi-square examines the frequency of occurence within a group. Specifically, Chi-square evaluates whether the observed (or actual) frequency of occurence within a category or group is different from an expected frequency of occurence within that group. That is, is the real world different from what you hypothesize it to be?

Let's go back to our compensation study to illustrate. Let's assume we are interested in examining the relationship between gender (male, female) and salary level (high, low). Given what we know from research about the discrepancy in earning power between men and women, it would be foolish to expect an equal number (percentage) of men and women to be in the "high" or "low" group. A 50-50 expected frequency of occurence therefore seems unlikely. We may hypothesize, however, based upon reading in the management/compensation literature, that women will comprise roughly 25% of the "high salary" group and 75% of the "low salary" group. Chi-square can be used to test this hypothesis: do women truly make up 25% of the high salary group and 75% of the low salary group? A non-significant Chi-square result would support the hypothesis, suggesting that our hypothesized (or expected) frequencies match the observed frequencies, and women do in fact comprise only one quarter of the high salary group and three quarters of the low salary group. On the other hand, a significant result would suggest that our expected frequencies were different from the observed frequencies, meaning that women do not make up 25% of the high salary or 75% of the low salary groups. The exact proportion of women in either group is unclear, but the results of the Chi-square test tell us it is not 25-75. (Your hypothesis must be grounded in the literature of your field. It would not be appropriate to "guess" your way to the result you want by just randomly plugging in numbers.)

Analyze --> Non-Parametrics --> Chi-Square

Choose from the complete list of variables displayed on the left of the dialog box the two variables under study and move them to the Test Variable List on the right (from our compensation study example, this would be gender and the recoded variable "salary_hilo"). In the Expected Value box directly below the Test Variable List, enter the values corresponding to the frequencies you suspect or hypothesize in each group. From the fictitious compensation study, the expected values are "25" and "75". Click on the Values option, and enter the first expected frequency (25) and click Add; repeat this procedure for the second expected value (75). Click on the Options button in the lower right hand corner. Choose Descriptives and then click on Continue. Back on the main dialog box, click OK.

If you haven't been following along with your own data up to this point, take a moment now to catch up.

The first table in the Output, Descriptive Statistics, contains the standard descriptive statistics you have come to expect by now. Make it a habit to review this information before continuing.

The next three tables are directly related to the Chi-Square test. The first two are labeled individually with the name of the variables you have chosen to study ("gender" and "salary_hilo" in our ongoing example). The first column of each table shows the number of mutually exclusive categories within each variable. In the salary study we are looking at gender and salary level (defined as "high salary" and "low salary"), two clearly dichotomous variables. We should not be surprised then to see that the first column in both tables shows two values.

The second column of each table shows us the number of observed occurences within each category of the variable. Remember, this is the count (frequency) of actual occurences within the data set of the category of the variable. The third column contains a calculated value corresponding to the hypothesized frequency we expected to see (either 25% or 75% from the compensation study). The fourth column reports the difference between observed and expected frequencies.

The fourth and final table, Test Statistics, answers the question: is there a difference between what is and what we expected? If the significance level reported on the third line of this table is less than or equal to your predetermined level of significance (.05 or .01), you can conclude that the frequency of actual occurence and expected occurence are different. In the compensation study, for example, if we had a significant result we could report that there is a difference between what we expected (25% of females in the high salary group and 75% in the low salary group) and what in fact was observed.

Go now to your own data and see if you can determine the relationship between two categorical variables.
 
SPSS Resource and Tutorial
SPSS Home Page
Module I: Getting Started
Module II: Navigation, Data Entry and Management
Module III: Summarizing and Describing Data
Module IV: Data Analysis
Frequently Asked Questions
 
  About USF | Academics | Prospective Students | Admission | Current Students | Alumni Contact Us | SOE Home