Chapter 8

In this case we will collect two samples of observations and we will use the data to infer if the two populations from which the samples came are the same. You can in fact test hypotheses concerning any population parameter, but we will confine ourselves to tests that involve the mean and variance.

Suppose that you have two populations of raccoons and you want to study variations in body weight (kg), you collect 11 raccoons from population one and Xbar

We would like to test the null hypothesis that the populations have the same weight and, in this case, that the means of the populations are the same.

H

H

Note that we do not know what the population means are. From the results of our test we are going to infer if the means are equal or not, but in reality we will never know for sure if the means are equal or not. Note also that the Xbars are not equal 4.2 =/ 5.6, and clearly a test of the Xbars would not be meaningful.

To test this null hypothesis we will use a

The third assumption listed above can be checked with an appropriate statistical test called the

H

H

To test H

The values of the F distribution vary from 0 to + infinity, however the tables only give you values for alpha less than or equal to 0.50. Thus, to insure that your calculated F value will be within the range of values given in the table, it is important that the largest of the two sample variances be in the numerator. The calculated F will then be greater than 1.

F = s

In our case, sample 1 has the larger sample variance (21.87) and thus it will be put in the numerator.

F = 21.87/15.36 = 1.42

To determine if we reject our null hypothesis or not, we need to compare our calculated F to a critical F value from the table. The critical value will have an alpha(2) = 0.05; (2) because we are doing a two-tailed test. The numerator degrees of freedom is equal to n

F critical = F

This value is on page 29 of the appendix (the page for numerator degrees of freedom equal to 10). Use the column for alpha(2) = 0.05 and the row for 7 degrees of freedom, The intersection of this column and row yields the value 4.76.

As our calculated F value (1.42) is less than the critical F value (4.76), we fail to reject the null hypothesis that the populations variances are equal. We can now pool our variances and proceed with the t-test.

The formula for the t-value is Xbar

XbarThe first quantity to be determined is s_{1}- Xbar_{2}t = ---------------------------- s_{Xbar1 - Xbar2}

To do this we first need the

s

s

where v

Remember that SS

s

The pooled variance is then used to calculate the

s

s

This quantity (2.035) is then substituted along with the Xbars into equation 8.1

t = (4.2 - 5.6)/2.035 = -0.687

We now need to compare our

Use SigmaStat to run a two-sample t-test.

The t-test is quite

There is a modified formula for the t-value, t

In the above example we made the choice to reject or not reject based upon an alpha(2) = 0.05. We can also look at this decision to reject based on P values. If the

The calculated t value in the above example was -0.687. As we are doing a two-tailed test and the curve is symmetrical, we can use the value 0.687 when determining the P value. As you look at table B3, find the row equal to the degrees of freedom for our test (in this case 17). Go across the row, comparing the numbers in the table to our calculated value. You are looking for that value that is larger than the calculated t-value. In our case the first entry in the row is 0.689, which is greater than our calculated t-value, so we do not need to look any farther. If you go to the top of the column containing 0.689, you see that the alpha(2) value associated with 0.689 is 0.50. Also note that as the t-values get larger, the associated alpha(2)s get smaller. Thus, as 0.687 is less than 0.689, our P value is greater than 0.50. Thus, based on the table, we can say that P > 0.50. Not very close to the cut-off of 0.05.

If our calculated t-value had been say 1.897, this number would have fallen between the values 1.740 and 2.110, which are associated with alpha(2) values of 0.10 and 0.05, respectively. We could then say that 0.05 is less than P which is less than 0.10 (0.05 < P < 0.10). Still not a significant difference between the means.

Most computer programs calculate the exact value of P and include this in the output. Thus for our example, the computer may have said that the P value was equal to 0.52 (P = 0.52). This P value is usually reported when presenting the results of statistical tests in publications. It might look like this (t = -0.687, df = 17, P = 0.52). It will be important for you to be able to determine the range for P values from the statistical tables.

Study section 8.2, 8.3, and 8.4 for the concepts of power and how power can be determined, however, I will not expect you to remember these formulas or be able to do these calculations on a test.

Do problems 8.1, 8.2, 8.3 and 8.4 at the end of chapter 8 (page 159).Answers: Use SigmaStat to do problems 8.1 and 8.2 Answers: |

While the t-test is quite robust, you can deviate so far from the assumptions to the extent that the power becomes too small or that the test begins to perform poorly in other ways. Sometimes, coding or transforming the data may produce a normal distribution, but we will not spend anytime on the transformation of data. If your data do not conform to the assumptions, then you can do a non-parametric test, which is a test that does not have any assumptions about the nature of the underlying distributions (normality and equality of variances). In general, the Mann- Whitney test is more powerful when the assumption of a t-test are not met and a t-test is more powerful when the assumptions are met. You would want to choose a test that has the highest power, that is the greatest ability to reject a false null hypothesis.

Just as the two-sample t-test is the appropriate parametric test when you have two samples and you want to test if the populations means are equal, the Mann-Whitney test is the appropriate non-parametric tests for two samples. The test is carried out through the use of ranks of the measurements and not the original measurements. You will note that the null hypothesis for a Mann-Whitney test does not say anything about the populations means. Remember, mu is a population parameter, and as this test is non-parametric, it does not address if the means are equal. For example, if the you were looking at the difference in body weight of raccoons from Kansas in comparison to those from North Dakota and you had determined that you data fit the assumption of a t-test, your null hypothesis might be

H

H

However, if you had determined that the data did not fit the assumptions of a t-test and you were to do a Mann-Whitney test your null hypothesis would be

H

H

I will not expect you to be able to do the calculations of the Mann-Whitney test on a test, instead I will expect you to be able to interpret the output from SigmaStat. However, you should pay close attention to how you look up critical values to reject a null hypothesis, how to do a one- tailed test, and when the normal approximation to a Mann-Whitney test can be used. Some computer programs automatically calculate the normal approximation. Be sure to study example 8.13 and 8.14 in your textbook.

Use SigmaStat to run a Mann-Whitney Test.

Do problems 8.12 and 8.13 at the end of chapter 8 (page 160).Answers: Use SigmaStat to do problems 8.21 and 8.13 Answers: |

Last updated on 27 September 2000.