Multisample hypotheses
Chapter 10

Often times, you will have an experimental design that includes comparisons among more than two samples. For example, you wish to test if the mean raccoon weight is the same at five locations (five samples from five different location). The statistical tools that you have to this point will not allow you to do this. You might be tempted to test if the means of the five populations were equal by doing two-sample t-test for all possible pairs of locations. However, this would lead to a very large increase in the chance of making a Type I error.

The problem is that for each t-test there is 5% chance of falsely rejecting a true null hypothesis (type I error). Your would need to do 10 such tests in order to test all possible pairs, with each test having a 5% chance of a type I error. For a series of 10 t-tests with alpha set at 0.05, there would be a 65% chance that at least one of the tests was a type I error (see table 10.1). Thus you can see, that it becomes very likely that you will conclude that there is difference among these populations when, in fact, there is none. As you can see, we clearly need a test that can do simultaneous comparisons among the five populations without increasing the chance of making a type I error. This method is called analysis of variance.

Analysis of Variance (chapter 10.1)

Analysis of variance (ANOVA) is a statistical test that allows you to test the null hypothesis that all means in a study are equal (Ho: mu1 = mu2 = mu3 = ... muk). Analysis of variance tests this hypothesis by comparing two estimates of the variance, hence the term analysis of variance.

One of the estimates of the variance is the "average variance" based upon the variances of each group. This term is essentially like the pooled variance term in a two-sample t-test. The other estimate is based upon the calculation of a standard error that measures how much each sample mean differs from the grand mean for the data. This standard error can then be used to estimate a population variance. If the five samples of raccoons above came from populations whose means are equal then these two variances will estimate the same quantity and thus be very close to equal. However, if the means among the populations are different, then the variance estimated from the standard error will be much larger than the "pooled variance". The ratio of these two variances provides a value (F-value), with which we can evaluate the null hypothesis.

We will now look at the these two variances. Each will be calculated in a way that shows the source of this variations. However, bear in mind that the actual method of calculating these values is very different than the method that I will use first. However, this first method will help to explain how ANOVA works. We will use the data in the table below.

```Note that there are seven groups with five observations in each group.
There are several quantities that we will use for the subsequent calculations
and these are defined below.

41      48        40        40        49         40         41
44      49        50        39        41         48         46
48      49        44        46        50         51         54
43      49        48        46        39         47         44
42      45        50        41        42         51         42

sum of the Xs  218     240       232       212       221        237        227
Xbar          43.6    48.0      46.4      42.4      44.2       47.4       45.4
Xbar2      1900.96  2304.0   2152.96   1797.76   1953.64    2246.76    2061.16
sum X2        9534   11532     10840      9034      9867      11315      10413
sum squares   29.2    12.0      75.2      45.2      98.1       81.2      107.2
Sum of the Xs is simply the sum of the observations for each group.
Xbar is the sample mean for each group.  Xbar2 is the sample mean squared.
Sum X2 is the sum of the squared observations.  Sum squares is the
sum of squares as previously defined when we discussed the variance.```
The first variance, that is the "average variance" based upon the variances of the of each group, could be called the within groups variance, however the more common term is the within groups mean square term.
To calculate this we first need the within groups sum of squares, which is the sum of the sum of squares across all groups. (29.2 + 12.0 + 75.2 + 45.2 + 98.1 +81.2 +107.2) = 448.8
Next we need the within groups degrees of freedom, which is the sum of (n - 1) across all groups. (4 + 4 + 4 + 4 + 4 + 4 +4) =28. Another way to calculate degrees of freedom is the total number of observations (N) minus the number of groups. (35 - 7) = 28. The within groups variance (within groups mean square) is given by the within groups sum of squares divided by the within groups degrees of freedom. 448.8/28 = 16.029.

The second variance is based upon the difference of each group mean from the grand mean. We will first calculate the variance among the mean. (Sum(Xbar - grand mean)2)/(number of groups - 1) = 25.417/6 = 4.2362. This value 4.2362 is essentially a variance of the means (sxbar2) and to convert this to an estimate of the population variance (s2), we need to multiply by the sample size for each group. This s2 = n x sxbar2. s2 = 4.2362 x 5 = 21.181. This variance term could be called the among groups variance or more commonly the among groups mean square.

We now have two estimates of the population variance, one based upon the variance within the groups and one based upon the variance among the means. If the means of the groups are equal, then these two variances should be approximately equal. If the means of the groups are not equal, then the among groups variance term should be much larger than the within groups variance.

To evaluate the relative magnitude of these two numbers, we will calculate a ratio with the among groups mean square in the numerator and the within groups mean square term in the denominator. This ratio is called an F-value (remember the variance ratio test). Using the F-value, we can look up a critical value to reject our null hypothesis that all of the means are equal. In this case the F-value that we calculate (the among groups mean square divided by the within groups mean square) is equal to 21.181/16.029 = 1.32. This value is compared to a critical F value with alpha(1)=0.05 and 6 and 28 degrees of freedom. Note that we always look up a one-tailed value. Falpha(1)=0.05, df= 6,28 = 2.56. As our calculated F-value is less than the critical F-value, we fail to reject our null hypothesis.

An important point to remember is that the sum of squares and degrees of freedom are additive. The within groups sum of squares plus the among groups sum of squares equals the total sum of squares (equation 10.8). In addition, the within groups degrees of freedom plus the among groups degrees of freedom equals the total degrees of freedom (equation 10.9). Using the formulas presented in table 10.2, we can summarize our results in what is called an ANOVA table.

source of variation sum of squaresdegrees of freedommean squareF-value
total575.88634
among groups127.086621.1811.32
within groups448.8002816.029

 On tests, I will not expect you to be able to generate the sum of squares values from raw data, however, I will expect you to be able to do the calculations within the ANOVA table. For instance, if I gave you the sum of squares and the samples size, you could determine the correct degrees of freedom, ultimately calculate an F-value, and then determine wether to reject the null hpothesis.

Use SigmaStat to do the following exercise in ANOVA.
Last updated on 21 July 1999.
Provide comments to Dwight Moore at mooredwi@emporia.edu.