Two-way Analysis of Variance
Chapter 12

When we first introduced ANOVA, we looked at designs in which there were several groups and the groups differed on a single factor (location in raccoons, for example), however it is possible and often desirable to look at the effect of two factors at the same time (location and sex in raccoons). This is called a factorial analysis of variance or two-way analysis of variance. While the test for significance among the groups for each factor is essentially the same as for a one-way ANOVA, you will also be able to test for interactions between the factors, something that is not possible with multiple one-way tests.

In addition, the two-factor design will also be used to derive a repeated measures or block design for one factor ANOVAs.

In a two factorial design there is every combination of level for two different factors and a pool of subjects are randomly assigned to one of the combinations. For example in a study to examine the effects of different brands of fertilizer (factor A) with 4 brands and different brands of pesticides (factor B) with three brands, plants would be randomly assigned to one of the 12 (3 x 4) treatment combinations. If we have equal sample sizes (say 4) in all cells, then we would need 48 different plants to assign to 12 possible combinations (cells). Factor A is said to have 4 levels (designated by the letter a) and Factor B is said to have 3 levels (designated by the letter b). These levels were fixed by the experimenter such that this is a Model I or fixed effects ANOVA for both factors. The calculations for a two factorial design are much easier and the power of the test is much higher if the sample sizes are equal.

With this design we have three null hypotheses to test.
1) Ho: mu of fertilizer 1 = mu of fertilizer 2 = mu of fertilizer 3 = mu of fertilizer 4.
2) Ho: mu of pesticide 1 = mu of pesticide 2 = mu of pesticide 3
3) Ho: there is no interaction between fertilizer and pesticide

The first two null hypothesis have been covered in chapter 10 and they are tested just as if you had run a one factor experiment with the sample size of 8 for each fertilizer or a sample size of 12 for each pesticide. Multiple comparison tests can be run, just as before, if we reject one or both of the null hypotheses. The third null hypothesis of no interaction is a new one and needs some additional explanation.

Interactions among factors

As this is a fixed effects model, we would expect the means among the different levels of a factor to vary by a constant amount, if they were in fact different. Let us suppose that in general relative to fertilizer 1, fertilizer 2 is greater by 4, fertilizer 3 is less by 6, and fertilizer 4 is greater by 7. Now if there were no interaction among pesticide and fertilizer, we would expect that this relationship would hold regardless of the brand of pesticide being tested. Even though plants with pesticide B are larger, fertilizer 2 is still greater than 1 by 4, fertilizer 3 is less by 6, and fertilizer 4 is greater by 7. In this case the effect of different brands of fertilizer is independent of the effect caused by different brands of pesticide, thus, there is no interaction between these two factors. The figure below is a plot of the cell means (each combination of a fertilizer with a pesticide is called a cell). The three different lines correspond to the three different pesticides with brands of fertilizers plotted on the x-axis. You can see that the lines are parallel, which means that there is no interaction.

An interaction would mean that the effect of the different fertilizers depends upon which brand of pesticide is being used. For example relative to fertilizer 1 when brand C pesticide is used, fertilizer 2 is decreased by 13, fertilizer 3 is decreased by 3, and fertilizer 4 is decreased by 7. However when brand A pesticide is used, relative to fertilizer 1, fertilizer 2 is greater by 4, fertilizer 3 is less by 6, and fertilizer 4 is greater by 7. In the figure below, this interaction can be seen in that the lines are no longer parallel. With an interaction, we can not say for sure which fertilizer gives the highest plant growth as that depends on which pesticide is being used.

Without doing a two factorial analysis of the data, we would not have been able to test for this interaction. The effect of an interaction is to make the discussion of your results more complicated in that statement about the effect of one factor must be interpreted in light of the level of a second factor.

The results of a two factorial analysis of variance for the experiment above in which the graph of the cell means did not produce parallel lines can be summarized in the ANOVA table below. Note carefully how to calculate the degrees of freedom, the mean square terms, the F-values, and how to look up the P values.

source of variationsum of squaresdegrees of freedommean squareF-valueP-value
among groups factor A600a - 1 = 320014.39P < 0.0005
among groups factor B300b - 1 = 215010.80P < 0.0005
interaction of A x B300(a-1)(b-1) = 6503.60.005 < P < 0.01
within groups500(a x b)(n -1) = 3613.89
total1700N - 1 = 47

For the first null hypothesis that concerns the differences among the means for fertilizer (factor A), we reject this null hypothesis.
Ho: mu of fertilizer 1 = mu of fertilizer 2 = mu of fertilizer 3 = mu of fertilizer 4.
The mean square for factor A divided by the within groups mean square equals 200/13.89 to yield an F value of 14.39. Looking this value up for 3 and 36 degrees of freedom (we will use 3 and 35 degrees of freedom as the table does not give an entry for 36), we find that we reject the null hypothesis with P < 0.0005.

For the second null hypothesis that concerns the differences among the means for fertilizer (factor A), we reject this null hypothesis.
Ho: mu of pesticide 1 = mu of pesticide 2 = mu of pesticide 3
The mean square for factor B divided by the within groups mean square equals 150/13.89 to yield an F value of 10.80. Looking this value up for 2 and 36 degrees of freedom (we will use 2 and 35 degrees of freedom as the table does not give an entry for 36), we find that we reject the null hypothesis with P < 0.0005.

For the third null hypothesis that there is no interaction between factor A and factor B, we also reject this null hypothesis.
Ho: there is no interaction between fertilizer and pesticide.
The mean square for the interaction divided by the within groups mean square equals 50/13.89 to yield an F value of 3.60. Looking this value up for 6 and 36 degrees of freedom (we will use 2 and 35 degrees of freedom as the table does not give an entry for 36), we find that we reject the null hypothesis with 0.005< P < 0.01.

Be sure and study example 12.1 and the ANOVA summary table in example 12.2.

Zar gives a nice discussion of replication within two factorial designs. The bottom line in this discussion is that when designing two factorial experiments, you need to be very careful that your replication scheme is appropriate. While some computer programs can handle disproportional replication, many can not. Zar discusses ways to estimate missing data and the guidelines for using such estimates. I will not expect you to know these formulas but you should be able to recognize a proportional design versus a disproportional design and discuss the problems associated with unequal replication.

Two-way ANOVA without replication (chapter 12.3)

You might have a design in which the sample size for each cell is 1, that is there is no replication within each cell. In this case, you can still test the null hypotheses that are associated with differences among the means for factor A and for factor B. However, you can test is there is an interaction between the two factors. In fact, the analysis proceeds with the additional assumption that there is no interaction between the two factors. If an interaction exists, then the tests of the null hypotheses must be accepted with caution (see tables 12.5 and 12.6).

Randomized block design (chapter 12.5)

In this design, an experimenter will assign subjects to blocks with the assumption that each block is homogenous within the block but varies in some way (but in such a way that we are not interested in this variation) from other blocks. Observations in one block are related to observations in another block. This is, in fact, the multisample analog to a paired sample t-test.

For example, let suppose that we wish to study the effect on growth of 4 brands of fertilizer. We could randomly assign plants to one of the four fertilizers and put them randomly in the greenhouse (a rather straight forward completely randomized one-way design), however it is clear that because of the orientation of the greenhouse, one end is much sunnier and warmer than the other end of the greenhouse. To control for this we are going to divide the greenhouse into six sections. Each sections differs from the others, but within a section we are reasonably sure that the environment is homogenous. We now randomly assign plants to one of the six blocks, with four plants in each block (each plant is randomly assigned to a different fertilizer). Thus we need 24 (4 x 6) plants for this experiment. Zar discusses the use of random number tables to carry out the process of assigning subjects to blocks and treatment groups.

The analysis can be carried out by a Model III ANOVA without replication. We are only interested in testing the null hypothesis of no difference in mean plant growth due to fertilizer. We are not interested in any difference that might exist between blocks (position in the greenhouse). We also must assume that there is no interactions between the block effect and the fertilizer. The ANOVA table would look like the one below.

source of variationsum of squaresdegrees of freedommean squareF-valueP-value
among fertilizer600a - 1 = 32006.000.005 < P < 0.01
among blocks1000b - 1 = 5200
within groups500(a-1) (b-1) = 1533.34
total2100N - 1 = 23

For the null hypothesis that concerns the differences among the means for fertilizer, we reject this null hypothesis.
Ho: mu of fertilizer 1 = mu of fertilizer 2 = mu of fertilizer 3 = mu of fertilizer 4.
The mean square for among fertilizer A divided by the within groups mean square equals 200/33.34 to yield an F value of 6.00. Looking this value up for 3 and 15 degrees of freedom, we find that we reject the null hypothesis with 0.005 < P < 0.01.

If we had analyzed the data with a one-way ANOVA, the among blocks variation would have been combined with the within groups source of variation and we would have been unable to reject our null hypothesis of no difference among the means due to fertilizer. Thus, just like a paired-sample t-test is more powerful than a two-sample t-test, a randomized block design is more powerful than a one-way design.

source of variationsum of squaresdegrees of freedommean squareF-valueP-value
among fertilizer600a - 1 = 32002.660.05 < P < 0.1
within groups1500a(n -1) = 2075
total2100N - 1 = 23

Be sure to study example 12.4, paying close attention to the sources of variation, the degrees of freedom, and the method for calculating the F-values.
 Do problems 12.1, 12.2, and 12.4 using SigmaStat.

Last updated on 27 July 1999.
Provide comments to Dwight Moore at mooredwi@emporia.edu.