Chapter 12 Part 2

Interpretation of Coefficients

The "hard" part of doing Two-way ANOVAs this way, is that there is much information, and we need to be able to interpret the coefficients. There are essentially three "different" ways to approach the interpretation of the coefficients. They are:

  1. Reexpression of the codes in terms of cell means.
  2. Interpretation of the interaction terms as changes in slopes.
  3. Consideration of the parameter estimates as the adjustments required for conditional predictions.
Each of these is considered in turn.

Interpretation in terms of cell means

First lets take a look at the cell means they are:
GROUP     =        1.000  MEAN                 32.000
GROUP     =        2.000  MEAN                 26.000
GROUP     =        3.000  MEAN                 17.000
GROUP     =        4.000  MEAN                 17.000
GROUP     =        5.000  MEAN                 19.000
GROUP     =        6.000  MEAN                   9.000
Remember that a regression coefficient can be found by this equation

And remember that the first code was 1 1 -2 1 1 -2

If b1 is equal to zero (This is what the t-test tests) and with some math (see page 336 in your text) you will see that it must be that:

So you are testing whether the average of the two treatment group means equals the mean of the placebo group.

Let's review the computer output for this problem.
DEP VAR:    MOOD      N:      18   MULTIPLE R: 0.975  SQUARED MULTIPLE R: 0.950
ADJUSTED SQUARED MULTIPLE R: 0.930     STANDARD ERROR OF ESTIMATE:        2.041

  VARIABLE    COEFFICIENT    STD ERROR     STD COEF TOLERANCE    T    P(2 TAIL)

CONSTANT           20.000        0.481        0.000      .      41.569    0.000
      X1            3.500        0.340        0.661     1.000   10.288    0.000
      X2            1.000        0.589        0.109     1.000    1.697    0.115
      X3            5.000        0.481        0.667     1.000   10.392    0.000
      X4            0.500        0.340        0.094     1.000    1.470    0.167
      X5            2.000        0.589        0.218     1.000    3.394    0.005


                             ANALYSIS OF VARIANCE

   SOURCE   SUM-OF-SQUARES    DF  MEAN-SQUARE     F-RATIO       P

 REGRESSION        960.000     5      192.000      46.080       0.000
   RESIDUAL         50.000    12        4.167
The coefficient b1 is 3.5. Let's look at the means.

The average of the means for the groups with the drug treatment is 23.5. The average of the means for the placebo groups is 13. That difference is 10.5. Now the codes for the drug groups are 1 and the codes for the placebo group are -2. So we moved three units to go from the treatment groups to the placebo groups. If we take 10.5 and divide by 3 we get the regression coefficient of 3.5!!!

The other main effect coefficients are just this easy. Let's take on something more difficult, the first interaction code (X4). We obtained X4 by multiplying X1 and X3 together.

X11 1 -2  1  1 -2
X31 1  1 -1 -1 -1 
X41 1 -2 -1 -1  2

Multiply everything by 12 (12 in the denominator vanishes).
We get:

Now divide everything by two and regroup

Thus, concluding that ß4 does not equal zero would tells us that the difference between the average of the two drug groups and the placebo group is different depending on the presence or absence of the enzyme. Remember this test was nonsignificant. In other words the observed difference in the two sides of the equation (3) is not significantly different than zero.


Let's try the next significant interaction:
We obtained X5 by multiplying X2 and X3 together.
X21 -1  0  1 -1  0
X31  1  1 -1 -1 -1 
X51 -1  0 -1  1  0

multiply everything by 4 (4 in the denominator vanishes).
We get:

Now regroup

Thus, concluding that ß5 does not equal zero is equivalent to concluding that the difference between those who received drug a and drug b with the enzyme is different than the difference between those who received drug A and drug B without the enzyme.

The text states this as In still other words, the superiority of Drug A over Drug B is not constant, but instead depends on the level of the other variable, which in this case is the presence of Enzyme E. The difference 6 points versus 2 points is different than zero. The regression weight is 2 and we move from -1 to +1 or 2 units so this difference of 4 (6-2) divided by 2 gives the regression weight.

Note, your book regrouped differently than I did and they state this significant interaction as: The degree to which scores for those with Enzyme E tend to be higher than scores for those without Enzyme E is not constant, but instead depends on the level of the other variable, which in this case is type of drug.


Interpretation in Terms of Slope changes

To evaluate interactions this way, start with the regression equation.
 VARIABLE    COEFFICIENT    STD ERROR     STD COEF TOLERANCE    T    P(2 TAIL)

CONSTANT           20.000        0.481        0.000      .      41.569    0.000
      X1            3.500        0.340        0.661     1.000   10.288    0.000
      X2            1.000        0.589        0.109     1.000    1.697    0.115
      X3            5.000        0.481        0.667     1.000   10.392    0.000
      X4            0.500        0.340        0.094     1.000    1.470    0.167
      X5            2.000        0.589        0.218     1.000    3.394    0.005
Mood = 20.0 + 3.5X1 + 1X2 + 5X3 + .5X4 + 2X5
Then substitute in the product definitions (i.e., X4 is X1*X3 and X5 is X2*X3).
Mood = 20.0 + 3.5X1 + 1X2 + 5X3 + .5X1X3 + 2X2X3
Then solve a "simple" equation with respect to x3.
Mood = (20.0 + 3.5X1 + X2) + (5 + .5X1 + 2X2)X3
The first parenthetical terms constitute the intercept and the second parenthetical terms constitute the slope in this simple equation.
Remember that these are orthogonal contrasts so their means equal zero. Let's substitute zero for X1 and X2 and draw the "simple" regression equation.
Mood = 20.0 + 5X3
Now X3 ranges from -1 to +1 (Enzyme present vs Enzyme absent) so the graph would look like this.

Notice that 20 is the intercept. The mean for all the groups with no enzyme is 15 and the mean for the groups with the enzyme is 25.

Obviously X1 can have values of 1 or -2, and X2 can have values of -1, 0, and 1. We could look at those six regression equations. Here they are:


These differences in slope are the essence of an interaction. Looking at this figure, you can tell that changes in X3 are more important when X2 is +1 than when it is -1.


Interpretation in Terms of Conditional Predictions.

The simplest model says guess the grand mean (20) for everyone. Obviously the regression equation gives other predictions. As in the case with the one-way ANOVA, the regression equation predicts cell means for subjects.


Journal Summary

Here is the suggestion for this study from the text.

On average, both drug treatments produced higher mood scores than the placebo condition (means 23.5 versus 13, PRE = .90, F1,12 = 105.8, P < .0001). There is no statistically significant difference in the average mood scores produced by Drug A versus Drug B (means 24.5 versus 22.5, PRE = .19, F1,12 = 2.9, p = .11). Averaged across all three drug conditions, patients who had Enzyme E in their blood had higher mood scores than those who did not (means 25 versus 15, PRE = .90, F1,12 = 108, p < .0001). However, there was a significant interaction between type of drug administered (A versus B) and whether or not Enzyme E was present such that Drug A, relative to Drug B, produced an increase in mood scores for those with Enzyme E but produced a decrease in mood scores for those without Enzyme E (PRE = .49, F1,12 = 11.5, p < .005).

Note: I would leave out the differential adjustment statement.

Finally, a useful addition to an article with a significant interaction is a graph using the cell means to display the interaction. See the two possible ways of doing this with this problem on pages 347 and 348. On the blackboard, I will (hopefully) demonstrate the coding of a 4X3X2 ANOVA. Wish me luck. I'll use Helmert contrasts for the same problem in your text.


Unequal numbers of subjects in the cells

Not much of a problem. Just remember that you probably shouldn't do the analysis with unequal numbers without a complete set of contrast codes. Again, however, the nuisance is that the SS for the individual contrast coded predictors to not sum.