Beginning Chapter 7 section 3

Regular Test

Testing the regular single and two pramater models against one another.

Step 1


Here we use the Judd 7-4 data set. We know that MODEL C = Third grade math = its mean, and MODEL A = Third grade math is dependent upon second grade math. The compact model estimates one parameter and the augmented model estimates two parameters, the intercept and slope of the regression equation.

Step 2

Set the alpha level = .05

Step 3

The data set along with two extra columns of errors are found in the Judd 7-4e data set.

Step 4

You have seen simple regression analysis done using Statlets several times by now. Simply copy the data set above into Statlets, and conduct a simple regression analysis.

Here is the output from the Summary tab. Of course, you should do this exercise using Statlets yourself. The output is identical.


Regression Analysis for THIRD versus SECOND
-----------------------------------------------------------------------
Model type: Linear
-----------------------------------------------------------------------
Equation: THIRD = -97.5558 + 2.16366*SECOND
-----------------------------------------------------------------------
Coefficient     Estimate        Std. Error      t-value         P-value
-----------------------------------------------------------------------
Intercept       -97.5558        35.8947         -2.71784        0.0176
Slope           2.16366         0.439644        4.92139         3.0E-4
-----------------------------------------------------------------------
Correlation = 0.8067
R-squared = 65.07%
Std. error of est. = 14.4909
Below is the ANOVA tab output.
-------------------------------------------------------------------------
                           Analysis of Variance
-------------------------------------------------------------------------
Source         Sum of Squares   Df    Mean Square    F-Ratio      P-Value
-------------------------------------------------------------------------
Model          5085.9           1     5085.9         24.22        3.0E-4
Residual       2729.83          13    209.987
-------------------------------------------------------------------------
  Lack-of-Fit  1918.33          9     213.148        1.05064      0.5221
  Pure Error   811.5            4     202.875
-------------------------------------------------------------------------
Total (Corr.)  7815.73          14

What does it all mean?

The last section labeled Analysis of Variance, consists of the source table that provides sums of squares, degrees of freedom, and mean squares used for comparing the augmented two-parameter model with the compact single-parameter model. The row labeled Model gives the reduction in the sum of squared errors when we move from the compact to the augmented model. The row labeled Residual refers to the sum of squared errors remaining given the augmented two-parameter model. The Total row indicates the error in the compact model, if, and only if, the compact model is the mean model.

The Summary output lists the name of the dependent variable, the correlation coefficient and the squared correlation or PRE value. The standard error of estimate (14.49), was also called the standard error of prediction in Chapter 6.

The t-tests in the Summary output tests the null hypothesis that each of these parameters (the intercept or constant and the slope) equals zero. The first row provides an estimate of the constant or intercept value betao (-97.556) followed later by a t statistic that results when the null hypothesis is tested that betao = 0. As defined in Chapter 5 this t statistic is simply the square root of the F statistic that results if we compare this augmented two-parameter model with a compact single-degree-of-freedom one in which betao is forced to equal zero. Thus, we must square this value -2.718 to give us an F with 1 and 13 degrees of freedom that would result from the comparison of the two models illustrated in the figure below:



We are not told the PRE value that results from this comparison, but we can calculate it with the following formula:



Note for this case that F = (-2.718)2 and PRE = .362

This entire test is not very informative, since the intercept value of -97.558 equals the predicted value of the third grade math test when the value of the second grade math test is zero.


The next row of this middle section provides the beta1 or slope estimate (2.1637). This value indicates that as second grade math scores improve by one unit, that third grade math skills will improve by 2.16 units. Again, you are given a t statistic to see if this value differs from zero. If you square that value, you will note that it is the same as the F value found in the ANOVA table below it.

When we get to models with more than single predictor variables, the ANOVA table at the bottom will always test the full augmented model with all the estimated parameters against the single model with the mean. On the other hand, tests of significance in the Summary tab output will always result from comparing the full augmented model with a compact model in which only that particular parameter has been forced to equal zero.

If you take a particular parameter value and divide it by its reported standard error, the t statistic results for testing whether this parameter is equal to zero. If you also multiply this standard error value by the critical value of the t statistic then you also get another expression for a parameter's confidence interval.


bj ± (Standard Error)tcrit

Step 5

We could conclude from this analysis that there is a reliable positive relationship between third grade mathematics test performance and performance one year earlier. The resulting slope value is positive and reliably different than zero. The intercept is also different than zero, but this is virtually meaningless in this context.


A New Test

Testing to see whether the third grade math score differes reliably from some hypothesized value controlling for second grade performance.

Scaling the x-axis

First let's visualize what happens when we simply scale the x values by subtracting the mean of x away from every x value. In the applet on the left, this has been done. The new zero point is shown using the magenta line. Zero on the x-axis will now be where the old mean on the x-axis was. When we scale the x variables in this manner, the new constant must be the average of the y values. You can check this.

The first t-test in the Summary tab output tests to see if the constant is different from zero. As you move the data values in the applet, you will see the differences that actually get tested graphed on the right side of the applet. The magenta bar represents the distance tested in the model where x values have been scaled. If we have scaled the x data as detailed above, the first t-test determines if the mean of y is different from zero. The green bar represents the distance tested in the old model where x was not scaled. If the x data was not scaled the first t-test investigates whether the old constant is different from zero.

Notice that within the range of values possible in this applet that you can not create a negative constant using the scaled x values model. However, you can create a negative constant in the original unscaled model.

Scaling the x-axis plus y-axis

In the applet on the immediate left, the y-axis has been scaled by subtracting 50 away from each y value. Compare the y-axis with the one on the applet above. Now where y used to be equal to 50 it is now equal to zero. Notice the new magenta line indicating this point.

When the t-test investigates whether the constant is different than zero, the magenta bar indicates the difference tested. The green bar shows the difference that would have been tested if neither the y-axis or the x-axis had been scaled.

Note how using the scaled data is now equivalent to testing whether y-bar (the constant in the new model) is different from 0. Since zero in the new model here is equivalent to 50 before the y-axis was scaled, this is also equivalent to testing whether y-bar is different from 50. You can deviate any value away from the y values to conduct this type of test.

A Textbook Problem

As was shown in section 7.2 in Judd & McClelland, if we control for another predictor variable, we get a more powerful test. See pages 135-140 in your text.

Remember in Chapter 5 we tested the third grade math data to see if it reliably differed from a score of 65. See the data set on page 74. The PRE value we calculated was .25 with an associated F with 1,14 df = 4.64. We will now let Statlets do the calculation while controlling for second grade performance.

Step 1



Recall that the intercept will equal the mean of Y (Y-bar) if the predictor variable is transformed so that its mean equals zero. In other words we need to transform the predictor into mean deviation form.

With Statlets testing a null hypothesis that the intercept equals a certain value can be accomplished by deviating the criterion variable from that value. In other words, we need to also deviate third grade math scores from 65.

Then our two models may be written:

Step 2

Set the alpha level = .05

Step 3

The deviated scores for both variables are produced in the last two columns ofJudd 7-4e. If you were producing these errors yourself, you would need to subtract the mean of the second grade scores away from each second grade score, producing XDEV. You would also need to subtract the value of 65 away from each of the Third Grade scores producing THIRDDEV. This could easily be done in a spreadsheet application.

Step 4

Choose the regression command and regress the deviated criterion score THIRDDEV on the deviated predictor score XDEV. The Input tab will look like the figure directly below.



Here are the Summary tab results using Statlets.
Regression Analysis for THIRDDEV versus XDEV
-----------------------------------------------------------------------
Model type: Linear
-----------------------------------------------------------------------
Equation: THIRDDEV = 13.1333 + 2.16366*XDEV
-----------------------------------------------------------------------
Coefficient     Estimate        Std. Error      t-value         P-value
-----------------------------------------------------------------------
Intercept       13.1333         3.74154         3.51014         0.0038
Slope           2.16366         0.439644        4.92139         3.0E-4
-----------------------------------------------------------------------
Correlation = 0.8067
R-squared = 65.07%
Std. error of est. = 14.4909
Followed by the ANOVA tab output.
-------------------------------------------------------------------------
                           Analysis of Variance
-------------------------------------------------------------------------
Source         Sum of Squares   Df    Mean Square    F-Ratio      P-Value
-------------------------------------------------------------------------
Model          5085.9           1     5085.9         24.22        3.0E-4
Residual       2729.83          13    209.987
-------------------------------------------------------------------------
  Lack-of-Fit  1918.33          9     213.148        1.05064      0.5221
  Pure Error   811.5            4     202.875
-------------------------------------------------------------------------
Total (Corr.)  7815.73          14


Note how these results compare with the results on page 146 in your text.
Go to the previous section of Chapter 7 material.