Working Dumb and Hard
We would like to test the idea that adding PIQs to a regression analysis in addition to VIQs improves our prediction of READ scores.
Step 1
Ho: MODEL C = MODEL A
H1: MODEL C not= MODEL A
Note that MODEL C will only contain VIQ as a predictor while MODEL A will contain both VIQ and PIQs as predictors.
Step 2
alpha = .05
Step 3
You will want to use the School Referrals data set.
Step 4
As you remember from the last lecture the SSE in each output is the sum of squared errors for the model being estimated. In most statistical packages this is referred to as the Residual Sum-of-squares. Therefore for MODEL C regress READ on VIQ. Then for MODEL A regress READ on VIQ and PIQ.
The following output was produced using SYSTAT (another professional statistical program). These results are presented here because unlike the version of Statlets we are using, SYSTAT can handle large sample sizes. Later you will be asked to redo this analysis using the School Referrals data set and Statlets.
Here are the results for MODEL C:
Dep var: READ N: 200 Multiple R: .542 Squared multiple R: .294 Adjusted squared multiple R: .290 Standard error of estimate: 12.318 Variable Coefficient Std error Std coef Tolerance T P(2 tail) CONSTANT 42.262 5.086 0.000 . 8.309 0.000 VIQ 0.530 0.058 0.542 .100E+01 9.077 0.000 Analysis of Variance Source Sum-of-squares DF Mean-square F-ratio P Regression 12502.171 1 12502.171 82.401 0.000 Residual 30041.329 198 151.724
Here is the output for the Augmented model:
Dep var: READ N: 200 Multiple R: .542 Squared multiple R: .294 Adjusted squared multiple R: .287 Standard error of estimate: 12.347 Variable Coefficient Std error Std coef Tolerance T P(2 tail) CONSTANT 42.882 5.627 0.000 . 7.620 0.000 VIQ 0.543 0.078 0.556 0.5697305 7.006 0.000 PIQ -0.019 0.075 -0.021 0.5697305 -0.260 0.795 Analysis of Variance Source Sum-of-squares DF Mean-square F-ratio P Regression 12512.502 2 6256.251 41.040 0.000 Residual 30030.998 197 152.442Note that the Sum-of-squares for this model is 30030.998.

Working Fast and Smart
Look at the augmented output above.
First remember that the PRE value we calculated for adding PIQ to the model was .00034. The t value of -.26 is reported in the middle table. If we square it we get an F = .067. This t value is indicating whether the addition of this variable in an augmented model adds to the prediction accuracy over a compact model that includes every other variable. In SYSTAT you can even get the direct F values by using the Test command after the full regression. Here is the dialog box, and then the Test results from SYSTAT.
TEST FOR EFFECT CALLED: CONSTANT TEST OF HYPOTHESIS SOURCE SS DF MS F P HYPOTHESIS 8851.675 1 8851.675 58.066 0.000 ERROR 30030.998 197 152.442 -------------------------------------------------------------------------------------- TEST FOR EFFECT CALLED: VIQ TEST OF HYPOTHESIS SOURCE SS DF MS F P HYPOTHESIS 7483.192 1 7483.192 49.089 0.000 ERROR 30030.998 197 152.442 -------------------------------------------------------------------------------------- TEST FOR EFFECT CALLED: PIQ TEST OF HYPOTHESIS SOURCE SS DF MS F P HYPOTHESIS 10.331 1 10.331 0.068 0.795 ERROR 30030.998 197 152.442 --------------------------------------------------------------------------------------
Note the F value of .068 reported for PIQ. Obviously, the F value of 49.089 is testing the significance of adding VIQ to a compact model that contains PIQ as the predictor.
From the reported t or F values, the PRE values are easily calculated. Note that these are 1 df tests in the numerator.
This PRE for the addition of exactly one parameter has the special name of coefficient of partial determination The square root of this special PRE is usually called the partial correlation coefficient because it is the simple correlation between Y and Xp when the effects of the other p-1 predictors have been removed from both Y and Xp.
Tolerance
R2p is simply the PRE obtained when all the other predictors are used to predict the other predictor in question. Thus, it is a measure of redundancy of the predictor in question with the other predictors. The term 1 - R2p has the special name tolerance. Tolerance is a measure of the predictor's uniqueness in the regression. Only the unique part of a predictor is useful in reducing error. If tolerance is low (say below .01 or .001) then it will be exceedingly difficult for the predictor to be helpful.
Note some programs including Statlets report a variance inflation factor (VIF) which is the inverse of tolerance.
More than Two Predictors
Let's increase the complexity of our problem by using VIQ, PIQ, AGGRESS & WITHDRAW as predictors. We will simply do the augmented model. To input all four predictors choose the menus Model/Regression/Multiple Regression and complete the Input tab as shown in the figure below.

Here are the results from the Model Fit Tab
--------------------------------------------------------------------------- Standard T Parameter Estimate Error Statistic P-Value --------------------------------------------------------------------------- CONSTANT 30.1774 7.02608 4.30 1.0E-4 VIQ 0.559027 0.10748 5.20 1.0E-4 PIQ 0.0695531 0.104357 0.67 0.5067 AGGRESS 0.554008 0.265687 2.09 0.0397 WITHDRAW 0.0629026 0.14494 0.43 0.6653 --------------------------------------------------------------------------- Analysis of Variance --------------------------------------------------------------------------- Source Sum of Squares Df Mean Square F-Ratio P-Value --------------------------------------------------------------------------- Model 10179.4 4.0 2544.84 19.26 1.0E-4 Residual 12549.4 95.0 132.099 --------------------------------------------------------------------------- Total (Corr.) 22728.8 99.0 R-squared = 44.7862 percent R-squared (adjusted for d.f.) = 42.4615 percent Standard error of est. = 11.4934 Coeff. of variation = 13.1881 percent Mean absolute error = 8.90248 Durbin-Watson statistic = 1.96066
Further ANOVA for Variables in the Order Fitted --------------------------------------------------------------------------- Source Sum of Squares Df Mean Square F-Ratio P-Value --------------------------------------------------------------------------- VIQ 9513.24 1.0 9513.24 72.02 1.0E-4 PIQ 91.1905 1.0 91.1905 0.69 0.4081 AGGRESS 550.044 1.0 550.044 4.16 0.0441 WITHDRAW 24.8807 1.0 24.8807 0.19 0.6653 --------------------------------------------------------------------------- Model 10179.4 4.0

Further ANOVA for Variables in the Order Fitted --------------------------------------------------------------------------- Source Sum of Squares Df Mean Square F-Ratio P-Value --------------------------------------------------------------------------- PIQ 5952.65 1.0 5952.65 45.06 1.0E-4 AGGRESS 454.016 1.0 454.016 3.44 0.0669 WITHDRAW 199.068 1.0 199.068 1.51 0.2226 VIQ 3573.62 1.0 3573.62 27.05 1.0E-4 --------------------------------------------------------------------------- Model 10179.4 4.0 Statistical Interpreter ----------------------- This table shows the statistical significance of each variable as it was added to the model. You can use this table to help determine how much the model could be simplified, especially if you are fitting a polynomial.
95.0% confidence intervals for model coefficients ---------------------------------------------------------------------------- Standard Lower Upper Parameter Estimate Error Limit Limit V.I.F. ---------------------------------------------------------------------------- CONSTANT 30.1774 7.02608 16.2289 44.126 VIQ 0.559027 0.10748 0.345652 0.772403 2.15 PIQ 0.0695531 0.104357 -0.137623 0.276729 2.11 AGGRESS 0.554008 0.265687 0.0265516 1.08146 1.04 WITHDRAW 0.0629026 0.14494 -0.22484 0.350645 1.08 ---------------------------------------------------------------------------- Correlation matrix for coefficient estimates --------------------------------------------------------------------------- Constant VIQ PIQ AGGRESS CONSTANT 1.0000 -0.3192 -0.4009 -0.0949 VIQ -0.3192 1.0000 -0.7145 0.0077 PIQ -0.4009 -0.7145 1.0000 -0.0779 AGGRESS -0.0949 0.0077 -0.0779 1.0000 WITHDRAW 3.0E-4 -0.1499 -0.0053 0.1775 --------------------------------------------------------------------------- WITHDRAW CONSTANT 3.0E-4 VIQ -0.1499 PIQ -0.0053 AGGRESS 0.1775 WITHDRAW 1.0000 ---------------------------------------------------------------------------With all this information you could produce a full ANOVA table like that shown in your text.
In asking several questions, we have the problem of doing multiple statistical tests on the same set of data (increasing the family-wise error rate). If we use a given level of a in repeated tests, our chances of making at least one Type I error increases rapidly. It is safer (but seldom done) to use alpha/p as the cutoff for each of the repeated tests. Using alpha/p as the criterion is known as the Bonferroni inequality for multiple comparisons.
Notice the tolerance values in the following output! What do these values indicate?When we add FSIQ as a predictor.
Standard T Parameter Estimate Error Statistic P-Value --------------------------------------------------------------------------- CONSTANT 45.0232 15.8079 2.85 0.0054 VIQ -0.473985 0.991308 -0.48 0.6337 PIQ -0.908559 0.938907 -0.97 0.3357 FSIQ 1.86773 1.78177 1.05 0.2972 AGGRESS 0.521996 0.267299 1.95 0.0538 WITHDRAW 0.0405218 0.146429 0.28 0.7826 --------------------------------------------------------------------------- Analysis of Variance --------------------------------------------------------------------------- Source Sum of Squares Df Mean Square F-Ratio P-Value --------------------------------------------------------------------------- Model 10324.4 5.0 2064.87 15.65 1.0E-4 Residual 12404.4 94.0 131.962 --------------------------------------------------------------------------- Total (Corr.) 22728.8 99.0 R-squared = 45.4242 percent R-squared (adjusted for d.f.) = 42.5212 percent Standard error of est. = 11.4875 Coeff. of variation = 13.1812 percent Mean absolute error = 8.91412 Durbin-Watson statistic = 2.05376 Further ANOVA for Variables in the Order Fitted --------------------------------------------------------------------------- Source Sum of Squares Df Mean Square F-Ratio P-Value --------------------------------------------------------------------------- VIQ 9513.24 1.0 9513.24 72.09 1.0E-4 PIQ 91.1905 1.0 91.1905 0.69 0.4079 FSIQ 215.397 1.0 215.397 1.63 0.2045 AGGRESS 494.423 1.0 494.423 3.75 0.0559 WITHDRAW 10.1058 1.0 10.1058 0.08 0.7826 --------------------------------------------------------------------------- Model 10324.4 5.0 95.0% confidence intervals for model coefficients ---------------------------------------------------------------------------- Standard Lower Upper Parameter Estimate Error Limit Limit V.I.F. ---------------------------------------------------------------------------- CONSTANT 45.0232 15.8079 13.6361 76.4103 VIQ -0.473985 0.991308 -2.44225 1.49428 182.99 PIQ -0.908559 0.938907 -2.77278 0.955667 171.05 FSIQ 1.86773 1.78177 -1.67002 5.40548 605.50 AGGRESS 0.521996 0.267299 -0.00873423 1.05273 1.05 WITHDRAW 0.0405218 0.146429 -0.250218 0.331261 1.10 ---------------------------------------------------------------------------- Correlation matrix for coefficient estimates --------------------------------------------------------------------------- Constant VIQ PIQ FSIQ CONSTANT 1.0000 -0.9060 -0.9102 0.8959 VIQ -0.9060 1.0000 0.9794 -0.9941 PIQ -0.9102 0.9794 1.0000 -0.9938 FSIQ 0.8959 -0.9941 -0.9938 1.0000 AGGRESS -0.1442 0.1144 0.1049 -0.1143 WITHDRAW -0.1305 0.1289 0.1443 -0.1458 --------------------------------------------------------------------------- AGGRESS WITHDRAW CONSTANT -0.1442 -0.1305 VIQ 0.1144 0.1289 PIQ 0.1049 0.1443 FSIQ -0.1143 -0.1458 AGGRESS 1.0000 0.1911 WITHDRAW 0.1911 1.0000 -------------------------------------------------------
Notice how dramatically the Variance Inflation Factor has changed for the variables that are highly correlated with one another (VIQ, PIQ, and FSIQ).
Testing the Addition of a Set of Predictors
Instead of asking whether the addition of just one additional parameter is worthwhile, we sometimes want to know whether the addition of a set of predictors would be useful.
Let's suppose we want to know whether the set of behavioral scores AGGRESS and WITHDRAW help the prediction using VIQ and PIQ.
We start by estimating the compact model:
MODEL C: READ = ßo + ß1VIQ + ß2PIQ + ei
Here are the Results:
--------------------------------------------------------------------------- Dependent variable: READ --------------------------------------------------------------------------- Standard T Parameter Estimate Error Statistic P-Value --------------------------------------------------------------------------- CONSTANT 31.5595 7.0776 4.46 1.0E-4 VIQ 0.558374 0.10748 5.20 1.0E-4 PIQ 0.0864393 0.105291 0.82 0.4137 --------------------------------------------------------------------------- Analysis of Variance --------------------------------------------------------------------------- Source Sum of Squares Df Mean Square F-Ratio P-Value --------------------------------------------------------------------------- Model 9604.43 2.0 4802.21 35.49 1.0E-4 Residual 13124.3 97.0 135.302 --------------------------------------------------------------------------- Total (Corr.) 22728.8 99.0 R-squared = 42.2567 percent R-squared (adjusted for d.f.) = 41.0662 percent Standard error of est. = 11.632 Coeff. of variation = 13.347 percent Mean absolute error = 9.05871 Durbin-Watson statistic = 1.93272
Remember that the SSE RESIDUAL estimates the error for this model. In this case the value for the compact model is 13124.3
Now estimate the augmented model
MODEL A: READ = ßo + ß1VIQ + ß2PIQ + ß3AGGRESS + ß4WITHDRAW+ ei
Only the important results are repeated here:
Analysis of Variance --------------------------------------------------------------------------- Source Sum of Squares Df Mean Square F-Ratio P-Value --------------------------------------------------------------------------- Model 10179.4 4.0 2544.84 19.26 1.0E-4 Residual 12549.4 95.0 132.099 --------------------------------------------------------------------------- Total (Corr.) 22728.8 99.0

