MULTIPLE REGRESSION II EXTRA SUM OF SQUARES

MULTIPLE REGRESSION EXAMPLE CALIFORNIA RAIN
MIDDLESBROUGH COUNCIL LICENSING OF HOUSES IN MULTIPLE
1 MULTIPLE ALIGNMENTS FOR STRUCTURAL FUNCTIONAL OR PHYLOGENETIC ANALYSES

1 RANGKAIAN MULTIPLEXER 4X1 DENGAN MENGGUNAKAN STROBE ATAU ENABLE
12 MULTIPLE REGRESSION MODELS LEARNING OBJECTIVES 1 EXPLAIN
18 MODERATED MULTIPLE REGRESSION WORK NOTES AND SYNTAX VERSION

Multiple Regression - II

Multiple Regression - II

Extra Sum of Squares

An extra sum of squares measures the marginal reduction in the error sum of squares when one or more predictor variables are added to the regression model, given that other predictor variables are already in the model. Equivalently, one can view an extra sum of squares as measuring the marginal increase in the regression sum of squares when one or several predictor variables are added to the regression model.

Example: Body fat (Y) to be explained by possibly three predictors and their combinations: Triceps skinfold thickness (X1), thigh circumference (X2) and midarm circumference (X3).

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Body fat is hard to measure, but the predictor variables are easy to obtain.



Model (X1) Fit and ANOVA

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Model (X2) Fit and ANOVA

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Model (X1, X2) Fit and ANOVA

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Model (X1, X2, X3) Fit and ANOVA

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Extra Sum of Squares

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Decomposition of SSR into Extra Sum of Squares

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

What are other possible decompositions?

Note that the order of the X variables is arbitrary.

ANOVA Table Containing Decomposition of SSR

Source of Variation

SS

df

MS

Regression

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

3

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

1

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

1

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

1

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Error

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Total

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES


ANOVA Table with Decomposition of SSR - Body Fat Example with Three Predictor Variables.

Source of Variation

SS

df

MS

Regression

396.98

3

132.27

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

352.17

1

352.27

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

33.17

1

33.17

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

11.54

1

11.54

Error

98.41

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

6.15

Total

495.39

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES


Computer Packages

SAS use the term “Type I” to refer to the extra sum of squares.

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES Example in SAS Using Body Fat Data


The GLM Procedure


Dependent Variable: y


Sum of

Source DF Squares Mean Square F Value Pr > F


Model 3 396.9846118 132.3282039 21.52 <.0001


Error 16 98.4048882 6.1503055


Corrected Total 19 495.3895000



R-Square Coeff Var Root MSE y Mean


0.801359 12.28017 2.479981 20.19500



Source DF Type I SS Mean Square F Value Pr > F


x1 1 352.2697968 352.2697968 57.28 <.0001

x2 1 33.1689128 33.1689128 5.39 0.0337

x3 1 11.5459022 11.5459022 1.88 0.1896



Source DF Type III SS Mean Square F Value Pr > F


x1 1 12.70489278 12.70489278 2.07 0.1699

x2 1 7.52927788 7.52927788 1.22 0.2849

x3 1 11.54590217 11.54590217 1.88 0.1896






Standard

Parameter Estimate Error t Value Pr > |t|


Intercept 117.0846948 99.78240295 1.17 0.2578

x1 4.3340920 3.01551136 1.44 0.1699

x2 -2.8568479 2.58201527 -1.11 0.2849

x3 -2.1860603 1.59549900 -1.37 0.189



For more details on the decomposition of the SSR into extra sums of squares, please see the Schematic Representation in Figure 7.1 on page 261



Mean Squares

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Note that each extra sum of squares involving a single extra X variable has associated with it one degree of freedom.

Extra Sum of Squares from Several Variables

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Extra sums of squares involving two extra X variables, such as SSR(X2, X3| X1), have two degrees of freedom associated with them. This follows because we can express such an extra sum of squares as a sum of two extra sums of squares, each associated with one degree of freedom.

Uses of Extra Sums of Squares in Tests for Regression Coefficients

Test Whether a Single Beta Coefficient is Zero (two tests are available).

  1. t-test (6.51b) discussed in chapter 6

  2. General Linear Test Approach

Full Model


MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Hypotheses

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Reduced Model when H0 holds

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

General Form of Test Statistic

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Form of Test Statistic for Testing a Single Beta Coefficient Equal Zero

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

We don’t need to fit both the full model and the reduced model. Only fitting a full model in SAS will provide the MSR(X3| X1 , X2) and MSE(X1,X2,X3). See the SAS output

Note: (1) here the t-test and F-test are equivalent test.

(2) the F test to test whether or not 3=0 is called a partial F test

(3) the F test to test whether or not all k=0 is called the overall F test.

Test Whether Several Beta Coefficients Are Zero (only one test available).

General Linear Test Approach

Full Model


MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Hypotheses

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Reduced Model when H0 holds

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

General Form of Test Statistic

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Form of Test Statistic for Testing Several Beta Coefficients Equal Zero

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES


Example: Body Fat

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Other Tests When Extra Sum of Squares Cannot be Used, therefore both full model and reduced model have to be fitted.

Example

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Full Model

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Hypotheses

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Reduced Model when H0 holds

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

General Test Statistic

Coefficients of Partial Determination

Descriptive Measures of relation ships, uses extra sum of squares. Useful in describing causal relationships.

Two Predictor Variables

The coefficient of multiple determination measures the proportionate reduction in the variation of Y achieved by the introduction of the entire set of X variables.

Coefficient of Partial Determination uses Y and X1 both “adjusted for X2” and measure the proportionate reduction in the variation of the “adjusted Y” by including the “adjusted X1.” (comments 2 on page 270)


MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

General Case

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Example

  • When X2 is added to model containing X1, SSE is reduced by 23.2%

  • When X3 is added to model containing X1 and X2, SSE is reduced by 10.5%

  • When X1 is added to model containing X2, SSE is reduced by only 3.1%

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES



Multicollinearity and Its Effects

Some questions frequently asked are:

  1. What is the relative importance of the effects of the different predictor variables?

  2. What is the magnitude of the effect of a given predictor variable on the response variable?

  3. Can any predictor variable be dropped from the model because it has little or no effect on the response variable?

  4. Should any predictor variable not yet included in the model be considered for possible inclusion?

If the predictor variables included in the model are

then relative simple answers can be given. Unfortunately, in many nonexperimental situations in business economics, and social and biological sciences, the predictor variables are correlated.

For example:

When the predictor variables are correlated among themselves, intercorrelation or multicollinearity among them is said to exist.

Example of Perfectly Uncorrelated Predictor Variables (Table 7.6)



Models:


X1 and X2 are uncorrelated.

the regression coefficient for X1 is the same for both model (1) and (2). The same holds for regression coefficient for X2.

conduct controlled experiments since the levels of the predictor variables can be chosen to ensure they are uncorrelated

SSR(X1|X2)=SSR(X1)

SSR(X2|X1)=SSR(X2)

(1)

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Source of Variation SS df MS

Regression 402.250 2 201.125

Error 17.625 5 3.525

Total 419.875 7



(2)

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Source of Variation SS df MS

Regression 231.125 1 231.125

Error 188.75 6 31.458

Total 419.875 7



(3)

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Source of Variation SS df MS

Regression 171.125 1 171.125

Error 248.75 6 41.458

Total 419.875 7















Example of Perfectly Correlated Predictor Variables

Case
i


X
i1


X
i2


Y


Pred-Y (Model 1)


Pred-Y (Model 2)

1

2

6

23

23

23

2

8

9

83

83

83

3

6

8

63

63

63

4

10

10

103

103

103



Models:



(1)

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Perfect Relation between predictors:

X2=5+0.5 X1

(2)

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Two Key Implications

  1. The perfect relation between X1 and X2 do not inhibit our ability to obtain a good fit to the data.

  2. Since many different response functions provide the same good fit, we cannot interpret any one set of regression coefficients as reflecting the effect of different predictor variables.

Effects of Multicollinearity

We seldom find variables that are perfectly correlated. However, the implication just noted in our idealized example still have relevance.

  1. The fact that some or all predictor variables are correlated among themselves does not, in general, inhibit our ability to obtain a good fit.

  2. The counterpart in real life to many different regression functions providing equally good fits to the data in our idealized example is that the estimated regression coefficients tend to have large sampling variability when the predictor variables are highly correlated.

  3. The common interpretation of a regression coefficient as measuring the change in the expected value of the response variable when the given predictor variable is increased by one unit while all the other predictors are held constant is not fully applicable when multicollinearity exits.

Example: Body Fat

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Effects on Regression Coefficients

  1. Estimates of coefficients change a lot as each variable is entered in the model.

  2. In Model (3) although the F-test is significant, none of the t-tests for individuals coefficients is significant.

  3. In Model (3) the variances of the coefficients are inflated.

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

  1. The standard error of estimate is not substantially improved as more variables are entered in the model. Thus fitted values and predictions are neither more nor less precise.

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

Theoretical reason for inflated variance: As the correlation between the predictors increases to one, the variance increases to infinity.

  1. The primed variables Y’, X1’, X2’ are called the “correlation transformation.”

  2. The X’X matrix of the primed variables is the correlation matrix rXX.

  3. As (r12)2 approaches 1 the variances march off to infinity.

MULTIPLE REGRESSION  II  EXTRA SUM OF SQUARES

For more details, please read page 272-278 of ALSM



17




1984 ADVANCED PLACEMENT EXAM PART I MULTIPLE CHOICE NOTE
3 PRACTICE QUIZ 2 MULTIPLE CHOICE 1
39 CHAPTER 9 EARNINGS MULTIPLES EARNINGS MULTIPLES REMAIN


Tags: extra sum, uses extra, multiple, extra, squares, regression