The background: I was making notes about the ANOVA table for “least squares” linear regression and reviewing how to derive the “sum of squares” equality:
Total Sum of Squares = Sum of Squares Regression + Sum of Squares Error or…
If is the observed response, the sample mean of the responses, and are the responses predicted by the best fit line (simple linear regression here) then:
(where each sum is for the n observations. )
Now for each it is easy to see that but the equations still holds if when these terms are squared, provided you sum them up!
And it was going over the derivation of this that reminded me about an important fact about least squares that I had overlooked when I first presented it.
If you go in to the derivation and calculate:
Which equals and the proof is completed by showing that:
and that BOTH of these sums are zero.
But why?
Let’s go back to how the least squares equations were derived:
Given that
yields that . That is, under the least squares equations, the sum of the residuals is zero.
Now which yields that
That is, the sum of the residuals, weighted by the corresponding x values (inputs) is also zero. Note: this holds with multilinear regreassion as well.
Really, that is what the least squares process does: it sets the sum of the residuals and the sum of the weighted residuals equal to zero.
Yes, there is a linear algebra formulation of this.
Anyhow returning to our sum:
Now for the other term:
Now as it is a constant multiple of the sum of residuals and as it is a constant multiple of the weighted sum of residuals..weighted by the .
That was pretty easy, wasn’t it?
But the role that the basic least squares equations played in this derivation went right over my head!