Explained variance and Pythagoras’ theorem31 July, 2012 at 14:39 | Posted in Statistics & Econometrics | Leave a comment
In many statistical and econometric studies R2 is used to measure goodness of fit – or more technically, the fraction of variance ”explained” by a regression.
But it’s actually a rather weird measure. As eminent mathematical statistician David Freedman writes:
The math is fine, but the concept is a little peculiar … Let’s take an example. Sacramento is about 78 miles from San Francisco, as the crow flies. Or, the crow could fly 60 miles East and 50 miles North, passing near Stockton at the turn. If we take the 60 and 50 as exact, Pythagoras tells us that the squared hypotenuse in the triangle is
602 + 502 = 3600 + 2500 = 6100 miles2.
With “explained” as in “explained variance”, the geography lesson can be cruelly summarized. The area – squared distance – between San Francisco and Sacramento is 6100 miles2, of which 3600 is explained by East …
The theory of explained variance boils down to Pythagoras’ theorem on the crow’s triangular flight. Explainig the area between San Francisco and Sacramento by East is zany, and explained variance may not be much better.