## Econometrics — the signal-to-noise problem

19 Mar, 2020 at 11:24 | Posted in Statistics & Econometrics | 3 Comments

When we first encounter the term, “noisy data,” in econometrics, we are usually told that it refers to the problem of measurement error, or errors-in-variables—especially in the explanatory variables (x). Most textbooks contain a discussion of measurement error bias. In the case of a bivariate regression, y = a + bx + u, measurement error in x means the ordinary least squares (OLS) estimator is biased. The magnitude of the bias depends on the ratio of the measurement error variance to the variance of x. If that ratio is very small, then the bias is negligible; but if the ratio is large, that means the measurement error can “drown” the true variation in x, and the bias is large.

In principle, the extent of the bias can be assessed by a simple formula, but in practice, this is rarely done. This is partly because we need to know the variance of the measurement error and, in most cases, we simply don’t know that. But there is more to it than that. There is a common opinion among many econometricians that, relative to the other problems of econometrics, a little bit of measurement error really doesn’t matter very much. Unfortunately, this misses the point. It is not the absolute size of the measurement error that matters, but its size relative to the variation in x. Nevertheless, many econometricians just ignore the problem …

Kalman proposed to “adopt the contemporary—very wide—implications of the word “noise,” as used in physics and engineering: any causal or random factors that should not or cannot be modeled, about which further information is not available, which are not analyzable, which may not recur reproducibly, etc. Thus, “noise” = the “unexplained.” This is a much more comprehensive category.”

This means that “noise” should include not just measurement errors and ambiguities in our economic concepts, but also any idiosyncracies and peculiarities in individual observations, which are not explained by the economic relationship we are interested in, and indeed, which obscure that relationship. Noisy data becomes a problem when it dominates the signal we want to observe. For Kalman, moreover, noisy data cannot be ignored, because noisy data must imply a noisy model. More precisely: “When we have noisy data, the uncertainty in the data will be inherited by the model. This is a fundamental difficulty; it can be camouflaged by adopting some prejudice but it cannot be eliminated.”

Peter Swann

## 3 Comments »

1. The hardness of econometric analysis lies in our attempt to statistically explain and project social variables in open and uncontrollable settings that are ever more globally connected. Since that is next to impossible the error noise must be amiliorated-not eliminated-by “wisdom biased”-meaning expert experience who can provide wise advice about a particular subject. Example, since oil is a factor of production in the econ-growth of all OECD nations, if you are proyecting that get expert opinion on the range of oil prices and run different scenarios of that may be. Econometrics, like a 🚗 is a valuable tool, but it needs both experienced mechanics and seasoned drivers who can read a Map

2. In an information system where information is used as a means of competition and profiteering there is not only noise. All the different methods of Information Warfare are used.

3. There is noisy and there is dirty. Treating dirty as noisy is being more than a bit in denial.