## On the scaling property of randomness (wonkish)

7 July, 2016 at 00:16 | Posted in Economics | 1 Comment

I thought hard and long on how to explain with as little mathematics as possible the difference between noise and meaning, and how to show why the time scale is important in judging an historical event. The Monte Carlo simulator can provide us with such an intuition. We will start with an example borrowed from the investment world …

Let us manufacture a happily retired dentist, living in a pleasant sunny town. We know a priori that he is an excellent investor, and that he will be expected to earn a return of 15% in excess of Treasury bills, with a 10% error rate per annum (what we call volatility). It means that out of 100 sample paths, we expect close to 68 of them to fall within a band of plus and minus 10% around the 15% excess return, i.e. between 5% and 25% (to be technical; the bell-shaped normal distribution has 68% of all observations falling between —1 and 1 standard deviations). It also means that 95 sample paths would fall between —5% and 35% …

A 15 % return with a 10 % volatility (or uncertainty) per annum translates into a 93% probability of making money in any given year. But seen at a narrow time scale, this translates into a mere 50.02% probability of making money over any given second. Over the very narrow time increment, the observation will reveal close to nothing …

This scaling property of randomness is generally misunderstood, even by professionals. I have seen Ph.D.s argue over a performance observed in a narrow time scale (meaningless by any standard) …

The same methodology can explain why the news (the high scale) is full of noise and why history (the low scale) is largely stripped of it (though fraught with interpretation problems).

Nassim N. Taleb

And it actually gets even worse if we leave the ‘certainty’ of Monte Carlo simulations and get in to reality. Since return volatilities often follow different scaling laws at different horizons, there is no way of simply converting short horizon ‘risks’ into long horizon ‘risks’ by using a universal scaling parameter (unless you assume the data distribution is iid, which, of course, it is not if we are talking about financial time series data).

In my view, regression models are not a particularly good way of doing empirical work in the social sciences today, because the technique depends on knowledge that we do not have. Investigators who use the technique are not paying adequate attention to the connection – if any – between the models and the phenomena they are studying. Their conclusions may be valid for the computer code they have created, but the claims are hard to transfer from that microcosm to the larger world …

Regression models often seem to be used to compensate for problems in measurement, data collection, and study design. By the time the models are deployed, the scientific position is nearly hopeless. Reliance on models in such cases is Panglossian …

Causal inference from observational data presents may difficulties, especially when underlying mechanisms are poorly understood. There is a natural desire to substitute intellectual capital for labor, and an equally natural preference for system and rigor over methods that seem more haphazard. These are possible explanations for the current popularity of statistical models.

Indeed, far-reaching claims have been made for the superiority of a quantitative template that depends on modeling – by those who manage to ignore the far-reaching assumptions behind the models. However, the assumptions often turn out to be unsupported by the data. If so, the rigor of advanced quantitative methods is a matter of appearance rather than substance.