Laplace’s rule of succession and Bayesian priors
26 Jan, 2022 at 15:37 | Posted in Statistics & Econometrics | 5 CommentsAfter their first night in paradise, and having seen the sun rise in the morning, Adam and Eve was wondering if they were to experience another sunrise or not. Given the rather restricted sample of sunrises experienced, what could they expect? According to Laplace’s rule of succession, the probability of an event E happening after it has occurred n times is p(E|n) = (n+1)/(n+2).
The probabilities can be calculated using Bayes’ rule, but to get the calculations going, Adam and Eve must have an a priori probability (a base rate) to start with. The Bayesian rule of thumb is to simply assume that all outcomes are equally likely. Applying this rule Adam’s and Eve’s probabilities become 1/2, 2/3, 3/4 …
Now this might seem rather straight forward, but as already e. g. Keynes (1921) noted in his Treatise on Probability, there might be a problem here. The problem has to do with the prior probability and where it is assumed to come from. Is the appeal of the principle of insufficient reason — the principle of indifference — really warranted?
Assume there is a certain quantity of liquid containing wine and water mixed so that the ratio of wine to water (r) is between 1/3 and 3/1. What is then the probability that r ≤ 2? The principle of insufficient reason means that we have to treat all r-values as equiprobable, assigning a uniform probability distribution between 1/3 and 3/1, which gives the probability of r ≤ 2 = [(2-1/3)/(3-1/3)] = 5/8.
But to say r ≤ 2 is equivalent to saying that 1/r ≥ ½. Applying the principle now, however, gives the probability of 1/r ≥ 1/2 = [(3-1/2)/(3-1/3)]=15/16. So we seem to get two different answers that both follow from the same application of the principle of insufficient reason. Given this unsolved paradox, we have reason to stick with Keynes and be skeptical of Bayesianism.
5 Comments
Sorry, the comment form is closed at this time.
Blog at WordPress.com.
Entries and Comments feeds.
Is this a fundamental problem for math?
.
r/r = 1 by fundamental identity, right?
.
But doesn’t r calculated the first way (0.625) times 1/r, calculated another way as 0.0625, equal 10, not 1 as the foundational mathematic identity predicts?
.
Can we set r <= 2?
.
r <= ((2-(1/3)) / ((3-(1/3)) = 0.625
.
1/r <= 1/0.625 = 1.6
.
But, 1/r <= ((1/2)-(1/3)) / (3-(1/3)) = 0.0625, not 1.6 as predicted by simple bedrock identity theory r/r = 1?
.
Is basic math off by, calculated this way at least, a factor of ten?
.
Checks: r * 1/r = 1; 2 * 1/2=1; 0.625 * 1.6 = 1; 0.625 * 0.0625 = 0.039 (doesn't the first relation predict 1 here, so is that 96% off?)
Comment by rsm— 28 Jan, 2022 #
No. Not a fundamental problem for math, but otherwise, yes.
1/2 and 1/3 are multiplicative inverses of 2 and 3, but that’s totally meaningless in the linear scaling that all the other calculations are doing. Draw it out on a number line, see how much closer 1/2 is to 1/3 than 2 is to 3, then think what happens when you use these differences in denominators, believing they’re going to be comparable. Hats off to Prof. Syll for a really sneaky counterfactual.
Comment by Mel— 28 Jan, 2022 #
Mel, (after having wrestled with this problem quite extensively over the past few days) may I still present my argument that a fundamental breakdown in math is occurring?
.
To make things simple, can we take r = 5 on a 0-10 scale?
.
r x (1/r) = 1; 5 x (1/5) = 1; but when we scale by 10, do we need to find (r/10) x (1/(r/10))?
.
So, (5/10) x (1 / (5/10))? Or 0.5 x 2 = 1? But what does “2” mean here, 200%? What sense does it make to say the probability of (1/r) = (1/ (5/10)) = 200%?
.
Don’t we really want 1/r to equal 0.2? Then the probability of ((1/5)/10) = 2%, not 200%? But to get our preferred, rational outcome of 1/r = 2% (not 200%), don’t we have to rearrange associations so that (1/r) = ((1/5)/10)?
.
Thus, if we want to calculate r x (1/r) we use (1/r) = (1/(5/10)); but when we want a sensible percentage for (1/r), we use (1/r) = ((1/5)/10)?
.
Checks: r = 5/10; (5/10) x (1/(5/10)) = 1; but since (1/(5/10)) = 2 = 200% which makes no sense as a percentage for 1/r we must arbitrarily regroup 1/r so that it equals ((1/5)/10) = 0.02 = 2%?
.
So 1/r calculated the first way correctly solves r x (1/r) but only if we regroup 1/r do we get a correct percentage value for 1/r? (And when we regroup 1/r to yield 2% not 200%, does not that make our r x (1/r) calculation = 0.01 not the 1 we expect?)
.
What axiom allows such an arbitrary regrouping depending on what result we want?
.
What am I missing?
.
Why does 1/r have to be both (1/(5/10)) and ((1/5)/10)?
Comment by rsm— 31 Jan, 2022 #
A Baynesian analysis only allows one prior distribution.
In a particular application, if for some strange reason Prof. Syll really does think that all r ratios (wine/water) are equally likely, that’s his privilege.
But he can’t simultaneously use a very different inconsistent prior such as equiprobable 1/r (water/wine).
So the alleged paradox doesn’t exist.
The principle of insufficient reason — the principle of indifference – is not an excuse for inconsistent woolly thinking.
The prior should say what you mean and mean what you say, not two different things at the same time.
Comment by Kingsley Lewis— 26 Jan, 2022 #
“Given this unsolved paradox, we have reason to stick with Keynes and be skeptical of Bayesianism” – The apparent paradox is only reason to be skeptical of Laplace’s rule and uniform or indifference priors as general defaults. It means nothing for the thousands of Bayesian variants that ignore the inverse-probability (premodern Bayesian) approaches which prevailed up to but not beyond the time of Keynes.
As usual, we have a Good tonic against dated critiques of Bayesianism,
https://www.jstor.org/stable/20115014
and there are much more germane reasons than the shortcomings of Laplace’s rule for being wary of some modern Bayesian promotions (like their failure to make solid logical contact with frequencies, causation, satisficing, and other ways we actually perceive the world and organize those perceptions into laws)…
https://www.jstor.org/stable/2683105
https://www.semanticscholar.org/paper/Why-I-am-not-a-Bayesian-Glymour/4a050fb3cba56d571a454397224ddd79c16dd91f
https://aiws.net/practicing-principles/modern-causal-inference/judea-pearls-works/writing/bayesianism-and-causality-or-why-i-am-only-a-half-bayesian/
https://projecteuclid.org/journals/bayesian-analysis/volume-3/issue-3/Objections-to-Bayesian-statistics/10.1214/08-BA318.full
[I know at least the latter has been cited prominently on this blog before]
Comment by lesdomes— 26 Jan, 2022 #