Probability and confidence — Keynes vs. Bayes

22 December, 2016 at 15:29 | Posted in Economics | 13 Comments

41T-mg6ZxPL._SX312_BO1,204,203,200_An alternative possibility is to accept the consequences of the apparent fact that the central prediction of the Bayesian model in its descriptive capacity, that people’s choices are or are ‘as if’ they are informed by real-valued subjective probabilities, is, in general, false …

According to Keynes’s decision theory it is rational to prefer to be guided by probabilities determined on the basis of greater evidential ‘weight’, the amount or completeness, in some sense, of the relevant evidence on which a judgement of probability is based … Keynes later suggests a link between weight and confidence, distinguishing between the ‘best estimates we can make of probabilities and the confidence with which we make them’ … The distinction between judgement of probability and the confidence with which it is made has no place in the world of a committed Bayesian because it drives a wedge into the link between choices and degrees of belief on which it is founded.

Jochen Runde

treatprob

According to Keynes we live in a world permeated by unmeasurable uncertainty – not quantifiable stochastic risk – which often forces us to make decisions based on anything but ‘rational expectations.’ Keynes rather thinks that we base our expectations on the confidence or ‘weight’ we put on different events and alternatives. To Keynes expectations are a question of weighing probabilities by ‘degrees of belief,’ beliefs that often have preciously little to do with the kind of stochastic probabilistic calculations made by rational agents as modeled by mainstream economics.

 

 

Advertisements

13 Comments

  1. I wanted to study statistics, but reading your blog I’m changing my mind. Virtually statistical models can not be applied to anything because assumptions in reality are never verified.

    • Oh don’t do that! I definitely think you should study statistics for at least a semester or two. We certainly need a new generation of statistics cognocenti that are aware of its possibilities and — as David Freedman — its limitations (especially when applied in social sciences).

      • In your opinion why so many firms hire statisticians and Data scientists if they are so little useful?

    • Oh thanks for the answer. I wonder why a growing number of firms hire statisticians and Data scientists if they are so little useful.

      • Remember that firms use statistics mostly at a descriptive and ‘data reduction’ level, where it is rather unproblematic. It’s when it comes to the kind of statistics that researchers are interested in — inferential statistics in general, and more specifically for making causal inferences — that problems begin to mount.

    • Many thanks for the replies. I’m considering a degree in statistics and I wanted to understand if statistical models are applicable to reality (although you have to pay attention) or if they are beautiful formulas or theoretical models but not applicable in practice.
      For example econometrics (and microeconometrics) has its uses though resized or is it completely useless?

  2. “the central prediction of the Bayesian model in its descriptive capacity, that people’s choices are or are ‘as if’ they are informed by real-valued subjective probabilities, is, in general, false”
    .
    According to which authority is this “the central prediction of the Bayesian model”?
    .
    The Bayesian model posits that subjective belief may be reduced to a real-valued probability via the artificial mechanism of a wager. The Bayesian model (at least anywhere I’ve read) does not “predict” the converse, i.e. that subjective belief is composed from real-valued probilities via the mechanism of wagers.
    .
    The flaw of Bayesian statistics (to the extent there is one) is also shared by frequentist/Fisherian statistics: the use of point estimation rather than interval estimation for distribution parameters. Both are simplifications/reductions of the likelihood function of the given distribution, parameterized by the given observation.
    .
    Used correctly, both approaches converge to the same point estimates with larger samples as the corresponding interval estimates narrow.
    .
    Again, if the complaint is that people commonly apply statistical techniques that are only valid under IID to problems with no assurance of IID (e.g. “a world permeated by unmeasurable uncertainty – not quantifiable stochastic risk”), well, yes, but Bayesians certainly have no monopoly on that malpractice. Far from it.

    • “Bayesians certainly have no monopoly on that malpractice” — on that we totally agree!

  3. Complete the sentence: “the central prediction of the Bayesian model in its descriptive capacity“. That is, insofar as it attempts to tell us something about what probability is and how probability functions in the real world.
    .
    The central descriptive claim of Bayesianism is that probability just is the subjective “degree of belief” of an agent. Literally nothing else in Bayesianism makes sense or is in any way motivated without this claim.
    .
    Moreover, “degree of belief” is interpreted to be a quantitative matter. Again, this is a claim about the world, about the very nature of probability. If human beliefs don’t come in quantitative degrees, then there are no probabilities for the Bayesian.
    .
    Do beliefs come in degrees? Is there such a thing as “a third of a belief”? Few pause to ask such questions. At the very least, the language of “degrees of belief” is unfortunate, if not outright nonsensical.
    .
    Then, let’s say for argument’s sake that beliefs do come in degrees. Are those degrees quantitative? It is entirely erroneous to assume that degree entails quantity — though again, this issue is almost never broached. If they are not quantitative, then there is no basis whatsoever for assuming that they obey the probability calculus and its mathematical laws. In other words, conclusions derived from applying mathematical operations to those “degrees of belief” are quite invalid.
    .
    If they are quantitative, then they can be measured. How do you measure something subjective? Ramsey’s admittedly clever response was to present the betting interpretation.
    .
    Your formulation — that “subjective belief may be reduced to a real-valued probability via the artificial mechanism of a wager” — is fine, depending on what is meant by “reduced to”. Three interpretations come to mind. One is ontological: subjective belief the odds placed, or perhaps the “disposition” to place specific odds in a specific situation. I don’t quite know what you mean when denying that Bayesianism is committed to the idea that “subjective belief is composed from real-valued probilities via the mechanism of wagers”, but far from the ontological interpretation being somehow alien to “the Bayesian model”, it was fervently advocated by one of Bayesianism’s most famous partisans, Bruno de Finetti.
    .
    The other two interpretations I can think of are explanatory (subjective degrees of belief cause agents to pick the odds they do, or “would do”) or methodological (i.e. betting odds measure subjective degrees of belief). I’ve already dealt with the measurement interpretation. Successful measurement requires that the attributes being measured have quantitative structure. So the claim of the Bayesian model to measure subjective degrees of belief through real-valued wagers requires the degrees of belief to be real-valued themselves. Otherwise, none of the results of mathematical manipulations conducted on those “measurements” has any validity.
    .
    The causal interpretation faces the same objections, plus additional ones. (Why should we care about features of the effect when it’s the alleged cause we’re actually interested in? On what basis can we infer from features of the effect to features of the cause?)
    .
    Now the betting interpretation of degrees of belief faces all kinds of well-known problems anyway, regardless of how it is (in its turn) interpreted. It is a form of operationalism and behaviouralism, and faces their obvious objections. (Relatedly, it faces the same intrinsic problems as Samuelson’s muddled “revealed preferences” technique.)

    Used correctly, both approaches converge to the same point estimates with larger samples as the corresponding interval estimates narrow.

    Indeed, if “used correctly” includes “if their assumptions are actually satisfied by the target phenomenon to which they’re applied”.

    only valid under IID to problems with no assurance of IID (e.g. “a world permeated by unmeasurable uncertainty – not quantifiable stochastic risk”)

    Although the problems of the failure to verify that IID obtains are indeed serious and widely underappreciated, the phenomenon of “unmeasurable uncertainty” goes way beyond IID failure.

    • “The central descriptive claim of Bayesianism is that probability just is the subjective “degree of belief” of an agent. Literally nothing else in Bayesianism makes sense or is in any way motivated without this claim.”
      .
      Uh, no. The central claim of Bayesianism is that “priors” are formed from an initial real-valued representation of subjective degree of belief updated with observations. That’s the formula. More observations, more arithmetic, less subjective degree of belief.
      .
      It is not in any way “Bayesian” to maintain a subjective degree of belief which is inconsistent with observations. Human, yes; Bayesian no.
      .
      The part about Bayesianism which seems to have caused so much trauma is that in the absence of observation, all degrees of belief are consistent with observation. When observations are few, the degrees of belief consistent with observations are many. So, why does someone just get to arbitrarily pick one?
      .
      Simply because it is a requirement of point estimation to estimate a point. If, instead one estimates the interval, rather than a point, then, for a given distribution, a given threshhold of likelihood, and a given (possibly empty) set of observations, the range of probabilities consistent with observation is mathematically objective. Subjectivity only plays a role if one insists on collapsing the interval to a point. It is a fundamentally arbitrary response to the imposition of a fundamentally arbitrary constraint.

      • It is not in any way “Bayesian” to maintain a subjective degree of belief which is inconsistent with observations. Human, yes; Bayesian no.

        Since nothing I said implied this, I will ignore it as an obvious red herring.

        Uh, no. The central claim of Bayesianism is that “priors” are formed from an initial real-valued representation of subjective degree of belief updated with observations. That’s the formula. More observations, more arithmetic, less subjective degree of belief.

        Which of course contains the formulation of probability in terms of “subjective degree of belief” within it. Not to go all Wikipedia on you, but you can find something like this formulation of the Bayesian interpretation of probability anywhere:

        Bayesian probability is an interpretation of the concept of probability, in which, instead of frequency or propensity of some phenomenon, assigned probabilities represent states of knowledge[1] or belief.[2]

        Of course Bayesianism proposes rational updating of belief according to the famous formula, assignment of evidential probabilities to observation events, “washing out of priors”, convergence theorems, etc. etc. Who ever said otherwise? But the definition of probability itself for Bayesians is, and remains, subjective degree of belief. This applies not just to the “priors” but to the “posterior probabilities” as well. Of course, the “posteriors” are supposed to be more “rational” than the priors, because they’ve come into contact with the available evidence. But they’re still subjective degrees of belief. And this is true of subjectivist and objectivist versions of Bayesianism. It is central to the entire apparatus.
        .
        Your response basically ignores the fundamental issues I raised. One of my very points is that what you call “more arithmetic” — the use of mathematical operations on something not known to be mathematical in structure — is precisely where the problem lies. Note moreover that this isn’t just a problem with the “priors”. It’s a problem with the entire “updating” formula, which is a mathematical function that can only legitimately be applied to terms whose real-world counterparts have quantitative structure. If they don’t, none of the axioms or laws of probability can be assumed to follow, which undermines the entire logic of the formula.
        .
        It is also a problem with the likelihood assignments and the treatment of “observations” or “evidence”. It simply assumes that the probabilities assigned to those “observations” represent an underlying, determinate “true” parameter or probability distribution that attaches to a broader “population” of interest. Hence the use of “convergence theorems” and “updating” and the conviction that the influence of (subjective) priors is being “washed out” as the sample size increases. But all of that means nothing if there is nothing for the posterior probabilities to converge on.
        .
        Such convergence requires much more than the obvious fact that any random countable collection of unrelated things can be assigned descriptive statistical properties. Statistical inference — frequentist or Bayesian — requires that the parameters, intervals or distributions in question be precise, objective, determinate attributes of the larger “population” of interest. (And that requires knowing what that larger “population” is as well as whether each observation can be uniquely and unambiguously assigned to the exact “population” you claim to be investigating. But of course no one knows this. Which is one reason why Bayesians are wrong when they pretend that they manage to avoid the crippling problem of the “reference class” when evaluating single-case probabilities. It’s all hidden in the allegedly straightforward treatment and interpretation of the likelihoods….)
        .
        One can always correlate social variables and come up with “results”. But statistical inference is supposed to be ampliative — to tell you more than just the features of this or that group of observations. So I suppose it’s fine to say “the range of probabilities consistent with observation is mathematically objective”. But so what? That’s not what social scientists want or need. And proposing intervals instead of point estimates may be more modest and indeed useful, given existing practices in the social sciences, but it still relies on all the same demanding assumptions — in order to apply mathematical formulae to the real world and get valid conclusions from that exercise.

      • //So I suppose it’s fine to say “the range of probabilities consistent with observation is mathematically objective”. But so what? That’s not what social scientists want or need.//
        .
        My personal application of statistics is in a purely commercial setting, and part of my job is to frequently remind business stakeholders that, as much as they may wish it were otherwise, statistics is not there to give them what want or need.
        .
        If you’re lucky, on a good day, you have enough data of sufficient quality that statistical analysis may tell you a hypothesis is likely or unlikely. Most of the time, though, the value provided by statistics is the knowledge that you still don’t really know one way or the other.
        .
        And this is with squeaky-clean, fine-granularity data sets, collected under tightly controlled conditions with millisecond accuracy, with cardinality many orders of magnitude greater than the time series data fed into econometric models.
        .
        So, yeah, maybe “that’s not what social scientists want or need”. Too bad.
        .
        Either you have observations that form an “exchangeable sequence of random variables” and you can therefore employ analytic techniques which rely upon that condition being satisfied for validity, or you do not and you cannot. You can’t “want” or “need” your way into it, and no amount of ontological/epistemological hair splitting will change that, frequentist, Bayesian, or other (cf. “replication crisis in […]”).
        .
        If, however, you *are* able to satisfy that condition, Bayesian statistical approaches are no worse than, and often more useful than frequentist approaches.
        .
        I have found in a great many cases, though, neither approach is as useful as plotting out the likelihood function given the available data, and simply looking at the graph.
        .
        The question of what that gets you, though, which I believe is your question, becomes a matter of how effective you are at modeling the domain of interest, how inspired you are in formulating hypotheses, and how clever are you in designing your experiments to exclude confounding factors.
        .
        And if you aren’t any good at doing that, then it really doesn’t matter whether you’re a Bayesian or not, does it?

  4. Hmmm. Sentence in 7th paragraph should read:
    .
    “One is ontological: subjective belief just is the odds placed, or perhaps the “disposition” to place specific odds in a specific situation.”


Sorry, the comment form is closed at this time.

Blog at WordPress.com.
Entries and comments feeds.