Bayesian inference in a “large world”

11 Sep, 2013 at 07:33 | Posted in Theory of Science & Methodology | 1 Comment

The view that Bayesian decision theory is only genuinely valid in a small world was asserted very firmly by Leonard Savage when laying down the principles of the theory in his path-breaking Foundations of Statistics. He makes the distinction between small and large worlds in a folksy way by quoting the proverbs ”Look before you leap” and ”Cross that bridge when you come to it”. You are in a small world if it is feasible always to look before you leap. You are in a large world if there are some bridges that you cannot cross before you come to them.

consistency

As Savage comments, when proverbs conflict, it is pro-verbially true that there is some truth in both—that they apply in different contexts. He then argues that some decision situations are best modeled in terms of a small world, but others are not. He explicitly rejects the idea that all worlds can be treated as small as both ”ridiculous” and ”preposterous” … Frank Knight draws a similar distinction between making decision under risk or uncertainty …

Bayesianism is understood [here] to be the philosophical principle that Bayesian methods are always appropriate in all decision problems, regardless of whether the relevant set of states in the relevant world is large or small. For example, the world in which financial economics is set is obviously large in Savage’s sense, but the suggestion that there might be something questionable about the standard use of Bayesian updating in financial models is commonly greeted with incredulity or laughter.

Someone who acts as if Bayesianism were correct will be said to be a Bayesianite. It is important to distinguish a Bayesian like myself—someone convinced by Savage’s arguments that Bayesian decision theory makes sense in small worlds—from a Bayesianite. In particular, a Bayesian need not join the more extreme Bayesianites in proceeding as though:

• All worlds are small.
• Rationality endows agents with prior probabilities.
• Rational learning consists simply in using Bayes’ rule to convert a set of prior
probabilities into posterior probabilities after registering some new data.

Bayesianites are often understandably reluctant to make an explicit commitment to these principles when they are stated so baldly, because it then becomes evident that they are implicitly claiming that David Hume was wrong to argue that the principle of scientific induction cannot be justified by rational argument …

Bayesianites believe that the subjective probabilities of Bayesian decision theory can be reinterpreted as logical probabilities without any hassle. Its adherents therefore hold that Bayes’ rule is the solution to the problem of scientific induction. No support for such a view is to be found in Savage’s theory—nor in the earlier theories of Ramsey, de Finetti, or von Neumann and Morgenstern. Savage’s theory is entirely and exclusively a consistency theory. It says nothing about how decision-makers come to have the beliefs ascribed to them; it asserts only that, if the decisions taken are consistent (in a sense made precise by a list of axioms), then they act as though maximizing expected utility relative to a subjective probability distribution …

A reasonable decision-maker will presumably wish to avoid inconsistencies. A Bayesianite therefore assumes that it is enough to assign prior beliefs to as decisionmaker, and then forget the problem of where beliefs come from. Consistency then forces any new data that may appear to be incorporated into the system via Bayesian updating. That is, a posterior distribution is obtained from the prior distribution using Bayes’ rule.

The naiveté of this approach doesn’t consist in using Bayes’ rule, whose validity as a piece of algebra isn’t in question. It lies in supposing that the problem of where the priors came from can be quietly shelved.

Savage did argue that his descriptive theory of rational decision-making could be of practical assistance in helping decision-makers form their beliefs, but he didn’t argue that the decision-maker’s problem was simply that of selecting a prior from a limited stock of standard distributions with little or nothing in the way of soulsearching. His position was rather that one comes to a decision problem with a whole set of subjective beliefs derived from one’s previous experience that may or may not be consistent …

But why should we wish to adjust our gut-feelings using Savage’s methodology? In particular, why should a rational decision-maker wish to be consistent? After all, scientists aren’t consistent, on the grounds that it isn’t clever to be consistently wrong. When surprised by data that shows current theories to be in error, they seek new theories that are inconsistent with the old theories. Consistency, from this point of view, is only a virtue if the possibility of being surprised can somehow be eliminated. This is the reason for distinguishing between large and small worlds. Only in the latter is consistency an unqualified virtue.

Ken Binmore

1 Comment

  1. As a mathematician I find Ken’s approach technically correct, but in the sections that you quote he has not engaged with a common pragmatic version of Bayesianism. This is to behave like a Bayesianite in the short-term but to be prepared to recognize problems and to make new judgments, adaptively. Many seem to suppose that this is the best one can do, and it sometimes is, but when?

    I have observed that many experienced and seemingly effective decision-makers reason as if even in short-term activity one needs to take account of the non-probabilistic uncertainties, which makes me think that in many areas this kind of uncertainty matters, and that maybe resilience is more important than narrow consistency.

    In science, for example, one can distinguish between Kuhnian refinement of ‘the established paradigm’ and what I think of as ‘science proper’. Kuhn envisages scientists as believing in some dominant theory, or at least acting as if it were true, pragmatically. Occasionally they find that they cannot refine the theory, and there is a period of uncertainty and – hopefully – a revolution establishing a new orthodoxy. The alternative is for a scientist to be aware of the key experiments, to consider alternative explanations and to devise and perform experiments to challenge the dominant theory. This is clearly not pragmatic in the same way, nor is the behaviour necessarily consistent: as soon as a superior theory has been established, the scientist will seek to refute it. Madness! Perhaps the problem is not just in the ‘Bayesian’ part, but in a simplistic notion of the products of inference.

    In a small world, one can use an extrapolation as a prediction. In a large world, this may be dangerous.


Sorry, the comment form is closed at this time.

Blog at WordPress.com.
Entries and Comments feeds.