Solving the St Petersburg paradox

18 June, 2014 at 17:52 | Posted in Economics | 7 Comments


Solving the St Petersburg paradox in the way Peters suggests, involves arguments about ergodicity and the all-important difference between time averages and ensemble averages. These  are difficult concepts that many students of economics have problems with understanding. So let me just try to explain the meaning of these concepts by means of a couple of simple examples.

Let’s say you’re offered a gamble where on a roll of a fair die you will get €10  billion if you roll a six, and pay me €1 billion if you roll any other number.

Would you accept the gamble?

If you’re an economics students you probably would, because that’s what you’re taught to be the only thing consistent with being rational. You would arrest the arrow of time by imagining six different “parallel universes” where the independent outcomes are the numbers from one to six, and then weight them using their stochastic probability distribution. Calculating the expected value of the gamble – the ensemble average – by averaging on all these weighted outcomes you would actually be a moron if you didn’t take the gamble (the expected value of the gamble being 5/6*€0 + 1/6*€10 billion = €1.67 billion)

If you’re not an economist you would probably trust your common sense and decline the offer, knowing that a large risk of bankrupting one’s economy is not a very rosy perspective for the future. Since you can’t really arrest or reverse the arrow of time, you know that once you have lost the €1 billion, it’s all over. The large likelihood that you go bust weights heavier than the 17% chance of you becoming enormously rich. By computing the time average – imagining one real universe where the six different but dependent outcomes occur consecutively – we would soon be aware of our assets disappearing, and a fortiori that it would be irrational to accept the gamble.

From a mathematical point of view you can  (somewhat non-rigorously) describe the difference between ensemble averages and time averages as a difference between arithmetic averages and geometric averages. Tossing a fair coin and gaining 20% on the stake (S) if winning (heads) and having to pay 20% on the stake (S) if loosing (tails), the arithmetic average of the return on the stake, assuming the outcomes of the coin-toss being independent, would be [(0.5*1.2S + 0.5*0.8S) – S)/S]  = 0%. If considering the two outcomes of the toss not being independent, the relevant time average would be a geometric average return of  squareroot[(1.2S *0.8S)]/S – 1 = -2%.

Why is the difference between ensemble and time averages of such importance in economics? Well, basically, because when assuming the processes to be ergodic,ensemble and time averages are identical.

Assume we have a market with an asset priced at €100 . Then imagine the price first goes up by 50% and then later falls by 50%. The ensemble average for this asset would be €100 – because we here envision two parallel universes (markets) where the assetprice falls in one universe (market) with 50% to €50, and in another universe (market) it goes up with 50% to €150, giving an average of 100 € ((150+50)/2). The time average for this asset would be 75 € – because we here envision one universe (market) where the asset price first rises by 50% to €150, and then falls by 50% to €75 (0.5*150).

From the ensemble perspective nothing really, on average, happens. From the time perspective lots of things really, on average, happen. Assuming ergodicity there would have been no difference at all.

On a more economic-theoretical level the difference between ensemble and time averages also highlights the problems concerning the neoclassical theory of expected utility that I have raised before (e. g.  here).

When applied to the neoclassical theory of expected utility, one thinks in terms of “parallel universe” and asks what is the expected return of an investment, calculated as an average over the “parallel universe”? In our coin tossing example, it is as if one supposes that various “I” are tossing a coin and that the loss of many of them will be offset by the huge profits one of these “I” does. But this ensemble average does not work for an individual, for whom a time average better reflects the experience made in the “non-parallel universe” in which we live.

Time averages gives a more realistic answer, where one thinks in terms of the only universe we actually live in, and ask what is the expected return of an investment, calculated as an average over time.

Since we cannot go back in time – entropy and the arrow of time make this impossible – and the bankruptcy option is always at hand (extreme events and “black swans” are always possible) we have nothing to gain from thinking in terms of ensembles.

Actual events follow a fixed pattern of time, where events are often linked in a multiplicative process (as e. g. investment returns with “compound interest”) which is basically non-ergodic.


Instead of arbitrarily assuming that people have a certain type of utility function – as in the neoclassical theory – time average considerations show that we can obtain a less arbitrary and more accurate picture of real people’s decisions and actions by basically assuming that time is irreversible. When are assets are gone, they are gone. The fact that in a parallel universe it could conceivably have been refilled, are of little comfort to those who live in the one and only possible world that we call the real world.

So — solving the St Petersburg paradox may at first seem to be a highly esoteric kind of thing. As Peters shows — it’s not.



  1. Reblogged this on Digital Real Edge and commented:
    The St Petersburg paradox revisited!

  2. This video with his reflections towards the end confirms my point in the other post.

  3. “From a mathematical point of view you can (somewhat non-rigorously) describe the difference between ensemble averages and time averages as a difference between arithmetic averages and geometric averages. ”

    If by non-rigorous you mean made-up, then yes. Mathematical definitions of ergodicity and ergodic theorems do not make any such distinction and both time and ensemble averages are defined as “arithmetic” averages.

    Peters seems to have invented his own definition of ergodicity, which makes little sense. For an example, define x_t = exp(e_t), where e_t is Gaussian white noise, and compute arithmetic ensemble and geometric time averages of process {x_t}. You’ll find they differ. So apparently sequence of iid random variables is a nonergodic process!

  4. “If you’re an economics students you probably would”

    This is total nonsense as there is no paradox once you take risk measures into account. The game you describe implies a lot of risk. Conventional theory realistically assumes that risk aversion decreases in wealth so in a standard mainstream model only billionaires or near-billionaires would play the game.

  5. I actually agree with Ole’s findings, but in trying to be very general his arguments come across as quite tame and one might well wonder if he hasn’t perhaps misunderstood the theory that he criticising, or if there might not be a simple way to fix it.

    Turing showed that a wide class of dynamic systems have ‘critical instabilities’ and hence multiple pseudo-equilibria. Ensemble averages are uninformative about the long run.

    For example, if you win a small amount your short-run consumption might increase, but the long-run impact would be negligible. But if you win a large amount your whole way of living might change, including your attitude to risk, sense of values and personal relationships. Turing, in effect, argues that it is impossible to make a sensible decision without thinking through these issues, which is – in effect – what Ole is recommending. Turing also notes that critical instabilities are driven by increases in ‘relative ordering’, of which leverage is an obvious example. Thus it makes no sense to assess values and risk and then to decide on leverage: leverage and one’s leveraging strategies create critical instabilities. This is a quite separate notion from that of ‘financial risk’. It is something to do with ‘Knightian uncertainty’, except that one can be sure that excessive leverage will destabilize.

  6. Question:

    As an amateur, my understanding of ergodicity is that probability distributions from the past can be used in place of probability distributions in the future. Or that probability distributions are stable through time.
    I’m not clear how the toss of a fair coin could be non-ergodic as in:

    “because when assuming the processes to be ergodic, ensemble and time averages are identical.”

    My understanding of why the ergodic axiom is a problem, is that it was applied to social phenomena(economics) where it is not valid.

  7. This may be a silly question but why do we take 5/6*€0 and not 5/6*€-1 billion?

Sorry, the comment form is closed at this time.

Blog at
Entries and comments feeds.