Solving the St. Petersburg Paradox

22 Aug, 2017 at 18:57 | Posted in Economics | 4 Comments


Solving the St Petersburg paradox in the way Peters suggests, involves arguments about ergodicity and the all-important difference between time averages and ensemble averages. These are difficult concepts that many students of economics have problems with understanding. So let me just try to explain the meaning of these concepts by means of a couple of simple examples.

Let’s say you’re offered a gamble where on a roll of a fair die you will get €10  billion if you roll a six, and pay me €1 billion if you roll any other number.

Would you accept the gamble?

If you’re an economics student you probably would because that’s what you’re taught to be the only thing consistent with being rational. You would arrest the arrow of time by imagining six different ‘parallel universes’ where the independent outcomes are the numbers from one to six, and then weight them using their stochastic probability distribution. Calculating the expected value of the gamble — the ensemble average — by averaging on all these weighted outcomes you would actually be a moron if you didn’t take the gamble (the expected value of the gamble being 5/6*€0 + 1/6*€10 billion = €1.67 billion)

If you’re not an economist you would probably trust your common sense and decline the offer, knowing that a large risk of bankrupting one’s economy is not a very rosy perspective for the future. Since you can’t really arrest or reverse the arrow of time, you know that once you have lost the €1 billion, it’s all over. The large likelihood that you go bust weights heavier than the 17% chance of you becoming enormously rich. By computing the time average — imagining one real universe where the six different but dependent outcomes occur consecutively — we would soon be aware of our assets disappearing, and a fortiori that it would be irrational to accept the gamble.

From a mathematical point of view, you can  (somewhat non-rigorously) describe the difference between ensemble averages and time averages as a difference between arithmetic averages and geometric averages. Tossing a fair coin and gaining 20% on the stake (S) if winning (heads) and having to pay 20% on the stake (S) if losing (tails), the arithmetic average of the return on the stake, assuming the outcomes of the coin-toss being independent, would be [(0.5*1.2S + 0.5*0.8S) – S)/S]  = 0%. If considering the two outcomes of the toss not being independent, the relevant time average would be a geometric average return of squareroot [(1.2S *0.8S)]/S – 1 = -2%.

Why is the difference between ensemble and time averages of such importance in economics? Well, basically, because when assuming the processes to be ergodic, ensemble and time averages are identical.

Assume we have a market with an asset priced at €100. Then imagine the price first goes up by 50% and then later falls by 50%. The ensemble average for this asset would be €100 – because we here envision two parallel universes (markets) where the asset-price falls in one universe (market) with 50% to €50, and in another universe (market) it goes up with 50% to €150, giving an average of 100 € ((150+50)/2). The time average for this asset would be 75 € – because we here envision one universe (market) where the asset price first rises by 50% to €150 and then falls by 50% to €75 (0.5*150).

From the ensemble perspective nothing really, on average, happens. From the time perspective lots of things really, on average, happen. Assuming ergodicity there would have been no difference at all.

On a more economic-theoretical level, the difference between ensemble and time averages also highlights the problems concerning the neoclassical theory of expected utility that I have raised before (e. g.  here).

When applied to the neoclassical theory of expected utility, one thinks in terms of ‘parallel universe’ and asks what is the expected return of an investment, calculated as an average over the ‘parallel universe’? In our coin tossing example, it is as if one supposes that various ‘I’ are tossing a coin and that the loss of many of them will be offset by the huge profits one of these ‘I’ does. But this ensemble average does not work for an individual, for whom a time average better reflects the experience made in the ‘non-parallel universe’ in which we live.

Time averages give a more realistic answer, where one thinks in terms of the only universe we actually live in and ask what is the expected return of an investment, calculated as an average over time.

Since we cannot go back in time – entropy and the arrow of time make this impossible – and the bankruptcy option is always at hand (extreme events and ‘black swans’ are always possible) we have nothing to gain from thinking in terms of ensembles.

Actual events follow a fixed pattern of time, where events are often linked to a multiplicative process (as e. g. investment returns with ‘compound interest’) which is basically non-ergodic.

flaw-of-averages-1

Instead of arbitrarily assuming that people have a certain type of utility function – as in the neoclassical theory – time average considerations show that we can obtain a less arbitrary and more accurate picture of real people’s decisions and actions by basically assuming that time is irreversible. When are assets are gone, they are gone. The fact that in a parallel universe it could conceivably have been refilled, is of little comfort to those who live in the one and only possible world that we call the real world.

So — solving the St Petersburg paradox may at first seem to be a highly esoteric kind of thing. As Peters shows — it’s not!

4 Comments

  1. The market player knows he can bet $1 billion indefinitely as long as he can bluster his way into getting his IOUs accepted by someone who will expand their balance sheet. A bank can go to the Fed for an overnight loan, and keep rolling it until the six comes up. The trader just has to keep getting you to go “double or nothing” …

  2. Sorry for my spelling errors, I meant RISK.

  3. How does prospect theory fill in? Is not prospect theory more or less a Bernoully solution based on diminishing marginal return?

  4. It is still remarcable that people who cannot solve the mathematics of the situation are fully capable to say no to participating in such a game. Keynes were probably right when referring to a perception of risc, what happens if you loose. That is less complicated than solving the 250 year old paradox.

    So, how come people are observant to risc? Is it a learned skill? Is it part of our biology or a combination? Trial and error and instructions from the surrounding social group? Its hard to see it as an individual utility function based on individual needs but more of a talent derived from family upbringing and perhaps a bit of arithmetic learned in school.


Sorry, the comment form is closed at this time.

Blog at WordPress.com.
Entries and Comments feeds.