Tuesday, November 16, 2010

Picking the bigger number of two even if one is unknown

Here is a nice problem from the xkcd blog: Two real numbers, A and B, are drawn using some unknown, possibly probabilistic process and written on papers that go into two envelopes. You randomly pick one and open it to find some number on it. You now have to decide whether you want to receive that number as an amount in dollars or rather the number that is in the other envelope (which is still sealed). Can you come up with a process that with probability >50% picks the larger amount?

Think about it.


You can. You will need some function f that maps the real numbers to the open interval (0,1) in a strictly monotonic way. You could for example take f(x) = (1+tanh(x))/2. Assume the number that you found in the first envelope was X. Then throw an unfair coin such that with probability f(X) you keep X and otherwise take the other envelope. Obviously (?), if you started with the envelope with the smaller number you are more likely to switch than if you had started with the larger number.

This sounds a bit counter intuitive. How can you increase your expected payoff if you know nothing about the number in the second envelope?

You might have the idea that something fishy is going on. What comes to mind is for example that you can produce paradoxes when assuming that there is a uniform probability distribution on the reals (or integers). But I believe, that this is not what is going on here since I did not say how the numbers were picked. They could have been picked with any perfectly fine probability measure on the reals, nobody said all numbers were equally likely. Below I will compute the expected outcome for any probability distribution that might have been used and it always works, not just in average.

More precisely, I think this is unrelated to a similar puzzle: In that second puzzle, there are also two envelopes that contain numbers representing payout but none is opened but instead it is known that the number in one envelope is twice the number in the other. You just don't know if you have the half oder double. There, assuming your envelope contains X then you could be tempted to argue that with probability 50% the other contains 2X and with 50% it contains X/2 and thus the expectation value is 50% x 2 X + 50% x X/2= 5/4 X and thus you could increase your expectation by 25% by switching. But then you could increase it by another 25% by using the same argument again and switching back.

In the second puzzle, it is really the implied uniform distribution of X's that is the origin of the paradox: You can see this by giving the additional information that both numbers are definitely smaller than 100 trillion dollars. That sounds like a trivial information but note that the calculation of the expectation value changes: If X is greater than 50 trillion dollars, you know with certainty that the other number cannot be 2X and thus the expectation of taking the other envelope is nor 125%X but X/2. If you now carefully go through the expectation value calculation you will find that averaged over all values of X the expectation for switching is the same as for keeping the first envelope.

Some of my readers will notice that the second puzzle is related to recent arguments that were made in the Landscape scenario about the imminent end of the world.

Back to the first game. Let's do some calculation to compute the expectation of the outcome. We will assume that the numbers were picked according to some probability measure rho(x)dx and that has a finite expectation value, i.e. the integral E=E(X)=int x rho(x)dx converges.

Then the expected outcome of the strategy above is X with probability f(X) and E with probability (1-f(X)) (as in that case we take the number in the second envelope).

We can now compute the expectation E(f(X) X + (1-f(X))E)= E(f(X)X) + E - E E(f(X)). For simplicity assume that E=0. Otherwise we could pay out E immediately and then subtract E from all number in the envelopes. Thus the expected payout of our strategy is E(f(X)X) but it is easy to see that this is positive (and thus we make more than the average E=0): In computing

E(f(X)X) = int f(x) x rho(x) dx

we can for x<0 overestimate f(x) by f(0) and for x>0 underestimate f(x) by f(0) and then conclude (unless rho(x) = delta(x) and we always spit out 0s)

E(f(X)X) > f(0)E(X) = 0

Thus we do better on the average than by deciding for one envelope or the other not taking into account the contents of the first one.

Not that the difference to the first puzzle is that this works for any rho(x), we did not have to assume some (non existent) uniform rho(x) and the effect does not go away as soon as a cut-off is introduced contrary to the other puzzle.