Thursday, March 24, 2011

Mixed superrationality does not beat pure in prisoner's dilemma

The prisoner's dilemma is probably one of the most famous toy games of game theorists. It amounts to two criminals that being caught by the police are interrogated individually are offered the following deal: If both remain silent ("cooperate" with each other) both go to prison for $S$ ('short') years for small crimes that the police can prove. But if one prisoner admits the big crime ("defects") he goes free and the other spends $L$ ('long') years in prison. But if both admit the crime they both face a $M$ ('middle') year sentence. To be a dilemma the sentences should obey $0<S<M<L$ and by picking an appropriate normalisation of the unit of time, we can set $S=1$.

The standard (economist) analysis of the game goes as follows: I assume that the other prisoner has already made his decision. Then, no matter what he decided I am better off by defecting: If he cooperates, my choice is between going free and $S$ years while if he is defecting I can choose between $M$ and $L$. So I defect and he comes to the same conclusion, so we end up spending $M$ years in prison. Both defecting is in fact a Nash equilibrium.

That's not too exciting, as we could do better by both cooperating and serving only $S$ years, which is Pareto optimal but unstable because there is the temptation for each player to defect and then go free. So much for the classic analysis of this game (not iterated) which is a model for many decision problems where one has to decide between a personal advantage or the global optimum.

I first learned about this game many many years ago when still attending high school from a Douglas Hofstaedter column in the Scientific American. He makes the following observation: When defecting, I am counting on the fact that the other prisoner is not as clever as me. It only pays if the situation is asymmetric. But since the other prisoner is faced with the same problem, he will come up with the same solution so the asymmetric case of one player cooperating and the other defecting will not occur. Thus the only real possibilities are both cooperating (yielding $S$ years) and both defecting (yielding $M$ years) of which the obvious better choice is to cooperate. Hofstadter calls this argument "superrational". It is the realization that in the analysis of the Nash equilibrium the idea that my decision is independent of the other prisoner's decision might be wrong.

Then Hofstadter points out another version of this game: You receive a letter from a very rich person stating that she is studying human intelligence and she figured that you are one of the top ten intelligent people in the world. She offers you (and also the other nine top-brainers) the following game: On the bottom of the letter is a coupon. You can either ignore the letter (in which case nothing more will happen) or you write your name on the coupon and send it back. If out of the ten possible coupons she receives exactly one she gives the person who returned the coupon 100 Million dollars. If any other number of coupons arrive until the end of this year nobody will receive any money. And as a warning: You are watched over by a number of private investigators. If they notice you trying to find out who the other nine people are the whole thing is called off and again nobody will get any money. So don't even think about it.

This does not look very promising: Obviously, if you don't send in the coupon you won't get any money. So you have to send the coupon but so will the other nine and again you will receive nil. Too bad.

Well, unless you widen your strategy space and besides 'pure', deterministic strategies you also allow for 'mixed', i.e. probabilistic strategies. You could for example come up with the following strategy: You roll dice and then send the coupon only with probability $p$. Let's see which $p$ optimizes your expectation assuming the other nine player follow the same strategy: You only get the money if you send the letter (probability $p$) and all nine other don't (probablity $(1-p)^9$) so the expectation is $E=p(1-p)^9$. Setting to zero the $p$ derivative of $E$ gives $0=(1-p)^9-9p(1-p)^8=(1-p)^8(1-p-9p)$ thus $p=1/10$. So you could prepare ten envelopes but only one with the coupon and mail a random one of these to optimize your expectation.

But with this idea of taking into account also mixed strategies we can go back to the prisoner's dilemma and see what happens when both players defect with probability $p$ (this is the new part of the story I came up with this morning under the shower. Of course, I do not claim any originality here). Then the expected number of years I spend in prison is $p^2M+Lp(1-p)+(1-p)^2$. Quick check for $p$ being 0 or 1 I get back the two deterministic values. So can I do better? Obviously, this is a quadratic function of $p$ going through $(0,1)$ and $(1,M)$. So it has is minimum in the interior of the range $p\in[0,1]$ if the slope at $p=0$ is negative (remember $M>S=1$). But the slope is $2(M-L+1)p+L-2$ which is positive as long as $L>2$. But this is really the interesting parameter range for the game since for $L<2$ it is better for both players to always switch between cooperate-defect and defect-cooperate since the average sentence in the asymmetric case is shorter than the one year sentence of both cooperating. So, unless that is the case, always cooperating is still the better symmetric strategy of superrational players than the probabilistic ones.

Tuesday, March 15, 2011

Formulas in Blogger

To include formulas in blogger.com I have so far used mimetex which uses an external server running a cgi-script to convert TeX-style formulas to picutres.

This did its job most of the time except that mathphys, the old machine in Bremen that hosted my mimetex service died a couple of months ago and that the formulas have that stupid box around them which is particularly annoying for single symbols (this could probably be fixed by investing some time staring at the stylesheet for this blog). This is very much 8bit pixel style and does not scale nicely but I never touched it since it allowed you to read what I wrote.

Now, some reader suggested MathJax which I try out here:

Let's start witha wave function $\psi$, we define the velocity field $\vec v= \frac1{2m}\Im(\frac{\nabla \psi}{\psi})$. This leads to a conserved current:
$$\frac{\partial\rho}{\partial t}= -\vec\nabla\cdot (\bar\psi\psi\vec v).$$
At first, I thought it does not work but it just takes some time to reload.

Wednesday, March 02, 2011

Bohmian mechanics threatend by Occam's razor

Last semester, I have been running a seminar on "Foundation of Quantum Mechanics" (wiki page) for TMP students that had been disappointed that "Mathematical Quantum Mechanics" was not on foundations.

Overall, I am quite satisfied with the outcome. We had covered several approaches to foundational issues, in particular the relation of quantum to classical physics and here specifically the "measurement problem" (which I am convinced is not a problem but is explained withing quantum theory by decoherence). We will produce a reader with all the contributions and I myself will write some introduction (which I will post here as well once it is finished).

But today, want to discuss Bohmian mechanics which was one of the topics and which has strong support by some local experts. I never really cared about this approach (being one of the Gallic villages where a small group of people know they are doing it better than the rest of the ignorant world, much like algebraic QFT or loop quantum gravity) being satisfied with quantum physics without any extras.

But now was the time to find out what Bohmian mechanics is really about and in this post I would like to share my findings. The big question everybody asks really is "do they make any predictions that differ from usual quantum mechanics i.e. can it be distinguished by some sort of experiment or is it just an alternative interpretation?" but unfortunately I do not have a final answer. But more below.

Before I start, let me put it a bit in perspective: Inequalities of Bell type (an I would include the Kochen-Specker theorem and GHZ type experiments) show in effect that the world cannot be both "realistic" and "local". Realistic means here that all properties have values at any instant of time irrespective of whether they are measured or not while local means that any decision I take here and now (for example whether I measure the x or y component of the spin of my half of an EPR singlet state) cannot influence measurements that are so far away that they cannot be reached even at the speed of light.

Thus one has to give up either realism or locality. The common interpretation of quantum mechanics gives up realism, the x component of the spin does not have a value when I measure the y component but is local. In some of the popular literature you will find statements to the contrary but they are mistaken: It is true, there can be non-local correlations. But this is no different from classical physics: Most of the time the color of the sock on my right foot is correlated with the color of the sock on the left foot, even at the same instant of time (when they are space-like to each other). But the question of locality is not about states (which are always global) it is about operators or measurements. And measuring the color (as compared for example to the size) of one of the socks does not influence the other sock, the local operators do commute.

Bohmian mechanics insists on realism and the price it has to pay is to give up is locality. It does not violate causality in an obviously measurable way but doing the x- or y-measurement here influences what happens far far away. But enough of these philosophical remarks, let's look at some formulas.

In its pure form, Bohmian mechanics is about non-relativistic systems of N particles with Hamiltonian of the form H=\sum_i p_i^2 + V(x_1,\ldots,x_N). Everybody knows that the norm-squared wave function in position representation \rho(x_1,\ldots,x_N)=|\psi(x_1,\ldots,x_N)|^2 gives the probability distribution of finding particle 1 at x_1, particle 2 at x_2 etc. and there is a conserved current j(x_1,\ldots,x_N)=Im(\bar\psi\nabla\psi) for this density. That is if you start with some distribution \rho at an initial time then wait a bit while you flow according to the current you end up with the new \rho at a later time.

The new thing for the Bohmians is to interpret this current as an actual current of particles with velocities \dot Q(x_1,\ldots,x_N) = v =j/\rho= Im(\nabla\psi/\psi). According to the Bohmians, these particles with joint coordinates Q are dots that for example show up on the screen of a double slit experiment. Obviously, if you start with a probability distribution of particle positions given by |\psi|^2 at an initial time and follow the deterministic flow equation for Q above, then at any later time the particles will be distributed according to |\psi|^2. The Bohmians claim, that there are really particles and at any instant of time their position is Q and the velocity is \dot Q no matter whether they are measured or not. That's it.

A few trivial remarks: This theory is non-local as the velocity of the i-th particle does depend via the wave function on the positions of all the other particles. Bohmians say that this is not to worry about since their theory is non-relativistic and this is like for example the Coulomb interaction in non-relativistic quantum mechanics where the force on one electron depends on the instantaneous positions of the other charged particles.

The next remark is that in Bohm's theory there is also the wave function that follows the same Schroedinger equation as in usual quantum mechanics. Thus any question involving only the wave function trivially gives the same answer as in quantum mechanics. The equation of motion for the particle positions Q which are the new ingredient in the Bohm theory depend on the wave function but not the other way around. There is no feed-back and the wave function does not know about the Q. Any quantum mechanical measurement that in the end measures position (like for example Stern-Gerlach) gives the same result as the q follow the wave function that determines the outcome in the usual interpretation.

All observables that are functions of the coordinates x_i at one instant of time do commute with each other and one can thus give them all sharp values at that instant of time. Thus there is no problem with claiming those positions are Q even in the usual interpretation.

Position measurements at different times in general do not commute and thus they have no common meaning. Thus the only hope to find disagreement is in experiments that in the Bohmian interpretation require sharp positions at different instants of time.

When it comes to spin I have the impression that the Bohmians cheat a bit: They declare that "spin is no a property of a point-like particle" meaning that realism does not apply to the different components and like in the usual interpretation, the components do not have a meaning unless measured. One can read this as a manifestation of the preferred role the Bohmians give to observables that are a function of the position operators over all other operators. In effect they claim only those position observables deserve realism.

Of course, one can reformulate the Bell type experiments mentioned above in terms of positions (e.g. by translating spins into positions via Stern-Gerlach set-ups) but then the non-local flow equation seems to prevent any obvious contradictions with quantum mechanics.

There are more formal problems: For time-reversal invariant Hamiltonians, one can always choose the eigenfunctions of the Hamiltonian to be real. Thus for the wave-function to be such an eigenfunction \dot Q=0, the particles don't move, even in i.e. the Coulomb field of a hydrogen atom. You may say that this is not the classical world but the quantum world and there are other equations of motion but I must say I find particles standing still even in the presence of forces a bit strange.

That that brings us to my main criticism: It is not clear to me how to observe the particle at Q. Do experiments measure the wave function (via \langle O \rangle= \langle\psi|O|\psi\rangle) or do they measure Q? And if so, can I prepare (and later measure) Q without significantly disturbing the wave function? If that is the case I can of course check whether I put an electron in a hydrogen atom in an energy eigenstate at some Q and later check whether I find it at some other place (which quantum mechanics would predict).

There are if course ways to wiggle out: You could argue that this experiment is impossible since I would always disturb the wave function significantly by placing a particle at Q and thus everything get screwed up.

But this excuse is pretty much equivalent to "you cannot observe Q (directly)". But then we are adding something (the particles at positions Q) to our theory which is not observable. And that sounds to me to be directly threatened by Occam's razor.

Anyway. Unless somebody explains to me how to measure Q, I maintain that adding Q to the theory is as good as adding invisible angels.

Update: The promised write-up is here.