Wednesday, November 30, 2011

More than one nature for natural units

Hey blog, long time no see! Bee has put together a nice video on natural units. There are one or two aspects that I would put slightly differently and rather than writing a comment I thought it might better be to write a post myself.

The first thing is that strictly speaking, there is not the natural unit system, it depends on the problem you are interested in. For example, if you are interested in atoms, the typical mass is that of the electron, so you will likely be interested in masses as multiples of $m_e$. Then, interactions are Coulomb and you will want to express charges as multiples of the electron charge $e$. Finally, quantum mechanics is your relevant framework, so it is natural to express actions in multiples of $\hbar$. Then a quick calculation shows that this unit system of setting $m_e=e=\hbar=1$ implies that distances are dimensionless and the distance $r=1$ happens to be the Bohr radius that sets the natural scale for the size of atoms. Naturalness here lets you guess the size of an atom from just identifying the electron mass, the electric charge and quantum mechanics to be the relevant ingredients.

When you are doing high energy particle physics quantum physics and special relativity are relevant and thus it is convenient to use units in which $\hbar=c=1$ which is Bee's example. In this unit system, masses and energy have inverse units of length.

If you are a classical relativist contemplating solutions of Einstein's equations, then quantum mechanics (and thus $\hbar$) does not concern you but Newton's constant $G$ does. These people thus use units with $c=G=1$. Confusingly, in this unit system, masses have units of length (and not inverse length as above). In particular, the length scale of a black hole with mass M, the Schwarzschild radius is $R=2M$ (the 2 being there to spice up life a bit). So you have to be a bit careful when you convert energies to lengths, you have to identify if you are in a quantum field theory or in a classical gravity situation.

My other remark is that it is conventional how many independent units you have. Many people think, that in mechanics you need three (e.g. length, mass and time, meters, kilograms and seconds in the SI system) and a fourth if you include thermodynamics (like temperature measured in Kelvins) and a fifth if there is electromagnetism (like charge or alternatively current, Amperes in SI). But these numbers are just what we are used to. This number can change when we change our understanding of a relation from "physical law" to "conversion factor". The price is a dimensionful constant: In the SI system, it is a law that in equipartition of energy $E=\frac 12k_bT$ and Coulombs law equates a mechanical force to an electrostatic expression via $F=\frac{qQ} 1{4\pi\epsilon_0r}$ and it is a law that light moves at a speed $c=s/t$.

But alternatively, we could use these laws to define what we actually mean by Temperature (then measured in units of energy), charge (effectively setting $4\pi\epsilon_0$ to unity and thereby expressing charge in mechanical units) and length (expressing a distance by the time light need to traverse it). This eliminates a law and a unit. What remains of the law is only the fact that one can do that without reference to circumstances, that a distance from here to Paris does not depend for example on the time of the year (and thus on the direction of the velocity of the earth on its orbit around the sun and thus potentially relative to the ether). If the speed of light would not be constant and we would try to measure distances by the time it takes light to traverse them then distances would suddenly vary when we would say that the speed of light varies.

There is even an example that you can increase the number of units to more than what we are used to (although a bit artificial): It is not god given what kinds of things we consider 'of the same type' and thus possible to be measured in the same units. We are used to measuring all distances in the same unit (like for example meters) or derived units like kilometers or feet (with a fixed numerical conversion factor). But in nautical situations it is common to treat horizontal distance to be entirely different from vertical distances. Horizontal distances like the way to the next island you would measure in nautical miles while vertical distances (like the depth of water) you measure in fathoms. It is then a natural law that the ratio between a given depth and a given horizontal distance is constant over time and there is dimensionful constant (fathoms per mile) of nature that allows to compute a horizontal distance from a depth.

Friday, June 03, 2011

Bitcoin explained

As me, you might have recently heared about "Bitcoin", the internet currency that tries to be safe without a central authority like a bank or a credit card company that say which transactions are legitimate. So far, all mentions in blogs, podcasts or the press that I have seen had in common that they did not say how it works, what are the mechanisms that make sure Bitcoins operate like money. So I looked it up and this is what I found:

Bitcoin uses to cryptographic primitives: hashes and public key encryption. I case you don't know what these are: A hash is a function that reads in a string (or file or number, those are technically all the same) and produces some sort of checksum. The important properties are that everybody can do this computation (with some small amount of effort) and produce the same checksum. On the other hand, it is "random" in the sense that you cannot work backwards, i.e. if you only know the checksum you effectively have no idea about the original string. It is computationally hard to find a string for a given checksum (more or less the best you can do is guess random strings, compute their checksums until you succeed). A related hard problem is to find such a string with prescribed first $N$ characters.

This can be used as a proof of effort: You can pose the problem to find a string (possibly with prescribed first characters) such that the first $M$ digits of the checksum have a prescribed value. In binary notation you could for example you could ask for $M$ zeros. Then on the average you have to make $2^M$ guesses for the string until you succeed. Presenting such a string then proves you have invested an effort of $O(2^M)$. The nice thing is that this effort is additive: You can start your string with the characters "The message '....' has checksum 000000xxxxxxxxxxx" and continue it such that the checksum of the total string starts with many zeros. That proves that in addition to the zeros your new string has, somebody has already spent some work on the string I wrote as dots. Common hash functions are SHA-1 (and older and not as reliable: MD5).

The second cryptographic primitive is public key encryption. Here you have two keys $A$, the public key which you tell everybody about and $B$ your secret key (you tell nobody about). These have the properties that you can use one of the keys to "encrypt" a string and then the other key can be used to recover the original string. In particular, you need to know the private key to produce a message that can be decrypted with the public key. This is called a "signature": You have a message $M$ and encrypt it using $B$. Let us call the result $B(M)$. Then you can show $A$ and $M$ and $B(M)$ to somebody to prove that you are in possession of $B$ without revealing $B$ since that person can verify that $B(M)$ can be decrypted using $A$. Here, an example is the RSA algorithm.

Now to Bitcoin. Let's go through the list of features that you want your money to have. The first is that you want to be able to prove that your coins belong to you. This is done by making coins files that contain the public key $A$ of their owner. Then, as explained in the previous paragraph you can prove that you are the legitimate owner of the private key belonging to that coin and thus you are its owner. Note that you can have as many public-private key pairs as you like possibly one for every coin. It is just there to equate knowing of a secret (key) to owning the coin.

Second you want to be able to transfer ownership of the coin. Let us assume that the recipient has the public key $A'$. Then you transfer the coin (which already contains your public key $A$) by appending the string "This coin is transfered to the owner of the secrete key to the public key $A'$". Then you sign the whole thing with your private key $B$. The recipient can now prove that the coin was transferred to him as the coin contains both your public key (from before) and your statement of the transfer (which only you, knowing $B$ can have authorized. This can be checked by everybody by checking the signature). So the recipient can prove you owned the coin and agreed to transfer it to him.

The last property is that once you transfered the coin to somebody else you cannot give it to a third person as you do not own it anymore. Or put differently: If you try to transfer a coin a second time that should not work and the recipient should not accept it or at least it should be illegitimate.

But what happens if two people claim they own the same coin, how can we resolve this conflict? This is done via a public time-line that is kept collaboratively between all participants. Once you receive a coin you want to be able to prove later that you already owned it at a specific time (in particular at the time when somebody else claims he received it).

This is done as follows: You compute the hash function of the transfer (or the coin after transfer, see a,bove including the signature of the previous owner of the coin that he has given it to you) and add it to the time line. This means you take the hash value of the time line so far, at the hash of the transfer and compute new hash. This whole package you then send to your network peers and ask them to also include your transfer in their version of the time line.

So the time line is a record of all the transfers that have happened in the past and each participant in the network keeps his own copy of it.

There could still be a conflict when two incompatible time lines are around. Which is the correct one that should be trusted? One could have a majority vote amongst the participants but (as everybody knows from internet discussions) nothing is easier than to come up with a large number of sock puppets that swing any poll. Here comes the proof of work that I mentioned above in relation to hash functions: There is a field in the time line that can be filled with anything in the attempt to construct something that has a hash with as many zeros as possible. Remember, producing $N$ leading zeros amounts to $O(2^N)$ work. Having a time line with many zeros demonstrates that were willing to put a lot of effort into this time line. But as explained above, this proof of effort is additive and all the participants in the network continuously try to add zeros to their time line hashes. But if they share and combine their time lines often enough such that they stay coherent they are (due to additivity) all working on fining zeros on the same time line. So rather than everybody working for themselves everybody works together as long as their time lines stay coherent. And going back through a time line it is easy to see how much zero finding work has been but in. Thus in the case of conflicting time lines one simply takes that that contains more zero finding work. If you wanted to establish an alternative time line (possibly one where at some point in time you did not transfer a coin but rather kept it to yourself so you could give it to somebody else later) to establish it you would have to outperform all other computers in the network that are all busy working on computing zeros for the other, correct, time line.

Of course, if you want to receive a bitcoin you should make sure that in the generally accepted time line that same coin has not already been given to somebody else. This is why the transfers take some time: You want to wait for a bit that the information that the coin has been transferred to you has been significantly spread on the network and included in the collective time line that it cannot be reversed anymore.

There are some finer points like how subdividing coins (currently worth about 13 dollars) is done and how new coins can be created (again with a lot CPU work) but I think they are not as essential in case you want to understand the technical basis of bitcoin before you but real money in.

BTW, if you liked this exposition (or some other here) feel free to transfer me some bitcoins (or fractions of it). My receiving address is
19cFYVExc2ZS4p7ZARGyENFijnV43y6ts1
.

Thursday, March 24, 2011

Mixed superrationality does not beat pure in prisoner's dilemma

The prisoner's dilemma is probably one of the most famous toy games of game theorists. It amounts to two criminals that being caught by the police are interrogated individually are offered the following deal: If both remain silent ("cooperate" with each other) both go to prison for $S$ ('short') years for small crimes that the police can prove. But if one prisoner admits the big crime ("defects") he goes free and the other spends $L$ ('long') years in prison. But if both admit the crime they both face a $M$ ('middle') year sentence. To be a dilemma the sentences should obey $0<S<M<L$ and by picking an appropriate normalisation of the unit of time, we can set $S=1$.

The standard (economist) analysis of the game goes as follows: I assume that the other prisoner has already made his decision. Then, no matter what he decided I am better off by defecting: If he cooperates, my choice is between going free and $S$ years while if he is defecting I can choose between $M$ and $L$. So I defect and he comes to the same conclusion, so we end up spending $M$ years in prison. Both defecting is in fact a Nash equilibrium.

That's not too exciting, as we could do better by both cooperating and serving only $S$ years, which is Pareto optimal but unstable because there is the temptation for each player to defect and then go free. So much for the classic analysis of this game (not iterated) which is a model for many decision problems where one has to decide between a personal advantage or the global optimum.

I first learned about this game many many years ago when still attending high school from a Douglas Hofstaedter column in the Scientific American. He makes the following observation: When defecting, I am counting on the fact that the other prisoner is not as clever as me. It only pays if the situation is asymmetric. But since the other prisoner is faced with the same problem, he will come up with the same solution so the asymmetric case of one player cooperating and the other defecting will not occur. Thus the only real possibilities are both cooperating (yielding $S$ years) and both defecting (yielding $M$ years) of which the obvious better choice is to cooperate. Hofstadter calls this argument "superrational". It is the realization that in the analysis of the Nash equilibrium the idea that my decision is independent of the other prisoner's decision might be wrong.

Then Hofstadter points out another version of this game: You receive a letter from a very rich person stating that she is studying human intelligence and she figured that you are one of the top ten intelligent people in the world. She offers you (and also the other nine top-brainers) the following game: On the bottom of the letter is a coupon. You can either ignore the letter (in which case nothing more will happen) or you write your name on the coupon and send it back. If out of the ten possible coupons she receives exactly one she gives the person who returned the coupon 100 Million dollars. If any other number of coupons arrive until the end of this year nobody will receive any money. And as a warning: You are watched over by a number of private investigators. If they notice you trying to find out who the other nine people are the whole thing is called off and again nobody will get any money. So don't even think about it.

This does not look very promising: Obviously, if you don't send in the coupon you won't get any money. So you have to send the coupon but so will the other nine and again you will receive nil. Too bad.

Well, unless you widen your strategy space and besides 'pure', deterministic strategies you also allow for 'mixed', i.e. probabilistic strategies. You could for example come up with the following strategy: You roll dice and then send the coupon only with probability $p$. Let's see which $p$ optimizes your expectation assuming the other nine player follow the same strategy: You only get the money if you send the letter (probability $p$) and all nine other don't (probablity $(1-p)^9$) so the expectation is $E=p(1-p)^9$. Setting to zero the $p$ derivative of $E$ gives $0=(1-p)^9-9p(1-p)^8=(1-p)^8(1-p-9p)$ thus $p=1/10$. So you could prepare ten envelopes but only one with the coupon and mail a random one of these to optimize your expectation.

But with this idea of taking into account also mixed strategies we can go back to the prisoner's dilemma and see what happens when both players defect with probability $p$ (this is the new part of the story I came up with this morning under the shower. Of course, I do not claim any originality here). Then the expected number of years I spend in prison is $p^2M+Lp(1-p)+(1-p)^2$. Quick check for $p$ being 0 or 1 I get back the two deterministic values. So can I do better? Obviously, this is a quadratic function of $p$ going through $(0,1)$ and $(1,M)$. So it has is minimum in the interior of the range $p\in[0,1]$ if the slope at $p=0$ is negative (remember $M>S=1$). But the slope is $2(M-L+1)p+L-2$ which is positive as long as $L>2$. But this is really the interesting parameter range for the game since for $L<2$ it is better for both players to always switch between cooperate-defect and defect-cooperate since the average sentence in the asymmetric case is shorter than the one year sentence of both cooperating. So, unless that is the case, always cooperating is still the better symmetric strategy of superrational players than the probabilistic ones.

Tuesday, March 15, 2011

Formulas in Blogger

To include formulas in blogger.com I have so far used mimetex which uses an external server running a cgi-script to convert TeX-style formulas to picutres.

This did its job most of the time except that mathphys, the old machine in Bremen that hosted my mimetex service died a couple of months ago and that the formulas have that stupid box around them which is particularly annoying for single symbols (this could probably be fixed by investing some time staring at the stylesheet for this blog). This is very much 8bit pixel style and does not scale nicely but I never touched it since it allowed you to read what I wrote.

Now, some reader suggested MathJax which I try out here:

Let's start witha wave function $\psi$, we define the velocity field $\vec v= \frac1{2m}\Im(\frac{\nabla \psi}{\psi})$. This leads to a conserved current:
$$\frac{\partial\rho}{\partial t}= -\vec\nabla\cdot (\bar\psi\psi\vec v).$$
At first, I thought it does not work but it just takes some time to reload.

Wednesday, March 02, 2011

Bohmian mechanics threatend by Occam's razor

Last semester, I have been running a seminar on "Foundation of Quantum Mechanics" (wiki page) for TMP students that had been disappointed that "Mathematical Quantum Mechanics" was not on foundations.

Overall, I am quite satisfied with the outcome. We had covered several approaches to foundational issues, in particular the relation of quantum to classical physics and here specifically the "measurement problem" (which I am convinced is not a problem but is explained withing quantum theory by decoherence). We will produce a reader with all the contributions and I myself will write some introduction (which I will post here as well once it is finished).

But today, want to discuss Bohmian mechanics which was one of the topics and which has strong support by some local experts. I never really cared about this approach (being one of the Gallic villages where a small group of people know they are doing it better than the rest of the ignorant world, much like algebraic QFT or loop quantum gravity) being satisfied with quantum physics without any extras.

But now was the time to find out what Bohmian mechanics is really about and in this post I would like to share my findings. The big question everybody asks really is "do they make any predictions that differ from usual quantum mechanics i.e. can it be distinguished by some sort of experiment or is it just an alternative interpretation?" but unfortunately I do not have a final answer. But more below.

Before I start, let me put it a bit in perspective: Inequalities of Bell type (an I would include the Kochen-Specker theorem and GHZ type experiments) show in effect that the world cannot be both "realistic" and "local". Realistic means here that all properties have values at any instant of time irrespective of whether they are measured or not while local means that any decision I take here and now (for example whether I measure the x or y component of the spin of my half of an EPR singlet state) cannot influence measurements that are so far away that they cannot be reached even at the speed of light.

Thus one has to give up either realism or locality. The common interpretation of quantum mechanics gives up realism, the x component of the spin does not have a value when I measure the y component but is local. In some of the popular literature you will find statements to the contrary but they are mistaken: It is true, there can be non-local correlations. But this is no different from classical physics: Most of the time the color of the sock on my right foot is correlated with the color of the sock on the left foot, even at the same instant of time (when they are space-like to each other). But the question of locality is not about states (which are always global) it is about operators or measurements. And measuring the color (as compared for example to the size) of one of the socks does not influence the other sock, the local operators do commute.

Bohmian mechanics insists on realism and the price it has to pay is to give up is locality. It does not violate causality in an obviously measurable way but doing the x- or y-measurement here influences what happens far far away. But enough of these philosophical remarks, let's look at some formulas.

In its pure form, Bohmian mechanics is about non-relativistic systems of N particles with Hamiltonian of the form H=\sum_i p_i^2 + V(x_1,\ldots,x_N). Everybody knows that the norm-squared wave function in position representation \rho(x_1,\ldots,x_N)=|\psi(x_1,\ldots,x_N)|^2 gives the probability distribution of finding particle 1 at x_1, particle 2 at x_2 etc. and there is a conserved current j(x_1,\ldots,x_N)=Im(\bar\psi\nabla\psi) for this density. That is if you start with some distribution \rho at an initial time then wait a bit while you flow according to the current you end up with the new \rho at a later time.

The new thing for the Bohmians is to interpret this current as an actual current of particles with velocities \dot Q(x_1,\ldots,x_N) = v =j/\rho= Im(\nabla\psi/\psi). According to the Bohmians, these particles with joint coordinates Q are dots that for example show up on the screen of a double slit experiment. Obviously, if you start with a probability distribution of particle positions given by |\psi|^2 at an initial time and follow the deterministic flow equation for Q above, then at any later time the particles will be distributed according to |\psi|^2. The Bohmians claim, that there are really particles and at any instant of time their position is Q and the velocity is \dot Q no matter whether they are measured or not. That's it.

A few trivial remarks: This theory is non-local as the velocity of the i-th particle does depend via the wave function on the positions of all the other particles. Bohmians say that this is not to worry about since their theory is non-relativistic and this is like for example the Coulomb interaction in non-relativistic quantum mechanics where the force on one electron depends on the instantaneous positions of the other charged particles.

The next remark is that in Bohm's theory there is also the wave function that follows the same Schroedinger equation as in usual quantum mechanics. Thus any question involving only the wave function trivially gives the same answer as in quantum mechanics. The equation of motion for the particle positions Q which are the new ingredient in the Bohm theory depend on the wave function but not the other way around. There is no feed-back and the wave function does not know about the Q. Any quantum mechanical measurement that in the end measures position (like for example Stern-Gerlach) gives the same result as the q follow the wave function that determines the outcome in the usual interpretation.

All observables that are functions of the coordinates x_i at one instant of time do commute with each other and one can thus give them all sharp values at that instant of time. Thus there is no problem with claiming those positions are Q even in the usual interpretation.

Position measurements at different times in general do not commute and thus they have no common meaning. Thus the only hope to find disagreement is in experiments that in the Bohmian interpretation require sharp positions at different instants of time.

When it comes to spin I have the impression that the Bohmians cheat a bit: They declare that "spin is no a property of a point-like particle" meaning that realism does not apply to the different components and like in the usual interpretation, the components do not have a meaning unless measured. One can read this as a manifestation of the preferred role the Bohmians give to observables that are a function of the position operators over all other operators. In effect they claim only those position observables deserve realism.

Of course, one can reformulate the Bell type experiments mentioned above in terms of positions (e.g. by translating spins into positions via Stern-Gerlach set-ups) but then the non-local flow equation seems to prevent any obvious contradictions with quantum mechanics.

There are more formal problems: For time-reversal invariant Hamiltonians, one can always choose the eigenfunctions of the Hamiltonian to be real. Thus for the wave-function to be such an eigenfunction \dot Q=0, the particles don't move, even in i.e. the Coulomb field of a hydrogen atom. You may say that this is not the classical world but the quantum world and there are other equations of motion but I must say I find particles standing still even in the presence of forces a bit strange.

That that brings us to my main criticism: It is not clear to me how to observe the particle at Q. Do experiments measure the wave function (via \langle O \rangle= \langle\psi|O|\psi\rangle) or do they measure Q? And if so, can I prepare (and later measure) Q without significantly disturbing the wave function? If that is the case I can of course check whether I put an electron in a hydrogen atom in an energy eigenstate at some Q and later check whether I find it at some other place (which quantum mechanics would predict).

There are if course ways to wiggle out: You could argue that this experiment is impossible since I would always disturb the wave function significantly by placing a particle at Q and thus everything get screwed up.

But this excuse is pretty much equivalent to "you cannot observe Q (directly)". But then we are adding something (the particles at positions Q) to our theory which is not observable. And that sounds to me to be directly threatened by Occam's razor.

Anyway. Unless somebody explains to me how to measure Q, I maintain that adding Q to the theory is as good as adding invisible angels.

Update: The promised write-up is here.