Tuesday, November 06, 2012

A Few Comments On Firewalls

I was stupid enough to agree to talk about Firewalls in our strings lunch seminar this Wednesday without having read the paper (or what other people say about them) except for talking to Raphael Busso at the Strings 2012 conference and reading Joe Polichinski's guest post over at the Cosmic Variance blog.

Now, of course I had to read (some of) the papers and I have to say that I am confused. I admit, I did not get the point. Even more, I cannot understand a large part of the discussion. There is a lot of prose and very little formulas and I have failed to translate the prose to formulas or hard facts for myself. Many of the statements taken at face value do not make sense to me but on the other hand, I know the authors to be extremely clever people and thus the problem is most likely on my end.

In this post, I would like to share some of my thoughts in my endeavor to decode these papers but probably they are to you even more confusing than the original papers to me. But maybe you can spot my mistakes and correct me in the comment section.

I had a long discussion with Cristiano Germani on these matters for which I am extremely grateful. If this post contains any insight it is his while all errors are for course mine.

What is the problem?

I have a very hard time not to believe in "no drama", i.e. that anything special can happen at an event horizon. First of all, the event horizon is a global concept and its location now does in general depend on what happens in the future (e.g. how much further stuff is thrown in the black hole). So who can it be that the location of a anything like a firewall can depend on future events?

Furthermore, I have never seen such a firewall so far. But I might have already passed an event horizon (who knows what happens at cosmological scales?). Even more, I cannot see a local difference between a true event horizon like that of a black hole and the horizon of an accelerated observer in the case of the Unruh-effect. That the later I am pretty sure I have crossed already many times and I have never seen a firewall.

So I was trying to understand why there should be one. And whenever I tried to flesh out the argument for one they way I understood it it fell apart. So, here are some of my thoughts;

The classical situation

No question, Hawking radiation is a quantum effect (even though it happens at tree level in QFT on curved space-time and is usually derived in a free theory or, equivalently, by studying the propagator). But apart from that not much of the discussion (besides possibly the monogamy of entanglement, see below) seems to be particular quantum. Thus we might gain some mileage by studying classical field theory on the space time of a forming and decaying black hole as given by the causal diagram:
A decaying black hole, image stolen from Sabine Hossenfelder.

Issues of causality a determined by the characteristics of the PDE in question (take for example the wave equation) and those are invariant under conformal transformations even if the field equation is not. So, it is enough to consider the free wave equation on the causal diagram (rather than the space-time related to it by a conformal transformation). 

For example we can give initial data on I- (and have good boundary conditions at the r=0 vertical lines). At the dashed horizontal line, the location of the singularity, we just stop evolving (free boundary conditions) and then we can read off outgoing radiation at I+. The only problematic point is the right end of the singularity: This is the end of the black hole evaporation and to me it is not clear how we can here start to impose again some boundary condition at the new r=0 line without affecting what we did earlier. But anyway, this is in a region of strong curvature, where quantum gravity becomes essential and thus what we conclude should better not depend too much on what's going on there as we don't have a good understanding of that regime.

The firewall paper, when it explains the assumptions of complementarity mentions an S-matrix where it tries to formalize the notion of unitary time evolution. But it seems to me, this might be the wrong formalization as the S-matrix is only about asymptotic states and even fails in much simpler situations when there are bound states and the asymptotic Hilbert spaces are not complete. Furthermore, strictly speaking, this (in the sense of LSZ reduction) is not what we can observe: Our detectors are never at spatial infinity, even if CMS is huge, so we should better come up with a more local concept. 
Two regions M and N on a Cauchy surface C with their causal shadows

In the case of the wave equation, this can be encoded in terms of domains of dependence: By giving initial data on a region of a Cauchy surface I determine the solution on its causal shadow (in the full quantum theory maybe plus/minus an epsilon for quantum uncertainties). In more detail: If I have two sets of initial data on one Cauchy surface that agree on a local region. Than the two solutions have to agree on the causal shadow of this region no matter what the initial data looks like elsewhere. This encodes that "my time-evolution is good and I do not lose information on the way" in a local fashion.

States

Some of my confusion comes from talking about states in a way that at least when taken at face value is  in conflict with how we understand states both in classical and in better understood quantum (both quantum mechanics and quantum field theory) circumstances.

First of all (and quite trivially), a state is always at one instant of time, that is it lives on a Cauchy surface (or at least a space-like hyper surface, as our space-time might not be globally hyperbolic), not in a region of space-time. Hilbert space, as the space of (pure) states thus also lives on a Cauchy surface (and not for example in the region behind the horizon). If one event is after another (i.e. in its forward light-cone) it does not make sense to say they belong to different tensor factors of the Hilbert (or different Hilbert spaces for that matter).

Furthermore, a state is always a global concept, it is everywhere (in space, but not in time!). There is nothing like "the space of this observer". What you can do of course is restrict a state to a subset of observables (possibly those that are accessible to one observer) by tracing out a tensor factor of the Hilbert space. But in general, the total state cannot be obtained by merging all these restricted states as those lack information about correlations and possible entanglement.

This brings me to the next confusion: There is nothing wrong with states containing correlations of space-like separated observables. This is not even a distinguishing property of quantum physics, as this happens all the time even in classical situations: In the morning, I pick a pair of socks from my drawer without turning on the light and put it on my feet. Thus I do not know which socks I am wearing, in particular, I don't know their color. But as I combined matching socks when they came from the washing machine (as far as this is possible given the tendency of socks going missing) I know by looking at the sock on my right foot what the color of the sock on my left foot is, even when my two feet are spatially separated. Before looking, the state of the color of the socks was a statistical mixture but with non-local correlations. And of course there is nothing quantum about my socks (even if in German "Quanten" is not only "quantum" but also a pejorative word for feet). This would even be true (and still completely trivial) if I had put one of my feet through an event horizon while the other one is still outside. This example shows that locality is not a property that I should demand of states in order to be sure my theory is free of time travel. The important locality property is not in the states, it is in the observables: The measurement of an observable here must not depend of whether or not I apply an operator at a space-like distance. Otherwise that would imply I could send signals faster than the speed of light. But it is the operators, not the states that have to be local (i.e. commute for spatial separation).

If two operators, however, are time-like separated (i.e. one is after the other in its forward light cone), I can of course influence one's measurement by applying the other. But this is not about correlations, this is about influence. In particular, if I write something in my notebook and then throw it across the horizon of a black hole, there is no point in saying that there is a correlation (or even entanglement) between the notebook's state now and after crossing the horizon. It's just the former influencing the later.

Which brings us to entanglement. This must not be confused with correlation, the former being a strict quantum property whereas the other can be both quantum or classical. Unfortunately, you can often see this in popular talks about quantum information where many speakers claim to explain entanglement but in fact only explain correlations. As a hint: For entanglement, one must discuss non-commuting observables (like different components of a the same spin) as otherwise (by the GNS reconstruction theorem) one deals with a commutative operator algebra which always has a classical interpretation (functions on a classical space). And of course, it is entanglement which violates Bell's inequality or shows up in the GHZ experiment. But you need something of this complexity (i.e. involving non-commuting observables) to make use of the quantumness of the situation. And it is only this entanglement (and not correlation) that is "monogamous": You cannot have three systems that are fully entangled for all pairs. You can have three spins that are entangled, but once you only look at two they are no longer entangles (which makes quantum cryptography work as the eavesdropper cannot clone the entanglement that is used for coding).

And once more, entanglement is a property of a state when it is split according to a tensor product decomposition of the Hilbert space. And thus lives on a Cauchy surface. You can say that a state contains entanglement of two regions on a Cauchy surface but it makes no sense to say to regions that are time-like to each other to be entangled (like the notebook before and after crossing the horizon). And therefore monogamy cannot be invoked with respect to also taking the outgoing radiation in as the third player.

Monday, September 24, 2012

The future of blogging (for me) and in particular twitter

As you might have noticed, breaks between two posts here get bigger and bigger. This is mainly due to lack of ideas on my side but also as I am busy with other things (now that with Ella H. kid number two has joined the family but there is also a lot of TMP admin stuff to do).

This is not only true for me writing blog posts but also about reading: Until about a year ago, I was using google reader not to miss a single blog post of a list of about 50 blogs. I have completely stopped this and systematically read blogs only very occasionally (that is other than being directed to a specific post by a link from somewhere else).

What I still do (and more than ever) is use facebook (mainly to stay in contact with not so computer affine friends) and of course twitter (you will know that I am @atdotde there). Twitter seems to be the ideal way to stay current on a lot of matters you are interested in (internet politics for example) while not wasting too much time given the 140 character limit.

Twitter's only problem is that they don't make (a lot of) money. This is no problem for the original inventors of the site (they have sold their shares to investors) but the current owners now seem desperate to change this. From what they say they want to move twitter more to a many to one (marketing) communication platform and force users to see ads they mix among the genuine tweets.

One of the key aspects of the success of twitter was its open API (application programmers interface): Everybody could write programs (and for example I did) that interacted with twitter so for example everybody can choose their favourite client program on any OS to read and write tweets. Since the recent twitter API policy changes this is no longer the case: A client can now have only 100,000 users (or if they already have more can double the number of users), a small number given the allegedly about 4,000,000 million twitter accounts. And there are severe restrictions how you may display tweets to your users (e.g. you are not allowed to use them in any kind of cloud service or mix them with other social media sites, i.e. blend them with Facebook updates). The message that this sends is clearly: "developers go away" (the idea seems to be to force users to use the twitter website and their own clients) and anybody who still invests in twitter developing is betting on a dead horse. But it is not hard to guess that in the long run this will also make the while twitter unattractive to a lot of (if not eventually all) their users.

People (often addicted to twitter feeds) are currently evaluating alternatives (like app.net) but this morning I realized that maybe the twitter managers are not so stupid as they seem to be (or maybe they just want to cash in what they have and don't care if this ruins the service), there is still an alternative that would make twitter profitable and would secure the service in the long run: They could offer to developers to allow them to use the old API guidelines but for a fee (say a few $/Euros per user per month): This would bring in the cash they are apparently looking for while still keeping the healthy ecosystem of many clients and other programs. twitter.com would only be dealing with developers while those would forward the costs to their users and recollect the money by selling their apps (so twitter would not have to collect money from millions of users).

But maybe that's too optimistic and they just want to earn advertising money NOW.

Tuesday, February 07, 2012

AdS/cond-mat

Last week, Subir Sachdev came to Munich to give three Arnold Sommerfeld Lectures. I want to take this opportunity to write about a subject that has attracted a lot of attention in recent years, namely applying AdS/CFT techniques to condensed matter systems like trying to write gravity duals for D-wave superconducturs or strange metals (it's surprisingly hard to find a good link for this keyword).

My attitude towards this attempt has somewhat changed from "this will never work" to "it's probably as good as anything else" and in this post I will explain why I think this. I should mention as well that Sean Hartnoll has been essential in this phase transition of my mind.

Let me start by sketching (actually: caricaturing) what I am talking about. You want to understand some material, typically the electrons in a horribly complicated lattice like bismuth strontium calcium copper oxide, or BSCCO. To this end, you come up with a five dimensional theory of gravity coupled to your favorite list of other fields (gauge fields, scalars with potentials, you name it) and place that in an anti-de-Sitter background (or better, for finite temperature, in an asymptotically anti-de-Sitter black hole). Now, you compute solutions with prescribed behavior at infinity and interpret these via Witten's prescription as correlators in your condensed matter theory. For example you can read off Green functions and (frequency dependent) conductivities, densities of state.

How can this ever work, how are you supposed to guess the correct field content (there is no D-brane/string description anywhere near that could help you out) and how can you ever be sure you got it right?

The answer is you cannot but it does not matter. It does not matter as it does not matter elsewhere in condensed matter physics. To clarify this, we have to be clear about what it means for a condensed matter theorist to "understand" a system. Expressed in our high energy lingo, most of the time, the "microscopic theory" is obvious: It is given by the Schrödinger equation for $10^23$ electrons plus as similar number of noclei feeling the Coulomb potential of the nuclei and interacting themselves with Coulomb repulsion. There is nothing more to be known about this. Except that this is obviously not what we want. These are far too many particles to worry about and, what is more important, we are interested in the behavior at much much lower energy scales and longer wave lengths, at which all the details of the lattice structure are smoothed out and we see only the effect of a few electrons close to the Fermi surface. As an estimate, one should compare the typical energy scale of the Coulomb interactions, the binding energies of the electrons to the nucleus (Z times 13.6 eV) or in terms of temperature (where putting in the constants equates 1eV to about 10,000K) to the milli-eV binding energy of Cooper pairs or the typical temperature where superconductivity plays a role.

In the language of the renormalization group, the Coulomb interactions are the UV theory but we want to understand the effective theory that this flows to in the IR. The convenient thing about such effective theories is that they do not have to be unique: All we want is a simple to understand theory (in which we can compute many quantities that we would like to know) that is in the same universality class as the system we started from. Differences in relevant operators do not matter (at least to leading order).

Surprisingly often, one can find free theories or weakly (and thus almost free) theories that can act as the effective theory we are looking for. BCS is a famous example, but Landau's Fermi Liquid Theory is another: There the idea is that you can almost pretend that your fermions are free (and thus you can just add up energies taking into account the Pauli exclusion principle giving you Fermi-surfaces etc) even though your electrons are interacting (remember, there is always the Coulomb interaction around). The only effect the interactions have, is to renormalize the mass, to deform the Fermi surface away from a ball and to change the hight of the jump in the T=0 occupation number. Experience shows that this is an excellent description in more than one dimension (that has the exception of the Luttinger liquid) and can probably traced back to the fact that a four-Fermi-interaction is non-renormalizable and thus invisible in the IR.

Only, it is important to remember that the fields/particles in that effective theories are not really the electrons you started with but just quasi-particles that are build in complicated ways out of the microscopic particles carrying around clouds of other particles and deforming the lattice they move in. But these details don't matter and that is the point.

It is only important to guess the effective theory in the same universality class. You never derive this (or: hardly ever). Following an exact renormalization group flow is just way beyond what is possible. You make a hopefully educated guess (based on symmetries etc) and then check that you get good descriptions. But only the fact, that there are not too many universality classes makes this process of guessing worthwhile.

Free or weakly coupled theories are not the only possible guesses for effective field theories in which one can calculate. 2d conformal field theories are others. And now, AdS-technology gives us another way of writing down correlation functions just as Feynman-rules give us correlation functions for weakly coupled theories. And that is all one needs: Correlation functions of effective field theory candidates. Once you have those you can check if you are lucky and get evidence that you are in the correct universality class. You don't have to derive the IR theory from the UV. You never do this. You always just guess. And often enough this is good enough to work. And strictly speaking, you never know if your next measurement shows deviations from what you thought would be an effective theory for your system.

In a sense, it is like the mystery that chemistry works: The periodic table somehow pretends that the electrons in atoms are arranged in states that group together like for the hydrogen atom, you get the same n,l,m,s quantum numbers and the shells are roughly the same (although with some overlap encoded in the Aufbau principle) as for hydrogen. This pretends that the only effect of the electron-electron Coulomb potential is to shield the charge of the nucleus and every electron sees effectively a hydrogen like atom (although not necessarily with integer charge Z) and Pauli's exclusion principle regulates that no state is filled more than once. One could have thought that the effect of n-1 electrons on the last is much bigger, after all, they have a total charge that is almost the same of the nucleous, but it seems, the last electron only sees the nucleus with a 1/r potential although with reduced charge.

If you like, the only thing one should might worry about is that the Witten prescription to obtain boundary correlators from bulk configurations really gives you valid n-point functions of a quantum theory (if you feel sufficient mathematical masochism for example in the sense of Wightman) but you don't want to show that it is the quantum field theory corresponding to the material you started with.

Friday, February 03, 2012

Write-ups

Not much to say, but I would like to mention that, finally, we have been able two finalize two write-ups that I have announced here in the past:

First, there are the notes of a block course that I have in the summer on how to fix some mathematicla lose ends in QFT (notes written by our students Mario Flory and Constantin Sluka):


How I Learned to Stop Worrying and Love QFT

Lecture notes of a block course explaining why quantum field theory might be in a better mathematical state than one gets the impression from the typical introduction to the topic. It is explained how to make sense of a perturbative expansion that fails to converge and how to express Feynman loop integrals and their renormalization using the language of distribtions rather than divergent, ill-defined integrals.

Then there are the contributions to a seminar on "Foundations of Quantum Mechanics" (including an introduction by your's truly) that I taught a year ago. From the contents:


  1. C*-algebras, GNS-construction, states, (Sebastian)
  2. Stone-von-Neumann Theorem (Dennis)
  3. Pure Operations, POVMs (Mario)
  4. Measurement Problem (Anupam, David)
  5. EPR and Entanglement, Bell's Theorem, Kochen–Specker theorem (Isabel, Matthias)
  6. Decoherence (Kostas, Cosmas)
  7. Pointer Basis (Greeks again)
  8. Consistent Histories (Hao)
  9. Many Worlds (Max)
  10. Bohmian Interpretation (Henry, Franz)
See also the seminar's wiki page.

Have fun!



Wednesday, November 30, 2011

More than one nature for natural units

Hey blog, long time no see! Bee has put together a nice video on natural units. There are one or two aspects that I would put slightly differently and rather than writing a comment I thought it might better be to write a post myself.

The first thing is that strictly speaking, there is not the natural unit system, it depends on the problem you are interested in. For example, if you are interested in atoms, the typical mass is that of the electron, so you will likely be interested in masses as multiples of $m_e$. Then, interactions are Coulomb and you will want to express charges as multiples of the electron charge $e$. Finally, quantum mechanics is your relevant framework, so it is natural to express actions in multiples of $\hbar$. Then a quick calculation shows that this unit system of setting $m_e=e=\hbar=1$ implies that distances are dimensionless and the distance $r=1$ happens to be the Bohr radius that sets the natural scale for the size of atoms. Naturalness here lets you guess the size of an atom from just identifying the electron mass, the electric charge and quantum mechanics to be the relevant ingredients.

When you are doing high energy particle physics quantum physics and special relativity are relevant and thus it is convenient to use units in which $\hbar=c=1$ which is Bee's example. In this unit system, masses and energy have inverse units of length.

If you are a classical relativist contemplating solutions of Einstein's equations, then quantum mechanics (and thus $\hbar$) does not concern you but Newton's constant $G$ does. These people thus use units with $c=G=1$. Confusingly, in this unit system, masses have units of length (and not inverse length as above). In particular, the length scale of a black hole with mass M, the Schwarzschild radius is $R=2M$ (the 2 being there to spice up life a bit). So you have to be a bit careful when you convert energies to lengths, you have to identify if you are in a quantum field theory or in a classical gravity situation.

My other remark is that it is conventional how many independent units you have. Many people think, that in mechanics you need three (e.g. length, mass and time, meters, kilograms and seconds in the SI system) and a fourth if you include thermodynamics (like temperature measured in Kelvins) and a fifth if there is electromagnetism (like charge or alternatively current, Amperes in SI). But these numbers are just what we are used to. This number can change when we change our understanding of a relation from "physical law" to "conversion factor". The price is a dimensionful constant: In the SI system, it is a law that in equipartition of energy $E=\frac 12k_bT$ and Coulombs law equates a mechanical force to an electrostatic expression via $F=\frac{qQ} 1{4\pi\epsilon_0r}$ and it is a law that light moves at a speed $c=s/t$.

But alternatively, we could use these laws to define what we actually mean by Temperature (then measured in units of energy), charge (effectively setting $4\pi\epsilon_0$ to unity and thereby expressing charge in mechanical units) and length (expressing a distance by the time light need to traverse it). This eliminates a law and a unit. What remains of the law is only the fact that one can do that without reference to circumstances, that a distance from here to Paris does not depend for example on the time of the year (and thus on the direction of the velocity of the earth on its orbit around the sun and thus potentially relative to the ether). If the speed of light would not be constant and we would try to measure distances by the time it takes light to traverse them then distances would suddenly vary when we would say that the speed of light varies.

There is even an example that you can increase the number of units to more than what we are used to (although a bit artificial): It is not god given what kinds of things we consider 'of the same type' and thus possible to be measured in the same units. We are used to measuring all distances in the same unit (like for example meters) or derived units like kilometers or feet (with a fixed numerical conversion factor). But in nautical situations it is common to treat horizontal distance to be entirely different from vertical distances. Horizontal distances like the way to the next island you would measure in nautical miles while vertical distances (like the depth of water) you measure in fathoms. It is then a natural law that the ratio between a given depth and a given horizontal distance is constant over time and there is dimensionful constant (fathoms per mile) of nature that allows to compute a horizontal distance from a depth.

Friday, June 03, 2011

Bitcoin explained

As me, you might have recently heared about "Bitcoin", the internet currency that tries to be safe without a central authority like a bank or a credit card company that say which transactions are legitimate. So far, all mentions in blogs, podcasts or the press that I have seen had in common that they did not say how it works, what are the mechanisms that make sure Bitcoins operate like money. So I looked it up and this is what I found:

Bitcoin uses to cryptographic primitives: hashes and public key encryption. I case you don't know what these are: A hash is a function that reads in a string (or file or number, those are technically all the same) and produces some sort of checksum. The important properties are that everybody can do this computation (with some small amount of effort) and produce the same checksum. On the other hand, it is "random" in the sense that you cannot work backwards, i.e. if you only know the checksum you effectively have no idea about the original string. It is computationally hard to find a string for a given checksum (more or less the best you can do is guess random strings, compute their checksums until you succeed). A related hard problem is to find such a string with prescribed first $N$ characters.

This can be used as a proof of effort: You can pose the problem to find a string (possibly with prescribed first characters) such that the first $M$ digits of the checksum have a prescribed value. In binary notation you could for example you could ask for $M$ zeros. Then on the average you have to make $2^M$ guesses for the string until you succeed. Presenting such a string then proves you have invested an effort of $O(2^M)$. The nice thing is that this effort is additive: You can start your string with the characters "The message '....' has checksum 000000xxxxxxxxxxx" and continue it such that the checksum of the total string starts with many zeros. That proves that in addition to the zeros your new string has, somebody has already spent some work on the string I wrote as dots. Common hash functions are SHA-1 (and older and not as reliable: MD5).

The second cryptographic primitive is public key encryption. Here you have two keys $A$, the public key which you tell everybody about and $B$ your secret key (you tell nobody about). These have the properties that you can use one of the keys to "encrypt" a string and then the other key can be used to recover the original string. In particular, you need to know the private key to produce a message that can be decrypted with the public key. This is called a "signature": You have a message $M$ and encrypt it using $B$. Let us call the result $B(M)$. Then you can show $A$ and $M$ and $B(M)$ to somebody to prove that you are in possession of $B$ without revealing $B$ since that person can verify that $B(M)$ can be decrypted using $A$. Here, an example is the RSA algorithm.

Now to Bitcoin. Let's go through the list of features that you want your money to have. The first is that you want to be able to prove that your coins belong to you. This is done by making coins files that contain the public key $A$ of their owner. Then, as explained in the previous paragraph you can prove that you are the legitimate owner of the private key belonging to that coin and thus you are its owner. Note that you can have as many public-private key pairs as you like possibly one for every coin. It is just there to equate knowing of a secret (key) to owning the coin.

Second you want to be able to transfer ownership of the coin. Let us assume that the recipient has the public key $A'$. Then you transfer the coin (which already contains your public key $A$) by appending the string "This coin is transfered to the owner of the secrete key to the public key $A'$". Then you sign the whole thing with your private key $B$. The recipient can now prove that the coin was transferred to him as the coin contains both your public key (from before) and your statement of the transfer (which only you, knowing $B$ can have authorized. This can be checked by everybody by checking the signature). So the recipient can prove you owned the coin and agreed to transfer it to him.

The last property is that once you transfered the coin to somebody else you cannot give it to a third person as you do not own it anymore. Or put differently: If you try to transfer a coin a second time that should not work and the recipient should not accept it or at least it should be illegitimate.

But what happens if two people claim they own the same coin, how can we resolve this conflict? This is done via a public time-line that is kept collaboratively between all participants. Once you receive a coin you want to be able to prove later that you already owned it at a specific time (in particular at the time when somebody else claims he received it).

This is done as follows: You compute the hash function of the transfer (or the coin after transfer, see a,bove including the signature of the previous owner of the coin that he has given it to you) and add it to the time line. This means you take the hash value of the time line so far, at the hash of the transfer and compute new hash. This whole package you then send to your network peers and ask them to also include your transfer in their version of the time line.

So the time line is a record of all the transfers that have happened in the past and each participant in the network keeps his own copy of it.

There could still be a conflict when two incompatible time lines are around. Which is the correct one that should be trusted? One could have a majority vote amongst the participants but (as everybody knows from internet discussions) nothing is easier than to come up with a large number of sock puppets that swing any poll. Here comes the proof of work that I mentioned above in relation to hash functions: There is a field in the time line that can be filled with anything in the attempt to construct something that has a hash with as many zeros as possible. Remember, producing $N$ leading zeros amounts to $O(2^N)$ work. Having a time line with many zeros demonstrates that were willing to put a lot of effort into this time line. But as explained above, this proof of effort is additive and all the participants in the network continuously try to add zeros to their time line hashes. But if they share and combine their time lines often enough such that they stay coherent they are (due to additivity) all working on fining zeros on the same time line. So rather than everybody working for themselves everybody works together as long as their time lines stay coherent. And going back through a time line it is easy to see how much zero finding work has been but in. Thus in the case of conflicting time lines one simply takes that that contains more zero finding work. If you wanted to establish an alternative time line (possibly one where at some point in time you did not transfer a coin but rather kept it to yourself so you could give it to somebody else later) to establish it you would have to outperform all other computers in the network that are all busy working on computing zeros for the other, correct, time line.

Of course, if you want to receive a bitcoin you should make sure that in the generally accepted time line that same coin has not already been given to somebody else. This is why the transfers take some time: You want to wait for a bit that the information that the coin has been transferred to you has been significantly spread on the network and included in the collective time line that it cannot be reversed anymore.

There are some finer points like how subdividing coins (currently worth about 13 dollars) is done and how new coins can be created (again with a lot CPU work) but I think they are not as essential in case you want to understand the technical basis of bitcoin before you but real money in.

BTW, if you liked this exposition (or some other here) feel free to transfer me some bitcoins (or fractions of it). My receiving address is
19cFYVExc2ZS4p7ZARGyENFijnV43y6ts1
.

Thursday, March 24, 2011

Mixed superrationality does not beat pure in prisoner's dilemma

The prisoner's dilemma is probably one of the most famous toy games of game theorists. It amounts to two criminals that being caught by the police are interrogated individually are offered the following deal: If both remain silent ("cooperate" with each other) both go to prison for $S$ ('short') years for small crimes that the police can prove. But if one prisoner admits the big crime ("defects") he goes free and the other spends $L$ ('long') years in prison. But if both admit the crime they both face a $M$ ('middle') year sentence. To be a dilemma the sentences should obey $0<S<M<L$ and by picking an appropriate normalisation of the unit of time, we can set $S=1$.

The standard (economist) analysis of the game goes as follows: I assume that the other prisoner has already made his decision. Then, no matter what he decided I am better off by defecting: If he cooperates, my choice is between going free and $S$ years while if he is defecting I can choose between $M$ and $L$. So I defect and he comes to the same conclusion, so we end up spending $M$ years in prison. Both defecting is in fact a Nash equilibrium.

That's not too exciting, as we could do better by both cooperating and serving only $S$ years, which is Pareto optimal but unstable because there is the temptation for each player to defect and then go free. So much for the classic analysis of this game (not iterated) which is a model for many decision problems where one has to decide between a personal advantage or the global optimum.

I first learned about this game many many years ago when still attending high school from a Douglas Hofstaedter column in the Scientific American. He makes the following observation: When defecting, I am counting on the fact that the other prisoner is not as clever as me. It only pays if the situation is asymmetric. But since the other prisoner is faced with the same problem, he will come up with the same solution so the asymmetric case of one player cooperating and the other defecting will not occur. Thus the only real possibilities are both cooperating (yielding $S$ years) and both defecting (yielding $M$ years) of which the obvious better choice is to cooperate. Hofstadter calls this argument "superrational". It is the realization that in the analysis of the Nash equilibrium the idea that my decision is independent of the other prisoner's decision might be wrong.

Then Hofstadter points out another version of this game: You receive a letter from a very rich person stating that she is studying human intelligence and she figured that you are one of the top ten intelligent people in the world. She offers you (and also the other nine top-brainers) the following game: On the bottom of the letter is a coupon. You can either ignore the letter (in which case nothing more will happen) or you write your name on the coupon and send it back. If out of the ten possible coupons she receives exactly one she gives the person who returned the coupon 100 Million dollars. If any other number of coupons arrive until the end of this year nobody will receive any money. And as a warning: You are watched over by a number of private investigators. If they notice you trying to find out who the other nine people are the whole thing is called off and again nobody will get any money. So don't even think about it.

This does not look very promising: Obviously, if you don't send in the coupon you won't get any money. So you have to send the coupon but so will the other nine and again you will receive nil. Too bad.

Well, unless you widen your strategy space and besides 'pure', deterministic strategies you also allow for 'mixed', i.e. probabilistic strategies. You could for example come up with the following strategy: You roll dice and then send the coupon only with probability $p$. Let's see which $p$ optimizes your expectation assuming the other nine player follow the same strategy: You only get the money if you send the letter (probability $p$) and all nine other don't (probablity $(1-p)^9$) so the expectation is $E=p(1-p)^9$. Setting to zero the $p$ derivative of $E$ gives $0=(1-p)^9-9p(1-p)^8=(1-p)^8(1-p-9p)$ thus $p=1/10$. So you could prepare ten envelopes but only one with the coupon and mail a random one of these to optimize your expectation.

But with this idea of taking into account also mixed strategies we can go back to the prisoner's dilemma and see what happens when both players defect with probability $p$ (this is the new part of the story I came up with this morning under the shower. Of course, I do not claim any originality here). Then the expected number of years I spend in prison is $p^2M+Lp(1-p)+(1-p)^2$. Quick check for $p$ being 0 or 1 I get back the two deterministic values. So can I do better? Obviously, this is a quadratic function of $p$ going through $(0,1)$ and $(1,M)$. So it has is minimum in the interior of the range $p\in[0,1]$ if the slope at $p=0$ is negative (remember $M>S=1$). But the slope is $2(M-L+1)p+L-2$ which is positive as long as $L>2$. But this is really the interesting parameter range for the game since for $L<2$ it is better for both players to always switch between cooperate-defect and defect-cooperate since the average sentence in the asymmetric case is shorter than the one year sentence of both cooperating. So, unless that is the case, always cooperating is still the better symmetric strategy of superrational players than the probabilistic ones.