I was stupid enough to agree to talk about
Firewalls in our strings lunch seminar this Wednesday without having read the paper (or
what other people say about them) except for talking to Raphael Busso at the Strings 2012 conference and reading
Joe Polichinski's guest post over at the Cosmic Variance blog.
Now, of course I had to read (some of) the papers and I have to say that I am confused. I admit, I did not get the point. Even more, I cannot understand a large part of the discussion. There is a lot of prose and very little formulas and I have failed to translate the prose to formulas or hard facts for myself. Many of the statements taken at face value do not make sense to me but on the other hand, I know the authors to be extremely clever people and thus the problem is most likely on my end.
In this post, I would like to share some of my thoughts in my endeavor to decode these papers but probably they are to you even more confusing than the original papers to me. But maybe you can spot my mistakes and correct me in the comment section.
I had a long discussion with Cristiano Germani on these matters for which I am extremely grateful. If this post contains any insight it is his while all errors are for course mine.
What is the problem?
I have a very hard time not to believe in "no drama", i.e. that anything special can happen at an event horizon. First of all, the event horizon is a global concept and its location now does in general depend on what happens in the future (e.g. how much further stuff is thrown in the black hole). So who can it be that the location of a anything like a firewall can depend on future events?
Furthermore, I have never seen such a firewall so far. But I might have already passed an event horizon (who knows what happens at cosmological scales?). Even more, I cannot see a local difference between a true event horizon like that of a black hole and the horizon of an accelerated observer in the case of the Unruheffect. That the later I am pretty sure I have crossed already many times and I have never seen a firewall.
So I was trying to understand why there should be one. And whenever I tried to flesh out the argument for one they way I understood it it fell apart. So, here are some of my thoughts;
The classical situation
No question, Hawking radiation is a quantum effect (even though it happens at tree level in QFT on curved spacetime and is usually derived in a free theory or, equivalently, by studying the propagator). But apart from that not much of the discussion (besides possibly the monogamy of entanglement, see below) seems to be particular quantum. Thus we might gain some mileage by studying classical field theory on the space time of a forming and decaying black hole as given by the causal diagram:
Issues of causality a determined by the characteristics of the PDE in question (take for example the wave equation) and those are invariant under conformal transformations even if the field equation is not. So, it is enough to consider the free wave equation on the causal diagram (rather than the spacetime related to it by a conformal transformation).
For example we can give initial data on I (and have good boundary conditions at the r=0 vertical lines). At the dashed horizontal line, the location of the singularity, we just stop evolving (free boundary conditions) and then we can read off outgoing radiation at I+. The only problematic point is the right end of the singularity: This is the end of the black hole evaporation and to me it is not clear how we can here start to impose again some boundary condition at the new r=0 line without affecting what we did earlier. But anyway, this is in a region of strong curvature, where quantum gravity becomes essential and thus what we conclude should better not depend too much on what's going on there as we don't have a good understanding of that regime.
The firewall paper, when it explains the assumptions of complementarity mentions an Smatrix where it tries to formalize the notion of unitary time evolution. But it seems to me, this might be the wrong formalization as the Smatrix is only about asymptotic states and even fails in much simpler situations when there are bound states and the asymptotic Hilbert spaces are not complete. Furthermore, strictly speaking, this (in the sense of LSZ reduction) is not what we can observe: Our detectors are never at spatial infinity, even if CMS is huge, so we should better come up with a more local concept.

Two regions M and N on a Cauchy surface C with their causal shadows 
In the case of the wave equation, this can be encoded in terms of domains of dependence: By giving initial data on a region of a Cauchy surface I determine the solution on its causal shadow (in the full quantum theory maybe plus/minus an epsilon for quantum uncertainties). In more detail: If I have two sets of initial data on one Cauchy surface that agree on a local region. Than the two solutions have to agree on the causal shadow of this region no matter what the initial data looks like elsewhere. This encodes that "my timeevolution is good and I do not lose information on the way" in a local fashion.
States
Some of my confusion comes from talking about states in a way that at least when taken at face value is in conflict with how we understand states both in classical and in better understood quantum (both quantum mechanics and quantum field theory) circumstances.
First of all (and quite trivially), a state is always at one instant of time, that is it lives on a Cauchy surface (or at least a spacelike hyper surface, as our spacetime might not be globally hyperbolic), not in a region of spacetime. Hilbert space, as the space of (pure) states thus also lives on a Cauchy surface (and not for example in the region behind the horizon). If one event is after another (i.e. in its forward lightcone) it does not make sense to say they belong to different tensor factors of the Hilbert (or different Hilbert spaces for that matter).
Furthermore, a state is always a global concept, it is everywhere (in space, but not in time!). There is nothing like "the space of this observer". What you can do of course is restrict a state to a subset of observables (possibly those that are accessible to one observer) by tracing out a tensor factor of the Hilbert space. But in general, the total state cannot be obtained by merging all these restricted states as those lack information about correlations and possible entanglement.
This brings me to the next confusion: There is nothing wrong with states containing correlations of spacelike separated observables. This is not even a distinguishing property of quantum physics, as this happens all the time even in classical situations: In the morning, I pick a pair of socks from my drawer without turning on the light and put it on my feet. Thus I do not know which socks I am wearing, in particular, I don't know their color. But as I combined matching socks when they came from the washing machine (as far as this is possible given the tendency of socks going missing) I know by looking at the sock on my right foot what the color of the sock on my left foot is, even when my two feet are spatially separated. Before looking, the state of the color of the socks was a statistical mixture but with nonlocal correlations. And of course there is nothing quantum about my socks (even if in German "Quanten" is not only "quantum" but also a pejorative word for feet). This would even be true (and still completely trivial) if I had put one of my feet through an event horizon while the other one is still outside. This example shows that locality is not a property that I should demand of states in order to be sure my theory is free of time travel. The important locality property is not in the states, it is in the observables: The measurement of an observable here must not depend of whether or not I apply an operator at a spacelike distance. Otherwise that would imply I could send signals faster than the speed of light. But it is the operators, not the states that have to be local (i.e. commute for spatial separation).
If two operators, however, are timelike separated (i.e. one is after the other in its forward light cone), I can of course influence one's measurement by applying the other. But this is not about correlations, this is about influence. In particular, if I write something in my notebook and then throw it across the horizon of a black hole, there is no point in saying that there is a correlation (or even entanglement) between the notebook's state now and after crossing the horizon. It's just the former influencing the later.
Which brings us to entanglement. This must not be confused with correlation, the former being a strict quantum property whereas the other can be both quantum or classical. Unfortunately, you can often see this in popular talks about quantum information where many speakers claim to explain entanglement but in fact only explain correlations. As a hint: For entanglement, one must discuss noncommuting observables (like different components of a the same spin) as otherwise (by the GNS reconstruction theorem) one deals with a commutative operator algebra which always has a classical interpretation (functions on a classical space). And of course, it is entanglement which violates Bell's inequality or shows up in the GHZ experiment. But you need something of this complexity (i.e. involving noncommuting observables) to make use of the quantumness of the situation. And it is only this entanglement (and not correlation) that is "monogamous": You cannot have three systems that are fully entangled for all pairs. You can have three spins that are entangled, but once you only look at two they are no longer entangles (which makes quantum cryptography work as the eavesdropper cannot clone the entanglement that is used for coding).
And once more, entanglement is a property of a state when it is split according to a tensor product decomposition of the Hilbert space. And thus lives on a Cauchy surface. You can say that a state contains entanglement of two regions on a Cauchy surface but it makes no sense to say to regions that are timelike to each other to be entangled (like the notebook before and after crossing the horizon). And therefore monogamy cannot be invoked with respect to also taking the outgoing radiation in as the third player.