Wednesday, October 17, 2018

Bavarian electoral system

Last Sunday, we had the election for the federal state of Bavaria. Since the electoral system is kind of odd (but not as odd as first past the post), I would like to analyse how some variations (assuming the actual distribution of votes) in the rule would have worked out. So, first, here is how actually, the seats are distributed: Each voter gets two ballots: On the first ballot, each party lists one candidate from the local constituency and you can select one. On the second ballot, you can vote for a party list (it's even more complicated because also there, you can select individual candidates to determine the position on the list but let's ignore that for today).

Then in each constituency, the votes on ballot one are counted. The candidate with the most votes (like in first past the pole) gets elected for parliament directly (and is called a "direct candidate"). Then over all, the votes for each party on both ballots (this is where the system differs from the federal elections) are summed up. All votes for parties with less then 5% of the grand total of all votes are discarded (actually including their direct candidates but this is not of a partial concern). Let's call the rest the "reduced total". According to the fraction of each party in this reduced total the seats are distributed.

Of course the first problem is that you can only distribute seats in integer multiples of 1. This is solved using the Hare-Niemeyer-method: You first distribute the integer parts. This clearly leaves fewer seats open than the number of parties. Those you then give to the parties where the rounding error to the integer below was greatest. Check out the wikipedia page explaining how this can lead to a party losing seats when the total number of seats available is increased.

Because this is what happens in the next step: Remember that we already allocated a number of seats to constituency winners in the first round. Those count towards the number of seats that each party is supposed to get in step two according to the fraction of votes. Now, it can happen, that a party has won more direct candidates than seats allocated in step two. If that happens, more seats are added to the total number of seats and distributed according to the rules of step two until each party has been allocated at least the number of seats as direct candidates. This happens in particular if one party is stronger than all the other ones leading to that party winning almost all direct candidates (as in Bavaria this happened to the CSU which won all direct candidates except five in Munich and one in Würzburg which were won by the Greens).

A final complication is that Bavaria is split into seven electoral districts and the above procedure is for each district separately. So there are seven times rounding and adding seats procedures.

Sunday's election resulted in the following distribution of seats:

After the whole procedure, there are 205 seats distributed as follows

• CSU 85 (41.5% of seats)
• SPD 22 (10.7% of seats)
• FW 27 (13.2% of seats)
• GREENS 38 (18.5% of seats)
• FDP 11 (5.4% of seats)
• AFD 22 (10.7% of seats)

Now, for example one can calculate the distribution without districts throwing just everything in a single super-district. Then there are 208 seats distributed as

• CSU 85 (40.8%)
• SPD 22 (10.6%)
• FW 26 (12.5%)
• GREENS 40 (19.2%)
• FDP 12 (5.8%)
• AFD 23 (11.1%)
You can see that in particular the CSU, the party with the biggest number of votes profits from doing the rounding 7 times rather than just once and the last three parties would benefit from giving up districts.

But then there is actually an issue of negative weight of votes: The greens are particularly strong in Munich where they managed to win 5 direct seats. If instead those seats would have gone to the CSU (as elsewhere), the number of seats for Oberbayern, the district Munich belongs to would have had to be increased to accommodate those addition direct candidates for the CSU increasing the weight of Oberbayern compared to the other districts which would then be beneficial for the greens as they are particularly strong in Oberbayern: So if I give all the direct candidates to the CSU (without modifying the numbers of total votes), I get the follwing distribution:
221 seats
• CSU 91 (41.2%)
• SPD 24 (10.9%)
• FW 28 (12,6%)
• GREENS 42 (19.0%)
• FDP 12 (5.4%)
• AFD 24 (10.9%)
That is, there greens would have gotten a higher fraction of seats if they had won less constituencies. Voting for green candidates in Munich actually hurt the party as a whole!

The effect is not so big that it actually changes majorities (CSU and FW are likely to form a coalition) but still, the constitutional court does not like (predictable) negative weight of votes. Let's see if somebody challenges this election and what that would lead to.

The perl script I used to do this analysis is here.

Postscript:
The above analysis in the last point is not entirely fair as not to win a constituency means getting fewer votes which then are missing from the grand total. Taking this into account makes the effect smaller. In fact, subtracting the votes from the greens that they were leading by in the constituencies they won leads to an almost zero effect:

Seats: 220
• CSU  91 41.4%
• SPD  24 10.9%
• FW  28 12.7%
• GREENS  41 18.6%
• FDP  12 5.4%
• AFD  24 10.9%
Letting the greens win München Mitte (a newly created constituency that was supposed to act like a bad bank for the CSU taking up all central Munich more left leaning voters, do I hear somebody say "Gerrymandering"?) yields

Seats: 217
• CSU  90 41.5%
• SPD  23 10.6%
• FW  28 12.9%
• GREENS  41 18.9%
• FDP  12 5.5%
• AFD  23 10.6%
Or letting them win all but Moosach and Würzbug-Stadt where the lead was the smallest:

Seats: 210

• CSU  87 41.4%
• SPD  22 10.5%
• FW  27 12.9%
• GREENS  40 19.0%
• FDP  11 5.2%
• AFD  23 11.0%

Thursday, March 29, 2018

Machine Learning for Physics?!?

Today was the last day of a nice workshop here at the Arnold Sommerfeld Center organised by Thomas Grimm and Sven Krippendorf on the use of Big Data and Machine Learning in string theory. While the former (at this workshop mainly in the form of developments following Kreuzer/Skarke and taking it further for F-theory constructions, orbifolds and the like) appears to be quite advanced as of today, the latter is still in its very early days. At best.

I got the impression that for many physicists that have not yet spent too much time with this, deep learning and in particular deep neural networks are expected to be some kind of silver bullet that can answer all kinds of questions that humans have not been able to answer despite some effort. I think this hope is at best premature and looking at the (admittedly impressive) examples where it works (playing Go, classifying images, speech recognition, event filtering at LHC) these seem to be more like those problems where humans have at least a rough idea how to solve them (if it is not something that humans do everyday like understanding text) and also roughly how one would code it but that are too messy or vague to be treated by a traditional program.

So, during some of the less entertaining talks I sat down and thought about problems where I would expect neural networks to perform badly. And then, if this approach fails even in simpler cases that are fully under control one should maybe curb the expectations for the more complex cases that one would love to have the answer for. In the case of the workshop that would be guessing some topological (discrete) data (that depends very discontinuously on the model parameters). Here a simple problem would be a 2-torus wrapped by two 1-branes. And the computer is supposed to compute the number of matter generations arising from open strings at the intersections, i.e. given two branes (in terms of their slope w.r.t. the cycles of the torus) how often do they intersect? Of course these numbers depend sensitively on the slope (as a real number) as for rational slopes $p/q$ and $m/n$ the intersection number is the absolute value of $pn-qm$. My guess would be that this is almost impossible to get right for a neural network, let alone the much more complicated variants of this simple problem.

Related but with the possibility for nicer pictures is the following: Can a neural network learn the shape of the Mandelbrot set? Let me remind those of you who cannot remember the 80ies anymore, for a complex number c you recursively apply the function
$f_c(z)= z^2 +c$
starting from 0 and ask if this stays bounded (a quick check shows that once you are outside $|z| < 2$ you cannot avoid running to infinity). You color the point c in the complex plane according to the number of times you have to apply f_c to 0 to leave this circle. I decided to do this for complex numbers x+iy in the rectangle -0.74
I have written a small mathematica program to compute this image. Built into mathematica is also a neural network: You can feed training data to the function Predict[], for me these were 1,000,000 points in this rectangle and the number of steps it takes to leave the 2-ball. Then mathematica thinks for about 24 hours and spits out a predictor function. Then you can plot this as well:

There is some similarity but clearly it has no idea about the fractal nature of the Mandelbrot set. If you really believe in magic powers of neural networks, you might even hope that once it learned the function for this rectangle one could extrapolate to outside this rectangle. Well, at least in this case, this hope is not justified: The neural network thinks the correct continuation looks like this:
Ehm. No.

All this of course with the caveat that I am no expert on neural networks and I did not attempt anything to tune the result. I only took the neural network function built into mathematica. Maybe, with a bit of coding and TensorFlow one can do much better. But on the other hand, this is a simple two dimensional problem. At least for traditional approaches this should be much simpler than the other much higher dimensional problems the physicists are really interested in.

Thursday, December 14, 2017

What are the odds?

It's the time of year, you give out special problems in your classes. So this is mine for the blog. It is motivated by this picture of the home secretaries of the German federal states after their annual meeting as well as some recent discussions on Facebook:
I would like to call it Summers' problem:

Let's have two real random variables $M$ and $F$ that are drawn according to two probability distributions $\rho_{M/F}(x)$ (for starters you may both assume to be Gaussians but possibly with different mean and variance). Take $N$ draws from each and order the $2N$ results. What is the probability that the $k$ largest ones are all from $M$ rather than $F$? Express your results in terms of the $\rho_{M/F}(x)$. We are also interested in asymptotic results for $N$ large and $k$ fixed as well as $N$ and $k$ large but $k/N$ fixed.

Last bonus question: How many of the people that say that they hire only based on merit and end up with an all male board realise that by this they say that women are not as good by quite a margin?

Thursday, November 09, 2017

Why is there a supercontinent cycle?

One of the most influential books of my early childhood was my "Kinderatlas"
There were many things to learn about the world (maps were actually only the last third of the book) and for example I blame my fascination for scuba diving on this book. Also last year, when we visited the Mont-Doré in Auvergne and I had to explain how volcanos are formed to my kids to make them forget how many stairs were still ahead of them to the summit, I did that while mentally picturing the pages in that book about plate tectonics.

But there is one thing I about tectonics that has been bothering me for a long time and I still haven't found a good explanation for (or at least an acknowledgement that there is something to explain): Since the days of Alfred Wegener we know that the jigsaw puzzle pieces of the continents fit in a way that geologists believe that some hundred million years ago they were all connected as a supercontinent Pangea.

By Original upload by en:User:Tbower - USGS animation A08, Public Domain, Link

In fact, that was only the last in a series of supercontinents, that keep forming and breaking up in the "supercontinent cycle".

By SimplisticReps - Own work, CC BY-SA 4.0, Link

So here is the question: I am happy with the idea of several (say $N$) plates roughly containing a continent each that a floating around on the magma driven by all kinds of convection processes in the liquid part of the earth. They are moving around in a pattern that looks to me to be pretty chaotic (in the non-technical sense) and of course for random motion you would expect that from time to time two of those collide and then maybe stick for a while.

Then it would be possible that also a third plate collides with the two but that would be a coincidence (like two random lines typically intersect but if you have three lines they would typically intersect in pairs but typically not in a triple intersection). But to form a supercontinent, you need all $N$ plates to miraculously collide at the same time. This order-$N$ process seems to be highly unlikely when random let alone the fact that it seems to repeat. So this motion cannot be random (yes, Sabine, this is a naturalness argument). This needs an explanation.

So, why, every few hundred million years, do all the land masses of the earth assemble on side of the earth?

One explanation could for example be that during those tines, the center of mass of the earth is not in the symmetry center so the water of the oceans flow to one side of the earth and reveals the seabed on the opposite side of the earth. Then you would have essentially one big island. But this seems not to be the case as the continents (those parts that are above sea-level) appear to be stable on much longer time scales. It is not that the seabed comes up on one side and the land on the other goes under water but the land masses actually move around to meet on one side.

I have already asked this question whenever I ran into people with a geosciences education but it is still open (and I have to admit that in a non-zero number of cases I failed to even make the question clear that an $N$-body collision needs an explanation). But I am sure, you my readers know the answer or even better can come up with one.

Friday, June 16, 2017

I got this wrong

In yesterday's post, I totally screwed up when identifying the middle part of the spectrum as low frequency. It is not. Please ignore what I said or better take it as a warning what happens when you don't double check.

Apologies to everybody that I stirred up!

Thursday, June 15, 2017

Some DIY LIGO data analysis

UPDATE: After some more thinking about this, I have very serious doubt about my previous conclusions. From looking at the power spectrum, I (wrongly) assumed that the middle part of the spectrum is the low frequency part (my original idea was, that the frequencies should be symmetric around zero but the periodicity of the Bloch cell bit me). So quite to the opposite, when taking into account the wrapping, this is the high frequency part (at almost the sample rate). So this is neither physics nor noise but the sample rate. For documentation, I do not delete the original post but leave it with this comment.

Recently, in the Arnold Sommerfeld Colloquium, we had Andrew Jackson of NBI talk about his take on the LIGO gravitational wave data, see this announcement with link to a video recording. He encouraged the audience to download the freely available raw data and play with it a little bit. This sounded like fun, so I had my go at it. Now, that his paper is out, I would like to share what I did with you and ask for your comments.

I used mathematica for my experiments, so I guess the way to proceed is to guide you to an html export of my (admittedly cleaned up) notebook (Source for your own experiments here).

The executive summary is that apparently, you can eliminate most of the "noise" at the interesting low frequency part by adding to the signal its time reversal casting some doubt about the stochasticity of this "noise".

I would love to hear what this is supposed to mean or what I am doing wrong, in particular from my friends in the gravitational wave community.

Thursday, June 08, 2017

Relativistic transformation of temperature

Apparently, there is a long history of controversy going back to Einstein an Planck about the proper way to deal with temperature relativistically. And I admit, I don't know what exactly the modern ("correct") point of view is. So I would like to ask your opinion about a puzzle we came up during yesterday's after colloquium dinner with Erik Verlinde:

Imagine a long rail of a railroad track. It is uniformly heated to a temperature T and is in thermodynamic equilibrium (if you like a mathematical language: it is in a KMS state). On this railroad track travels Einstein's relativistic train at velocity v. From the perspective of the conductor, the track in front of the train is approaching the train with velocity v, so one might expect that the temperature T appears blue shifted while behind the train, the track is moving away with v and so the temperature appears red-shifted.

Following this line of thought, one would conclude that the conductor thinks the rail has different temperatures in different places and thus is out of equilibrium.

On the other hand, the question of equilibrium should be independent of the observer. So, is the assumption of the Doppler shift wrong?

A few remarks: If you are worried that Doppler shifts should apply to radiation then you are free to assume that both in front and in the back, there are black bodies in thermal contact with the rail and thus exhibiting a photon gas at the same temperature as the rail.

You could probably also make the case for the temperature transforming like the time component of a four vector (since it is essentially an energy). Then the transformed temperature would be independent of the sign of v. This you could for example argue for by assuming the temperature is so high that in your black body photon gas you also create electron-positron pairs which would be heavier due to their relativistic speed relative to the train and thus requiring more energy (and thus temperature) for their creation.

A final remark is about an operational definition of temperature at relativistic speeds: It might be difficult to bring a relativistic thermometer in equilibrium with a system if there is a large relative velocity (when we define temperature as the criterium for two systems in contact to be in equilibrium). Or to operate a heat engine between he front part of the rail and the back while moving along at relativistic speed and then arguing about the efficiency (and defining the temperature  that way).

Update one day later:
Thanks for all your comments. We also had some further discussions here and I would like to share my conclusions:

1) It probably boils down to what exactly you mean when you say ("temperature"). Of course, you want that his at least in familiar situations agrees with what thermometers of this type or another measure. (In the original text I had hinted at two possible definitions that I learned about from a very interesting paper by Buchholz and Solveen discussing the Unruh effect and what would actually be observed there: Either you define temperature that the property that characterises equilibrium states of systems such there is no heat exchange when you bring in contact two systems of the same temperature. This is for example close to what a mercury thermometer measures. Alternatively, you operate a perfect heat engine between two reservoirs and define your temperatures via
$$\eta = \frac{T_h - T_c}{T_h}.$$
This is for example hinted at in the Feynamn lectures on physics.

One of the commentators suggested using the ratio of eigenvalues of the energy momentum tensor as definition of temperature. Even though this might give the usual thing for a perfect fluid I am not really convinced that this generalises in the right way.

2) I would rather define the temperature as the parameter in the Gibbs (or rather KMS) state (it should only exist in equilibrium, anyway). So if your state is described by density matrix $\rho$, and it can be written as
$$\rho = \frac{e^{-\beta H}}{tr(e^{-\beta H})}$$
then $1/\beta$ is the temperature. Obviously, this requires the a priori knowledge of what the Hamiltonian is.

For such states, under mild assumptions, you can prove nice things: Energy-entropy inequalities ("minimisation of free energy"), stability, return to equilibrium and most important here: passivity, i.e. the fact you cannot extract mechanical work out of this state in a cyclic process.

2) I do not agree that it is out of the question to have a thermometer with a relative velocity in thermal equilibrium with a heat bath at rest. You could for example imagine a mirror fixed next to the track and in thermal equilibrium with the track. A second mirror is glued to the train (and again in thermal equilibrium, this time with a thermometer). Between the mirrors is is a photon gas (black body) that you could imagine equilibrating with the mirrors on both ends. The question is if that is the case.

3) Maybe rails and trains a a bit too non-spherical cows, so lets better look at an infinitely extended free quantum gas (bosons or fermions, your pick). You put it in a thermal state at rest, i.e. up to normalisation, its density matrix is given by
$$\rho = e^{-\beta P^0}.$$
Here $P^0$ is the Poincaré generator of time translations.

Now, the question above can be rephrased as: Is there a $\beta'$ such that also
$$\rho = e^{-\beta' (\cosh\alpha P^0 + \sinh \alpha P^1)}?$$
And to the question formulated this way, the answer is pretty clearly "No". A thermal state singles out  a rest frame and that's it. It is not thermal in the moving frame and thus there is no temperature.

It's also pretty easy to see this state is not passive (in the above sense): You could operate a windmill in the slipstream of particles coming more likely from the front than the back. So in particular, this state is not KMS (this argument I learned from Sven Bachmann).

4) Another question would be about gravitational redshift: Let's take some curve space-time and for simplicity assume it has no horizons (for example, let the far field be Schwarzschild but in the center, far outside the Schwarzschild radius, you smooth it out. Like the space-time created by the sun). Make it static, so it contains a timeline Killing vector (otherwise no hope for a thermal state). Now prepare a scalar field in the thermal state with temperature T. Couple to it a harmonic oscillator via
$$H_{int}(r) = a^\dagger a + \phi(t, r) (a^\dagger + a).$$
You could now compute a "local temperature" by computing the probability that the harmonic oscillator is in the first excited state. Then, how does this depend on $r$?