Tuesday, September 18, 2007

Not quite infinite

Lubos has a memo where he discusses how physicists make (finite) sense of divergent sums like 1+10+100+1000+... or 1+2+3+4+5+... . The last is, as string theorists know, of course -1/12 as for example explained in GSW. Their trick is to read that sum as the value at s=-1 of and define that value via the analytic continuation of the given expression which is well defined only for real part of s>1.

Alternatively, he regularises as . Then, in an obscure analogy with minimal subtraction throws away the divergent term and takes the finite remainder as the physical value.

He justifies this by claiming agreement with experiment (here in the case of a Casimir force). This, I think, however, is a bit too weak. If you rely on arguments like this it is unclear how far they take you when you want to apply them to new problems where you do not yet know the answer. Of course, it is good practice for physicists to take calculational short-cuts. But you should always be aware that you are doing this and it feels much better if you can say "This is a bit dodgy, I know, and if you really insist we could actually come up with a rigorous argument that gives the same result.", i.e. if you have a justification in your sleeve for what you are doing.

Most of the time, when in a physics calculation you encounter an infinity that should not be there (of course, often "infinity" is just the correct result, questions like how much energy I have to put into the acceleration of an electron to bring it up to the speed of light? come to my mind), you are actually asking the wrong question. This could for example be because you made an idealisation that is not physically justified.

Some examples come to my mind: The 1+2+3+... sum arises when you try to naively compute the commutator of two Virasoro generators L_n for the free boson (the X fields on the string world sheet). There, L_n is given as an infinite sum over bilinears in a_k's, the modes of X. In the commutator, each summand gives a constant from operator ordering and when you sum up these constants you face the sum 1+2+3+...

Once you have such an expression, you can of course regularise it. But you should be suspicious that it is actually meaningful what you do. For example, it could be that you can come up with two regularisations that give different finite results. In that case you should better have an argument to decide which is the better one.

Such an argument could be a way to realise that the infinity is unphysical in the first place: In the Virasoro example, one should remember that the L_n stand for transformations of the states rather than observables themselves (outer vs. inner transformations of the observable algebra). Thus you should always apply them to states. But for a state that is a finite linear combination of excitations of the Fock vacuum there are always only a finite number of terms in the sum for the L_n that do not annihilate the state. Thus, for each such state the sum is actually finite. Thus the infinite sum is an illusion and if you take a bit more care about which terms actually contribute you find a result equivalent to the -1/12 value. This calculation is the one you should have actually done but the zeta function version is of course much faster.

My problem with the zeta function version is that to me (and to all people I have asked so far) it looks accidental: I have no expansion of the argument that connects it to the rigorous calculation. From the Virasoro algebra perspective it is very unnatural to introduce s as at least I know of no way to do the calculation with L_n and a_k with a free parameter s.

Another example are the infinities that arise in Feynman diagrams. Those arise when you do integrals over all momenta p. There are of course the usual tricks to avoid these infinities. But the reason they work is that the integral over all p is unphysical: For very large p, your quantum field theory is no longer the correct description and you should include quantum gravity effects or similar things. You should only integrate p up the scale where these other effects kick in and then do a proper computation that includes those effects. Again, the infinity disappears.

If you have a renormalisable theory you are especially lucky: There you don't really have to know the details of that high energy theory, you can subsume them into a proper redefinition of your coupling constants.

A similar thing can be seen in fluid dynamics: The Navier-Stokes equation has singular solutions much like Einstein's equations lead to singularities. So what shall we do with for example infinite pressure? Well, the answer is simple: The Navier-Stokes equation applies to a fluid. But the fluid equations are only an approximation valid at macroscopic scales. If you look at small scales you find individual water molecules and this discreteness is what saves you actually encountering infinite values.

There is an approach to perturbative QFT developed by Epstein and Glaser and explained for example in this book that demonstrates that the usual infinities arise only because you have not been careful enough earlier in your calculation.

There, the idea is that your field operators are actually operator valued distributions and that you cannot always multiply distributions. Sometimes you can, if their singularities (the places where they are not a function but really a distribution) are in different places or in different directions (in a precise sense) but in general you cannot.

The typical situation is that what you want to define (for example delta(x)^2) is still defined for a subset of your test functions. For example delta(x)^2 is well defined for test functions that vanish in a neighbourhood of 0. So you start with a distribution defined only for those test functions. Then, you want to extend that definition to all test-functions, even those that are finite around 0. It turns out that if you restrict the degree of divergence (the maximum number of derivatives acting on delta, this will later turn out to be related to the superficial scaling dimension) to be below some value, there is a finite dimensional solution space to this extension problem. In the case of phi^4 theory for example the two point distribution is fixed up to a multiple of delta(x) and a multiple of the d'Alambertian of delta(x), the solution space is two dimensional (if Lorentz invariance is taken into account). The two coefficients have to be fixed experimentally and of course are nothing but mass and wave function renormalisation. In this approach the counter terms are nothing but ambiguities of an extension problem of distributions.

I has been shown in highly technical papers, that this procedure is equivalent to BPHZ regularization and dimensional regularisation and thus it's save to use the physicist's short-cuts. But it's good to know that the infinities that one cures could have been avoided in the first place.

My last example is of slightly different flavour: Recently, I have met a number of mathematical physicists (i.e. mathematicians) that work on very complicated theorems about what they call stability of matter. What they are looking at is the quantum mechanics of molecules in terms of a Hamiltonian that includes a kinetic term for electrons and Coulomb potentials for electron-electron and electron-nucleus interactions. The position of the nuclei are external (classical) parameters and usually you minimise them with respect to the energy. What you want to show is that the spectrum of this Hamiltonian is bounded from below. This is highly non-trivial as the Coulomb potential itself alone is not bounded from below (-1/r becomes arbitrarily negative) and you have to balance it with the kinetic term. Physically, you want to show that you cannot gain an infinite amount of energy by throwing an electron into the nucleus.

Mathematically, this is a problem about complicated PDE's and people have made progress using very sophisticated tools. What is not clear to me is if this question is really physical: It could well be that it arises from an over-simplification: The nuclei are not point-like and thus the true charge distribution is not singular. Thus the physical potential is not unbounded from below. In addition, if you are worried about high energies (as would be around if the electron fell into a nucleus) the Schrödinger equation would no longer be valid and would have to be replaced with a Dirac equation and then of course the electro-magnetic interaction should no longer be treated classically and a proper QED calculation should be done. Thus if you are worried about what happens to the electron close to the nucleus in Schrödinger theory, you are asking an unphysical question. What still could be a valid result is that you show (and that might look very similar to a stability result) is that you don't really get out of the area of applicability of your theory as the kinetic term prevents the electrons from spending too much time very close to the nucleus (classically speaking).

What is shared by all these examples, is that some calculation of a physically finite property encounters infinities that have to be treated and I tried to show that those typically arise because earlier in your calculation you have not been careful and stretched an approximation beyond its validity. If you would have taken that into account there wouldn't have been an infinity but possible a much more complicated calculation. And in lucky cases (similar to the renormalisable situation) you can get away with ignoring these complications. However you can sleep much better if you know that there would have been another calculation without infinities.

Update: I have just found a very nice text by Terry Tao on a similar subject to "knowing there is a rigorous version somewhere".


Joe Polchinski said...

In chapter 1 of my book, eq. 1.3.34, I derive the `correct' value of this infinite sum by the requirement that one cancel the Weyl anomaly introduced by the regulator by a local counterterm; this fixes the finite value completely.

At various points later in the book (see index item `normal ordering constants') I derive the constant by a fully finite calculation that respects the Weyl symmetry throughout.

Robert said...

For those readers who don't have Joe's book at hand let me reproduce his argument: In the cut-off version, epsilon is if fact dimension-full and a constant, n independent term would as well be the consequence of a world sheet cosmological constant. Thus the 1/epsilon^2 is in fact a renormalisation of the world-sheet cosmological constant. This would be in conflict with Weyl invariance and thus one has to add a counter term which makes it vanish.

This is what I should have written instead of calling the argument "obscure".

This leaves me still looking for a physical justification for the introduction of s in the zeta regularisation and the hope that physics is actually analytic in s. Maybe this could be related to dimensional regularisation on the world sheet?

Lumo said...

Dear robert, I am somewhat confused by your skepticism. A similar comment to yours by ori - I suppose it could even be Ori Ganor - appeared on my blog.

Why I am confused? Because I think that Joe's argument is, at the level of physics, a rigorous argument. Let me start with the vacuum energy subtraction.

We require Weyl invariance of the physical quantities. So the total zero-point function must vanish. It is clearly the case because such a result is dimensionful and any dimensionful quantity has a scale and breaks scale invariance.

So one exactly needs to add a counterterms to have the total vacuum energy vanish and this counterterm thus exactly has the role of killing the 1/epsilon^2 term. Joe has a lot of detailed extra factors of length etc. in his formulae to make it really transparent how the terms depend on the length. This makes the mathematical essence of the regularization more convoluted than it is but it should make the physical interpretation much more unambiguous.

Now the zeta function.

You ask about the "hope" that physics is analytical in complex "s". I don't know why you call it a hope. It is a easily demonstrable fact that is, as you correctly hint, analogous to the case of dim reg. Just substitute a complex "s" and calculate what the result is. You only get nice functions so of course the result is locally holomorphic in "s".

Just like in the case of dimreg, one doesn't have to have an interpretation of complex values of "s". The only thing we call "physics for complex s" are the actual formulae and their results and they are clearly holomorphic.

Beisert and Tseytlin have checked a highly nontrivial zeta-function regularization of some AdS/CFT spinning calculation up to four loops. That's where they argued to understand the three-loop discrepancy as an order of limits issue.

See also a 600+ citation paper by Hawking who checks curved spaces in all dimensions etc. These regularizations work and it's no coincidence.


Robert said...


you misunderstand me. I have no doubt that in field theory calculations where for example you want to compute tr(log(O)) for some operator O as this gives you the 1 loop effective action zeta function regularisation of log(0) works as well as any other regularisation (and often nicer as it preserves more symmetries than more ad hoc versions).

What I am looking for is a version where you not only reinterpret n as 1/n^s for s=-1 once you encounter an obviously divergent expression but start out with something that includes s from the beginning such that for say Re(s)>1 everything is finite at all stages and in the end you can take s->-1 analytically. Can you come up with (s dependent) definitions of a_n and their commutation relations or L_n such that the commutator of L_n's (which is something you calculate rather than define) gives the expression including s?

BTW, in the LQG version of the string, the correct constant appears as Tr([A_2,B_2]) where A and B are generators of diffeomorphisms and the subscript 2 refers to

A_2 = (A + JAJ)/2

where J multiplies positive modes by i and negative modes by -i. Thus it's the 'beta'-part in the language of Boguliubov transformations. Needless to mention this expression is in fact finite even though there is a trace in an infinite dimensional Hilbert space as can be shown that A_2 is a Hilbert-Schmidt operator (that is the product of two such operators has a finite trace). Of course you need an infinite dimensional space for a commutator to have a non-vanishing trace.

Lumo said...

More generally about your comments, Robert.

I think that it is entirely wrong to say "this argument is dodgy blah blah blah" (in the context of the vacuum energy subtraction) because the argument is transparent and rigorous when looked at properly. Both of them in fact.

Also, I disagree with your general statement that an infinity means that we have asked a wrong question. Only IR divergences are about wrong questions. UV divergences are about a theory being effective. But even QCD that is UV finite gives UV divergences - they're responsible e.g. for the running. There's no way to ask a better question about the exact QCD theory that we know and love that would remove the infinity.

QCD also falsifies your statement that "the integral over all p is unphysical". It's not unphysical. QCD is well-defined at arbitrarily high values of "p" but it still requires one to deal with and subtract the infinities properly.

Sorry to say but the comments that physicists are always expected to say "we're dodgy, everything is unreliable, we need experiments" just mean that you don't quite understand the technology. Your comments are Woit-Lite comments. In each case, there is a completely well-defined answer to the questions whether a particular symmetry constrains the terms or not, whether a given regularization preserves the symmetry or not, and consequently, whether a given regularization gives a correct result or not. There is no ambiguity here whatsoever and the examples listed are guaranteed to give the right results.

Lumo said...

Dear Robert, concerning your comment, I understood pretty well that you wanted to define the whole theory for complex unphysical values of "s".

That's exactly why I pre-emptively wrote that it is wrong to try to define the whole theory for wrong values of "s" just like it is wrong to define a theory in a complex dimension "d" in dimreg. Such a theory probably doesn't exist, especially not in the dimreg case.

But you don't need the full theory in 3.98+0.2i spacetime dimensions in order to prove that dimreg preserves gauge invariance, do you? In the same way, you don't need to define the operator algebra in a CFT for complex values of "s" or something like that.

I don't understand how to combine this discussion with the "LQG version of a string". The texts I wrote above were trying to help to clarify how the quantities actually behave in correct physics while LQG is a supreme example how the divergences and other things are treated physically incorrectly.

Of course that things I write are incompatible with the LQG quantization. But the reason is that the LQG quantization is wrong while e.g. Joe's arguments are correct. Your conclusion that physics is ambiguous is not a correct conclusion.

Robert said...

All I am saying is you should have a way (fine if done retroactively) to treat infinities without them actually occurring. And if you do that by adding an epsilon dependent counter term (that diverges by itself when you take epsilon to 0) that's fine with me. As long as you can physically justify it.

Otherwise you are prone to arguments like

And sorry, "an argument is correct if it gives the correct result" is not good enough. I would like to have a way to decide if an argument is valid before I know the answer from somewhere else.

Robert said...

By "LQG string" I meant our version where we (in a slightly mathematically more careful language) re-derive the usual central charge (same content, different formalism) rather than the polymer version (different content of which you know I do not approve).

Lumo said...

Dear Robert, I disagree that one can only trust a theory if infinities never occur. A particular regularization that replaces infinities by finite numbers as the intermediate results is just a mathematical trick but the actual physical result is independent of all details of the regularization which really means that it directly follows from a correct calculation inside the theory that contains these infinities.

In other words, you only need the Lagrangian of standard QCD (one that leads to divergent Feynman diagrams) plus correct physical rules that constrain/dictate how to deal with infinities to get the right QCD predictions. You don't need any theory that is free of infinities. Such a theory is just a psychological help if one feels uncertain.

I agree with you that one should be able to decide whether an argument is correct before the result is compared with another one. And indeed, it is possible. This is what this discussion is about. You argue that it is impossible to decide whether an argument or calculation is correct as long as it started with an infinite expression, and others are telling you that it is possible.

If you rederive the same physics in what you call "LQG string", why do you talk about "LQG string" as opposed to just a "string"? Cannot you reformulate your argument in normal physics as opposed to one of kinds of LQG physics?

Sabine's calculation you linked to is manifestly wrong because she doubles one of the infinities in order to subtract them and get a wrong finite part. There was no symmetry principle that would constrain the right result in her calculation. The original integral was perfectly convergent and she just added (2-1) times infinity (by rescaling the cutoff by a factor of 2 in one term), pretending that 2-1=0. I don't quite know why you think that I am prone to such arguments. ;-) Maybe Sabine is but I am not.

She didn't make any proper analysis of counterterms, any proper analysis of any symmetries, and she didn't make any analytical continuation of anything to a convergent region either. Why do you think it's analogous to a valid calculation?

If you mentioned it because of the relationship between 1+2+3+... and 1-2+3-4+..., the derived relationship between them may remind you of Sabine's wrong calculation. But it is not analogous. These rescalings and alternating sums can be calculated by the zeta function regularization that allows me to make these arguments adding subseries and rescaling them.

For example, you get the correct sum for antiperiodic fields, 1/2 + 3/2 + 5/2 + ... can also be calculated by taking the normal sum 1+2+3 and subtracting a multiple of it from itself.

So if the zeta-function reg gives a Weyl-invariant value of the alternating sum, it also gives the right value of the normal sum as well as the shifted Neveu-Schwarz sum and others.

Lumo said...

Let me say more physically what she actually did. In order to calculate a convergent integral in the momentum space (x), she wrote it as a difference of two divergent ones. That would be perfectly compatible with physics and nothing wrong could follow from it. The error only occurs when she rescales the "x" by a factor of 1/2 or 2 in the two terms. This is equivalent to confusing what is her cutoff - by a factor of two up or down. Because her integral is logarithmically divergent, it is a standard example of a running coupling. So she has effectively added "g(2.lambda)-g(lambda/2)" - the difference of gauge couplings at two different scales, pretending that it is zero. Of course, it is not zero: this is exactly the way how running couplings arise.

An experienced physicist would never make this error - using inconsistent cutoffs for different contributions in the same expression. Hers is just a physics error, if we interpret it as a physics calculation. One can't say that her calculation is analogous to the correct calculations such as Joe's subtractions of the vacuum energy even though it seems that this is precisely what you're saying.

There is a very clear a priori difference between correct and wrong calculations: correct ones have no physical errors of this or other kinds.

Robert said...

My final comment for tonight: For those readers who did not get this from my comments above: I completely agree with Joe's derivation of including a regularisation and imposing Weyl invariance. Do not try to convince me it is correct. It is.

My point about Sabine's calculation was that you can of course (and nobody I believe doubts this) produce non-sense if you are not careful about infinite quantities. Once you regulate, the error is obvious.

My final remark (and this is not serious, thus I will delete any comments referring to it) is that there is a shorter version of Sabine's argument which goes: "int dx/x is always zero in dimensional regularisation" (this is how I learned to actually apply dim reg from a particle phenomenologist: Bring your integrals to form finite + int dx/x and set the second term to zero).

Anonymous said...

When physicists proceed 'formally', its usually explicitly stated as such.

There are many examples throughout history where this actually turns out to be wrong when done rigorously.

The interesting thing (for mathematicians) is when it turns out to be correct, as it usually means theres some hidden principle in there somewhere and often can lead to new and nontrivial mathematics (eg distribution theory).

Lumo said...

Dear Robert, if you exactly agree with Joe's derivation, why do you exactly write that this derivation is based on an "obscure analogy with minimal subtraction"?

There is nothing obscure about it and, if looked at properly, there is nothing obscure about the minimal subtraction either. One can easily prove why it works whenever it works.

I agree that one must be careful about infinite quantities but we seem to disagree what it means to be careful. In my picture, it means that you must carefully include them whenever they are nonzero. In the polymer LQG string that you researched, for example, they are very careful to throw all these important terms arising as infinities away which is wrong, and your work is an interpolation between the correct result and the wrong result which is thus also wrong, at least partially. ;-)

I disagree that your "nonserious" comment is not serious. It is absolutely serious. Don't try to erase this comment because of it. The comment that you call "nonserious" is the standard insight - certainly taught in QFT courses at most good graduate schools - that power law divergences are zero in dim reg. In the case of the log divergence it is still true as long as you consistently extract the finite part by taking correct limits of the integral.

Thomas Larsson said...

Why are zeta-function techniques better than simply calculating the action of the Virasoro generators on some state? It is very easy to compute [L_m, L_-m] |0>, and you can read off the central charge from this, without ever having to introduce any infinities.

What is less trivial is how to generalize this to d dimensions, where the diffeomorphism generators are labelled by vectors m = (m_0, m_1, ...) in Z^d rather than an scalar integer m in Z. In fact, I was stuck on this problem for many years (and ran out of funding in the meantime), before it was solved in a seminal paper by Rao and Moody.

amused said...

Hi Robert, Lubos, and anyone else,
I have a question/doubt about something Lubos wrote in his post on this topic and would appreciate your views or clarifications. (Normally I would post this on the blog of the person who wrote it,
but seeing as in this case it's Lubos...hope you don't mind me posting it here instead)

LM wrote:
"The fact that different regularizations lead to the same final results is a priori non-trivial but can be mathematically demonstrated to be inevitably true by the tools of the renormalization group."

Is this really true? E.g., I don't recall any mention of this in Peskin & Schroeder's book, even though they discuss RG group in detail.. To explain my doubts, consider the case of perturbative QCD: two different regularizations which preserve gauge invariance are dimensional reg. and lattice formulation. In fact there are a whole lot of different possible lattice discretizations, and not all of them can be expected to produce results which agree with the physical ones obtained using dimensional regularization. E.g., there must at least be some kind of locality condition on the lattice QCD formulation that one uses, and I don't think anyone knows at present what the mildest possible locality requirement is that guarantees that the lattice formulation will produce correct results. In light of this, I don't see how it can be asserted that different regularizations (which preserve the appropriate symmetries) are always guaranteed to give the same final results...

Robert said...

I know there is some literature about different regularisation/renormalisation schemes giving identical results but trying to locate some using google scholar was unsuccessful. I know for sure that BPHZ and Epstein-Glaser have been shown to be equivalent and would be surprised if the ones more often used in practical calculations (i.e. dim reg) would not have been connected as well. Step zero for such a proof (which in character is mathematical and not very physics oriented) is to define what exactly you mean by scheme X. That would have to be a prescription that works at all loop order for all graphs and not like in QFT textbooks where a few simple graphs are calculated (most often only one loop so they do not encounter overlapping divergencies) and then a "you proceed along the same lines for other graphs" instruction is given.

Lattice regularisation, however, is very different in spirit as it is not perturbative (it does not expand in the coupling constant) so it is not supposed to match a perturbative calculation up to some fixed loop order. Thus it does not compare directly with Feynman graph calculations. Only the continuum limit of the lattice theory is supposed to match with an all loop calculation that also takes into account non-perturbative effects.

In fact, the lattice version of gauge theories is probably the best definition of what you mean by "the full quantum theory including non-perturbative effects" as those are not computed directly in perturbation theory and there are only indirect hints from asymptotic expansions and of course S-duality.

OTOH, starting from the lattice theory, you have to show that the continuum limit in fact has Lorentz symmetry and is causal, two properties that this regularisation destroys. Once you managed this, it's likely you are not too far from claiming the 1 million dollars:


amused said...

Thanks Robert. You seem to have in mind the nonperturbative lattice formulation used in computer simulations, but there is also a perturbative version which does expand in the coupling constant - see, e.g., T.Reisz, NPB 318 (1989) 417 where perturbative renormalizability of lattice QCD was proved to all orders in the loop expansion. However, it is not clear to me that this will always give the correct physical results for any choice of lattice QCD formulation. There must surely be some conditions on the formulation; in particular some minimal locality condition. That's why I was surprised by the claim that any regularization (preserving the symmetries) will must lead to the same end results

(Btw, extraction of physics results from the lattice involves perturbative calculations as well as the computer simulations. I recall some nice posts about this on the "life on the lattice" blog at some point..)

cecil kirksey said...

Interesting subject. I think I can accept the "mathematical" definition of summing divert series because the "sum" can be defined in a potentially consistent manner.

However, in any real world situation the question does it EVER make sense using such a divergent series? Would it ever make sense add (sum?)an infinite number of measurable quantites? If not exactly what is being added in ST? Thanks.