Thursday, June 15, 2017

Some DIY LIGO data analysis

UPDATE: After some more thinking about this, I have very serious doubt about my previous conclusions. From looking at the power spectrum, I (wrongly) assumed that the middle part of the spectrum is the low frequency part (my original idea was, that the frequencies should be symmetric around zero but the periodicity of the Bloch cell bit me). So quite to the opposite, when taking into account the wrapping, this is the high frequency part (at almost the sample rate). So this is neither physics nor noise but the sample rate. For documentation, I do not delete the original post but leave it with this comment.


Recently, in the Arnold Sommerfeld Colloquium, we had Andrew Jackson of NBI talk about his take on the LIGO gravitational wave data, see this announcement with link to a video recording. He encouraged the audience to download the freely available raw data and play with it a little bit. This sounded like fun, so I had my go at it. Now, that his paper is out, I would like to share what I did with you and ask for your comments.

I used mathematica for my experiments, so I guess the way to proceed is to guide you to an html export of my (admittedly cleaned up) notebook (Source for your own experiments here).

The executive summary is that apparently, you can eliminate most of the "noise" at the interesting low frequency part by adding to the signal its time reversal casting some doubt about the stochasticity of this "noise".


I would love to hear what this is supposed to mean or what I am doing wrong, in particular from my friends in the gravitational wave community.



3 comments:

Shantanu said...

There are several things that I have not understood in what you did or why your results are surprising. Following are my questions.

o Just to be clear.
when you say "raw strain data at 4096 Hz, I presume you mean h(t) timeseries containing GW signal + residual LIGO noise right?
Is it completely raw (except for been down-sampled to 4096 Hz)?

o You then mention that since the signal is real, Fourier transform has a phase of 0 or pi for a constant phase
and you find out that it is pi. I don't know what this tell us. What frequency does the region of constant phase correspond to?
(Maybe you can show one plot with frequency in (Hz) on X-axis instead of frequency bin number).

o Why do you consider h(t) + time-reversed h(t) (containing the signal) noise?
Only h(t) containing the signal+LIGO noise -best fit signal template would be noise?
But I don't think you are doing that. So I don't understand how an addition of the original signal+ its time-reversal is noise. Also I don't see a cancellation at other frequencies.

oAlso have you repeated the same exercise for Hanford? Do you get the same results?
I guess at any rate the best way to resolve this through some toy numerical experiments with injecting a mock signal in mock time series containing white noise (and also doing the same with colored noise)

Shantanu said...

I asked someone who has extensively dealt with time-series analysis about this and that person said

"Time reversing and subtracting a noisy time series should only cancel out 1 sample (for white noise) and increase the variance (by a factor of 2) for the rest. I see from the plot that the noise is canceled over a much longer time (?) before the variance increases. Could be an effect of colored noise though… I would expect the cancellation to be over the auto-correlation timescale (which is 1 sample for white noise but longer for colored noise + lines).

Robert Helling said...

"Is it completely raw (except for been down-sampled to 4096 Hz)?"

That's how I understand the LIGO open science web page. I take that data set as is.

"You then mention that since the signal is real, Fourier transform has a phase of 0 or pi for a constant phase
and you find out that it is pi. I don't know what this tell us. What frequency does the region of constant phase correspond to?"

The phase should be random (as it is outside the low frequency region: The dots come at all colors). What is the frequency? You can do the math yourself (and I am too lazy) but given that is the region of low noise I would expect it to be up to a few 100Hz.

" Why do you consider h(t) + time-reversed h(t) (containing the signal) noise?
Only h(t) containing the signal+LIGO noise -best fit signal template would be noise?
But I don't think you are doing that. So I don't understand how an addition of the original signal+ its time-reversal is noise. Also I don't see a cancellation at other frequencies."

The signal has orders of magnitude less power than the noise. So not taking out the signal does not really matter.

The effect is strongest for the dataset I mentioned.

And your time analysis friend is right: Adding to random noise the time reversal should yield noise that is sqrt(2) times stronger (except at omega=0).