Filtering – Part 9 – Dave McGlone

So .. I spent a lot of pages talking of filtering and the spectrum that results from the Fourier analysis of a time series. I actually used up a lot of space presenting lots of plots. Tired of plots yet? There’s only one on this last page.

I used the two most common normalized filters in 4th-order configurations:

1) the Butterworth normalized for a zero-frequency amplitude of 1 and corner frequency at 1 rad/sec
2) the Bessel filter – aka the Thomson or linear-phase filter – normalized for a zero-frequency group delay of 1 sec.

The Butterworth is more common and often “easier” to design. It has a uniform amplitude response and monotonic attenuation but it’s group delay is larger than 1, non-monotonic, and has a peak delay at the upper end of the passband. Its corner frequency is higher than the Butterworth which needs to be considered when used as an anti-aliasing filter. The corner frequency can be normalized to 1 rad/s to match the Butterworth response. This will increase the delay time – although it stays uniform – and the attenuation roll-off is slower and begins sooner than the Butterworth.

The Butterworth should be used where phase linearity over a wide bandwidth is not an issue. The Bessel should be used where a linear phase response over a wide bandwidth is desirable.

In many situations, the desired data is collected in pre-determined “sample lengths” – some fixed period of time.In my “geophysics” world of making “earth” measurements (the term “earth” covering many things – including minerals in asteroids or water on Mars), this sample length may be defined by geometrical considerations: for example, each data segment should represent 1 km of line measurements. The question becomes similar to “data every 10 m or every 100 m”?

In most of the examples previously discussed, the signal bandwidth covered just shy of a full decade – the lowest frequency was 0.1 rad/s, the highest was 0.9 rad/s. With samples collected at 2 rad/s, 1 cycle of the low frequency takes 10 sec or 20 samples per cycle. Therefore, 5003 samples represents 250 cycles of the lowest frequency. 5003 samples at the Nyquist frequency of 1 rad/s represents 2500 cycles. My high-side rule-of-thumb of 50 samples per cycle suggests a minimum sample length of 1000 to provide sufficient sampling for the lowest frequency. If my lowest project frequency is 100 kHz and sample frequency is 2 MHz, I’d want a minimum sample length of 10 ms.

The spectrum examples provided were based on a sample rate of twice the Butterworth corner frequency of 1. This is the Nyquist frequency and in practice is often a poor choice for sample rate and/or $^*$ corner frequency. The variations in response were defined by the data sample length. It is not uncommon for “more samples” to be selected as a solution to obtaining good frequency discrimination. What this solution doesn’t address are the effects due to the relationship between the signal frequencies and the system – in this case, the filter corner frequency.

The scenario presented included a sinusoidal shift in phase related to changes in signal travel path during measurements. Phase variations will appear as modulation components. In some situations, this suggests AM demodulation techniques as a means of extracting the desired data.

Even with 4th-order filters, a good case could be made for the filters to be configured for cut-off frequencies at least 1 decade above the highest signal frequency. This was examined in the previous suggestion – in fact, a separation of 2 decades performed even better … but other factors not discussed will come into play here.

Let’s talk through the criteria for data analysis …. again, I’ll work through the problem on the basis of a filter corner frequency of 1.

I need a time length of data to adequately measure the desired information (amplitude of a known frequency of less than the corner frequency of the filter). The previous examples used a sample frequency of 2 – which I selected based on no other reason than it was a parameter of an actual project I once worked on (and was by intent the Nyquist sample frequency).

I’ve two time factors here: number of samples per cycle and number of cycles to analyze.

I’ve recommended an initial estimate of sample frequency being between 20x and 50x the highest signal frequency rather than the 2x of the examples. The examples in the previous section showed a factor of 100 to be satisfactory … but some compromise is often necessary.

I have a time length of data. How long does it need to be? Do I have some idea of the signal to be measured?

The filter frequency is 1. Units – Hz or rad/s – don’t matter – except when converting to time (time = 1/f, not 1/ω). I’ve recommended a sample rate of 50x this high signal frequency; I had been sampling at 2x. At a sample frequency of 2, 1 cycle in 1 time unit (conveniently in seconds when assuming units of Hz) will have 2 samples at ½ sec intervals. At a sample rate of 50x, I’ll have samples at 1/50 cycle intervals. However, I need to assure that I have a sufficient data length to adequately analyze the data.

If my frequency is 1 Hz and sample rate at 1/50 cycle, I have 1 cycle in 1 sec with 50 samples. If I have a frequency of 0.1 Hz, I have 1/10 cycle in 1 sec – still with 50 samples. I need a sample length of 10s to measure one cycle … and I have 500 samples. At f = 0.01, I need 100 sec for 1 cycle. At samples every 1/50 sec, I’ll have 5000 samples. But if I have a DC signal, 1 sample will be (ideally) sufficient. The difference is the frequency resolution. Perhaps I don’t need to resolve frequencies to 1/100 of the filter frequency … which would be equivalent to stating I don’t need to resolve a 10 kHz signal with a filter corner frequency of 1 MHz.

And maybe I don’t – like always, it depends on project goals.

In this case, I do desire that resolution. To obtain 1 cycle of information from a signal of fin = 0.01, I need fs/fin = 50/0.001 = 5000 samples. In actuality, I’d use the psuedo-code:

$\begin{displaymath} \begin{align} SampleLength \; &= \; Ceiling \left[ \, \dfrac{ \, NextPrime \left[ \, \frac{ \, fs \, }{ \, fin \, } \right] \, - \, 1 \,}{ fs} \, \right] \; \Rightarrow \; \; Ceiling \left[ \, \dfrac{ \, NextPrime \left[ \, \frac{ \, 50 \, }{ \, 0.01 \, } \right] \, - \, 1 \,}{ 50} \, \right] \\ \\ &\Rightarrow \; \; Ceiling \left[ \, \dfrac{ \, 5003 \, - \, 1 \,}{ 50} \, \right] \; = \; 101 \, cycles \end{align} \end{displaymath}$

So, all this blabber aside, what would $I$ do? In general and for 1st-pass estimations (and subject to modification under differing circumstances), I’d define my anti-aliasing filter to be more than 2nd-order with a cut-off frequency at least a factor of 5 higher than my highest signal frequency – preferably a full decade. I’d set my sample frequency to a minimum factor of 5 above my filter frequency … but I’d want between 20 and 50 samples per the highest signal frequency period.

Follows is a generic plot of a 4th-order Butterworth filter with my suggestions for frequency allocations.

The filter corner frequency is 1. The signal bandwidth is actually open-ended on the low side but encompasses one decade between 0.05 and 0.5. Notice the signal magnitude is no longer 1 at a frequency of 0.5 – this is the reason I suggest 1/10 the filter corner frequency as the upper signal limit. It may be that this difference is not significant … it’s a loose recommendation anyway.

On the other side of the corner frequency, I’ve indicated a range of sample frequencies. These are well above the mathematical Nyquist limit of 2 times the signal frequency. I can’t build a mathematical brickwall filter so the ideal brickwall limit of 2x doesn’t apply.

The lower frequency limit is 5 which is 10x the highest signal frequency; about 55 db (1/625) attenuation. The upper limit is also open-ended but as shown, the higher limit is 25 which is 50x the highest signal frequency (and attenuation of about 110 dB or 1/391,000).

Long as this 9-part post is, I’ve barely covered the topic even when constraining myself to these two simple filters in an ideal situation. There are plenty of “cookbook” approaches, but by following a cookbook approach without knowledge, one could end up with a fried chicken dinner when the goal is roast turkey – and it might be a wonderful fried chicken – but it’s not the requested turkey (it’s Thanksgiving time as I write this – I need a Thanksgiving analogy).

Page 8

Page 1

I think this has been enough discussion on this topic so …
That’s a wrap.

Some of my history with Mathematica and a few of the routines used to analyze and plot the data is here.

* The lawyers have told me in contracts classes that “and/or” is redundant; that “or” includes the sets “1”, “2”, and “1” and “2” which is correct by the Boolean logic gate OR. However, I was trained – maybe improperly – that “and/or” represented the Boolean equivalent of AND (“1” and “2”) and XOR (“1” or “2”) – not that I was taught the phrase in terms of Boolean logic.
The English language is full of inconsistencies … and my background isn’t Boolean logic – or contract law. “and/or” it is.