Almost every explanation of FIR and IIR filters has not worked for me. I have a reasonable background in maths, but the maths used by mathematicians rather than physicists or engineers. This means I can deal with complex numbers, and more of less Fourier analysis, but anything to do with Laplace makes me go cross-eyed. Somehow, every explanation of FIR and IIR, being framed for physicists or engineers, goes over my head. So, I've tried to reframe it in terms I understand.
This document is my attempt at explanation.
Given a sequence X_n, we can make a FIR filter of it, Y_n, by calculating it as follows:
Y_n = a_0 * X_n + a_1 * X_n-1 + a_2 * X_n-2 + ...
for some finite number of terms.
It's a finite impulse response filter, because if you put a sequence with a finite number of non-zero elements in, you get a finite number of non-zeros out.
An IIR, on the other hand, looks like this:
Y_n = a_0 * X_n + a_1 * X_n-1 + a_2 * X_n-2 + ... + b_1 * Y_n-1 + b_2 * Y_n-2 + ...
Again, with a finite number of terms.
Because of the feedback, a finite impulse in can lead to an infinite response out. They tend to make better filters.
Yes. While the underlying systems are usually continuous, when dealt with as digital filters, they're discrete, and to avoid the pain of "How does this discretise?", I'm modelling it as a discrete system.
We'll deal with IIR filters, treating FIR as a subset of IIR.
Let's say we have the same filter being applied to seqeuences X1 and X2, producing Y1 and Y2:
Y1_n = a_0 * X1_n + a_1 * X1_n-1 + a_2 * X1_n-2 + ... + b_1 * Y1_n-1 + b_2 * Y1_n-2 + ...
Y2_n = a_0 * X2_n + a_1 * X2_n-1 + a_2 * X2_n-2 + ... + b_1 * Y2_n-1 + b_2 * Y2_n-2 + ...
What would happen if, instead, we applied the filter to the element-wise sum of X1 + X2 (call it Z)?
Z_n = a_0 * (X1_n + X2_n) + a_1 * (X1_n-1 + X2_n-1) + a_2 * (X1_n-2 + X2_n-2) + ...
+ b_1 * Z_n-1 + b_2 * Z_n-2 + ...
Z_n = a_0 * X1_n + a_1 * X1_n-1 + a_2 * X1_n-2 + ...
+ a_0 * X2_n + a_1 * X2_n-1 + a_2 * X2_n-2 + ...
+ b_1 * Z_n-1 + b_2 * Z_n-2 + ...
I will now do what feels like some slightly dodgy induction. If we
start calculating our sequences at index 0, we have to find some
values for index i
, where i < 0
. We'll
assume they're zero. This gives us Z_i = Y1_i + Y2_i
for i < 0
. Inductively, if we assume
Z_m = Y1_m + Y2_m
for m < n
:
Z_n = a_0 * X1_n + a_1 * X1_n-1 + a_2 * X1_n-2 + ...
+ a_0 * X2_n + a_1 * X2_n-1 + a_2 * X2_n-2 + ...
+ b_1 * Z_n-1 + b_2 * Z_n-2 + ...
= a_0 * X1_n + a_1 * X1_n-1 + a_2 * X1_n-2 + ...
+ a_0 * X2_n + a_1 * X2_n-1 + a_2 * X2_n-2 + ...
+ b_1 * (Y1_n-1 + Y2_n-1) + b_2 * (Y1_n-2 + Y2_n-2) + ...
= a_0 * X1_n + a_1 * X1_n-1 + a_2 * X1_n-2 + ...
+ b_1 * Y1_n-1 + b_2 * Y1_n-2 + ...
+ a_0 * X2_n + a_1 * X2_n-1 + a_2 * X2_n-2 + ...
+ b_1 * Y2_n-1 + b_2 * Y2_n-2 + ...
= Y1_n + Y2_n
Hence, by induction, Z_n = Y1_n + Y2_n
. It's linear!
This means that, for any input, to get the output of the filter, we can break the input down into convenient signals, pass them through the filter, and sum the results together.
Convienient signals like... sine waves. If we know how the filter reacts to sine waves, we know everything about it. (Thanks, Fourier!)
For our purposes, it's easiest to treat sine/cosine as the real part of a complex exponenential. Sorry, it just is. Complex addition is just a lot easier than dealing with a bunch of trig identities.
I'm now going to start indexing the sequence with t
(for time) as
i
is going to be the square root of minus one.
Assume the sequence elements X_t
are samples at
frequency s
of a cosine of
frequency f
. Then X_t = Re (e^(i 2 pi t f /
s))
. Let's set w = 2 pi f / s
for simplicity
(w
is my ASCII approximation to omega, angular
velocity).
Then
Y_n = a_0 * Re(e^(iwn)) + a_1 * Re(e^(iw(n-1))) + a_2 * Re(e^(iw(n-2))) + ...
+ b_1 * Y_n-1 + b_2 * Y_n-2 + ...
and quite frankly we can (assuming the a
s
and b
s are real) just drop the "Re" operators and just
take the real component of Y
as we need it, to make our
lives easier:
Y_n = a_0 * e^(iwn) + a_1 * e^(iw(n-1)) + a_2 * e^(iw(n-2)) + ...
+ b_1 * Y_n-1 + b_2 * Y_n-2 + ...
Next, we're going to do that dodgy thing where we assume the form
of a solution, and then check it works. Specifically, we'll
assume Y_n = y * e^(iwn)
, where y
is some
complex (not necessarily real) number:
y * e^(iwn) = a_0 * e^(iwn) + a_1 * e^(iw(n-1)) + a_2 * e^(iw(n-2)) + ...
+ b_1 * y * e^(iw(n - 1)) + b_2 * y * e^(iw(n-2)) + ...
Let's divide everything by e^(iwn)
:
y = a_0 + a1 * e^(-iw) + a2 * (e^(-iw))^2 + ...
+ b_1 * y * e^(-iw) + b_2 * y * (e^(-iw))^2 + ...
Solving for y
, we get:
y = (a_0 + a1 * e^(-iw) + a2 * (e^(-iw))^2 + ...)
/ (1 - (b_1 * e^(-iw) + b_2 * (e^(-iw))^2 + ...))
The value of y
as w
varies is the frequency response of the filter. It
turns out the frequency response is simply a rational polynomial of
e^(-iw)
.
What you usually care about is the magnitude of y
,
which gives the amplification of that frequency, while the argument
of y
is the phase shift that the filter gives to that
signal.
The value of w
usually lies in the range between 0 (DC
signal) and pi (the Nyquist frequency for that sampling rate). You can
squint and think of e^(-iw)
as providing a bit of
distortion on a linear interpolation of w
between 1 and
-1 (most distorted at the highest and lowest frequencies).
All the fancy filter designs are really about choosing a polynomial that gives the shape of frequency response (and phase, for those that care) curve that people want.
The "poles" of filter design are the points at which the denominator goes to zero, creating infinite amplification. Physically, that's resonance, and you probably don't want a pole inside the range of inputs you expect, unless you really, really want a pole there (when what you're building is called an "oscillator :).
Zeroes are when the numerator goes to zero, and that frequency is not passed through at all.
Posted 2023-11-06.