Before reading this book I had no idea that David Duchovny has a Master's degree in English literature from Yale. And if I knew that, I'd say that he did it as practice for the role of Hank Moody in Californication.

Holy Cow, his debut novel, reveals another side of

]]>Before reading this book I had no idea that David Duchovny has a Master's degree in English literature from Yale. And if I knew that, I'd say that he did it as practice for the role of Hank Moody in Californication.

Holy Cow, his debut novel, reveals another side of his personality. Having been written in recent years, the story is set in modern time. Elsie Bowary, a cow from a farm in the USA, tells us about her escape along with a turkey and a pig to India, Turkey, and Israel respectively. She peeps a TV-set showing a program about meat production that reveals to her the destiny of cows to be slaughtered. The pig and the turkey solicit Elsie to take them with. They go, take a flight, visit several countries, take part in adventures, meet other animals and humans.

The novel is written in a light and humorous style. However, it concerns the way the human race uses the other world, and its propensity to build an industry upon others' lives. Having read up to the middle of the book I thought that Mr. Dukhovny might be a vegetarian.

I liked the book because it was written by David Dukhovny. This is smooth and humoruos reading and It may be good alternative to pulp fiction. On the other hand, I can recommend it only if you enjoy easy-going entertaining books. It's pretty shallow on my opinion.

]]>Obviously, once you've measured some imperfectness, you want to fix it

]]>In the previous post I measured the Room Impulse Response and tried a naive way of its compensation - the Weiners Deconvolution. The next step I took was figuring out how can I correct the frequency and phase characteristics.

Obviously, once you've measured some imperfectness, you want to fix it somehow. An evident approach is to design a FIR filter using this function, but, for no reason, I didn't feel like doing that and wanted to learn something new. I wanted to learn how true scientists solve this problem.

I found some methods, but it seems they're not suitable for this problem. In this post, I describe my researches of these hard ways, and later, I hope, I'll do a simple and straightforward compensation.

ARC is supposed to adjust a played back sound so that a listener would hear its original version. ARC is supposed to vanish out rooms acoustic influence. It performs as an equalizer which is tuned by a computer.

The same task is solved in other areas such as:

echo cancelation solves this problem in a way

However, there's no single approach that suits all needs. For example, equalizers in wireless communication aimed to correct a channel with a wide bandwidth and smallest power consumption. In the problem I'm digging into, one can do whatever they want with a room and equipment, there's no need in real-time processing, no deficit of processing power or memory. The problem I dealt with was a lack of spare time only.

For the simplicity's sake, let's assume we just need to solve an inverse task of a convolution. The result of a deconvolution is an estimation of the \(x\), knowing \(y\) and hoping that we've estimated \(h\) correctly:

\[h \ast x = y \]

This problem has no precise solution, but there are several approaches to it. I tried to find any paper relevant to the ARC, but a short search wasn't very successful, so I adopted the algorithm from this paper, which corresponds to some close but not the same problem.

This method is based on a simple trick: just change the convolution with \(h\) in the equation above to a matrix multiplication and you'll get a well-known problem - an overdetermined system of linear equations. The matrix, that represents the Room Impulse Response \(h\), is a Toeplitz matrix with values of coefficients of \(h\), then the previous equation transforms to this one:

\[\begin{bmatrix} h_{0} & 0 & 0 & 0 & \dots & 0 & 0 & 0 \\ h_{1} & h_{0} & 0 & 0 & \dots & 0 & 0 & 0 \\ h_{2} & h_{1} & h_{0} & 0 & \dots & 0 & 0 & 0 \\ h_{3} & h_{2} & h_{1} & h_{0} & \dots & 0 & 0 & 0 \\ \ddots & \ddots & \ddots & \ddots & \ddots & \ddots & \ddots & \ddots \\ h_{n-3} & h_{n-4} & h_{n-5} & h_{n-6} & \dots & h_{0} & 0 & 0 \\ h_{n-2} & h_{n-3} & h_{n-4} & h_{n-5} & \dots & h_{1} & h_{0} & 0 \\ h_{n-1} & h_{n-2} & h_{n-3} & h_{n-4} & \dots & h_{2} & h_{1} & h_{0} \end{bmatrix} \cdot \begin{bmatrix} x_{0} \\ x_{1} \\ x_{2} \\ x_{3} \\ \vdots \\ x_{n-3} \\ x_{n-2} \\ x_{n-1} \end{bmatrix} = \begin{bmatrix} y_{0} \\ y_{1} \\ y_{2} \\ y_{3} \\ \vdots \\ y_{n-3} \\ y_{n-2} \\ y_{n-1} \end{bmatrix} \]

So, we are to simply build a proper \(h\) and \(y\), and ask any solver to find the best \(x\). The only problem here is that the matrix \(h\) can't be big enough, otherwise the system of equations becomes too big. My laptop is able to deal with up to 2048 numbers across.

This method performs so-called piece-wise deconvolution, which isn't very useful for real-time processing. I want to build a FIR filter that I'd be able to apply to a signal in order to fix an acoustic environment. I call this filter the inverse impulse response (IR). The inverted IR \(k\) should satisfy the following equation:

\[ h\ast (k \ast x) = x \]

This equation reveals that the original signal should be prefiltered with \(k\) just before playing it through speakers (that are the filter \(h\)).

We can estimate \(k\) implying that it is an invertion of \(h\) and solving the following equation using the same approach that has been showed recently:

\[ h \ast k = \delta \]

\( \delta \) here is the digital delta impulse, whose all cells except one equal to zero and the remaining single sample equals to one.

The next step is pretty straightforward, we just try to estimate \( k \) in the following equation:

\[\begin{bmatrix} h_{0} & 0 & 0 & 0 & \dots & 0 & 0 & 0 \\ h_{1} & h_{0} & 0 & 0 & \dots & 0 & 0 & 0 \\ h_{2} & h_{1} & h_{0} & 0 & \dots & 0 & 0 & 0 \\ h_{3} & h_{2} & h_{1} & h_{0} & \dots & 0 & 0 & 0 \\ \ddots & \ddots & \ddots & \ddots & \ddots & \ddots & \ddots & \ddots \\ h_{n-3} & h_{n-4} & h_{n-5} & h_{n-6} & \dots & h_{0} & 0 & 0 \\ h_{n-2} & h_{n-3} & h_{n-4} & h_{n-5} & \dots & h_{-1} & h_{0} & 0 \\ h_{n-1} & h_{n-2} & h_{n-3} & h_{n-4} & \dots & h_{-2} & h_{-1} & h_{0} \\ 0 & h_{n-1} & h_{n-2} & h_{n-3} & \dots & h_{-3} & h_{-2} & h_{-1} \\ 0 & 0 & h_{n-1} & h_{n-2} & \dots & h_{-4} & h_{-3} & h_{-2} \\ 0 & 0 & 0 & h_{n-1} & \dots & h_{-5} & h_{-4} & h_{-3} \\ \ddots & \ddots & \ddots & \ddots & \ddots & \ddots & \ddots & \ddots \\ 0 & 0 & 0 & 0 & \dots & 0 & 0 & h_{n-1} \end{bmatrix} \cdot \begin{bmatrix} k_{0} \\ k_{1} \\ k_{2} \\ k_{3} \\ \vdots \\ k_{n-3} \\ k_{n-2} \\ k_{n-1} \\ k_{n} \\ k_{n+1} \\ k_{n+2} \\ \vdots \\ k_{m} \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \\ \vdots \\ 0 \\ 0 \\ \textbf{1} \\ 0 \\ 0 \\ 0 \\ \vdots \\ 0 \end{bmatrix} \]

It's almost the same system of equations except that I added the tail to \(h\) and \(y\) in order to deal with the edge effect on the end of the convolution. This is why \(k\) is bigger than \(h\) in my experiments.

To debug an implementation of this algorithm I made some simulation setup: I used the Room Response that I obtained from the measurements from the previous post, a test signal is a simulated sum of several sines with additive noise. This is what I've got:

There is no Room Response and its inversion on the image as it's not interesting at all, really. On the other hand, the image consists of:

- the
**quasi-delta impulse**is convolution of the Room Response and its computed inversion - the most exhilarating plot for me; - the spectrum of the quasi-delta-impulse;
- comparison of the virgin signal (before play) -
**Y_original**, blue - and the one that was prefiltered with the inverted room response and then "played with speakers" (which was just simulated by convolution with the Room Response) -**Y**, green; - spectrum of the room response and its inversion;
- residual is a difference between the virgin signal and prefiltered and played one.

Test signal **Y_original** is obtained with this code:

```
# Making the Y signal -- sum of couple of sines and awg-noise:
time = np.array(range(irr_len*4))
A = [0.5, 0.2]
freq=[np.pi/16, np.pi/20]
phi = [0, np.pi*1.3]
Y_original = sum([a*np.sin(2*np.pi*f*time + p) for a,f,p in zip(A, freq, phi)])
# Add some noise
Y_original += np.random.normal(0, 0.27, size=Y_original.shape[0])
```

Then this signal was prefiltered and then it was played in the next few lines:

```
# Prefilter it with inverse room response
Y_predistorted = fftconvolve(Y_original, inv_room_response)[:Y_original.shape[0]]
# Filter it like it was played through the speakers (note: we do it with the long version of
# the room response):
Y = fftconvolve(room_response, Y_predistorted)[:Y_predistorted.shape[0]]
```

I solved this system with scipy's lstsq with a default solver suitable for this purpose.

Here is the code of building this matrix equation and solving it:

```
def inverted_ir(IR, invir_len=None):
'''
Calculates reverse impulse response so that we can use it as
a predistortion filter.
IR -- the estimated impulse response.
invir_len -- the length of the target reversed IR.
'''
ir_len = IR.shape[0]
if invir_len is None:
invir_len = ir_len
# The digital delta-impulse:
Y = np.zeros(invir_len + ir_len-1)
Y[ir_len - 1] = 1
ir_zpadded = np.append(IR, np.zeros(invir_len-ir_len))
H = toeplitz(ir_zpadded, np.zeros(invir_len))
# Some magick for dealing with an edge-effect.
H_tail = toeplitz(np.zeros(Y.shape[0]), \
ir_zpadded[-1::-1])[:ir_len-1]
H = np.append(H, H_tail, axis=0)
inverse = lstsq(H, Y)[0]
inverse /= np.max(fftconvolve(IR, inverse))
return inverse
```

Note the fact that H matrix is built of two parts (lines 19 and 21), the latter part forms an edge effect of the convolution in order to emulate convolution more precisely.

This code performs on small pieces of a signal as a solving system of linear equations and it consumes much time on pieces larger than 1024 samples.

It's always useful to talk to someone in order to have all thoughts sorted in the mind. While I was writing this post I thought it should be much efficient to express gradient of error function analytically and feed it a solver.

This is the how I define error function and its partial derivations, that make gradient:

\[E(k) = \Sigma (h \ast k - \delta)^2 \] \[\frac{dE}{dk_i} (k) = H*H^T * k - H^T * \delta\]

Its computation is pretty simple and is checked here:

```
H_sqr = np.matmul(H.T, H)
H_delta = np.matmul(H.T, delta)
def grad(k0):
# Calculate gradient with single matrix multiplication
return (np.matmul(H_sqr, k0)- H_delta)
# Find the minimum -- res.x:
res = minimize(E, k0, method='BFGS', jac=grad)
plot(fftconvolve(res.x,h))
```

However this approach worse than just applying lstsq as the results are equal and lstsq is way faster. Though I practiced in math again :-)

These exercises didn't give me a practical solution as both methods are too computationally demanding. If I set my sights a little lower and stop asking for a perfect inversion of an Impulse Response, I can correct frequency response and group delay separately.

]]>I've just made some experiments with Room Response Estimation method described in "Simultaneous measurement of impulse response and distortion with a swept-sine technique" Angelo Farina [1]. The code measures Impulse Response of the room by playing a probe impulse (just specially prepared audio-signal)

]]>I've just made some experiments with Room Response Estimation method described in "Simultaneous measurement of impulse response and distortion with a swept-sine technique" Angelo Farina [1]. The code measures Impulse Response of the room by playing a probe impulse (just specially prepared audio-signal) and recording the sound which appears to reflect back to a microphone. It's available on github. So maybe someone will come into sight to share their opinion or even improvements.

These weird plots show how I correct my laptop speakers (I'll tell you about it later):

Impulse Response of a room is an acoustic signal you can hear in a particular place in the room when someone makes a super short impulse in another place in the room. Strictly speaking, Room Response is a sum of acoustic reflections of Diracs impulse. Why measure it? Because this signal affects every sound you hear. It's actually is the primary characteristics of the room reverberation. Estimating reverberation let us compensate it with predistortion of the audio signal we're going to playback.

Homer produces very short and loud impulse, the arrows are some particular ways of acoustic waves:

And this is the signal that approaches the microphone:

Modern high-end audio receivers can perform a so-called Active Room Correction on their own, though I haven't a chance to check how it affects an impression of the listening. There're lots of experts reviews about such receivers and this feature. Also, there are several programs aimed to measure the impulse response that let you know what the environment is.

I've made this small piece of code just out of curiosity. It showed interesting results so I hope I'll continue digging this deeper in my spare time.

In ancient times it was common to measure an impulse response by recording on a tape the shortest sound one could make (e.g. piercing a balloon). However, nowadays there're several wide-known modern computational techniques for Room Response Estimation. They all involve speakers and microphone, though they use different math underneath. In short, the basic idea beyond all of them is producing a long self-orthogonal signal with a speaker and calculating cross-correlation between played and measured signals.

Realy, why don't we just play a sine of a particular frequency, record the response then increase the frequency and play and record again etc, then perform Inversed Fourier Transform and voilà - here we've got the Room Response! Just like all digital network analyzers do!

Every tiny error in measurements would make the result of the IFFT complex. And the fact is that Farina's method is almost the same but much more usable.

The most popular modern digital methods of measuring the room response are:

MLS (Maximum Length Sequence) technique: a program playbacks a periodic pseudo-random signal, which is almost the same as white noise, but the program is able to correlate it's acquired copy with original signal in order to recover room impulse response.

IRS (inverted repeated sequence): the sequence consists of MLS where every even sample is inverted.

Time-Stretched Pulses: "this method is based on a time expansion and compression technique of an impulsive signal. The aim of using an expansion process for the excitation signal is to increase the amount of sound power emitted for a fixed magnitude of this signal and therefore to increase the signal-to-noise ratio without increasing the nonlinearities introduced by the measurement system" [2]

Logarithmic SineSweep technique: probe signal is built with a sine-sweep with logarithmic changing frequency, so as to diminish harmonics influence on the result.

Here is a comparison SNRs of these methods [2]:

MLS | IRS | Time-Stretched Pulses | SineSweep |
---|---|---|---|

60.5 dB | 63.2 dB | 77.0 dB | 80.1 dB |

Logarithmic SineSweep is the widely used method of the room response so far. This method performs very alike to a pulse radar: it emits a probe sound and records the reflection (and of course it gets the direct signal too because transmitters and receivers are not actually the same as in a radar). It lets us measure a pure linear room response vanishing out all non-linearity of the sound tract. Although it's very sensitive to an external noise. Fortunately, it's easier to get rid of the noise than to decrease non-linear harmonics volume.

I've chosen this method to implement on my own because of the content of the table in the previous part. It appeared to be simpler to implement than I expected, despite of it's improved performance. It has a straightforward algorithm and clean description in the Farina's paper [1]. I'll give a short overview in the text below.

So, in order to measure the response, we need to playback a probe sound. This impulse must satisfy following requirements:

it has to contain all frequencies you want to measure,

it has to be self-orthogonal, so as to compress itself into Dirac's impulse during auto-correlation.

The Farina's method also eliminates non-linear parts of recorded signal which is a very useful feature too.

The logarithmic sine-sweep impulse is very similar to widely-known sine-sweep signal which is described by following equation:

\(x_{linsweep}(t) = sin(\omega_1 t + \frac{\omega_2-\omega_1}{T} \frac{t^2}{2})\)

This signal lasts for \(T\) seconds and contains a sine with linearly increasing frequency from \(\omega_1\) to \(\omega_2\). This is what most radars emit and receive.

But we can get more if we use exponential law for changing the frequency. In that case cross-correlation of the recorded signal won't produce fake copies of the probe impulse.

The log sweep is made with this not-so-simple formula:

\( x_{logsweep}(t) = sin[ \frac{\omega_1 T}{ln(\frac{\omega_2} {\omega_1})} ( e ^ {\frac{t}{T} ln( \frac{\omega_2}{\omega_1} )} - 1) ]\)

Here is how it's looks like (it's smoothy on the boundaries on purpose, I modulated it with exponential windows):

And what's more important, spectrogram of probe impulse of range from 100 to 15 000 Hz:

And its spectrum:

It only remains for us to build inverse filter - the time-reversed copy of the \( x_{logsweep}(t) \), compensate its amplitude characteristic and convolve the record with it.

The log-sine-sweep impulse has skewed spectrum so that we need to correct the amplitude characteristic of its counterpart. Farina recommends in [1]:

The generation of the inverse filter is simply matter of time-reversing the excitation signal, and then applying to it (the reversed impulse) an amplitude envelope to reduce the level by 6 dB/octave, starting from \(0\) dB and ending to \(−6 \cdot log_2(\frac{\omega_2}{\omega_1}) \).

The inverse filter spectrogram:

Its actual graph is like that:

And its spectrum:

Now we can check the whole system by filtering original probe impulse with the inverse filter. I use for this the same frequency range from 100 Hz to 15 000 Hz and 10 seconds.

If you expected to see a delta impulse on this plot, I'm sorry than, it's impossible until you'd have an unlimited bandwidth. Actually, this signal is a sinc (approximately), mixed with (multiplied by) a carrier frequency in the middle of the range (7 550 Hz). This simply means that we perform Room Response estimation over a limited bandwidth.

All code is available in this repository.

The heart of the room response estimation is inside **RoomResponseEstimator** class (room_response_estimator.py). Scripts **evaluation.py** and **measure.py** includes examples of its use. Class **SoundEnvirenment** (signal_env.py) mocks the real process of measurements. It receives an impulse response as an argument and do everything else inside its method **run(...)**. I used it while I was digging inside the math as a playground, but this code wasn't preserved for descendants.

The probe impulse is generated inside **probe** method, which is called from the constructor:

```
def probe(self):
w1 = self.w1
w2 = self.w2
T = self.T
# page 5
def lin_freq(t):
return w1*t + (w2-w1)/T * t*t / 2
# page 6
def log_freq(t):
K = T * w1 / np.log(w2/w1)
L = T / np.log(w2/w1)
return K * (np.exp(t/L)-1.0)
freqs = log_freq(range(int(T)))
impulse = np.sin(freqs)
return impulse
```

Everything must be pretty clear here, I've just implemented equations you saw in Probe impulse part of this post. **w1** and **w2** are the boundaries of our target frequency range, **T** is the number of samples in the probe impulse.

The counterpart of the impulse response is made by just time reverse and frequency-dependent gain compensation of the probe pulse itself (mentioned in How to build correct a reconstructing filter)

```
# This is what the value of K will be at the end (in dB):
kend = 10**((-6*np.log2(self.w2/self.w1))/20)
# dB to rational number.
k = np.log(kend)/self.T
# Making reverse probe impulse so that convolution will just
# calculate dot product. Weighting it with exponent to acheive
# 6 dB per octave amplitude decrease.
self.reverse_pulse = self.probe_pulse[-1::-1] * \
np.array(list(\
map(lambda t: np.exp(t*k), np.arange(self.T))\
)\
)
```

And then a small workaround comes:

```
# Now we have to normilze energy of result of dot product.
# This is "naive" method but it just works.
Frp = fft(fftconvolve(self.reverse_pulse, self.probe_pulse))
self.reverse_pulse /= np.abs(Frp[round(Frp.shape[0]/4)])
```

I just divide the impulse response of the reconstructing filter by the amplitude of the harmonic from the middle of the spectrum.

The whole process of estimation is just applying reconstructing filter to the recorded response, it's implemented in the same class in **estimate** method:

```
def estimate(self, response):
I = fftconvolve( response, self.reverse_pulse, mode='full')
I = I[self.probe_pulse.shape[0]:self.probe_pulse.shape[0]*2+1]
return I
```

You can run `measure.py`

and get several wave files, some of them you can reuse for debug.

This script just creates the probe impulse, stores it in the wav, calls for **SoX** to playback it and record the result (I call sox just because didn't find any lib for python playing and recording sound that works for me), then it reads the record and plots some stuff which I describe little bit later.

The good way to start is just to call that: `./measure.py -b 100 -e 15000 -d 10`

. It should play the sweep and show you plots (and store the sweep and the record in wave-files).

Here is what you can see on the plots:

**Measured Room Response**is the reason I did this piece of code,**Deconvoluted Room Response**is the result of Wiener Deconvolution of room response with itself just for check,**Recorded log-sine sweep**is the sound the microphone has just heard,**Spectrum of the record**is the spectrum of the previous signal,**Deconvolved sine sweep**is the result of Wiener Deconvolution of record with the measured room response just for check,**Spectrum of deconvolved sine sweep**is the spectrum of the previous signal.

Active Room Correction in common sense is just a predistortion of samples to be played by a speaker in order to achieve almost exactly the same sound that was designed by a record engineer of a music recording studio. That implies applying a linear filter which attenuates peaks of the Room Response spectrum, though it's hard to pull out zeros of standing waves by only digital means. Also one may want to tune phase characteristics of each channel in order to achieve right panorama.

I'm going to use hrir-module of PulseAudio to make amplitude-frequency characteristic of my speakers and room response spectrum smoother.

Stay tuned

[1] Simultaneous measurement of impulse response and
distortion with a swept-sine technique, **Angelo Farina**

[2] Comparison of different impulse response measurement techniques, **Stan Guy-Bart, Embrechts Jean-Jacques, Archambeau Dominique**

Hi, my name is Mikhail Baranov, I work as an embedded software engineer and I'm fond of digital signal processing stuff too. I'm native Russian speaker so please bear my English. I'm doing my best to learn it but it's hard to do in Moscow, you know. I do some

]]>Hi, my name is Mikhail Baranov, I work as an embedded software engineer and I'm fond of digital signal processing stuff too. I'm native Russian speaker so please bear my English. I'm doing my best to learn it but it's hard to do in Moscow, you know. I do some pet projects which are available on github. You can reach me on facebook or linkedin.

]]>Internet of Things isn't just things with internet connection. Kettle's web-interface doesn't change user experience in major: user still have to put water in it and make an order to boil. It's not a game changing feature to be able to turn on the kettle lying in

]]>Internet of Things isn't just things with internet connection. Kettle's web-interface doesn't change user experience in major: user still have to put water in it and make an order to boil. It's not a game changing feature to be able to turn on the kettle lying in the bed or turn off the lights at home while your are in the office. It's just a remote control.

The main feature of getting things connected is synergy they could provide by exchanging infomation to each other. The washing machine could know the type of the clothes inside itself and make the decision when and how to operate based on this information and the electricity prices which it could acquire from the power-meter. The fitness tracker could know if the user has a party tonight so as to tell the fridge that it has to order a beer or analgene in addition to regular milk.

Internet of Things makes sense if different Things connects to each other in order to provide an environment to run applications affecting few of such devices. And the ability of execute that applications is a game changing feature.

Sometime later IoT will become a mess with a few clusters of compatible things inside. Some applications would have standards tightly defining interconnections and functions of whole system, just like smart grid in USA.

Big players would have platforms joining very different devices. Google's glasses, lenses, powermeter and cars are already here. The most interesting business is in providing the platform for running high level applications affecting things of all descriptions. And providing a market of that applications.

I won a contest with this text. They didn't tell anything about competitors so I think I was alone :-)

]]>