Audio assignment lab script

MSc Comms/IWC/DSE/DSP
Advanced Multimedia Applications: Audio
Assignment on Spatial Audio
Introduction
By the end of this lab you will be able to play sounds through a pair of headphones that appear to
circle around the listener’s head.
As discussed in lectures, the head-related impulse response (HRIR) is the time-domain description
of the acoustic system between a sound source and the eardrum. Its frequency-domain
equivalent is the head-related transfer function (HRTF). One can be obtained from the other by
applying the Fourier transform. Passing an audio signal through a left/right pair of HRTFs and
playing the output through headphones produces the illusion that the sound is located at a
particular point outside the listener’s head. This forms the basis for binaural spatial audio and is
the subject of this assignment, which comprises five tasks:





Obtaining and inspecting a set of HRIRs
Spatialising a noise burst in a specific direction
Moving a noise source from one side of the head to the other
Making a noise source, speech or music circle smoothly around the listener
An optional extension exercise of your own choosing (not assessed)
Some text in this lab script has a line down the left side (like this paragraph). This indicates that
the results of any instructions or the answers to any questions in the text should be included in
your audio PowerPoint presentation. You will also need to submit the m-files and .wav files
named in this script and you may want to include clickable buttons in the presentation to play the
sounds within the presentation. (You may assume that the imaginary audience for your
presentation are wearing headphones.)
Part 1: Obtaining and inspecting a set of suitable HRIRs
You are provided with a set of HRIRs for 24 equally-spaced directions in the horizontal plane
around the listener. Using the standard coordinate system described in lectures, all the HRIRs
have zero elevation and azimuth angles which run from 0 to 345 in 15 steps, as shown in Fig. 1.
0
270

90

180

Figure 1: Diagram showing the source directions
for which HRIRs have been provided.
1
The HRIRs were obtained from the IRCAM HRIR database1, which contains measured HRIRs for
about 50 people. Only the HRIRs measured in the horizontal plane have been selected and they
have been re-organised to make it easier for you to perform the tasks in this assignment. They
are contained within the MATLAB data file:
N:\course\elec\examples\matlab\jjw\HRIRs_0el_IRC_subject59.mat
Copy the file into the folder where you want to keep your work on this assignment and select the
folder as your current working directory in MATLAB. To extract the variables stored in this file
and place them into MATLAB’s workspace, type at the command line prompt:
load HRIRs_0el_IRC_subject59
The following variables will appear in MATLAB’s workspace:
HRIR_set_L
– the 24 HRIRs in the horizontal plane for the left ear
HRIR_set_R
– the corresponding set of HRIRs for the right ear
zero_az_angles
– the list of directions (in degrees) for these HRIRs
Naz_angles
– the number of distinct azimuth directions
Fs
– the sampling frequency
Fig. 2 shows how the data in the vector (one-dimensional array) and matrices (two-dimensional
arrays) are organised.
Column index: sample 1, 2, 3, ....512
Row index | Angle
1
2
3
0
15
30
512-sample HRIR for direction az = 0
512-sample HRIR for direction az = 15
512-sample HRIR for direction az = 30
23
24
25
330
345
360
512-sample HRIR for direction az = 330
512-sample HRIR for direction az = 345
512-sample HRIR for direction az = 360
zero_az_angles
HRIR_set_L & HRIR_set_R
Figure 2: How to access the correct HRIR samples and azimuth directions
Plot the left and right HRIRs for the directions 0 azimuth on the same axes. Label the graph fully
and clearly. Similarly, plot the HRIR pair for 90 azimuth. Explain the key differences between
the two plots. Inspect the plots for a few other directions to check they vary in the way you
expect.
1
http://recherche.ircam.fr/equipes/salles/listen/system_protocol.html
2
Part 2 Spatialising a noise burst in a specific direction
When listening through headphones, there are two distinct ways to process a monophonic audio
signal so that it appears to come from a particular direction. In the time domain, the signal’s
waveform is convolved with the appropriate pair of HRIRs. In the frequency domain, the signal’s
spectrum is multiplied by the appropriate pair of HRTFs. For the purposes of this exercise it is not
necessary to understand the details of convolution, though there are plenty of tutorials online if
you wish to study it further.
In this task you are asked to spatialise a one-second noise burst in the direction (az = 45, el = 0),
first in the time domain and then in the frequency domain. Use MATLAB’s random number
generator (rand) to produce the noise signal and place it in a vector called noise. Use a sampling
frequency, Fs, of 44.1 kHz. Save the time-domain code you produce in a MATLAB script file called
static_noise_TD.m and your frequency-domain solution in static_noise_FD.m. The code you
produce should be well commented, so that others can understand how it works.
The signal should have a uniform distribution between -1 and +1 (check this by plotting the noise
signal you produce).
In static_noise_TD.m incorporate the following lines of code to perform the convolution:
L_TD = conv(HRIR_L, noise);
R_TD = conv(HRIR_R, noise);
To perform the equivalent frequency-domain process in static_noise_FD.m include the lines:
L_FD = real(ifft(fft(HRIR_L) .* fft(noise)));
R_FD = real(ifft(fft(HRIR_R) .* fft(noise)));
In this case, HRIR_L and HRIR_R must first be extended with zeros to make them the same length
as noise (a process known as zero padding) and then they are transformed into HRTFs of the
same length using fft. The special array multiply operator .* tells MATLAB to multiply each
element in the two HRTFs by the corresponding element in the spectrum of noise to create a
product vector of the same size.
Listen to the audio outputs using headphones (open-back phones usually give good spatialisation,
or small in-the-ear ‘bud’ phones). The noise should sound as if it comes from approximately the
correct direction.
Plot fully labelled graphs of the stereo output from each script for the noise input. Check that the
results appear reasonable.
Save the audio output from static_noise_TD.m and
static_noise_FD.m
as
stereo
.wav
files
called
noise_burst_45_TD.wav
and
noise_burst_45_FD.wav, respectively. Remember to apply the same scaling factor to left and
right channels when setting the highest absolute sample value to just under unity. If you do not,
the relative amplitude of the two channels may be altered, affecting the IID cues.
Experiment using HRIR pairs for a variety of directions and comment on how effective the
spatialisation is. Try spatialising speech or music as well as noise.
3
Part 3 Moving a noise source from one side of the head to the other
Building on your experience of spatialising a fixed sound, in this section you will implement a
simple method for moving a sound in virtual 3D space. Initially, the sound should lie at az = 270
(i.e. to the extreme left of the listener (see Fig. 3)). In a period of 2 seconds, it should pass
through the listener’s head and finish at az = 90 (i.e. to the extreme right of the listener).
Start
Finish
Figure 3: The desired trajectory for the virtual sound.
To achieve the impression of movement, write a MATLAB script called move_noise.m which
produces two 2-second audio waveforms from the same noise source, one for a sound spatialised
in the direction az = 270 and one in the direction az = 90. Over the 2-second period of
movement, linearly fade down the sound in the Start position and gradually fade up the sound in
the Finish position. You may find the function linspace useful for this. Spatialise the audio using
processing in either the time domain or the frequency domain, whichever you prefer.
Plot a graph of the waveforms for both channels on the same axes and save the audio output in a
stereo .wav file called xfade_noise.wav. Experiment a little by moving sounds in different
directions, at different speeds and over different distances. Is the effect always convincing or
does it work better for some kinds of movement than it does for others?
Part 4 Making a sound circle smoothly around the listener
Moving a sound smoothly around the head is an extension of the previous exercise. In this case,
however, instead of implementing a single crossfade, crossfades are applied repeatedly to move
the sound steadily from one HRIR direction to the next.
In lectures, the short-time Fourier transform (STFT) was introduced in the context of creating a
spectrogram. This exercise demonstrates another application of the STFT; here it can be used to
alter the spectrum of a signal dynamically (i.e. as a function of time). This can be achieved in the
frequency domain by applying a different HRTF filter to each frame of the STFT. One possible
implementation is shown diagrammatically in Fig. 4.
A partially completed script, STFT_framework.m, for implementing the method in Fig. 4, is
available at
N:\course\elec\examples\matlab\jjw\circling_sound.m
There is no requirement to use it, however, and any method can be used to move the sound,
provided you describe it clearly. Name your script circling_sound.m.
Include a fully labelled graph which shows the left and right output waveforms for a noise source
circling the head once and comment on its appearance.
4
Frame
Input waveform
Square root von
Hann window
FFT
Varying direction
with time
Frequency
domain
X
HRTF
FFT
HRIR
IFFT
Add re-windowed result to
output waveform then move
right half a frame and repeat
process with a new HRIR
+
Ouput waveform
Figure 4: Block diagram of one method for making sounds move around the head. The process is
shown for only one of the two channels.
Save examples of the audio output from your code for noise, a 1 kHz sine wave and speech.
Name the files circling_noise.wav, circling_tone.wav and circling_speech.wav, respectively.
Comment on the effectiveness of the spatialisation; does it depend on the type of signal source?
are there directions which work well and any that are particularly poor? If there are
shortcomings, try to suggest possible reasons.
Part 5 Optional extension exercise
Now that you are able to move sounds in virtual space, you might like to experiment with an
interesting 3D spatial audio effect of your own choosing. This could be to add reverberation
(should you add it before or after spatialising the direct sound?) or moving more than one sound
around at the same time or extending the IRCAM HRIRs used in the previous exercises to include
different elevations.
MSc_MMapps_audio_assignment_lab_AIT_V1-3a.doc
5