D6.1 VOCAL TRACT REPLICAS AND ACOUSTIC MEASUREMENTS. Xavier Pelorson and R´emi Blandin and Annemie Van Hirtum and Xavier Laval Gipsa-lab, 11 rue des Math´ematiques, Grenoble Campus, Saint Martin d’H`eres, France e-mail: [email protected],e-mail: [email protected], e-mail: [email protected], email : [email protected] The goal of this work is to provide accurate and extensive acoustic measurements on various vocal tract replicas in order to validate the numerical simulations performed in WP5. This task required first, the development of a specific measurement set-up to acquire the acoustic pressure inside vocal tract replicas at specific positions or over a whole surface using a 3D stage positioning system. Then, the optimization of each element of this set-up as well as the post processing of the acquired data was the second major challenge. In close collaboration with WP5, measurements have been performed for vocal tract replicas of increasing complexity. The comparisons with the numerical simulations performed in WP5 was complemented with comparisons using theoretical predictions, obtained from simple acoustical theory. Version Date WP number WP leader WP leader email V1 05/03/2014 6 Xavier Pelorson [email protected] EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 1 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice Contents 1 Introduction 3 2 Experimental setup 2.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Vocal tract replicas and acoustic excitation . . . . . . . . . . . . . . . . . . . . . . . 2.3 Acoustic pressure modulus and phase estimation . . . . . . . . . . . . . . . . . . . 3 3 3 5 3 Theory 3.1 Plane wave theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 One tube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Two tubes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 6 6 6 4 Transfer function measurement 4.1 Transfer function measurement method . . . . . . . . . . . . . . . . . . . . 4.2 Problems encountered when trying to perform measurements near the source 4.3 Measured transfer functions . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 One tube replica . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Two tubes replica . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.3 Vowels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 8 8 9 9 10 12 5 Reflection coefficient estimation 5.1 The two microphone method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Reflection coefficient estimation from experimental data . . . . . . . . . . . . . . . 12 12 13 6 Surface measurement 6.1 One tube replica . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Two tube replica . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 14 14 7 Conclusion 18 . . . . . . . . . . . . . . . . . . REFERENCES 18 A Article : three dimensional vocal tract acoustics 19 EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 2 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice 1. Introduction A very commonly used geometrical approximation of the vocal tract consists in a succession of tubes having different cross sections and sharing the same axis.This approximation implicitely assumes plane wave propagation. However, the human vocal tract is not perfectly axisymmetric and transverse modes could be generated and involved in the propagation of sound in the frequency range of interest for speech production. An approximation consisting of a succession of tubes, taking the eccentricity of vocal tract and the transverse propagation modes into account may be a better approximation. To investigate to what extent the plane wave propagation is accurate, measurements of transfer functions and pressure patterns at a given frequency inside the vocal tract replicas are performed. Several replicas of increasing complexity are studied. As a particularly illustrative example, two replicas constituted of a succession of two tubes, one with the two tubes sharing the same axis and the other with different axis, are compared. These measurements are compared to the plane wave acoustic theory and with Finite Element (FEM) simulations. The experimental setup used to perform the measurements is first presented. Then, the simple acoustic theory for a one tube and a two tubes geometry is introduced. Afterwards, the transfer function estimation method is explained and the experimental results are presented and discussed. An estimation of the reflection coefficient for a one tube replica is described and the experimental data is compared with theory. Eventually, the presentation of the surface measurements allows to confirm the observations and the assumptions made from the transfer function measurements. 2. Experimental setup 2.1 Setup To measure the acoustic pressure inside a vocal tract replica, an experimental setup is used. It is composed of an acoustic source , a probe microphone (B&K type 4182 with a 200 mm long and 1 mm wide probe) moved by a 3-axis positioning system (OWIS PS35), an anechoic room [4] (1.92x1.95x1.99 m, Vol = 7.45 m3 ) (see figure 2a). A BNC board connects the electrical signals to a computer containing a data acquisition card (PCI-MIO 16XE)(see figure 1). Data acquisition is controlled using Labview. The positioner is used to measure the pressure in various locations inside and outside of the vocal tract replica. The source allows to generate sinusoidal signals at given frequencies. The setup is placed in the anechoic room and acoustic foam is placed under the screen to avoid reflection effects. Temperature is measured for each experiment with a thermometer placed in the anechoic room. For a given position of the probe or frequency of the source, the pressure and the source input voltage signals are recorded during about 1 s. This allows to compute the modulus and phase of the pressure. 2.2 Vocal tract replicas and acoustic excitation Six different vocal tract replicas, of increasing geometrical complexity, have been constructed for this study : • a simple uniform straight tube of dimensions 29.5 × 170mm (see figure 3a). • a two-tube cascade with dimensions 14 × 85mm for the first tube and 29.5 × 85mm for the second one. Two geometrical configurations have been considered. In the first one, the centered configuration, both tubes share the same axis of revolution while in the second case, the eccentered configuration, the axis of revolution is different (see figure 3b and 3c). EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 3 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice Figure 1: Experimental setup. (a) 3D positioning system inside the annechoic room. (b) Connection between the source and the replica. Figure 2: Experimental setup. • three 3D printings of vocal tract geometries corresponding to the vowel /a/, /i/ and /u/ have been realised in rigid acrylic. These geometries have been taken from the litterature [3] (see figure 4). The acoustic excitation of the replicas was realised using compression chambers. In order to cover the full frequency range of interest for speech (i.e. up to 10 kHz), two compression chambers were used: a Monacor KU-916T for the lowest frequency range (50 Hz - 2kHz) and an Eminence PSD:2002S-8 for the highest frequencies. In order to prevent for acoustic interferences, the sound source was located outside of the anechoic room. The connection between the sound source and the vocal tract replicas was performed using and adaptation part (see figure 2b). The acoustic excitation of the replicas was therefore radiated through a hole of 1mm diameter. EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 4 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice (a) One tube (b) Two centric tubes (c) Two eccentric tubes Figure 3: One and two tubes replicas. (a) Vowel /a/ replica (b) Vowel /i/ cut view Figure 4: Vowels replicas. 2.3 Acoustic pressure modulus and phase estimation Despite all the care used during the measurements, the recorded signals can be altered by some spurius phenomena. The presence of noise cannot be excluded and harmonic distortion of the acoustic source is not avoidable. On the second hand, the signals can have a continuous component and transient phenomena can be present when the frequency changes and when the source start to generate sound. To avoid all of these artefacts, a careful signal processing is performed. The first 200 ms are removed to avoid transient phenomena then the Fourier transform of the signal is computed. The spectrum amplitude is normalised by multiplying it by 2/N (N being the number of samples of the analysed signal). The Fourier transform is computed using zero-padding to get a frequency resolution lower than 0.1 Hz. The maximum of the spectrum is searched on a frequency band centered on the supposed signal frequency. The frequency and the phase corresponding to this maximum are then extracted. A parabolic interpolation is performed on the 3 closest points of maximum to get a better estimation of the amplitude (the point having the maximal amplitude is not necessarily the maximum of the Fourier EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 5 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice Figure 5: Diagram of a junction between two simple tubes. transform). 3. Theory 3.1 Plane wave theory 3.1.1 One tube The wave wave field produced by a source located in a single uniform tube can be described at any abscissa x by an acoustic pressure and a flow of the following form (a time factor e−jωt is understood throughout this part) : P = A(e−jkx + Rejkx ) (1) U = Zc−1 A(e−jkx − Rejkx ) Where A is an amplitude factor, R is a reflection coefficient, k = 2πf /c is the wave number (f being the frequency and c the sound speed) and Zc = ρc/S is the characteristic impedance (ρ being the air density and S the tube cross section area). At the exit (abscissa x = 0) of the tube the radiation impedance Zr gives the following boundary condition : P (0) 1+R = Zr = −1 U(0) Zc (1 − R) (2) This equation gives the expression of the reflection coefficient R : R= Zr /Zc − 1 Zr /Zc + 1 (3) At the the source (x = xs ) , the following condition is satisfied : U(xs ) = Us (4) Where Us is the amplitude of the acoustic source. This leads to the expression of A : Us Zc − Rejkxs The viscothermal losses can be taken into account using a complex wavenumber. A= 3.1.2 e−jkxs (5) Two tubes The wave field produced by a source located in a segmented two tube waveguide can be described at any abscissa x by the following equations : EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 6 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice P1 = A1 (e−jkx + R1 ejkx ) −1 U1 = Zc1 A1 (e−jkx − R1 ejkx ) P2 = A2 (e−jkx + R2 ejkx ) −1 U2 = Zc2 A2 (e−jkx − R2 ejkx ) (6) Where indice 1 refer to tube 1 and indice 2 to tube 2 (see figure 5). The continuity of pressure and the conservation of the acoustic flow on the junction (at the abscissa x = 0) give two equations : A1 [1 + R1 ] = A2 [1 + R2 ] S1 A1 [1 − R1 ] = S2 A2 [1 − R2 ] (7a) (7b) Where S1 and S2 are the cross section surfaces of tube 1 and 2. Adding 7a to 7b and dividing by S1 gives a relationship between A1 and A2 : 1 S2 A1 = A2 1 + R2 + (1 − R2 ) = CA2 (8) 2 S1 Dividing 7a by 7b leads to a relationship between R1 and R2 (which is equivalent to writing the equality of the impedances on both sides of the junction) : R1 = S1 (1 + R2 ) − S2 (1 − R2 ) S1 (1 + R2 ) + S2 (1 − R2 ) (9) The reflection coefficient R2 can be found thanks to the boundary condition at the exit : P2 (l2 ) = Zr U2 (l2 ) (10) Thus : R2 = e−2jkl2 Zr /Zc2 − 1 Zr /Zc2 + 1 (11) At the source, assuming that the source is located inside the tube 1, the following condition is satisfied : −1 U1 (xs ) = Zc1 A1 (e−jkxs − R1 ejkxs ) = Us (12) Where Us is the amplitude of the source flow. This leads to the following expression for the amplitude factor A1 : A1 = Us Zc1 − R1 ejkxs e−jkxs (13) If the source is located inside the tube 2 the following relationship is verified : A2 = Us Zc2 − R2 ejkxs e−jkxs EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission (14) 7 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice Figure 6: Transfer function measurement method. 4. Transfer function measurement 4.1 Transfer function measurement method To measure the transfer function between two points inside a vocal tract replica a frequency step sweep method is used. A sinusoidal signal is generated during a fixed amount of time for each frequency at which ones desire to know the transfer function value. A single microphone is used so that no absolute calibration is necessary. The measurement is made in two stages. The pressure sa is firstly measured for each frequency at the first point of coordinate a then it is measured at the second point of coordinate b (see figure 6). During each measurement the supply voltage s0 is measured at the same time in order to have a phase reference. So the acquired signals are : s0 = A0 ejφ0 and sa = Aa ejφa (15) s0 = A0 ejφ0 and sb = Ab ejφb Both transfer functions H0a and H0b between the supply voltage and the pressure at the measurement points are then estimated. To achieve this the amplitude of the signal measured by the microphone is divided by the supply voltage amplitude to compute the modulus. The phase is obtained by computing the phase shift between the signal measured by the microphone and the supply voltage. So the transfer functions H0a and H0b are : H0a = H0b = Aa j(φa −φ0 ) e A0 Ab j(φb −φ0 ) e A0 (16) The transfer function Hab between the measurement points a and b is obtained as the ratio H0b /H0a . The transfer function H0a corresponds to the product of the transfer functions of the acoustic source, the propagation of sound from the source to the point xa , the probe, the microphone and the microphone conditioner. If the experimental conditions are exactly the same for the measurement of transfer function H0a and H0b , the transfer function H0b is the product of transfer function H0a by the transfer function Hab which ones wants to know. Thus we have : H0b (17) H0a The whole measurement system transfer function is thus eliminated with this computation. This method though a bit heavy gives quality results because enough energy is supplied for each frequency to make a good measurement. Hab = 4.2 Problems encountered when trying to perform measurements near the source Transfer function estimation has been firstly performed between the entrance and the exit of the one tube theory. The measurement points were very close to the communication hole between the source and the tube and on the exit of the tube. The frequency range has been chosen below the cutoff frequency of the first transverse mode (about 7000 Hz) so the planar mode was expected to EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 8 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice −30 |P| (dB) −35 −40 −45 −50 0.015 0.01 0.01 0.005 0.008 0 0.006 −0.005 0.004 −0.01 0.002 −0.015 y (m) 0 x (m) Figure 7: Pressure field measured on a 20mm × 10mm surface containing the tube axis just in front of the communication hole (located on y = 0 and x = 0) for a source frequency of 2548 Hz. Non planar evanescent modes due to radiation of the source through the communication hole inside the tube can be seen. be predominant. However the transfer function obtained were not in agreement with the plane wave theory. These differences are due to the fact that evanescent non planar modes are not negligible at the tube extremities. A pressure measure on a surface (20mm × 10mm) perpendicular to the tube axis just in front of the communication hole shows that the plane wave assumption is not valid at this place (see figure 7). One can see that close to the source the pressure perturbation can be important over a short distance. Even if the theory takes evanescent non planar modes into account, errors due to probe location uncertainty remains critical. So the neighborhood of the communication hole has been avoided to perform transfer function estimation. 4.3 4.3.1 Measured transfer functions One tube replica Three transfer functions have been measured with the previously described setup and method. A 160 wat Sphynx SP-DYN-PRO2 acoustic source and a type 4182 B&K microphone with a 200 mm long and 1 mm wide probe have been used. The duct used was 170 mm long and had an internal diameter of 30 mm. It ended in a 300 mm wide and 400 mm long screen. The pressure has been measured at 3 points labeled 1,2 ans 3 respectively located at 120 mm, 80 mm and 40 mm from the duct entrance (see figure 8) at frequencies varying from 2 kHz to 10 kHz by steps of 50 Hz. The modulus and phase of the transfer function measured between points 1 and 3 is presented in figure 9. The theoretical transfer functions have been computed with a plane wave theory assuming that the duct ends in an infinite screen. Viscothermal losses are taken into account. The experimental results show a good agreement with theory. EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 9 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice Figure 8: Measurement points locations in the one tube replica. 15 |H13| (dB) 10 5 0 −5 −10 2000 3000 4000 5000 6000 f (Hz) 7000 8000 9000 10000 3000 4000 5000 6000 f (Hz) 7000 8000 9000 10000 0 φ (deg) −200 −400 −600 −800 −1000 2000 Figure 9: Modulus and phase of the transfer function between two points located at 120 mm and 40 mm from the entrance of a duct which is 170 mm long and has an internal diameter of 30 mm. The dots have been obtained by measurement and the line is computed from the theory. 4.3.2 Two tubes replica Six transfer functions have been measured on both centric and eccentric two tube replicas. The measurements have been performed in two stages because the source used for generating high frequencies (above 2000 Hz) cannot be used for low frequencies. So the transfer functions have first been measured between 2000 Hz and 10000 Hz with a high frequency source (Eminence PSD:2002S8) and then between 100 Hz and 2000 Hz with a low frequency source (Monacor KU-916T). The pressure has been measured at 4 points labeled 1,2,3 and 4 (see figure 10 and 11). The transfer functions between these points have then been computed. As an example the one between points 1 and 2 is presented in figure 12. As one can see, the transfer functions are quite similar at low frequency (up to about 5000 Hz). This is no more the case at high frequency. The most noticeable difference is the presence of maxima (as an example at 7220 Hz and 7910 Hz) and minima (as an example at 7060 Hz and 7500 Hz) above 7000 Hz for the eccentric case which does not appear in the other case. This difference is due to the fact that in the eccentric configuration the non planar propagation modes are excited whereas they are almost non existent in the other configuration. When the frequency is higher than the non planar cutoff frequency they can propagate and generate other resonances than the plane wave resonances. This is the reason of the presence of additional maxima in the transfer function. The results obtained from FEM simulation are in agreement with these meaEUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 10 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice Figure 10: Dimensions and measurement points locations for the two tubes centric replica. Figure 11: Dimensions and measurement points locations for the two tubes eccentric replica. |H12| (dB) 20 0 −20 −40 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 f (Hz) φ(H12) (deg) 5 centric eccentric 0 −5 −10 −15 0 1000 2000 3000 4000 5000 f (Hz) 6000 7000 8000 9000 10000 Figure 12: Modulus and phase of the transfer function between points 1 and 2 (located at 30 mm and 90 mm from the source) of two tube replicas, (see figure 10 and 11). EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 11 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice Figure 13: Measurement points locations in vowels replicas. surements and confirms this difference of behaviour between both configurations. For a comparison between experiment and FEM simulations the reader is referred to WP5 deliverable (D5.1 Simulation and Validation of VT sound with static geometries). 4.3.3 Vowels Transfer function measurements have been performed on 3D printed vocal tract replicas. These replicas are a concatenation of cylinders corresponding respectively to vowels /a/, /i/ and /u/ (the area functions have been taken from [3]). All the cylinders share the same central axis. Three transfer functions between these points have then been computed in the same way as the one used for the two tube replicas. The acoustic sources used are also the same. The pressure has been measured at three locations inside and outside of these replicas (see figure 13). Three examples corresponding to the 3 vowels are displayed in figure 14. One does not notice important maxima at high frequency which could be the effect of the presence of transverse propagation modes as it was observed for the eccentric two-tubes replica. This result is logical since the axisymmetric configuration chosen does not allow the first transverse propagation modes to be generated. This will be confirmed by measurements performed on an eccentric replica of vowel /a/ which will be available soon for measurements. 5. Reflection coefficient estimation 5.1 The two microphone method The measurement of transfer functions between two points inside a tube gives the possibility to compute an estimation of the reflection coefficient. This method is called the two-microphone method [1] [2]. Considering that the pressure at each point is the sum of an incident wave and a reflected one it can be expressed at points 1 and 2 by the following equations : P1 = e−jkx1 + Rejkx1 (18) P2 = e−jkx2 + Rejkx2 Where R is the reflection coefficient at the end of the tube. The transfer function between the two points is H12 = P2 /P1 . Replacing P1 and P2 by their expression in (18) provides an expression of R : R= e−jkx2 − H12 e−jkx1 H12 ejkx1 − ejkx2 (19) This reflection coefficient can be compared to the theoretical one obtained from the radiation impedance ZR with the following expression : EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 12 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice H 12 40 |H12| (dB) 20 0 −20 /a/ −40 /i/ /u/ −60 2000 3000 4000 5000 6000 f (Hz) 7000 8000 9000 10000 3000 4000 5000 6000 f (Hz) 7000 8000 9000 10000 φ(H12) (deg) 500 0 −500 −1000 2000 Figure 14: Modulus and phase of the transfer function between points 1 and 2 (located at 100 mm and 40 mm from the exit) of vowel /a/, /i/ and /u/ replicas (see figure 13). ZR /Zc − 1 (20) ZR /Zc + 1 A common way of representing the reflection coefficient is to plot its modulus and the length correction corresponding to the phase shift induced by the reflection. This length correction δ is given by the following relation : R= R = −|R|e−j2kδ 5.2 (21) Reflection coefficient estimation from experimental data The transfer functions measured in the one tube replica have been used to estimate its reflection coefficient. The theoretical reflection coefficient has been estimated using two different ways : • using the theoretical expression (20) • by computing the pressure at point 1, 2 and 3 with expression (1) taking viscothermal losses into account. Then the same transfer functions have been estimated and the reflection coefficient is deduced from equation (19) Both the theoretical value and the experimental values have been plotted on figure 15. The experimental values are close to the theoretical ones except for some values of ka (0.6, 1.2, 1.7 and 2.3). The theoretical values obtained taking viscothermal losses into account reduces these differences indicating that viscothermal losses could be a plausible explanation of the differences between experiments and theory. EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 13 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice Reflection coefficient 2 Without VT losses With VT losses Experiments |R| 1.5 1 0.5 0 0.5 1 1.5 2 2.5 3 ka Length correction / radius 3 Without VT losses With VT losses Experiments 2.5 dl/d 2 1.5 1 0.5 0 0.5 1 1.5 2 2.5 3 ka Figure 15: Modulus and ratio of length correction to radius of the reflection coefficient estimated from the transfer function between points 1 and 3 of the one tube replica (see figure 8) plotted against the ka product. 6. Surface measurement Measurements on surfaces at a given frequency have been performed to identify the kind of modes involved in the propagation inside the replicas. The frequencies have been chosen as close as possible to resonances and anti-resonances. 6.1 One tube replica Measurements performed on the one tube replica show that at low frequency (up to about 5000 Hz) the acoustic field inside the replica (see figure 16) behaves as one-dimensional (plane waves) except at the ends of the tube where evanescent non planar modes exist. At higher frequency, non planar propagation modes can be observed (see figure 17). The mode observed in figure 17 is not expected for a perfectly axisymmetric geometry. It exists experimentally because the replica is not perfectly axisymmetric. This shows that when modelling a vocal tract with a concatenation of cylinders sharing the same axis this kind of mode is not taken into account whereas in real vocal tract nothing is axi-symmetrical and this kind of mode is supposed to exist. 6.2 Two tube replica The same kind of measurement has been performed on the two tube replica. As for the one tube replica one can see that at low frequency in both centric and eccentric cases the plane wave theory describes well the internal wave field (see figure 18 and 20) except where evanescent non planar modes are present. At high frequency one can see the effect of eccentricity. At the same frequency (7400 Hz), although non planar modes can be detected in the centric configuration, one can see that these higher EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 14 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice −35 −40 P (dB) −45 −50 −55 −60 −65 0.06 0.04 0.25 0.02 0.2 0 0.15 −0.02 0.1 −0.04 0.05 −0.06 y (m) 0 x (m) Figure 16: Amplitude of acoustic pressure measured inside and outside a one tube vocal tract replica at 3340 Hz. −50 P (dB) −60 −70 −80 −90 −100 0.06 0.04 0.25 0.02 0.2 0 0.15 −0.02 0.1 −0.04 0.05 −0.06 y (m) 0 x (m) Figure 17: Amplitude of acoustic pressure measured inside and outside of a one tube vocal tract replica at 6810 Hz. EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 15 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice −10 P (dB) −20 −30 −40 −50 −60 0.05 0.25 0.2 0 0.15 0.1 0.05 −0.05 y (m) 0 x (m) Figure 18: Amplitude of acoustic pressure measured inside and outside of a two centric tubes replica at 3060 Hz. −10 −15 −20 P (dB) −25 −30 −35 −40 −45 −50 0.05 0.25 0.2 0 0.15 0.1 0.05 −0.05 y (m) 0 x (m) Figure 19: Amplitude of acoustic pressure measured inside and outside of a two centric tubes replica at 7400 Hz. order modes are predominent in the eccentric case. This result illustrates that in a concatenation vocal tract geometry the eccentricity of each section has an influence. This difference also confirms the assumptions made from transfer functions. At the zeros ob- EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 16 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice served at high frequency the wave field is dominated by the non planar modes in the eccentric case whereas in the other case the plane waves remain predominant. −20 −25 P (dB) −30 −35 −40 −45 −50 −55 0.04 0.02 0.25 0 0.2 0.15 −0.02 0.1 −0.04 0.05 −0.06 y (m) 0 x (m) Figure 20: Amplitude of acoustic pressure measured inside and outside of a two eccentric tubes replica at 2550 Hz. −40 −50 P (dB) −60 −70 −80 −90 −100 0.04 0.02 0.25 0 0.2 0.15 −0.02 0.1 −0.04 0.05 −0.06 y (m) 0 x (m) Figure 21: Amplitude of acoustic pressure measured inside and outside of a two eccentric tubes replica at 7400 Hz. The same pressure pattern are obtained with FEM simulations with however the difference that EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 17 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice for the centric case in high frequency no transverse mode can be seen since the mesh used is perfectly simetric. For a comparison between experiment and FEM simulations the reader is referred to WP5 deliverable (D5.1 Simulation and Validation of VT sound with static geometries). 7. Conclusion The main challenge of this first year was to design, to build and to use a specific experimental set-up in order to measure accurately and extensively the acoustics of vocal tract replicas. A step-bystep procedure, starting with simple academic geometries allowed thus to optimise the set-up as well as the associated signal processing techniques. In particular, by comparing the measured data with theoretical expectations, some spurious experimental artefacts have been suppressed or avoided. This work achieved, reliable and meaningful data could have been shared with WP5 in order to validate the numerical simulations as well as to investigate the possible origin of some departures. This work also allowed us to illustrate some important features that are seldom mentioned in the speech literature. The occurrence of higher acoustical modes is a spectacular example that affects both the internal and the radiated sound field. A sensible study of this effect enhances further that the greatest care must be taken with the three-dimensional geometrical description of the vocal tract. The experimental set-up has been successfully extended to deformable vocal tract replicas. During the practical work of Boris Mondet (Universit´e Joseph Fourier), a single deformable tube was thus constrained by two plates driven by a step motor in order to generate and control a dynamic constriction. This will allow us to simulate slight to large vocal tract movements (articulation) in particular in view of the goals of year 2 and 3 of the present project. Publications : • R´emi Blandin, Xavier Pelorson, Annemie Van Hirtum, Rafa¨el laboissi`ere, Oriol Guasch and Marc Arnela (2014) ”Effet des modes de propagation non plan dans les guides d’ondes a` section variable”, accepted at the 12th French Congress on Acoustics, to appear in proceedings. • Xavier Pelorson, Annemie Van Hirtum, Boris Mondet, Oriol Guasch and Marc Arnela (2013), ”Three-dimensional vocal tract acoustics”, Acoustics 2013, November 10-15, New Delhi, India. • Boris Mondet (2013), ”Comportement acoustique du conduit vocal humain”, rapport de stage, Juin-Juillet 2013, Universit´e Joseph Fourier, D´epartement Licence Sciences et Technologies. REFERENCES 1 ˚ M Abom and H Bod´en. Error analysis of two-microphone measurements in ducts with flow. The Journal of the Acoustical Society of America, 83:2429, 1988. 2 AF Seybert and DF Ross. Experimental determination of acoustic properties using a twomicrophone random-excitation technique. The Journal of the Acoustical Society of America, 61:1362, 1977. 3 BH Story. Comparison of magnetic resonance imaging-based vocal tract area functions obtained from the same speaker in 1994 and 2002. The Journal of the Acoustical Society of America, 123:327, 2008. 4 A Van Hirtum and Y Fujiso. Insulation room for aero-acoustic experiments at moderate Reynolds and low Mach numbers. Applied Acoustics, 73(1):72–77, 2012. EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 18 EUNISON: Extensive UNIfied-domain SimulatiON of the human voice A. Article : three dimensional vocal tract acoustics EUNISON is supported by the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research of the European Commission 19 THREE-DIMENSIONAL VOCAL TRACT ACOUSTICS Xavier Pelorson, Annemie Van Hirtum, Boris Mondet Gipsa-Lab, Département parole et Cognition, UMR CNRS UMR 5216 CNRS/INPG/UJF/Université Stendhal, 11 rue des Mathématiques F-38420 Saint Martin d'Hères, France Oriol Guasch, Marc Arnela GTM Grup de recerca en Tecnologies Mèdia, La Salle, Universitat Ramon Llull, C/Quatre Camins 2, Barcelona 08022, Catalonia, Spain e-mail: [email protected] At present time, the theoretical models used in speech synthesis as well as in speech analysis (such as inverse filtering, for instance) rely on low-frequency acoustic propagation models (one dimensional approximation) through one-dimensional vocal tract approximations (using an area function). The one dimensional approximation can be justified, to a certain extent, in the case of voiced sounds due to the low-frequency behavior of the glottal source and due to its position inside the vocal tract. This is not the case for plosives and fricatives for which one can expect the generation and the propagation of higher acoustical modes. These higher modes are then predominant not only inside a resonator but also have a spectacular effect on the radiated sound in terms of directivity. Based on anatomical considerations, one can estimate the first cut-on frequency of these higher acoustical modes to lie around 4-5 kHz, which is in the middle of a typical speech spectrum and close to the maximum of sensitivity of our ears. Perceptual effects of these higher acoustical modes can therefore be expected to be considerable. A theoretical model based on a modal approach is then presented as an alternative to plane-wave models. It is shown that, using this theoretical model, the solution of the wave equation is analytical in the case of a simple geometry and can be extended numerically to the case of more complex resonator shapes (closer to the human vocal tract) by a matching mode procedure. Measurements of the acoustic pressure inside and radiated from replicas of the vocal tract, using a sound probe driven by a micrometric 3-D stage positioning system, will be presented and discussed. The experimental data will then be compared with the theoretical predictions and with numerical simulations using the Finite Element Method. Simple geometry, using concatenated tubes, will be first considered in order to illustrate three-dimensional effects. Different vowels replicas, obtained from a 3-D printing of MRI data, will be then considered. ACOUSTIS2013NEWDELHI, New Delhi, India, November 10-15, 2013 1 1. Introduction Classical textbooks on physical models of speech production [1], [2] describe the propagation of sound inside the vocal tract on the basis of a plane wave decomposition. As the same textbooks clearly indicate that this description rely on a low frequency assumption, the limits of validity of the underlying theory is not clearly established. As our knowledge concerning the sound sources, the three dimensional vocal tract geometry [3] is increasing in complexity and in accuracy, in-vivo measurements, or computer simulations clearly enhance spectacular departures from plane wave theory even at moderate frequencies (of order of 5 kHz) [4]. As a plausible explanation for these departures, we first present a theoretical investigation of sound propagation inside a simplified vocal-tract like waveguide focusing in particular upon the three dimensional effects due to the presence of higher acoustical modes. Results obtained using numerical simulations and measurements on replicas of the vocal tract will then be presented and discussed. 2. Theoretical aspects We first consider the case of a uniform waveguide. Let (O,x1,x2,x3) be any coordinate system, x3 being parallel to the waveguide axis. In the frequency domain, a general solution of the wave equation for the acoustic pressure, p might be sought in the form : x x p ( x , x ) A e mn 3 B e mn 3 mn 1 2 mn mn m, n 0 := (1) ( x , x )P mn 1 2 mn m, n 0 where Amn and Bmn are two constants depending on the end conditions. The (m,n) mode wave number, mn as well as the eigen functions, mn depend on the geometry of the waveguide, on the hygrometry and on the boundary conditions at the wall. When the system coordinate (x 1, x2) is separable, the eigenfunctions mn can be written in the form: (2) with fm and gn two orthogonal functions, Nmn a constant. km and kn are the associated eigenvalues to the eigenfunctions mn. The dispersion relationship provides : (3) Equation (3) shows that a given acoustical mode (m,n) will be propagating only if . As an example, writing k = 2f/c and km = 2fm/c one sees that the (m,0) mode will be propagating only if the excitation frequency, f is higher that fm. fm is called the cut on frequency of the mode. Mode (0,0) is always propagating because f0 = 0. This is the so-called plane wave. It is worth mentioning that the decomposition used in (2) is only possible when the waveguide geometry is compatible with the coordinate system (x1, x2). In practice, this corresponds to rectangular (compatible with cartesian coordinates) or elliptic shapes (compatible with prolate spheroidal coordinates). More complex shapes can however be assessed using approximate methods [5]. Lastly, viscous and thermal losses can be accounted using boundary layer approximation. A change of geometry may be described using a piecewise method as a succession of local discontinuities. ACOUSTIS2013NEWDELHI, New Delhi, India, November 10-15, 2013 2 Si+1 Si section i+1 x3 section i Figure 1: Change of section between two waveguides i and i+1 Let p(i) and p(i+1), respectively be the components of the acoustical pressure in section (i) and in sec3 tion (i+1), respectively. Using modal decomposition one has : (i ) (i ) p (x , x )P (4) mn 1 2 mn m, n 0 (i 1) (i 1) p (x , x )P (5) pq 1 2 pq p, q 0 Where (respectively ) are the eigenfunctions associated with guide (i) (respectively pq mn (i+1)). Applying the continuity of pressure at the junction between the two guides, Si and Si+1 gives : (i) (i 1) 1 (6) P P ( x , x )* ( x , x )dS mn pq pq 1 2 mn 1 2 S p, q 0 i Si In a similar way, continuity of the velocity along x3 provides a second relationship between the ve(i 1) (i) locity amplitudes V and V : pq mn V (i 1) pq (i) 1 V mn S i 1 m, n 0 * ( x , x ) ( x , x )dS pq 1 2 mn 1 2 (7) S i 1 For a vocal tract geometry discretized using N sections, equations (6) and (7) form thus a system of N equations with N+2 unknowns. The specific boundary conditions at both ends of the vocal tract (at section 1 and section N) provide the last two equations. Equation (6) already points out an important geometrical effect. If two sections share the same axis as in figure 1, because the first acoustical modes are antisymmetric, the resulting integral in (6) will always equal zero. This effect is illustrated in the synthetic example of two connected tubes. the first one is 85mm long and its diameter is 14.5 mm, while the second one is also 85 mm long with a diameter of 30mm. Figure 2 presents the calculated transfer functions (output acoustic pressure / glottal volume velocity) assuming, or not, that the tubes are centered. Figure 2: Transfer function of a two tube junction. Left : centered, Right : eccentered. ACOUSTIS2013NEWDELHI, New Delhi, India, November 10-15, 2013 3 x 3. Experimental and Numerical Methods 3.1 Experimental set-up The experimental set-up uses replicas of the vocal tract made of Plexiglas or ABS printed using 3-D printers. The exit end of the replicas are mounted inside a rigid plane baffle while a compression chamber provides the excitation, through a 1 mm diameter hole, at the entrance. A sound pressure probe (Bruel and Kjaer 4182) can be displaced inside and outside the replica using a stage positioning system (with an accuracy of 4 m). All measurements were performed in a soundinsulated room. 3.2 Numerical simulations To carry out the numerical simulations, the Finite Element Method (FEM) has been used to solve the acoustic wave equation in the time domain. In order to account for free-field propagation and to consider a computational domain of a reasonable size as well, the latter has been surrounded with a Perfectly Matched Layer (PML), which avoids any spurious reflection at the domain boundaries. The PML formulation developed in [6] has been adapted to the FEM framework and the resulting modified wave equation has been solved using an explicit time evolving scheme (see [7] for details of the implemented formulation). Each simulated duct system or vowel exits at a rigid baffle with dimensions 0.25 m x 0.25 m. The baffle constitutes one surface of a rectangular volume of 0.25 m x 0.25 m x 0.1 m in size, which allows sound waves emanating from the tube system propagate towards infinity. As said, this volume is surrounded by a 0.1 m width PML with a relative reflection coefficient of 10 -4. With regard to the boundary conditions, a constant frequency boundary admittance µ=0.0005 has been assigned at the duct walls to get some losses, and a sinusoid having the same frequency to that in the corresponding experimental test has been imposed at the duct entrance. The resulting computational domains have been meshed following the ten nodes per wavelength accuracy criteria [8]. Proper time step values have been chosen for each FEM mesh to fulfil a stability condition of the Courant-Friedrich-Levy type. The speed of sound has been computed using the temperature at which the experiments were performed. A numerical simulation lasting 25ms has been carried out for each analyzed case, capturing the acoustic pressure within the tube and in the near-field in a prefixed grid with a spatial resolution of 0.002m, to allow comparisons with experiments. The mean pressure at each grid point has been computed from the last 5ms of the numerical simulation. 4. Results 4.1 Two-tubes We present the FEM simulation of the two connected tubes configuration considered in section 2. The first one is 85 mm long and its diameter is 14.5 mm, while the second one is also 85 mm long but its diameter is 30 mm. Two configurations are considered, one for which the two tubes are centered and another for which they are eccentered. The simulated transfer functions for both configurations are presented in figure 3. ACOUSTIS2013NEWDELHI, New Delhi, India, November 10-15, 2013 4 Figure 3 : FEM simulations of the transfer function of two connected tubes. Left curve : centered tubes, right curve : eccentered tubes. This result confirms the theoretical expectation presented in figure 2. The influence of the relative position of the two tubes can be clearly seen and analyzed as an effect of higher acoustical modes. 4.2 Vowels 3-D geometry of several vowels [3] was used for both FEM simulations and acoustical measurements. Figure 4 presents an example of comparison between FEM simulations and measured data in the case of vowel /i/. Figure 4: Left: FEM simulation of the acoustical pressure inside and outside a /i/ vocal tract, Right : Comparison between the simulation and the measured data on the center line. Results for an excitation at 4500 Hz. Figure 5: Comparison between FEM simulation and theoretical transfer function for vowel /a/ ACOUSTIS2013NEWDELHI, New Delhi, India, November 10-15, 2013 5 As a last example, we present, on figure 5, a comparison between FEM simulations and theoretical prediction for the transfer function of vowel /a/. FEM simulations agree very well with both theoretical expectations and measured data. Some discrepancies can however be observed at high frequency which can probably be attributed to the different radiation models 5. Conclusions This paper describes an extension of the plane wave theory to the case of 3-D vocal tract geometry. The theoretical model has been successfully compared with both FEM simulations and experimental data obtained on casts of vocal tracts. The 3-D effects appeared to be significant in the high frequency domain and depend strongly on the geometrical discretization. The apparition of higher acoustical modes leaded to zeros in the transfer function (at the cut-on frequency of these modes) and to extra resonances. Further, contrarily to plane waves, these higher modes generate a highly directive sound pressure field. Because these effects occur in the higher frequency range, their relevance for vowels might be little since the glottal sound source is of low frequency nature. However, in the case of plosives or fricatives we probably can expect them to be significant. Additional experiments and simulations will be performed to confirm this conclusion. 6. Acknowledgements This research has been supported by EU-FET grant EUNISON 308874. REFERENCES [1] Fant G. (1960) Acoustic Theory of Speech Production. Mouton, The Hague. [2] Flanagan J.L.(1972) Speech Analysis Synthesis and Perception, 2nd Edition, SpringerVerlag, Berlin. [3] Story B.H. 2008 Comparison of magnetic resonance imaging-based vocal tract area functions obtained from the same speaker in 1994 and 2002. J Acoust Soc Am., 123,327-35. [4] Elmasri S., Pelorson X., Saguet P., Badin P. (1998). The use of the Transmission Line Matrix in acoustics and in Speech. International Journal of Numerical Modeling, 11, 133151. [5] Laboissière R., Yehia H.C., Pelorson X. Higher order modes propagation in the human vocal tract, Proceedings of Acoustics 2012, Nantes, France. [6] Grote M. and Sim I., 2010. Efficient PML for the wave equation, Global Science Preprint, arXiv: math.NA/1001.0319v1. [7] Arnela M. and Guasch O., 2013. Finite element computation of elliptical vocal tract impedances using the two-microphone transfer function method, Journal of the Acoustical Society of America, 133 (6), 4197–4209. [8] Ihlenburg F., 1998. Finite Element Analysis of Acoustic Scattering, Applied Mathematical Sciences, Springer, Berlin, Chap. 2. ACOUSTIS2013NEWDELHI, New Delhi, India, November 10-15, 2013 6
© Copyright 2025 ExpyDoc