Institut für Biologie - General Zoology and Neurobiology

Functional Mapping of Basic Acoustic Parameters in the Human
Central Auditory System
Der Fakultät für Biowissenschaften, Pharmazie und Psychologie
der Universität Leipzig
eingereichte
DISSERTATION
zur Erlangung des akademischen Grades
Doctor rerum naturalium
(Dr. rer. nat.)
vorgelegt von
Dipl. biol. Marc Schönwiesner
geboren am 10.09.1975 in
Halle/Saale
To Vee and the science of the polyps.
BIBLIOGRAPHICAL DATA
Marc Schönwiesner
Functional Mapping of Basic Acoustic Parameters in the Human
Central Auditory System
University of Leipzig, dissertation, 101 pages, 315 references, 24 figures, 3 tables
Abstract
This dissertation describes five studies aimed at understanding the representation of
acoustic signals in the human auditory system. The studies focus on basic properties of
the auditory cortex and subcortical structures: topographic frequency representation,
functional integration of binaural input and hemispherical asymmetries in spatial and
spectro-temporal processing. The method in all experiments is functional magnetic
resonance imaging (fMRI) with the echo-planar imaging (EPI) acquisition protocol.
Table of Contents
Acknowledgments
ix
Overview
1
1
2
Is it tonotopy, after all?
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Material and Methods . . . . . . . . . . . . . . . . . . . . . . .
1.2.1 Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.2 Acoustic stimulation and experimental design . . . . . . .
1.2.3 Investigational procedure . . . . . . . . . . . . . . . . . .
1.2.4 FMRI data acquisition and analysis . . . . . . . . . . . .
1.2.5 Second level analysis . . . . . . . . . . . . . . . . . . . .
1.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.1 Anatomical variability . . . . . . . . . . . . . . . . . . .
1.3.2 Frequency-dependent activity patterns . . . . . . . . . . .
1.3.3 Grouped data . . . . . . . . . . . . . . . . . . . . . . . .
1.3.4 Frequency profiles along Heschl’s gyri . . . . . . . . . . .
1.3.5 Variation of the frequency selectivity along Heschl’s gyrus
1.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.1 Comparison with physiological data . . . . . . . . . . . .
1.4.2 Frequency profiles . . . . . . . . . . . . . . . . . . . . .
1.4.3 Anatomical variation . . . . . . . . . . . . . . . . . . . .
1.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
5
2
3
3
3
4
5
6
6
6
11
11
15
19
20
22
23
24
25
Binaural processing in the human brainstem
2.1 Introduction . . . . . . . . . . . . . . . . . . . . .
2.2 Materials and Methods . . . . . . . . . . . . . . .
2.2.1 Stimuli and experimental protocol . . . . .
2.2.2 fMRI data acquisition . . . . . . . . . . .
2.2.3 Data analysis . . . . . . . . . . . . . . . .
2.2.4 Listeners . . . . . . . . . . . . . . . . . .
2.3 Results . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1 Comparison between all sounds and silence
2.3.2 BD contrast . . . . . . . . . . . . . . . . .
2.3.3 Motion contrast . . . . . . . . . . . . . . .
2.4 Discussion . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
31
33
33
33
34
35
35
35
35
37
38
41
v
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2.5
3
4
2.4.1 Is the inhibition exerted by the ipsilateral or the contralateral signal? 43
2.4.2 Absence of binaural facilitation . . . . . . . . . . . . . . . . . . 43
2.4.3 Hierarchical processing of binaural cues . . . . . . . . . . . . . . 44
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Asymmetry in spectro-temporal processing
3.1 Introduction . . . . . . . . . . . . . . . . . . . .
3.2 Material and Methods . . . . . . . . . . . . . . .
3.2.1 Subjects . . . . . . . . . . . . . . . . . .
3.2.2 Acoustic stimuli . . . . . . . . . . . . .
3.2.3 Procedure . . . . . . . . . . . . . . . . .
3.2.4 fMRI data acquisition . . . . . . . . . .
3.2.5 Data analysis . . . . . . . . . . . . . . .
3.3 Results . . . . . . . . . . . . . . . . . . . . . . .
3.3.1 Covariation analysis . . . . . . . . . . .
3.3.2 Region-Of-Interest analysis . . . . . . .
3.4 Discussion . . . . . . . . . . . . . . . . . . . . .
3.4.1 Spectral processing . . . . . . . . . . . .
3.4.2 Temporal processing . . . . . . . . . . .
3.4.3 Microanatomical hemispheric differences
functional specialization . . . . . . . . .
3.5 References . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
as a possible basis for
. . . . . . . . . . . . .
. . . . . . . . . . . . .
Representation of left and right auditory space
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.1 Stimuli and experimental protocol . . . . . . . . . . . . . . . . .
4.2.2 fMRI data acquisition . . . . . . . . . . . . . . . . . . . . . . .
4.2.3 Data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.4 Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.1 Comparison between all sounds and silence . . . . . . . . . . . .
4.3.2 Differential sensitivity to lateralized sounds: contralateral asymmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.3 Differential sensitivity to moving sounds: right-hemisphere dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.4 Activations outside ‘classical’ auditory structures . . . . . . . . .
4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.1 Representation of spatial attributes in stationary sounds . . . . . .
4.4.2 Specialized auditory “where” processing stream . . . . . . . . . .
4.4.3 Auditory motion processing . . . . . . . . . . . . . . . . . . . .
4.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vi
49
51
52
52
52
53
53
54
54
54
56
56
57
57
58
58
61
63
64
64
65
65
66
66
66
67
70
72
74
75
75
76
77
5
Activation asymmetry in the auditory pathway
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Material and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.1 Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.2 Stimuli and experimental protocol . . . . . . . . . . . . . . . . .
5.2.3 fMRI data acquisition . . . . . . . . . . . . . . . . . . . . . . .
5.2.4 Data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.1 Activation during monaural left and right ear stimulation . . . . .
5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.1 The contralateral activation predominance . . . . . . . . . . . . .
5.4.2 The right ear advantage . . . . . . . . . . . . . . . . . . . . . . .
5.4.3 Functional asymmetries in the auditory cortex, and the corticofugal projection system . . . . . . . . . . . . . . . . . . . . . . . .
5.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Summary
81
83
84
84
84
85
85
86
86
89
91
92
93
94
99
vii
ix
ACKNOWLEDGMENTS
This thesis occupied most of my waking hours and many of my sleeping ones during the
last three years. It is nevertheless the result of an effort of many people. Some of them
I wish to thank here.
I am grateful to my supervisor Prof. Dr. Rudolf Rübsamen. When he first suggested
learning fMRI I said: “Sound’s great!”, without having a clue what the letters meant.
I am also grateful to Prof. Dr. D. Yves von Cramon for the opportunity to work in his
multidisciplinary fMRI group, and for many helpful comments on my experiments.
I thank Dr. Katrin Krumbholz for filtering out my silly remarks and scientific naïvety
during our collaboration, and teaching me practical science instead.
I am thankful to Prof. Dr. Gereon Fink for hosting our collaboration projects at the Research Center Jülich.
I am thankful to Manon Grube for uncounted discussions of science and non-science, and
for sharing a desk, a tent, and several missed airplanes with me.
I thank Bettina Mech, my co-founder and fellow beneficiary of the Society for the moral
support of promising doctoral students in the Lessingstraße 32, for sharing a flat and her
opinion that even a scientist should be able to name the Under-Secretary-General of the
UN.
I thank Dr. Mari Tervaniemi for the opportunity to hibernate in Helsinki to compile the
last two chapters of this work.
I thank my family, all friends, co-workers, co-authors, and staff members of the Neuroscience Unit of the University Leipzig and the Max-Planck-Institute of Human Cognitive
and Brain Sciences.
I thank Vee Simoens for becoming my Vee of life.
Overview
This document describes five studies aimed at understanding the representation of acoustic signals in the human auditory system. The studies focus on basic properties of the
auditory cortex and subcortical structures: topographic frequency representation, functional integration of binaural input, and hemispherical asymmetries in spatial and spectrotemporal processing.
The first chapter discusses the topographic frequency representation (tonotopy) in
the auditory cortex. Tonotopy is a fundamental organizational principle of the auditory
system of mammals, and its presence has been demonstrated in various animal species
with single cell electrophysiological recordings. Invasive electrophysiology is, of course,
only in rare cases applicable in human research, and scientists resort to non-invasive
functional imaging techniques, like functional magnetic resonance imaging or positron
emission tomography. Despite their many advantages, these methods are limited to low
spatial and temporal resolution compared to invasive methods. If these limitations are
not properly taken into account during the design and analysis of functional imaging
experiments, erroneous conclusions are imminent. The experiment described in chapter
one demonstrated that the activation of several regions on the superior temporal plane
depends on the frequency content of the stimuli. A detailed comparison of these activation
sites with the results of electrophysiological and cytoarchitectonical studies in humans
and monkeys suggests that frequency-dependent activation sites found in this and other
recent investigations are no direct indication of tonotopic organization. As an alternative
interpretation, it is proposed that the activation sites correspond to different cortical fields,
engaged in the processing of acoustic features in different frequency bands.
The second experiment focuses on the integration of sound input from the left and
the right ear. The integration of binaural information is essential for auditory spatial
processing, which includes the localization of sound sources. Binaural integration relies
mainly on ‘fast’ binaural neurons in the auditory brainstem, which are difficult to access
in humans. Chapter two describes a method that successfully demonstrated a correlate of
binaural integration in the inferior colliculi in the midbrain, the medial geniculate body in
the thalamus, and the primary auditory cortex. The experiment also investigated dynamic
aspects of spatial processing by including stimuli that simulated moving sound sources.
Whereas all processing stages were activated by stationary sounds, only non-primary
auditory fields on the planum temporale responded selectively to the moving sounds,
thereby suggesting a hierarchical organization of auditory spatial processing.
The topographic organization of frequency bands and binaural information is the
same in both cerebral hemispheres. It results from the organization of the fiber tracts
in the ascending auditory system and therefore mirrors anatomical properties of the sys1
2
tem. Other, more abstract features of the acoustic input are often preferentially processed
in either one of the cerebral hemispheres. The earliest indications of asymmetries in the
function of the hemispheres came from studies of language impaired neurological patients in the early 19th century. Since then, much evidence for a dominance of the left
hemisphere in the processing of speech signals has been gathered. A complementary
rightward lateralization of tonal and melody processing has also been proposed. Recent
studies suggest that the hemispheric lateralization does not arise from the semantic content of the signal (speech or non-speech), but from its content of spectro-temporal modulations. Chapter three describes a study that investigated the hemispheric specialization
for spectral and temporal processing by varying the spectral and temporal complexity of
dynamic wideband stimuli. To overcome the limitations of previous studies, these stimuli
were designed to permit a clear separation of the effects of spectral complexity from those
of melody on cortical activation. A region on the right superior temporal gyrus responded
exclusively to spectral modulation, whereas the equivalent portion on the left superior
temporal gyrus responded predominantly to temporal modulations. These findings permitted a generalization of the hemispheric specialization model to include processing of
simultaneously present spectral peaks and demonstrate the involvement of the primary
auditory cortex and the right superior temporal gyrus in spectral integration.
Hemispherical specializations appear to exist in many processing streams in the cerebral cortex, and the hemispheric dominance for speech processing and its underlying
anatomical asymmetries may have delegated other auditory functions to the non-dominant
hemisphere. Chapter four reports evidence for a hemispherical asymmetry in the processing of spatial sound information that relies on binaural differences of a few microseconds.
Non-primary auditory cortex in the left hemisphere responded predominantly to sound
movement within the right hemifield, whereas the right hemisphere responded to sound
movement in both hemifields.
Functional asymmetries might not be confined to the cerebral hemispheres. Chapter
five reports evidence for an asymmetrical activation of the left and right auditory brainstem, thalamus and cerebral cortex in response to sounds presented monaurally to either
ear. The experiment took advantage of the organization of the ascending auditory pathway by using the activity recorded from the first brainstem processing center, the cochlear
nucleus, as a control for activation asymmetries in subsequent auditory structures. The
cochlear nucleus receives only input from the cochlea on the same side of head and should
therefore only be activated by sounds presented to the closer ear. As expected, ipsilateral
stimulation elicited a larger signal change than contralateral stimulation in the left and
right cochlear nucleus. In contrast, subsequent auditory structures responded asymmetrically: the right-side structures responded equally well to sound stimulation from the
left and right ear, whereas the left-side structures responded predominantly to the right
ear stimulation. The study demonstrated that activation asymmetries can be found as
early as in the inferior colliculi and continue up to the auditory thalamus and cortex. It
is discussed, how these asymmetries might arise from the anatomical and physiological
asymmetries in the afferent and efferent auditory pathway.
3
The main method in all experiments was functional magnetic resonance imaging
(fMRI) with the echo-planar imaging (EPI) acquisition protocol. This non-invasive technique uses differences in the magnetic properties of oxygenated and des-oxygenated
hemoglobin to detect cortical activation with a spatial resolution of a few millimeters.
Measuring fMRI responses from the auditory cortex requires special data acquisition
protocols. Because fMRI scanning can be accompanied by noise of up to 100 dB, an
acquisition protocol called ‘sparse temporal sampling’ was used, which separates the cortical responses to scanner noise from the responses to the experimental stimuli by inserting a silent period in between subsequent scans. Recording from subcortical structures
provides the additional challenge that the human brainstem undulates with the cardiac
cycle. In the present experiments, this movement was accounted for by coupling the data
acquisition to the phase of the cardiac cycle (cardiac gating). The magnetic field generated by the MRI scanner also limits the means of sound presentation. High-fidelity
MR-compatible headphones were constructed to guarantee undistorted sound presentation to subjects inside the scanner.
The core of every auditory study is the acoustic stimulus. Specialized stimuli were
constructed for each experiment, some of which had never before been used in functional
imaging. The experiment on topographic frequency representation employed random frequency modulation walks that combine a narrow spectral peak, suitable for activating narrow iso-frequency bands, with stochastic frequency modulations that prevent habituation,
and hence signal loss, in auditory cortical neurons. The experiments on binaural integration and hemispheric asymmetries in auditory spatial processing used noise bursts with
dynamically varying interaural delays in the order of microseconds. For the investigation
of hemispheric lateralization in spectro-temporal processing, random spectrogram sounds
provided a means to independently change spectral and temporal complexity of the stimuli. In sum, the five studies provide evidence on topology and functional lateralization of
basic processing mechanisms in the ascending auditory pathway. These mechanisms are
thought to be among the building blocks of the complex abilities that humans nevertheless
accomplish effortlessly, like navigating in complex auditory environments, transforming
speech into language and enjoying music.
Chapter 1
Is it tonotopy, after all?
5
Abstract
In this functional MRI study the frequency-dependent localization of acoustically evoked
BOLD responses within the human auditory cortex was investigated. A blocked design
was employed, consisting of periods of tonal stimulation (random frequency modulations
with center frequencies 0.25, 0.5, 4.0, and 8.0 kHz) and resting periods during which only
the ambient scanner noise was audible. Multiple frequency dependent activation sites
were reliably demonstrated on the surface of the auditory cortex. The individual gyral
pattern of the superior temporal plane (STP), especially the anatomy of Heschl’s gyrus,
was found to be the major source of inter-individual variability. Considering this variability by tracking the frequency responsiveness to the four stimulus frequencies along
individual Heschl’s gyri yielded medio-lateral gradients of responsiveness to high frequencies medially and low frequencies laterally. It is, however, argued that with regard to
the results of electrophysiological and cytoarchitectonical studies in humans and in nonhuman primates, the multiple frequency-dependent activation sites found in the present
study as well as in other recent fMRI investigations are no direct indication of tonotopic
organization of cytoarchitectonical areas. An alternative interpretation is that the activation sites correspond to different cortical fields, the topological organization of which
cannot be resolved with the current spatial resolution of fMRI. In this notion, the detected
frequency selectivity of different cortical areas arises from an excess of neurons engaged
in the processing of different acoustic features, which are associated with different frequency bands. Differences in the response properties of medial compared to lateral and
frontal compared to occipital portions of HG strongly support this notion.
2
CHAPTER 1. IS IT TONOTOPY, AFTER ALL?
1.1 Introduction
Tonotopy is a general principle of the functional organization of the auditory system. It
arises in the sensory epithelium through the structure of the cochlea and is maintained
throughout the central auditory pathway by means of orderly projections between auditory nuclei. Details of the tonotopic organization of the auditory cortex in non-human
primates were revealed by electrophysiological recordings (Merzenich and Brugge 1973;
Imig et al. 1977; Morel and Kaas 1992; Morel et al. 1993). Cytoarchitectonical studies in both monkeys and humans gave further insight into the anatomical parcellation
of respective cortical areas located on the superior temporal plane (STP) (Mesulam and
Pandya 1973; Pandya and Sanides 1973; Imig et al. 1977; Fitzpatrick and Imig 1980; Galaburda and Sanides 1980; Galaburda and Pandya 1983; Rauschecker 1997; Rauschecker
et al. 1997; Rivier and Clarke 1997). Subsequently, various attempts were made to align
the functional and the anatomical ‘maps’, which, in brief, led to the following widely accepted model: (1) A core region of the auditory cortex, distinguished by a dense granular
layer IV (koniocortex), comprises two (Merzenich and Brugge 1973; Imig et al. 1977;
Morel et al. 1993) or three (Morel and Kaas 1992; Hackett et al. 1998; Kaas and Hackett
1998; Kaas and Hackett 2000) tonotopic maps with mirror-oriented frequency gradients.
The respective maps consist of medio-laterally oriented isofrequency bands which are
aligned from occipito-medial to fronto-lateral along the lower bank of the lateral sulcus. It is not known which particular acoustic features, if any, are represented along the
auditory cortex perpendicular to the tonotopic gradient. (2) A number of different areas, jointly named the auditory belt, surround the core region and embody second level
auditory processing. The auditory belt is thought to include seven or more cytoarchitectonically distinct cortical areas, some of which seem to be tonotopically organized as well
(Pandya and Sanides 1973; Kaas and Hackett 1998; Kaas and Hackett 2000). These second level areas receive only a sparse thalamic input and depend largely on the input from
the core fields (Rauschecker et al. 1997). Electrophysiological recordings gave evidence
that neurons of the auditory belt have rather variable response properties and are often
best activated by complex combinations of signal features (Rauschecker et al. 1997). (3)
A third level of auditory processing is thought to take place in a region named the auditory
parabelt, a cortical domain localized laterally to the auditory belt on the dorsal and dorsolateral surface of the superior temporal gyrus (Pandya and Sanides 1973; Rauschecker
et al. 1995; Rauschecker et al. 1997).
In humans, however, the relationship between cytoarchitectonically defined fields of
the auditory cortex and physiologically characterized areas is less well established. For
obvious reasons, electrophysiological mapping of the cortical tonotopy was rarely performed (Ojemann 1983; Howard et al. 1996). The available data mostly come from studies which make use of non-invasive imaging techniques such as magnetoencephalography
(Romani et al. 1982; Pantev et al. 1988; Pantev et al. 1989), positron emission tomography
(Lauter et al. 1985; Lockwood et al. 1999) and functional magnetic resonance imaging
(Wessinger et al. 1997; Bilecen et al. 1998; Talavage et al. 2000).
Despite the discrepancies in the precise orientation and the number of tonotopic maps
proposed in the cited studies, they consistently show that high-frequency responsive areas
are located occipito-medially from low-frequency areas on Heschl’s gyrus (HG). These
findings are taken as an indication of tonotopical organization of the human auditory
1.2. MATERIAL AND METHODS
3
cortex, despite of (1) the fact that they don’t match the complex functional architecture
of the auditory cortex in non-human primates and (2) the uncertainty of constructing a
tonotopic map from only two data points, without probing the progression in between
them. The latter difficulty arises from the fact that the bulk of studies employed only two
spectrally different stimuli, which are insufficient to probe tonotopic gradients.
Furthermore, the spatial relationship of tonotopic maps from different studies and
their orientation with respect to defined cortical landmarks is still subject to debate. Even
in the most recent studies there is only little consensus with regard to the functional and
anatomical parcellations of the auditory cortex (including the nomenclatures used) (Rivier and Clarke 1997; Scheich et al. 1998; Hashimoto et al. 2000; Talavage et al. 2000;
Morosan et al. 2001).
The present functional MRI study investigates the frequency-dependent localization
of acoustically evoked BOLD responses within the human auditory cortex using four
spectrally different stimuli. Since non-primary auditory cortex is better activated by
bandpass- and modulated signals than by pure tones (Rauschecker 1997; Wessinger et
al. 2001), we used stochastically modulated pure tones with sufficient acoustical complexity to activate primary- as well as higher order auditory areas.
By examining the cortical activity pattern in individual subjects, we intended to identify areas with significant frequency dependent activation in the region of the superior
temporal plane and their role with regard to proposed tonotopic maps. In addition, we
wanted to study the influence of the individual gyral structure on the activation pattern.
1.2 Material and Methods
1.2.1 Subjects
Thirteen healthy right-handed individuals (six females, seven males) ranging in age from
22 to 27 years were tested on two days. The subjects had no history of neurological
illnesses and were accustomed to the scanning equipment and procedure. All subjects
were right handed as assessed by the Edinburgh Inventory (Oldfield 1971). The study
was approved by the local ethics review board at the University of Leipzig.
1.2.2 Acoustic stimulation and experimental design
In this work, random frequency modulated sine tones (RFMs) were used for acoustic
stimulation. The RFMs consisted of a series of short frequency modulation sweeps with
random slope and direction and a total length of 1250 ms. Center frequencies of the four
stimuli were 0.25, 0.5, 4.0, and 8.0 kHz, and the respective modulation depth was 20%.
The RFM-stimuli combine reasonable acoustic complexity with restrictions of the bandwidth necessary to obtain frequency-specific cortical activation patterns. Their stochastic
nature only causes a slight widening of the Fourier spectra. Additionally, pure tones with
frequencies set at the center frequencies of the RFMs were generated and one subject
underwent the experimental procedure twice, with the only difference being the type of
acoustic stimuli that were presented: either RFMs or pure tones.
Stimulus frequencies from 0.8 to 3.0 kHz were avoided, since the major part of the
sound energy arising from gradient switching lies within that frequency band and could
4
CHAPTER 1. IS IT TONOTOPY, AFTER ALL?
have interfered with the acoustic stimulation.
It has been shown that actively directing the subjects’ attention toward a stimulus
can lead to increased activity in auditory cortical areas (Woldorff et al. 1993; Grady et
al. 1997). In order to increase and control the subject’s attention, a simple deviant detection task was utilized in the present study. Half of the presented RFMs contained an
additional minute spectral shift, which could be easily perceived as a twitch of the center
frequency. These stimuli are in the following referred to as deviants. The stimuli without
such additional modulations are referred to as standard stimuli. With each stimulus the
subjects had to decide whether they heard a standard or a deviant stimulus and they had
to indicate their decision by pressing one of two buttons.
A common problem of auditory fMRI studies is the hemodynamic interaction between the experimental stimuli and the scanner noise. Several solutions to this problem
have been proposed (low noise image acquisition sequences, Scheich et al. 1998; clustered volume acquisition, Edmister et al. 1999; sparse imaging, Hall et al. 1999). These
methods inevitably lead to a reduced number of images acquired in a given period of time.
In the present study, a fast EPI sequence was utilized, with image acquisitions clustered
at the first 750 ms of a repetition interval of 2 s. During the remaining 1250 ms, the
auditory stimuli were presented. Additionally, only a small amount of the sound energy
of the scanner noise fell within the frequency bands used for auditory stimulation. These
two precautions resulted in a clear separation of stimuli and scanner noise in the timeand frequency domain. Contrasting the experimental conditions, which included tonal
stimulation and scanner noise, with a baseline condition during which only the ambient
scanner noise was audible, eliminated the unwanted BOLD response to the scanner noise.
The stimuli were presented in an epoch-related design, where each block was 20 s
long and consisted of 10 sequential repetitions of 2 s. Within a single block either acoustic stimuli of one of the four center frequencies (frequency block) or no stimuli (baseline
block) were presented. The frequency and baseline blocks were presented in pseudorandomized order. In order to increase the number of recorded brain volumes per frequency block only 20% of the presented blocks were baseline blocks and consequently
not every frequency block was followed by a baseline block. While this design has implications for highpass filtering the fMRI time series (see ‘fMRI data acquisition and analysis’), it is irrelevant for statistical modeling. Within the four different frequency blocks
(0.25, 0.5, 4.0 and 8.0 kHz) standard stimuli were presented interleaved with the appropriate deviant stimuli in equal proportions and pseudo-randomized order. The frequencyand the baseline blocks were repeated 24 times leading to 240 recorded brain volumes of
each experimental condition and an overall experimental time of 40 min.
1.2.3 Investigational procedure
Prior to the functional scanning, the subjects had to read an instruction text on a video
display where examples of all acoustic stimuli were presented and the keys to press were
indicated. The display was mounted at the face of the gradient coil and was visible to the
subjects via mirror goggles. During the experiment, the light was extinguished and the
video display was switched off.
The subjects’ heads were immobilized with padding on the scanner bed. Pulse and
oxygen level were monitored during the experiment. The acoustic stimuli were generated
1.2. MATERIAL AND METHODS
5
with a PC sound card and presented via electrostatic headphones (Resonance Technology
Inc., California, USA). The earmuffs of the headphones served as passive noise attenuation and reduced the intensity of the scanner noise by approximately 25 dB. Clinical
earplugs attenuated the scanner noise by another 25 dB. The output adjustments of the
sound card were adapted empirically to compensate for the nonlinear filter characteristics
of the earplugs.
1.2.4 FMRI data acquisition and analysis
The study was performed at 3 Tesla using a Bruker Medspec 30/100 system (Bruker
Medizintechnik, Ettlingen, Germany). A gradient-echo EPI sequence was used with a TE
30 ms, flip angle 90 degrees, TR 2 s, acquisition bandwidth 100 kHz. Acquisition of the
slices within the TR was arranged so that the slices were all rapidly acquired followed by
a period of no acquisition to complete the TR. The matrix acquired was 64×64 with a
FOV of 19.2 mm, resulting in an in-plane resolution of 3×3 mm. The slice thickness was
3 mm with an interslice gap of 1 mm. Six horizontal slices parallel to the AC–PC line
were scanned. During the same session and prior to the functional scanning, anatomical
images were acquired to assist localization of activation foci using a T1 weighted 3-D
segmented MDEFT sequence (Ugurbil et al. 1993) (data matrix 256×256, TR 1.3 s, TE
10 ms, with a non slice-selective inversion pulse followed by a single excitation of each
slice (Norris 2000)).
The fMRI time series was analyzed on a single-subject basis using in-house software
(Lohmann et al. 2001). Preprocessing of the raw fMRI time series included motion correction using a matching metric based on linear correlation, correction for the temporal
offset between the slices acquired in one scan and removal of low-frequency baseline drift
with a temporal highpass filter. The cutoff of the highpass filter was calculated using a
standard procedure in fMRI data analysis by determining the maximal time difference
between consecutive trial onsets of one condition for each of the conditions and multiplying the minimum of these time differences by two. This procedure ensures that the
highpass filter preserves all signal changes induced by the paradigm. Because of the
non-alternating experimental design, the cutoff was relatively high (142 time steps) and
we checked the individual fMRI raw data for residual baseline effects. No baseline drift
could be detected in any dataset after filtering.
The anatomical slices were then co-registered with a full-brain scan that resided in
the stereotaxic coordinate system by means of a rigid linear registration with six degrees
of freedom (3 rotational, 3 translational). The transformation parameters obtained from
this step were subsequently applied to the functional slices so that they were also registered into the stereotaxic space. The statistical evaluation was based on a least-squares
estimation using the general linear model for serially autocorrelated observations (Friston et al. 1995a; Friston et al. 1995b; Worsley and Friston 1995). The design matrix
was generated with a boxcar function that included a response delay of 6 s. The model
equation, including the observation data, the design matrix and the error term, was temporally smoothed by convolution with a Gaussian kernel of dispersion of 4 s FWHM. No
spatial smoothing was performed. The model includes an estimate of temporal autocorrelation that is used to estimate the effective degrees of freedom. Contrasts were calculated
between the four frequency conditions and the baseline condition and between the two
6
CHAPTER 1. IS IT TONOTOPY, AFTER ALL?
high-frequency conditions versus the two low-frequency conditions. A t-statistic was
computed from the estimated parameters and the resulting t-maps were converted to zmaps (SPM{Z}). The p-values (uncorrected) pertaining to the z-scores were used to test
anatomically constrained hypotheses about the location of activated areas in individual
subjects.
1.2.5 Second level analysis
Activated areas that were consistently found in the subjects were identified and named in
accordance with the nomenclature introduced by Talavage and coworkers (2000) based
on the correspondence of Talairach coordinates of respective activation sites reported in
both studies. The nomenclature was slightly expanded to accommodate all activated areas
found in the present study. Mean locations of activation foci were computed by averaging
the locations of respective local maxima in the SPM{Z} across subjects.
In order to account for anatomical variations of Heschl’s gyrus in different subjects
the percentage signal change was traced along the frontal and the occipital wall of the
individual HG (‘tubular regions of interest’). In each (anatomical) sagittal slice of the
subjects, Heschl’s gyrus was identified according to the definitions given by Penhune et
al. (1996) and Leonard et al. (1998), and voxels with 45% of the maximal grey-value in
the data set were marked on the image (Fig. 1.1C). Subsequently, Talairach coordinates of
the frontal-most (highest Talairach y coordinate) and the occipital-most (lowest Talairach
y coordinate) of the marked voxels forming the contour of HG were recorded. For statistical analysis, the distance in Talairach millimeters between the Talairach y coordinates
of the two voxels was taken as the width of HG on the respective sagittal slice. In the registered functional data sets, the voxels corresponding to the frontal- and occipital-most
anatomical voxels of the HG contour were obtained. The mean percentage signal change
was computed along the frontal and occipital wall of HG respectively by averaging the
signal change in these voxels and their eight in-plane neighbours for each sagittal plane.
The gradient of the signal change along the walls of HG was then plotted for the four
frequency conditions, normalized by linear interpolation to the length of the longest HG
in the sample (40 mm) and averaged across subjects. These plots will be referred to as
frequency profiles. Computing the differences in percentage signal change between the
pooled high- vs. the low-frequency conditions yielded gradients of frequency selectivity
along the walls of HG.
1.3 Results
1.3.1 Anatomical variability
Although most of the obtained activation sites were consistent in size and spatial relationship across subjects, there was still a distinct inter-individual variability, which had
to be considered in analyzing the data. The transverse gyrus of Heschl is known to exhibit a great inter-individual anatomical variability, partly because its crown is frequently
indented by an intermediate sulcus (SI), causing a partial or complete duplication (bifurcation) of the gyrus (Penhune et al. 1996; Leonard et al. 1998). Three of the thirteen
subjects studied showed a bifurcation of HG in the left hemisphere and four subjects in
1.3. RESULTS
7
Figure 1.1: (A) Graphical representation (box and whisker plot, box: median and inter-quartile
range, whiskers: data range, red crosses: outliners) of the average course of the frontal and occipital walls of the left Heschl’s gyrus in terms of Talairach coordinates. (B) Graphical representation
of the width of the left HG along its medio-lateral extension (box and whisker plot as above).
The average width is nearly constant along HG ( 1̃5 mm) whereas the variance of the width increased laterally. (C) Illustration of the procedure employed to extract the course of the walls of
HG from individual anatomical slices. For each subject, HG was identified and the coordinates of
its frontal-most (HG front) and occipital-most (HG occip) voxel on each sagittal slice (between
Talairach x −50 and −32) were recorded. The same coordinates were also used for the construction of the individual tubulous regions of interest, along which the fMRI-signal was tracked for
subsequent analysis. The diameter of these regions, 3×3 functional voxels, is indicated by the
white squares overlapping HG frontally and occipitally. In the subject shown, the crown of HG
is indented by the intermediate sulcus (SI). (D) A ranking of the subjects according to the width
of their HG (indicated by bars) between x −47 and −45 (grey bars) was performed. A correlation analysis of this ranking and a second one according to the distance between foci #1a and
#1b (mean Talairach x −47) revealed a significant correlation (Spearman rank order correlation
coefficient rs = 0.57, p < 0.05). Only the 11 subjects clearly showing both foci were included in
the analysis.
8
CHAPTER 1. IS IT TONOTOPY, AFTER ALL?
the right hemisphere. In two subjects HG was bifurcated on both sides, and the remaining
four subjects had non-bifurcated gyri on either side. Taken together, 11 out of 26 HG
showed a bifurcation, five in the left and six in the right hemisphere. The bifurcation
of HG was most prominent in fronto-lateral aspects of the STP, while the gyri merged
towards its occipito-medial border.
For the left hemisphere, Figure 1.1 gives a graphical representation of the course of
the frontal and occipital walls of HG in terms of Talairach coordinates (Fig. 1.1A). Both
landmarks angle forward in their medio-lateral progression. The distance between the
frontal and the occipital wall of HG, projected onto a Talairach z-plane, was taken as an
estimate of the width of HG (Fig. 1.1B, C). The median of the width of HG varied only
slightly along the medio-lateral HG extension (Fig. 1.1B) still the variance of the HG
width was higher at its lateral extreme (Talairach x −50 to −47) compared to the medial
extreme (Talairach x −35 to −33; p < 0.05, Moses rank-like test for scale differences).
In order to test if the anatomical variability of the lateral HG is related to the variability
of functional activation patterns (described below), subjects were ranked according to the
width of their HG between Talairach x −47 and −45 (Fig. 1.1D). A second ranking was
performed according to a parameter of the localization of activated areas, i.e. the distance
between two activation foci on (or near) lateral HG (mean Talairach x −47). The two
rankings showed a significant correlation (Spearman rank order correlation coefficient
rs = 0.57, p < 0.05).For illustration, Figures 1.2 & 1.3 show data from two individuals
with different anatomies and the respective patterns of activation.
Subject FR1T had a non-bifurcated HG in the left hemisphere with a constant width
of about 1 cm (Fig. 1.1A). Acoustic stimulation with RFM signals limited to defined
frequency ranges yielded discrete areas of activation as seen on an axial slice through HG
(Fig. 1.2B; a detailed analysis of the frequency dependent activation will be given below).
If high and low-frequency conditions were contrasted (Fig. 1.2C), two HF foci (marked
in blue) appear in proximity to the occipito-medial border of HG and two interconnected
LF foci (marked in red) flank the gyrus at more fronto-lateral aspects. The parasagittal
slice indicates that the activated LF areas coincide with the superior part of the frontal
and occipital wall of HG (Fig. 1.2D). The present example does not reveal whether the
fronto-lateral area of activity represents a single or two separate foci.
In subject SJ2T, an expanded and fronto-laterally bifurcated Heschl’s gyrus is seen
in the STP-surface reconstruction (Fig. 1.3A). The distance between the two sulci, which
form the respective frontal and occipital border of Heschl’s ‘complex’ (the sulcus temporalis transversus primus and -secundus) increases from approximately 1 cm occipitomedially to 3 cm fronto-laterally. This specific anatomy is reflected in the pattern of
activity (Fig. 1.3 B&C). The locations of the two HF areas near the occipito-medial border of HG (marked in blue; Fig. 1.3D) are directly comparable to the respective foci seen
in subject FR1T. However, the two LF-activated fronto-lateral areas (marked in red) are
clearly separated in the present case. When the bifurcation of Heschl’s gyrus is considered, the activated areas are located at the respective frontal and occipital walls of
Heschl’s ‘complex’, and thus in the same relative position as in the above case.
The other eleven subjects showed the same correspondence between the individual
anatomy of the superior temporal plane and the distribution of activated HF and LF areas.
1.3. RESULTS
9
Figure 1.2: Individual gyral pattern and distribution of activated areas in subject FR1T. (A) Threedimensional reconstruction of the left temporal lobe with the STP surface exposed and Heschl’s
gyrus highlighted in red. (B) Activation pattern (SPM{Z}) for the four frequency conditions
shown on an axial plane through the left Heschl’s gyrus. The low-frequency stimulation caused
lateral activation foci (#1a & #1b, discernible in the upper two images, see text for further explanations) whereas high-frequency stimulation caused medial activation foci (#2 and #4, lower two
images). This difference is emphasized in (C) and (D) where low- and high-frequency conditions
are contrasted. Foci with significant HF dependent responses are shown in blue and LF foci in
red. (D) The lateral LF activations coincide with the frontal and occipital walls of HG as seen on
the parasagittal plane. Note that because of the small fronto-occipital extension of HG the lateral
activation sites partly overlap.
10
CHAPTER 1. IS IT TONOTOPY, AFTER ALL?
Figure 1.3: Gyral pattern and distribution of activated areas in subject SJ2T. Details like in Fig
1.1. Note that the large fronto-occipital extension of HG due to its bifurcation causes the lateral
activation sites (foci #1a & #1b, B & D) to be further apart than the respective foci in subject
FR1T (Fig. 1.2). The parasagittal plane in (C shows the medial HF foci (#2 and #4) at the frontoand occipito-medial walls of HG.
1.3. RESULTS
11
1.3.2 Frequency-dependent activity patterns
Since RFM stimuli have not been used in previous fMRI studies, the effects of increased
bandwidth and of the more complex temporal signal structure on the cortical activation
were tested. The actual differences in activation sites are shown for subject MA3T for
which images were obtained by the use of both, pure tone and RFM stimuli (Fig. 1.4).
For this comparison, the pure tone frequencies were set at the center frequencies of
the respective RFM stimuli. As seen in Fig. 1.4, the RFM induced activations were more
prominent, and inevitably more spread out, than those induced by pure tones. Typically,
the latter were located amidst the respective RFM activation sites. The major advantage
of using RFM stimuli was that they gave prominence to the activation sites on lateral
aspects of STP compared to pure tone stimulation.
Next, the typical pattern of frequency specific activation sites will be described for
subject HF1T, with special emphasis on those sites, which are consistently seen in all
subjects (Fig. 1.4). Seven activation sites could be differentiated on the left STP, most
of which showed a significant responsiveness during the four frequency conditions (Fig.
1.5A). The strength of activation, however, differed in either the high- (4.0 and 8.0 kHz)
or the low-frequency conditions (0.25 and 0.5 kHz). Two foci located at the occipitomedial border of Heschl’s gyrus (#2 and #4, for an explanation of the nomenclature see
below) showed a significantly stronger activation during the 4.0 and 8.0 kHz stimulation compared to the 0.25 and 0.5 kHz stimulation (Fig. 1.5B). The reverse accentuation
of frequency dependent response strength was seen in two fronto-lateral activation sites
overlapping Heschl’s gyrus (#1a and #1b)
Three more foci were consistently found in all subjects. One with a prominent lowfrequency activation was seen at the fronto-lateral transition of Heschl’s gyrus to the STG
(#6). In the area of the planum temporale (PT) one high-frequency focus (#3) was located
occipital to HG and lateral to focus #4, approximately halfway on the medio-lateral PT
extension (Fig. 1.5A intermediate and superior planes). The frequency dependence of the
last two foci (#3 and #6) was less prominent than for foci #1a, #1b, #2, and #4, but still
met the p < 0.05 significance criterion. The most occipital focus (#8) was located where
the STG bent towards the angular gyrus. In HF1T this focus did not show a significant
frequency dependent response, but had LF preference in the majority of the subjects. Two
more activation sites are shown in Figure 1.5A ( and ♦) which were not consistently
found in the subjects. The first (), with a low-frequency preference, was located on
the fronto-lateral PT and the second (♦), which did not show a significant frequency
dependent response extended over the occipital wall of HG in between focus #4 and
#1b. In the intermediate and superior planes, foci and ♦ merge with foci #1a and #1b,
yielding a single large activation site (Fig. 1.5A).
1.3.3 Grouped data
Averaging the centers of the individuum-specific activation sites in all 13 subjects using the Talairach coordinates as a reference gave evidence that the pattern of activation
described above comprises distinct auditory processing domains on the STP (Fig. 1.6).
For the identification of these frequency responsive areas, the numbering scheme from
Talavage and coworkers (2000) was adopted. Seven frequency responsive foci could
12
CHAPTER 1. IS IT TONOTOPY, AFTER ALL?
Figure 1.4: Cortical activation in subject MA3T resulting from stimulation with (A) random frequency modulation walks and (B) pure tones for four frequency conditions (0.25, 0.5, 4.0 and
8.0 kHz). The statistical parameter maps show z-values > 6.0 (p 0.001) overlaid on corresponding anatomical images. The pure tone frequencies were set at the center frequencies of
the respective RFM stimuli. Note that the RFM stimulation led to a more prominent activation
particularly in the lateral STP.
1.3. RESULTS
13
Figure 1.5: Pattern of activated areas in three axial planes (Talairach z=4, 7, and 10) for subject
HF1T (SPM{Z}) overlaid on corresponding anatomical images. Each image is centered on Heschl’s gyrus, which extends diagonally from occipito-medial to fronto-lateral. (A) The upper four
rows show contrasts of the four frequency conditions vs. baseline (0.25, 0.5, 4.0, 8.0 kHz). All
voxels of the resulting z maps with z > 6.0 (p < 0.001, uncorrected) are shown. Note that the
large area of activation seen laterally in the intermediate and superior planes during low-frequency
stimulation (0.25 & 0.5 kHz condition) separate into three distinct foci (#1a, #1b, and ) with the
RFMs shifted towards higher frequencies. (B) The bottom row (LF-HF) shows a contrast of the
pooled low- vs. pooled high-frequency conditions. The resulting z maps depict areas with a significant difference (p < 0.001) in activation strength in the low- (marked in red) vs. high-frequency
conditions (marked in blue). The numbers indicate activation sites on HG and in its vicinity consistently found in the subjects.
14
CHAPTER 1. IS IT TONOTOPY, AFTER ALL?
Table 1.1: Criteria for identification of the frequency responsive foci with the locations in Talairach coordinates and the mean percent signal change. Additionally, the coordinates reported
by Talavage and coworkers (2000) are given for reference. The Talairach coordinates (mean
±S.E.M.) are an average across foci locations in the left hemispheres of the subjects. Note that
the percent signal change is highest for activation sites #1a, #1b, #2 and #4 and decreases in the
sites farer away from HG (secondary or tertiary auditory cortex).
consistently be differentiated in the left hemisphere. Table 1.1 indicates the anatomical
criteria for focus identification, the respective locations in Talairach coordinates, and the
corresponding coordinates reported by Talavage and coworkers (2000).
The foci #2 and #4 were the most stable activation sites across subjects in terms of
location and frequency specificity (Fig. 1.6). They were found in all subjects and exhibited significantly greater activation strength during periods of high-frequency- (4.0 and
8.0 kHz) compared to low-frequency stimulation (0.25 and 0.5 kHz). Foci #2 and #4
were also the most medially located activation sites overlapping with the frontal- (focus
#2) and the occipital wall (focus #4) of Heschl’s gyrus at its medial border. Focus #3
was located on the PT occipital to HG and lateral to focus #4, approximately halfway
on the medio-lateral PT extension. Focus #6 was located at the fronto-lateral transition
of Heschl’s gyrus to the STG. The latter two foci (#3 and #6) were predominantly activated by low-frequency stimulation. Focus #8 was the most occipital activation site and
occupied an area where the STG ascends towards the angular gyrus. This focus showed
low-frequency dependence only in some subjects. What corresponds to Talavage’s (2000)
focus #1 could be further subdivided into two distinct low-frequency responsive areas.
One area, overlapping with HG fronto-laterally, was termed #1a and the other, located
more occipital, #1b. The high standard errors in fronto-lateral direction for the mean location of foci #1a and #1b are an indication of the above-described anatomical variability
of the lateral portions of HG (Fig. 1.6). No lateral shift of activation was observed when
1.3. RESULTS
15
the stimulus frequencies were lowered from 0.5 to 0.25 kHz; neither did any other significant change in the location of the lateral activation site occur. Similarly, the 8 kHz
stimulation did not induce activation more medial, or by other means different, than the
4 kHz activation.
The same pattern of frequency responsiveness was also seen on the right STP (not
shown). Same as on the left side, the HF-activated foci were found occipito-medially
and the LF foci fronto-laterally on Heschl’s gyrus. Similarly, the anatomical inter-subject
variability increases from occipito-medial to fronto-lateral. Nevertheless, it was not possible to unequivocally distinguish as many separated activation sites on the right HG and
in its vicinity. In nine subjects, the volumes of foci on the frontal border of Heschl’s gyrus
were connected with foci bordering the gyrus occipitally. The altered pattern on the right
STP might relate to the fact that the right Heschl’s gyrus is considerably narrower than its
left counterpart is, so that the limit of imaging resolution prevents a more refined analysis.
1.3.4 Frequency profiles along Heschl’s gyri
The data presented indicate that acoustically evoked activation mainly coincides with the
frontal and occipital walls of Heschl’s gyrus. This suggests that variations of frequency
dependent activation might line up with these anatomical landmarks. In order to assess
frequency dependent activation along HG, we defined for each subject four tubulous regions of interest (two left and two right) exactly overlaying the individual frontal and
occipital walls of HG (Fig. 1.7A). The strength of activation resulting from the stimulation in the four frequency bands was analyzed along each of these three-dimensional
domains 21-42 mm in length. For subject ID1T, the frequency profiles are shown along
the frontal (Fig. 1.7B) and occipital wall of HG (Fig. 1.7C). For all stimulus conditions,
the response strength varied systematically along the spheres. Referring to the frontal
wall of the left HG (Fig. 1.7B), the 4.0 and 8.0 kHz-stimulation caused a significant activation near the medial border of Heschl’s gyrus, while laterally under the same stimulus
conditions there was a tendency for the activation to fall even below the resting level.
The 0.25 and 0.5 kHz stimulation caused significant activation all along the frontal wall
of HG with a broad maximum between 60–90% of its longitudinal extension. For the
occipital HG wall (Fig. 1.7C), the variation of activation strength resulting from LF stimulation showed a similar gradient with the maximum activation in the lateral half of HG.
The activation caused by HF stimulation was maximal at the medial HG border, where
it exceeded the LF induced activation and decreased towards the lateral HG. The same
frequency gradients were also observed for the frontal and occipital walls of HG on the
right side (not shown).
Averaging the frequency profiles for all 13 subjects for the left and the right STP
(Fig. 1.8) revealed frequency dependent response patterns matching with those shown for
subject ID1T (Fig. 1.7). Medial portions of HG responded more vigorously to high-frequency and lateral portions predominantly to low-frequency stimulation. Difference in
activation strength were highly significant in medial and lateral HG portions, but not in a
transition area between both.
16
CHAPTER 1. IS IT TONOTOPY, AFTER ALL?
Figure 1.6: The locations of seven frequency dependent foci are shown on schematic outlines
of the cortex based on the Talairach atlas (Talairach and Tournoux 1988) (outlines are from the
Talairach daemon [Lancaster et al. 2000]). The crossbars indicate the mean location and the 95%
confidence area of the respective foci. The header of each image gives the number of the depicted
focus (adopted from Talavage et al. (2000)) followed by the frequency dependence (HF or LF) and
the Talairach coordinates. Note the large fronto-occipital variance for foci #1a and #1b, which
is due to the anatomical variability of HG across the subjects. The location of foci #2 and #4
showed the least variability. In order to facilitate comparison with results of other studies, the two
rightmost subfigures show fractions of relevant anatomical (Galaburda and Sanides 1980; Rivier
and Clarke 1997; Morosan et al. 2001) and functional maps (Scheich et al. 1998; Hashimoto
et al. 2000; Di Salle et al. 2001) overlaid on schematic drawings of HG and the surrounding
cortex. The maps have been considerably simplified and the reader is advised to consult the
original publications for further details of the parcellations. In the compilation of anatomical
maps, only the cytoarchitectonical fields of Galaburda and Sanides (1980, outlined in red) are
labeled (KAm medial koniokortex, KAlt lateral koniokortex, ProA prokoniokortex, PaAi internal
parakonoikortex, PaAe external parakoniokortex, PaAc/d caudo-dorsal parakoniokortex). The
large blue-encircled area covering most of HG in the same subfigure is AI of Rivier and Clarke
(1997). The three adjoining areas outlined in green are fields Te 1.1, Te1.0 and Te1.2 (from medial
to lateral HG) described by Morosan et al. (2001). In the compilation of functional maps shown
in the lower right subfigure, the red outlines pertain to the functional fields T1a, T1b, T2 and T3
proposed by Scheich et al. (1998). The two green-encircled areas (A, B) were found to exhibit
differential BOLD response patterns (type a and type b decay pattern) during 1 kHz pure tone
stimulation by Di Salle et al. (2001). Finally, blue lines surround the areas described by Hashimoto
et al. (2000, A1, A2m and A2l). The occipito-medial area A2m was differentially activated by
a dichotic vs. a diotic listening condition. (See “Discussion” for the proposed attribution of our
activation foci to these maps)
1.3. RESULTS
17
Figure 1.7: Tubular regions of interest (ROI) used to extract individual frequency profiles along
Heschl’s gyri. (A) Two tubular ROIs were manually aligned with the frontal and occipital walls
of Heschl’s gyrus. (B) The relative activation strength of voxels in the four frequency conditions
analyzed from medial to lateral along the regions indicated in (A) at the frontal and occipital wall
of HG.
18
CHAPTER 1. IS IT TONOTOPY, AFTER ALL?
Figure 1.8: Averaged frequency profiles for the frontal and occipital walls of the left and right
Heschl’s gyri (frequencies indicated in the graph). The strength of activation caused by highfrequency stimulation decreases from medial to lateral in the four domains analyzed. Lowfrequency stimulation caused maximum activation in the lateral third of the medio-lateral HG
extension. Note the differences between the frontal and the occipital frequency profiles as well as
between the medial and the lateral HG portions.
1.3. RESULTS
19
Figure 1.9: Average frequency selectivity for the frontal and occipital walls of the left and right
Heschl’s gyri. The frequency selectivity was quantified by the mean percent signal change between the LF and HF stimulus conditions (LF−HF). Note that the frequency selectivity increases
from medial to lateral in all four domains analyzed. The lateral LF responsive areas are highly
selective for low frequencies (LF−baseline), whereas the medial HF areas are activated by both
low- and high frequencies with the HF activation strength surpassing the LF activation strength
(HF−baseline and LF−baseline).
1.3.5 Variation of the frequency selectivity along Heschl’s gyrus
The frequency profiles showed non-symmetrical gradients for high- and low-frequency
responses (Fig. 1.9). The fronto-medial and occipito-medial portions of HG were activated by both low- and high frequencies, but with the HF- significantly surpassing the LF
activation. In contrary, the fronto-lateral and occipito-lateral portions of Heschl’s gyrus
were almost exclusively activated by low-frequency stimulation. High frequencies did not
cause significant activations. The level of frequency selectivity can be assessed by quantifying the difference in activation strength caused by high and low-frequency stimulation
(Fig. 1.9). In all four HG domains regarded, this analysis showed a similar increase of the
frequency selectivity from medial to lateral portions of HG. This analysis also revealed
that the frequency profiles along HG on the right and the left STP were almost identical.
20
CHAPTER 1. IS IT TONOTOPY, AFTER ALL?
1.4 Discussion
The present results provide evidence for multiple frequency dependent activation sites
on the STP. The anatomical inter-subject variability of Heschl’s gyrus was recognized
as the major source of functional inter-subject variability. By examining the variation of
frequency dependent activity along individual Heschl’s gyri—the main landmarks of the
auditory cortex—we found that (a) the response strength resulting from high- and lowfrequency stimulation varies systematically in opposite directions along HG and (b) the
degree of frequency selectivity increases from medial to lateral.
Because of technical limitations and the need for averaging the data from many subjects, early imaging studies identified only a medial high- and a lateral low-frequency
activation site on STP (Lauter et al. 1985; Wessinger et al. 1997). Still, the results were
considered as an indication of a tonotopic organization of the auditory cortex. This finding was inconsistent with electrophysiological studies in non-human primates, in which
multiple tonotopic maps within the primary auditory cortex were reported (Merzenich
and Brugge 1973; Imig et al. 1977; Morel and Kaas 1992; Morel et al. 1993). It seems
likely that these studies only detected the most prominent activation sites with a high degree of frequency selectivity, namely #1a and #1b jointly as one lateral LF activation site,
and #2 and #4 as a single medial HF activation site.
More recent imaging studies were devoted to also reveal a more complex tonotopic
parcellation in the human auditory cortex. Undeniably, these studies succeeded in differentiating multiple activation sites on STP, but the attribution of those sites to cytoarchitectonically defined areas still requires more in-depth analyses. There are a number of
basic factors to be considered when one tries to line up imaging- and cytoarchitectonical
data. First, it is not entirely clear how many activation foci found in a functional image
correspond to a single cytoarchitectonically defined field. The argument is as follows: (1)
Combined cytoarchitectonical and electrophysiological studies in non-human primates
point to a clear correspondence between anatomically and physiologically defined fields
to an extent that cellular response properties can be used as markers for cytoarchitectonical field borders (Tian et al. 2001). (2) Electrophysiological studies disclose clear
tonotopic organization in the primary auditory cortex but only in one of the secondary(Rauschecker et al. 1995) and none of the tertiary areas. The primary (core) area embodies two (rhesus monkey, Merzenich and Brugge 1973; Imig et al. 1977; Morel et
al. 1993) or three (owl monkey, Morel and Kaas 1992; Hackett et al. 1998; Kaas and
Hackett 1998; Kaas and Hackett 2000) more or less complete tonotopic maps. (3) With
the exception of the field CM (Rauschecker et al. 1995), the secondary (auditory belt)
and tertiary (auditory parabelt) fields did not exhibit a distinctive tonotopy. Consequently,
frequency dependent activation foci should be predominantly found in the area of the primary auditory cortex, i.e. the medial two thirds of HG, while, towards higher auditory
areas, like belt and parabelt, the frequency selectivity should decrease. The attribution
of activation foci to cytoarchitectonical fields is often based on the euclidean distance in
Talairach space between the centers of mass of such foci and the centers of the respective
cytoarchitectonical fields. This mostly results in a one-to-one correspondence of foci to
fields. While this procedure undoubtedly yields an objective parameter (the distance) for
establishing such a correspondence, it completely ignores the underlying physiology and
seems inappropriate to arrive at conclusions regarding the tonotopic organization. One
1.4. DISCUSSION
21
frequency dependent activation focus in a cytoarchitectonical field (regardless whether
selective to high- or low frequencies) cannot be considered as indication of a tonotopic
organization, since such an indication would presuppose at least two activation foci of
opposing frequency selectivity in a single cytoarchitectonical field. While we presently
also propose a correspondence between the seven activation foci and parcellations based
on cytoarchitectonical criteria, we still arrive at different conclusions regarding tonotopic
organization.
Another critical issue inherent in fMRI is the effect draining veins have on the localization of BOLD response activation foci. These veins can be small and hard to detect.
The location of activation foci might depend on the individual vein pattern rather than on
the individual functional parcellation of the cortex. Still, we would argue that the location
of the present activation foci, especially of the foci at the frontal and occipital walls of
HG, is an indication of functional parcellation rather than a reflection of the course of
draining veins. The argument rests on the finding of differences of the BOLD response
properties of frontal and occipital areas on HG found in the present study and in other
recent studies detailed below. The assumption, that foci on the frontal and occipital wall
of HG are merely a single activated area dispersed by the effects of veins running along
either wall of HG, implies that these areas should be coactivated under all experimental
conditions. This is not the case, as shown by Hashimoto et al. (2000). Furthermore, the
location of veins and activation foci on STG was found to be uncorrelated by Di Salle
et al. (2001). Nonetheless, it can be that the exact localization of fMRI foci is affected
by the course of veins, e.g. veins that run in the sulcal basins might shift the foci away
from the crown of HG towards the depth of the limiting sulci,. While this would weaken
the comparability of fMRI data with electrophysiological data, it will not impede the
comparability of different fMRI studies.
The cytoarchitectonical maps of the auditory cortex that we will refer to are those reported by Galaburda & Sanides (1980), Rivier & Clarke (1997), and Morosan et al. (2001).
Our attempt was complicated by the fact that in these three studies the parcellation of Heschl’s gyrus differs. Originally, Galaburda and Sanides (1980) differentiated two koniocortical fields: KAm, occupying the fronto-medial wall and the crown of HG and KAlt,
occupying the lateral crown and stretching into the occipitally bordering sulcus. More
recently, Rivier and Clarke (1997) identified only a single large field, termed A1, covering most of HG. Morosan et al. (2001) (whose parcellation refines the primary auditory
cortex [TC] in the classical map from von Economo and Koskinas (1925)) emphasized
medio-lateral cytoarchitectonical differences along HG by distinguishing three adjoining
areas, Te1.1, Te1.0 and Te1.2.
Of the presently identified activation sites, focus #1a (Fig. 1.6) is the most likely
candidate for a response from the primary auditory cortex, as it lies directly on the frontolateral surface of HG in all subjects. This activation site corresponds to parts of KAm,
A1 and Te1.0 (here and in the remainder of the paragraph the respective abbreviations
point to the three studies of Galaburda and Sanides, Rivier and Clarke and Morosan et
al.). Focus #1b, which lies in the sulcus bordering HG occipitally, could be attributed to
KAlt (primary auditory cortex) or to PaAi (secondary auditory cortex) of Galaburda &
Sanides. With reference to the parcellation of Rivier and Clarke, #1b can also equally
well be attributed to the primary or the secondary auditory cortex, the fields A1 or LA,
respectively.Focus #2 is located near the fronto-medial border of HG, where, according to
22
CHAPTER 1. IS IT TONOTOPY, AFTER ALL?
Galaburda and Sanides, three cytoarchitectonical fields adjoin, impeding an unequivocal
attribution of this focus to one of them. KAm extends medio-laterally on HG and is
bordered frontally alongside the first sulcus of Heschl by ProA, which covers a portion of
the planum frontale. Both fields stretch up to the posterior caudo-dorsal paraconiocortex
(PaAc/d) at the fronto-medial border of HG. Rivier & Clarke did not include this area in
their cytoarchitectonical analysis. Their fields A1 on HG as well as MA (which resembles
ProA of Galaburda & Sanides) do not cover the most fronto-medial extends of HG and the
frontally bordering sulcus. Still, A1 is the best choice for attributing focus #2. Referring
to the map of Morosan and coworkers the focus #2 corresponds to the field Te1.1 that
covers the entire medial portion of HG.
The same field also comprises focus #4, located just occipitally from #2 near the
occipito-medial border of HG. Focus #4 is difficult to match with the map of Galaburda
& Sanides as four cytoarchitectonical fields (KAm, KAlt, PaAi and PaAc/d) meet at its location. In the map of Rivier and Clarke, the field PA occupies the occipito-medial portion
of HG and parts of the medial planum temporale, a location that is in good correspondence with the location of focus #4.
The location of focus #3 on the planum temporale is consistent with field PaAi of
Galaburda and Sanides, which stretches occipitally from HG along frontal aspects of the
planum temporale. In the map of Rivier & Clarke, this location is occupied by the field
LA. Focus #6 and #8 are located near the rim between STP and the superior temporal
gyrus (STG), a region termed auditory parabelt which is thought to embody the tertiary
auditory cortex. With reference to Galaburda and Sanides’ nomenclature, foci #6 and #8
lie in the region of the internal and external parakoniocortex (PaAi, PaAe). With regard
to the map of River and Clarke only focus #8 would correspond to the superior temporal
area (STA), whereas focus #6 lies just frontal of it at the fronto-lateral extreme of HG.
The latter location does not correspond to one particular field in their map.
The aforementioned four foci, #1a, #1b, #2, and #4 would be the best choice if one
considers a tonotopic organization of the auditory cortex. However, the apparent ambiguity of the anatomical attribution of the four foci makes it unlikely that all four correspond
to the primary auditory cortex. Additionally, recent experimental evidence suggests that
the activation near the occipital border of HG seen in imaging studies (our #1b and #4)
is a response from secondary auditory cortex (see below). The remaining foci #3, #6 and
#8 cluster around HG and might correspond to secondary and tertiary auditory areas.
1.4.1 Comparison with physiological data
At least as important as the anatomical attribution is the correspondence of activation foci
between recent imaging studies. The activation sites found in the present study show
a good correspondence to activation patterns reported by Hashimoto et al. (2000), Talavage et al. (2000), Scheich et al. (1998) and Di Salle et al. (2001). In these studies,
characteristic “stripe-like” clusters of auditory activation were found with maxima arranged alongside the frontal and occipital walls of HG. Additional activation sites were
reported on PT. Taken together, the studies suggest a physiological distinction between
the frontal and occipital wall of HG, whereby the frontal wall corresponds to the primaryand the occipital wall to the secondary auditory cortex (Scheich et al. 1998; Hashimoto et
al. 2000; Di Salle et al. 2001). The latter can be further subdivided into a medial and lat-
1.4. DISCUSSION
23
eral portion by their specific response properties (Hashimoto et al. 2000). Our activation
sites can be associated with the functional fields proposed in these studies on the basis of
anatomical ties. The activation sites on the frontal HG (#1a and #2) lie in the area T1b of
Scheich et al. (1998), while the occipital (#1b and #4) lie in T2. Area T3, as described in
the same paper, covers most of PT and includes the anatomical locus of our foci #3 and
#8. Hashimoto et al. (2000) described (among other results) three activated areas on STP
(A1, A2m and A2l). The activation site A1 extends along the frontal wall of HG, covering an area comparable to our foci #1a and #2. Their activation site A2, stretching along
the occipital wall of HG, was further divided into a medial (A2m) and a lateral portion
(A2l). The locations of these two sites correspond with our foci #1b and #4. Di Salle
and coworkers (2001) described distinct activation clusters along the frontal and occipital
walls of HG, which differed in the time course of the hemodynamic response. The frontal
activation covers the anatomical locations of our foci #1a and #2, whereas the occipital
one covers foci #1a and #4.
In the context of previous results, the present findings support the notion of physiological differences between medial/lateral and frontal/occipital portions of HG. On the
frontal wall the HF activation decreases laterally to zero, while on the occipital wall it
does not. Furthermore is the frequency selectivity in the lateral low-frequency parts of
the frontal and occipital HG walls significantly greater then in the medial parts. This
concept is consistent with differences in BOLD response properties (between frontal and
occipital HG; Di Salle et al. 2001), sensitivity to task and stimulus differences (occipitomedial and occipito-lateral HG; Hashimoto et al. 2000) and location of activation foci
(Scheich et al. 1998; Talavage et al. 2000; Hashimoto et al. 2000; Di Salle et al. 2001).
These physiological differences lessen the likelihood of a combination of different pairs
of activation foci as endpoints of tonotopic representations. An alternative interpretation
is that the activation sites correspond to different cortical fields, the topological organization of which cannot be resolved with the spatial resolution of several millimeters. In this
notion, the detected frequency selectivity of different cortical areas arises from an excess
of neurons engaged in the processing of different acoustic features, which are associated
with different frequency bands.
1.4.2 Frequency profiles
The frequency profiles show the variation of activation strength along individual Heschl’s
gyri for the four frequency conditions. This analysis is based on the assumption that HG
can be used as marker for the primary and secondary auditory cortex.
Heschl’s gyrus was taken as a landmark for the primary auditory cortex since Flechsig’s original article in 1920. Recently, at least two cytoarchitectonical studies have questioned the reliability of the relationship between the macroanatomical landmark, HG,
and the exact curse of microanatomical area borders (Hackett et al. 2001; Morosan et
al. 2001). Both studies state that it is not possible to precisely localize the borders of
the primary auditory cortex. Hackett and coworkers (2001) report that in the case of a
duplication of HG, the auditory core field may occupy variable portions of both gyri,
spanning the intermediate sulcus. In the case of one HG, the core occupied most of its
surface and was constrained by its sulcal boundaries (Hackett et al. 2001). Morosan and
coworkers (2001) state that a comparison of the Talairach coordinates of the transverse
24
CHAPTER 1. IS IT TONOTOPY, AFTER ALL?
sulci and those of the borders of primary auditory cortex show significant discrepancies
between both markers. We agree that this is true for the sub-millimeter and millimeter
scale. Nevertheless, we think that given the in-plane resolution of 3×3 mm of the present
and many other fMRI studies, and taken into account that MRI permits no direct access
to the microanatomy of individual subjects, the HG still provides the best estimate for the
location of primary auditory cortex.
The medial portions of the frontal and occipital walls of HG responded predominantly
to high frequencies, while lateral portions were more active during low-frequency stimulation. Assuming the high- and low-frequency activation sites as indicators of tonotopy,
one would expect a systematic decrease of LF induced activation towards medial portions
of HG and a corresponding increase in HF induced activation. However, such systematic gradients were not observed. Medial portions of HG were more balanced in their
responses to LF- and HF stimulation, with the HF- slightly surpassing the LF activation.
In contrary, lateral portions of HG almost exclusively showed LF activation. This difference in frequency selectivity further adds to the notion of a functional medio-lateral
distinction of HG and renders it difficult to combine medial and lateral activation sites
as endpoints of a single tonotopic representation. In the alternative interpretation, that
the different activation sites correspond to different cortical fields, the detected frequency
selectivity stems from an excess of neurons tuned to high or low frequencies. The differences in overall frequency tuning in different neuronal populations could arise from their
engagement in the processing of different acoustic features, which are associated with
different frequency bands. Medio-lateral differences in feature processing on HG have
been proposed for humans and non-human primates. A concept adapted from research
in the visual domain is the distinction of auditory processing streams for object recognition and spatial information (Tian et al. 2001, Rauschecker and Tian 2000, Romanski et
al. 1999). Most of the spectral cues important for spatial localization of sound sources
lie in higher frequency bands, whereas human speech is confined to lower frequencies.
Another emerging concept is that spectral and temporal features are preferentially processed at the medial and lateral HG respectively (Hall et al. 2002, Griffiths et al. 2001).
Again, if most of the relevant spectral and temporal cues are assumed to lie in different
frequency bands this would lead to a different preferential frequency tuning of medial
and lateral auditory areas.The frequency selectivity, defined as the percent signal change
between HF- and LF stimulation, was highest at the lateral activation sites #1a and #1b,
followed by #2 and #4 at the medial border of HG. The reminder of activation sites (#3,
#6, #8) showed a lesser degree of frequency selectivity, just reaching statistical significance. This low degree of frequency selectivity of secondary and tertiary areas is more
likely to correspond to functional specialization than to tonotopic gradients (see above).
1.4.3 Anatomical variation
The anatomical variability of Heschl’s gyrus found in the subjects is in good agreement
with data from recent morphological studies. Penhune and coworkers (1996) reported
an incidence of 20% for duplications of HG in a quantitative MRI study on normal subjects. Leonard and coworkers (1998), using a more refined analysis, concluded that the
incidence of HG duplications increases with the distance from the sagittal plane. The
incidence of what the authors called ‘common stem duplications’ (i.e. where the gyrus
1.5. REFERENCES
25
bifurcates at some point between the medial and lateral end) increased up to 40% on the
left- and 50% on the right STP. Complete duplications reaching up to the medial base
of HG were only found in 20% of the subjects. In our subjects, 38% (5/13) of the left
and 46% (6/13) of the right Heschl’s gyri bifurcated and no complete duplications were
detected. This variable duplication pattern is most prominent at lateral aspects of HG
and explains the laterally increased variance of the distance between frontal and occipital
walls of HG (see Fig. 1.1B).
The finding that the HG anatomy constitutes a major source of inter-individual functional variability has implications for future group studies on the auditory cortex. Two
procedures are commonly used to normalize groups of brain volumes: transformation
into Talairach-Fox space and brain warping. Neither of these procedures can adequately
deal with the high anatomical variability of STP. If several functional volumes are subjected to one or to both transformations in order to create an average activation map, the
more subtle activations tend to cancel out. Thus, if possible, the distribution of activation
sites should be studied in individual subjects. If activation sites are to be compared across
subjects, regions of interest can be defined, aligned with individual anatomical landmarks.
Another possibility is to select subjects according to their HG bifurcation pattern, since
warping algorithms can usually handle a consistent number of gyri on STP.
The apparent inter-individual differences in length and width of Heschl’s gyri and the
associated differences in the extension of cortical areas imposes the question of possible
functional significance. We are presently engaged in a study, which aims at the correlation
of individual extensions of STP regions with differences in basic auditory discrimination
performance.
In sum, (i) the combination of physiologically plausible RFM stimuli differing in
spectral content, (ii) an analysis referring to the individual gyral pattern, and (iii) a careful
comparison of the activation sites with cytoarchitectonical and imaging studies sheds new
light on the results of earlier attempts to reveal the tonotopic organization of auditory
areas on STP. A hypothesis of how to interpret the activation pattern on STP, which
we think fits the known cytoarchitectonical and physiological properties of the involved
cortical areas better, was suggested.
1.5 References
Bilecen, D., K. Scheffler, N. Schmid, K. Tschopp and J. Seelig 1998. Tonotopic organization of the human auditory cortex as detected by BOLD-FMRI. Hearing Research
126(1-2): 19-27.
Di Salle, F., E. Formisano, E. Seifritz, D. E. Linden, K. Scheffler, C. Saulino, G. Tedeschi,
F. E. Zanella, A. Pepino, R. Goebel and E. Marciano 2001. Functional fields in human
auditory cortex revealed by time-resolved fMRI without interference of EPI noise. Neuroimage 13(2): 328-38.
Edmister, W. B., T. M. Talavage, P. J. Ledden and R. M. Weisskoff 1999. Improved
auditory cortex imaging using clustered volume acquisitions. Human Brain Mapping
7(2): 89-97.
26
CHAPTER 1. IS IT TONOTOPY, AFTER ALL?
Fitzpatrick, K. A. and T. J. Imig 1980. Auditory cortico-cortical connections in the owl
monkey. Journal of Comparative Neurology 192(3): 589-610.
Friston, K. J., C. D. Frith, R. Turner and R. S. Frackowiak 1995a. Characterizing evoked
hemodynamics with fMRI. Neuroimage 2(2): 157-65.
Friston, K. J., A. P. Holmes, J. B. Poline, P. J. Grasby, S. C. Williams, R. S. Frackowiak
and R. Turner 1995b. Analysis of fMRI time-series revisited. [see comments]. Neuroimage 2(1): 45-53.
Galaburda, A. and F. Sanides 1980. Cytoarchitectonic organization of the human auditory
cortex. Journal of Comparative Neurology 190(3): 597-610.
Galaburda, A. M. and D. N. Pandya 1983. The intrinsic architectonic and connectional organization of the superior temporal region of the rhesus monkey. Journal of Comparative
Neurology 221(2): 169-84.
Grady, C. L., J. W. Van Meter, J. M. Maisog, P. Pietrini, J. Krasuski and J. P. Rauschecker
1997. Attention-related modulation of activity in primary and secondary auditory cortex.
Neuroreport 8(11): 2511-6.
Griffiths, T. D., S. Uppenkamp, I. Johnsrude, O. Josephs and R. D. Patterson 2001. Encoding of the temporal regularity of sound in the human brainstem. Nature Neuroscience
4(6): 633-7.
Hackett, T. A., I. Stepniewska and J. H. Kaas 1998. Subdivisions of auditory cortex
and ipsilateral cortical connections of the parabelt auditory cortex in macaque monkeys.
Journal of Comparative Neurology 394(4): 475-95.
Hall, D. A., I. S. Johnsrude, M. P. Haggard, A. R. Palmer, M. A. Akeroyd and A. Q.
Summerfield 2002. Spectral and temporal processing in human auditory cortex. Cerebral
Cortex 12(2): 140-49.
Hall, D. A., M. P. Haggard, M. A. Akeroyd, A. R. Palmer, A. Q. Summerfield, M. R.
Elliott, E. M. Gurney and R. W. Bowtell 1999. “Sparse” temporal sampling in auditory
fMRI. Human Brain Mapping 7(3): 213-23.
Hashimoto, R., F. Homae, K. Nakajima, Y. Miyashita and K. L. Sakai 2000. Functional
differentiation in the human auditory and language areas revealed by a dichotic listening
task. Neuroimage12(2): 147-58.
Howard, M. A., 3rd, I. O. Volkov, P. J. Abbas, H. Damasio, M. C. Ollendieck and M. A.
Granner 1996. A chronic microelectrode investigation of the tonotopic organization of
human auditory cortex. Brain Research 724(2): 260-4.
Imig, T. J., M. A. Ruggero, L. M. Kitzes, E. Javel and J. F. Brugge 1977. Organization of
auditory cortex in the owl monkey (Aotus trivirgatus). Journal of Comparative Neurology
171(1): 111-28.
Kaas, J. H. and T. A. Hackett 1998. Subdivisions of auditory cortex and levels of processing in primates. Audiology & Neuro-Otology 3(2-3): 73-85.
Kaas, J. H. and T. A. Hackett 2000. Subdivisions of auditory cortex and processing
streams in primates. Proceedings of the National Academy of Sciences of the United
States of America 97(22): 11793-9.
1.5. REFERENCES
27
Lancaster, J. L., M. G. Woldorff, L. M. Parsons, M. Liotti, C. S. Freitas, L. Rainey, P. V.
Kochunov, D. Nickerson, S. A. Mikiten and P. T. Fox 2000. Automated Talairach atlas
labels for functional brain mapping. Human Brain Mapping 10(3): 120-31.
Lauter, J. L., P. Herscovitch, C. Formby and M. E. Raichle 1985. Tonotopic organization
in human auditory cortex revealed by positron emission tomography. Hearing Research
20(3): 199-205.
Leonard, C. M., C. Puranik, J. M. Kuldau and L. J. Lombardino 1998. Normal variation
in the frequency and location of human auditory cortex landmarks. Heschl’s gyrus: where
is it? Cerebral Cortex 8(5): 397-406.
Lockwood, A. H., R. J. Salvi, M. L. Coad, S. A. Arnold, D. S. Wack, B. W. Murphy
and R. F. Burkard 1999. The functional anatomy of the normal human auditory system:
responses to 0.5 and 4.0 kHz tones at varied intensities. Cerebral Cortex 9(1): 65-76.
Lohmann, G., K. Mueller, V. Bosch, H. Mentzel, S. Hessler, L. Chen and D. Y. von
Cramon 2001. Lipsia - A new software system for the evaluation of functional magnetic
resonance images of the human brain. Computerized Medical Imaging and Graphics
25(6): 449-457.
Merzenich, M. M. and J. F. Brugge 1973. Representation of the cochlear partition of the
superior temporal plane of the macaque monkey. Brain Research 50(2): 275-96.
Mesulam, M. M. and D. N. Pandya 1973. The projections of the medial geniculate complex within the sylvian fissure of the rhesus monkey. Brain Research 60(2): 315-33.
Morel, A., P. E. Garraghty and J. H. Kaas 1993. Tonotopic organization, architectonic
fields, and connections of auditory cortex in macaque monkeys. Journal of Comparative
Neurology 335(3): 437-59.
Morel, A. and J. H. Kaas 1992. Subdivisions and connections of auditory cortex in owl
monkeys. Journal of Comparative Neurology 318(1): 27-63.
Morosan, P., J. Rademacher, A. Schleicher, K. Amunts, T. Schormann and K. Zilles 2001.
Human primary auditory cortex: cytoarchitectonic subdivisions and mapping into a spatial reference system. Neuroimage 13(4): 684-701.
Norris, D. G. 2000. Reduced power multi-slice MDEFT imaging. Journal of Magnetic
Resonance Imaging 11: 445-51.
Ojemann, G. A. 1983. Brain organization for language from the perspective of electrical
stimulation mapping. Behav Brain Sci(2): 189-230.
Oldfield, R. C. 1971. The assessment and analysis of handedness: the Edinburgh inventory. Neuropsychologia 9(1): 97-113.
Pandya, D. N. and F. Sanides 1973. Architectonic parcellation of the temporal operculum
in rhesus monkey and its projection pattern. Zeitschrift fur Anatomie und Entwicklungsgeschichte 139(2): 127-61.
Pantev, C., M. Hoke, K. Lehnertz, B. Lutkenhoner, G. Anogianakis and W. Wittkowski
1988. Tonotopic organization of the human auditory cortex revealed by transient auditory evoked magnetic fields. Electroencephalography & Clinical Neurophysiology 69(2):
160-70.
28
CHAPTER 1. IS IT TONOTOPY, AFTER ALL?
Pantev, C., M. Hoke, B. Lutkenhoner and K. Lehnertz 1989. Tonotopic organization of
the auditory cortex: pitch versus frequency representation. Science 246(4929): 486-8.
Penhune, V. B., R. J. Zatorre, J. D. MacDonald and A. C. Evans 1996. Interhemispheric
anatomical differences in human primary auditory cortex: probabilistic mapping and volume measurement from magnetic resonance scans. Cerebral Cortex 6(5): 661-72.
Rauschecker, J. P. and B. Tian 2000. Mechanisms and streams for processing of “what”
and “where” in auditory cortex. Proceedings of the National Academy of Sciences of the
United States of America 97(22): 11800-6.
Rauschecker, J. P. 1997. Processing of complex sounds in the auditory cortex of cat,
monkey, and man. Acta Oto-Laryngologica—Supplement 532: 34-8.
Rauschecker, J. P., B. Tian and M. Hauser 1995. Processing of complex sounds in the
macaque nonprimary auditory cortex. Science 268(5207): 111-4.
Rauschecker, J. P., B. Tian, T. Pons and M. Mishkin 1997. Serial and parallel processing
in rhesus monkey auditory cortex. Journal of Comparative Neurology 382(1): 89-103.
Rivier, F. and S. Clarke 1997. Cytochrome oxidase, acetylcholinesterase, and NADPHdiaphorase staining in human supratemporal and insular cortex: evidence for multiple
auditory areas. Neuroimage 6(4): 288-304.
Romani, G. L., S. J. Williamson and L. Kaufman 1982. Tonotopic organization of the
human auditory cortex. Science 216(4552): 1339-40.
Romanski, L. M., J. F. Bates and P. S. Goldman-Rakic 1999. Auditory belt and parabelt projections to the prefrontal cortex in the rhesus monkey. Journal of Comparative
Neurology 403(2): 141-57.
Scheich, H., F. Baumgart, B. Gaschler-Markefski, C. Tegeler, C. Tempelmann, H. J.
Heinze, F. Schindler and D. Stiller 1998. Functional magnetic resonance imaging of a
human auditory cortex area involved in foreground-background decomposition. European Journal of Neuroscience 10(2): 803-9.
Talairach, P. and J. Tournoux 1988. A Stereotactic Coplanar Atlas of the Human Brain.
Stuttgart, Thieme.
Talavage, T. M., P. J. Ledden, R. R. Benson, B. R. Rosen and J. R. Melcher 2000.
Frequency-dependent responses exhibited by multiple regions in human auditory cortex.
Hearing Research 150(1-2): 225-44.
Tian, B., D. Reser, A. Durham, A. Kustov and J. P. Rauschecker 2001. Functional specialization in rhesus monkey auditory cortex. Science 292(5515): 290-3.
Ugurbil, K., M. Garwood, J. Ellermann, K. Hendrich, R. Hinke, X. Hu, S. G. Kim, R.
Menon, H. Merkle and S. Ogawa 1993. Imaging at high magnetic fields: initial experiences at 4 T. Magnetic Resonance Quarterly 9(4): 259-77.
von Economo, C. and G. N. Koskinas 1925. Die Cytoarchitektonik der Hirnrinde. Berlin,
Springer. Wessinger, C. M., M. H. Bounocore, C. L. Kussmaul and G. R. Mangun 1997.
Tonotopy in human auditory cortex examined with functional magnetic resonance imaging. Human Brain Mapping(5): 18-25.
1.5. REFERENCES
29
Wessinger, C. M., J. VanMeter, B. Tian, J. Van Lare, J. Pekar and J. P. Rauschecker 2001.
Hierarchical organization of the human auditory cortex revealed by functional magnetic
resonance imaging. Journal of Cognitive Neuroscience 13(1): 1-7.
Woldorff, M. G., C. C. Gallen, S. A. Hampson, S. A. Hillyard, C. Pantev, D. Sobel and
F. E. Bloom 1993. Modulation of early sensory processing in human auditory cortex
during auditory selective attention. Proceedings of the National Academy of Sciences of
the United States of America 90(18): 8722-6.
Worsley, K. J. and K. J. Friston 1995. Analysis of fMRI time-series revisited—again.
[letter; comment]. Neuroimage 2(3): 173-81.
Chapter 2
Hierarchical processing of sound
location and motion in the human
brainstem and planum temporale
31
Abstract
Horizontal sound localization relies on the extraction of binaural acoustic cues by integration of the signals from the two ears at the level of the brainstem. The present experiment
was aimed at detecting the sites of binaural integration in the human brainstem using
fMRI and a binaural difference (BD) paradigm, in which the responses to binaural sounds
were compared with the sum of the responses to the corresponding monaural sounds.
The experiment also included a moving sound condition, which was contrasted against a
spectrally and energetically matched stationary sound condition to assess, which of the
structures that are involved in general binaural processing are specifically specialized in
motion processing. The BD contrast revealed a substantial binaural response suppression
in the inferior colliculus (IC) in the midbrain, the medial geniculate body in the thalamus,
and the primary auditory cortex (PAC). The size of the suppression suggests that it was
brought about by neural inhibition at a level below the IC, the only possible candidate
being the superior olivary complex. Whereas all structures up to and including the PAC
were activated as strongly by the stationary as by the moving sounds, non-primary auditory fields in the planum temporale (PT) responded selectively to the moving sounds.
These results suggest a hierarchical organization of auditory spatial processing in which
the general analysis of binaural information begins as early as the brainstem, while the
representation of dynamic binaural cues relies on non-primary auditory fields in the PT.
2.1. INTRODUCTION
33
2.1 Introduction
In humans, horizontal sound localization mainly relies on the analysis of interaural differences in sound arrival time and level by comparison of the signals from the two ears.
The processing of these binaural cues begins at the level of the superior olivary complex
(SOC) in the brainstem. Neurons in the medial superior olive (MSO) receive excitatory
projections from both cochleae (EE neurons), and their responses are facilitated by coincident binaural input (cat: Yin and Chan, 1990). In contrast, the lateral superior olive
(LSO) contains neurons, whose main input from one cochlea is inhibitory, the other being excitatory (EI neurons). EE neurons are sensitive to interaural time differences (ITDs;
Joris et al., 1998), whereas EI neurons are sensitive to both ITDs (Joris and Yin, 1995;
Batra et al., 1997) and interaural level differences (ILDs; Tollin 2003).
While it is generally assumed that the brainstem plays a vital role in spatial hearing,
there is still little consensus about the mechanisms underlying the processing of binaural
cues in the brainstem. Part of the problem is the difficulty to investigate brainstem binaural processing in humans: So far, the sole established correlate of binaural integration
in the human brainstem is the binaural difference (BD) in the auditory evoked potentials (AEPs), often referred to as binaural interaction component (Riedel and Kollmeier,
2001). The BD is defined as the difference between the response to a binaural sound
and the sum of the responses to the corresponding monaural sounds presented separately
[BD = Bin − (Lef t + Right)]. Any deviation from zero BD is interpreted as an indication of binaural functional coupling. In particular, the BD would be expected to be
positive for EE neurons, and negative for EI neurons. An EI neuron’s binaural response
would be even smaller than the response to the neuron’s excitatory monaural input alone.
The BD in the human brainstem AEPs is invariably negative, amounting to about 14-23%
of the sum of the monaural responses (McPherson and Starr, 1993). When interpreting
the sign of the BD in the AEPs, however, one needs to keep in mind that the AEPs represent spatially distributed activity (Kaufman et al., 1981), and so, both EE and EI neurons
may contribute to the BD, their respective effects partially canceling out.
The present study investigates the BD with fMRI. The aims were (1) to devise a
method, which would enable to image brainstem binaural processing in a spatially specific
manner, and (2) to characterize sites of facilitatory and inhibitory binaural interaction in
the ascending auditory pathway. The experiment also included a motion paradigm similar
to those used in previous fMRI studies of spatial hearing (Baumgart et al., 1999; Warren
et al., 2002), in which moving sounds were contrasted against appropriately matched stationary sounds. The comparison between the BD and the motion contrast was expected to
reveal, which of the regions that are involved in general binaural processing are specifically specialized in motion processing, an thus complement the physiological data on this
question (Spitzer and Semple, 1998; Malone et al., 2002; McAlpine and Palmer, 2002).
2.2 Materials and Methods
2.2.1 Stimuli and experimental protocol
The experiment comprised two binaural and two monaural sound conditions as well as
a silence condition (Sil). In the monaural conditions (Left, Right), trains of noise bursts
34
CHAPTER 2. BINAURAL PROCESSING IN THE HUMAN BRAINSTEM
were played either to the left or right ear separately. In the binaural conditions (Diotic,
Move), the same noise bursts were played to both ears simultaneously. The two binaural
conditions differed from each other only in the sounds’ interaural temporal properties: in
the Diotic condition, the noise bursts were identical at both ears, so the perception was
that of a stationary sound in the center of the head. In the Move condition, the noise
bursts were presented with an ITD that varied continuously between −1000 and 1000 s,
to create the perception of a sound that moves back and forth between the two ears. By
convention, a positive ITD means that the sound to the left ear is lagging the sound to the
right ear, whereas a negative ITD denotes the reverse situation. The ITD variation in the
Move condition was linear with a rate of 1000 s per s, so it took 2 s for the sounds to move
from one ear to the other. The starting point of the movement was randomized from trial
to trial. In both binaural conditions, the noise bursts had the same energy at both ears,
and the energy to each ear was equal to the energy of either of the monaural noises. The
noise bursts had a duration of 50 ms; they were filtered between 200 and 3200 Hz and
presented at a rate of 10 per s. The noise was continuously generated afresh (Tucker Davis
Technologies, System 3), so that none of the noise bursts was ever repeated during the
experiment. The sounds were presented through electrostatic headphones (Sennheiser)
that passively shielded the listener from the scanner noise.
Cardiac gating (Guimaraes et al., 1998) was used to minimize motion artifacts in
the brainstem signal due to pulsation of the basilar artery. The functional images were
triggered 300 ms after the R-wave in the electrocardiogram, when the cardiac cycle is
in its diastolic phase. The sparse imaging technique (Hall et al., 1999) was applied to
avoid masking of the experimental sounds by the scanner noise and reduce the effect of
scanner noise on the recorded activity. The gaps between consecutive image acquisitions,
during which the sounds or the silence were presented, had a duration of about 7 s. The
exact duration of the gaps, and thus also the repetition time of the image acquisitions
(TR), varied slightly due to cardiac gating. The average TR over all listeners and trials
amounted to 10.5 s. The experimental conditions were presented in epochs, during which
five images were acquired. Four sound epochs containing the four sound conditions in
pseudorandom order were alternated with a single silence epoch. A total of 250 images
(corresponding to 50 epochs) were acquired per listener.
Listeners were asked to attend to the sounds and take particular notice of their spatial
attributes. To avoid eye movements in the direction of the sounds, the listeners had to
fixate a cross at the midpoint of the visual axis and perform a visual control task. The
task was to press a button with the left or right index finger upon each occurrence of the
capital letter ‘Z’ in either of two simultaneous, but uncorrelated, sequences of random
one-digit numbers that were shown to the left and the right of the fixation cross. The
numbers were presented once every 2 s for 50 ms.
2.2.2 fMRI data acquisition
Blood-oxygen level dependent (BOLD) contrast images were acquired with a 3-T Bruker
Medspec whole body scanner using gradient echo planar imaging (average TR = 10.5 s;
TE = 30 ms; flip angle = 90; acquisition bandwidth = 100 kHz). The functional images
consisted of 28 ascending slices with an in-plane resolution of 33 mm, a slice thickness of
3 mm and an inter-slice gap of 1 mm. The slices were oriented along the line connecting
2.3. RESULTS
35
the anterior and posterior commissures and positioned so that the lowest slices covered
the cochlear nucleus (CN) just below the pons. They were acquired in direct temporal
succession. The acquisition time amounted to 2.1 s. A high-resolution structural image
was acquired from each listener using a 3D MDEFT sequence (Ugurbil et al., 1993) with
128 1.5-mm slices (FOV = 25×25×19.2 cm; data matrix 256×256; TR = 1.3 s; TE =
10 ms). For registration purposes, a set of T1-weighted EPI images were acquired using
the same parameters as for the functional images (inversion time = 1200 ms; TR = 45 s;
four averages).
2.2.3 Data analysis
The data were analyzed with the software package LIPSIA (Lohmann et al., 2001). The
functional images of each listener were corrected for head motion and rotated into the
Talairach coordinate system by co-registering the structural MDEFT and EPI-T1 images
acquired in this experiment with a high-resolution structural image residing in a listeners
database. The functional images were then normalized and were spatially smoothed with
two different Gaussian kernels (3 and 10 mm full width at half maximum; FWHM) to
optimize for the signals from the brainstem and the cortex, respectively. The auditory
structures in the brainstem are only a few millimeters large and their location with respect
to macro-anatomical landmarks varies little across individuals, and so, the chances of detecting auditory activity in the brainstem can be increased by using a small smoothing
kernel. In contrast, auditory cortical regions are comparatively large and their boundaries exhibit a considerable inter-individual variability with respect to macro-anatomy
(Rademacher et al., 2001), which means that a larger smoothing kernel is more suitable
for analyzing the auditory cortical signal. The smoothed image time series of twelve
listeners, comprising a total of 3000 image volumes, were subjected to a fixed-effects
group analysis using the general linear model. Each of the five experimental conditions
(silence and four sound conditions) was modeled as a box-car function convolved with
a generic hemodynamic response function including a response delay of 6 s. The data
were highpass filtered at 0.0019 Hz to remove low-frequency drifts, and lowpass filtered
by convolution with a Gaussian function (4 s FWHM) to control for temporal autocorrelation. The height threshold for activation was t = 3.1 (p < 0.001 uncorrected).
2.2.4 Listeners
Twelve right-handed listeners (6 male, 6 female) between 23 and 32 years of age, with no
history of hearing disorder or neurological disease, participated in the experiment after
having given informed consent. The experimental procedures were approved by the local
ethics committee.
2.3 Results
2.3.1 Comparison between all sounds and silence
In order to reveal brain regions that showed a general sensitivity to the noise stimuli used
in the present experiment, and thereby identify possible candidates for nonlinear binaural
36
CHAPTER 2. BINAURAL PROCESSING IN THE HUMAN BRAINSTEM
20
y = -29 mm
3.1
AC
AC
z = -2 mm
MGB
y = -36 mm
MGB
IC
IC
y = -42 mm
z = -29 mm
CN
L
R
CN
Figure 2.1: Activation for the contrast between all four sound conditions (Left,Right,Bin andMove)
and the silent baseline (Sil), rendered onto the average structural image of the group. The left
column depicts three coronal slices at y = −42, −36 and −29 mm (from bottom to top). The
lower two panels in the right column show axial slices at z = −29 and −2 mm; the slice shown
in the upper right panel is oriented parallel to the Sylvian fissure (see small inset at the top). The
color bar (top) shows the t-values for the statistical comparison. The contrast revealed bilateral
activation in the cochlear nucleus (CN), the inferior colliculus (IC), the medial geniculate body
(MGB) and the auditory cortex (AC) on the supratemporal plane.
interaction, we first compared the average activation produced by all sound conditions
(Left, Right, Diotic and Move) to the activation in the silence condition. This all sounds
versus silence contrast revealed bilateral activation at four different levels of the auditory
processing hierarchy (Fig. 2.1).
The lower two panels of Fig. 2.1 show activation in both cochlear nuclei (CN). The
CN is the first processing stage in the auditory system and receives purely monaural
input from the ipsilateral cochlea. The location of the CN activations with respect to
macroanatomical landmarks (Fig. 2.1) corresponds well with the location of the respective activations in the data of Griffiths et al. (2001; see their Fig. 2.2) and Melcher et al.
(1999; see their Fig. 2.4). The Talairach coordinates of the left and right CN activations
amounted to −14, −42, −30 mm and 10, −42, −30 mm, respectively. These coordinates
transform to about −14, −42, −38 mm and 10, −42, −38 mm in the MNI space, which
2.3. RESULTS
37
brain region
coordinates x, y, z
z−value
left CN
right CN
left IC
right IC
left MGB
right MGB
left STP
right STP
−14, −42, −30
10, −42, −30
−8, −36, −3
4, −36, −3
−17, −30, 0
13, −30, −3
−47, −27, 12
40, 35, 25
> 3.1
> 3.1
> 4.5
> 4.5
> 3.1
> 3.1
> 16
> 19
Table 2.1: Talairach coordinates and z-values of auditory activation foci in the all sounds versus
silence contrast. CN: cochlear nucleus; IC: inferior colliculus; MGB: medial geniculate body;
STP: supratemporal plane
corresponds reasonably with the respective coordinates reported by Griffiths et al. (−12,
−40, −46 mm and 8, −34, −48 mm).
The middle panels of Fig. 2.1 show activation in the inferior colliculi (IC) in the
midbrain and the medial geniculate bodies (MGBs) in the thalamus (right panel). The IC
is the last auditory processing stage in the brainstem and contains a mandatory synapse
for all ascending auditory pathways. The inferior colliculi are strongly interconnected by
commissural fibers, suggesting that the IC may have profound implications in binaural
processing. The MGB activation can also be seen in the upper left panel of Fig. 2.1. As
for the CN, the Talairach coordinates of the most significantly activated voxel in the IC
and MGB (Table 2.1) correspond well with the coordinates of the respective activations
reported by Griffiths et al. (2001). The upper right panel depicts a slice parallel to the
Sylvian fissure, showing activation in the auditory cortices.
The SOC failed to exhibit any significant activation in the all sounds versus silence
contrast, and indeed in any of the other contrasts tested, probably because it is too small
to be detectable with standard-resolution fMRI sequences. In humans, the largest nucleus
of the SOC, the MSO, has an rostrocaudal extent of about 2.6 mm and a dorsoventral
extent of 1.8–2.4 mm (Bazwinsky et al., 2003), which is smaller than even a single voxel
in the functional images, or the width of the spatial smoothing kernel (3 mm). Thus, even
in the case that the SOC completely falls into a single voxel, which is in itself improbable,
the activation of this voxel would probably fail to reach statistical significance.
2.3.2 BD contrast
In order to reveal sites of facilitatory and inhibitory binaural interactions, which underlie
the processing of auditory spatial information, the sum of the hemodynamic responses to
the left and right monaural sounds (Left, Right) was compared with the response to the
diotic binaural sound (Diotic). This comparison is analogous to the binaural difference
operation that has previously been applied to AEP data (Riedel and Kollmeier, 2002).
The particular difficulty in applying this operation to fMRI data lies in the fact that it
involves comparing a single sound condition (Diotic) with the sum of two sound conditions (Left+Right). Such a comparison would be unbalanced for any of the non-auditory
processes that were also active during sound presentation, as for instance the visual con-
38
CHAPTER 2. BINAURAL PROCESSING IN THE HUMAN BRAINSTEM
trol task, and the corresponding contrast would be unestimable, if the baseline is modeled
explicitly, as was the case in the current experiment. The problem can be circumvented
by referencing each of the sound conditions in the BD contrast to the silence condition
(Sil), yielding BD = (Diotic − Sil) − [(Lef t − Sil) + (Right − Sil)], which reduces
to BD = (Diotic + Sil) − (Lef t + Right). By this means, the BD contrast was not
only balanced for any non-auditory processes that were active during both sound and silence epochs, but also for sound energy, because the intensity and presentation rate of the
left and right monaural sounds were equal to those of the left- and right-ear stimuli in
the diotic sound, and the silence condition did not contain any sound energy. Balancing
the contrast for sound energy is the prerequisite for recording nonlinear (facilitatory and
inhibitory) binaural interactions. In particular, testing for a negative BD (−BD > 0) reveals regions whose binaural response is suppressed relative to the monaural responses,
whereas a positive BD (BD > 0) would be associated with regions that exhibit facilitatory
binaural coupling.
The BD contrast yielded a significant bilateral response in the IC, the MGB, and the
medial and central part of Heschl’s gyrus (HG), which is the site of the primary auditory
cortex (PAC) in humans (Fig. 2.2a). In contrast, the CN exhibited no significant BD response, as would be expected, since the CN receives purely monaural input. In all regions
which showed a significant BD contrast, the BD was invariably negative. In fact, the size
of the binaural response (red bars in Fig. 2.2b) never exceeded 50% of the sum of the
monaural responses (horizontal dashed lines on blue bars in Fig. 2.2b). On average, the
proportion between the binaural response and the sum of the monaural responses was
37% in the IC, and 46% in both the MGB and PAC. Thus, the binaural response was not
only smaller than the sum of the monaural responses, but was also smaller than the larger
of the two monaural responses alone. From the level of the IC upwards, contralateral
monaural sounds usually produce a larger response than ipsilateral sounds, which is consistent with the notion that the majority of ascending auditory pathways cross from the
ipsilateral to the contralateral side below the level of the IC (Melcher et al., 1999, 2000;
Pantev et al., 1998; Woldorff et al., 1999). In the present experiment, the ratio between
the contralateral and ipsilateral monaural responses amounted to an average of 153% in
the IC, 124% in the MGB, and 126% in the PAC. Thus, when expressed relative to the
contralateral monaural response, the suppression of the binaural response averaged 38%
in the IC, 17% in the MGB, and 18% in the PAC.
Testing for a positive BD yielded no significant activation whatsoever, anywhere in
the auditory pathway.
2.3.3 Motion contrast
In order to assess which brain regions are specialized in auditory motion processing and
whether they overlap with those regions that are involved in general binaural processing
as shown by the BD contrast, we compared the activation produced by the Move condition
to the activation produced by the Diotic condition. The Move and Diotic conditions only
differed in the noise bursts’ interaural temporal characteristics, the ITD being fixed at
0 s in the Diotic condition and varying linearly over time in the Move condition. The
motion contrast () revealed significant bilateral activation in the planum temporale (PT),
viz, the part of the supratemporal plane that lies posterior to HG, and the temporo-parietal
2.3. RESULTS
39
Figure 2.2: BD contrast. Panel a: activation to the BD contrast rendered onto two coronal slices at
y = –36 and –29 mm (left and middle) and one oblique slice oriented parallel to the Sylvian fissure
as in Fig. 2.1 (right). The BD contrast yielded bilateral activation in the IC, the MGB and on HG
in the region of the primary auditory cortex; there was no activation on the planum temporale
(PT), behind HG. Panel b shows the size of the response to the binaural stationary sounds (Diotic;
red bars) and the sum of the responses to the two monaural sounds (Left+Right; blue bars) relative
to the silent baseline in each of these regions. The binaural response never exceeded 50% of the
sum of the monaural responses (horizontal dashed lines). The absence of BD activation in the
PT (at the location of the most significant voxel in the motion contrast) was due to the fact that
the responses to all of the stationary sounds (Left, Right and Diotic) on the whole were greatly
reduced in this region (two right-most sets of bars).
40
CHAPTER 2. BINAURAL PROCESSING IN THE HUMAN BRAINSTEM
Saturation
Response strength
a
Ipsi Contra
Bin
Bin L+R
Inhibition
Response strength
b
+
+
+
−
Ipsi Contra Bin
Bin L+R
Figure 2.3: Motion contrast. Upper panel: activation to the motion contrast (blue highlight)
rendered onto an oblique slice oriented parallel to the Sylvian fissure (a), a sagittal slice at x =
54 mm (b), and two coronal slices at y = −36 and −29 mm (c, d). For a comparison, the red
highlight shows the activation to the BD contrast. Whereas the BD produced activation in the IC,
the MGB and on HG, the activation to the motion contrast was largely confined to the PT and the
TPJ, behind HG. Panel e shows the contrast-weighted beta-values for the motion contrast (blue
bars) and the negative BD contrast (−BD; red bars) in each of these regions. The gray shading
highlights those regions, where the motion response surpassed the BD response.
junction (TPJ; blue highlight in Fig. 2.3a/b). The PT contains non-primary auditory fields
and, like the TPJ, has previously been implicated with the processing of sound location
and sound movement (Baumgart et al., 1999; Warren et al., 2002; Zatorre et al., 2002).
Unlike the BD contrast, the motion contrast did not produce any significant activation
in the IC, the MGB, or the PAC on HG. Usually, the absence of activation is difficult
to interpret, since activation may still be present even if it does not meet the underlying
significance criterion. In the current experiment, however, the response to motion contrast
can be directly compared with the BD response. In the IC and MGB, the motion response
(blue bars in Fig. 2.3e) was miniscule compared to the BD response (red bars in Fig. 2.3e).
In the PAC, the motion contrast produced a small response, which did not, however, reach
statistical significance. Only in the PT and the TPJ was the motion response larger than
the BD response (gray highlight in Fig. 2.3e). Even though the BD contrast produced no
significant activation in the PT (at the location of the most significant voxel in the motion
2.4. DISCUSSION
41
contrast), the response to the stationary binaural sound (Diotic) was still smaller than the
sum of the monaural responses (Left+Right), as is shown by the two right-most sets of
bars in Fig. 2.2b. The absence of activation to the BD contrast in the PT was due the fact
that the responses to all stationary sound conditions on the whole (Left, Right and Diotic)
were very small in this region.
2.4 Discussion
In this study, we present a new paradigm, which enables to investigate binaural processing in the human brainstem in a spatially specific manner using fMRI. The BD contrast
revealed a substantial binaural interaction in the IC, the MGB, and the PAC. Interestingly, the BD was invariably negative in these regions. In fact, the binaural response
was not only smaller than the sum of the monaural responses, but was even smaller than
the contralateral monaural response alone. This finding suggests that the observed binaural suppression was caused by inhibitory processes rather than by response saturation
(Fig. 2.4). Saturation may effect binaural suppression in the absence of any inhibition
(Fig. 2.4a). In this case, however, the binaural response should be larger than the larger
of the two monaural responses, which is the response to the contralateral monaural sound
from the level of the IC upwards (Contra in Fig. 2.4a). The fact that the observed binaural
response was actually smaller than the contralateral monaural response, particularly in
the IC, strongly suggests that binaural suppression involves inhibition (Fig. 2.4b). The
current results are consistent with those of Jäncke et al. (2002), who found that the superior temporal response to binaural consonant-vowel syllables and tones is smaller than
the sum of the responses to the corresponding monaural sounds, and in some cases even
smaller than the response to the contralateral sound alone.
It is important to bear in mind that the hemodynamic effects of inhibitory and excitatory processes are likely to be indistinguishable at the synaptic level, because both kinds
of processes are metabolically similarly costly. Thus, the observed binaural suppression
probably reflects inhibitory processes at a level below the IC, the only possible candidate
being the SOC. The fact that the relative size of the BD was largest in the IC and decreased slightly towards higher levels, suggests that the binaural suppression in the MGB
and PAC was simply ‘inherited’ from the IC, which implies that inhibition at and above
the level of the IC does not affect the monaural and binaural responses differentially.
Physiological data in mammals have shown that not only EI neurons in the LSO,
but also EE neurons in the MSO receive inhibitory inputs, mediated by extremely fast
projections via the trapezoid bodies (Oertel 1997, 1999; Grothe, 2000). The inhibition
in the MSO is tightly time-locked to the sound’s temporal structure and may thus play
an important part in the processing of ITDs (guinea pig: Brandt et al., 2002; Grothe
2003). Any binaural suppression effected by this temporally precise inhibition would be
expected to depend on the exact temporal register between the sounds at the two ears.
Whether or not such temporally precise inhibition contributes to binaural suppression
could be tested, for instance, by measuring the BD contrast with interaurally delayed or
uncorrelated rather than diotic noise.
In non-human mammals, the inhibitory input to EI neurons in the LSO is mediated
by the medial nucleus of the trapezoid body (MNTB). Intriguingly, anatomical studies
42
CHAPTER 2. BINAURAL PROCESSING IN THE HUMAN BRAINSTEM
a
b
c
e
PTL
0.25
Size of effect
0.20
HGL
HGR
x = 54 mm
BD
motion
y = -36 mm d y = -29 mm
PTR
ICL
ICR MGBL MGBR
BD
motion
0.15
0.10
0.05
0.00
Figure 2.4: Schematic representations of saturation and inhibition accounts of binaural suppression. The left part of each panel shows the responses to the ipsi- and contralateral monaural sounds
(Ipsi, Contra) and to the binaural sound (Bin); the right part of each panel shows the binaural response (Bin) in comparison with the sum of the monaural responses (L+R). Response saturation
(schematically represented by curved, solid line in panel a) may cause binaural response suppression even in the absence of any inhibition; in this case, the binaural response would be expected
to be larger than the larger of the two monaural responses (Contra), and thus larger than 50% of
the sum of the monaural responses (horizontal dashed lines). The fact that the binaural response
was actually smaller than 50% of the sum of the monaural responses indicates that suppression
was brought about by the convergence of excitatory (+) and inhibitory (−) effects from the two
ears (panel b).
2.4. DISCUSSION
43
have so far failed to yield unequivocal evidence for the existence of a human MNTB
(Moore, 2000; Bazwinsky et al., 2003). The current data indicate, however, that neural
inhibition is a prominent feature of binaural integration in the human SOC, suggesting
that a structure functionally and phylogenetically equivalent to the MNTB may also exist
in humans.
2.4.1 Is the inhibition exerted by the ipsilateral or the contralateral signal?
Physiological data indicate that the vast majority of EI-type neurons in and above the
IC are excited by contralateral and inhibited by ipsilateral input (Imig and Adrian, 1977;
Middlebrooks and Zook, 1982; Reser et al., 2000; Tollin, 2003), suggesting that the
binaural suppression observed in the present study reflects inhibition that the ipsilateral
signal exerts on the contralateral one. In contrast, accounts of the right ear advantage in
dichotic listening (Tervaniemi and Hugdahl, 2003) are generally based on the assumption that the stronger contralateral signal suppresses the weaker ipsilateral signal before
reaching the left-hemisphere speech system (Kimura, 1967). It is difficult to reconcile
the notion of contralateral suppression with the ipsilateral inhibition effected by EI neurons in the IC and auditory cortex, other than by assuming that the two processes are
functionally unrelated and that any contralateral suppression possibly occurs above the
level of auditory cortex. Recently, Fujiki et al. (2002) reported evidence for contralateral suppression in auditory cortex using the so-called frequency-tagging method and
magnetoencephalography (see also Keneko et al., 2003). However, the validity of their
conclusions is challenged by the fact that a good part of the putatively ‘binaural’ suppression obtained with the frequency-tagging method may actually be an entirely monaural
effect (Picton et al., 1987; Lins and Picton, 1995; Draganova et al., 2002).
2.4.2 Absence of binaural facilitation
The absence of any evidence for facilitatory binaural interaction in the current data is surprising from the point of view of the prevalent theories of binaural processing (Colburn,
1996). According to the Jeffress model (Jeffress, 1949), which is still the basis of most
of the current models of interaural temporal processing (Joris et al., 1998; see however
Fitzpatrick et al., 2002), ITDs are processed by EE-type neurons that are tuned to narrow
ranges of ITDs by virtue of a coincidence mechanism. This mechanism would be expected to produce strongly facilitated binaural responses at each neuron’s best ITD, viz,
the ITD producing maximal discharge. The best ITD is assumed to vary parametrically
across neurons to create a topographic map of ITD, with a concentration of best ITDs
around the midline (0 s), where ITD perception is most accurate (Durlach and Colburn,
1978). Midline sounds with a large proportion of low-frequency energy, like the diotic
noise bursts in the current experiment, would thus be expected to elicit a strongly facilitated response in Jeffress-type coincidence neurons, and the complete absence of any
facilitation in the current data calls the model into question. Many MSO neurons actually
behave like coincidence detectors, in that they are strongly sensitive to ITDs and exhibit
facilitated binaural responses at their best ITD (Joris et al., 1998). However, in small
rodents, the majority of best ITDs in the MSO and IC have been found to be concentrated
around a mean of 200 to 300 s, well away from the midline and outside the range of
44
CHAPTER 2. BINAURAL PROCESSING IN THE HUMAN BRAINSTEM
ITDs that these animals encounter in natural sounds (McAlpine et al., 2001; Brandt et
al., 2002; McAlpine and Grothe, 2003). If these physiological results generalize to humans, the absence of any facilitatory responses to the midline sounds used in the current
experiment would be unsurprising. In that case, one may expect to observe facilitatory
responses to strongly lateralized sounds with ITDs of several hundred microseconds.
Nonetheless, the absence of any evidence for binaural facilitation in the current data
remains surprising in view of the fact that binaural sounds are perceived as about twice as
loud as the corresponding monaural sounds (Hirsch, 1948), and the finding that activity
in the PAC increases with increasing loudness (Hart et al., 2002). Our observation that
the binaural response was less than half as large as the sum of the monaural responses on
the entire HG suggests that binaural loudness summation is represented other than by an
increase in discharge rate in the PAC.
2.4.3 Hierarchical processing of binaural cues
Whereas the BD contrast revealed activation in the brainstem (IC), the thalamus (MGB)
and the PAC on HG, the motion contrast only produced activation in non-primary auditory
fields in the PT and in the TPJ, posterior to HG. Thus, the BD paradigm and the motion
paradigm yield complementary measures of auditory spatial processing, which appear
to be associated with different levels in the processing hierarchy. The stationary and
moving binaural sounds produced similar activations up to the level of and including the
PAC. In contrast, in the PT, the activation to all stationary sounds was greatly reduced
relative to the lower levels, whereas the moving sounds still produced a sizeable response.
The reduction of the responses to the stationary sounds may be due to the fact that nonprimary auditory fields exhibit largely phasic responses to prolonged auditory stimuli,
whereas responses in and below the PAC are more tonic (Giraud et al., 2000; Harms and
Melcher, 2002; Seifritz et al., 2002). Phasic responses would be expected to produce a
lesser activation than tonic responses in the blocked sparse imaging design used in the
current experiment. The results suggest that the processing of motion conveyed by timevarying interaural cues (ITDs) starts in the PT, and that motion sensitivity in the PT is
established by adaptation to invariant sound features.
In summary, this study shows that the BD paradigm enables to measure brainstem
binaural processing with fMRI. Comparing the BD and motion paradigms revealed a hierarchical organization of binaural processing in humans, with binaural integration starting below the IC, and motion sensitivity emerging only above the level of the PAC, in the
PT.
2.5 References
Batra R, Kuwada S, Fitzpatrick DC (1997) Sensitivity to interaural temporal disparities
of low- and high-frequency neurons in the superior olivary complex. I. Heterogeneity of
responses. J Neurophysiol 78:1222–1236.
Baumgart F, Gaschler-Markefski B, Woldorff MG, Heinze HJ, Scheich H (1999) A movement-sensitive area in auditory cortex. Nature 400:724-726.
2.5. REFERENCES
45
Bazwinsky I, Hilbig H, Bidmon HJ, Rübsamen R (2003) Characterization of the human
superior olivary complex by calcium binding proteins and neurofilament H (SMI-32). J
Comp Neurol 456:292-303.
Brandt A, Behrend O, Marquardt T, McAlpine D, Grothe B (2002) Precise inhibition is
essential for microsecond interaural time difference coding. Nature 417:543-547.
Colburn HS (1996) Computational models of binaural processing. In: Auditory computation (Hawkins HL, McMullen TA, Popper, AN, Fay RR, eds) pp332-400. New York:
Springer.
Draganova R, Ross B, Borgmann C, Pantev C (2002) Auditory cortical response patterns
to multiple rhythms of AM sound. Ear Hear 23:254-265.
Durlach NI, Colburn HS (1978) Binaural phenomena. In: Handbook of perception, Vol.
IV (Carterette EC, Friedman M, eds) pp405-466. New York: Academic Press.
Fitzpatrick DC, Kuwada S, Batra R (2002) Transformations in processing interaural time
differences between the superior olivary complex and inferior colliculus: beyond the Jeffress model. Hear Res 168:79-89.
Fujiki N, Jousmaki V, Hari R (2002) Neuromagnetic responses to frequency-tagged sounds:
a new method to follow inputs from each ear to the human auditory cortex during binaural
hearing. J Neurosci 22:RC205.
Giraud AL, Lorenzi C, Ashburner J, Wable J, Johnsrude I, Frackowiak R, Kleinschmidt
A (2000) Representation of the temporal envelope of sounds in the human brain. J Neurophysiol 84:1588–1598.
Grothe B (2000) The evolution of temporal processing in the medial superior olive, an
auditory brainstem structure. Prog Neurobiol 61:581-610.
Grothe B (2003) New roles for synaptic inhibition in sound localization. Nat Rev Neurosci 4:540-550.
Guimaraes AR, Melcher JR, Talavage TM, Baker JR, Ledden P, Rosen BR, Kiang NY,
Fullerton BC, Weisskoff RM (1998) Imaging Subcortical Auditory Activity in Humans.
Hum Brain Mapp 6:33-41.
Hall DA, Haggard MP, Akeroyd MA, Palmer AR, Summerfield AQ, Elliott MR, Gurney
EM, Bowtell RW (1999) “Sparse” Temporal Sampling in Auditory fMRI. Hum Brain
Mapp 7:213-223.
Harms MP, Melcher JR (2002) Sound repetition rate in the human auditory pathway:
representations in the waveshape and amplitude of fMRI activation. J Neurophysiol
88:1433–1450.
Hart HC, Palmer AR, Hall DA (2002) Heschl’s gyrus is more sensitive to tone level than
non-primary auditory cortex. Hear Res 171:177–190.
Hirsch IJ (1948) Binaural summation—a century of investigation. Psychol Bull 45:193206.
Imig TJ, Adrian HO (1977) Binaural columns in the primary field (A1) of cat auditory
cortex. Brain Res 138:241-257.
46
CHAPTER 2. BINAURAL PROCESSING IN THE HUMAN BRAINSTEM
Jäncke L, Wüstenberg T, Schulze K, Heinze HJ. Asymmetric hemodynamic responses
of the human auditory cortex to monaural and binaural stimulation. Hear Res. 2002
Aug;170(1-2):166-178.
Jeffress LA (1949) A place theory of sound localization. J Comp Physiol Psychol 41:35–
39.
Joris PX, Yin TC (1995) Envelope coding in the lateral superior olive. I. Sensitivity to
interaural time differences. J Neurophysiol 73:1043–1062.
Joris PX, Smith PH, Yin TC (1998) Coincidence detection in the auditory system: 50
years after Jeffress. Neuron 21:1235–1238.
Kaufman L, Okada Y, Brenner D, Williamson SJ (1981) On the relation between somatic
evoked potentials and fields. Int J Neurosci 15:223–239.
Kaneko K, Fujiki N, Hari R (2003) Binaural interaction in the human auditory cortex
revealed by neuromagnetic frequency tagging: no effect of stimulus intensity. Hear Res
183:1–6.
Lins OG, Picton TW (1995) Auditory steady–state responses to multiple simultaneous
stimuli. Electroencephalogr Clin Neurophysiol 96:420–432.
Lohmann G, Müller K, Bosch V, Mentzel H, Hessler S, Chen L, von Cramon DY (2001)
Lipsia—A new software system for the evaluation of functional magnetic resonance images of the human brain. Comput Med Imaging Graphics 25:449–457.
Malone BJ, Scott BH, Semple MN (2002) Context–dependent adaptive coding of interaural phase disparity in the auditory cortex of awake macaques. J Neurosci 22:4625–4638.
McAlpine D, Palmer AR (2002) Blocking GABAergic inhibition increases sensitivity to
sound motion cues in the inferior colliculus. J Neurosci 22:1443–1453.
McAlpine D, Jiang D, Palmer AR (2001) A neural code for low–frequency sound localization in mammals. Nat Neurosci 4:396–401.
McAlpine D, Grothe B (2003) Sound localization and delay lines—do mammals fit the
model? Trends Neurosci 26:347–350.
McPherson DL, Starr A (1993) Binaural interaction in auditory evoked potentials: brainstem, middle- and long-latency components. Hear Res 66:91–98.
Melcher JR, Talavage TM, Harms MP (1999) Functional MRI of the auditory system. In:
Medical radiology—diagnostic imaging and radiation oncology, functional MRI (Moonen C, Bandettini P eds), pp393–406. Berlin: Springer.
Melcher JR, Sigalovsky IS, Guinan JJ Jr, Levine RA (2000) Lateralized tinnitus studied
with functional magnetic resonance imaging: abnormal inferior colliculus activation. J
Neurophysiol 83:1058–1072.
Middlebrooks JC, Zook JM (1983) Intrinsic organization of the cat’s medial geniculate
body identified by projections to binaural response-specific bands in the primary auditory
cortex. J Neurosci 3:203–224.
Moore JK (2000) Organization of the human superior olivary complex. Microsc Res Tech
51:403–412.
2.5. REFERENCES
47
Norris DG (2000) Reduced power multislice MDEFT imaging. J Magn Reson Imaging
11:445–451.
Oertel D (1997) Encoding of timing in brain stem auditory nuclei of vertebrates. Neuron
19:959–962.
Oertel D (1999) The role of timing in the brain stem auditory nuclei of vertebrates. Annu
Rev Physiol 61:497–519.
Pantev C, Lutkenhoner B, Hoke M, Lehnertz K (1986) Comparison between simultaneously recorded auditory-evoked magnetic fields and potentials elicited by ipsilateral,
contralateral and binaural tone burst stimulation. Audiology 25:54–61.
Pantev C, Ross B, Berg P, Elbert T, Rockstroh B (1998) Study of the human auditory cortices using a whole-head magnetometer: left vs. right hemisphere and ipsilateral vs. contralateral stimulation. Audiol Neurootol 3:183–190.
Picton TW, Skinner CR, Champagne SC, Kellett AJ, Maiste AC (1987) Potentials evoked
by the sinusoidal modulation of the amplitude or frequency of a tone. J Acoust Soc Am
82:165–178.
Reite M, Zimmerman JT, Zimmerman JE (1981) Magnetic auditory evoked fields: interhemispheric asymmetry. Electroencephalogr Clin Neurophysiol 51:388–392.
Reser DH, Fishman YI, Arezzo JC, Steinschneider M (2000) Binaural interactions in
primary auditory cortex of the awake macaque. Cereb Cortex 10:574–584.
Riedel H, Kollmeier B (2002) Auditory brain stem responses evoked by lateralized clicks:
is lateralization extracted in the human brain stem? Hear Res 163:12–26.
Seifritz E, Esposito F, Hennel F, Mustovic H, Neuhoff JG, Bilecen D, Tedeschi G, Scheffler K, Di Salle F (2002) Spatiotemporal pattern of neural processing in the human auditory cortex. Science 297:1706–1708.
Spitzer MW, Semple MN (1998) Transformation of binaural response properties in the
ascending auditory pathway: influence of time-varying interaural phase disparity. J Neurophysiol 80:3062–3076.
Tervaniemi M, Hugdahl K (2003) Lateralization of auditory-cortex functions. Brain Res
Rev 43:231–246.
Tiihonen J, Hari R, Kaukoranta E, Kajola M (1989) Interaural interaction in the human
auditory cortex. Audiology 28:37–48.
Tollin DJ (2003) The lateral superior olive: a functional role in sound source localization.
Neuroscientist 9:127–143.
Ugurbil K, Garwood M, Ellermann J, Hendrich K, Hinke R, Hu X, Kim SG, Menon R,
Merkle H, Ogawa S, Salmi R (1993) Imaging at high magnetic fields: initial experiences
at 4 T. Magn Reson Q 9:259–277.
Warren JD, Zielinski BA, Green GGR, Rauschecker JP, Griffiths TD (2002) Perception
of sound-source motion by the human brain. Neuron 34:139–148.
48
CHAPTER 2. BINAURAL PROCESSING IN THE HUMAN BRAINSTEM
Woldorff MG, Tempelmann C, Fell J, Tegeler C, Gaschler-Markefski B, Hinrichs H,
Heinz HJ, Scheich H (1999) Lateralized auditory spatial perception and the contralaterality of cortical processing as studied with functional magnetic resonance imaging and
magnetoencephalography. Hum Brain Mapp 7:49–66.
Yin TC, Chan JC (1990) Interaural time sensitivity in medial superior olive of cat. J
Neurophysiol 64:465–488.
Zatorre RJ, Bouffard M, Ahad P, Belin P (2002) Where is ‘where’ in the human auditory
cortex? Nat Neurosci 5:905–909.
Chapter 3
Spectral and temporal processing in
the human auditory cortex
—revised.
49
Abstract
The present study investigates the hemispheric specialization for spectral and temporal
processing by measuring cortical responses with functional magnetic resonance imaging to a new class of parametric, wideband, dynamic acoustic stimuli. The stimuli were
designed to permit a clearer separation of spectral integration from tonal sequence processing than was possible in previous studies. Importantly, the sounds have a complex,
multi-peaked, spectrum at any instant, rather than one spectral peak that varies in time.
Cortical activation caused by the stimuli therefore indicates integration along the frequency axis (spectral integration), rather than integration over time (tonal sequence processing). The density of modulated spectral components (spectral parameter) and the
temporal modulation rate of these components (temporal parameter) were varied independently. BOLD-responses from the left and right primary auditory cortex covaried
with the spectral parameter, while the covariation analysis for the temporal parameter revealed mainly an area on the left superior temporal gyrus (STG). The equivalent region
on the right STG responded exclusively to the spectral parameter. These findings support
the hemispheric specialization model and permit a generalization of the model to include
processing of simultaneously present spectral peaks. The results also demonstrate the
involvement of the primary auditory cortex and the right STG in spectral integration.
3.1. INTRODUCTION
51
3.1 Introduction
The human cerebral hemispheres show notable differences in their anatomy and function.
Among the proposed functional specializations in the auditory system are the left hemisphere dominance for speech and the right hemisphere dominance for music processing
(Zatorre 2001; Zatorre et al. 2002; Tervaniemi and Hugdahl 2003). Hemispheric asymmetries in the processing of spectral and temporal sound information are thought to underlie the complementary lateralization for speech and music (Schwartz and Tallal 1980;
Zatorre and Belin 2001). In a recent study on this hypothesis, Zatorre and Belin (2001)
used nonverbal stimuli that varied independently along spectral and temporal dimensions
to show that an increasing rate of temporal change preferentially recruits left auditory
cortical areas, while an increasing number of spectral elements engages right auditory
cortical regions more strongly. They used sequential, melodic, stimuli to specifically
address functional asymmetries in tonal processing. The conclusions drawn from that
study are therefore restricted to the processing of sequential spectral information. In electrophysiologically oriented work, however, the terms ‘spectral processing’ and ‘spectral
complexity’ usually refer to a different concept, the modulation of the stimulus spectrum
at any given instant. The primate auditory cortex contains a considerable proportion of
neurons with complex, multi-peaked tuning curves (Kadia and Wang 2003). These neurons respond best to broadband stimuli with complex spectral profiles, i.e. when several
harmonically related peaks in the stimulus spectrum are present simultaneously. They are
therefore thought to be involved in spectral integration (Kadia and Wang 2003). A presumably related phenomenon is observed in human functional imaging: spectrally rich
sounds evoke stronger BOLD-responses than pure tones (Hall et al. 2002). Using stimuli
that excite multi-peaked neurons in functional imaging experiments would help to relate
findings from functional studies in humans to electrophysiological studies in animals, and
hence link the human imaging data more closely to neuronal response properties. Using
the term ‘spectral processing’ indiscriminatively to refer to the integration of simultaneous spectral peaks and to the processing of sequential tonal patterns can lead to seemingly
contradictory findings from functional imaging and electrophysiological studies. In humans, spectral processing (in the sense of tonal sequence processing) is associated with
right-lateralized activity in non-primary auditory areas (Zatorre 2001; Zatorre and Belin
2001; Patterson et al. 2002), whereas electrophysiological studies demonstrated spectral
processing (in the sense of integration of multiple simultaneous frequency peaks) in the
primary auditory cortex (Kadia and Wang 2003). On grounds of the electrophysiological
finding of multi-peaked neurons, we predict activation of the primary auditory cortex in
response to stimuli with complex spectral profiles.
Several studies argued for a hierarchical model of the processing of tonal sequences,
in which the PAC extracts the pitch of individual tones by mechanisms based on spectral
or temporal regularities. Subsequent structures integrate slow changes in pitch over time
that are found for instance in melodies (Zatorre et al. 1994; Rauschecker 1997; Griffiths
et al. 1998; Griffiths et al. 1999; Griffiths et al. 2001; Zatorre 2001; Patterson et al. 2002).
In humans, processing of pitch sequences is associated with right-lateralized activity in
non-primary auditory areas (Zatorre 2001; Zatorre and Belin 2001; Patterson et al. 2002).
The hierarchical model predicts a hemispherical specialization in non-primary auditory
areas, whereas the response of the primary auditory cortex is supposedly the same in both
52
CHAPTER 3. ASYMMETRY IN SPECTRO-TEMPORAL PROCESSING
hemispheres.
Challenging the hemispheric specialization hypothesis for spectral processing, Patterson and colleagues (2002) demonstrated that the right-lateralization of tonal processing
does not necessarily rely on spectral features. Pitch changes in their tonal sequences were
produced by manipulating the temporal characteristics of noise to enhance the incidence
of one particular time interval. Pitch differences in these stimuli are not accompanied
by differences in the stimulus spectrum. The present experiment also tests whether the
right-lateralization of tonal processing depends solely on temporal mechanisms, by using
stimuli with spectral changes mostly in frequency bands above 1.5 kHz, the upper limit
of peripheral neuronal coupling to the stimulus waveform. Auditory nerve firing does not
encode temporal regularities above this frequency limit.
If processing of spectral information in general is lateralized, than not only sequential but also simultaneously present spectral peaks would preferably engage the right-side
auditory cortical structures. The present study tests this hypothesis by seeking cortical
areas in which the BOLD signal covaries with the number of simultaneous spectral components (spectral complexity) or the temporal modulation rate (temporal complexity) of
the stimuli. While this covariation would highlight areas that are presumably involved in
converting spectral and temporal stimulus parameters into neuronal activity, it does not
yield information about conversion mechanism.
3.2 Material and Methods
3.2.1 Subjects
16 subjects (6 male, 10 female; 100% right-handedness (Oldfield 1971)) between 22 and
30 years of age, with no history of hearing disorder or neurological disease, participated
in the experiment after having given informed consent. The experimental procedures
were approved by the local ethics committee.
3.2.2 Acoustic stimuli
The experiment comprised 10 stimulus conditions in a parametric design with five levels
of the factors spectral and temporal complexity. The stimuli differed in temporal modulation rate (‘temporal complexity’) and number of independently modulated spectral components (‘spectral complexity’), but not in bandwidth. The stimuli were random spectrogram sounds, a new class of parametric, wideband, dynamic acoustic stimuli. These
sounds permit independent variation of the density of modulations along the spectral and
temporal dimension. They were constructed as follows: A two-dimensional random field
with dimensions matching the desired temporal and spectral modulation rate was generated. The rows of this matrix modulated the amplitude of sinusoids with frequencies
equally spaced with respect to equivalent rectangular bandwidth (Moore and Glasberg
1983) between 250 Hz and 16 kHz. The sine tones were added together to form a sound
with a spectrogram that equalled the initial two-dimensional matrix. The size of the matrix determined in one direction the temporal modulation rate and in the other direction
the number of spectral components in the resulting stimulus. Because the frequencyresolution of the auditory periphery (in equivalent rectangular bandwidth) was taken into
3.2. MATERIAL AND METHODS
53
account, average sound energy is equally distributed across the six octaves of stimulus
bandwidth. The advantage of random spectrogram sounds over the tonal stimuli used
in comparable experiments is perceptual homogeneity. Increasing spectral complexity is
not accompanied by an emerging melodic pattern and increasing temporal modulation
rate does not result in monotonous staccato. Because the stimuli contain several spectral peaks simultaneously they are suited to investigate spectral integration mechanisms
that act across frequencies. In contrast, melodic successions of tones, as used by Zatorre
and Belin (2001), investigate spectral integration over time. The five levels of parametric
variation of spectral complexity were 4, 6, 8, 12 and 16 independently modulated components. The five levels of temporal modulation rate were 5, 8, 14, 20 and 30 Hz. All stimuli
in the spectral variation condition had a fixed temporal modulation rate of 3 Hz and all
stimuli in the temporal variation condition had a fixed number of 3 spectral components.
The stimulus parameters were chosen to yield a linear increase of perceived acoustic complexity, according to psychoacoustical ratings by subjects in a pre-test. The sounds were
presented through MR compatible electrostatic headphones (Sennheiser model HE 60)
with modified industrial ear protectors (Bilsom model 2452) (Palmer et al. 1998).
3.2.3 Procedure
Before scanning each subject rated the subjective complexity of the different stimulus
levels in a thirty-minute psychoacoustical test. The individual ratings were later used
for the covariation analysis. During scanning, 284 functional volumes were acquired
per subject in blocks of four volumes. After an initial dummy block that was discarded
during analysis, blocks for each of the five levels of the spectral and temporal factor were
presented interleaved and repeated seven times during the experiment. The experimental
time was 38 minutes.
3.2.4 fMRI data acquisition
Blood-oxygen level dependent (BOLD) contrast images were acquired with a 3-T Bruker
Medspec whole body scanner using gradient echo planar imaging (TR = 8 s; TE = 30 ms;
flip angle = 90◦ ; acquisition bandwidth = 100 kHz). The functional images consisted of
6 ascending slices with an in-plane resolution of 3×3 mm, a slice thickness of 3 mm and
an inter-slice gap of 1 mm. The slices were oriented parallel to the lateral sulcus, completely covering the superior temporal plane, and acquired in direct temporal succession
in the first 0.5 s of the TR, followed by 7.5 s of stimulus presentation without interfering
acquisition noise. Clustering the slice acquisition at the beginning of a long TR (sparse
imaging technique) reduces the effect of scanner noise on the recorded BOLD-response
to the stimuli (Edminster et al. 1999, Hall et al. 1999). A high-resolution structural image was acquired from each listener using a 3D MDEFT sequence (Ugurbil et al. 1993)
with 128 1.5-mm slices (FOV = 25×25×19.2 cm; data matrix 256×256; TR = 1.3 s;
TE = 10 ms). A set of T1-weighted EPI images, acquired with the same parameters as
for the functional images (inversion time = 1200 ms; TR = 45 s; four averages), assisted
alignment of structural and functional brain volumes.
54
CHAPTER 3. ASYMMETRY IN SPECTRO-TEMPORAL PROCESSING
3.2.5 Data analysis
The data were analyzed with the software package LIPSIA (Lohmann et al. 2001). The
functional images of each listener were corrected for head motion and rotated into the
Talairach coordinate system by co-registering the structural MDEFT and EPI-T1 images acquired in this experiment with a high-resolution structural image residing in a
database. The functional images were normalized and spatially smoothed (10 mm full
width at half maximum Gaussian kernel). The preprocessed image time series of 16 subjects was subjected to two independent fixed-effects group analyses (2240 volumes each)
for the spectral and the temporal factor using a general linear model. The experimental
conditions were modeled as box-car functions convolved with a generic hemodynamic
response functions including a response delay of 6 s. The height thresholds for activation
were z = 3.1 (p < 0.001 uncorrected) or z = 6 (p < 0.05 corrected)
3.3 Results
A parametric model of the BOLD-response produced by different levels of spectral and
temporal modulation revealed cortical areas whose activation strength covaried with stimulus complexity. A subsequent region-of-interest (ROI) analysis established the link between the activation data and anatomical region on the superior temporal plane. Penhunes
(1996) probabilistic map defines the location of Heschl’s gyri (HG) in a large sample of
subjects. Additionally, ROIs for the left and right HG defined individually in all subjects
were used for single subject analyses. Previously published cytoarchitectonical probabilistic maps defined the location of the primary auditory cortex (PAC) (Morosan et
al. 2001). ROIs on the left and right lateral superior temporal gyrus (STG) were defined
by local maxima in the BOLD-response. The average BOLD signal change was extracted
from each ROI in all subjects and plotted against the spectral and temporal stimulus levels in order to access region-specific differences in the response to spectral and temporal
processing.
3.3.1 Covariation analysis
The analysis of covariation for the spectral parameter revealed two regions of significant
BOLD-response covariation, one on the left and one on the right superior temporal plane
(STP) (Fig. 3.1a). The Talairach coordinates (Talairach and Tournoux 1988) of the foci
were −49, −20, 11, and 45, −14, 8. The activated cortical volume was mainly, but not
exclusively, confined to Heschl’s gyrus and the primary auditory cortex, as determined
by comparison to the subjects’ individual HG locations and a probabilistic map of HG
(Penhune et al. 1996), and to a probabilistic map based on cytoarchitecture (Morosan et
al. 2001), respectively. Analysis of covariance for the temporal parameter revealed also
two regions, one located on the superior temporal gyrus of the left hemisphere, posterior
and lateral to HG, and another on the right HG. The Talairach coordinates of the foci were
−57, −14, 9 and 41, −19, 11. In the left hemisphere, the focus of temporal covariation is
located on the STG, lateral and slightly anterior to the spectral focus, whereas in the right
hemisphere the temporal focus is located in PAC, medial and posterior to the spectral
focus. The portion of the STG that showed significant covariation with the temporal but
3.3. RESULTS
55
Figure 3.1: A) Areas of significant BOLD-response covariation with the spectral (red, opaque) and the temporal (blue)
parameter rendered onto average structural images of the group. The height threshold for activation was z = 6 (p < 0.05
corrected) for the spectral, and z = 3.2 (p < 0.001 uncorrected). For the temporal covariation. The light red highlight
shows the position of Heschl’s gyrus (HG) (80% probability in the HG maps for 16 individuals). The cytoarchitectonic
region of the primary auditory cortex (Morosan et al. 2001) is indicated by a green highlight and its overlap with HG by
a yellow highlight (see color code in lower right corner). The sections are in neurological convention and the axial slice
runs parallel to the lateral sulcus, as indicated in the inset (lower left corner). B) Mean z-scores (red and blue bars, left
ordinate) and effect sizes (light blue circles, right ordinate, mean±standard deviation) for the ROIs in the left and right
primary auditory cortex (lPAC and rPAC) and the superior temporal gyrus (lSTG and rSTG). The z-scores are normalized
t-values and denotes the significance of the effect, whereas effect sizes are the GLM parameter estimates weighted by
the contrast vector (what would correspond to the percentage signal change in ON-OFF contrasts). The BOLD-response
in both PAC and the right STG covaried strongly with the spectral and weakly with the temporal parameter in terms of
z-value and effect size. The covariation of the left STG activity was more significant for the temporal than for the spectral
parameter, although the difference in effect size was small. C) Relative effect sizes, computed by normalizing the effect
sizes evoked by spectral variation to the maximal effect size for the temporal parameter. The relative difference of the
effects of spectral and temporal variation characterize the asymmetrical response in the STG best, because they account
for the overall stronger response to spectral variation. D) Slopes of BOLD-signal changes in the four ROIs as a function of
spectral and temporal input parameters. Symbols indicate average percentage signal changes with one standard deviation.
The inset lines show the slopes of a least-squares linear regression through these points. In the middle panel, the ROIs are
rendered onto a slice of the average anatomical image of the group.
56
CHAPTER 3. ASYMMETRY IN SPECTRO-TEMPORAL PROCESSING
not spectral parameter in the left hemisphere, exhibited the reverse covariation pattern
(spectral but not temporal) in the right hemisphere. The described pattern of regional
covariation was found in the average as well as the majority of the subjects.
3.3.2 Region-Of-Interest analysis
We extracted the mean size-of-effect and level of significance (z-score) of the covariation analysis from the ROIs in the primary auditory cortices and superior temporal gyri
(Fig. 3.1b). The level of activation in response to the spectral modulation exceeded that
of the activation due to temporal modulation in left and right PAC and in the right STG.
Only the left lateral STG showed higher statistical scores and slightly larger effect sizes in
response to the temporal modulation. Because the spectral modulation appeared to evoke
overall higher BOLD-responses, the percentage effect sizes permit a clearer view on the
relative responses to spectral and temporal modulation (Fig. 3.1d).
To visualize the precise relationship between parameter value and BOLD-response in
the ROIs, we extracted the mean percentage signal change evoked by the different stimuli
in the primary auditory cortices and superior temporal gyri (Fig. 3.1d). The left and right
PAC showed essentially the same responses, a roughly linear increase in BOLD-response
with increasing spectral complexity, and no discernable effect of temporal modulation
rate. The right STG also responded with linearly increasing activation with increasing
spectral complexity, but not with temporal complexity. The activation in left STG showed
a significant positive correlation with the temporal modulation rate. Spectrally complex
stimuli appeared to slightly activate the left STG as well, but only in a constant manner.
The percentage signal change did not follow the increasing spectral modulation beyond
parameter values greater than two.
3.4 Discussion
The results provide evidence for different effects of spectral and temporal sound complexity in the left versus right hemisphere, and in different regions in each hemisphere. The
left and right primary auditory cortices (PAC) responded strongly to the spectral parametric modulation. The only significant response to the temporal modulation within the PAC
was confined to the middle right PAC (area Te1.0 (Morosan et al. 2001)). In contrast, portions of the left and right superior temporal gyrus (STG) responded noticeably different
to spectral and temporal modulation. Activation of the right STG increased steeply with
increasing spectral, but not temporal, complexity, whereas activation of the left STG was
weighted towards temporal complexity.
These findings support the hemispheric specialization model, which proposes a preference of the left-hemispheric auditory structures for fast temporal processing and a righthemisphere preference for fine-grained spectral processing (Zatorre and Belin 2001). The
results also permit a generalization of the model to include processing of complex spectral
profiles.
3.4. DISCUSSION
57
3.4.1 Spectral processing
Differences in the spectral composition between the stimuli used in the study by Zatorre
and Belin (2001) and in the present one also explain the noticeable differences in the
reported activation patterns. Using pure-tone sequences, they found a stronger response
to temporal than to spectral complexity in the left and right Heschl gyrus. In contrast,
spectral variation of the wideband modulated stimuli activated the auditory cortical areas,
including PAC, highly efficient in the present study. The significant covariation of the
PAC activity with the number of independently modulated spectral components in the
present data demonstrates that neurons in both PAC are responsive to complex spectral
profiles.
The response in secondary areas appears to follow the hierarchical asymmetry pattern observed for pitch processing (Patterson et al. 2002): secondary auditory areas, but
not PAC, exhibit right-lateralized activation. Whereas Patterson and colleagues (2002)
argued for a processing mechanism based on temporal regularity, the right-ward lateralization found in the present study relates to the processing of complex spectral contours.
The spectral changes in the random spectrogram sounds occurred in six octaves from
0.25 to 16 kHz, of which the upper four octaves lie above the upper frequency limit of
temporal encoding in the auditory nerve fibers. The two findings could be reconciled by
considering the secondary auditory areas as being involved in pitch processing irrespectively of whether the apparent pitch is based on spectral or temporal regularities. In this
view, non-primary auditory cortex receives pitch information extracted from temporal
and spectral mechanisms, while the pitch extraction itself is accomplished in preceding
auditory processing stages. In accordance with this idea, Griffiths and colleagues (2001)
demonstrated an encoding of temporal regularity already in auditory brainstem structures.
3.4.2 Temporal processing
The present data provide evidence for an involvement of the left STG in the processing of
slow temporal modulations. The BOLD-response covaried with the temporal modulation
rate of the random spectrogram stimuli, indicating that faster modulations recruit more
neural activity in that area. Left hemispheric non-primary auditory cortex is often implicated with speech processing (Kimura 1961; Geschwind 1972; Gazzaniga 1983; Zatorre
et al. 2002), and the covariation with temporal modulation rate in the left STG might be
related to the processing of fast transients, important elements in human language perception (Schwartz and Tallal 1980; Tallal et al. 1993; Shannon et al. 1995).
The precise location of the areas most responsive to temporal modulations appears
to depend strongly on the modulation frequency. Giraud and coworkers (2000) demonstrated that the modulation frequency ranges from 4 to 16 kHz and from 128 Hz to 256
Hz are represented in the cortex by different response types, but without consistent segregation. Large clusters on the superior temporal plane responded to the lower modulation
frequency range, whereas the higher range yielded a mosaic of small clusters. Zatorre
and Belin (2001) found the PAC to be responsive to temporal modulation in the range of
1.5 Hz to 47 Hz. Patterson and colleagues (2002) reported activation in the left and right
lateral HG in response to fixed pitch stimuli (83 Hz), while melodies (4 Hz pitch change
rate) activated the right STP and planum polare. Psychoacoustics and electrophysiology
58
CHAPTER 3. ASYMMETRY IN SPECTRO-TEMPORAL PROCESSING
provide additional evidence for a difference in the processing of temporal modulations
below and above 30 Hz, the frequency that demarcates the lower limit of pitch perception (Krumbholz et al. 2000; Pressnitzer et al. 2001). With single cell recordings in the
AC of awake monkeys, Lu and Wang (2000) found two largely distinct neuronal populations: one explicitly representing slowly occurring sound sequences with a temporal code,
and the other implicitly representing rapidly occurring events with a rate code. Models of
temporal processing in the auditory system suggest that modulations above approximately
30 Hz (equalling time intervals of 33 ms and the lower limit of pitch) are integrated into
an interval-based pitch estimate (Patterson et al. 1995). In the present study, the temporal parameter varied between 5 Hz and 30 Hz and the covariation analysis permitted to
selectively investigate the effects of temporal modulations in this frequency range.
3.4.3 Microanatomical hemispheric differences as a possible basis for functional specialization
Zatorre and colleages (Zatorre 2001; Zatorre and Belin 2001; Zatorre et al. 2002) discussed possible anatomical causes of the lateralized processing of spectral and temporal
modulations, based on interhemispheric microanatomical differences in cell size (Hayes
and Lewis 1993), myelination (Penhune et al. 1996), width of microcolumns (Seldon
1981b, a; Buxhoeveden et al. 2001), and spacing of macrocolumns (Galuske et al. 2000).
Microcolumns are considered as candidates for the smallest cortical processing units
(Jones 2000), while macrocolumns are 200-700µm-wide patches of neurons with similar
connections pattern and receptive fields (Reale et al. 1983). In the left non-primary auditory cortex, cortical microcolumns have greater width and intercolumnar distance than in
the right auditory cortex (Seldon 1981b, a; Buxhoeveden et al. 2001). Together with the
finding that macrocolumns appear to be of the same size in both hemispheres (Galuske
et al. 2000), this suggests a greater number of microcolumnar units per macrocolumn in
the right non-primary auditory cortex. If pitch is represented by a population code in
macrocolumns, then the greater number of microcolumns in the right hemisphere could
indicate a greater encoding precision. The thicker myelination in the right hemisphere, in
contrary, could enhance fast temporal processing (Zatorre 2001). These anatomical considerations remain highly speculative, because precise microanatomical data from both
hemispheres is only available for a few auditory regions and the neural mechanisms of
complex spectral and temporal patterns are not fully understood.
3.5 References
Buxhoeveden, D. P., A. E. Switala, M. Litaker, E. Roy and M. F. Casanova (2001). Lateralization of minicolumns in human planum temporale is absent in nonhuman primate
cortex. Brain Behav Evol 57(6): 349-58.
Galuske, R. A., W. Schlote, H. Bratzke and W. Singer (2000). Interhemispheric asymmetries of the modular structure in human temporal cortex. Science 289(5486): 1946-9.
Gazzaniga, M. S. (1983). Right hemisphere language following brain bisection. A 20year perspective. Am Psychol 38(5): 525-37.
3.5. REFERENCES
59
Geschwind, N. (1972). Language and the brain. Sci Am 226(4): 76-83.
Giraud, A. L., C. Lorenzi, J. Ashburner, J. Wable, I. Johnsrude, R. Frackowiak and A.
Kleinschmidt (2000). Representation of the temporal envelope of sounds in the human
brain. J Neurophysiol 84(3): 1588-98.
Griffiths, T. D., C. Buchel, R. S. Frackowiak and R. D. Patterson (1998). Analysis of
temporal structure in sound by the human brain. Nat Neurosci 1(5): 422-7.
Griffiths, T. D., I. Johnsrude, J. L. Dean and G. G. Green (1999). A common neural
substrate for the analysis of pitch and duration pattern in segmented sound? Neuroreport
10(18): 3825-30.
Griffiths, T. D., S. Uppenkamp, I. Johnsrude, O. Josephs and R. D. Patterson (2001).
Encoding of the temporal regularity of sound in the human brainstem. Nat Neurosci 4(6):
633-7.
Hall, D. A., I. S. Johnsrude, M. P. Haggard, A. R. Palmer, M. A. Akeroyd and A. Q.
Summerfield (2002). Spectral and temporal processing in human auditory cortex. Cereb
Cortex 12(2): 140-9.
Hayes, T. L. and D. A. Lewis (1993). Hemispheric differences in layer III pyramidal
neurons of the anterior language area. Arch Neurol 50(5): 501-5.
Jones, E. G. (2000). Microcolumns in the cerebral cortex. Proc Natl Acad Sci U S A
97(10): 5019-21.
Kadia, S. C. and X. Wang (2003). Spectral integration in A1 of awake primates: neurons
with single- and multipeaked tuning characteristics. J Neurophysiol 89(3): 1603-22.
Kimura, D. (1961). Some effects of temporal-lobe damage on auditory perception. Can
J Psychol 15: 156-65.
Krumbholz, K., R. D. Patterson and D. Pressnitzer (2000). The lower limit of pitch as
determined by rate discrimination. J Acoust Soc Am 108(3 Pt 1): 1170-80.
Lohmann, G., K. Muller, V. Bosch, H. Mentzel, S. Hessler, L. Chen, S. Zysset and D.
Y. von Cramon (2001). LIPSIA–a new software system for the evaluation of functional
magnetic resonance images of the human brain. Comput Med Imaging Graph 25(6):
449-57.
Lu, T. and X. Wang (2000). Temporal discharge patterns evoked by rapid sequences of
wide- and narrowband clicks in the primary auditory cortex of cat. J Neurophysiol 84(1):
236-46.
Moore, B. C. and B. R. Glasberg (1983). Suggested formulae for calculating auditoryfilter bandwidths and excitation patterns. J Acoust Soc Am 74(3): 750-3.
Morosan, P., J. Rademacher, A. Schleicher, K. Amunts, T. Schormann and K. Zilles
(2001). Human primary auditory cortex: cytoarchitectonic subdivisions and mapping
into a spatial reference system. Neuroimage 13(4): 684-701.
Oldfield, R. C. (1971). The assessment and analysis of handedness: the Edinburgh inventory. Neuropsychologia 9(1): 97-113.
60
CHAPTER 3. ASYMMETRY IN SPECTRO-TEMPORAL PROCESSING
Patterson, R. D., M. H. Allerhand and C. Giguere (1995). Time-domain modeling of
peripheral auditory processing: a modular architecture and a software platform. J Acoust
Soc Am 98(4): 1890-4.
Patterson, R. D., S. Uppenkamp, I. S. Johnsrude and T. D. Griffiths (2002). The processing of temporal pitch and melody information in auditory cortex. Neuron 36(4): 767-76.
Penhune, V. B., R. J. Zatorre, J. D. MacDonald and A. C. Evans (1996). Interhemispheric
anatomical differences in human primary auditory cortex: probabilistic mapping and volume measurement from magnetic resonance scans. Cereb Cortex 6(5): 661-72.
Pressnitzer, D., R. D. Patterson and K. Krumbholz (2001). The lower limit of melodic
pitch. J Acoust Soc Am 109(5 Pt 1): 2074-84.
Rauschecker, J. P. (1997). Processing of complex sounds in the auditory cortex of cat,
monkey, and man. Acta Otolaryngol Suppl 532: 34-8.
Reale, R. A., J. F. Brugge and J. Z. Feng (1983). Geometry and orientation of neuronal
processes in cat primary auditory cortex (AI) related to characteristic-frequency maps.
Proc Natl Acad Sci U S A 80(17): 5449-53.
Schwartz, J. and P. Tallal (1980). Rate of acoustic change may underlie hemispheric
specialization for speech perception. Science 207(4437): 1380-1.
Seldon, H. L. (1981a). Structure of human auditory cortex. I. Cytoarchitectonics and
dendritic distributions. Brain Res 229(2): 277-94.
Seldon, H. L. (1981b). Structure of human auditory cortex. II. Axon distributions and
morphological correlates of speech perception. Brain Res 229(2): 295-310.
Shannon, R. V., F. G. Zeng, V. Kamath, J. Wygonski and M. Ekelid (1995). Speech
recognition with primarily temporal cues. Science 270(5234): 303-4.
Talairach, J. and P. Tournoux (1988). Co-planar stereotaxic atlas of the human brain. New
York, Thieme.
Tallal, P., S. Miller and R. H. Fitch (1993). Neurobiological basis of speech: a case for
the preeminence of temporal processing. Ann N Y Acad Sci 682: 27-47.
Tervaniemi, M. and K. Hugdahl (2003). Lateralization of auditory-cortex functions.
Brain Res Brain Res Rev 43(3): 231-46.
Zatorre, R. J. (2001). Neural specializations for tonal processing. Ann N Y Acad Sci 930:
193-210.
Zatorre, R. J. and P. Belin (2001). Spectral and temporal processing in human auditory
cortex. Cereb Cortex 11(10): 946-53.
Zatorre, R. J., P. Belin and V. B. Penhune (2002). Structure and function of auditory
cortex: music and speech. Trends Cogn Sci 6(1): 37-46.
Zatorre, R. J., A. C. Evans and E. Meyer (1994). Neural mechanisms underlying melodic
perception and memory for pitch. J Neurosci 14(4): 1908-19.
Chapter 4
Representation of interaural
temporal information from left and
right auditory space in the human
planum temporale and inferior
parietal lobe
61
Abstract
The localization of low-frequency sounds mainly relies on the processing of microsecond
temporal disparities between the ears, since low frequencies produce little or no interaural
energy differences. Thus, the overall auditory cortical response to low-frequency sounds
is largely symmetric between the two hemispheres, even when the sounds are lateralized. However, the effects of unilateral lesions in auditory cortex (AC) suggest that the
spatial information mediated by lateralized sounds is distributed asymmetrically across
the hemispheres. This paper describes a fMRI experiment, which shows that the interaural temporal processing of lateralized sounds produces an enhanced neural response in
the posterior aspect of the respective contralateral AC. The response is stronger and extends further into adjacent inferior parietal regions when the sound is moving than when
it is stationary. The differential responses to moving sounds further revealed that the
left hemisphere responded predominantly to sound movement within the right hemifield,
whereas the right hemisphere responded to sound movement in both hemifields. This
rightward asymmetry parallels the asymmetry associated with the allocation of visuospatial attention and may underlie unilateral auditory neglect phenomena. The current
results demonstrate that the interaural temporal information is reorganized according to
hemifields at subcortical processing levels.
4.1. INTRODUCTION
63
4.1 Introduction
Many sounds that are behaviorally relevant to humans, such as speech and music, contain
predominantly low-frequency energy. The horizontal localization of these sounds mainly
relies on the processing of interaural temporal disparities (ITDs; Wightman and Kistler,
1992), produced by path length differences from the sound source to the two ears, as low
frequencies produce little or no interaural energy differences. Consequently, humans—
and other mammals with good low-frequency hearing—have evolved a remarkable sensitivity to ITDs of the order of a few tens of microseconds (Durlach and Colburn, 1978).
Sounds with the same energy at the two ears activate both auditory cortices about equally
strongly, even when they are completely lateralized towards one or the other ear by means
of ITDs (Woldorff et al., 1999). This is probably why unilateral lesions in auditory cortex
(AC) usually have surprisingly little effect on most auditory functions, such as the ability
to understand speech or to appreciate music (for a review, see Engelien et al., 2001). In
contrast, lateralized visual stimuli produce a largely contralateral response in early visual
cortical areas, and unilateral lesions in visual cortex may lead to complete blindness in
the contralesional hemifield. Unilateral auditory cortical lesions do, however, often lead
to deficits in sound localization (for a review, see Clarke et al., 2000). Several studies reported selective sound localization deficits in the contralesional hemifield following AC
lesions in either hemisphere. Other studies described localization deficits in both hemifields after lesions in one (either left or right) but not the other hemisphere (e.g., Zatorre
and Penhune, 2001). The lesion results suggest that the processing of the spatial information mediated by lateralized sounds differs from the non-spatial information, in that it is
distributed asymmetrically across the two hemispheres.
In order to verify this notion, one would have to measure the responses to sounds that
have the same energy at the two ears and that are lateralized solely by means of ITDs, because only in that way would any functional asymmetry in the observed responses not be
confounded by the known anatomical asymmetry in the number of crossed and uncrossed
excitatory projections in the ascending auditory pathway (Webster et al., 1992). Unfortunately, animal physiological data on the representation of ITDs in AC are still scarce. A
recent study by Fitzpatrick et al. (2000) suggests that, in the rabbit, the distribution of best
ITDs (the ITD producing maximal discharge) is skewed towards the contralateral hemifield. The recordings of Fitzpatrick et al. are from the primary AC and the distribution
might look somewhat different in non-primary auditory areas, particularly in areas belonging to the dorsal “where” stream that is assumed to be specialized in auditory spatial
processing in the monkey (for a review, see Rauschecker and Tian, 2000). Moreover, the
results of lesions in AC suggest that, in humans, any contralateral asymmetry in the representation of auditory space may be shifted somewhat towards one or the other hemisphere
(e.g., Zatorre and Penhune, 2001; for a review, see Clarke et al., 2000).
Deficits in sound localization may also be observed in patients with hemispatial neglect (Bellmann et al., 2001). Chronic neglect most reliably occurs after right- and not
left-hemisphere lesions. This asymmetry is generally explained by assuming that the
left hemisphere deploys attention mainly within the right hemifield, whereas the right
hemisphere deploys attention within both hemifields. In accordance with this notion are
findings which show that parietal activations associated with the allocation of spatial attention, and more generally with global spatial processing, exhibit an asymmetry towards
64
CHAPTER 4. REPRESENTATION OF LEFT AND RIGHT AUDITORY SPACE
the right hemisphere (reviewed in Mesulam, 1999 and Marshall and Fink, 2001). Moreover, auditory spatial processing has been found to activate parietal and frontal regions
more strongly in the right than in the left hemisphere (Griffiths et al., 1998; Weeks et al.,
1999, 2000). It is unclear, whether a similar rightward asymmetry is also inherent in the
preattentive, sensory processing of spatial information. In the auditory domain, the existing evidence from human lesion data (Clarke et al., 2000; Zatorre and Penhune, 2001)
and from neuroimaging and electrophysiological studies of auditory spatial processing
(Baumgart et al., 1999; Kaiser et al., 2000; Warren et al., 2002; Zatorre et al., 2002a) are
contradictory with respect to this question.
The current study uses fMRI to investigate how the interaural temporal information
mediated by low-frequency sounds is represented in AC. In order to isolate brain regions involved in interaural temporal processing, we compared the blood oxygen leveldependent (BOLD) responses to sounds that were matched in energy and spectral composition and differed solely in their interaural temporal properties. Our hypothesis was that
lateralized sounds would yield a stronger activation in the contralateral AC as compared
to midline sounds. Any contralateral asymmetry in the auditory cortical representation
of sound laterality may or may not be superposed by a right-hemisphere dominance for
auditory spatial processing.
4.2 Materials and Methods
4.2.1 Stimuli and experimental protocol
The experiment comprised a total of five sound conditions, as well as a silent condition.
The sounds consisted of 50-ms bursts of noise, filtered to the low-frequency region (2003200 Hz), where interaural temporal cues are most salient, and presented at a rate of 10
per s. All sounds had the same energy at both ears. They were delivered through electrostatic headphones, which produced minimal image distortions and passively shielded the
subjects from the scanner noise. In three of the five sound conditions, labeled ‘center’,
‘left static’, and ‘right static’, the noise bursts were presented with static ITDs of 0, −500
or 500 s, respectively, so the perception was that of a stationary sound centered on the
midline, or lateralized towards the left or right ear, respectively. By convention, a positive
ITD means that the sound to the left ear is lagging the sound to the right ear, whereas a
negative ITD denotes the reverse situation. In the remaining two sound conditions, labeled ‘left moving’ and ‘right moving’, the train of noise bursts moved back and forth
between the midline and the left or right ear. The impression of movement was created
by varying the ITD continuously between 0 and −1000 or 1000 s. The ITD variation was
linear with a rate of 1000 s per s, so it took 2 s for the sounds to move from the midline
to the left or right ear and back to the midline again. The starting point of the movement
was randomized from trial to trial.
The sparse imaging technique (Hall et al., 1999) was applied to minimize the effect of
the scanner noise on the recorded activity, and cardiac triggering (Guimaraes et al., 1998)
of image acquisition was used to reduce motion artifacts in the brainstem signal resulting
from basilar artery pulsation. Each image acquisition was triggered by the first R-wave
of the electrocardiogram occurring after a 6.5-s period of either sound presentation or
silence. No images were acquired during this 6.5-s period. Due to cardiac triggering,
4.2. MATERIALS AND METHODS
65
the exact repetition time of image acquisitions (TR) varied slightly over time and across
subjects; the average TR amounted to 9.24 s. Five sound epochs containing the five
sound conditions in pseudorandom order were alternated with a single silence epoch.
Each epoch lasted about 46 s, during which the stimulus was presented five times. A total
of 300 images were acquired per listener (50 for each condition).
Subjects were asked to listen to the sounds and take particular notice of their spatial
attributes. To avoid that the subjects moved their eyes in the direction of the sounds,
they were asked to fixate a cross at the midpoint of the visual axis and perform a visual
control task. The task was to press a button with the left or right index finger upon
each occurrence of the capital letter ‘Z’ in either of two simultaneous, but uncorrelated,
sequences of random one-digit numbers that were shown to the left and the right of the
fixation cross. The numbers were presented for 50 ms once every 2 s.
4.2.2 fMRI data acquisition
Blood oxygen level-dependent (BOLD) contrast image volumes were acquired with a
Siemens Vision 1.5-T whole body scanner and gradient echo planar imaging (TR = 9.24 s;
TE = 66 ms). Each brain volume consisted of twenty 4-mm slices with an interslice gap
of 0.4 mm and an in-plane resolution of 3.125×3.125 mm, which were acquired in ascending order. At the beginning of each measurement, a high-resolution structural image
was acquired using the 3D MP-RAGE sequence. The midsagittal slice of the structural
image was used to orient the slices of the functional images along the line between the
anterior and posterior commissures. The functional slices were positioned so that the
inferior colliculus (IC) in the midbrain was covered by the third slice.
4.2.3 Data analysis
Structural and functional images were analyzed using SPM99 (http://www.fil.ion.ucl.ac.uk/spm). After realignment, slice time correction, coregistration with structural images,
normalization and smoothing (10 mm full width at half maximum), the functional image
time series of fourteen subjects, comprising a total of 4200 volumes, were subjected to a
fixed-effects group analysis. The height threshold for activation was t = 4.65, pvoxel ≤
0.05, corrected for multiple comparisons across the entire scanned volume). In Fig. 4.5,
a cluster threshold (pcluster ≤ 0.001, corrected) was used to illustrate the whole extent
of the respective activations. The contrasts between the static lateralized sound conditions and the center condition failed to meet the threshold criterion of t ≤ 4.65, but did
produce significant auditory cortical activation when a less stringent criterion was used
(t ≤ 3.09; pvoxel ≤ 0.001, uncorrected). In these cases, a small volume correction was
applied within bilateral spheres of 15-mm radius centered on the plana temporale (PT;
dashed outlines in Figs. 2a and 2b). The position of PT was approximated as 10 mm
posterior and lateral, and 5 mm superior to the ‘center of gravity’ of the probability map
of Heschl’s gyrus (HG) for the fourteen subjects who participated in the experiment (see
Table 4.1 for MNI coordinates). The probability map was constructed by labeling HG
in both hemispheres of each subject using the MRIcro software (http://www.psychology
.nottingham.ac.uk/staff/crl/mricro.html). For that, the area between the face of HG and
the connecting line between the first transverse sulcus and Heschl’s sulcus, or the sulcus
66
CHAPTER 4. REPRESENTATION OF LEFT AND RIGHT AUDITORY SPACE
a
L
z = 12 mm
b
R 4.65
y = −36 mm
t-value
30
Figure 4.1: Activation for the contrast between all sound conditions and silence, rendered onto
the average structural image of the group. Axial section at z = 12 mm showing bilateral activation
in AC (a), coronal section at y = –36 mm showing activation in IC (b). The color scale gives the
t−value for the comparison between the BOLD responses to the sound conditions and silence.
Activation was thresholded at t = 4.65 (p voxel ≤ 0.05, corrected).
intermedius in the case of a duplicate HG, was marked in successive coronal slices of the
individual structural scans between the posterior and the anterior edge of HG. The individual marked volumes of fourteen subjects were then averaged to produce a probability
map of HG.
4.2.4 Subjects
Fourteen right-handed subjects (6 male, 8 female), between 22 and 33 years of age, with
no history of hearing disorder or neurological disease participated in the experiment after
having given informed consent. The experimental procedure was approved by the local
ethics committee.
4.3 Results
4.3.1 Comparison between all sounds and silence
The BOLD response produced by all five sound conditions (center, left and right static,
left and right moving) was compared to the response produced by the silent condition to
reveal regions sensitive to the noise stimuli used in the current experiment. The contrast
yielded three clusters of significant activation in the auditory pathway, one large cluster
in each superior temporal plane (STP; Fig. 4.1a), and a smaller cluster spanning both
inferior colliculi (IC) in the midbrain (Fig. 4.1b). The MNI coordinates and t-values of
the most significant voxels in these activation clusters are listed in Table 4.1. The STP
activation comprised the region of Heschl’s gyrus (HG) and extended onto the planum
temporale (PT).
The lateralized sounds produced a largely symmetric auditory cortical response when
contrasted against the silence condition (Fig. 4.2). The activation patterns for the contrasts between all left- (Fig. 4.2b) and all right-lateralized sounds (Fig. 4.2c) versus silence were similar to the activation pattern produced by the all sounds versus silence
4.3. RESULTS
67
Contrast
Brain region
all sounds-silence
left STG
right STG
IC
right PT
left PT
right PT
left PT
right PT/TPJ/IPL
left PT/TPJ/IPL
right PT/TPJ/IPL
left PT/TPJ
all left-center
all right-center
left static-center
right static-center
left moving-center
right moving-center
right-left moving
Coordinates x, y, z
t
pvoxel (corr.)
−40, −28, 124
46, −24, 10
−2, −36, −8
54, −24, 10
−46, −28, 6
56, −24, 8
−62, −28, 20
64, −32, 14
−56, −26, 12
66, −34, 16
−54, −28, 12
29.36
25.36
6.24
5.62
5.45
4.42
3.65
6.35
6.61
5.92
4.70
< 0.001
< 0.001
< 0.001
0.001
0.001
0.002*
0.032*
< 0.001
< 0.001
< 0.001
0.04
Table 4.1: MNI coordinates and t−values of auditory activation foci. CN: cochlear nucleus; IC:
inferior colliculus; MGB: medial geniculate body; STP: supratemporal plane
contrast (Fig. 4.2a). This is in accordance with the results of Woldorff et al. (1999), who
also contrasted lateralized sounds against a silent baseline and found no significant interhemispheric differences in activation strength. The light-gray ellipses in Fig. 4.2 mark
the approximate position of Heschl’s gyrus in the group of fourteen listeners.
4.3.2 Differential sensitivity to lateralized sounds: contralateral asymmetry
Contrasts between sound conditions and silence would be expected to represent all brain
areas that are sensitive to the sounds or to one of the sounds’ various perceptual attributes.
In order to isolate those regions involved in interaural temporal processing, and examine
their response to lateralized sounds, the response to all left- or all right-lateralized sounds
(left static/moving or right static/moving) was compared to the response to the central
sound (center). Figs. 3a and 3b show that the lateralized sounds produced a stronger
contralateral response compared to the central sound. The activation to the all left versus
center contrast was largely confined to the right AC (Fig. 4.3a), whereas the main area
of activation in the all right versus center contrast was in the left AC (Fig. 4.3b). The
differential activation produced by the lateralized sounds was limited to the PT, that is
the part of the supratemporal plane posterior to HG (Figs. 3a and 3b; Table 4.1). The PT
has previously been implicated with the processing of spatial sound attributes and sound
movement in humans (Baumgart et al., 1999; Warren et al., 2002; Zatorre et al., 2002a).
In the monkey, non-primary auditory fields posterior to primary AC have been shown to
form a posterior-dorsally directed processing stream that is assumed to be specialized in
auditory spatial processing (Rauschecker and Tian, 2000).
In order to assess the relative contributions of the static and moving sound conditions
to the activation in PT, each of the lateralized sound conditions (left/right static/moving)
was contrasted separately against the center condition. Figs. 3c and 3d show that the moving sounds (cross-hatched bars) produced a consistently stronger activation in PT than the
static sounds (hatched bars). In fact, neither the left static versus center contrast nor the
right static versus center contrast produced any activation that exceeded the threshold criterion of t = 4.65, corresponding to a p-value of 0.05 or better, corrected for multiple
68
CHAPTER 4. REPRESENTATION OF LEFT AND RIGHT AUDITORY SPACE
All sounds-silence
a
R
L
All left-silence
b
All right-silence
c
Figure 4.2: Contrasts between lateralized sounds and silence. The two lower panels show the axial
projection of the activations to the contrasts between all left- (b) and all right-lateralized sounds (c)
versus silence for a height threshold of t = 4.65 (p voxel ≤ 0.05, corrected). For a comparison, the
upper panel shows the activation to the all sounds versus silence contrast, replotted from Fig. 4.1.
The light-gray ellipses mark the approximate position of HG. When contrasted against silence,
the lateralized sounds produced a largely symmetric response.
4.3. RESULTS
69
All right−center
All left−center
a
b
L
R
c
d
Figure 4.3: Contrasts between the lateralized sounds and the central sound. The upper panels
show the axial projection of the activation to all left-lateralized (a) and all right-laterlized sounds
(b) relative to the central sound; the height threshold was t = 4.65, p voxel ≤ 0.05, corrected).
The differential activation to the lateralized sounds was confined to the contralateral PT. Panels c
and d depict the relative contributions of the static and moving sounds to the contrasts shown in
panels a and b. Panel c shows the contrast-weighted beta-values for the left static versus center
(hatched bar) and left moving versus center contrasts (cross-hatched bar) at the most significant
voxel in the all left versus center comparison (gray arrow pointing to panel a). Panel d shows the
analogous analysis for the right-lateralized sounds. The moving sounds activated the PT more
strongly than the static sounds. The black, dotted outlines in panels a and b mark the regions used
for the volume of interest analyses of the left and right static versus center contrasts (see text).
70
CHAPTER 4. REPRESENTATION OF LEFT AND RIGHT AUDITORY SPACE
comparisons across the entire scanned volume. However, using a more lenient threshold
criterion (t=3.09; pvoxel ≤ 0.001, uncorrected) and a hypothesis-driven (Warren et al.,
2002; Zatorre et al., 2002a) volume of interest analysis revealed that the left static versus
center and right static versus center contrasts produced a significant activation in the PT
of the respective contralateral hemisphere (see Table 4.1); there was no significant activation of the corresponding region in the ipsilateral hemisphere. The search volumes for
these analyses were spheres of 15-mm radius centered on the left and right PT; they are
marked by black dotted outlines in Figs. 3a and 3b. The position of PT in each hemisphere was approximated as 10 mm posterior and lateral, and 5 mm superior to the center
of HG, which was derived from the averaged map of HG for the fourteen subjects who
participated in this experiment (see Methods).
4.3.3 Differential sensitivity to moving sounds: right-hemisphere dominance
Unlike the contrasts between the static sounds and center, the contrasts between the moving sounds and center did reach the predefined threshold criterion of t = 4.65 (Figs. 4a
and 4b; Table 4.1). The activation produced by the left moving versus center contrast was
largely confined to the right hemisphere (Fig. 4.4a), whereas the right moving versus center comparison produced a more bilateral pattern of activation, comprising a larger activation cluster in the left hemisphere and a smaller cluster in the right hemisphere (Fig. 4.4b).
This suggests a right-hemisphere dominance in the processing of sound movement, in the
sense that the right AC represents movement in both hemifields, whereas the left AC predominantly represents movement within the right hemifield. The lower panels in Fig. 4.4
corroborate this conjecture. In the right AC (Fig. 4.4c), the differential response to the
right moving sounds (cross-hatched bar) is almost as large as the response to the left moving sounds (hatched bar). In contrast, the differential response of the left AC to the left
moving sounds (hatched bar in Fig. 4.4d) is much smaller than the left-AC response to the
right moving sounds (hatched bar in Fig. 4.4d). In order to verify the effect statistically,
we calculated the contrasts between the right moving and left moving sound conditions
and vice versa. If the right moving sounds produce a reliably stronger left-AC activation
than the left moving sounds, but the left and right moving sounds activate the right AC
similarly strongly, the right moving versus left moving contrast should yield a significant
activation in the left AC, but the left moving versus right moving contrast should yield
no significant activation in either AC. Figure 4.5 shows that this was indeed the case. In
order to reveal all significantly activated voxels in the auditory cortices, even those which
would be insignificant at the corrected level, the activation in Fig. 4.5 was thresholded at
t = 3.09, pvoxel ≤ 0.001, uncorrected) and masked with the all sounds versus silence
contrast; the uncorrected p-value of the mask was set to 0.001. Even with this relatively
lenient threshold criterion, the left moving versus right moving contrast yielded no activation in either AC (Fig. 4.5a). In contrast, the right moving versus left moving contrast
produced a significant activation in the left AC, parts of which even surpassed the more
conservative threshold criterion of t = 4.65 (pvoxel ≤ 0.05, corrected; see Fig. 4.5b and
Table 4.1).
Figure 4.6a shows how the differential activation to moving sounds is distributed on
the supratemporal plane. The red color marks voxels with t-values of 4.65 or larger
(pvoxel ≤ 0.05, corrected). The green color depicts the whole extent of the respective
4.3. RESULTS
71
a
Left moving−center
L
Right moving−center
b
R
c
d
Figure 4.4: Contrasts between the moving sounds and the central sound. The upper panels show
the axial projection of the activation to the left moving sounds (a) and the right moving sounds (b)
relative to the central sound, thresholded at t = 4.65; p voxel ≤ 0.05, corrected). The lower panels
show the contrast-weighted beta-values for the left moving versus center (hatched bar) and right
moving versus center contrasts (cross-hatched bar), evaluated at the most significant voxels in the
left moving versus center (c) and right moving versus comparisons (d), which were located in the
right and left AC, respectively (gray arrows in panels a and b). The analysis shows that the right
AC was activated by both the left and right moving sounds (c), whereas the left AC predominantly
responded to the right moving sounds (d).
72
CHAPTER 4. REPRESENTATION OF LEFT AND RIGHT AUDITORY SPACE
a
Left−right moving
L
b
Right−left moving
R
Figure 4.5: Activation to the contrasts between the left and right moving sounds in axial projection. The activation was thresholded at t = 3.09; p voxel ≤ 0.001, uncorrected) and masked
with the all sounds versus silence contrast to reveal all significant voxels in the auditory cortices.
Whereas the right moving versus left moving contrast yielded a significant activation in the left
AC (b), the left moving versus right moving contrast produced no activation in either AC (a)
corroborating the notion of a right-hemisphere dominance in auditory motion processing.
activation clusters (t ≤ 3.09, pvoxel ≤ 0.001, uncorrected). The white highlight shows
a 50% probability map of HG for the group of subjects (see Methods). The shape of the
activation to moving sounds is roughly triangular in both hemispheres and comprises the
lateral half to two-thirds of the PT. Some activation to the moving sounds overlaps parts
of HG medially and laterally, however, there is little or no movement-related activity on
the central part of HG, which is the site of the primary AC in humans (Rademacher et al.,
2001).
The differential activation to moving sounds also comprised the temporo-parietal
junction (TPJ) and extended into regions of the inferior parietal lobe (IPL; Fig. 4.6b).
The uncorrected significant activation t ≤ 3.09; pvoxel ≤ 0.001, uncorrected; green in
Fig. 4.6) in the PT and IPL formed contiguous clusters in both hemispheres. The parietal
activations to the left moving versus center and right moving versus center contrasts were
located at MNI coordinates 54, −38, 30 (t = 3.66) and −56, −36, 26 mm (t = 5.48),
respectively. Similar to the supratemporal activation, the inferior parietal activation to
the left moving versus center contrast was confined to the right hemisphere (left panels
in Fig. 4.6b), whereas the inferior parietal activation to the right moving versus center
contrast was essentially bilateral (right panels in Fig. 4.6b), albeit with lesser significance
on the ipsilateral side (t = 4.9 at 54, −38, 30 mm versus t = 5.48 at −56, −36, 26 mm).
The moving sounds produced no differential activation in the IC. In view of the much
lower t-values of the IC activation in the all sounds versus silence contrast compared to
the AC activation (Table 4.1), however, the lack of IC activation in the differential sound
contrasts may be a mere threshold effect.
4.3.4 Activations outside ‘classical’ auditory structures
The contrasts between sound conditions and silence, and between the lateralized sounds
and center also produced some activations in structures outside the ‘classical’ AC (see
4.3. RESULTS
73
a
Left moving−center Right moving−center
L
R
pvoxel ≤ 0.05 (corrected)
pcluster ≤ 0.001 (corrected)
b
L
R
x = 52 mm
y = −36 mm
x = −52 mm
Figure 4.6: Differential activation to the moving sounds rendered onto the average structural
image of the group. Red: voxels with t-values of 4.65 or larger (p voxel ≤ 0.05, corrected). Green:
voxels with t-values of 3.09 or larger (p voxel ≤ 0.001, uncorrected) that were located in clusters
of highly significant size (p cluster ≤ 0.001, corrected). In panel a, the location and orientation
of the section is shown in the small inset at the bottom. The locations of the coronal and sagittal
sections shown in panel b are indicated by brown, vertical lines in the images themselves. The
white highlight shows the 50% probability map of HG for the group of subjects.
74
CHAPTER 4. REPRESENTATION OF LEFT AND RIGHT AUDITORY SPACE
Figs. 1–4). The most significant activation outside AC in the all sounds versus silence
contrast was located at the base of the inferior frontal sulcus in the left hemisphere, close
to the junction between the inferior frontal and precentral sulci (t = 5.42; pvoxel ≤ 0.05,
corrected at −28, 16, 22 mm). In the contrast between all lateralized sound conditions and
the center condition, the most significant activation outside AC was in the left thalamus
(t = 4.8; pvoxel ≤ 0.05, corrected at −10, −8, 4 mm) and the left and right pulvinar (t =
4.47; pvoxel ≤ 0.001, uncorrected at −12, −26, −4 mm and t = 4.31; pvoxel ≤ 0.001,
uncorrected at 10, −24, −4 mm). These activations may, at least in part, be related to the
fact that subjects were asked to perform a visual control task whilst listening to the sounds
(see Methods). Performing the control task would be expected to be more difficult during
the sound conditions than during the silence condition, because subjects had to divide
their attention between the auditory and visual modalities. Moreover, the spatial foci of
auditory and visual attention were disparate during the lateralized sound conditions, and
subjects had to suppress the temptation to move their eyes in the direction of the sounds
in order to do the visual task.
4.4 Discussion
The current data show that the internal representation of interaural temporal information
mediated by lateralized sounds is predominantly contralateral in the human AC (Fig. 4.2).
All sounds used in the current experiment had the same energy at the two ears and the
impression of laterality or movement was created solely by interaural temporal manipulations, which are inaudible when listening to each ear separately. This means that
the observed asymmetry was unconfounded by the known asymmetry in the number of
crossed and uncrossed excitatory projections in the ascending auditory pathway (Webster
et al., 1992) and must be a result of the interaural temporal processing of the sounds.
ITD processing involves comparing the temporal structure of the signals from the two
ears on a sub-millisecond scale. This comparison must be accomplished in the brainstem
(Oertel, 1997), because the spike discharges of AC neurons do not exhibit the temporal precision that would be necessary to convey timing differences on that fine a scale
(Lu et al., 2001; Eggermont, 2002). It is generally assumed that interaural temporal information is converted to a more stable neural code at the level of the superior olivary
complex (SOC), which is the first auditory brainstem structure, where information from
the two ears is integrated (Joris et al., 1998). The current data show how the left and right
auditory hemifields are recreated as a result of this subcortical recoding. The view that
the observed contralateral asymmetry originates in the brainstem is supported by lesion
studies in animals (Casseday and Neff, 1975; Thomson and Masterton, 1978; Jenkins and
Masterton, 1982) and in humans (Furst et al., 1995; Pratt et al., 1998). Both animal and
human studies showed that brainstem lesions above the level of the SOC, for example, in
the lateral lemniscus or the IC, impair sound localization in the hemifield contralateral to
the site of the lesion, whereas damage below the level of the SOC causes more diffuse
deficits.
4.4. DISCUSSION
75
4.4.1 Representation of spatial attributes in stationary sounds
The stationary lateralized sounds used in the current experiment produced a significant,
albeit small activation increase in the PT of the respective contralateral hemisphere compared to the central sound. In contrast, Zatorre et al. (2002a) observed no reliable cerebral
blood flow change (measured with PET) associated with variations in the spatial attributes
of stationary sounds, at least not when the sounds were presented sequentially, as in the
current experiment. However, the spatial ranges of the sounds used by Zatorre et al. were
centered around the midline, and thus always comprised equal parts of both hemifields,
and so, Zatorre et al. were unable detect the contralateral spatial tuning that was observed in the current study. Taken together, the current results and the results of Zatorre
et al. suggest that there is no differential spatial tuning within each hemisphere, which
means that different ITDs, viz, different lateral positions in the two hemifields must be
coded non-topographically (see also Middlebrooks, 2002). A topographic map of interaural temporal information would contain neurons that are tuned to narrow ranges of
ITDs, and the ITD producing maximal discharge (best ITD) would vary parametrically
across neurons, spanning the entire physiologically relevant range of ITDs. Interaural delay would thus be represented by the position of maximal discharge in the map. Contrary
to these expectations, electrophysiological data by McAlpine et al. (2001) and Brandt et
al. (2002) demonstrated that most binaurally sensitive neurons in the brainstem and midbrain of the guinea pig and the gerbil are tuned to ITDs outside, rather than within, the
physiologically relevant ITD range. The majority of neurons were tuned ITDs favoring
the contralateral ear, and the slope of the tuning functions was generally steepest around
zero ITD, where ITD discrimination performance is most accurate in humans (Durlach
and Colburn, 1978). These results suggest that ITDs are coded by the activity level in
two hemispheric channels, each of which is broadly tuned to the respective contralateral
hemisphere, rather than by the position of maximal activity in topographic neural maps
(Grothe, 2003; McAlpine and Grothe, 2003). Alternatively, different azimuthal positions
within one hemifield may be coded by the timing of action potentials (e.g., first-spike latency) in broadly tuned neurons of the respective contralateral hemisphere (Middlebrooks
et al., 1998; Furukawa and Middlebrooks, 2002; Middlebrooks et al., 2002). The current
data are consistent with both of these hypotheses.
4.4.2 Specialized auditory “where” processing stream
Unlike the sound versus silence contrasts, none of the differential sound contrasts (lateralized versus central sound conditions) yielded any activation in the region of the primary
AC on HG. Rather, the differential activations to the static lateralized and moving sounds
were largely confined to regions posterior to HG. In the monkey, (at least) two different auditory processing streams have been distinguished on the basis of distinct patterns
of cortico-cortical connections (Romanski et al., 1999; Kaas and Hackett, 2000; Lewis
and Van Essen, 2000). Based on analogy with the visual system, Rauschecker (1998,
Rauschecker and Tian, 2000; Tian et al., 2001) proposed that the anterior-ventrally directed stream is specialized in the processing of non-spatial sound features (“what”),
whereas the posterior-dorsally directed stream is specifically concerned with auditory
spatial processing (“where”). The current data are consistent with this hypothesis, sug-
76
CHAPTER 4. REPRESENTATION OF LEFT AND RIGHT AUDITORY SPACE
gesting that, in humans, interaural temporal information is projected posteriorly from primary AC into PT, and then further posteriorly from PT to the TPJ and into the IPL (see,
however, Budd et al., 2003). In the current experiment, changes in the sounds’ spatial attributes were unconfounded with changes in their monaural spectro-temporal properties,
as would have been the case, had lateralization been mediated by filtering with headrelated transfer functions (Wightman and Kistler, 1993). This means that the observed
activations in PT cannot be attributed to the processing of “spectral motion” (Belin and
Zatorre, 2000). As a note of caution, however, we would like to emphasize that the notion
of a specialized “where” stream remains conjectural as long as the mechanisms by which
auditory spatial information is processed are not properly understood. In particular, it is
conceivable that regions in the putative anterior “what” stream encode sound location by
action potential timing rather than by firing rate (Furukawa and Middlebrooks, 2002). In
this case, auditory spatial processing in these regions would not be associated with any
increase in BOLD signal and would thus be undetectable with fMRI. Evidence that auditory cortical regions anterior to HG may indeed be involved in sound localization comes
from human lesion data (Zatorre and Penhune, 2001).
4.4.3 Auditory motion processing
In addition to posterior temporal regions (PT), the differential response to the moving
sounds also comprised regions in the inferior parietal lobe. The posterior temporal activation to the moving sounds probably reflects the preattentive, sensory processing of timevarying spatial cues, whereas the inferior parietal activation may be related to higherorder processes associated with the conscious perception of movement, as for instance
the attentional tracking of the moving stimulus through space, or the integration of auditory spatial cues into multimodal spatial representations (Bushara et al., 1999; Bremmer
et al., 2001). Griffiths and colleagues (Griffiths et al., 1998, 2000) compared moving
sounds with sounds, which contained the same physical movement cues as the moving
sounds, but were nevertheless perceived as stationary, because different cues were traded
against each other to produce exact cancellation of their respective perceptual effects.
This comparison emphasizes activation associated with the perceptual and cognitive processing of movement by neutralizing activation related to the low-level processing of the
acoustic movement cues. In accordance with our interpretation of the posterior temporal
and inferior parietal parts of the movement-evoked response, the comparison reported by
Griffiths and coworkers revealed significant activation in the IPL and other parietal and
frontal regions, but not in the AC. The notion that the inferior parietal activation reflects
attentional or supramodal aspects of motion processing is also supported by the fact that
lesions in the IPL and, in particular, the TPJ are a frequent cause of the hemispatial neglect syndrome, which is known to be a supramodal deficit that may affect the visual,
auditory and somatosensory modalities (Halligan et al., 2003).
Both the posterior temporal and the inferior parietal activation to moving sounds exhibited a relative rightward asymmetry, in the sense that the right hemisphere was activated to similar degrees by sounds moving within the left or right hemifields, whereas the
left hemisphere was predominantly activated by sounds moving within the right hemifield. These results indicate that the functional hemispheric asymmetry in the sensory
representation of interaural temporal information parallels the asymmetry associated with
4.5. REFERENCES
77
attentional and supramodal components of spatial processing. Hemispheric functional
asymmetries have also been observed in melody and speech processing in the auditory
pathway (Zatorre et al., 2002b; Patterson et al., 2002). In these cases, one hemisphere appears to devote more neuronal resources to the respective task than the other hemisphere.
In the case of auditory motion processing, on the other hand, the functional difference
between the hemispheres seems to be more a qualitative rather than a quantitative one,
in that auditory motion processing is more global in the right hemisphere and more local
in the left hemisphere. In this sense, the hemispheric asymmetry in auditory motion processing resembles the asymmetry in the processing of global and local aspects of visual
stimuli (Fink et al., 1996; Fink et al., 1997; Marshall and Fink, 2001). The difference
does not mean that the left hemisphere plays a lesser role in auditory space perception
than the right hemisphere in neurologically intact subjects. In the current data, the leftAC activation to the right moving sounds was stronger (Fig. 4.2) and spanned a larger
area (Fig. 4.5) than the right-AC activation to the left moving sounds. In the case of unilateral lesion, however, the right hemisphere would be expected to be better prepared to
take over the function of the left hemisphere than vice versa. The observed asymmetry in
the auditory motion processing may thus underlie the reported disparities in the auditory
spatial deficits following unilateral temporal or parietal lesions in the left versus the right
hemisphere (Bellmann et al., 2001; Zatorre and Penhune, 2001).
4.5 References
Baumgart F, Gaschler-Markefski B, Woldorff MG, Heinze HJ, Scheich H (1999) A movement-sensitive area in auditory cortex. Nature 400:724–726.
Belin P, Zatorre RJ (2000) ‘What’, ‘where’ and ‘how’ in auditory cortex. Nat Neurosci
3:965–966.
Bellmann A, Meuli R, Clarke S (2001) Two types of auditory neglect. Brain 124:676–
687.
Bremmer F, Schlack A, Shah NJ, Zafiris O, Kubischik M, Hoffmann K, Zilles K, Fink
GR (2001) Polymodal motion processing in posterior parietal and premotor cortex: A human fMRI study strongly implies equivalencies between humans and monkeys. Neuron,
29:287–296.
Brandt A, Behrend O, Marquardt T, McAlpine D, Grothe B (2002) Precise inhibition is
essential for microsecond interaural time difference coding. Nature 417:543–547.
Budd TW, Hall DA, Goncalves MS, Akeroyd MA, Foster JR, Palmer AR, Head K, Summerfield AQ (2003) Binaural specialisation in human auditory cortex: an fMRI investigation of interaural correlation sensitivity. Neuroimage 20:1783–1794.
Bushara KO, Weeks RA, Ishii K, Catalan MJ, Tian B, Rauschecker JP, Hallett M (1999)
Modality-specific frontal and parietal areas for auditory and visual spatial localization in
humans. Nat Neurosci 2:759–766.
Casseday JH, Neff WD (1975) Auditory localization: role of auditory pathways in brain
stem of the cat. J Neurophysiol 38:842–858.
78
CHAPTER 4. REPRESENTATION OF LEFT AND RIGHT AUDITORY SPACE
Clarke S, Bellman A, Meuli R, Assal G, Steck A (2000) Auditory agnosia and auditory
spatial deficits following left hemispheric lesions: evidence for distinct processing pathways. Neuropsychologia 38:797–807.
Durlach NI, Colburn HS (1978) Binaural phenomena. In: Handbook of perception, Vol.
IV (Carterette EC, Friedman M, eds) pp405–466. New York: Academic Press.
Eggermont JJ (2002) Temporal modulation transfer functions in cat primary auditory
cortex: separating stimulus effects from neural mechanisms. J Neurophysiol 87:305–321.
Engelien A, Stern E, Silbersweig D (2001) Functional neuroimaging of human central auditory processing in normal subjects and patients with neurological and neuropsychiatric
disorders. J Clin Exp Neuropsychol 23:94–120.
Fink GR, Halligan PW, Marshall JC, Frith CD, Frackowiak RS, Dolan RJ (1996) Where
in the brain does visual attention select the forest and the trees? Nature 382:626–628.
Fink GR, Halligan PW, Marshall JC, Frith CD, Frackowiak RS, Dolan RJ (1997) Neural mechanisms involved in the processing of global and local aspects of hierarchically
organized visual stimuli. Brain 120:1779–1791.
Fitzpatrick DC, Kuwada S, Batra R (2000) Neural sensitivity to interaural time differences: beyond the Jeffress model. J Neurosci 20:1605–1615.
Furst M, Levine RA, Korczyn AD, Fullerton BC, Tadmor R, Algom D (1995) Brainstem
lesions and click lateralization in patients with multiple sclerosis. Hear Res 82:109–124.
Furukawa S, Middlebrooks JC (2002) Cortical representation of auditory space: information-bearing features of spike patterns. J Neurophysiol 87:1749–1762.
Griffiths TD, Rees G, Rees A, Green GGR, Witton C, Rowe D, Buchel C, Turner R,
Frackowiak RS (1998) Right parietal cortex is involved in the perception of sound movement in humans. Nat Neurosci 1:74–79.
Griffiths TD, Green GGR, Rees A, Rees G (2000) Human brain areas involved in the
analysis of auditory movement. Hum Brain Mapp 9:72–80.
Grothe B (2003) New roles for synaptic inhibition in sound localization. Nat Rev Neurosci 4:540–550.
Guimaraes AR, Melcher JR, Talavage TM, Baker JR, Ledden P, Rosen BR, Kiang NY,
Fullerton BC, Weisskoff RM (1998) Imaging Subcortical Auditory Activity in Humans.
Hum Brain Mapp 6:33–41.
Hall DA, Haggard MP, Akeroyd MA, Palmer AR, Summerfield AQ, Elliott MR, Gurney
EM, Bowtell RW (1999) “Sparse” Temporal Sampling in Auditory fMRI. Hum Brain
Mapp 7:213–223.
Halligan PW, Fink GR, Marshall JC, Vallar G (2003) Spatial cognition: evidence from
visual neglect. Trends Cogn Sci 7:125–133.
Jenkins WM, Masterton RB (1982) Sound localization: effects of unilateral lesions in
central auditory system. J Neurophysiol 47:987–1016.
Joris PX, Smith PH, Yin TC (1998) Coincidence detection in the auditory system: 50
years after Jeffress. Neuron 21:1235–1238.
4.5. REFERENCES
79
Kaas JH, Hackett TA (2000) Subdivisions of auditory cortex and processing streams in
primates. Proc Natl Acad Sci USA 97:11793–11799.
Kaiser J, Lutzenberger W, Preissl H, Ackermann H, Birbaumer N (2000) Right-hemisphere
dominance for the processing of sound-source lateralization. J Neurosci 20:6631–6639.
Lewis JW, Van Essen DC (2000) Corticocortical connections of visual, sensorimotor, and
multimodal processing areas in the parietal lobe of the macaque monkey. J Comp Neurol
428:112–137.
Lu T, Liang L, Wang X (2001) Temporal and rate representations of time-varying signals
in the auditory cortex of awake primates. Nat Neurosci 4:1131–1138.
Malone BJ, Scott BH, Semple MN (2002) Context-dependent adaptive coding of interaural phase disparity in the auditory cortex of awake macaques. J Neurosci 22:4625–4638.
Marshall JC, Fink GR (2001) Spatial cognition: where we were and where we are. Neuroimage 14:S2–S7.
McAlpine D, Jiang D, Shackleton TM, Palmer AR (2000) Responses of neurons in the
inferior colliculus to dynamic interaural phase cues: evidence for a mechanism of binaural
adaptation. J. Neurophysiol. 83:1356–1365.
McAlpine D, Jiang D, Palmer AR (2001) A neural code for low-frequency sound localization in mammals. Nat Neurosci 4:396–401.
McAlpine D, Grothe B (2003) Sound localization and delay lines—do mammals fit the
model? Trends Neurosci 26:347–350.
Mesulam MM (1999) Spatial attention and neglect: parietal, frontal and cingulated contributions to the mental representation and attentional targeting of salient extrapersonal
events. Phil Transac Roy Soc Lon 354:1325–1346.
Middlebrooks JC (2002) Auditory space processing: here, there or everywhere? Nat
Neurosci 5:824–826.
Middlebrooks JC, Xu L, Eddins AC, Green DM (1998) Codes for sound-source location
in nontonotopic auditory cortex. J Neurophysiol 80:863–881.
Middlebrooks JC, Xu L, Furukawa S, Mickey BJ (2002) Location signaling by cortical
neurons. In: Integrative functions in the mammalian auditory pathway (Oertel D, Fay
RR, Popper AN, eds) pp319–357. New York: Springer.
Oertel D (1997) Encoding of timing in brain stem auditory nuclei of vertebrates. Neuron
19:959–962.
Olivares R, Montiel J, Aboitiz F (2001) Species differences and similarities in the fine
structure of the mammalian corpus callosum. Brain Behav Evol 57:98–105.
Patterson RD, Uppenkamp S, Johnsrude IS, Griffiths TD (2002) The processing of temporal pitch and melody information in auditory cortex. Neuron 36:767–776.
Pratt H, Polyakov A, Ahronson V, Korczyn AD, Tadmor R, Fullerton BC, Levine RA,
Furst M (1998) Effects of localized pontine lesions on auditory brain-stem evoked potential and binaural processing in humans. EEG Clin Neurophysiol 108:511–520.
80
CHAPTER 4. REPRESENTATION OF LEFT AND RIGHT AUDITORY SPACE
Rademacher J, Morosan P, Schormann T, Schleicher A, Werner C, Freund HJ, Zilles
K (2001) Probabilistic mapping and volume measurement of human primary auditory
cortex. Neuroimage 13:669–683.
Rauschecker JP (1998) Cortical processing of complex sounds. Curr Opin Neurobiol
8:516–521.
Rauschecker JP, Tian B (2000) Mechanisms and streams for processing of “what” and
“where” in auditory cortex. Proc Natl Acad Sci USA 97:11800–11806.
Thompson GC, Masterton RB (1978) Brainstem auditory pathways involved in reflexive
head orientation to sound. J Neurophysiol 541:1183–1202.
Romanski LM, Tian B, Fritz J, Mishkin M, Goldman-Rakic PS, Rauschecker JP (1999)
Dual streams of auditory afferents target multiple domains in the primate prefrontal cortex. Nat Neurosci 2:1131–1136.
Tian B, Reser D, Durham A, Kustov A, Rauschecker JP (2001) Functional specialization
in rhesus monkey auditory cortex. Science 292:290–293.
Warren JD, Zielinski BA, Green GGR, Rauschecker JP, Griffiths TD (2002) Perception
of sound-source motion by the human brain. Neuron 34:139–148.
Weeks RA, Aziz-Sultan A, Bushara KO, Tian B, Wessinger CM, Dang N, Rauschecker
JP, Hallett M (1999) A PET study of human auditory spatial processing. Neurosci Lett
262:155–158.
Weeks R, Horwitz B, Aziz-Sultan A, Tian B, Wessinger CM, Cohen LG, Hallett M,
Rauschecker JP (2000) A positron emission tomographic study of auditory localization
in the congenitally blind. J Neurosci 20:2664–2672.
Webster DB, Popper AN, Fay RR (1992) The Mammalian Auditory Pathway: Neuroanatomy New York: Springer.
Wightman FL, Kistler DJ (1992) The dominant role of low-frequency interaural time
differences in sound localisation. J Acoust Soc Am 85:868–878.
Wightman FL, Kistler DJ (1993) Sound localization. In: Human pschophysics (Yost WA,
Popper AN, Fay RR, eds.), pp. 155–192. New York: Springer.
Woldorff MG, Tempelmann C, Fell J, Tegeler C, Gaschler-Markefski B, Hinrichs H,
Heinz HJ, Scheich H (1999) Lateralized auditory spatial perception and the contralaterality of cortical processing as studied with functional magnetic resonance imaging and
magnetoencephalography. Hum Brain Mapp 7:49–66.
Zatorre RJ, Penhune VB (2001) Spatial localization after excision of human auditory
cortex. J Neurosci 21:6321–6328.
Zatorre RJ, Bouffard M, Ahad P, Belin P (2002a) Where is ‘where’ in the human auditory
cortex? Nat Neurosci 5:905–909.
Zatorre RJ, Belin P, Penhune VB (2002b) Structure and functions of auditory cortex:
music and speech. Trends Cogn Sci 6:37–46.
Chapter 5
Top-down or bottom-up:
hemispheric asymmetry in response
to monaural stimulation in the
human auditory brainstem,
thalamus, and cortex
81
Abstract
This study reports evidence for an asymmetrical activation of the left- and right-side auditory brainstem, thalamus and cortex in response to left or right monaural sound stimulation. Neural activity elicited by monaural sound stimulation was measured in the human
auditory pathway from the cochlear nucleus to the cortex. Functional magnetic resonance
imaging (fMRI) of the whole brain with cardiac triggering allowed simultaneous observation of activity in the brainstem, thalamus and cerebrum; sparse temporal sampling was
employed to separate the effects of scanner noise from the response to the experimental
stimuli. Left and right cortical and subcortical structures responded differently to the
monaural sound conditions. In the left and right cochlear nucleus, ipsilateral stimulation
elicited a larger signal change than contralateral stimulation, as expected from the exclusively ipsilateral afferent projections to the CN. In contrast, the inferior colliculi, medial
geniculate bodies and auditory cortices responded asymmetrically to left and right ear
stimulation: the right-side structures responded equally well to sound stimulation from
the left and right ear, whereas the left-side structures responded predominantly to right ear
stimulation. The data show that neural activation asymmetries can be found as early as in
the inferior colliculi and continue up to the auditory thalamus and cortex. It is discussed,
how these asymmetries might arise from the anatomical and physiological asymmetries
in the afferent (bottom-up) and efferent (top-down) auditory pathway.
5.1. INTRODUCTION
83
5.1 Introduction
In the primate auditory system, sound information travels in the form of action potentials
along the ascending auditory pathway from the spiral ganglion in the cochlea to the auditory cortex, and on to higher polymodal cortical areas. Along this way, the information
traverses more processing stages than in any other sensory system. Among the processing
nuclei are: the cochlear nucleus (CN), the superior olivary complex (SOC), the inferior
colliculus (IC), and the medial geniculate body (MGB) that relays information to all subdivisions of the auditory cortex (AC: in this article, the term ‘auditory cortex’ denotes
the part of the superior temporal plane that is sensitive to sound stimulation). All of
the structures mentioned, including the CN (Needham and Paolini 2003), project to their
contralateral homologues and to several higher processing areas, both ipsi- and contralaterally. Monaural sound input into one ear activates the ipsilateral CN, after which the activation spreads out ipsi- and contralaterally into the complex system of brainstem nuclei
with a stronger neural excitation in the contralateral structures. All activation pathways
finally converge at the highest auditory brainstem nucleus, the IC. The majority of ascending projections from the IC to the AC via the MGB are ipsilateral and hence preserve
the contralateral activation predominance in the MGB and AC. Although all neurons
in the primary AC respond to sounds from both ears (Zhang et al. 2004), the contralateral projections take precedence in the number of excitatory fibers (Rosenzweig 1951;
Glendenning and Masterton 1983). This contralateral activation predominance has been
demonstrated in the human AC with EEG, MEG, PET and fMRI (Loveless et al. 1994;
Hirano et al. 1997; Scheffler et al. 1998; Jancke et al. 2002; Suzuki et al. 2002), and with
fMRI in the IC (Melcher et al. 2000).
However, even when both ears are stimulated with exactly the same input, the activation of the left and right auditory structures is symmetric only for very simple stimuli.
Complex natural stimuli appear to be processed preferentially in either one of the cortical
hemispheres, depending on their acoustic characteristics. Among the proposed functional
specializations in the auditory system are the left hemisphere dominance for speech and
the right hemisphere dominance for music processing (Zatorre et al. 2002; Tervaniemi
and Hugdahl 2003). Zatorre and Belin (2001) suggested that a hemispheric asymmetry
in the processing of spectral and temporal sound information underlies the speech/music
asymmetry. There is evidence for a hemispheric asymmetry in auditory spatial processing as well. Lesion studies demonstrated that a damaged right auditory cortex impairs
sound localization performance more severely than a damaged left AC (Zatorre and Penhune 2001). A recent fMRI study by Krumbholz and colleagues (2004) corroborated
these results by demonstrating that the right auditory cortex responds to perceived sound
movement in both acoustic hemifields, while the left AC responds predominantly to the
contralateral (right) hemifield.
Asymmetries in the activation pattern of left vs. right auditory structures are not confined to the cerebral hemispheres, but have also been demonstrated at several levels of
the subcortical auditory system, although much less consistently than in the cortex. For
example, King and coworkers (1999) measured multiunit responses of neurons in the left
and right MGB of anesthetized guinea pigs, presented with sinousoids, clicks and human
speech sounds. The majority of the animals showed greater multiunit response amplitudes in the left than in the right MGB and the degree of the left-ward asymmetry was
84
CHAPTER 5. ACTIVATION ASYMMETRY IN THE AUDITORY PATHWAY
positively correlated with acoustic signal complexity. Surprisingly, although not explicitly discussed by the authors, monaural stimulation of either ear elicited larger responses
in the left than in the right MGB. In accordance with the latter finding are two earlier EEG
studies by Levine and colleagues (Levine and McGaffigan 1983; Levine et al. 1988), who
reported an asymmetry in the brainstem auditory evoked potentials (AEP) associated with
monaural stimulation. Monaural sound stimulation of the right ear elicited larger AEP
amplitudes than stimulation of the left ear, suggesting an increased responsiveness of the
left-side auditory brainstem structures. The authors hypothesized that the known cerebral language asymmetry might thus be related to asymmetries in the brainstem auditory
system. Finally, there is evidence for a response asymmetry already in the auditory periphery. Spontaneous otoacoustic emissions (Kemp 1978, 2002), caused by the motion of
the cochlea’s sensory hair cells, are more frequent in the right than in the left ear. Moreover, Khalfa and coworkers (Khalfa and Collet 1996; 1997; 1998a) demonstrated that the
medial olivo-cochlear system, the pathway of efferent projections from the SOC to the
CN, is more active on the right than on the left side. The authors linked this peripheral
asymmetry to cerebral hemispheric functional asymmetries on the grounds that, in pathological cases, a dysfunctional peripheral asymmetry is often accompanied by hemispheric
lateralization disorders.
In the present article, we report a left-right asymmetry in the activation of the IC,
the MGB and the AC, but not the CN, in response to monaurally presented sounds and
develop hypotheses that subsume the present as well as previous findings.
5.2 Material and Methods
5.2.1 Subjects
Twelve subjects (6 male, 6 female; 100% right-handedness (Oldfield 1971)) between 23
and 32 years of age, with no history of hearing disorder or neurological disease, participated in the experiment after having given informed consent. The experimental procedures were approved by the local ethics committee.
5.2.2 Stimuli and experimental protocol
The experiment comprised two monaural and two binaural sound conditions as well as
a silent baseline condition (Sil). In the monaural conditions (Left, Right), trains of noise
bursts were played either to the left or right ear separately. In the two binaural conditions,
the same noise bursts were played to both ears simultaneously, one with stationay and the
other with dynamically varying interaural time differences in the microsecond range. The
binaural conditions were included to assess binaural interaction in the human auditory
system, and the respective results are presented in Chapter 2, while the present article focuses on the effects of monaural sound stimulation. The noise bursts had a duration of 50
ms each; they were filtered between 200 and 3200 Hz and presented at a rate of 10 per s.
The noise was continuously generated afresh (Tucker Davis Technologies, System 3), so
that none of the noise bursts was ever repeated during the experiment. The sounds were
presented through MR-compatible electrostatic headphones (Sennheiser model HE 60),
which were fitted into industrial ear protectors that passively shielded the subjects from
5.2. MATERIAL AND METHODS
85
the scanner noise (Bilsom model 2452) (Palmer et al. 1998). Cardiac gating (Guimaraes
et al. 1998) was used to minimize motion artifacts in the brainstem signal resulting from
pulsation of the basilar artery. The functional images were triggered 300 ms after the
R-wave in the electrocardiogram, when the cardiac cycle is in its diastolic phase. The
sparse imaging technique (Edmister et al. 1999; Hall et al. 1999) was applied to avoid
masking of the experimental sounds by the scanner noise and reduce the effect of scanner
noise on the recorded activity. The gaps between consecutive image acquisitions, during
which the sounds or the silence were presented, were of about 7 s duration. The exact duration of the gaps, and thus also the repetition time of the image acquisitions (TR), varied
slightly due to cardiac gating. The average TR over all subjects and trials amounted to
10.5 s. The experimental conditions were presented in epochs, during which five images
were acquired. Four sound epochs containing the four sound conditions in pseudorandom
order were alternated with a single silence epoch. A total of 250 images (corresponding
to 50 epochs) were acquired per subject. To avoid eye movements in the direction of
the sounds, subjects fixated a cross at the midpoint of the visual axis and performed a
visual control task. The task was to press a button with the left or right index finger upon
each occurrence of the capital letter ‘Z’ in either of two simultaneous, but uncorrelated,
sequences of random one-digit numbers that were shown to the left and the right of the
fixation cross. The numbers were presented once every 2 s for 50 ms.
5.2.3 fMRI data acquisition
Blood-oxygen level dependent (BOLD) contrast images were acquired with a 3-T Bruker
Medspec whole body scanner using gradient echo planar imaging (average TR = 10.5 s;
TE = 30 ms; flip angle = 90◦ ; acquisition bandwidth = 100 kHz). The functional images
consisted of 28 ascending slices with an in-plane resolution of 3×3 mm, a slice thickness
of 3 mm and an inter-slice gap of 1 mm. The slices were oriented along the line connecting the anterior and posterior commissures and positioned so that the lowest slices
covered the cochlear nucleus (CN) just below the pons. The slices were acquired in direct
temporal succession and the acquisition time amounted to 2.1 s.
A high-resolution structural image was acquired from each subject using a 3D MDEFT
sequence (Ugurbil et al. 1993) with 128 1.5-mm slices (FOV = 25×25×19.2 cm; data matrix 256×256; TR = 1.3 s; TE = 10 ms). For registration purposes, a set of T1-weighted
EPI images were acquired using the same parameters as for the functional images (inversion time = 1200 ms; TR = 45 s; four averages).
5.2.4 Data analysis
The data were analyzed with the software package LIPSIA (Lohmann et al. 2001). Each
subject’s functional images were corrected for head motion and rotated into the Talairach
coordinate system by co-registering the structural MDEFT and EPI-T1 images acquired
in this experiment with a high-resolution structural image residing in a subject database.
The functional images were then normalized and were spatially smoothed with two different Gaussian kernels (3 and 10 mm full width at half maximum; FWHM) to optimize
for the signals from the brainstem and the cortex, respectively. The auditory structures in
the brainstem are only a few millimeters large, and their location with respect to macro-
86
CHAPTER 5. ACTIVATION ASYMMETRY IN THE AUDITORY PATHWAY
anatomical landmarks varies little across individuals. Thus, the chances of detecting auditory activity in the brainstem can be increased by using a small smoothing kernel. In
contrast, auditory cortical regions are comparatively large, and their boundaries exhibit
a considerable inter-individual variability with respect to macro-anatomy (Penhune et
al. 1996; Rademacher et al. 2001), which means that a larger smoothing kernel is more
suitable for analyzing the auditory cortical signal. The smoothed image time series of
twelve subjects, comprising a total of 3000 image volumes, were subjected to a fixedeffects group analysis using the general linear model. The experimental conditions were
modeled as box-car functions convolved with a generic hemodynamic response function
including a response delay of 6 s. The data were highpass filtered at 0.0019 Hz to remove
low-frequency drifts, and lowpass filtered by convolution with a Gaussian function (4 s
FWHM) to control for temporal autocorrelation. The height threshold for activation was
z = 3.1 (p = 0.001 uncorrected).
5.3 Results
The activation produced by the left and right monaural sound conditions was compared
with the activation during the silent baseline condition to illustrate the cortical and subcortical regions sensitive to the noise stimuli. These regions included the cochlear nuclei
(CN), the inferior colliculi (IC), the medial geniculate bodies (MGB) and the auditory
cortices (AC) on both sides of the brain (see Fig. 5.2, top panel). The Talairach coordinates (x,y,z) of the most significant voxel in each structure in this contrast were as follows:
left CN: −14, −42, −30; right CN: 10, −42, −30; left IC: −8, −36, −3; right IC 4, −36,
−3; left MGB: −17, −30, 0; right MGB: 13, −30, −3; left AC: −47, −27, 12; and right
AC: 40, 35, 25. Subsequently, the monaural conditions were individually compared to
the silence condition as well as to each other in order to reveal significant differences in
the cortical and subcortical activation during monaural sound stimulation.
5.3.1 Activation during monaural left and right ear stimulation
The activation during monaural stimulation of the left (Left) and right (Right) ear was
compared with the activation during the silent baseline condition (Sil) in two separate
contrasts, Left-Sil and Right-Sil. Statistical parameter maps of these two contrasts are
shown in Figure 5.1.
Left-ear stimulation caused a significant activation of the left CN. Activation of the
right CN just reached significance. In subsequent auditory structures, activation shifted to
the right side. The right IC and MGB responded more strongly than their left-side counterparts, which just reached significance. The auditory cortex was activated bilaterally,
with the activation being more pronounced in the right hemisphere.
Figure 5.2 shows the percent signal change (lower panel) at the most significant voxel
in each auditory structure (indicated by lines in the upper panel). In the left and right
CN, the ipsilateral stimulation elicited a larger signal change than the contralateral stimulation, with the difference and the absolute change being approximately equal in the left
and right CN. In contrast, the IC, MGB and AC responded asymmetrically to the left
and right ear stimulation. The right-side structures responded about equally strongly to
sound stimulation from the left and right ear (no significant differences in the percent
5.3. RESULTS
87
Figure 5.1: From left to right, ascending the auditory pathway, axial and coronal anatomical
slices through the cochlear nuclei (CN), the inferior colliculi (IC), the medial geniculate bodies
(MGB), and the auditory cortices (AC) are shown. (White arrows denote the respective structures
of interest in slices with several activated areas visible.) Superimposed are color-coded statistical
parameter maps that show significant activation of auditory structures following monaural stimulation of the left and right ear compared to the silent baseline condition. Note that monaural left
and right ear stimulation resulted in a strong activation of the ipsilateral CN, whereas the activation pattern in the higher structures is asymmetric. The IC, MGB, and AC of the left hemisphere
responded stronger to the (contralateral) right ear stimulation and less strong to the (ipsilateral)
left ear stimulation than the respective structures of the right hemisphere. In the right hemisphere,
no such activation difference was observed. (The color-coded SPMs are z-maps thresholded at
3.1 (p < 0.001 uncorrected). For the axial display of the AC, the slicing plane has been rotated
by 30◦ , as indicated by the dashed line in the schematic inset, to show the full length of Heschl’s
gyrus.)
88
CHAPTER 5. ACTIVATION ASYMMETRY IN THE AUDITORY PATHWAY
Figure 5.2: The upper panel shows coronal slices through the cochlear nuclei (CN), the inferior
colliculi (IC), the medial geniculate bodies (MGB), and the auditory cortices (AC) overlaid with
the significant activations in an contrast where the sum of both monaural stimulation conditions is
contrasted with the silent baseline condition. The lower panel gives the percent signal change in
response to the left ear (red bars) and right ear stimulation (blue bars, mean±standard deviation)
for the most active voxel of each auditory structure in this contrast (marked by the black lines).
Note the equivalence of the response to left and right stimulation in the IC, MGB, and AC of
the right hemisphere. In the left hemispheric counterparts of these structures, the activation in
response to right ear stimulation always exceeded the right hemispheric activation while the activation in response to left ear stimulation was weaker than (IC, AC) or equal to (MGB) the right
hemispheric activation. The CN showed a stronger response to the ipsilateral stimulation in both
hemispheres. In terms of absolute activation strength, the lowest signal changes were recorded
from the CN followed by MGB and IC. The AC gave the strongest signal, although its absolute
signal change should not be compared to that of the subcortical structures, since a wider spatial
smoothing kernel was applied during data analysis (10 mm FWHM, compared to 3 mm FWHM
for the subcortical structures) to account for the greater inter-individual anatomical differences.
5.4. DISCUSSION
89
Left Hemisphere Right Hemisphere
Left Ear
Right Ear Left Ear
Right Ear
size of effect
0,29
0,21
0,20
0,13
0,12
0,11
0,06
0,04
0,00
lIC
rIC
0,23
lAC
rAC
0,18
lMGB
rMGB
lCN
rCN
0,13
0,11
0,05
0,03
0,00
Figure 5.3: The percentage signal changes in the auditory structures for monaural left and right
stimulation (see also Fig. 5.2) are shown sorted by hemisphere to render the hemispherical differences more easily appreciable. The left part of the diagram summarizes the responses of the
left hemispheric auditory structures to left and right ear stimulation; the right part gives the responses of the right hemispherical structures. Note the clear hemispherical activation asymmetry
in response to left and right ear stimulation.
signal change), while the left-side structures responded predominantly to the right ear
stimulation (t-test, IC: p < 0.001, t22 = 7.35; MGB: p = 0.001, t22 = 3.84; and AC:
p > 0.001, t22 = 9.8). This is further illustrated in Fig. 5.3, where the signal changes are
sorted by hemisphere. Apart from the CN, the left- and right-ear stimulation produced
similarly large responses in the right-side auditory structures, while the left-side structures
responded approximately twice as strongly to the right-ear than the left-ear stimulation.
The direct contrast between the left- and right-ear stimulation conditions highlights
regions with significant differences in activation strength between both conditions (Fig.
5.4). The left IC, MGB and AC showed a significantly greater activation in response to the
right ear stimulation. None of the right-side structures showed any activation difference.
5.4 Discussion
The present results provide evidence for an asymmetry in the activation of the IC, MGB
and AC in response to left versus right monaural sound stimulation. There has been
recent evidence for a possible activation asymmetry in the AC in response to monaural
stimulation (Devlin et al. 2003). The present study extends these findings by providing a
more comprehensive view with results from the major structures of the ascending auditory
pathway. Previous results in the fields of cortical functional specialization, anatomy and
physiology of subcortical auditory structures and their connectivity suggest a number of
candidate explanations for the present findings.
One possible explanation is the superposition of a stronger contralateral activation
90
CHAPTER 5. ACTIVATION ASYMMETRY IN THE AUDITORY PATHWAY
Figure 5.4: Contrasting right ear vs. left ear stimulation. Axial and coronal anatomical slices
through the inferior colliculi (IC, lower panels), the medial geniculate bodies (MGB, middle panels), and the auditory cortices (AC, upper panels) are shown. (White arrows denote the respective
structures of interest in slices with several activated areas visible.) The overlaid color-coded statistical parameter maps show areas with significant activation differences between right and left
ear stimulation. The left IC, MGB, and AC were more responsive to right ear stimulation. As
seen in Fig. 5.3, this response difference does not only result from an enhanced activation in response to contralateral stimulation, but also from a decreased activation in response to ipsilateral
stimulation. No areas exhibiting a stronger response to the left ear stimulation were found. The
CN showed a statistical sub-threshold tendency to favor ipsilateral stimulation.
5.4. DISCUSSION
91
and a right ear advantage. First, the concepts of contralateral activation predominance
and right ear advantage will be reviewed briefly; then it will be discussed how the present
results could emerge from both effects. This explanation considers primarily the ascending auditory pathway (bottom-up) and relies on functional asymmetries in the brainstem
auditory system. A second possible explanation is based on the back projections from
the auditory cortex to the thalamus and brainstem (corticofugal system, top-down) and
focuses on the contribution of the cerebral auditory system. After a brief review of functional and anatomical aspects of cerebral hemispheric asymmetries and the corticofugal
system, it will be argued that the present results could arise from a corticofugal backprojection of cerebral functional asymmetries to subcortical structures.
5.4.1 The contralateral activation predominance
For the following consideration, the observed BOLD-response of the IC, MGB and AC
is conceptually divided into two parts, a mirror-symmetric part that shows a contralateral
activation predominance, and an asymmetric part that increases the responses to the right
ear stimulation and corresponds to the right ear advantage. To visualize this hypothetical
distinction, one can imagine the response to the right ear stimulation in Fig. 5.2 to be reduced by about one quarter. The response to right and left ear stimulation is then virtually
mirror-symmetric and clearly shows a stronger activation of the contralateral structures.
The asymmetric part of the observed response might arise from a higher sensitivity of the
left and right IC, MGB and AC to right ear stimulation and corresponds to the right ear
advantage. Further down it is argued that such a difference in the hemispheric sensitivity
to right and left ear stimulation might relate to physiological differences in the left- and
right-side auditory periphery.
Considering only the mirror-symmetric part of the hypothetical division of the observed responses, the auditory structures above the CN showed a stronger activation contralateral to the side of sound stimulation. The anatomy and physiology of this activation
shift to the opposite side of the brainstem between the CN and the IC (“acoustic chiasm”)
has been described previously by Glendenning and Masterton (1983)
The shift of activation to the contralateral side in the auditory system is first established in the lateral superior olivary nuclei (LSO). Neurons in the LSO send either inhibitory ipsilateral projections or excitatory contralateral projections to the auditory midbrain through the lateral lemniscus (Glendenning et al. 1992). Therefore, stimulation of
one ear causes a net excitation of the contralateral auditory midbrain structures and a net
inhibition of the ipsilateral structures. The subsequent processing stage, the lateral lemniscus, reinforces the contralateral bias already established by the SOC. The majority of
neurons in the dorsal nucleus of the lateral lemniscus (DNLL) inhibit the contralateral
DNLL and the IC via the commissure of Probst, thereby enhancing the contralaterality of
excitatory responses in the central auditory system from the IC upwards (Kelly and Kidd
2000).
In humans, these findings are supported by neuroimaging studies using positron emission tomography, fMRI, MEG, or EEG (Loveless et al. 1994; Hirano et al. 1997; Scheffler
et al. 1998; Suzuki et al. 2002), which show consistently that monaurally presented stimuli elicit stronger responses in the contralateral auditory cortex. The present study extends
this evidence to the MGB in the thalamus and the IC in the brainstem. The CN served
92
CHAPTER 5. ACTIVATION ASYMMETRY IN THE AUDITORY PATHWAY
as a physiological control for the activation asymmetries in the present study, by showing
the predominantly ipsilateral activation expected from the almost exclusively ipsilateral
afferents.
5.4.2 The right ear advantage
In addition to a contralaterally stronger activation, the IC, MGB and AC also appeared
to be more sensitive to right ear stimulation. This phenomenon could correspond to the
so-called right ear advantage (REA), observed in behavioural studies. When two different stimuli are presented simultaneously, one to each ear (dichotic stimulation), subjects
show a better discrimination performance for stimuli delivered to the right than to the left
ear (Hugdahl 1995; 2000). Usually, speech-related stimuli, like consonant-vowel pairs,
have been used in dichotic listening paradigms to demonstrate a REA in normal subjects
(Hugdahl 1995). The REA for the recognition of speech is thought to arise from the
stronger projections that the language-specialized left auditory cortex receives from the
right ear (Kimura 1967). More recently, King and coworkers (1999) reported evidence
for an alternative explanation. With electrophysiological recordings from aggregate cell
groups of the ventral and caudomedial subdivision of the MGB of guinea pigs, the authors detected a left-right asymmetry in the response to speech and click trains already in
the thalamus. Synthesized speech-like signals elicited larger response onset amplitudes
in the left than in the right MGB, irrespective of whether the stimuli were presented to
the left or right ear, or to both ears. A significant, albeit smaller, response asymmetry
was detected during pure tone stimulation. The authors suggested that cortical functional
lateralizations are at least partly based on subcortical (thalamic) modulation of the input
into the left and right auditory cortices. That study also indicated a subcortical activation asymmetry for modulated non-speech stimuli (clicks and tone bursts), although to
a lesser degree than for speech stimuli (synthesized /da/). There is some evidence for
an even more peripheral contribution to the right ear advantage. Kalfa and coworkers
(Khalfa and Collet 1996; 1997; 1998a) argued in a series of studies for a rightward lateralization of several aspects of auditory function in the auditory periphery and brainstem.
The authors found the olivocochlear projection system to be more active on the right than
on the left side (Khalfa and Collet 1996). The olivocochlear system originates in the superior olivary complex and exerts a modulatory influence on the hair cell activity in the
cochlea. An index of the function of this system is its influence on so called otoacoustic emissions, sounds that are generated in the cochlea by the contractions of hair cells.
Amplitude and rate of spontaneous otoacoustic emissions are higher in the right ear not
only in normal subjects but also in preterm neonates, indicating an early manifestation of
this peripheral asymmetry (Khalfa et al. 1997). The functional activity differences in the
olivocochlear system are not restricted to the auditory periphery, but also correlate with
different perceptions of stimuli presented to the left and right ear (Khalfa et al. 2000), as
well as handedness (Khalfa et al. 1998c). According to Khalfa and colleagues (1998b),
these findings suggest a contribution of the peripheral asymmetry to the right ear advantage, in which case signals from the right cochlea would elicit slightly stronger responses
in the ascending auditory pathway above the level of the CN.
In summary, the present findings could be explained by a hypothetical division of the
observed responses into a stronger contralateral activation of auditory structures above
5.4. DISCUSSION
93
the CN, and a higher sensitivity of these structures to right ear stimulation. According
to this hypothesis, the stronger response to the right-ear stimulation and the contralateral
activation predominance cancel out in the right hemispheric structures, while they add up
in the left-side structures, thereby causing the observed left-right activation asymmetry.
So far, mainly the contribution of the peripheral auditory system was considered,
but the auditory thalamus and brainstem also receive strong efferent (top-down) projections from the cortex. These corticofugal projections might convey cortical functional
asymmetries down to the auditory brainstem and thalamus. Physiological and anatomical
asymmetries of the auditory cortex and the corticofugal system support this hypothesis.
5.4.3 Functional asymmetries in the auditory cortex, and the corticofugal
projection system
Much of the work on the functional differences between the left and right AC has been
dedicated to the lateralization of speech and music, but there is also evidence for a lateralization of auditory spatial processing. Zatorre and Penhune (2001) reported that patients
with right-hemispheric AC lesions showed impaired sound source localization in both
auditory hemifields, whereas patients with left AC lesions were mainly impaired in localizing sound sources in the right hemifield only. Kaiser and Lutzenberger (2001) found
the same lateralization tendency by recording the mismatch response to changes in the
direction of a perceived sound source in a MEG study. The right hemisphere was activated by sound-source shifts in both auditory hemifields, while left-hemisphere regions
responded predominantly to contralateral events. A recent fMRI study by Krumbholz,
Schönwiesner and colleagues (2004) confirms these findings. Using stimuli that were lateralized by means of interaural temporal differences, the authors reported an asymmetry
in the response of the left and right auditory cortex to moving sounds in the left and right
hemifield. The auditory cortex of the right hemisphere responded equally well to sounds
in either hemifield, whereas the left auditory cortex responded preferentially to sounds in
the right hemifield. Considering that monaural tones can be regarded as maximally lateralized to either the left or the right acoustical hemifield, this cortical activation asymmetry
exactly parallels the asymmetrical response pattern in the present study.
What influences could an asymmetrical activation of the cortical hemispheres exert
upon the subcortical activation balance? The afferent and efferent portions of the auditory
system are closely interconnected, introducing multiple feedback loops in the ascending
auditory pathways: monosynaptic projections descend from the auditory cortex to the
MGB (FitzPatrick and Imig 1978; Winer et al. 2001), IC (FitzPatrick and Imig 1978),
and CN (Jacomme et al. 2003). These corticofugal projections are highly organized and
have a variety of modulatory effects. The main recipient of cortical back projections is
the MGB. The efferent fiber tracts from the AC to the MGB are considerably larger than
the afferent projections from the MGB (Winer et al. 2001). The corticothalamic projections to the MGB are focal, clustered, and follow the tonotopic organization. They
convey short-term and long-term facilitation (50-300 ms (He 1997; 2002)) and inhibition
(250-1000 ms, (He 2003)) to MGB neurons. The pattern of cortical input into the IC
is equally elaborate, although the corticocollicular fiber tracts are smaller than the corticothalamic tracts. Winer and colleagues (1998; 2001) have demonstrated a significant
projection from every cortical field to the IC in various mammals, and individual corti-
94
CHAPTER 5. ACTIVATION ASYMMETRY IN THE AUDITORY PATHWAY
cal areas project to several collicular subdivisions. These projections stay mainly on the
ipsilateral side (Saldana et al. 1996; Druga et al. 1997). Actions potential from the AC
reach the IC in 6-20 ms (Bledsoe et al. 2003) and studies in rats (Syka and Popelar 1984),
guinea pigs (Torterolo et al. 1998) and mice (Yan and Ehret 2001, 2002) demonstrated
that corticofugal modulation of IC neurons can be excitatory or inhibitory. Focal electrical activation of the AC elicits frequency-specific changes in tonotopy, frequency tuning,
sensitivity and temporal response pattern in IC neurons (Yan and Suga 1996; Zhang et
al. 1997; Yan and Suga 1998; 2000; Yan and Ehret 2001, 2002).
If this elaborate corticofugal system is a general trend in mammals, the resulting selective cortical influences on subcortical neuronal activity could convey activation asymmetries in the cerebral hemispheres down to the ipsilateral subcortical processing levels.
From this consideration follows the hypothesis that the hemispheric activation asymmetry
in IC and MGB reflects a top-down modulation of the right-lateralized auditory spatial
processing.
In conclusion, the present study reports evidence for an asymmetrical activation of
the left- and right-side auditory cortex and subcortical structures in response to monaural
sound stimulation. Two hypotheses based on asymmetries in either the brainstem or the
cerebral auditory system were suggested. These hypotheses lie on either end of a spectrum of possible explanations, and it is likely that the efferent and the afferent auditory
system both contribute to the observed activation asymmetry. Our speculation about the
functional significance of subcortical activation asymmetries is the following. The incoming sound information is directed to either hemisphere according to its spectro-temporal
features. These features are represented with maximal temporal resolution by neurons
in the auditory brainstem nuclei. Auditory brainstem nuclei therefore appear to be the
natural platform for sorting sound information with respect to fine temporal cues, and
directing the information to either hemisphere. The parameters that determine whether
the information is preferentially sent the left or right side are set by the cerebral hemispheres via the efferent projections. The afferent and efferent auditory pathways can be
disentangled with methods providing a high temporal resolution, like MEG and cellular electrophysiology, as well as with appropriate experimental designs in fMRI. The
hypotheses put forward are thus amenable to testing in future studies.
5.5 References
Bledsoe, S. C., S. E. Shore and M. J. Guitton (2003). Spatial representation of corticofugal input in the inferior colliculus: a multicontact silicon probe approach. Exp Brain Res
153(4): 530-42.
Devlin, J. T., J. Raley, E. Tunbridge, K. Lanary, A. Floyer-Lea, C. Narain, I. Cohen, T.
Behrens, P. Jezzard, P. M. Matthews and D. R. Moore (2003). Functional asymmetry for
auditory processing in human primary auditory cortex. J Neurosci 23(37): 11516-22.
Druga, R., J. Syka and G. Rajkowska (1997). Projections of auditory cortex onto the
inferior colliculus in the rat. Physiol Res 46(3): 215-22.
Edmister, W. B., T. M. Talavage, P. J. Ledden and R. M. Weisskoff (1999). Improved
auditory cortex imaging using clustered volume acquisitions. Hum Brain Mapp 7(2):
5.5. REFERENCES
95
89-97.
FitzPatrick, K. A. and T. J. Imig (1978). Projections of auditory cortex upon the thalamus
and midbrain in the owl monkey. J Comp Neurol 177(4): 573-55.
Glendenning, K. and R. Masterton (1983). Acoustic chiasm: efferent projections of the
lateral superior olive. J. Neurosci. 3(8): 1521-37.
Glendenning, K. K., B. N. Baker, K. A. Hutson and R. B. Masterton (1992). Acoustic
chiasm V: inhibition and excitation in the ipsilateral and contralateral projections of LSO.
J Comp Neurol 319(1): 100-22.
Guimaraes, A. R., J. R. Melcher, T. M. Talavage, J. R. Baker, P. Ledden, B. R. Rosen,
N. Y. Kiang, B. C. Fullerton and R. M. Weisskoff (1998). Imaging subcortical auditory
activity in humans. Hum Brain Mapp 6(1): 33-41.
Hall, D. A., M. P. Haggard, M. A. Akeroyd, A. R. Palmer, A. Q. Summerfield, M. R.
Elliott, E. M. Gurney and R. W. Bowtell (1999). “Sparse” temporal sampling in auditory
fMRI. Hum Brain Mapp 7(3): 213-23.
He, J. (1997). Modulatory effects of regional cortical activation on the onset responses of
the cat medial geniculate neurons. J Neurophysiol 77(2): 896-908.
He, J. (2003). Corticofugal modulation of the auditory thalamus. Exp Brain Res 153(4):
579-90.
He, J., Y. Q. Yu, Y. Xiong, T. Hashikawa and Y. S. Chan (2002). Modulatory effect of
cortical activation on the lemniscal auditory thalamus of the Guinea pig. J Neurophysiol
88(2): 1040-50.
Hirano, S., Y. Naito, H. Okazawa, H. Kojima, I. Honjo, K. Ishizu, Y. Yenokura, Y. Nagahama, H. Fukuyama and J. Konishi (1997). Cortical activation by monaural speech sound
stimulation demonstrated by positron emission tomography. Exp Brain Res 113(1): 7580.
Hugdahl, K. (1995). Dichotic listening: Probing temporal lobe functional integrity. Brain
asymmetry. R. J. Davidson and K. Hugdahl. Cambridge MA, MIT Press: 123-56.
Hugdahl, K. (2000). Lateralization of cognitive processes in the brain. Acta Psychol
(Amst) 105(2-3): 211-35.
Jacomme, A. V., F. R. Nodal, V. M. Bajo, Y. Manunta, J. M. Edeline, A. Babalian and
E. M. Rouiller (2003). The projection from auditory cortex to cochlear nucleus in guinea
pigs: an in vivo anatomical and in vitro electrophysiological study. Exp Brain Res 153(4):
467-76.
Jancke, L., T. Wustenberg, K. Schulze and H. J. Heinze (2002). Asymmetric hemodynamic responses of the human auditory cortex to monaural and binaural stimulation. Hear
Res 170(1-2): 166-78.
Kaiser, J. and W. Lutzenberger (2001). Location changes enhance hemispheric asymmetry of magnetic fields evoked by lateralized sounds in humans. Neurosci Lett 314(1-2):
17-20.
96
CHAPTER 5. ACTIVATION ASYMMETRY IN THE AUDITORY PATHWAY
Kelly, J. B. and S. A. Kidd (2000). NMDA and AMPA Receptors in the Dorsal Nucleus of the Lateral Lemniscus Shape Binaural Responses in Rat Inferior Colliculus. J
Neurophysiol 83(3): 1403-14.
Kemp, D. T. (1978). Stimulated acoustic emissions from within the human auditory
system. J Acoust Soc Am 64(5): 1386-91.
Kemp, D. T. (2002). Otoacoustic emissions, their origin in cochlear function, and use. Br
Med Bull 63: 223-41.
Khalfa, S. and L. Collet (1996). Functional asymmetry of medial olivocochlear system in
humans. Towards a peripheral auditory lateralization. Neuroreport 7(5): 993-6.
Khalfa, S., C. Micheyl, E. Pham, S. Maison, E. Veuillet and L. Collet (2000). Tones
disappear faster in the right ear than in the left. Percept Psychophys 62(3): 647-55.
Khalfa, S., C. Micheyl, E. Veuillet and L. Collet (1998a). Peripheral auditory lateralization assessment using TEOAEs. Hear Res 121(1-2): 29-34.
Khalfa, S., T. Morlet, C. Micheyl, A. Morgon and L. Collet (1997). Evidence of peripheral hearing asymmetry in humans: clinical implications. Acta Otolaryngol 117(2):
192-6.
Khalfa, S., T. Morlet, E. Veuillet, X. Perrot and L. Collet (1998b). [Peripheral auditory
lateralization]. Ann Otolaryngol Chir Cervicofac 115(3): 156-60.
Khalfa, S., E. Veuillet and L. Collet (1998c). Influence of handedness on peripheral
auditory asymmetry. Eur J Neurosci 10(8): 2731-7.
Kimura, D. (1967). Functional asymmetry of the brain in dichotic listening. Cortex 3:
163-78.
King, C., T. Nicol, T. McGee and N. Kraus (1999). Thalamic asymmetry is related to
acoustic signal complexity. Neurosci Lett 267(2): 89-92.
Krumbholz, K., M. Schönwiesner, D. Y. von Cramon, R. Rübsamen, N. J. Shah, K. Zilles
and G. R. Fink (2004). Representation of interaural temporal information from left and
right auditory space in the human planum temporale and inferior parietal lobe. Cerebral
Cortex (submitted).
Levine, R. A., J. Liederman and P. Riley (1988). The brainstem auditory evoked potential
asymmetry is replicable and reliable. Neuropsychologia 26(4): 603-14.
Levine, R. A. and P. M. McGaffigan (1983). Right-left asymmetries in the human brain
stem: auditory evoked potentials. Electroencephalogr Clin Neurophysiol 55(5): 532-7.
Lohmann, G., K. Muller, V. Bosch, H. Mentzel, S. Hessler, L. Chen, S. Zysset and D.
Y. von Cramon (2001). LIPSIA–a new software system for the evaluation of functional
magnetic resonance images of the human brain. Comput Med Imaging Graph 25(6):
449-57.
Loveless, N., J. P. Vasama, J. Makela and R. Hari (1994). Human auditory cortical mechanisms of sound lateralisation: III. Monaural and binaural shift responses. Hear Res
81(1-2): 91-9.
5.5. REFERENCES
97
Melcher, J. R., I. S. Sigalovsky, J. J. Guinan, Jr. and R. A. Levine (2000). Lateralized
tinnitus studied with functional magnetic resonance imaging: abnormal inferior colliculus
activation. J Neurophysiol 83(2): 1058-72.
Needham, K. and A. G. Paolini (2003). Fast inhibition underlies the transmission of
auditory information between cochlear nuclei. J Neurosci 23(15): 6357-61.
Oldfield, R. C. (1971). The assessment and analysis of handedness: the Edinburgh inventory. Neuropsychologia 9(1): 97-113.
Palmer, A. R., D. C. Bullock and J. D. Chambers (1998). A high-output, high quality
sound system for use in auditory fMRI. NeuroImage 7: S359.
Penhune, V. B., R. J. Zatorre, J. D. MacDonald and A. C. Evans (1996). Interhemispheric
anatomical differences in human primary auditory cortex: probabilistic mapping and volume measurement from magnetic resonance scans. Cereb Cortex 6(5): 661-72.
Rademacher, J., P. Morosan, T. Schormann, A. Schleicher, C. Werner, H. J. Freund and
K. Zilles (2001). Probabilistic mapping and volume measurement of human primary
auditory cortex. Neuroimage 13(4): 669-83.
Rosenzweig, M. R. (1951). Representations of the two ears at the auditory cortex. Am. J.
Physiol. 167: 147-214.
Saldana, E., M. Feliciano and E. Mugnaini (1996). Distribution of descending projections from primary auditory neocortex to inferior colliculus mimics the topography of
intracollicular projections. J Comp Neurol 371(1): 15-40.
Scheffler, K., D. Bilecen, N. Schmid, K. Tschopp and J. Seelig (1998). Auditory cortical responses in hearing subjects and unilateral deaf patients as detected by functional
magnetic resonance imaging. Cereb Cortex 8(2): 156-63.
Suzuki, M., H. Kitano, T. Kitanishi, R. Itou, A. Shiino, Y. Nishida, Y. Yazawa, F. Ogawa
and K. Kitajima (2002). Cortical and subcortical activation with monaural monosyllabic
stimulation by functional MRI. Hear Res 163(1-2): 37-45.
Syka, J. and J. Popelar (1984). Inferior colliculus in the rat: neuronal responses to stimulation of the auditory cortex. Neurosci Lett 51(2): 235-40.
Tervaniemi, M. and K. Hugdahl (2003). Lateralization of auditory-cortex functions.
Brain Res Brain Res Rev 43(3): 231-46.
Torterolo, P., P. Zurita, M. Pedemonte and R. A. Velluti (1998). Auditory cortical efferent
actions upon inferior colliculus unitary activity in the guinea pig. Neurosci Lett 249(2-3):
172-6.
Ugurbil, K., M. Garwood, J. Ellermann, K. Hendrich, R. Hinke, X. Hu, S. G. Kim, R.
Menon, H. Merkle, S. Ogawa and et al. (1993). Imaging at high magnetic fields: initial
experiences at 4 T. Magn Reson Q 9(4): 259-77.
Winer, J. A., J. J. Diehl and D. T. Larue (2001). Projections of auditory cortex to the
medial geniculate body of the cat. J Comp Neurol 430(1): 27-55.
Winer, J. A., D. T. Larue, J. J. Diehl and B. J. Hefti (1998). Auditory cortical projections
to the cat inferior colliculus. J Comp Neurol 400(2): 147-74.
98
CHAPTER 5. ACTIVATION ASYMMETRY IN THE AUDITORY PATHWAY
Yan, J. and G. Ehret (2001). Corticofugal reorganization of the midbrain tonotopic map
in mice. Neuroreport 12(15): 3313-6.
Yan, J. and G. Ehret (2002). Corticofugal modulation of midbrain sound processing in
the house mouse. Eur J Neurosci 16(1): 119-28.
Yan, J. and N. Suga (1996). Corticofugal modulation of time-domain processing of
biosonar information in bats. Science 273(5278): 1100-3.
Yan, W. and N. Suga (1998). Corticofugal modulation of the midbrain frequency map in
the bat auditory system. Nat Neurosci 1(1): 54-8.
Zatorre, R. J. and P. Belin (2001). Spectral and temporal processing in human auditory
cortex. Cereb Cortex 11(10): 946-53.
Zatorre, R. J., P. Belin and V. B. Penhune (2002). Structure and function of auditory
cortex: music and speech. Trends Cogn Sci 6(1): 37-46.
Zatorre, R. J. and V. B. Penhune (2001). Spatial localization after excision of human
auditory cortex. J Neurosci 21(16): 6321-8.
Zhang, J., K. T. Nakamoto and L. M. Kitzes (2004). Binaural interaction revisited in the
cat primary auditory cortex. J Neurophysiol 91(1): 101-17. Epub 2003 Sep 24.
Zhang, Y. and N. Suga (2000). Modulation of responses and frequency tuning of thalamic
and collicular neurons by cortical activation in mustached bats. J Neurophysiol 84(1):
325-33.
Zhang, Y., N. Suga and J. Yan (1997). Corticofugal modulation of frequency processing
in bat auditory system. Nature 387(6636): 900-3.
Summary
This work reports evidence on the topography and hemispherical asymmetries in hemodynamic correlates of the auditory processing of basic acoustic parameters. Investigated
were:
• topographic frequency representation in the auditory cortex,
• integration of binaural input in the subcortical and cortical auditory system,
• cortical activation asymmetries for spectral and temporal processing,
• cortical asymmetries in the representation of the left and right auditory hemifield,
and
• cortical and subcortical activation asymmetries in response to monaural stimulation.
The study on the topographic representation of frequencies lead to the conclusion that
the combination of specialized stimuli, so called random frequency modulation walks,
with an analysis respecting the different anatomy of the auditory cortex in individual
subjects, and a careful comparison of the activation sites with cytoarchitectonical and
imaging studies is mandatory to arrive at valid results concerning tonotopy. The finding of
differences in frequency selectivity of distinct auditory areas called for a reinterpretation
of the results of earlier studies. The suggested interpretation is that the different activation
sites correspond to different cortical fields, the tonotopical organization of which cannot
be resolved with the current spatial resolution of functional imaging methods.
The study on binaural integration introduces a functional imaging paradigm, the binaural difference paradigm, that enables the measurement of brainstem binaural processing. The extraction of binaural acoustic cues by integration of the signals from the two
ears is known to be the basis for horizontal sound localization. The binaural difference
paradigm revealed a substantial binaural response suppression in the inferior colliculus
in the brainstem, the medial geniculate body in the thalamus, and the primary auditory
cortex. The size of the suppression suggests that it was brought about by neural inhibition
at a level below the IC, the only possible candidate being the superior olivary complex.
The experiment also included a moving sound condition, which was contrasted against a
spectrally and energetically matched stationary sound condition to reveal structures that
are specialized in auditory motion processing. Comparing the sites of binaural integration and motion processing revealed a hierarchical organization of binaural processing
in humans, with binaural integration starting below the inferior colliculus, and motion
sensitivity emerging in the planum temporale.
99
The study on hemispherical asymmetries in spectral and temporal processing yielded
supporting evidence for a hemispheric specialization model that proposes a preference
of the left-hemispheric auditory structures for fast temporal processing and a right-hemispheric preference for fine-grained spectral processing. These asymmetries in the processing of spectral and temporal sound information are thought to underlie the lefthemispherical dominance for speech and the right-hemispherical dominance for the processing of tonal sequences (melodies). A new class of parametric, wideband, dynamic,
acoustic stimuli was constructed, which permitted independent variation of spectral and
temporal sound characteristics. Cortical responses from the left and right primary auditory cortex covaried with the spectral parameter, while a covariation analysis for the temporal parameter revealed an area on the left superior temporal gyrus (STG). The equivalent region on the right STG responded exclusively to the spectral parameter. These
findings support the hemispheric specialization model and permit a generalization of the
model to include processing of simultaneously present spectral peaks. Because the stimuli
were inherently unrelated to melodic sequences, the results also provide the first unequivocal evidence for a right-lateralization of spectral integration in general.
The study on hemispheric asymmetries in the processing of auditory space shows
that the internal representation of interaural temporal information mediated by lateralized sounds is predominantly contralateral in the human auditory cortex. The differential
responses to moving sounds further revealed that the left hemisphere responded predominantly to sound movement within the right hemifield, whereas the right hemisphere responded to sound movement in both hemifields. All sounds used in the experiment had
the same energy at the two ears and the impression of laterality or movement was created solely by interaural temporal manipulations, allowing to conclude with confidence
that the observed functional asymmetries are not confounded by the known asymmetry
in the number of crossed and uncrossed excitatory projections in the ascending auditory
pathway, but result from the interaural temporal processing of the sounds.
A possible subcortical basis of cortical hemispheric specialization is discussed in the
study on activation asymmetries in the auditory pathway. The study reports evidence
for an asymmetrical activation of the left- and right-side auditory cortex, thalamus and
brainstem in response to monaural sound stimulation. A hypothesis based on anatomical
and physiological properties of the afferent and efferent auditory pathway is suggested,
in which the cerebral functional asymmetries are conveyed to subcortical structures via
the corticofugal fiber tracts.
100
CURRICULUM VITA
Marc Schönwiesner
Scientific Education
2000–2004
08/2000
1995–2000
Ph.D student at the University of Leipzig and the MaxPlanck-Institute of Human Cognitive and Brain Sciences
with a scholarship from the German National Academic
Foundation
Diploma (1.0; “with distinction”)
Biology studies at the University of Leipzig
School Education
1990–1994
1982–1990
Abitur at the “Adolph-Reichwein” Gymnasium in
Halle/Saale
Polytechnical High School “Clara Zetkin” in Halle/Saale
Professional Education and Experience
since
10/2003
07/2003
10/2002
08/2002
07/2002
06/2002
04/2002
2000–2001
1998–1999
1998
Visiting researcher at the Cognitive Brain Research Unit,
University of Helsinki, Helsinki, Finland
Invited talk at the Montréal Neurological Institute, McGill
University, Montréal, Canada, host: Dr. R. Zatorre
Invited talk at the Max-Planck-Institute of Biophysical
Chemistry, Göttingen, Germany, host: Dr. P. Dechent
Visiting researcher at the Institute of Medicine, Research
Center Jülich, Germany
Invited talk at the Institute of Medicine, Research Center
Jülich, Germany, host: Dr. K. Krumbholz
Trainee Travel Award from the National Institutes of
Health and the Organization for Human Brain Mapping
Lectures on Brain Theory and the Frontiers of Cognitive
Neuroscience (ca. 100 Students)
Courses in brain anatomy (Frankfurt/M.), computational
neuroscience (Bochum) and transcranial magnetic stimulation (Göttingen), organised by the German Neuroscience
Society
teaching assistant for basic zoology seminars for biology
students at the University of Leipzig, class size: 20 pers.
Internship at Medtronic GmbH (heart pacemaker programming)