-
Research Article
Detection of Modulated Tones in Modulated Noiseby Non-human
Primates
PETER BOHLEN,1 MARGIT DYLLA,1 COURTNEY TIMMS,1 AND RAMNARAYAN
RAMACHANDRAN1
1Department of Hearing and Speech Sciences, Vanderbilt
University School of Medicine, Nashville, TN 37232, USA
Received: 24 January 2014; Accepted: 8 May 2014; Online
publication: 5 June 2014
ABSTRACT
In natural environments, many sounds are amplitude-modulated.
Amplitude modulation is thought to be asignal that aids auditory
object formation. A previousstudy of the detection of signals in
noise found thatwhen tones or noise were amplitude-modulated,
thenoise was a less effective masker, and detectionthresholds for
tones in noise were lowered. Theseresults suggest that the
detection of modulated signalsin modulated noise would be enhanced.
This paperdescribes the results of experiments investigating
howdetection is modified when both signal and noisewere
amplitude-modulated. Two monkeys (Macacamulatta) were trained to
detect amplitude-modulatedtones in continuous, amplitude-modulated
broadbandnoise. When the phase difference of otherwisesimilarly
amplitude-modulated tones and noise werevaried, detection
thresholds were highest when themodulations were in phase and
lowest when themodulations were anti-phase. When the depth of
themodulation of tones or noise was varied, detectionthresholds
decreased if the modulations were anti-phase. When the modulations
were in phase, increas-ing the depth of tone modulation caused an
increasein tone detection thresholds, but increasing depth ofnoise
modulations did not affect tone detectionthresholds. Changing the
modulation frequency oftone or noise caused changes in threshold
that
saturated at modulation frequencies higher than20 Hz; thresholds
decreased when the tone and noisemodulations were in phase and
decreased when theywere anti-phase. The relationship between
reactiontimes and tone level were not modified by manipula-tions to
the nature of temporal variations in the signalor noise. The
changes in behavioral threshold wereconsistent with a model where
the brain subtractednoise from signal. These results suggest that
theparameters of the modulation of signals and maskersheavily
influence detection in very predictable ways.These results are
consistent with some results inhumans and avians and form the
baseline forneurophysiological studies of mechanisms of detec-tion
in noise.
Keywords: amplitude modulation, detection,behavior,
comodulation
INTRODUCTION
The amplitudes of natural sounds fluctuate with time.Due to the
prevalence of temporally modulatedsounds, the auditory system may
be specially adaptedto encode and even take advantage of these
features(Gans 1992). Studies of physiological responses
ofauditory-responsive neurons have shown that onesuch adaptation,
phase locking, could lead to an upto 20 dB enhancement in
sensitivity to sounds (Joris etal. 1994). However, natural
environments are com-posed of multitudes of sounds, and the
amplitude ofany or all of them could vary with time.
Thus,behaviorally relevant target sounds and behaviorallyirrelevant
distractors could both tap into the auditory
Correspondence to : Ramnarayan Ramachandran & Departmentof
Hearing and Speech Sciences & Vanderbilt University Schoolof
Medicine & Nashville, TN 37232, USA. Telephone: (615)
322-4991;email: [email protected]
JARO 15: 801–821 (2014)DOI: 10.1007/s10162-014-0467-7D 2014
Association for Research in Otolaryngology
801
JAROJournal of the Association for Research in
Otolaryngology
-
sensitivity for modulations. This represents part of
thecomplexity of auditory scene analysis problem thathighlights the
difficulty in auditory processing incomplex, noisy environments
that characterize thenatural environment. Research in visual
systemssuggests that visual scene analysis, specifically
scenesegmentation, depends on feature borders and con-trasts
between local stimulus properties and globalstimulus properties
(e.g., Julesz 1986; reviewed inNothdurft 1994).
While many studies of auditory scene analysishighlight pattern
discrimination and identification,some studies deal with the
processing of contrastbetween local signals and global signals.
Amplitudemodulation is one way to integrate multiple stimuliinto a
single auditory object (Yost and Sheft 1989).Consistent with such a
hypothesis, detection thresh-olds of a steady-state signal in a
modulated maskerwere lower relative to when the signal and
themasker were not temporally modulated or whenthe modulation of
the masker is uncorrelatedacross different spectral regions (e.g.
Hall et al.1984; Schooneveldt and Moore 1989; Fantini
1991;Langemann and Klump 2001; Dylla et al. 2013).When both signals
(local stimulus to an auditoryfilter) and masker (global stimulus)
were temporal-ly modulated, behavioral performance was
highlydependent on temporal correlations between thesignal and the
masker: detection thresholds werelower when the modulation of the
signal and themasker were different relative to when the signaland
the masker were modulated similarly (e.g.,McFadden 1987; Cohen and
Schubert 1987; Fantiniand Moore 1994). Since animals also live
inenvironments where signals and maskers are bothmodulated,
potentially similar results and rulescould apply to animals as well
(Bee and Micheyl2008). And, consistent with that hypothesis,
exper-iments in avians have found that correlationsbetween signal
and masker resulted in higherthresholds for the detection of signal
relative towhen the signal and masker were not correlatedwith each
other (corvids: Jensen 2007; passarines:Langemann and Klump
2007).
With the recent popularity of the macaque as amodel for hearing,
it is an open question to ask ifsome of the properties of scene
analysis and auditoryobject processing that have been described in
humansapply to macaques as well. Studies have found thatmacaques
have U-shaped audiograms, similar tohumans (e.g., Stebbins et al.
1966; Pfingst et al.1975, 1978), and the modification of the
audiogramsby noise are similar to humans (compare resultsfrom Dylla
et al. (2013) with Hawkins and Stevens(1950)). An early indication
of modulation-basedrelease in masking in macaques was observed
when
tone detection thresholds were lower when either thesignal or
the noise was modulated (Dylla et al. 2013),consistent with
findings in humans and other species(e.g., Gustafsson and Arlinger
1994; Bacon et al. 1998;Langemann and Klump 2001; Velez and Bee
2010). Inthis paper, we extend the findings of our
previousbehavioral study to further investigate how detectionis
modified when both tones and noise were time-varying (temporal
variation was created by amplitudemodulation) and suggest a model
for the computationunderlying the detection. If amplitude
modulationhelps auditory object formation, then thresholds todetect
an amplitude-modulated signal in a similarlyamplitude-modulated
noise would be higher thanwhen the signal and noise were modulated
differently.Theories of dip listening would suggest that
detectionthresholds would increase as the energy in the dip ofthe
masker decreased. The behavioral performance ofthe monkeys is
consistent with both predictions, andan energetic masking model
where the nervoussystem effectively subtracts noise from the signal
canaccount for the results. The results of these experi-ments form
the baseline for neurophysiological ex-periments exploring the
mechanisms underlyingauditory scene analysis, auditory object
formation,and the detection of signals in noise.
METHODS
Experiments were conducted on two male rhesusmacaque monkeys
(Macaca mulatta) that were both5 years of age at the beginning of
these experiments(monkeys C and D). The monkeys were prepared
forchronic experiments using standard techniques usedin primate
research (e.g., Ramachandran andLisberger 2005; Dylla et al. 2013),
and their audio-grams as well as the effects of noise on
theiraudiograms were consistent with previous reports onnon-human
primates, including studies from ourlaboratory (Stebbins et al.
1966; Pfingst et al. 1975,1978; Dylla et al. 2013). All procedures
were approvedby the Institutional Animal Care and Use Committeeat
Vanderbilt University and were in strict compliancewith the
guidelines for animal research established bythe National
Institutes of Health.
The surgical and experimental procedures havebeen described in
detail earlier (Dylla et al. 2013).Briefly, monkeys were prepared
for this study with asurgical procedure conducted using isoflurane
anes-thesia and performed under sterile conditions.During this
surgical procedure, bone cement andscrews were used to secure a
head holder to the skull.The monkey was allowed to recover with a
regimen ofanalgesics and antibiotics (if necessary) and was
undercareful observation by both laboratory staff and
802 BOHLEN ET AL.: Detection of Modulated Tones in Modulated
Noise
-
veterinary personnel. The head holder was used toposition the
monkey’s head in a constant location inthe chair (via a head-post)
relative to the speakersduring experiments.
All experiments were conducted in a double-walledpseudo-anechoic
sound booth (model 1200A, IndustrialAcoustics Corp., NY). The
monkeys were seated com-fortably in an acrylic primate chair that
was custom-designed for their comfort and to leave the area
aroundthe ears clear. The monkeys’ heads were fixed to thechair by
means of the implanted head holder such thatthe head was level with
the center of speakers positioneddirectly in front at a distance of
90.1 cm from the ears.The speakers (Rhyme Acoustics speakers,
Madisound,WI) could deliver sounds between 50 and 40 kHz andwere
driven by linear amplifiers such that the output ofthe speakers
varied by ±3 dB over the entire frequencyrange. The efficacy of the
sound system was frequentlytested by calibrating the output with a
½″ probemicrophone system (PS 9200, ACO Pacific, Belmont,CA). All
calibrations were performed with the probemicrophone being placed
at the location of one of theears of the monkey with its head
fixed. The samespeaker was used to deliver tones and noise, so
thatthere was no spatial separation between the twostimuli. Tones
were calibrated by presenting thestimuli, measuring the signal with
the probe micro-phone placed at the location of the monkey ears
andusing the known sensitivity of the microphone. Noisewas
calibrated by filtering the noise into 1-Hz bandsusing custom
software written in Matlab, calibratingthe sound pressure level
over the entire frequencyrange of the noise (thus measuring dB
spectrumlevel, see below) and then calculating the overalllevel
based on the known relationship betweendecibel overall level and
decibel spectrum level(see below for details).
Behavioral Task
The experiments were controlled by a computerrunning OpenEx
software (System 3, TDT Inc.,Alachua, FL). Signals (tones and
noise) were generat-ed with a sampling rate of 97.6 kHz. Lever
state wassampled at a rate of 24.4 kHz, with a temporalresolution
of about 40 μs on the lever release. Thedetails of the task, basic
stimulus, and experimentalconditions have been described elsewhere
(Dylla et al.2013). Briefly, the monkeys initiated trials by
holdingdown a lever (Model 829 Single Axis Hall EffectJoystick,
P3America, San Diego, CA). When signals(duration=200 ms, 10 ms rise
and fall times) werepresented (∼80 % of the trials,
tones/amplitude-modulated tones), monkeys were required to
releasethe lever within a 600 ms response window beginningat tone
onset. A correct release resulted in fluid
reward, incorrect non-releases were not penalized,and early
release was treated as a false alarm. Oncatch trials (∼20 % of the
trials, when no signals werepresented), monkeys were required to
hold throughthe response window. Correct rejects were notrewarded,
but incorrect releases (false alarms) result-ed in a variable
duration (6–10 s) time-out periodduring which no new trials could
be initiated.Broadband noise (bandwidth 5 Hz–40 kHz) was usedand
was presented continuously, beginning 10 sbefore the first trial
could be initiated so that themonkey was adapted to the noise. On
signal trials,monkeys were required to detect signal
(tone/modu-lated tone) in noise (broadband
noise/amplitude-modulated broadband noise), and on catch
trials,monkeys were required to reject the noise.
Tones were generated using the formula S(t)=Asin(2πfct+ϕc),
where S(t) represents the tone signal, Arepresents the amplitude in
volts, fc represents thecarrier (tone) frequency, and ϕc represents
the carrierphase. Usually, the carrier phase was set to be 0
(zero)in all of the experiments described below. Broadbandnoise
(N(t)) was generated using the “Random”function in OpenEx, which
generated flat-spectrumnoise with roughly equal amplitude at all
frequenciesand was further band-limited to 40 kHz. The ampli-tude
of the broadband noise is always given as thetotal level, in
decibel (dB). Usually, the mean noiseamplitude was set at a 55-dB
overall level. Theamplitude in dB SPL spectrum level may be
computedby subtracting from that overall level an amount equalto
10*log10(bandwidth), 46 dB. The measure of signallevel used was
power (the signal duration was not takeninto account for the
calculation of signal level). In theseexperiments, the sound
pressure level of the tone couldvary over a 90-dB range, going from
−16 to 74 dB SPL.Tone levels were usually presented in steps of 3
or 5 dB,and sound pressure levels were randomly interleavedwithin a
block. Under the conditions of the experi-ments, broadband noise at
55 dB caused a roughly 30 dBchange in tone thresholds across many
frequencies,consistent with previous results in our laboratory
(Dyllaet al. 2013). Figure 1 shows the audiograms of the twomonkeys
to tones presented alone (large symbols andsolid lines) and in
continuous broadband noise at thenoise level used in this study
(55-dB overall level; smallsymbols and dashed lines). The noise
level used causedsignificant threshold shifts that showed
frequencyspecific trends that were consistent with and
matchedprevious data in macaques (Dylla et al. 2013) and withdata
from humans (Hawkins and Stevens 1950). Notethat the use of higher
noise levels (99 dB SPL spectrumlevel) would result in higher
masked thresholds (e.g.Dylla et al. 2013) and may cause different
amounts ofmasking release as a result of parametric variations
inthe signal or noise modulations.
BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise
803
-
Temporal variations of signals were created viasinusoidal
amplitude modulation. For any signal S(t),sinusoidal amplitude
modulation was produced ac-cording to
SAM tð Þ ¼ S tð Þ � 1þ ds � sin 2π f ms t þ ϕs� �� �
;
where SAM(t) is the amplitude-modulated signal, ds isthe depth
of signal modulation, and fms and ϕsrepresent the modulation
frequency and modulationphase of the signal, respectively.
Amplitude-modulat-ed noises were created similarly according to
NAM tð Þ ¼ N tð Þ � 1þ dn � sin 2π f mnt þ ϕn� �� �
;
where NAM(t) is the amplitude-modulated noise, dnis the depth of
noise modulation, and fmn and ϕnrepresent the modulation frequency
and modula-tion phase of the noise, respectively. In both ofthese
cases, the mean sound pressure level will beprovided in the data,
so the signal and the noisehad peaks that were 6 dB higher than the
reportedmean level when the modulation depth was set at1. The
parameterization shown above allowed usthe opportunity to vary ds,
dn, fms, fmn, ϕs, and ϕnindependently. The experiments were
performedin a block design so that all modulation parame-ters were
constant within a block, except for A; thisway, the threshold and
reaction time metrics couldbe determined using the method of
constantstimuli. Across blocks, modulation parameterscould be
systematically varied and their effects onbehavior measured.
Data Analysis
The analytical techniques have been described previ-ously (Dylla
et al. 2013). All analyses were based onsignal detection theoretic
methods (Green and Swets1966; Macmillan and Creelman 2005) and
implement-ed using MATLAB (Mathworks, Matick, MA). Briefly,the hit
rate (H) and false alarm rate (FA) werecalculated based on the
number of releases at tonesound pressure level (A) for each block.
Signaldetection theory dictates that the behavioral sensi-tivity
for a Go/No-Go task can be analyzed in thefollowing way:
p cð Þ ¼ z−1 z Hð Þ − z FAð Þ2
� �;
where z converts hit rate and false alarm rate intounits of
standard deviation of a standard normaldistribution (z-score,
norminv in MATLAB) (Macmil-lan and Creelman 2005). The inverse z
(z−1) thenconverts a unique number of standard deviations of
astandard normal distribution into a probability cor-rect (p(c),
normcdf in MATLAB). Care was taken toadjust for cases when hit
rates and false alarm rateswere 1 and 0, respectively, using
methods describedpreviously (Dylla et al. 2013; Macmillan and
Creelman2005). The probability correct values were calculatedfor
all signal amplitudes to create the psychometricfunction.
The false alarms (10 % or less in all the blocks) andsometimes
less than perfect performance at highersound pressure levels cause
the psychometric func-tions to be non-ideal. To account for that,
psychomet-ric functions were fit with a modified Weibullcumulative
distribution function (cdf) using thefollowing equation:
p cð Þfit ¼ c − d � e− level=λð Þk
;
where level represents the tone sound pressure level indB SPL,
and is related to A by a logarithmic function,λ and k represent the
threshold and slope parame-ters, respectively, and c and d
represent the proba-bility correct at higher sound levels, and
theestimates of chance performance at sound levelsbelow threshold,
respectively. To account for thesound pressure levels below 0 dB
SPL, sound levelswere translated by up to 16 dB, fit with a
Weibullfunction, and then sound levels and thresholds
weretranslated back by the same amount as the originaltranslation.
From the Weibull cdf, threshold wascalculated as that tone sound
pressure level thatwould cause a probability correct value of
0.76.
Frequency (Hz)100 400 1000 4000 10000 40000
Thr
esho
ld (
dB S
PL)
-10
0
10
20
30
40
50
60
70 Monkey D, 55 dB noiseMonkey C, 55 dB noiseMonkey D, Tone
aloneMonkey C, Tone alone
FIG. 1. Thresholds to tones alone and to tones in noise.
Thresholdto a 200-ms tone is plotted against the tone frequency for
monkeys C(red circles and lines) and D (blue diamonds and lines).
Thresholdsare shown when tones were presented alone (large symbols,
solidlines) and when tones were presented embedded in
continuousbroadband noise at a 55-dB overall level.
804 BOHLEN ET AL.: Detection of Modulated Tones in Modulated
Noise
-
These analyses were performed under the variousconditions used
in this study.
In all cases, reaction time was also computed, basedon the time
of the lever release. Reaction time wascomputed as follows:
Reactiontime ¼ timeof level release−toneonset time
Reaction time was computed on all correct Goresponses. We
performed statistical analyses on thereaction times to explore the
variation of reactiontime with signal strength and with noise level
and withthe modulation of noise or signal.
Statistical Analysis
All statistical analyses were implemented usingMATLAB and were
either coded by one of theauthors based on a theory described in
Zar (1984)or was implemented using a built-in function.
In many cases, the variability in the data was only ableto be
estimated using bootstrap methods (Efron andTishirani 1993).
Briefly, each trial was resampled usingrandom draws with
replacement, while taking care tomaintain the substructure of the
block (e.g., number oftrials at each sound level). For example, the
variability inthreshold measurements would be estimated by
resam-pling each block of behavioral responses 1,000 times.The
responses at each tone level (including catch trials)were drawn
with replacement from the original datasetat that particular tone
level, taking care that the numberof bootstrapped trials at that
tone level matched thenumber obtained behaviorally. This was
repeated at alltone levels to generate one estimate of the
bootstrappedbehavioral data to generate one bootstrapped
thresholdestimate. The same procedure was repeated 1,000 timesto
generate 1,000 estimates of bootstrapped threshold.This procedure
permitted the calculation of the vari-ability of the threshold
measured. In all cases, thenumber of iterations was restricted to
be the lowestnumber such that the parameters converged. In
mostcases, the distributions converged by 1,000 iterations.
RESULTS
Effect of Phase Difference
One way of varying the temporal relationship betweentwo
modulated sounds is to impose a phase differencebetween the
modulations. The effect of phase differ-ence between the
modulations of tone and noise(δϕ=ϕs−ϕn) were investigated in two
macaques. Diplistening theories predict that as more of the
signal(modulated tone) occurred in the dips of the noise,thresholds
would be reduced; (i.e.), the thresholds
would be lowered when phase differences approached180 ° and
would be systematically higher as the phasedifferences deviated
from 180 °. Figure 2 shows theresults of such a manipulation in one
monkey duringthe detection of a 12.8 kHz tone in broadband
noise.Both the tone and the noise were amplitude-modu-lated at 10
Hz, and both tone and noise weremodulated to a depth of 1. Figure
2A shows the hitrates (colored circles) and false alarm rates
(coloreddashed lines, labeled FA) as a function of the tonesound
pressure level during the detection task forfour different phase
differences. The different colorsrepresent different phase
differences between thetone and the noise modulations (see legend).
Thehit rates diverged from false alarm rates at verydifferent sound
pressure levels depending on thephase difference of the
modulations. This implies thatthe monkey could reliably release the
lever at lowersound levels when the tone and noise modulationswere
in anti-phase at tone onset (δϕ=180 °) relative towhen the tone and
noise modulations were in phaseat tone onset (δϕ=0 °). The tone
levels required for areliable lever release for the phase
differences inter-mediate to those (δϕ=90 ° and δϕ=270 °)
wereintermediate to those for the other two δϕ valuesand appeared
similar to each other. The behavioralaccuracy in the task at each
sound pressure level wascalculated by taking hit rate and false
alarm rate intoconsideration (as in the “METHODS” section)
andplotted as psychometric functions relating probabilitycorrect
(p(c)) and tone sound pressure level inFigure 1B. The psychometric
functions were fit withWeibull cdfs to generate smooth estimates of
behav-ioral accuracy and to estimate behavioral thresholds.The
psychometric functions varied with the modula-tion phase difference
in a manner similar to the hitrates. The detection thresholds were
lowest for δϕ=180 °, intermediate for δϕ=90 ° and δϕ=270 °,
andhighest when δϕ=0 °. These results are consistentwith theories
of dip listening that suggest decreasesin threshold as more of the
signal falls into the dipof the masker.
Figure 2C shows how response times changed withsound pressure
level. The color scheme is the same asin Figure 2B. In all cases,
the reaction times decreasedas the tone levels increased, similar
to the trend forsteady state tones, and steady state tones masked
bynoise. The slopes of the reaction time vs. tone levelrelationship
were not significantly different withmodulation phase difference
(ANOVA afterbootstrapping, F(7,993)=1.58, p=0.137).
Figure 3 shows how the phase differences betweenthe signal and
noise modulations (δϕ=ϕs-ϕn) influ-enced detection thresholds and
reaction times.Figure 3A shows the relationship between the
thresh-olds and δϕ for the exemplar case shown in Figure 2.
BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise
805
-
The thresholds decreased as the phase differenceincreased from 0
to 180 °, but then increased as phasedifference wrapped back to 360
°. The thresholdsappeared to be sinusoidally modulated by
phasedifference and were best fit with a sinusoidal functionrelated
to half the phase difference and amplitude of16.4 dB. The
sinusoidal shape of the curve fit wasconsistent with a subtraction
model, where the noiseamplitude was subtracted from the signal
amplitudeor one where the modulation waveform of the noisewas
subtracted from the modulation waveform of thetone. Figure 3B shows
the trend over all otherfrequencies tested, ranging between 0.4
and25.6 kHz (shown in different colors). The offset inthe curves
was highly correlated with and was mostlikely related to the
audiometric thresholds at thosefrequencies. The trend in threshold
changes as afunction of modulation phase difference was
similaracross fc values, and the magnitude of the thresholdchange
as a function of δϕ was not significantlydifferent as a function of
frequency (Kruskal Wallistest, df=5, H=8.57, p=0.127). These
results did notvary depending on the onset phase of the tone
ornoise modulation, as long as δϕ was maintainedconstant. These
results are consistent with listeningin the “dips” of the noise; as
the phase differencebetween the signal and noise modulations was
varied,the mount of signal in the dips of the noise increased,which
could result in improved thresholds.
Figure 3C and shows the effect of δϕ on reactiontimes at the
exemplar fc (12.8 kHz) condition shownin Figure 1. The slope of the
linear fit to reaction timevs. sound level did not differ
significantly as a functionof phase difference for any frequency
studied (seeFig. 2 for an example). We investigated the
reactiontimes at each sound level as a function of themodulation
phase difference δϕ. The reaction times
at each sound pressure level did not vary significantlywith δϕ
(individual reaction times are not shown forclarity, line joining
medians are shown in Fig. 3C).When we examined the reaction times
at sound levelsrelative to threshold, the reaction times did not
varysignificantly as a function of δϕ (Fig. 3D, line joining
-20 0 20 40 60 80
Pro
port
ion
of le
ver
rele
ases
0
0.5
1
δφ = 270ºδφ = 180ºδφ = 90ºδφ = 0º
A
-20 0 20 40 60 80
Pro
babi
lity
corr
ect
0.5
0.75
1B
δφ = 270ºδφ = 180ºδφ = 90ºδφ = 0º
Data Fit
Tone Level (dB SPL)
-20 0 20 40 60 80
Rea
ctio
n T
ime
(s)
0.2
0.3
0.4
0.5
0.6C
δφ = 270ºδφ = 180ºδφ = 90ºδφ = 0º
Data Fit
fc=12.8 kHz, fms=fmn=10 Hz, ds=dn=1Effect of modulation phase
difference
FA
FIG. 2. The effect of changing the phase difference between
theamplitude modulation of the signal and noise waveform during
adetection task. A Hit rate (probability of lever release) vs. tone
levelduring detection of a 12.8-kHz tone embedded in broadband
noise.Tone and noise were both amplitude-modulated at 10Hz and a
depth of1. Noise level was 55-dB overall level. Performance during
phasedifferences of 0 ° (black), 90 ° (green), 180 ° (blue), and
270 ° (red) areshown.Dashed horizontal lines represent false alarm
rate (FA) during theblocks of the phase differences shown and are
color-coded. BBehavioral accuracy (probability correct, see the
“METHODS” sectionfor calculation) vs. tone level for the exemplar
conditions shown in A.The symbols are color-coded as in A. Weibull
cumulative distributionfunction (cdf) fits are shown and are
color-coded by phase. Thehorizontal line shows p(c)=0.76; the
vertical dashed lines show thebehavioral thresholds under the phase
difference conditions shown. CReaction time vs. sound level for
during the detection of the amplitude-modulated tone. The reaction
times are color-coded based on the phasedifference between tone and
noise modulation as in A and B. Thereaction time vs. level
relationship was captured by a linear fit (showncolor-coded).
b
806 BOHLEN ET AL.: Detection of Modulated Tones in Modulated
Noise
-
medians shown for clarity). This lack of significantmodulation
held for both monkey subjects and alltone frequency conditions were
studied.
Effect of Modulation Depth
The depth of modulation should have a large effecton detection
thresholds. Our previous study foundthat modulation of signal or
noise caused a maskingrelease (lower thresholds) relative to
thresholds forunmodulated tones in unmodulated noise (Dylla et
al.2013). Thus, as the depth of the tone or the noisemodulation was
parametrically increased from 0 to 1,thresholds would be expected
to parametrically de-crease. When modulation depth is changed,
thedepth of the trough (or dip) changes by a much
larger amount than the height of the peak. Thereduction in
behavioral thresholds could be expecteddue to the dramatic increase
in the depth of the dipwhen the noise modulation depth was
increased (thusresulting in a much larger signal to noise ratio
aroundthe dip). Figure 4 shows an exemplar case describingthe
effects of changing modulation depth duringdetection of modulated
tones in modulated noise.Figure 4A shows the hit rate during the
detection of a8 kHz tone modulated at 10 Hz at various
tonemodulation depths (ds); the masker was broadbandnoise-modulated
at 10 Hz at a depth of unity and waspresented at a 55 dB overall
level (9 dB SPL spectrumlevel). Increasing tone modulation depths
causessmall increases in the peak amplitude of the signal(up to 6
dB for ds=1). The noise modulation was in
0 45 90 135 180 225 270 315 360
Thr
esho
ld (
dB S
PL)
0
10
20
30
C, fc = 12.8 kHzThreshold = 21.02-16.4*sin(δφ*π/360)
A
δφ, Phase difference (º)0 45 90 135 180 225 270 315 360
Rea
ctio
n T
ime
(s)
0.2
0.3
0.4
0.5
0.6
22dB25dB28dB32dB35dB38dB42dB
C
δφ, Phase difference (º)
0 45 90 135 180 225 270 315 360
0.2
0.3
0.4
0.5
0.6
First level > Threshold+3 dB+6 dB+10 dB+13 dB+16 dB+20 dB
D
0 45 90 135 180 225 270 315 360
0
10
20
30
C, fc=0.4 kHzD, fc=0.8 kHzD, fc=1.6 kHzC, fc=3.2 kHzC, fc=12.8
kHzC, fc=25.6 kHz
Threshold = 23.7-15.02*sin(δφ*π/360)
Bfms=fmn=10 Hz, ds=dn=1 fms=fmn=10 Hz, ds=dn=1
FIG. 3. Behavioral performance as a result of varying
phasedifference between tone and noise modulations. A Threshold as
afunction of modulation phase difference for the exemplar
frequencyshown in Figure 1. The circles represent thresholds at the
variousmodulation phase differences, and the dashed red line
represents thebest fit (sinusoid) to the threshold variations. B
Threshold as afunction of modulation phase difference for multiple
tone
frequencies tested. The individual frequencies are
color-coded.Fits to individual fc values are not shown. Thedashed
line is the best fit tothe entire data.C Trends of reaction time as
a result of modulation phasedifference. The lines connectmedian
thresholds at specific sound levels.Different colors show different
sound levels. D Similar to C, but levelsare considered relative to
threshold.
BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise
807
-
0 10 20 30 40 50 60
Pro
port
ion
of le
ver
rele
ase
0
0.25
0.5
0.75
1A
ds=1.0ds=0.25
0 10 20 30 40 50 60 70
0
0.25
0.5
0.75
1D
dn=1.0dn=0.25
0 10 20 30 40 50 60
Pro
babi
lity
corr
ect (
p(c)
)
0.5
0.75
1B
0 10 20 30 40 50 60 70
0.5
0.75
1E
dn=1.0dn=0.25
DataFit
Tone Level (dB SPL)
0 10 20 30 40 50 60
Rea
ctio
n tim
e (s
)
0.2
0.3
0.4
0.5
0.6C
ds=1.0ds=0.25
DataFit
Tone Level (dB SPL)
0 10 20 30 40 50 60 70
0.2
0.3
0.4
0.5
0.6F
dn=1.0dn=0.25
DataFit
ds=1.0ds=0.25
DataFit
fc=8 kHz, fms=fmn=10 Hz, dn=1, δφ=0˚ fc=25.6 kHz, fms=fmn=10 Hz,
ds=1, δφ=180˚Variation of depth of tone modulation Variation of
depth of noise modulation
FIG. 4. The effect of varying depth of modulation of tones (A–C)
ornoise (D–F) on the detection of modulated tones in modulated
noise.Format is similar to Figure 2. A Hit rate vs. tone level
duringdetection of a 8-kHz tone in broadband noise at a 55-dB
overalllevel for two depths of tone modulation: 0.25 (green) and
1.0 (red).Tone and noise were amplitude-modulated at 10 Hz, and
themodulations had a phase difference of 0 °. The depth of
noisemodulation was held at 1. Dashed lines show false alarm rates.
BProbability correct vs. tone level for the two depths of
tonemodulation in A. The psychometric functions (circles) were fit
witha Weibull cdf (solid lines). The horizontal line represents
the
threshold criterion (p(c)=0.76), and the vertical lines
representthreshold under the two conditions. C Reaction time vs.
tone levelduring detection at the two depths of tone modulation.
The reactiontimes (circles) relation to sound level was captured by
a linear fit(solid lines). D–F Same as A–C, but hit rates (D),
probability correctand thresholds (F) and reaction times (F) vs.
tone level when thedepth of noise modulation was manipulated. Tone
frequency was25.6 kHz, tone and noise modulation frequencies were
set at 10 Hz,and noise level was 55 dB. The depth of tone
modulation was held at1.0, and the modulations had a phase
difference of 180 °.
808 BOHLEN ET AL.: Detection of Modulated Tones in Modulated
Noise
-
phase with the tone modulation at tone onset (δϕ=0 °). The
different colored symbols show hit rates attwo different tone
modulation depths (ds=0.25, andds=1), and the hit rate vs. sound
level function showsthat as the modulation depth increased, tone
levelsrequired to produce hit rates above the false alarmrates
increased. Figure 4B shows the behavioralaccuracy (p(c)) for the
same case. The psychometricfunctions (circles) and the associated
Weibull fits(lines) detailing the behavioral performance at thetwo
depths of tone modulation show that the tonedetection thresholds
increased as the tone modula-tion depth increased. The reaction
times under theseconditions are shown in Figure 4C. In all
cases,reaction times decreased as the tone levels
increased.Comparing reaction times across the depths ofmodulations,
the slopes were not significantly differ-ent across the different
modulation depths (ANOVAafter bootstrapping, F(3,997)=1.47,
p=0.22).
Figure 4D–F shows similar data for a case in whichthe depth of
noise modulation (dn) was varied.Increase of the depth of noise
modulations caused asmall increase in the peak amplitude and
largedecreases in amplitude at the trough (e.g., Maloneet al.
2010). Figure 4D shows hit rates for two differentdn values when a
25.6-kHz tone was being detected;tone modulation frequency and
depth were heldconstant at 10 Hz and 1, respectively, the
noisemodulation frequency was 10 Hz, and the modulationphase
difference δϕ was 180 °. The mean noise levelwas held constant at a
55 dB overall level across thedifferent modulation depth
conditions. The tone levelrequired to produce hit rates higher than
the falsealarm rate was lower for dn=1 compared with dn=0.25.This
is in contrast to the experiments with tonemodulation where the
tone and noise modulationwere in phase (see Fig. 4A). The resulting
psychomet-ric function and their Weibull fits (Fig. 4E) shows
thatthe behavioral accuracy increased and thresholdsdecreased as
the noise modulation depths increased.As in previous cases, there
were no significant changesin the relationship between reaction
time and tonelevel as a function of the noise modulation
depth(ANOVA after bootstrapping, F(3,997)=1.14, p=0.33).
The exemplar data and data from some other tonefrequencies (fc)
are summarized in Figure 5. For allexamples and data shown, the
tone and noisemodulation frequencies were held constant and equalat
10 Hz. As expected from Figure 4A, the effect ofvarying depth of
tone modulation resulted in in-creased tone detection thresholds
when tone andnoise modulations were in phase (δϕ=0 °) (Fig. 5A).The
exemplar case of Figure 4A–C is shown in bluecolors. The threshold
changes as a function of ds weresignificantly different from zero
for each case(ANOVA after bootstrapping, pG0.01) and were fitwith a
line. The slopes of the linear fits at the differenttone
frequencies were all significantly different from
zero (t test for slopes, pG0.01 in all cases) and werenot
significantly different from each other (ANOVAafter bootstrapping,
F(2,997)=1.48, p=0.228). Thisresult could be because (1) the noise
and the tonemodulations became more similar as the depth oftone
modulation increased or (2) the signal energy inthe dips of the
masker decreased with increaseddepth of tone modulation. When the
tone and noisemodulations were 180 ° out of phase at tone
onset(δϕ=180 °), dip listening theories would predict thatthe trend
would be reversed relative to the in-phasecondition, due to
increase in the amplitude of thepeak during the dips of the masker.
The experimentaltest of the hypothesis showed that the trend
betweenthreshold and tone modulation depth when the toneand noise
modulations were anti-phase at tone onsetwas reversed relative to
when the modulations were inphase (Fig. 5B). Increasing the depth
of modulationof the tone caused a decrease in the tone
detectionthresholds. The threshold changes were
significantlydifferent from zero (t test for slopes, pG0.008 in
allcases). The relationship between threshold and ds wasbest
captured by a linear fit. This trend that heldacross all tone
carrier frequencies was tested. Theslopes of the linear fit were
not significantly differentfrom each other for the various
frequencies tested(ANOVA after resampling, F(2,997)=1.79,
p=0.1675).Note that the threshold difference between thehighest and
lowest modulation depth conditions weresmaller when δϕ=180 °
(modulations were anti-phase)compared to when δϕ=0 ° (modulations
were inphase). This result is consistent with smaller increasesin
the peak of the modulated signal with increases inmodulation depth
(important for δϕ=180 °) asopposed to large decreases in trough
depth withincreases in modulation depth (important for δϕ=0
°)(e.g., Malone et al. 2010).
The effect of varying noise modulation depth on tonedetection
thresholds is shown in Figure 5C and D.Changing the noisemodulation
depth changes the depthof the dip in the masker; thus, lower noise
modulationdepths were expected to be correlated with tone
detec-tion at higher thresholds when the tone and noisemodulations
are anti-phase, and vice versa. As shown inFigure 4E, changing the
depth of modulation of noise(dn) caused a decrease in tone
detection thresholds whenthe tone and noise modulations were 180 °
out of phase.This trend is summarized for the exemplar
frequency(shown in blue) and for some other frequencies
(othercolors) in Figure 5D. The thresholds varied significantly asa
result of changing dn (t test for slopes, pG0.01 in allcases), and
the relationship between them was capturedby a linear fit. The
slopes of the linear fit were notsignificantly different across
frequency (ANOVA afterresampling, F(2,997)=2.013, p=0.15). The
thresholdchanges as a result of changing dn when δϕ=180 °
werecomparable to the threshold differences after changing dswhen
δϕ=0 ° (compare Fig. 5A and D).
BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise
809
-
Surprisingly, changing dn while keeping δϕ=0 ° didnot result in
a significant change in tone thresholds(Kruskal Wallis test, p90.3
in each case). Figure 5Cshows the summary of two examples of
changing dn(using two different tone frequencies). In these
cases,the tone and noise were modulated at 10 Hz, and
themodulations were in phase. The slope of the relation-ship
between modulation depth and tone thresholdwas not significantly
different from 0 for either of thetwo examples or the several other
tone carrierfrequencies tested (t test for slopes, p90.24 in
eachcase).
Effect of Modulation Frequency
If the tone and noise were modulated at the samefrequency ( fms=
fmn), one would expect that the tonedetection threshold would be
high; when the modu-lation frequencies are different, detection
thresholdswould be expected to be lower than the equalmodulation
frequency case (Bregman 1994). Wetested that prediction by varying
the tone modulationfrequency or the noise modulation frequency
byblocking modulation frequency. The results of twoexperiments are
shown in Figure 6. Figure 6A–Cshows the results of an experiment in
which the tone
Depth of tone modulation (ds)0 0.25 0.5 0.75 1
Thr
esho
ld (
dB S
PL)
10
20
30
40
D fc = 25.6 kHzD fc = 8 kHz
C fc = 8 kHz
A
δφ = 0º
Depth of tone modulation (ds)0 0.25 0.5 0.75 1
10
20
30
40
D fc = 25.6 kHzD fc = 8 kHzC fc = 8 kHz
B
δφ = 180º
Depth of noise modulation (dn)0 0.25 0.5 0.75 1
Thr
esho
ld (
dB S
PL)
10
20
30
40
D fc = 8 kHzC fc = 12.8 kHz
C
δφ = 0º
Depth of noise modulation (dn)0 0.25 0.5 0.75 1
10
20
30
40
D fc = 8 kHzC fc = 12.8 kHzC fc = 25.6 kHz
D
δφ = 180º
DataFit
DataFit
DataFit
DataFit
fms=fmn=10 Hz, dn=1, δφ=180˚Variation of tone modulation
depth
fms=fmn=10 Hz, dn=1, δφ=0˚Variation of tone modulation depth
fms=fmn=10 Hz, ds=1, δφ=0˚Variation of noise modulation
depth
fms=fmn=10 Hz, ds=1, δφ=180˚Variation of noise modulation
depth
FIG. 5. The effects of varying depths of tone and noise
modula-tions. A Threshold as a function of change in the depth of
tonemodulation. Thresholds are shown for three different tone
frequen-cies (different colors) at various depths of tone
modulation. Therelationship was best captured by a linear fit
(solid lines). The toneand noise modulations were in phase during
these blocks. B Similar
to A, but for these blocks, the tone and noise modulations were
anti-phase (phase difference=180 °). C Threshold as a function of
depthof noise modulation. Format is the same as in A and B. For
theseblocks, the tone and noise modulations were in phase. D
Similar toC, but for these blocks, tone and noise modulations were
anti-phase.
810 BOHLEN ET AL.: Detection of Modulated Tones in Modulated
Noise
-
0 20 40 60 80
Pro
port
ion
of le
ver
rele
ase
0
0.25
0.5
0.75
1
fms=10fms=20fms=40
A
0 20 40 60 80
prob
abili
ty c
orre
ct (
p(c)
)
0.5
0.75
1
fms=10fms=20fms=40
B
Tone Level (dB SPL)
0 10 20 30 40 50 60 70 80
Rea
ctio
n tim
e (s
)
0.2
0.3
0.4
0.5
0.6
fms=10fms=20fms=40
C
10 20 30 40 50 60 70 80 90
0
0.25
0.5
0.75
1
fmn=10 fmn=20 fmn=40
D
10 20 30 40 50 60 70 80 90
0.5
0.75
1
fmn=10fmn=20fmn=40
E
Tone level (dB SPL)
10 20 30 40 50 60 70 80 90
0.2
0.3
0.4
0.5
0.6F
fmn=10fmn=20fmn=40
DataFitDataFit
DataFitDataFit
fc=3.2 kHz, fmn=10 Hz, ds=dn=1, δφ=0˚ fc=25.6 kHz, fms=10 Hz,
ds=dn=1, δφ=0˚Variation of tone modulation freq. Variation of noise
modulation freq.
FIG. 6. The effects of varying modulation frequency of tones
(A–C) ornoise (D–F) on detection of modulated tone in modulated
noise. Formatof the figures are same as in Figure 4. A–C Hit rate
vs. tone level (A),psychometric functions, Weibull cdf fits and
detection thresholds (B), andreaction times vs. tone level (C)
during detection of modulated tone inmodulated noise. Tone
frequency was 3.2 kHz; noise level was55 dB, noise modulation
frequency was 10 Hz, tone and noisemodulation depths were 1.0 each,
and the modulations were in
phase at tone onset. Data is shown for three tone
modulationfrequencies—10 Hz (blue), 20 Hz (green), and 40 Hz (red).
D–FSimilar to A–C, but as noise modulation frequencies werechanged.
Tone frequency was 25.6 kHz, noise level 55 dB,frequency of tone
modulation was10 Hz, depth of modulation oftone and noise 1.0, and
the tone and noise modulations were inphase at tone onset.
BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise
811
-
modulation frequency was changed between blocks(varying fms),
and Figure 6D–F show the results of anexperiment in which noise
modulation frequency ( fmn)was varied. In both cases, the
modulations of tone andnoise were in phase at the onset of the tone
(onset phasedifference, δϕ=0 °). Figure 6A shows the hit rates as
thetone modulation frequency was changed between 10(blue), 20
(green), and 40 Hz (red). The false alarmrates were zero in all
cases and are shown as separateddashed lines for clarity. As
expected, for each modula-tion frequency, the hit rates matched
false alarm ratesfor low sound levels and then increased rapidly
untilthey reached high values close to unity for higher soundlevels
(Fig. 6A). The effect of changing modulationfrequency of the tone
was that the tone level at whichthe hit rates diverged from false
alarm rates were loweras fms changed from 10 to 20 Hz and differed
more fromthe modulation frequency of noise (Fig. 6A). But at
ahigher fms, the threshold did not change much. This wastrue at a
higher fms value tested (80 Hz, data not shown).The behavioral
accuracy was computed from the hitrates, and the psychometric
functions andWeibull fits inFigure 6B show that the detection
thresholds decreasedas the tone modulation frequency increased from
10 to20 Hz (compare blue and green symbols and lines), butdid not
show a large change going from fms=20 Hz tofms=40Hz. The reaction
times as a result of changing thetone modulation frequency are
shown in Figure 6C. Asin previous cases, while reaction times
decreased as tonelevel increased under each of the tone
modulationfrequency conditions. The relationship between reac-tion
time and tone level was best captured by a linear fit(shown by the
blue, green, and red lines). There was nota change in the
relationship between reaction time andtone level as a result of
changing the tone modulationfrequency (slopes were not different,
intercepts werenot different). Reaction times examined in
greaterdetail as a function of modulation frequency (similarto Fig.
3C and D) did not show a trend when examinedwith absolute sound
level or with sound level re:threshold (data not shown).
Figure 6D–F shows an example of when the fmn wasvaried over
different blocks. As mentioned above, thephase difference between
the modulations was zero attone onset. Figure 6D shows the hit
rates, using the sameformat as Figure 6A. False alarm rates were ∼6
% for thefmn=10 Hz condition (blue dashed lines), but were zerofor
the other two conditions (green and red dashedlines). As with
Figure 6A, the false alarm rates are shownstaggered for the two
cases when they were zero. Theeffect of changing fmn was different
from the effect ofchanging fms. At fmn=20 Hz, the tone levels
required tochange the hit rate from the false alarm rate to
higherlevels was reduced relative to fmn=10 Hz (compare blueand
green symbols, Fig. 6D). When the noise modula-tion frequency was
changed to 40 Hz, then the tone
levels required to change the hit rate to levels above falsearm
rate increased above those for the 20-Hz conditions,but were still
lower than the 10-Hz condition. This trendwas reflected in the
psychometric functions and theirWeibull fits (Fig. 6E).
Psychometric functions for fmn=20 Hz were shifted to lower levels
relative to those forfmn=10 Hz, as well as those for fmn=40 Hz;
thepsychometric functions for fmn=40 Hz were shifted tolower levels
relative to fmn=10 Hz (Fig. 6E). As inprevious cases, reaction
times decreased as the tonelevels increased and were best related
to tone level by alinear fit. The linear fit was not significantly
impacted bychanges in fmn. A closer examination revealed
thatreaction times were not impacted by fmn, whether oneexamined
the relationship based on absolute tonesound pressure level or tone
sound pressure level re:threshold (data not shown).
Figure 7 summarizes the results of effects onthreshold at
various fc values as a result of changingfmn or fms. Figure 7A
shows the effect of varying fmswhile keeping δϕ=0 °. Theories of
dip listeningpredict that the detection thresholds would be
lowerwhen tone modulation frequencies increased, due tomore signal
energy in the dip of the masker. Eachcolor and symbol represents a
different tone frequen-cy ( fc) tested (see legend in Fig. 7B for
details). For allof these cases, fmn=10 Hz and noise level was 55
dBoverall level. The detection threshold was largest atfms=10 Hz
and was lower for higher values of fms. Thethresholds for fms910 Hz
were not different from eachother (ANOVA after resampling for each
frequency,p90.2). A similar trend held when the noise modula-tion
frequency ( fmn) was changed for the same fcvalues tested (Fig.
7B). When noise modulationfrequencies varied, previous studies have
found thatthe thresholds increased due to a reduction in
theduration of the masker dip, and thus smaller integra-tion time
(e.g., Velez and Bee 2010). In theseexperiments, tone detection
thresholds were highestwhen fmn=10 Hz and were lower for the other
valuesof fmn. However, thresholds at fmn=20 Hz were lowerthan those
for higher fmn values, a trend that held forall fc values (ANOVA
after resampling, p90.17).
One concern is that when the modulation frequencywas changed,
then the instantaneous phase of the tonemodulation waveform and the
noise modulation wave-form changed as a function of time. If a
subject hadmultiple looks at the stimuli during the tone
presenta-tion (i.e., the subject were to sample instantaneoussignal
and noise waveforms multiple times) and basedthe response on
instantaneous phase difference, thenthere would be no effects of
phase difference at toneonset on the effect of modulation frequency
ondetection thresholds. This was tested by testing theeffect of
modulation frequency with δϕ=180 °. As aresult of this
manipulation, the relationship between
812 BOHLEN ET AL.: Detection of Modulated Tones in Modulated
Noise
-
thresholds and modulation frequency had an invertedshape
relative to δϕ=0 °. One example is shown forchanges in fms and one
for changes in fmn. Both δϕ=0 °and δϕ=180 ° cases are shown for
both modulationfrequency variations. When δϕ=0 ° and fms was
varied,thresholds at 10 Hz were highest, and thresholds athigher
fms values were not different from each other(Fig. 7C, see red
symbols). When δϕ=180 °, thethresholds at 10 Hz were lower than
thresholds athigher fms values, and the thresholds at higher fms
valueswere not different from each other (blue symbols andlines,
Fig. 7C). Note that the thresholds at fms≥20Hz didnot differ as a
function of δϕ (Kruskal Wallis test afterresampling, p90.11 at
every fc value tested). This trendwas true for other tone
frequencies tested (data notshown). Similarly, changing the δϕ
values while varyingfmn caused the relationship between fmn and
threshold tobe inverted relative to δϕ=0 °. When δϕ=0 °,
thresholds
were highest at fmn=10 Hz, lowest at fmn=20 Hz, and hadvalues
intermediate between the above two at higher fmnvalues (red symbols
and lines, Fig. 7D). When δϕ=180 °,thresholds at 10 Hz were lowest,
and other thresholdswere higher at the other fmn values. Similar to
when fmswas varied, the thresholds for fmn≥20 Hz in the δϕ=180 °
and the δϕ=0 ° conditions were not significantlydifferent from each
other (Kruskal Wallis test afterresampling, p90.2 for every fc
value). The same trendwas observed at all fc values tested (data
not shown).
Predictions of a Model Based on StimulusStructure
In situations such as this, it is instructive to look at asimple
model to fit the behavioral data to attempt toinfer the
computations taking place underlying thisbehavior. Our goal is to
compare the best model with
10 20 40 80
10
20
30
40B
δφ=0º
10 20 40 80
Thr
esho
ld (
dB S
PL)
10
20
30
40A
δφ=0º
Modulation frequency of noise (fmn, Hz)
10 20 40 80
10
20
30
40
C, δφ=180˚C, δφ=0˚D, δφ=0˚D, δφ=180˚
D
fc = 3.2 kHz
Tone modulation frequency (fms, Hz)
10 20 40 80
Thr
esho
ld (
dB S
PL)
10
20
30
40C
fc = 3.2 kHz
D, fc=0.8 kHzD, fc=1.6 kHzD, fc=3.2 kHzD, fc=25.6 kHzC, fc=3.2
kHzC, fc=12.8 kHzC, fc=25.6 kHz
fmn=10 Hz, ds=dn=1Variation of tone modulation freq.
fms=10 Hz, ds=dn=1Variation of noise modulation freq.
FIG. 7. The effects of varying the frequency of tone or
noisemodulations. A Threshold as a function of tone
modulationfrequency. Thresholds for detection of tones of varying
carrierfrequencies (fc, different colors and symbols; legend with
panelB) in modulated noise, when tone modulation frequencies
werechanged. The tone and noise modulation frequencies were inphase
during these blocks. B Similar to A, but shows detectionthresholds
for tones having same fc values as in A (different
colors) in modulated noise when the noise modulation
frequen-cies were varied. C Threshold as a function of tone
modulationfrequency when the tone and noise modulations were in
phase(red) or anti-phase (blue) at tone onset for two subjects
(solidand dashed lines, respectively). The tone frequency was3.2
kHz. D Similar to C, but for variations in frequency ofnoise
modulation.
BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise
813
-
empirical results and models of processing at eachvarious stages
of the pathway to localize transforma-tions in signal processing.
The sinusoidal change inthreshold with the variation of modulation
phasedifferences suggests that a difference model wouldfit the
threshold changes as a result of the manipula-tion of tone and
noise parameters, reminiscent of theequalization-cancellation (EC)
model proposed forbinaural processing (Durlach 1963). Note that
asecondary formulation of the model would involvejust subtraction
of the envelopes, similar to modelsproposed by Hall et al. (1988);
the trend in thepredicted results of a model that computed
envelopedifferences matched those observed by the proposedmodel,
but getting threshold equivalents provedproblematic since the
envelope subtraction modelwas independent of stimulus or noise
levels. Analternate formulation of the model computed signalto
noise ratio at the dips of the masker (dips weredesignated as the
time intervals when the instanta-neous masker levels fell below
steady state levels, afterVelez and Bee (2010)). The results of
this modelshowed trends that did not match the behavioralresults
for the effect of manipulating tone modulationfrequency and δϕ=0 °
(see below). To formalize thedifference model in our analysis, we
computed thedifference between the amplitude-modulated
signalwaveform and the amplitude-modulated noise wave-form
(SAM(t)−NAM(t)) as a function of time for eachtone level. The
amplitudes of the tone and noisewaveform were logarithmically
transformed so as tomatch sensitivity to perceptually related
parameters. Ifthe log-transformed noise amplitude for t=t0
waslarger than the log-transformed signal amplitude att=t0, the
signal would be masked; in those cases, we setthe difference equal
to zero. When the log-trans-formed signal amplitude exceeded the
log-trans-formed noise amplitude (either positive or negative),the
absolute value of the difference was calculated.The total energy of
this difference function wascalculated by integrating the
difference waveformnumerically over time, consistent with the
findingthat for short duration (G1 s) signals, signal to noiseratio
is best expressed as a dimensionless quantity ofsignal energy to
noise spectrum level (e.g., Green etal. 1959). The area under this
accumulated differencecurve was calculated using the trapezoidal
rule andshould be directly related to the behavioral perfor-mance
at that tone level. That is, if the area under thedifference curve
increased with the parametric varia-tion, it is expected that the
hit rate would increase.The area was calculated at all tone levels
that wereused in the experiment. We then made an assumptionthat the
criterion for behavioral threshold was thesame across all
conditions and used a criterion todefine the threshold for a
specific set of parameters.
We set the criterion such that the threshold to tonesalone
matched audiometric thresholds. The criterionwas varied in the
simulation; the specific value of thecriterion changed the absolute
threshold level, butdid not affect the change in threshold as a
result ofthe parametric manipulation. Figure 8 provides theresults
of such a model calculation.
The results shown in Figure 8 were obtained usinga tone
frequency of 1 kHz. The results did not changewith the use of other
frequencies, so the results showncould represent any frequency
within the audiblerange of the primate. The simulation was such
thatthe model predicted a threshold of 1.5 dB SPL when a1-kHz tone
was presented alone, and the predictedthreshold in 55 dB noise was
29 dB SPL. These valueswere similar to the actual threshold values
measuredfor the two monkeys used as subjects in this study(tone
alone: monkey C: −0.5 dB; D: −0.8 dB; tone in55 dB noise: C: 29.5
dB; D: 30.5 dB; see Fig. 1). Thecriterion area value used to define
threshold for allfuture simulations was maintained identical to
that forthe unmodulated tone presented alone and in noise.The noise
and the tone were amplitude-modulated,and the model run as
described. The effects ofmanipulating the modulation phase
difference onthe model thresholds are shown in Figure 8A.
Thethresholds were highest at δϕ=0 °; as the values of δϕincreased
from 0 to 180 °, thresholds decreased andthen increased as δϕ
wrapped back around to 360 °(green circles, Fig. 8A). This trend in
model thresh-olds is just identical to the data shown in Figure
3Aand B. The relationship between the threshold and δϕwas best fit
with a sinusoidal function with anamplitude of 13.9 dB (green line,
Fig. 8A), much likethe behavioral data was fit by a sinusoid (Fig.
3A andB). The magnitude of the effect of phase differenceon model
thresholds was very similar to its effect onthe behavioral
thresholds (compare Fig. 3A and Bwith Fig. 8A). This suggests that
a difference model issufficient to capture the effects of changing
δϕ.
FIG. 8. Results of a simple energy difference model that
predictsvariation in behavioral threshold as a result of the
parametermanipulations for the studies presented here. Parameters
matchthose used in the experiments. A Effect of varying phase
differencebetween tone and noise modulations. Circles show
thresholds, solidline represents best fit to the data. B Effect of
varying frequency ofnoise amplitude modulation. Thresholds are
shown when tone andnoise modulation were in phase (red triangles)
and anti-phase (bluediamonds) at tone onset. Dashed and dotted
lines show best fit to thesymbols (exponential functions). C
Similar to B, but for variations intone amplitude modulation
frequency. Inset. The results of behav-ioral experiments in two
monkeys testing the effects of frequency oftone modulation between
10 and 20 Hz shows an undershoot inthreshold to match model
predictions when tone and noisemodulations were in phase at tone
onset. D Effects of varying depthof noise modulation. Format is
similar to C. E Similar to D, butparameter varied was depth of tone
modulation.
b
814 BOHLEN ET AL.: Detection of Modulated Tones in Modulated
Noise
-
δφ, onset phase difference (º)0 45 90 135 180 225 270 315
360
Thr
esho
ld (
dB S
PL)
10
20
30
A
Depth of noise modulation (dn)0 0.25 0.5 0.75 1
Thr
esho
ld (
dB S
PL)
10
20
30
δφ = 180ºδφ = 0º
D
Noise Modulation frequency (fmn, Hz)10 20 40 80
Thr
esho
ld (
dB S
PL)
10
20
30
δφ = 180ºδφ = 0º
B
Tone Modulation frequency (fms, Hz)10 20 40 80
10
20
30
δφ = 180ºδφ = 0º
C
Depth of Tone modulation (ds)0 0.25 0.5 0.75 1
10
20
30
δφ = 180ºδφ = 0º
E
Tone modulation frequency (Hz)10 20 40 80
Thr
esho
ld (
dB S
PL)
10
20
30
40
δφ=0º
BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise
815
-
Figure 8B and C shows the effect of varying thefrequency of
amplitude modulation of the tone or thenoise on the model
thresholds. Figure 8B shows theeffects of varying the frequency of
noisemodulation (fmn)while keeping the frequency of tone modulation
(fms)constant at 10 Hz. With δϕ=0 °, and as fmn differed fromfms,
the thresholds decreased rapidly and then saturatedfor fmn≥20 Hz
(red triangles, Fig. 8B). The range ofmodel threshold values and
their trend are similar to thebehavioral data shown in Figure 6B.
This relationship wasbest captured by an exponential function of
(fmn−fms)(red dashed line, Fig. 8B). With δϕ=180 °, the
thresholdsincreased from a smaller value when fmn=10 andsaturated
at fmn≥20 Hz (blue diamonds, Fig. 8B). It isnoteworthy that the
model thresholds matched for thetwo δϕ values, in a manner similar
to the behavioral data(Fig. 7D). This result is consistent with
behavioralthresholds for δϕ=180 ° that increased as fmn
increasedand saturated at fmn values larger than 20 Hz
(consistentwith Fig. 7D). An exponential function of fmn−fms best
fitthe model. Note also that the model predicts that thethreshold
changes in the δϕ=0 ° condition to be largerthan those in the
δϕ=180 ° condition. This is, in fact,consistent with the behavioral
data from both subjects inthis study (see Fig. 7D).
The effect of changing fms while maintainingconstant fmn on
threshold are shown in Figure 8C. Asfms differed from fmn, and with
δϕ=180 °, the modelpredicts thresholds that increased and then
saturatedas fms≥20 Hz (blue diamonds, Fig. 8C), similar to
theeffect of changing fmn. The behavioral results weresimilar to
this function (see Fig. 7C). However, whenδϕ=0 °, the area
increased as fms increased, attained apeak value at fms=13 Hz, and
then decreased tosaturate for fms≥20 Hz. Because of the low
resolutionof the sampling of fms values in the data reported
inFigure 7, the correlation with the behavioral values isnot clear.
To clarify the match of this model resultwith behavior, we tested
the monkeys at fms=11, 12,13 Hz, and the behavioral results are
shown inset. Thebehavioral results (see Fig. 8C, inset) show that
thethresholds initially decrease, attained a minimum atfms=13 Hz,
and then increased when fms=20 Hz. Thistrend, which was observed
for both monkeys (differ-ent colors, Fig. 7C, inset), matched the
modelprediction. Further analyses with the model indicatedthat the
dip in threshold is matched with thefrequency at which the lowest
cross-correlation wasobtained with a 10-Hz sine wave (representing
thenoise modulation) for a 200-ms duration signal,and the frequency
at which the modulation enve-lope of the signal and the noise were
mostdifferent cumulatively. The range of thresholdvalues and their
trend matched behavioral values.As with fmn values, the model
predicted similarthresholds in the δϕ=0 ° and δϕ=180 ° conditions
for fms≥
20Hz and predicted smaller threshold changes in the δϕ=180 °
condition relative to the δϕ=0 ° condition. Both ofthese were also
consistent with behavioral results (Fig. 7).
The model predictions for the effects of modulationdepth are
shown in Figure 8D and E. For thesecalculations, fms and fmn were
held constant at 10 Hz.When the depth of tonemodulation (ds)
increased from0 to 1, and δϕ=0 °, the predicted thresholds
increased(red triangles, Fig. 8E) and was fit with a straight
line.This trend and its magnitude are both consistent withthe
behavioral data (compare with Fig. 5). When dsincreased from 0 to
1, and δϕ=180 °, the predictedthresholds decreased (blue diamonds,
Fig. 8E) and wasbest fit with a straight line. This trend and the
range ofthresholds predicted were also consistent with
thebehavioral data (Fig. 5). Note that the model isconsistent with
larger threshold differences in the δϕ=180 ° condition relative to
the δϕ=0 ° condition. Whenthe depth of noise modulation (dn)
increased from 0 to1, and δϕ=180 °, the thresholds decreased
(bluediamonds, Fig. 8D) and were fit with a straight line.This is
consistent with behavioral thresholds decreasingunder the same
conditions (Fig. 5). When dn increasedfrom 0 to 1, and δϕ=0 °, the
model thresholds stayedidentical for modulation depths of 0 to
0.75, and thenincreased by 2 dB for a noisemodulation depth of 1
(redtriangles, Fig. 8D). The model thresholds were fit with
astraight line, the slope of which was not different fromzero (t
test for slopes, p=0.473). This result is also similarto the
behavioral data, which suggests that varying dndoes not
significantly change amplitude-modulatedtone detection thresholds
(Fig. 5). This may be relatedto increased sensitivity to tone
modulation relative tonoise modulation as a result of two-tone
suppression atthe level of the auditory nerve. Note also that the
modelis consistent with larger threshold differences in the δϕ=180
° condition relative to the δϕ=0 ° condition. Thus, asingle energy
difference accumulation model canaccount for all the results.
DISCUSSION
The results of this study show the effects of varying
thetemporal relationship between time varying signal andtime
varying noise. By systematically varying thevarious parameters that
characterize the relationshipbetween signal and noise modulations,
the results ofthis study show that the computations underlying
thedetection of signal in noise are consistent with adifferencing
operation.
Comparison with Previous Results
The power spectrum model of hearing suggests thatduring the
masked detection of a sine-wave signal, the
816 BOHLEN ET AL.: Detection of Modulated Tones in Modulated
Noise
-
subject utilizes information from the auditory filterthat is
centered on the sinusoid to be detected(Moore 2003). While there is
evidence that the systemis able to utilize across-frequency cues in
certaincircumstances, a test of within vs. across-frequencyband
cues requires manipulating the bandwidth ofthe noise, which was not
done in this study. Theresults of this study are generally
consistent with thefindings that when the modulation properties of
thesignal and the noise are different, signal detectionthresholds
were lower; when modulation propertiesbetween signal and noise were
similar, signal detec-tion thresholds were higher. These results
are consis-tent with previous findings in humans (e.g.,McFadden
1987; Cohen and Schubert 1987; Fantiniand Moore 1994), passarines
(Langemann and Klump2007), and corvids (Jensen 2007).
Recent studies have suggested that dip listening(listening
selectively during the trough of the masker)is sufficient to
explain changes in detection thresholdsor recognition thresholds
(e.g., Velez and Bee 2010,2011). Some of the results of this study
are generallyconsistent with dip listening mechanisms that
accountfor trends in threshold changes. For example, chang-ing δϕ
changes the amount of signal energy in the dipof the masker; the
least signal energy was at δϕ=0 °,and the most energy was at δϕ=180
° and would resultin threshold changes consistent with trends
observedin Figure 3B. However, dip listening theories predictthat
as the modulation frequency of the maskerincreased, detection
thresholds would increase dueto reduced duration of dips (e.g.,
Gustafsson andArlinger 1994; Bacon et al. 1998; Velez and Bee
2010,2011). Those results are not consistent with thefindings after
the manipulation of noise modulationfrequency (Fig. 7). In fact,
the only model thatexplains the data across all conditions is the
energydifference model (see Fig. 8).
In general, detection experiments involving modulat-ed sounds
have been done in the context ofcomodulation. In comodulation
masking ratio (CMR)experiments, the detectability of static signals
of varioussorts was determined in the presence of multiple
narrowbands of noise having either the same or differentmodulations
or in the presence of bandpass noise thatwas amplitude-modulated
(e.g., Hall et al. 1984; Cohenand Schubert 1987; Hall 1986;
McFadden 1986). Thesestudies cannot be directly compared with CMR
studiesbecause both signal and noise were modulated in thisstudy. A
better comparison would be studies ofcomodulation detection
differences (CDDs), in whichsubjects were asked to detect a
modulated band of noisethat was simultaneously masked by one or
more spectral-ly non-overlapping noisebands (called cue or
flankingbands) that were also modulated. When the modulationof the
signal band is different from that of the flanking
band(s), detection thresholds can be 10–12 dB betterthan when
all of the flanking band envelopes were thesame (councorrelated
condition) relative to when all theflanking band modulations were
different (all randomcondition, Cohen and Schubert 1987; McFadden
1987;Wright 1990; Fantini and Moore 1994). Experiments incorvids
and paasarenes showed similar threshold chang-es under CDD
measurements in birds (Langemann andKlump 2007; Jensen 2007),
suggesting that the CDD isnot specific to humans, but maybe a
general processingmode used to segregate sounds in complex
environ-ments (Cohen and Schubert 1987; Bee and Micheyl2008). The
experimental conditions in this paper aresimilar to the “all
correlated” condition in CDD studies(when the signal and the masker
had similar modula-tions) or the all uncorrelated condition (when
the signaland masker had different modulations). While
theexperiments in the current study did not really testCMR or CDD
explicitly by using bands of noise as signalor noise, the results
are consistent with large thresholdchanges as a result of changes
in the correlation betweenthe signal and noise in the above
studies.
The parameters manipulated in this paper have alsobeen
manipulated, but mainly in studies of the CMR,when signals were
unmodulated and maskers weremodulated (e.g., Hall et al. 1988;
Schooneveldt andMoore 1989; Grose and Hall 1989; Fantini 1991).
Manystudies have documented that human subjects were ableto
discriminate the modulation parameters manipulatedhere (e.g.,
Wakefield and Edwards 1987; Yost and Sheft1989; Wakefield and
Viemeister 1990). Most of theexperiments in which the signal and
masker were bothmodulated involvedmanipulation of correlation
betweenthe different noisebands (noisebands generated withdifferent
amplitude and phase parameters; McFaddenand Wright 1990; Wright
1990; Borrill and Moore 2002)rather than the depth of modulation,
the modulationphase, or the modulation frequency. The
thresholdchanges in the current study were roughly comparableto
than those seen for human behavior caused bychanged noiseband
correlations (e.g., ∼10 dB,McFadden and Wright 1990); however,
maskers inprevious studies had no spectral overlap with the
signal(e.g. Cohen and Schubert 1987; Langemann and Klump2007).
Perhaps, the lack of uncertainty of the signal
ornoisebandmodulations contributed to the large effects inthe
current study (see ∼15-dB threshold change formodulating phase
difference in Fig. 3, and ∼10–20-dB threshold change while
manipulating depth ofmodulation, Fig. 5).
In general, the detection or discrimination oftarget sounds
among distractors is facilitated underconditions that promote the
perceptual segregation oftargets from interferers, especially if
targets andinterferers share some common features (Gockel etal.
1999; Micheyl and Carlyon 1998; Micheyl et al.
BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise
817
-
2005) or when they vary rapidly and unpredictablyover time (Kidd
et al. 1994, 1995, 2002; Micheyl et al.2007). This suggests that
(1) changing the modulationphase difference between signal and
noise increasesthe segregation between signal and noise (Fig. 3);
(2)changing depth of modulation of the tone to valuescloser to the
depth of noise modulation when thetone and noise modulations were
in phase decreasedthe segregation of signal and noise (Fig. 5); (3)
whentone and noise modulations were anti-phase, chang-ing the depth
of noise modulation to values closer tothe tone modulation depth
improved the segregationof tone and noise (Fig. 5B and D; probably
as a resultof enhanced dip listening arising from deeper dips
innoise modulation); (4) changing the difference be-tween tone and
noise modulation frequency in-creased segregation when the tone and
noisemodulations were in phase, whereas the same stimu-lus
manipulation when the modulations were anti-phase at tone onset
decreased segregation (Fig. 7). Allof these are consistent with
theories of auditory sceneanalysis, which suggest that when
modulation param-eters are different, stream segregation is
enhanced(Bregman 1994). This is also consistent with the ideathat
amplitude modulation is an important contribu-tor to object
formation (Yost and Sheft 1989) and thatthe monkeys had lower
thresholds detecting signalfrom noise when the properties of the
signal andnoise modulations were larger (implying signal andnoise
were treated as two different objects). Anaddition to the theories
of scene analysis here is thatthe various factors causing stream
segregation are notindependent; rather, they interact in
predictable ways(e.g., modulation phase and depth of
modulation).
Alternatively, all the data in this study could beexplained by
selectively listening in the dips of themasker. Dip listening
caused enhanced behavioralperformance when the masker modulation
frequen-cies were low (Gustafsson and Arlinger 1994; Bacon etal.
1998; Velez and Bee 2010). However, these datashow that the
relationship between the modulationfrequencies of signal and masker
form an importantdeterminant of behavioral performance.
Previous studies have suggested either perceptualsegregation of
signal and noise, or dip listening, orsuppression in the auditory
pathways as a mechanism toexplain effects such as those seen in
this paper (e.g.Borrill and Moore 2002; Moore and Borrill 2002;
seeMoore 2003 for a review). The perceptual performancein such
stream segregation or dip listening tasks can beexplained by
comparing the different segregatedstreams (effectively a
subtraction operation, similar toDurlach’s EC model (1963)). So,
this would suggest thatthe signal and the noise in this study could
besegregated when the parameters of the signal and noisewere
different (e.g., when δϕ≠0, or when the fms≠fmn,
or when ds≠dn). Consistent with such a suggestion, thethreshold
changes as a result of the manipulations inthis study were
consistent with a differencing orcomparator operation being
performed by the auditorysystem (compare Fig. 8 with Figs. 3, 5,
and 7). This isconsistent with previous suggestions that
subtractionmechanismsmay be in play in a detection task (e.g.,
Hallet al. 1988). However, the dip listeningmay also apply forthe
current study that could not apply to the Hall et al.study; the
thresholds were lower when the tone and thenoise modulations were
anti-phase at the onset of thesignal relative to when the tone and
noise modulationswere in phase. In the cases where modulation
frequencywas changed, having higher tone modulation frequencymeant
that even if the signal and noisemodulations werein phase at signal
onset, there was some signal during thedip of the noise; when
signals and noise modulationswere anti-phase at signal onset, the
energy at the dipdecreased as a result of changing the tone
modulationfrequency, so thresholds increased as the tone
modula-tion frequency increased (Fig. 7). However, even in
thoseconditions where dip listening could explain the behav-ioral
performance, a signal to noise difference orcomparison best
explained the threshold changes.Some previous studies have found
that perceptualstreaming and dip listening did not apply under
certainconditions, and the only mechanisms that could
explaindetection based on correlations would be neural sup-pression
(Borrill and Moore 2002).
An interesting finding is that the δϕ values that areequally
separated from 180 ° produced roughly equalthresholds (i.e.,
thresholds for 45 ° and 315 ° phase shiftswere very similar as were
thresholds for 90 ° and 270 °,etc.; Figs. 1 and 2). This suggests
that the exact timing ofthe peaks and troughs of the signal and
noise did notmatter, just that the peaks were coincident or not.
Thissuggests that the behavioral strategy used by themonkeys did
not involve the relative timing of thefeatures of the tone and
noise stimulus and was possiblyrelated to simply the stimulus
energy.
The one parameter that did not affect behavioralthresholds (or
simulated thresholds) was when thedepth of noise modulation was
varied while the toneand noise modulations were in phase at the
onset of thesignal (data: Fig. 5C; model: Fig. 8D). Previous
results inmacaques (Dylla et al. 2013) and in humans (e.g., Hall
etal. 1984) suggest that detection thresholds inmodulatedmaskers
were much lower than those in steady-state(unmodulated) maskers.
These two results togethersuggest that the system is highly tuned
to the salienceof the tone modulation, and any potential
advantageprovided by noise modulation was potentially minimalwhen
the tone and noise were in phase.
An unusual prediction of the model was that thedetection
threshold for modulated tones would de-crease from its high values
at fms=10 Hz, be lowest at
818 BOHLEN ET AL.: Detection of Modulated Tones in Modulated
Noise
-
fms=13 Hz, and then saturate at a higher value forfms≥20 Hz (see
Fig. 8C). Thus, the 13 Hz representsthe tone modulation frequency
at which an observerwould notice the greatest dissimilarity between
thetone and the noise modulation waveforms. Thisfinding lends some
credence to the idea that correla-tions could play a role in the
generation of perceptualstreams. Such sensitivity might require
modulationshape discrimination, such as those observed in
theauditory cortex in macaques (Malone et al. 2007).
Potential Neurophysiological Mechanisms
While the neurophysiological responses under theseexact
conditions have not been studied, some studieshave examined
neuronal responses when signals aremasked by modulated noise.
Studies in songbirds haveshown that changes in correlation in the
envelopebetween signal and noisebands cause changes in theresponse
thresholds of neurons in the analog of theprimary auditory cortex
(Bee et al. 2007). The rangeof changes in these forebrain neuronal
thresholdsmatches, roughly, those observed behaviorally in thesame
species (Langemann and Klump 2007).Similarly, other studies in
mammalian species havealso shown that cortical neurons modulate
theirresponses in a manner similar to behavior in responseto
stimuli in the presence of modulated maskers(Fishman et al. 2001,
2012). One possible mechanismof signal detection in the presence of
time varyingmasker—the masker causes a synchronization of
theresponses of a population of neurons, and thepresentation of a
signal (modulated or otherwise)desynchronizes the responses of
neurons tuned to thesignal parameters from the rest of the
neurons(Nelken et al. 1999). Recent studies have found thatneurons
in auditory cortex are very sensitive tochanges in amplitude (both
increases and decreases)and function as envelope shape
discriminators with awide range of response characteristics (Malone
et al.2007, 2010). This suggests that the responses ofcortical
neurons would be able to respond differen-tially to the parameters
of the modulations of tonesand noise. This is consistent with
results from Sutter’slaboratory that show that cortical neurons
changetheir responses depending on the modulation param-eters as
well as the behavioral state of the animal andthe variations in
behavioral performance (Yin et al.2011; Niwa et al. 2012a, b;
Johnson et al. 2012). Whilethose results suggest enough information
in theresponses of forebrain neurons to account for behav-ior, it
was not clear if such information represented byauditory objects
was present in the earlier parts of thebrain or what exact
computations or mechanismswere involved in the generation of the
responses.
Studies in the visual system have implicated earlystructures
like the primary visual cortex and even theretina in some species
in the processing of local vs.global stimulus properties to account
fro scenesegmentation (e.g., Olveczky et al. 2003; Baccus et
al.2008; Nothdurft 1994). Very few studies of theauditory system
have looked at neuronal responsesin relation to scene segmentation,
and they have beenmainly in the auditory cortex (e.g. Fishman et
al.2001, 2012; Fishman and Steinschneider 2010;Gutschalk et al.
2005; Nelken and Bar-Yosef 2008).Modulations clearly are a major
signal for segregatingor integrating sounds. Modulated sounds
changeresponses in multiple parts of the auditory system.The
manipulations in this study (the onset phasedifference, modulation
frequency, and modulationdepth) cause changes of activity in many
parts of theauditory system such as the cochlear nucleus (CN;
e.g.Rhode and Greenberg 1994; Joris et al. 1994; Moller1976) and
the inferior colliculus (e.g., Nelson andCarney 2007; Krishna and
Semple 2000; Langner andSchreiner 1988; Muller-Preuss et al. 1994;
Rees andMoller 1983). However, very few studies have directlytested
the neuronal correlates of signal detectionwhen the masker is
temporally modulated (CN:Pressnitzer et al. 2001; Neuert et al.
2004). The studiesin the cochlear nucleus found that very few
neuronsin the ventral CN showed neuronal correlates ofenhanced
thresholds such as those seen in behavior(Pressnitzer et al. 2001).
However, a majority ofneurons in the dorsal CN showed such
neuronalcorrelates (Neuert et al. 2004), and these
thresholdenhancements were postulated to result due towideband
inhibition. However, magnitude of neuro-nal threshold changes in
the dorsal CN could notaccount for behavioral threshold changes. A
study inthe inferior colliculus of cats has shown that a majorityof
inferior colliculus (IC) neurons show responses thatare associated
with wideband inhibition beyond thatobserved in the CN (Davis et
al. 2003). This suggeststhat neurons in the IC should show
detectionthresholds that are larger than those seen in CN andmay be
more in line with behavioral observations.These responses may be
further modified at the levelof thalamus and cortex to represent
the behaviorallysimilar changes in neuronal responses seen in
theforebrain during such tasks (e.g., Bee et al. 2007;Fishman et
al. 2012).
ACKNOWLEDGMENTS
This research was funded by a grant from the NationalInstitutes
of Health, R01 DC 11092. The authors would liketo thank Mary
Feurtado for the help during surgery, Bruceand Roger Williams for
the hardware. Meagan Quinlan and
BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise
819
-
Dr. Jason Grigsby collected some preliminary data andperformed
some preliminary data analysis.
REFERENCES
BACCUS SA, OLVECZKY BP, MANU M, MEISTER M (2008) A retinal
circuitthat computes object motion. J Neurosci 28:6807–6817
BACON SP, OPIE JM, MONTOYA DY (1998) The effects of hearing
lossand noise masking on the masking release for speech
intemporally complex backgrounds. J Speech Lang Hear
Res41:549–563
BEE MA, MICHEYL C (2008) The cocktail party problem: what is
it?How can it be solved? And why should animal behaviorists
studyit? J Comp Psychol 122:235–251
BEE MA, BUSCHERMOHLE M, KLUMP GM (2007) Detecting
modulatedsignals in modulated noise: (II) neural thresholds in
thesongbird forebrain. Eur J Neurosci 26:1979–1994
BORRILL SJ, MOORE BC (2002) Evidence that comodulation
detectiondifferences depend on within-channel mechanisms. J Acoust
SocAm 111:309–319
BREGMAN AS (1994) Auditory scene analysis: the perceptual
organi-zation of sound. MIT Press, Cambridge
COHEN MF, SCHUBERT ED (1987) The effect of
cross-spectrumcorrelation on the detectability of a noise band. J
Acoust SocAm 81:721–723
DAVIS KA, RAMACHANDRAN R, MAY BJ (2003) Auditory processing
ofspectral cues for sound localization in the inferior colliculus.
JAssoc Res Otolaryngol 4:148–163
DURLACH NI (1963) Equalization and cancellation theory of
binauralmasking level differences. J Acoust Soc Am 35:1206–1218
DYLLA M, HRNICEK A, RICE C, RAMACHANDRAN R (2013) Detection
oftones and their modification by noise in nonhuman primates.
JAssoc Res Otolaryngol 14:547–560
EFRON B, TISHIRANI RJ (1993) An introduction to the
bootstrap.Chapman & Hall/CRC, Boca Raton
FANTINI DA (1991) The processing of envelope information
incomodulation masking release (CMR) and envelope discrimina-tion.
J Acoust Soc Am 90:1876–1888
FANTINI DA, MOORE BC (1994) Profile analysis and
comodulationdetection differences using narrow bands of noise and
theirrelation to comodulation masking release. J Acoust Soc
Am95:2180–2191
FISHMAN YI, STEINSCHNEIDER M (2010) Neural correlates of
auditoryscene analysis based on inharmonicity in monkey
primaryauditory cortex. J Neurosci 30:12480–12494
FISHMAN YI, VOLKOV IO, NOH MD, GARELL PC, BAKKEN H, AREZZO
JC,HOWARD MA, STEINSCHNEIDER M (2001) Consonance and disso-nance of
musical chords: neural correlates in auditory cortex ofmonkeys and
humans. J Neurophysiol 86:2761–2788
FISHMAN YI, MICHEYL C, STEINSCHNEIDER M (2012) Neural
mechanismsof rhythmic masking release in monkey primary auditory
cortex:implications for models of auditory scene analysis.
JNeurophysiol 107:2366–2382
GANS C (1992) An overview of the evolutionary biology of
hearing.In: Webster DB, Fay RR, Popper AN (eds) The
evolutionarybiology of hearing. Springer, New York, pp 3–13
GOCKEL H, CARLYON RP, MICHEYL C (1999) Context dependence
offundamental-frequency discrimination: lateralized
temporalfringes. J Acoust Soc Am 106:3553–3563
GREEN DM, SWETS JA (1966) Signal detection theory and
psycho-physics. Krieger, Huntingdon
GREEN DM, MCKEY MJ, LICKLIDER JCR (1959) Detection of a
pulsedsinusoid in noise as a function of frequency. J Acoust Soc
Am31:1446–1452
GROSE JH, HALL JW 3RD (1989) Comodulation masking release
usingSAM tonal complex maskers: effects of modulation depth
andsignal position. J Acoust Soc Am 85:1276–1284
GUSTAFSSON HA, ARLINGER SD (1994) Masking of speech by
ampli-tude-modulated noise. J Acoust Soc Am 95:518–529
GUTSCHALK A, MICHEYL C, MELCHER JR, RUPP A, SCHERG M, OXENHAM
AJ(2005) Neuromagnetic correlates of streaming in humanauditory
cortex. J Neurosci 25:5382–5388
HALL JW (1986) The effect of across-frequency differences
inmasking level on spectro-temporal pattern analysis. J AcoustSoc
Am 79:781–787
HALL JW, HAGGARD MP, FERNANDES MA (1984) Detection in noise
byspectro-temporal pattern analysis. J Acoust Soc Am 76:50–56
HALL JW 3RD, GROSE JH, HAGGARD MP (1988) Comodulation
maskingrelease for multicomponent signals. J Acoust Soc Am
83:677–686
HAWKINS JEJ, STEVENS SS (1950) The masking of pure tones and
ofspeech by white noise. J Acoust Soc Am 22:6–13
JENSEN KK (2007) Comodulation detection differences in thehooded
crow (Corvus corone cornix), with direct comparison tohuman
subjects. J Acoust Soc Am 121:1783–1789
JOHNSON JS, YIN P, O’CONNOR KN, SUTTER ML (2012) Ability
ofprimary auditory cortical neurons to detect amplitude modula-tion
with rate an