The temporal representation of the delay of dynamic iterated rippled noise with positive and...

15
Research Report The temporal representation of the delay of dynamic iterated rippled noise with positive and negative gain by single units in the ventral cochlear nucleus Mark Sayles, Ian Michael Winter Centre for the Neural Basis of Hearing, The Physiological Laboratory, Downing Street, Cambridge, CB2 3EG, UK ARTICLE INFO ABSTRACT Article history: Accepted 21 June 2007 Available online 6 August 2007 Spike trains were recorded from single units in the ventral cochlear nucleus of the anaesthetised guinea-pig in response to dynamic iterated rippled noise with positive and negative gain. The short-term running waveform autocorrelation functions of these stimuli show peaks at integer multiples of the time-varying delay when the gain is + 1, and troughs at odd-integer multiples and peaks at even-integer multiples of the time-varying delay when the gain is 1. In contrast, the short-term autocorrelation of the Hilbert envelope shows peaks at integer multiples of the time-varying delay for both positive and negative gain stimuli. A running short-term all-order interspike interval analysis demonstrates the ability of single units to represent the modulated pitch contour in their short-term interval statistics. For units with low best frequency (1.1 kHz) the temporal discharge pattern reflected the waveform fine structure regardless of unit classification (Primary-like, Chopper). For higher best frequency units the pattern of response varied according to unit type. Chopper units with best frequency 1.1 kHz responded to envelope modulation; showing no difference between their response to stimuli with positive and negative gain. Primary-like units with best frequencies in the range 13 kHz were still able to represent the difference in the temporal fine structure between dynamic rippled noise with positive and negative gain. No unit with a best frequency above 3 kHz showed a response to the temporal fine structure. Chopper units in this high frequency group showed significantly greater representation of envelope modulation relative to primary-like units with the same range of best frequencies. These results show that at the level of the cochlear nucleus there exists sufficient information in the time domain to represent the time-varying pitch associated with dynamic iterated rippled noise. © 2007 Elsevier B.V. All rights reserved. Keywords: Pitch Iterated rippled noise Inter-spike interval Autocorrelation BRAIN RESEARCH 1171 (2007) 52 66 Corresponding author. E-mail address: [email protected] (I.M. Winter). Abbreviations: ACF, autocorrelation function; ANF, auditory nerve fibre; BF, best frequency; CN, cochlear nucleus; CS, sustained chopper; CT, transient chopper; CV, coefficient of variation; DIRN, dynamic iterated rippled noise; FM, frequency modulation; IRN, iterated rippled noise; ISIH, interspike interval histogram; LF, low frequency unit; PL, primary-like; PN, primary-like with notch; PSTH, peri- stimulus time histogram; VCN, ventral cochlear nucleus 0006-8993/$ see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.brainres.2007.06.098 available at www.sciencedirect.com www.elsevier.com/locate/brainres

Transcript of The temporal representation of the delay of dynamic iterated rippled noise with positive and...

B R A I N R E S E A R C H 1 1 7 1 ( 2 0 0 7 ) 5 2 – 6 6

ava i l ab l e a t www.sc i enced i rec t . com

www.e l sev i e r. com/ loca te /b ra in res

Research Report

The temporal representation of the delay of dynamic iteratedrippled noise with positive and negative gain by single unitsin the ventral cochlear nucleus

Mark Sayles, Ian Michael Winter⁎

Centre for the Neural Basis of Hearing, The Physiological Laboratory, Downing Street, Cambridge, CB2 3EG, UK

A R T I C L E I N F O

⁎ Corresponding author.E-mail address: [email protected] (I.MAbbreviations: ACF, autocorrelation funct

chopper; CT, transient chopper; CV, coefficienrippled noise; ISIH, interspike interval histostimulus time histogram; VCN, ventral cochl

0006-8993/$ – see front matter © 2007 Elsevidoi:10.1016/j.brainres.2007.06.098

A B S T R A C T

Article history:Accepted 21 June 2007Available online 6 August 2007

Spike trains were recorded from single units in the ventral cochlear nucleus of theanaesthetised guinea-pig in response to dynamic iterated rippled noise with positive andnegative gain. The short-term running waveform autocorrelation functions of these stimulishow peaks at integer multiples of the time-varying delay when the gain is +1, and troughsat odd-integer multiples and peaks at even-integer multiples of the time-varying delaywhen the gain is −1. In contrast, the short-term autocorrelation of the Hilbert envelopeshows peaks at integer multiples of the time-varying delay for both positive and negativegain stimuli. A running short-term all-order interspike interval analysis demonstrates theability of single units to represent the modulated pitch contour in their short-term intervalstatistics. For units with low best frequency (≲1.1 kHz) the temporal discharge patternreflected the waveform fine structure regardless of unit classification (Primary-like,Chopper). For higher best frequency units the pattern of response varied according to unittype. Chopper units with best frequency ≳1.1 kHz responded to envelope modulation;showing no difference between their response to stimuli with positive and negative gain.Primary-like units with best frequencies in the range 1–3 kHz were still able to represent thedifference in the temporal fine structure between dynamic rippled noise with positive andnegative gain. No unit with a best frequency above 3 kHz showed a response to the temporalfine structure. Chopper units in this high frequency group showed significantly greaterrepresentation of envelopemodulation relative to primary-like units with the same range ofbest frequencies. These results show that at the level of the cochlear nucleus there existssufficient information in the time domain to represent the time-varying pitch associatedwith dynamic iterated rippled noise.

© 2007 Elsevier B.V. All rights reserved.

Keywords:PitchIterated rippled noiseInter-spike intervalAutocorrelation

. Winter).ion; ANF, auditory nerve fibre; BF, best frequency; CN, cochlear nucleus; CS, sustainedt of variation; DIRN, dynamic iterated rippled noise; FM, frequencymodulation; IRN, iteratedgram; LF, low frequency unit; PL, primary-like; PN, primary-like with notch; PSTH, peri-ear nucleus

er B.V. All rights reserved.

53B R A I N R E S E A R C H 1 1 7 1 ( 2 0 0 7 ) 5 2 – 6 6

1. Introduction

1.1. IRN signals and perception

Pitch is an important aspect of our everyday auditory sensa-tion. For example, differences in pitch are a major cue for theperceptual segregation of sound sources, whereas soundswith similar pitches tend to be grouped into auditory objects.Neurophysiological and psychophysical studies of pitchperception have commonly used stimuli with static pitchesalthough it is well known that speech sounds contain mo-dulated pitch contours onmultiple time scales (e.g. O'Shaugh-nessy and Allen, 1983). These include formant andfundamental frequency transitions characteristic of conso-nants (Liberman et al., 1956) and transitions on longer timescales providing prosodic information (cf. Poeppel, 2003). Insome languages changes in pitch can even provide lexicaldistinction (Stagray et al., 1992). Here we examine theneurophysiological representation of dynamic pitch by utiliz-ing a stimulus, iterated rippled noise (IRN), which has beenextensively used to study the representation of static pitch inthe nervous system (see below). Rippled noise is produced bypassing a broadband signal such as white noise through adelay-and-add filter. The output signal contains peaks in theamplitude spectrum and elicits the sensation of a pitch at thedelay, d, known as repetition pitch (Bilsen and Ritsma, 1969/70). This effect occurs in our natural environment and wasfirst described by the Dutch physicist Christian Huygens in1693. In this early demonstration the effect was produced bythe interaction of a source of broadband noise (falling waterfrom a fountain) and a nearby reflective surface (a stonestaircase). Standing between the fountain and staircaseHuygens noticed the emergent pitch, and attributed this tothe iterative delay and add process introduced by the series ofregularly spaced steps. A similar process can be implementedusing digital signal processing to delay a broadband noise bytime d, multiply it by a gain factor g and add it back to theoriginal waveform. By iterating the delay and add process thepitch strength increases (Yost et al., 1996), and the resultingsignal is known as iterated rippled noise. The iterative delayand add process introduces temporal regularity into the finestructure of the noise and a “ripple” into the long-term powerspectrum of the waveform. When the gain factor g is positive(IRN(+)) the perceived pitch corresponds to 1/d Hz. When thegain is negative (IRN(−)) the perceived pitch depends on thestimulus frequency content and number of iterations. Forbroadband IRN(−) with high iteration number (>4) theperceived pitch corresponds to 1/2d Hz (Raatgever and Bilsen,1992; Yost, 1996). The perceived pitch of band-filtered IRN(−),and broadband IRN(−) with less than 4 iterations (Yost, 1996)is ambiguous; with subjects matching the pitch to valuescorresponding to 0.88/d Hz, 1.14/d Hz and 1/2d Hz (Bilsen andRitsma, 1969/70; Raatgever and Bilsen, 1992). The differencein pitch between IRN(+) and IRN(−) reflects a difference in thetemporal fine structure of the stimuli, which can be demon-strated by comparing their waveform autocorrelation func-tions (ACFs). The waveform ACF for IRN(+) shows peaks at dand its integer multiples, whereas for broadband IRN(−) thereare troughs at d and its odd-integer multiples with peaks at

even-integermultiples of d. In contrast, the ACFs of the signalenvelopes of IRN(+) and IRN(−) both contain peaks at d andinteger multiples thereof (Yost et al., 1998; Shofner, 1999; seeFig. 1). The temporal fine structure of awaveform refers to therapid variations in pressure, whereas the waveform temporalenvelope refers to the slower changes in amplitude of theserapid pressure fluctuations. The envelope of the broadbandwaveform can be extracted in several different ways. Thesimplest is to apply half-wave rectification and low-passfiltering to the signal, whereas the “true” envelope in amath-ematical sense is the Hilbert envelope, which is the magni-tude of the analytic signal (see Hartmann, 1997).

At relatively low frequencies (below approximately 3 kHz inthe guinea pig) auditory nerve fibres (ANFs) can represent thetemporal fine structure information in their phase-lockeddischarge patterns. At higher best frequencies, beyond thelimit of phase-locking, the temporal discharge pattern reflectsthe envelope modulation at the output of the peripheral filter.Using band-filtered IRN in psychophysical experiments sev-eral authors have demonstrated a change in the perceivedpitch of IRN(−) from 1/2d Hz (corresponding to the temporalfine structure) to 1/d Hz (corresponding to the envelope perio-dicity) as the filter centre frequency is increased (Wiegrebeand Winter, 2001b; Yost et al., 1998; Supin et al., 1994).

In the environment the effects of the delay and add processare particularly salient when the broadband source is inmotion, so that the delay between the direct and reflectedsounds changes as a function of time, resulting in amodulatedpitch sensation. For example the sound of a jet airplane andthe reflections of this broadband noise from nearby buildingsresults in a filtered signal with a moving delay as the airplanetakes off (Hartmann, 1997). Bilsen and Ritsma (1969/70)describe a similar situation involving a steam train blowingoff steam whilst stationary at a platform. Standing on theplatform the listener hears a repetition pitch due to reflectedsound from the station platform adding back (after somedelay) with the original broadband noise from the steam at thelistener's ears. As they walk towards the locomotive thelistener hears the pitch becoming progressively lower. Both ofthese examples involve the introduction of a time delaybetween the arrival of the direct and indirect components ofsound at the ears which itself varies as a function of time. Ifthe travel-time (and hence the time delay, d) between a seriesof steps was not constant, but monotonically increasing ordecreasing with step number, we might also predict a mo-dulated repetition pitch. Such architecture has been suggestedas the origin of the chirped-echo effect noticed at the ancientMayan stone pyramid at Chichen Itza in Mexico (Lubman,1998; Declercq et al., 2004). In response to a handclap (or pre-sumably any other broadband impulsive sound) the listener,standing at the foot of the pyramid, hears an echo containing afrequency glide, which has recently been explained as arepetition pitch (Bilsen, 2006) in which the repetition periodgrows gradually longer. Bilsen (2006) notes that at the pyramida continuous sound source such as broadband noise wouldnot elicit the same frequency glide percept due to thesimultaneous mixing of all the different delays. It is possiblethough with digital techniques, by making the delay itself afunction of time, to create IRN with a non-static pitch; this hasbeen termed dynamic iterated rippled noise (DIRN) (Denham,

Fig. 1 – Waveforms, spectrograms and autocorrelation functions (ACFs) of the waveform as well as the Hilbert envelopeof DIRN[2, 1, +1, 16] (top row) and DIRN[2, 1, −1, 16] (bottom row). Time-windowed ACFs (middle and right columns) arecalculated in 25 ms sliding Hanning windows slid in 5 ms steps, for the waveform (middle column) and the Hilbert envelope(right column). Each ACF(t) is normalised for stimulus periodicity at t and averaged through time to yield the normalised ACF.Colour scale bars correspond to both plots in each column.

54 B R A I N R E S E A R C H 1 1 7 1 ( 2 0 0 7 ) 5 2 – 6 6

2005; see Fig. 1). A single iteration of this delay(t)-and-addprocess is theoretically equivalent to the steam train exampleexplained by Bilsen and Ritsma (1969/70). Increasing thenumber of iterations makes the pitch glide percept increas-ingly salient. This makes DIRN an attractive stimulus for thestudy of modulated pitch contour representation since anypitch contour can be easily imposed on the signal by changingthe delay function accordingly (Denham, 2005; Krishnan et al.,2006; Swaminathan et al., 2007). Here we have studied thesimple case of a linear transition in the frequency of the“fundamental” frequency (Fig. 1). Our stimuli are described asDIRN[d0, d1, g, n] where d0 is the delay at the start of thestimulus, d1 the delay at the end (always d0/2), g is the gain (+1or −1), and n is the number of iterations (16 throughout).Similar nomenclature has previously been used to describestatic IRN signals as IRN[d, g, n] (e.g. Neuert et al., 2005). Fig. 1shows waveforms, spectrograms and ACFs of the waveformand of the Hilbert envelope of DIRN(+) and DIRN(−). Thespectrograms (Fig. 1, left column) show peaks at 1/d(t) Hz andinteger multiples for DIRN(+) and peaks at odd-integer multi-ples of 1/2d(t) Hz for DIRN(−). The waveformACFs (Fig. 1,middle

column) show clear differences between the two stimuli withpeaks at d(t) and integer multiples for DIRN(+). For DIRN(−)there are peaks in thewaveformACF at even-integermultiplesof d(t), and troughs at odd-integer multiples of d(t). The Hilbertenvelope ACF (Fig. 1, right column) shows peaks at integermultiples of d(t) for both DIRN(+) and DIRN(−).

1.2. Representations of IRN pitch in neurons

Because of the relative ease with which the pitch (by changingthe delay) and pitch strength (by changing the iterationnumber) of IRN stimuli can be manipulated, IRN has proveda popular stimulus for examining the representation of pitchin the auditory system at many levels, such as the auditorynerve (ten Kate and van Bekkum, 1988), cochlear nucleus(Bilsen et al., 1975; Neuert et al., 2005; Shofner, 1991, 1999;Verhey andWinter, 2006; Wiegrebe andWinter, 2001a; Winteret al., 2001), inferior colliculus (Meddis and O'Mard, 2006), andauditory cortex (Bendor and Wang, 2005; Hall et al., 2006; Halland Plack, 2007; Patterson et al., 2002). In this study we havechosen to examine the temporal representation of DIRN in

55B R A I N R E S E A R C H 1 1 7 1 ( 2 0 0 7 ) 5 2 – 6 6

single units of the ventral cochlear nucleus (VCN). Thecochlear nucleus (CN) is the site of the first obligatory synapsefor all ANFs. Primary-like (PL) responses are obtained fromspherical bushy cells in the antero-ventral division (AVCN).These neurons receive large endbulb of Held synapses fromANFs and respond to pure-tone stimulation in a very similarmanner to ANFs. A similar response known as primary-with-notch (PN) is obtained from globular bushy cells. Because ofthe very secure synapse directly onto the soma bushy cells arecapable of preserving the precise timing information present

Fig. 2 – The response of a low BF primary-like unit (BF=813 Hz,(A–C) Responses to pure tone stimulation at BF. (A) PSTH in respinterspike interval distribution in response to the same stimulus(dashed line) of the interspike interval distribution as a functionvariation). (A–C) Binwidth=0.2ms. (D and E) Responses to DIRN. (Dresponse to DIRN[4, 2, +1, 16]. The colour scale represents firingdistribution, collapsed across time. The grey shaded area showsresponse to DIRN. (E) Same as D but in response to DIRN[4, 2, −1positive and negative gain at delays corresponding to the extremed and 2d. (D–G) Binwidth=0.1 ms.

at the level of the auditory nerve (Bourk, 1976; Blackburn andSachs, 1989; Winter and Palmer, 1990). Chopper units, whichcorrespond to T-multipolar cells, represent another majorresponse type in the CN. In response to acoustic stimulationchopper units fire regularly spaced action potentials; theyhave intrinsic periodicity. However, at low frequencies chop-per units phase-lock to the stimulus fine structure, but theupper frequency limit of this phase-locking ability is lowerthan in primary-like units (Blackburn and Sachs, 1989; Winterand Palmer, 1990). Chopper units are commonly divided into

Unit 1342008) to DIRN(+), DIRN(−), IRN(+) and IRN(−).onse to a BF tone at 20dB above threshold. (B) First-orderas in (A). (C) The mean (solid line) and standard deviationof time through stimulus presentation (CV=coefficient of) Time-windowed all-order interspike interval distribution inrate. The lower part of the panel shows the normalisedthe response to Gaussian noise, and the black area the, 16]. (F and G) Responses of the same unit to static IRN withs of the DIRN stimuli shown in D and E. Dashed lines indicate

56 B R A I N R E S E A R C H 1 1 7 1 ( 2 0 0 7 ) 5 2 – 6 6

two classes, transient chopper (CT) and sustained chopper(CS), on the basis of the coefficient of variation of their inter-spike interval distribution in response to a best-frequency (BF)tone (Young et al., 1988). The existence of several differentresponse types in the CN has lead to the general view thatparallel streams of information ascending to higher levels areestablished at this early stage of brainstem auditory proces-sing. An understanding of the processing undertaken at thisstage is critical for our interpretation of the responses ofmore distal auditory loci and their representations ofperceptually important qualities of acoustic signals. Onesuch perceptual quality is the pitch of complex sounds. Manystudies have examined the representation of static pitch in

Fig. 3 – The response of a low BF sustained chopper unit (BF=87(A–C) Responses to pure tone stimulation at BF. (A) PSTH in respinterspike interval distribution in response to the same stimulus(dashed line) of the interspike interval distribution as a function o(D and E) Responses to DIRN. (D) Time-windowed all-order intersThe lower part of the panel shows the normalised distribution, cto DIRN[8, 4, −1, 16]. (F and G) Responses of the same unit to IRNthe extremes of the DIRN stimuli shown in D and E. Dashed line

the auditory nerve and CN, from the point of view of either afrequency domain and/or a time domain representation.

Temporal theories of pitch rely on the temporal pattern ofaction potential discharge in ANFs. This information ispreserved to varying degrees in different neuronal popula-tions at the level of the CN. The temporal information presentin the response of a neuron is often represented as an all-orderinterspike interval histogram. Studies have previously dem-onstrated that the predominant interspike interval in an all-order interval analysis of populations of ANFs provides arobust representation of the pitch of many complex sounds(Cariani and Delgutte, 1996a,b). Using IRN several authorshave shown that the delay of IRN is represented in the first-

7 Hz, Unit 1341004) to DIRN(+), DIRN(−), IRN(+) and IRN(−).onse to a BF tone at 20dB above threshold. (B) First-orderas in (A). (C) The mean (solid line) and standard deviation

f time through stimulus presentation. (A–C) Binwidth=0.2ms.pike interval distribution in response to DIRN[8, 4, +1, 16].ollapsed across time. (E) Same as D but in responsewith positive and negative gain at delays corresponding tos indicate d and 2d. (D–G) Binwidth=0.1 ms.

57B R A I N R E S E A R C H 1 1 7 1 ( 2 0 0 7 ) 5 2 – 6 6

and all-order interspike interval histograms (ISIHs) in bothANFs and single units in the CN (ten Kate and van Bekkum,1988; Neuert et al., 2005; Shofner, 1991, 1999; Winter et al.,2001; Wiegrebe and Winter, 2001a; Verhey and Winter, 2006).Shofner (1999) showed that VCN primary-like units had peaksin their all-order ISIHs at integermultiples of the delay dwhenthe IRN gain was +1 but nulls flanked by a pair of peaks at dand odd-integer multiples of d, and positive peaks at even-integermultiples of dwhen the gainwas −1; thus representingthe temporal fine structure of IRN(+/−). In a population ofchopper units Shofner (1999) showed a response to envelopemodulation, i.e. in response to both positive and negative gainIRN these units showed peaks in the all-order ISIH at d. Thisresult was replicated by Verhey and Winter (2006), however,chopper units in the VCN could also represent the temporalfine structure, and therefore the difference in perceived pitchbetween IRN(+) and IRN(−) in their temporal discharge patternif their BFs were less than 1.1 kHz. This work extends thefindings of previous studies by examining the dynamictemporal representation of DIRN. By considering a short-term running all-order interspike interval representation weshow that both primary-like (PL/PN) and chopper (CS/CT) unitsin the VCN can represent the modulated fine structure andenvelope of DIRN in their short-term interspike intervalstatistics; providing a running representation of quasi-peri-

Fig. 4 – The response of a transient chopper unit (BF=2.63 kHz,(E) DIRN(+) and DIRN(−) with three different starting delays, as indat BF. (A) BF tone PSTH at 20 dB above threshold. (B) ISIH correspdeviation (dashed line) of the first-order ISI distribution as a funcbroadband DIRN normalised to stimulus periodicity and collapseDashed lines indicate d and 2d. (D–E) Binwidth=0.1 ms.

odicity and therefore a representation of dynamic pitch. Thisresult was BF dependent with primary-like units able torepresent the dynamic pitch to higher BFs than chopper units.

2. Results

2.1. Single unit data

We have recorded the responses of 89 units (17 PL, 13 PN, 17 CS,28 CT, and 14 low frequency (LF)) in the VCN to DIRN. Fig. 2shows the responses of a PL unit with a relatively low bestfrequency (BF=813 Hz). The top row of Fig. 2 shows the BF-toneperi-stimulus time histogram (PSTH) (Fig. 2A), the first-orderinterspike interval histogram (ISIH) calculated from the BF-toneresponse (Fig. 2B) and the mean (solid line) and standarddeviation (dashed line) of the interspike interval (ISI) distribu-tion as a function of time through theBF-tone response (Fig. 2C).Fig. 2D shows the response toDIRN[4, 2, +1, 16]. A large peak canbe seen following the time-varying delay and a second peak canbe seen at twice the delay. Normalising the short-term analysisto the corresponding stimulus periodicity at time t and thenaveraging the response across time gives the summary plotshown in the lowerpart of thepanel. Thegrey shadedhistogramshows the response of the same neuron to Gaussian noise,

Unit 1341007) to broadband (D) and low-pass filteredicated in the figure. (A–C) Responses to pure tone stimulationonding to the PSTH in A. (C) Mean (solid line) and standardtion of time. (A–C) Binwidth=0.2 ms. (D) Responses tod across time. (E) The responses to low-pass filtered DIRN.

58 B R A I N R E S E A R C H 1 1 7 1 ( 2 0 0 7 ) 5 2 – 6 6

normalised and averaged in the same way as for the DIRNstimulus. The black shaded areas show where the response toDIRN is greater than that to Gaussian noise. This normalisedhistogram shows the peaks in the ISI distribution are centred atinteger multiples of the delay. The largest peak is at d. Incontrast, the response of the same unit to DIRN[4, 2, −1, 16](Fig. 2E) shows a null atd(t) flankedby a pair of peaks, and a peakcentred at 2d(t). The pattern of peaks in the short-term all-orderinterval analysis for the unit shown in Fig. 2 corresponds to themoving stimulus periodicity and indicates that this low best-frequencyPLunit is responding to stimulus fine structure ratherthan the waveform envelope; i.e. there is a clear differencebetween the responses to positive and negative gain stimuli. Ithas been suggested that the largest peak in an all-order ISIH isthe cue to pitch (Cariani and Delgutte, 1996a) and on this basisthe results from the unit shown in Fig. 2 suggest a pitchcorresponding to 1/d(t) Hz for DIRN(+) and 1/2d(t) Hz for DIRN(−).For comparisonpurposesweshowresponsesof the sameunit tostatic IRN with positive and negative gain at delayscorresponding to the two extremes of the DIRN responses(Figs. 2F and G). The largest peak in each of these histogramscorresponds to the perceived pitch for broadband IRN stimuliwith >4 iterations. There are also large peaks in the intervalhistograms either side of d.

Fig. 5 – The responses of a primary-like unit (BF=2.85 kHz, Unitdelays (E) and broadband IRN with three different delays (F). (A–C(A) BF tone PSTH at 20 dB above threshold. (B) ISIH corresponding(dashed line) of the first-order ISI distribution as a function of timwaveform calculated from 1351 spontaneous spikes; * indicatesbroadband DIRN normalised to stimulus periodicity and collapseindicate d and 2d. (E–F) Binwidth=0.1 ms.

Primary-like units such as that shown in Fig. 2 are capableof preserving the fine temporal information present in theauditory nerve. It has been suggested that this information istransformed at the level of the cochlear nucleus into atemporal place code in units characterised by a CS responsepattern (Wiegrebe and Winter, 2001a; Winter et al., 2001;Wiegrebe and Meddis, 2004). Although their responses areoften thought of as being dominated by their intrinsicchopping rate chopper units of low BF can represent the finestructure of static IRN stimuli in their temporal dischargestatistics (Verhey and Winter, 2006). The responses of such alow-BF CS unit (BF=877 Hz) to both dynamic and static IRN areshown in Fig. 3. These responses are similar to those shownfor the PL unit in Fig. 2. Note, however that the stimulus delaystarts at 8 ms. There are clear peaks at d(t) and 2d(t) in theshort-term all-order interval analysis for DIRN[8, 4, +1, 16](Fig. 3D). With negative gain (Fig. 3E) there is a null at d(t) witha small peak at 2d(t). In this case the peak at 2d(t) is not thelargest; meaning that on the basis of the “largest peak wins”hypothesis this single unit would not indicate a pitchcorresponding to 1/2d(t) Hz in response to DIRN(−). The largestpeaks are those at interspike intervals surrounding d(t). Thesepeaks, and those surrounding d(t) in Fig. 2E may contribute tothe perception of pitches at 1/0.88dHz and 1/1.14d Hz reported

1349017) to broadband DIRN with three different starting) Responses to pure tone stimulation at BF.to the PSTH in A. (C) Mean (solid line) and standard deviatione. (A–C) Binwidth=0.2 ms. (D) Mean spike

the position of the small pre-potential. (E) Responses tod across time. (F) Responses to broadband IRN. Dashed lines

Fig. 6 – (A) All-order interval distribution peaks in responseto IRN[4, −1, 16] as a function of BF. (B) Normalised peakposition in response to IRN[4, −1, 16], IRN[8, −1, 16] and IRN[16, −1, 16] as a function of harmonic region. (A and B) Datapoints are the positions of peaks in the all-order interspikeinterval distribution with magnitude greater than the meanplus twice the standard deviation of the response toGaussian noise. (A) Curved dashed lines indicate the peakpositions predicted by Bilsen and Ritsma (1969/70) from theirequation relating the predicted autocorrelation peaks (τ) tofilter centre frequency (CF); τ=d±1/2CF. Horizontal dashedlines indicate 2d and 4d. (B) Horizontal dashed lines indicate1/1.14d and 1/0.88d. Vertical dashed lines indicate thepositions of the 3rd and 5th harmonics. (A) Starsymbol=chopper (CT/CS), circle=primary-like (PL/PN), andtriangle=low frequency unit (LF).

59B R A I N R E S E A R C H 1 1 7 1 ( 2 0 0 7 ) 5 2 – 6 6

in psychophysical experiments (e.g. Raatgever and Bilsen,1992). The large response in Fig. 3D and E at an interspikeinterval of approximately 2.5 ms (indicated by an arrow)corresponds to this unit's chopping frequency. Figs. 3F and Gshow the responses to static IRN with positive and negativegain. The response to static IRN is similar to the response toDIRN in that the largest peak in the negative gain histograms(Fig. 3G) does not correspond to the perceived pitch ofbroadband IRN(−) (1/2d Hz). Despite the peak at 2d not beingthe largest in the histogram this low frequency sustainedchopper unit nevertheless represents the difference in wave-form fine structure between the positive and negative gainstimuli, both for IRN and DIRN.

Not all VCN chopper units are able to represent the finestructure of broadband rippled noise stimuli. Responses fromanother unit classified as having a chopper PSTH (CT) are shownin Fig. 4. The BF of this unit is 2.34 kHz. In response to broadbandDIRN (Fig. 4D) at threedifferent startingdelays thisunit respondsto the envelope modulation; i.e. there is no difference betweenthe responses to positive and negative gain stimuli. By low-passfiltering the DIRN at 1 kHz and increasing the sound level by20 dB (Fig. 4E), to stimulate in the tail of the receptive field, theunit responds to the fine structure, with a null at d and a peak at2d for thenegativegainstimuli.This indicates that the responsesin Fig. 4D do not reflect an intrinsic limitation of the neuron butrather reflect the fact that with broadband stimuli the neuron isdominated by stimulus components close to BF, resulting in aresponse to envelope modulation, whereas if the stimulus islimited to frequency regions in the unit's phase-locking rangethe response can switch to that of a temporal fine-structureencoder. These responses contrastwith those presented in Fig. 5

which shows the responses of a relatively high BF (2.85 kHz)primary-like unit. Fig. 5E shows the responses of this high BFprimary-like unit to DIRN at three different starting delays withpositive (left column) and negative (right column) gain. Notice thatin each case there is a clear difference between the responses toDIRN(+) and DIRN(−). This is also the case for the responses ofthe same unit to IRN with a static delay (Fig. 5F). The responsesto the broadband DIRN resemble those of the chopper unit ofsimilar BF in Fig. 4 to low-pass DIRN. Thus even though thisprimary-like unit is dominated by on-BF inputs it can presum-ably phase-lock to sufficiently high frequencies to represent thewaveform fine structure at nearly 3 kHz. An analysis of thisunit's spike waveform revealed the presence of a small pre-potential (Fig. 5D). Winter and Palmer (1990) showed that pre-potential primary-like neurons show the best phase-lockingability of all VCN units, often matching that seen in auditorynerve fibres in the same species (Palmer and Russell, 1986).

The position of the two peaks flanking the null at d in Figs. 2E,G and 3E and G are a characteristic of the unit's best frequency(Bilsen and Ritsma, 1969/70; Raatgever and Bilsen, 1992; Shofner,1999; Yost et al., 1978). The positions of corresponding peaksaround d in the ACF of band-filtered IRN(−) are given by d−1/2CFand d+1/2CF, where CF is the filter centre frequency (Bilsen andRitsma, 1969/70; Yost et al., 1978). Fig. 6A presents a populationanalysis of the response to static IRN[4,−1, 16]. Datapoints are thepositions of peaks in the all-order interspike interval distribu-tions. A peak is any localised maximum having magnitudegreater than two standard deviations from the mean of the all-order interval distribution in response to Gaussian noise. Thecurved dashed lines indicate the peak positions around d mspredicted by Bilsen and Ritsma (1969/70) for a 1/3 octave band-pass filter. Horizontal dashed lines indicate 2d and 4d. Themajority of datapoints fall along thesepredicted intervals. Fig. 6Bshows the same data as in Fig. 6A, combined with data from thesame units in response to IRN[8, −1, 16] and IRN[16, −1, 16],normalised for stimulus delay and harmonic number. Thehorizontal dashed lines indicate 1/1.14d and 1/0.88d; the ambig-uous pitches reported by subjects listening to rippled noise withnegative gain (Bilsen and Ritsma, 1969/70; Raatgever and Bilsen,1992; Yost et al., 1978; Yost, 1996). In the region of the 3rd to the5th harmonic (the dominance region for pitch, indicated by thevertical dashed lines) the data fall close to the horizontal dashedlines. There is also a strong response at 2d across the BF axis.

2.2. Population data and a simple model

Verhey and Winter (2006) showed that chopper units with aBF≤1.1 kHz are able to represent the waveform temporal finestructure of IRN[4, ±1, 16] in their all-order interval statistics.This corresponds to human pitch perception, where it hasbeen demonstrated that the pitch of IRN(−) is at 1/2d Hz whenthe stimulus is high-pass filtered with a cut-off of 1.25 kHz orlower and at 1/dHzwhen the stimulus is filtered with a cut-offfrequency higher than 1.25 kHz (Wiegrebe and Winter, 2001b).Fig. 7 shows the mean responses to DIRN[8, 4, ±1, 16] of unitsgrouped according to unit type and BF with the same cut-off(1.25 kHz) as in Verhey and Winter (2006). Data are plotted asmean change in firing rate in response to DIRN relative toGaussian noise presented at the same sound level. Fig. 7Ashows this analysis for units classified as showing a chopper

60 B R A I N R E S E A R C H 1 1 7 1 ( 2 0 0 7 ) 5 2 – 6 6

PSTH (CT and CS grouped together). There is a clear differencein the responses to DIRN(+) and DIRN(−) in those chopper unitswith BFs below 1.25 kHz. This is also the case for primary-likeunits (Fig. 7B), although the primary-like group shows a morepronounced null at d(t) for the negative gain stimulus com-pared to chopper units. Chopper units with BFs>1.25 kHzshow no appreciable difference in their response to positiveand negative gain stimuli. The summary figure for primary-like units with BFs>1.25 kHz (Fig. 7B) includes some unitswhich do show a differential response between DIRN(+) and

Fig. 7 – The mean change in firing rate relative to the responseto unit type and BF. The number of individual unit recordings coof each panel. Colour scale represents the mean difference in firon a normalised scale. Binwidth=0.1 ms. (A) Mean responses ofprimary-like units (includes PL and PN groups) to the same stimas in A. (C) Mean response for units which could not be groupedphase-locking at BF.

DIRN(−), such as that shown in Fig. 5, and some units withBFs>3 kHz showing a response to envelope modulation.Therefore the peak at d(t) in the response to DIRN(−) for thehigh BF group in Fig. 7B is not as clearly defined as in thecorresponding positive gain response. A group of low best-frequency units which could not be classified as chopper orprimary-like (Fig. 7C) showed the clearest distinction betweenDIRN(+) and DIRN(−).

Using a simplified model of peripheral auditory processingVerhey and Winter (2006) showed the responses of VCN

to Gaussian noise for DIRN[8, 4, ±1, 16] grouped accordingntributing to each average is indicated in the lower lefting rate between DIRN and Gaussian noise expressedunits classified as chopper (CT and CS). (B) Mean response ofuli as in A. Units grouped according to the same BF rangeas either primary-like or chopper because of strong

61B R A I N R E S E A R C H 1 1 7 1 ( 2 0 0 7 ) 5 2 – 6 6

chopper neurons to IRN(+) and IRN(−) could be well predictedby simple peripheral filtering. Here we have examined theoutput of a similar model of peripheral auditory processing inresponse to DIRN (Fig. 8). The model used here consisted of abank of 100, 4th order gammachirp filters with bandwidthsappropriate for the guinea-pig (Evans, 2001) centred logarith-mically between 0.1 and 10 kHz. The output of each filter ishalf-wave rectified and low-pass filtered with a cut-offfrequency of 1 kHz. The normalised ACF of the low-passfiltered, half-wave rectified output is calculated for shortduration segments in the sameway as the individual all-orderISIHs. Fig. 8 shows the average responses to DIRN[8, 4, ±1, 16]of the filters grouped into two categories; those with centrefrequencies (CFs)≤1.25 kHz and those with CFs>1.25 kHz. Thecolour code represents the average difference between theshort-term ACFs in response to Gaussian noise and in res-ponse to the DIRN stimulus. For filters centred at 1.25 kHz andbelow (Figs. 8A and B) there is a clear difference between theresponse to DIRN(+) and DIRN(−), with a peak (red) at delayscorresponding to d(t) in Fig. 8A and a null (blue) at those samedelays in Fig. 8B. In both cases there is a small peak at 2d(t).Normalising and collapsing through time (Figs. 8E and F)reveals this small peak at 2d(t) more clearly. In contrast theresponse of filters with CFs>1.25 kHz (Figs. 8C and D) show no

Fig. 8 – Model short-term autocorrelation functions (A–D) and coThe autocorrelation functions (ACFs) were calculated on the outpand low-pass filtered output of a bank of 100 gammachirp filtersbetween 0.1 and 10 kHz. The ACFs shown in A–D were determinfrom the short-term ACF for the DIRN and averaging across centZero values are represented in green, positive values in yellow tACFs (E–H) are calculated by collapsing the short-term ACFs throHere the result is expressed as change in mean ACF magnitude

difference between positive and negative gain stimuli. Thelow-pass filter with a cut-off of 1 kHz at the output of eachfilter in the model simulates the fall off of phase-locking inguinea-pig VCN chopper units (Winter and Palmer, 1990).Fig. 9A shows the effect of increasing this cut-off frequency to3 kHz. The change in ACF magnitude at a normalised intervalof 1 in response to DIRN relative to Gaussian noise is plottedagainst filter centre frequency. Closed symbols are for DIRN(−)while open symbols are for DIRN(+). Star symbols are for a low-pass filter at 1 kHz and circles for a low-pass filter at 3 kHz.With the lower cut-off frequency there is a difference betweenthe positive and negative gain stimuli in filters with CFs up to∼1.3 kHz. Increasing the cut-off frequency to 3 kHz increasesthe limit of fine structure information to channels centred upto ∼4 kHz. Fig. 9B plots the single unit data in the same formatas in Fig. 9A. Qualitatively the single unit data and the modelpredictions appear similar, however a more quantitativecomparison is not possible due to the unknown relationshipbetween change in firing rate and change in ACF magnitude.

An analysis by BF range for the single unit data (Fig. 10)demonstrates a significant difference between the response topositive and negative gain stimuli for all units below 1 kHz(ANOVA, p<0.02). Only those units classified as primary-likeor primary-like with a notch showed a significant difference

rresponding normalised autocorrelation functions (E–H).ut of a simplified model using the half-wave rectifiedwith centre frequencies distributed logarithmicallyed by subtracting the short-term ACF for Gaussian noisere frequency in two groups (CFs≤1.25 kHz, CFs>1.25 kHz).hrough red and negative values in blue. The normalisedugh time in the same way as for the neuronal responses.relative to Gaussian noise.

Fig. 9 – Model prediction of the difference between chopperand primary-like units' ability to represent the difference infine structure between DIRN(+) and DIRN(−) as a function offilter centre frequency (A) and the corresponding neuronaldata (B). (A) Change in ACF magnitude between DIRN andGaussian noise as a function of filter centre frequency (CF) forpositive gain (open symbols) and negative gain (closedsymbols) stimuli. Stars are values obtainedwith the low-passfilter cut-off frequency at 1 kHz (to simulate the phase-lockinglimit of chopper units), circles with the cut-off at 3 kHz (tosimulate the phase-locking limit of primary-like units).(B) Change in firing rate between DIRN and Gaussian noise asa function of BF for the three unit types. Data are averagedacross stimulus conditions. As in A open symbols indicatepositive gain stimuli, closed symbols negative gain stimuli.Star symbols=Chopper units, Circles=Primary-like units,Triangles=Low Frequency units. (A and B) Dashed lines inboth plots indicate zero change from Gaussian noise.

62 B R A I N R E S E A R C H 1 1 7 1 ( 2 0 0 7 ) 5 2 – 6 6

between the response to positive and negative gain stimuli inthe BF range 1–3 kHz (ANOVA, p<0.02). As would be expected,based on the phase-locking cut off in CN units in the guineapig, neither the chopper nor the primary like group with BFsabove 3 kHz showed a significant difference between res-ponses to positive and negative gain stimuli. High BF chopperunits show a significantly greater response to envelopemodulation compared to high BF primary like units (ANOVA,p<0.005).

Fig. 10 – Mean change in firing rate at d relative to Gaussiannoise as a function of unit type and BF range. White barsrepresent responses to positive gain stimuli, black negativegain stimuli. Error bars represent S.E.M. The mean change infiring rate relative to Gaussian noise was significantlydifferent (* one-way ANOVA, Scheffe's post-hoc test, p<<<<0.02)between the positive and negative gain conditions in all threeunit types with BFs below 1 kHz, and in PL units with BFs inthe range 1–3 kHz. **Chopper units with BFs>3 kHz show asignificantly greater change in firing rate thandoprimary-likeunits in the same BF range for both positive and negative gainstimuli (** one-wayANOVA, Scheffe's post-hoc test, p<0.005).

3. Discussion

3.1. Temporal fine structure vs. envelope modulation

Using a short-term temporal analysis we have demonstrated arepresentation of the dynamic pitch of both positive andnegative gain DIRN stimuli in populations of VCN primary-likeand chopper units. Those units with low BFs were able torepresent the difference in pitch between positive andnegative gain stimuli, whereas those with relatively high BFsresponded to envelope modulation which is common to boththe positive and negative gain stimuli. The BF at which thistransition occurred was dependant on unit type, withprimary-like units able to represent stimulus fine structureup to 3 kHz and chopper units failing to show a representationof fine structure above 1 kHz (Fig. 10). This result is consistentwith the lower limit of phase-locking in chopper units

compared with primary-like units (Blackburn and Sachs,1989; Winter and Palmer, 1990). The lower cut-off frequencyfor phase-locking to stimulus fine structure observed inchopper units (T-multipolar cells) is likely due to the low-pass filter imposed on their responses by their inputssynapsing on the dendritic tree rather than onto the soma asis the case in primary-like units (Bushy cells) (White et al.,1994). A similar hierarchy in the ability of primary-like andchopper units to represent temporal fine structure has beenreported in the response to broadband noise (Louage et al.,2005). The traditional view that VCN chopper units codeamplitude modulation with greater gain than primary-likeunits (Frisina et al., 1990a,b; Rhode and Greenberg, 1994;Rhode, 1998; Keilson et al., 1997) was not the case for IRNstimuli in Shofner's (1999) study. Our data suggest that VCNchopper units with BFs>3 kHz do show significant enhance-ment over primary-like units in the same frequency range intheir ability to represent amplitude modulation (Fig. 10). Thismay be due to the difference in stimuli which in our case werefrequency modulated.

The transition from a temporal fine structure to anenvelope modulation response with increasing unit BF corre-sponds to a transition in the perceived pitch of high-passfiltered IRN[4, −1, 16] from 125 Hz (1/2d Hz) to 250 Hz (1/d Hz)when the lower cut-off frequency is increased above 1.25 kHz(Wiegrebe and Winter, 2001b). Using band-pass filtered IRN(+)and IRN(−) it has been shown that as the filter centre frequencyis increased above approximately 6 kHz listeners can no longerdistinguish between IRN(+) and IRN(−) (Yost et al., 1998; Supinet al., 1994), although even with a pass band of 8–10 kHzlisteners can distinguish both IRN(+) and IRN(−) from flat-spectrum noise (Yost et al., 1998), arguing for the use ofenvelope cues in these high frequency regions. Using DIRNstimuli Denham (2005) shows that human subjects candistinguish between upward-going and downward-goingpitch contours when the signals are high-pass filtered above2.4 kHz and that performance improves as the filter cut-off

63B R A I N R E S E A R C H 1 1 7 1 ( 2 0 0 7 ) 5 2 – 6 6

frequency is lowered. From this observation Denham (2005)has argued that the use of temporal information alone issufficient to identify the direction of the pitch sweep whenspectral cues from resolvedpeaks in the stimulus are removed;an advantage in discrimination is accrued when low frequen-cy, resolved, components are included.

3.2. Spectral dominance and the pitch of IRN

It is well known that the lower harmonics of a complex sounddominate the pitch percept (Plomp, 1967; Ritsma, 1967; Mooreet al., 1985; Dai, 2000). Although there is variation in the exactfigures measured in these studies there is broad agreementthat the dominant region spans the 3rd through 5th harmo-nics of a complex sound. The strong influence of the lowfrequency partials of a complex on the perceived pitch of acomplex could arise either from a rate-place representation ofresolved peaks in the amplitude spectrum, a temporalresponse to the stimulus fine structure rather than to therelatively small envelope modulation in higher frequencychannels, or a combination of these and othermechanisms. Inthis regard it is interesting to note the existence of a spectraldominance region for the pitch of IRN around 4/d Hz (Bilsenand Ritsma, 1969/70; Shofner and Yost, 1997).

The analysis presented in Fig. 6 shows the interspikeinterval distribution peaks in response to IRN(−) as a functionof BF. Yost et al. (1978) shows a similar analysis (their Fig. 7)based on passing cosine (−) noise (IRN(−) with a singleiteration) through octave wide band-pass filters. The majorpeaks in the ACF of the filtered output were positioned around1/1.1d and 1/0.9d for a filter centre frequency of 4/d Hz.Increasing the filter centre frequency resulted in the peaksmoving closer to d. This is a similar result to that obtained byBilsen and Ritsma (1969/70) in response to passing cos (−)random interval pulse trains through 1/3 octave band-passfilters. Our data show that in the dominance region around3/d Hz through 5/d Hz the peaks in the ISIHs correspond to1/0.88d, 1/1.14d and 2d (Fig. 6B). As unit BF increases thepeaks move closer to d. This result is consistent with theambiguous pitches of IRN(−) being determined by temporalprocessing within the dominance region for pitch.

3.3. The chopper theory of pitch

In addition to their ability to phase-lock to relatively lowfrequency stimulus fine structure chopper units show band-pass periodicity tuning by virtue of their intrinsic periodicity.Winter et al. (2001) showed that units, including choppers,were tuned to IRN-delays as predicted by peaks in their first-order ISIHs in response to white noise. However, Frisina et al.(1990a,b) found little correlation between intrinsic choppingfrequency and best modulation frequency, whereas Kim et al.(1990) found a much higher correlation between a unit'sintrinsic oscillation and it's best modulation frequency.Wiegrebe and Winter (2001a) provide evidence for a level-independent representation of the delay of IRN(+) in unitsclassified as sustained choppers (CS). Based on these observa-tions Winter et al. (2001) emphasized the role of CS units as apotential pathway carrying pitch related information to higherlevels of the auditory system. It has been hypothesised that CS

units are a key processing stage in the representation of pitchin which a purely temporal all-order interval code at the levelof the auditory nerve is converted to a first-order intervaltemporal-place code at the level of the CN (Wiegrebe andWinter, 2001a; Wiegrebe and Meddis, 2004). This hypothesisrequires an array of chopper units tuned to a range ofmodulation frequencies covering the range of pitch perceptionin iso-frequency laminae in the cochlear nucleus such thatcommon periodicity would be represented across the BF axis.Evidence for such a range of chopping frequencies is lacking;most intrinsic chopping rates in the VCN are in the range of 2to 5 ms, whilst the lower limit of human pitch perception(Krumbholz et al., 2000; Pressnitzer et al., 2001) would requirechopping rates in the region of 30 ms. In the case of DIRNstimuli, or any stimuli with amodulated pitch contour such asrunning speech, the pattern of activity in such a model wouldtake the form of a peak of activation moving through thepopulation of chopper units tuned to the instantaneousperiodicity at any moment in time. There would necessarilybe some constraints on the accuracy with which this systemcould encode the pitch at any instant, imposed by the need tointegrate information from the auditory nerve over a finitetime window. In response to static IRN, or any periodiccomplex sound, CS units are presumed to provide the moststable representation of the periodicity through time as aresult of their ability to maintain regularity throughoutcontinuous stimulation. In contrast, CT units would beexpected to represent the periodicity reliably only over thefirst few milliseconds of their response as their periodicityadapts over time. In the case of a dynamic pitch eachpopulation of chopper units tuned to a different periodicitywould only be maximally driven for a short time, indicatingthat the adaptation seen in CT units' firing patterns would addless noise to any system attempting to estimate the instan-taneous periodicity from the pattern of activity in chopperunits. This could potentially account for the observation thatrelatively weak pitches tend to become more salient whenthey change over time (Davis et al., 1951).

3.4. Temporal integration and dynamic pitch

Pitch-evoking sounds are usually periodic, with the pitch beingrelated to the period. In order to estimate the pitch using eitherspectral or temporal cues the auditory system must integrateinformation over time. The length of this integration windowhas been the subject of some debate. A long integration timewouldpotentially render any estimationof thepitchunreliabledue to variations in pitch over time, whereas a shorter windowreduces this “blurring” effect at the expense of reducedreliability of the estimation process: the time–frequencytrade-off. Wiegrebe et al. (1998) provides evidence for twostages of integration separated by a non-linearity. The firststage was a window duration of 1.5 ms, and the second asomewhat longer window. Later work by the same author(Wiegrebe, 2001) argued for a period-dependant time window;the minimum duration for pitch perception being twice thepitch period, with the additional constraint of a minimumintegration time of approximately 2.5 ms. Also in favour of ashort integration time is the observation that a pulse pair withd=5 ms presented every 500 ms is able to evoke a perceptible

64 B R A I N R E S E A R C H 1 1 7 1 ( 2 0 0 7 ) 5 2 – 6 6

pitch (Bilsen and Ritsma, 1970). Using iterated rippled noisestimuli Denham (2005) argues for some tolerance in the pitchperiod estimation process. When presented with a continuousrandomsequence of IRNstimuli each shorter than twice the IRNdelay subjects do not hear a pitch. However, when these sameshort IRN stimuli are presented in order of decreasing delaysubjects hear a clear upward pitch glide. Pure tone thresholdshave been shown to vary inversely with duration up toapproximately 100 ms (Moore, 1973). This suggests integrationover a relatively long time window. Here we have analysed ourdata in windows of 80 ms duration, which are refreshed every10 ms. There is some evidence that dynamic pitch may beperceived differently to static pitch. Instead of the systemusinga “sampling” mechanism and comparing window segmentsacross time theremaybe somemechanismwhich is sensitive tothe change in frequency (Sek and Moore, 1999). We show thatthe peripheral mechanisms allowing a temporal representationof the pitch of a complex sound, both to the temporal finestructure and the envelopeperiodicity, are robust in the face of amodulatedpitch contourwhenusing a fixedFMrate of 2 octavesper second. This is important for temporal theories of pitchprocessingwhenconsidering that running (English) speechwithnormal intonation contains fundamental frequency transitionsat up to 5 octaves per second and typically covering a wholeoctave (O'Shaughnessy and Allen, 1983). Modulated pitchcontours are arguably of even greater importance in tonelanguages such as Mandarin Chinese and Thai where theyform the basis for lexical distinctions (Stagray et al., 1992). Ouranalysis, which uses a sampling mechanism, does not implythat the information contained in the temporal dischargepattern of VCN units is necessarily interpreted in this way byhigher centres in the auditory system, only that informationregarding themodulated pitch exists in the temporal domain atthe level of the cochlear nucleus. This informationmust then beextracted in some way by higher levels.

4. Experimental procedures.

4.1. The preparation

Experiments were performed on pigmented guinea pigs (Caviaporcellus), weighing between 315 and 610 g. The animals wereanaesthetised with urethane (1.0 g/kg, ip); Hypnorm (fentanyl/fluanisone) was administered as supplementary analgesia(1 ml/kg, im). Anaesthesia and analgesia were maintained ata depth sufficient to abolish the pedal withdrawal reflex (frontpaw). Additional doses of Hypnorm or urethane were admin-istered on indication. Core temperature was monitored with arectal probe and maintained at 38 °C using a thermostaticallycontrolled heating blanket (Harvard Apparatus). The tracheawas cannulated and on signs of suppressed respiration, theanimalwasventilatedartificiallywith apump (Bioscience, UK).Surgical preparation and recordings took place in a sound-attenuated chamber (Industrial Acoustics Company). Theanimal was placed in a stereotaxic frame, which had ear barscoupled to hollow speculae designed for the guinea pig ear. Amid-sagittal scalp incision was made and the periosteum andthemuscles attached to the temporal and occipital boneswereremoved. The bone overlaying the left bulla was fenestrated

and a silver-coated wire was inserted into the bulla to contactthe round window of the cochlea for monitoring compoundaction potentials (CAP). The hole was resealed with Vaseline.The CAP threshold was determined at selected frequencies atthe start of the experiment and thereafter upon indication. Ifthe thresholds had deteriorated by more than 10 dB and werenon-recoverable (for example, by removing fluid from thebulla) the experiment was terminated. A craniotomy wasperformed exposing the left cerebellum. The overlying durawas removed and the exposed cerebellum was partiallyaspirated to reveal the underlying cochlear nucleus. The holeleft from the aspiration was then filled with 1.5% agar in salineto prevent desiccation. The experiments performed in thisstudy have been carried out under the terms and conditions ofthe project licence issued by the United Kingdom Home Officeto the second author.

4.2. Neural recordings

Single units were recorded extracellularly with glass-coatedtungsten microelectrodes (Merrill and Ainsworth, 1972). Elec-trodes were advanced in the sagittal plane by a hydraulicmicrodrive (650W; David Kopf Instruments, Tujunga, CA) at anangle of 45°. Single units were isolated using broadband noiseas a search stimulus. All stimuli were digitally synthesized inreal-timewith a PC equippedwith aDIGI 9636 PCI card thatwasconnected optically to an AD/DA converter (ADI-8 DS, RMEaudio products, Germany). The AD/DA converter was used fordigital-to-analog conversion of the stimuli as well as foranalog-to-digital conversion of the amplified (× 1000) neuralactivity. The sample ratewas 96 kHz. TheAD/DAconverterwasdriven using ASIO (Audio Streaming Input Output) and SDK(Software Developer Kit) from Steinberg (Lloyd, 2002).

After digital-to-analog conversion, the stimuli were equal-ized (phonic graphic equalizer, model EQ 3600; Apple Sound) tocompensate for the speaker and coupler frequency responseand fed into a power amplifier (Rotel RB971) and a programma-ble end attenuator (0–75 dB in 5 dB steps, custom build) beforebeing presented over a speaker (Radio Shack 30-1777 tweeterassembled byMikeRavicz,MIT, Cambridge,MA)mounted in thecoupler designed for the ear of a guinea pig. The stimuli weremonitored acoustically using a condenser microphone (B&K4134) attached to a calibrated 1-mm diameter probe tube thatwas inserted into the speculum close to the eardrum. Neuralspikes were discriminated in software, stored as spike times ona PC and analysed off-line.

4.3. Unit classification

Upon isolation of a unit, its BF and excitatory threshold weredetermined using audio-visual criteria. Spontaneous activitywas measured over a 10-s period. Single units were classifiedbased on their peri-stimulus time histograms (PSTH), the first-order interspike interval distribution and the coefficient ofvariation (CV) of the discharge regularity. The CVwas calculatedby averaging the ratios of the mean ISI (μ) and its standarddeviation (σ) between 12 and 20 ms after onset, as defined byYoung et al. (1988). PSTHs were generated from spike timescollected in response to 250 sweeps of a 50-ms tone at the unit'sBF at 20 dB and 50 dB suprathreshold. Rise–fall time of the tones

65B R A I N R E S E A R C H 1 1 7 1 ( 2 0 0 7 ) 5 2 – 6 6

was 1 ms (cos2 gate) and the starting phase was randomised.The tone bursts were repeated with a period of 250 ms. PSTHswere classified as primary-like (PL), primary-like with a notch(PN), sustained chopper (CS) and transient chopper (CT). Forsome units with very low BFs (<∼0.5 kHz) it was not possible toassign them to one of the above categories. In the absence of adefinitive classification these are grouped together as “lowfrequency” (LF) units.

4.4. Complex stimuli

DIRN was generated from a noise waveform with Gaussiandistribution of the instantaneous amplitudes. Each point isthen delayed by time d, multiplied by a gain factor g and thenadded back to the input waveform. The gain g was either +1(DIRN(+)) or −1 (DIRN(−)). The delay d is a function of time, sothat each point in the input waveform is delayed by a differentamount. Thus the signal can be described by:

yiðtÞ ¼ yi�1ðtÞ þ gyi�1ðt� dðtÞÞ; for i ¼ 1; N ;n: ð1Þ

The delay function d(t) is simply the reciprocal of a linearfrequency transition. Delay functions were chosen to give theDIRN signals a linear up-going frequency sweep covering onewhole octave in 0.5 s; hence the sweep rate is 2 octaves persecond. d0 was varied in octave steps between 2 and 32ms. Theoutput of one delay and add stage served as the input to thenext delay and add stage (equivalent to “add same”, Yost et al.,1996). Stimuliwere generatedwith 16 iterations. In additionwepresented low-pass filteredGaussiannoise at the same level asthe complex stimuli as a control stimulus. All stimuli were500 ms in duration, presented with a 1-s repetition period,gated on and off with 5 ms cos2 ramps and were low-passfiltered with a cut-off frequency of 10 kHz. Stimuli werepresented in random order for typically 50 or 100 repetitions atapproximately 75 dBSPL.When recording time allowedwe alsopresented IRN(+) and IRN(−) as described in Verhey andWinter(2006) for comparison. Fig. 1 shows example waveforms,spectral representations and short-termwaveformandHilbertenvelope autocorrelation functions for DIRN signals; DIRN(+)(top row) and DIRN(−) (bottom row).

4.5. Analysis

Spike trains were analysed using a modified version of theapproach taken by Verhey and Winter (2006). We calculatedthe all-order interspike interval distribution in 80 ms rectan-gular windowed segments of the response and slid thiswindow in 10 ms steps through the spike train responses.The ISIHs are expressed as firing rate by dividing bin values bythe product of binwidth (0.1 ms) and the total number ofspikes (cf. Shofner, 1999) and then smoothed with a sliding0.9 ms Hanning window (Neuert et al., 2005). The interspikeinterval distribution from each windowed segment is plottedagainst the time at the centre of the analysis window. In orderto provide a measure of the response to stimulus drivenperiodicity through time we normalise the ISIH from eachtime-windowed segment of the response to the stimulusperiodicity and then average the response through time toyield the normalised ISIH where the dimensionless units

along the abscissa are normalised intervals. On this scaleperiodicity corresponding to d(t) has a value of 1, and thatcorresponding to 2d(t) a value of 2.

Acknowledgments

We thank Roy Patterson and two anonymous reviewers forhelpful comments on an earlier version of the manuscript.This work was supported by the BBSRC. MS receives financialsupport from the Frank Edward Elmore fund of the Cambridgeclinical school MB/PhD programme.

R E F E R E N C E S

Bendor, D., Wang, X., 2005. The neuronal representation ofpitch in primate auditory cortex. Nature 436, 1161–1165.

Bilsen, F.A., Ritsma, R.J., 1969/70. Repetition pitch and itsimplication for hearing theory. Acoustica. 22 63–73.

Bilsen, F.A., Ritsma, R.J., 1970. Some parameters influencing theperceptibility of pitch. J. Acoust. Soc. Am. 47, 469–475.

Bilsen, F.A., ten Kate, J.H., Buunen, T.J., Raatgever, J., 1975.Responses of single units in the cochlear nucleus of the cat tocosine noise. J. Acoust. Soc. Am. 58, 858–866.

Bilsen, F.A., 2006. Repetition Pitch glide from the step pyramid atChichen Itza. J. Acoust. Soc. Am. 120, 594–596.

Blackburn, C.C., Sachs, M.B., 1989. Classification of unit types inthe anteroventral cochlear nucleus: PST histograms andregularity analysis. J. Neurophysiol. 62, 1303–1329.

Bourk, T.R., 1976. Electrical responses of neural units in theanteroventral cochlear nucleus of the cat (PhD thesis).Cambridge MA: MIT.

Cariani, P.A., Delgutte, B., 1996a. Neural correlates of the pitch ofcomplex tones. I. Pitch and pitch salience. J. Neurophysiol. 76,1698–1716.

Cariani, P.A., Delgutte, B., 1996b. Neural correlates of the pitch ofcomplex tones. II. Pitch shift, pitch ambiguity, phaseinvariance, pitch circularity, rate pitch, and the dominanceregion for pitch. J. Neurophysiol. 76, 1717–1734.

Dai, H., 2000. On the relative influence of individual harmonics onpitch judgement. J. Acoust. Soc. Am. 107, 953–959.

Davis, H., Silverman, S.R., McAuliffe, D.R., 1951. Some observationson pitch and frequency. J. Acoust. Soc. Am. 23, 40–42.

Declercq, N.F., Degrieck, J., Briers, R., Leroy, O., 2004. A theoreticalstudy of special acoustic effects caused by the staircase of theEl Castillo pyramid at the Mayan ruins of Chichen Itza inMexico. J. Acoust. Soc. Am. 116, 3328–3335.

Denham, S., 2005. Pitch detection of dynamic iterated rippled noise byhumans and a modified auditory model. Biosystems 79, 199–206.

Evans, E.F., 2001. Latest comparisons between physiological andbehavioural frequency selectivity. In: Breebaart, D., Houtsma,A., Kohlrausch, A., Prijs, V., Schoonhoven, R. (Eds.), Proceedingsof the 12th International Symposium on Hearing, Physiologicaland Psychophysical Bases of Auditory Function. Shaker BV,Maastrict, pp. 382–387.

Frisina, R.D., Smith, R.L., Chamberlain, S.C., 1990a. Encoding ofamplitude modulation in the gerbil cochlear nucleus: I. Ahierarchy of enhancement. Hear. Res. 44, 99–122.

Frisina, R.D., Smith, R.L., Chamberlain, S.C., 1990b. Encoding ofamplitude modulation in the gerbil cochlear nucleus: II.Possible neural mechanisms. Hear. Res. 44, 123–142.

Hall, D.A., Edmonson-Jones, A.M., Fridriksson, J., 2006. Periodicityand frequency coding in human auditory cortex. Eur.J. Neurosci. 24, 3601–3610.

66 B R A I N R E S E A R C H 1 1 7 1 ( 2 0 0 7 ) 5 2 – 6 6

Hall, D.A, Plack, C.J., 2007. The human ‘pitch centre’ respondsdifferently to iterated noise and Huggins pitch. NeuroReport 18,323–327.

Hartmann, W.M., 1997. Signals, Sound, and Sensation. Springer,New York.

Keilson, S.E., Richards, V.M., Wyman, B.E., Young, E.D., 1997. Therepresentation of concurrent vowels in the cat anaesthetizedventral cochlear nucleus: evidence for a periodicity-taggedspectral representation. J. Acoust. Soc. Am. 102, 1056–1070.

Kim, D.O., Sirianni, J.G., Chang, S.O., 1990. Responses of DCN-PVCNneurons and auditory nerve fibres in unanaesthetiseddecerebrate cats to AM and pure tones: analysis withautocorrelation/power-spectrum. Hear. Res. 45, 95–113.

Krishnan, A., Swaminathan, J., Gandour, J., 2006. Pitch encoding ofdynamic iterated ripplednoise in thehumanbrainstemis sensitiveto language experience. Assoc. Res. Otolaryngol. Abs. 37–38.

Krumbholz, K., Patterson, R.D., Pressnitzer, D., 2000. The lowerlimit of pitch as determined by rate discrimination. J. Acoust.Soc. Am. 108, 1170–1180.

Liberman, A.M., Delattre, P.C., Gerstman, L.J., Cooper, F.S., 1956.Tempo of frequency changes as a cue for distinguishing classesof speech sounds. J. Exp. Psychol. 52, 127–137.

Lloyd, M.E.A., 2002. A fast inexpensive stimulus presentation anddata acquisition system for auditory neuroscience. Int.J. Audiol. 41, 263.

Louage, D.H.G., van der Heijden, M., Joris, P.X., 2005. Enhancedtemporal response properties of anteroventral cochlearnucleus neurons to broadband noise. J. Neurosci. 25, 1560–1570.

Lubman, D., 1998. Archaeological acoustic study of chirped echofrom the Mayan pyramid at Chichen Itza. J. Acoust. Soc. Am.104, 1763.

Meddis, R., O'Mard, L.P., 2006. Virtual pitch in a computationalmodel. J. Acoust. Soc. Am. 120, 3861–3869.

Merrill, E.G., Ainsworth, A., 1972. Glass-coated platinum-platedtungsten microelectrodes. Med. Biol. Eng. 10, 662–672.

Moore, B.C.J., 1973. Frequency difference limens for short-durationtones. J. Acoust. Soc. Am. 54, 610–619.

Moore, B.C.J., Glasberg, B.R., Peters, R.W., 1985. Relative dominanceof individual partials in determining the pitch of complextones. J. Acoust. Soc. Am. 77, 1853–1860.

Neuert, V., Verhey, J.L., Winter, I.M., 2005. Temporalrepresentation of the delay of iterated rippled noise in thedorsal cochlear nucleus. J. Neurophysiol. 93, 2766–2776.

O'Shaughnessy, D., Allen, J., 1983. Linguistic modality effects onfundamental frequency in speech. J. Acoust. Soc. Am. 74,1155–1171.

Patterson, R.D., Uppenkamp, S., Johnstrude, I.S., Griffiths, T.D.,2002. The processing of temporal pitch and melodyinformation in auditory cortex. Neuron 36, 767–776.

Palmer, A.R., Russell, I.J., 1986. Phase-locking in the cochlear nerveof the guinea pig and its relation to the receptor potential ofinner hair cells. Hear. Res. 24, 1–15.

Plomp, R., 1967. Pitch of complex tones. J. Acoust. Soc. Am. 41,1526–1533.

Poeppel, D., 2003. The analysis of speech in different temporalintegration windows: cerebral lateralisation as ‘asymmetricsampling in time’. Speech Commun. 41, 245–255.

Pressnitzer, D., Patterson, R.D., Krumbholz, K., 2001. The lowerlimit of melodic pitch. J. Acoust. Soc. Am. 109, 2074–2084.

Raatgever, J., Bilsen, F.A., 1992. The pitch of anharmonic combfiltered noise reconsidered. In: Cazals, Y., Horner, K., Demany,L. (Eds.), Auditory Physiology and Perception. Pergamon, NewYork, pp. 215–222.

Rhode, W.S., 1998. Neural encoding of single-formant stimuli inthe ventral cochlear nucleus of the chinchilla. Hear. Res. 117,39–56.

Rhode, W.S., Greenberg, S., 1994. Encoding of amplitudemodulation in the cochlear nucleus of the cat. J. Neurophysiol.71, 1797–1825.

Ritsma, R.J., 1967. Frequencies dominant in the perception of thepitch of complex sounds. J. Acoust. Soc. Am. 42, 191–198.

Sek, A., Moore, B.C.J., 1999. Discrimination of frequency steps linkedby glides of various durations. J. Acoust. Soc. Am. 106, 351–359.

Shofner, W.P., 1991. Temporal representation of rippled noise inthe anteroventral cochlear nucleus of the chinchilla. J. Acoust.Soc. Am. 90, 2450–2466.

Shofner, W.P., Yost, W.A., 1997. Discrimination of rippled-spectrumnoise from flat-spectrum noise by chinchillas: evidence for aspectral dominance region. Hear. Res. 110, 15–24.

Shofner, W.P., 1999. Responses of cochlear nucleus units in theChinchilla to iterated rippled noises: analysis of neuralautocorrelograms. J. Neurophysiol. 81, 2662–2674.

Stagray, J.R., Downs, D., Sommers, R.K., 1992. Contributions of thefundamental, resolvedharmonics, and unresolvedharmonics intone-phoneme identification. J. Speech Hear. Res. 35, 1406–1409.

Supin, A., Popov, V.V., Milekhina, O.N., Tarakanov, N.B., 1994.Frequency resolving power measured with rippled noise. Hear.Res. 78, 31–40.

Swaminathan, J., Ananthanarayan, K., Gandour, J.T., Xu, Y., 2007.Applications of static and dynamic iterated rippled noise toevaluate pitch encoding in the human auditory brainstemIEEE Trans. Biomed. Eng. doi:10.1109/TBME.2007.896592.

ten Kate, J.H., van Bekkum, M.F., 1988. Synchrony-dependantautocorrelation in eighth-nerve-fiber response to ripplednoise. J. Acoust. Soc. Am. 84, 2092–2102.

Verhey, J.L., Winter, I.M., 2006. The temporal representation of thedelay of iterated rippled noisewith positive and negative gain bychopper units in the cochlear nucleus. Hear. Res. 216–217, 43–51.

Wiegrebe, L., Patterson, R.D., Demany, L., Carlyon, R.P., 1998.Temporal dynamics of pitch strength in regular interval noises.J. Acoust. Soc. Am. 104, 2307–2313.

Wiegrebe, L., 2001. Searching for the time constant of neural pitchintegration. J. Acoust. Soc. Am. 109, 1082–1091.

Wiegrebe, L., Winter, I.M., 2001a. Temporal representation ofiterated rippled noise as a function of delay and sound level inthe ventral cochlear nucleus. J. Neurophysiol. 85, 1206–1219.

Wiegrebe, L., Winter, I.M., 2001b. Psychophysics and physiology ofregular interval noise: critical experiments for current modelsand evidence for a first order, temporal pitch code. In: Breebart,et al. (Eds.), Physiological and Psychophysical Bases of AuditoryFunction. Shaker Publishing, Maastricht.

Wiegrebe, L., Meddis, R., 2004. The representation of periodicsounds in simulated sustained chopper units in the ventralcochlear nucleus. J. Acoust. Soc. Am. 115, 1207–1217.

Winter, I.M., Palmer, A.R., 1990. Responses of single units in theanteroventral cochlear nucleus of the guinea pig. Hear. Res. 44,161–178.

Winter, I.M., Wiegrebe, L., Patterson, R.D., 2001. The temporalrepresentationof thedelayof iterated ripplednoise in theventralcochlear nucleus of the guinea-pig. J. Physiol. 537, 533–566.

White, J.A., Young, E.D., Manis, P.B., 1994. The electrotonicstructure of regularly spiking neurons in the ventral cochlearnucleus may determine their response properties.J. Neurophysiol. 71, 1774–1786.

Yost, W.A., Hill, R., Perez-Falcon, T., 1978. Pitch and pitchdiscrimination of broadband signals with rippled powerspectra. J. Acoust. Soc. Am. 63, 1166–1173.

Yost, W.A., 1996. Pitch of iterated rippled noise. J. Acoust. Soc. Am.100, 511–518.

Yost, W.A., Patterson, R.D., Sheft, S., 1996. A time-domaindescription for the pitch strength of iterated rippled noise.J. Acoust. Soc. Am. 99, 1066–1078.

Yost, W.A., Patterson, R.D., Sheft, S., 1998. The role of envelope inprocessing iterated ripplednoise. J.Acoust. Soc.Am.104, 2349–2361.

Young, E.D., Robert, J.M., Shofner, W.P., 1988. Regularity andlatency of units in the ventral cochlear nucleus: implicationsfor unit classification and generation of response properties.J. Neurophysiol. 60, 1–29.