Neurotelemetry reveals putative predictive activity in HVC ...

43
Copyright © 2020 Ma et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed. Research Report: Regular Manuscript Neurotelemetry reveals putative predictive activity in HVC during call-based vocal communications in zebra finches https://doi.org/10.1523/JNEUROSCI.2664-19.2020 Cite as: J. Neurosci 2020; 10.1523/JNEUROSCI.2664-19.2020 Received: 11 November 2019 Revised: 22 May 2020 Accepted: 11 June 2020 This Early Release article has been peer-reviewed and accepted, but has not been through the composition and copyediting processes. The final version may differ slightly in style or formatting and will contain links to any extended data. Alerts: Sign up at www.jneurosci.org/alerts to receive customized email alerts when the fully formatted version of this article is published.

Transcript of Neurotelemetry reveals putative predictive activity in HVC ...

Copyright © 2020 Ma et al.This is an open-access article distributed under the terms of the Creative Commons Attribution4.0 International license, which permits unrestricted use, distribution and reproduction in anymedium provided that the original work is properly attributed.

Research Report: Regular Manuscript

Neurotelemetry reveals putative predictiveactivity in HVC during call-based vocalcommunications in zebra finches

https://doi.org/10.1523/JNEUROSCI.2664-19.2020

Cite as: J. Neurosci 2020; 10.1523/JNEUROSCI.2664-19.2020

Received: 11 November 2019Revised: 22 May 2020Accepted: 11 June 2020

This Early Release article has been peer-reviewed and accepted, but has not been throughthe composition and copyediting processes. The final version may differ slightly in style orformatting and will contain links to any extended data.

Alerts: Sign up at www.jneurosci.org/alerts to receive customized email alerts when the fullyformatted version of this article is published.

1

Neurotelemetry reveals putative predictive activity in HVC during call-based vocal 1

communications in zebra finches 2

3

Abbreviated title: Neural prediction in zebra finches 4

5

Shouwen Ma, Andries Ter Maat, Manfred Gahr 6

Behavioural Neurobiology, Max Planck Institute for Ornithology, Eberhard-Gwinner-7 Straße, 82319, Seewiesen, Germany 8

9

Corresponding author: S.M. ([email protected]) 10

Number of pages: 35 11

Number of figures: 6 12

Number of words for abstract: 157 13

Number of words for introduction: 540 14

Number of words for discussion: 1,483 15

16

Declaration of interest: The authors declare no competing interests 17

18

Acknowledgements: We thank Lisa Trost and Dr. Susanne Seltmann for their help 19 with the surgery, Markus Abels and Hannes Sagunsky for their custom-written 20 software and technical support. We thank Dr. Henrik Brumm, Dr. Erica Ehrhardt and 21 Dr. Susanne Hoffmann for critical readings of the manuscript. We Thank Dr. Jonathan 22 Benichov for comments and English proof of this manuscript. 23

24

Author contributions: S.M., A.T.M., M.G. conception; S.M. performed the 25 experiments and analyzed the data. S.M. and M.G. interpreted the data and wrote 26 the paper. 27

2

Abstract: Premotor predictions facilitate vocal interactions. Here we study such 28

mechanisms in the forebrain nucleus HVC (proper name), a cortex-like sensorimotor 29

area of songbirds, otherwise known for being essential for singing in zebra finches. 30

To study the role of the HVC in calling interactions between male and female mates, 31

we used wireless telemetric systems for simultaneous measurement of neuronal 32

activity of male zebra finches and vocalizations of males and females that freely 33

interact with each other. In a non-social context, male HVC neurons displayed 34

stereotypic premotor activity in relation to active calling and showed auditory-35

evoked activity to hearing of played-back female calls. In a social context, HVC 36

neurons displayed auditory-evoked activity to hearing of female calls only if that 37

neuron showed activity preceding the upcoming female calls. We hypothesize that 38

this activity preceding the auditory-evoked activity in the male HVC represents a 39

neural correlate of behavioral anticipation, predictive activity that helps to 40

coordinate vocal communication between social partners. 41

42

Keywords: sensorimotor, prediction, vocal communication, zebra finch 43

Significance statement 44

Most social-living vertebrates produce large numbers of calls per day and the calls 45

have prominent roles in social interactions. Here we show neuronal mechanisms 46

that are active during call-based vocal communication of zebra finches, a highly 47

social songbird species. HVC, a forebrain nucleus known for its importance in control 48

of learned vocalizations of songbirds, displays predictive activity that may enable the 49

3

male to adjust his own calling pattern in order to produce very fast sequences of 50

male female call exchanges. 51

52

Introduction 53

Many group-living species, such as zebra finches, use a large numbers of calls 54

to interact within their social group and, in particular, with their mates (Zann, 1996; 55

Marler, 2004; Beckers and Gahr, 2010; Gill et al., 2015; Elie and Theunissen, 2016). 56

During social interactions, each individual is confronted with mixtures of sounds that 57

are emitted by various group members. Relevant information can only be retrieved 58

and delivered with an effective system to process auditory inputs while controlling 59

vocal outputs (Prather et al., 2008; Beckers and Gahr, 2012). In zebra finches, calling 60

interaction patterns show that vocal responses can be fast (~200 milliseconds 61

[msec]) and that calling becomes mate-specific during the breeding cycle (ter Maat 62

et al., 2014; Gill et al., 2015). Previous computational and behavioral studies have 63

shown, that animals and humans benefit from predictive control that compensates 64

for sensorimotor delays, thereby enabling effective sensory processing and motor 65

response (Wolpert and Ghahramani, 2000; Wolpert et al., 2011). The ability to 66

predict upcoming calling events may allow zebra finches to achieve such rapid and 67

specific responses. While mechanisms of prediction have been studied in particularly 68

in humans using fMRI techniques within the framework of decision neuroscience 69

(Soon et al., 2008), little attention has been paid to the socio-sexual context and 70

related selection pressures (Mobbs et al., 2018). Thus, the predictive control of social 71

4

communication and the related neural mechanisms of prediction in sensorimotor 72

systems are unknown. 73

To study the neural mechanisms of stimulus-response (sensorimotor) 74

prediction underlying vocal interactions in a social context, we employed telemetric 75

devices that enabled the continuous synchronized monitoring of vocal activity and of 76

neurophysiological activity in HVC (proper name), a sensorimotor cortex-like nucleus 77

known for its control of song pattern (Nottebohm and Arnold, 1976; Yu and 78

Margoliash, 1996; Hahnloser et al., 2002). HVC is an important region of the song 79

control system of zebra finches that has been used extensively to study vocal 80

learning and the neural coding of birdsong (Doupe and Kuhl, 1999; Long and Fee, 81

2008; Mooney, 2009; Vallentin et al., 2016). With telemetric techniques, it has been 82

shown previously that zebra finches respond more contingently to their sexual 83

partners than to other group members (ter Maat et al., 2014; Gill et al., 2015) and 84

that the premotor nucleus RA (robust nucleus of the arcopallium) plays a role in 85

active calling in zebra finches (ter Maat et al., 2014). The motor cortex-like RA is the 86

target area of HVC in the descending song control pathway of songbirds (Nottebohm 87

and Arnold, 1976; Bolhuis and Gahr, 2006). HVC has a number of features that make 88

it possible to predict the calling of a social partner in order to optimize vocal 89

communication, (i) lesion of the HVC to RA connection disrupts the timing of call 90

responses in playback experiments (Benichov et al., 2016), (ii) the sequential timing 91

of motor units is the function of HVC during singing (Yu and Margoliash, 1996; 92

Hahnloser et al., 2002; Long and Fee, 2008), and (iii) HVC obtains environmental 93

information via a direct thalamic input (Wild, 1994; Akutagawa and Konishi, 2005; 94

Coleman et al., 2007). In this study, we analyzed auditory and motor-related 95

5

neuronal activity and their dependence on social context in the male HVC during 96

calling interactions of freely behaving zebra finches. 97

98

99

Materials and Methods 100

101

Animals 102

Animal experiments were carried out according to the regulations of the 103

government of Upper Bavaria (Az. 55.2-1-54-231-25-09). Experimental zebra finches 104

were obtained from our breeding facility. Male and female zebra finches (> 100 days 105

post hatch) were used for the audio- and neuro-telemetry experiments. Zebra finch 106

pairs were kept in custom-made, sound-attenuated chambers. Each sound-107

attenuated chamber was equipped with a microphone (TC20, Earthworks, U.S.A.), a 108

speaker (FRS 8, 30w, 8Ω, VISATON, Germany) and a telescopic antenna for wireless 109

recordings. Zebra finches were kept in a 14/10 Light/Dark cycle (fluorescent lamps), 110

24 oC and 60-70% humidity. 111

112

Sound recording 113

Custom-made wireless microphones (0.6 g, including battery) were used for wireless 114

sound recording (ter Maat et al., 2014). The wireless microphone was placed on the 115

back and fixed with an elastic band around the upper thighs of the bird. The 116

6

frequency modulated radio signals were received with communication receivers 117

(AOR5000, AOR, Ltd., Japan). Audio signals were either fed into an eight channels 118

audio A/D converter (Fast Track Ultra 8R, Avid Technology, Inc. U.S.A.) and recorded 119

with custom-made software or registered on a DASH8X data recorder (Astro-Med, 120

Inc., RI, U.S.A.). 121

122

Playback experiment 123

Playback experiments were performed in cohabitation (male and female were 124

together in a same box) and in isolation (male was separated from female). 125

Recorded stack calls of the female mates were played back by a speaker (FRS 8, 30w, 126

8Ω, VISATON, Germany). Female stack calls used for playbacks were recorded by the 127

in-box microphone (TC20, Earthworks, U.S.A.) prior to the experiments. The intervals 128

between played-back calls were uniformly randomized between 1 and 30 seconds. 129

130

Implantation of electrode and chronic recording of neuronal activity 131

Male zebra finches that showed stable antiphonal interactions with their mates were 132

chosen to carry the neural telemetric device. They were anesthetized using 133

isoflurane inhalation (0.8-1.8% at 0.5l O2/min). During anesthesia, the animals were 134

kept warm on a constant temperature pad (160 x 160 mm, 12V/6W, Thermo, GmbH, 135

Germany) and wrapped in a thin gauze blanket. Some feathers of the head were 136

removed, the skin was disinfected, opened and treated with lidocaine (Xylocain Gel 137

2%, AstraZeneca, Germany). After a window on the skull was opened, the bifurcation 138

7

of the mid-sagittal sinus served as a reference coordinate. A 2 MΩ tungsten 139

electrode (FHC, Bowdoin, U.S.A.) was lowered into HVC using a micromanipulator 140

(Luigs and Neumann, Germany). After verifying the signals of HVC by playing back 141

birds-own-song (BOS), conspecific songs and noise (Margoliash and Fortune, 1992), 142

the window on the skull and the pins for both reference and recording electrode was 143

covered and fixed with dental cement (Tetric evoflow, Ivoclar Vivadent, GmbH, 144

Germany). The animal was equipped with a custom-made neural telemetric device 145

and placed back into the sound box for chronic recordings of both vocal and neural 146

signals (ter Maat et al., 2014). We monitored the neuronal activity during chronic 147

recording. To ensure that the recording site was in HVC, we only selected recording 148

sites that showed premotor activity in relation to singing (Fig. 2B), since several 149

classes of HVC neurons including RA- and X- projecting neurons and interneurons are 150

active during singing (Kozhevnikov and Fee, 2007). For each recording site we found 151

one stack call-related unit, while other units showed activity related to songs, 152

distance calls or showed no specific activity. To further verify electrode placement, a 153

lesion was made at the recording site by connecting the reference and the recording 154

electrodes to the linear stimulus isolator (WPI, Inc, U.S.A.). A current of 4.5 μA was 155

applied for 6 seconds (ter Maat et al., 2014). The brains were then transferred to 4% 156

paraformaldehyde and cryocut in 30 μm sections for Nissl staining. The lesion sites 157

were confirmed with a standard light microscope (Leica, DM 6000B, Germany). 158

159

Amplitude of the stack calls and sensitivity of the backpack microphone 160

8

The sounds of males and females were recorded simultaneously with a calibrated 161

central microphone (TC20, Earthworks, U.S.A.) and with one backpack microphone 162

mounted on the male and another backpack microphone mounted on the female. All 163

events recorded by the central microphone during a 4-hours period were compared 164

with those recorded by the backpacks at that time. In all cases, the backpack 165

microphones mounted on the female or male, respectively, recorded the calls 166

emitted by the microphone carriers. The sound levels of calls were estimated by 167

calibrating between the central microphone and the sound meter (HD600, Extech, 168

U.S.A. with A-weighting, 125 msec response time, 1.4 dB accuracy). 169

170

Behavioral data acquisition and analyses 171

Vocal data were recorded continuously at 22,050 Hz sampling rate and written to 172

WAV files. Vocalizations were extracted from audio files by using custom-written 173

software (sound explorer). The vocalizations of zebra finches were clustered by 174

analyzing their sound features (ter Maat et al., 2014). The peri-stimulus time 175

histograms (PSTH) indicate the times at which males called in relation to the onset of 176

female calls (bin width: 50 msec). Z-Scores were calculated as: , where x, μ and 177

are call counts, mean and standard deviation of call counts, respectively. 178

179

Neurotelemetric recordings and analyses 180

The neurotelemetric signals were received and amplified with communication 181

receivers (AOR5000, AOR, Ltd., Japan). The signals were digitized (Fast Track Ultra 182

9

8R, Avid Technology, Inc. U.S.A.) and recorded with custom-made software or 183

registered on a DASH8X data recorder (Astro-Med, Inc., RI, U.S.A.). We used the 184

same sampling rate (22,050 Hz) to record the neurophysiological data 185

simultaneously with vocalizations of female and male zebra finches. Spike sorting 186

was carried out offline by using Spike2 (CED, Cambridge, U.K.). A template matching 187

algorithm was applied for spike sorting in 300-10,000 Hz bandpass filtered 188

recordings. Principal component analysis (PCA, Spike2) confirmed isolated units and 189

we obtained 2-3 units from each animal (8 birds). We selected one single-unit for 190

each bird for further analysis that showed premotor activity associated with call 191

production. The raster plots and the peri-stimulus time histograms (PSTH) indicate 192

the times at which HVC neurons fired in relation to the onset of vocal events (bin 193

width: 10 msec). Z-Scores were calculated as: , where x, μ and σ are spike counts, 194

mean and standard deviation of spike counts, respectively. The Z-Score is used for 195

indicating a significant increase or decrease in activity with the critical value ±1.645 196

(α = 0.05). The signal-to-noise ratio (SNR) was calculated as: 10 × log , where 197

and are the variabilities of the signal and the noise, respectively. We 198

used an F-test to compare the neuronal activity patterns. We compared the 199

neuronal activity in relation to associated female calls during cohabitation (“coh-aF”) 200

against spontaneous female calls during cohabitation (“coh-sF”), associated female 201

calls during separation (“sep-aF”) and spontaneous female calls during separation 202

(“sep-sF”). In the context of separation, we separated female and male mates into 203

two distant boxes with acoustic interconnections. The sum of the squared distances 204

between the mean curves of “coh-aF” and individual curves of respective groups 205

10

were calculated separately. The null hypothesis is that the difference (D0) between 206

the mean curves of “coh-aF” and the individual curves of “coh-aF” yields the same 207

variability as the difference (Dtest) between the mean curves of “coh-aF” and the 208

individual curves of “coh-sF”, “sep-aF” and “sep-sF”, respectively. The F-ratio 209

between the sum of squared differences between D0 and Dtest and the sum of 210

squared difference within D0 and Dtest was calculated. The degrees of freedom for 211

between-groups and within-groups are dfb and dfw, respectively. The p value for the 212

test is given by: 1− ( , , ), where x denotes F-ratio. 213

214

False discovery rate statistics of auditory-evoked HVC activity 215

In figure 3, we analyzed 8 single-units from 8 male zebra finches for hearing-related 216

activity in HVC. Of each unit, the spikes within [-1, -0.5), [-0.5, 0), [0, +0.5) and [+0.5, 217

1] sec bins were counted in relation to the onsets of the female’s stack calls, which 218

were aligned at 0 sec. For each neuron, we calculated the percentage of spikes 219

occurring within the time window [-0.5, 0) sec in relation to all spikes within the time 220

window [-1, 1] sec. Then we sorted the calling events according to the percentage of 221

spikes occurring during [-0.5, 0) into 10 quantiles, 0-10% spikes, 10-20% spikes, and 222

so on. Of the events of each quantile, we calculated the PSTHs (bin width: 10 msec) 223

for the [-1, +1] sec time window. For these PSTHs we calculated the Z-Scores and 224

defined the peak value of auditory-evoked HVC activity as the maximal Z-Score 225

within the time window [0, +0.5) sec after the onset of female’ stack calls (max Z-226

Scorepost). The maximal Z-Score within the time window [-0.5, 0) sec preceding the 227

female calls is defined as “max Z-Scorepre”. The Z-Scores were converted to p-values 228

11

(n = 80, 10 quantiles of 8 birds). We controlled for the false discovery rate by using 229

Benjamini-Hochberg procedure of q = 0.01 (Benjamini and Hochberg, 1995). 230

231

Experimental design and statistical analysis 232

Eight male and eight female zebra finches were used for the audio- and neuro-233

telemetry experiments (cohabitation, separation). Three of these pairs were 234

additionally used for the isolation experiment. Additional seven pairs were used for 235

the behavioral experiment of call reaction times between mates during cohabitation 236

and separation. In the context of cohabitation, zebra finch pairs were kept in 237

custom-made, sound-attenuated chambers (cohabitation). In the context of 238

separation, we separated female and male mates into two distant boxes with 239

acoustic interconnections (the microphone and the speaker were interconnected). In 240

the context of isolation, male zebra finches were kept alone in sound-attenuated 241

chambers. Statistical analyses for behavioral and neurophysiological data were 242

conducted using R (version 3.5.2). The behavioral data of reaction times were 243

analyzed with paired t-tests. We used F-tests to compare the neuronal activity in 244

relation to different contexts (cohabitation, separation and isolation). False 245

discovery rate statistics (p.adjust, R version 3.5.2) was also used to determine the 246

dependency between the auditory-evoked activity and the predictive activity in HVC. 247

248

249

Results 250

12

Call-based communication between pair-bonded zebra finches 251

Zebra finches live in loose social groups. Although all group members 252

exchange various vocalizations, males and females address their calls with increasing 253

selectivity to their mate during the progressing breeding cycle (Gill et al., 2015). 254

Here, we study the dynamics of such dyadic vocal communications. Vocalizations of 255

males and females were recorded simultaneously using telemetric microphones 256

mounted on their backs, as described previously (Gill et al., 2015; Gill et al., 2016). 257

While song is only uttered by males, male and female zebra finches share an 258

extensive repertoire of calls (Fig. 1A-C), of which the stack calls were the most 259

commonly uttered (65.5 ± 27.4% in male and 89.9 ± 4.8% in female; mean ± s.d., 16 260

males, 16 females). Since stack calling is the preferred mode of communication of 261

paired zebra finches, we chose stack calling for further analysis of neural 262

mechanisms of natural vocal communications. Although the dynamic range of calling 263

reactions in zebra finches is very large, their stack call communication exhibits an 264

antiphonal pattern (ter Maat et al., 2014; Benichov et al., 2016; Ma et al., 2017; Pika 265

et al., 2018). The temporal association of stack calls between male and female of 266

zebra finch pairs showed the calling interaction pattern was consistent over days 267

after pair bonding (Fig. 1D). Certain stack calls are temporally associated with the 268

calling behaviors of their mates, while other stack calls are uttered spontaneously 269

both without following and being followed by other calls. For simplicity, we indexed 270

the former as associated calls (“aF”: associated female calls) and the latter as 271

spontaneous calls (“sF”: spontaneous female calls), when the spontaneous calls 272

occurred both without following and being followed by other vocalizations (including 273

13

self- and other-produced calls and song elements) for half a second (Fig. 1E; “aF” is 274

defined as the opposite). 275

To test the impact of social context on the stack calling dynamic, we 276

separated male and female mates into two distant boxes with acoustic 277

interconnections and compared the reaction times (RT) between the contexts of 278

cohabitation and separation in male zebra finches. Because the distribution of the 279

calling intervals in zebra finches is scale-free (Ma et al., 2017) and only has finite 280

weighted mean, we analyzed the medians of all intervals. Males increased their 281

reaction times when answering their mates’ stack calls during visual separation (sep) 282

(Fig. 1F, median RTsep = 1.20 sec (median 20% fastest RTsep = 0.23 sec); p < 0.001, 283

paired t-test, n = 15) and isolation (iso) (Fig. 1F, median RTiso = 2.61 sec (median 20% 284

fastest RTiso = 0.43 sec); p = 0.008, paired t-test, n = 8) as compared to cohabitation 285

(coh, median RTcoh = 0.94 sec (median 20% fastest RTcoh = 0.15 sec)), while the 286

reaction times during isolation were longer than the reaction times during 287

separation (Fig. 1F, p = 0.02, paired t-test, n = 8). 288

289

Telemetric monitoring of neuronal activity in male HVC during social interactions 290

We recorded the neuronal activity of the male HVC (n = 8 birds) with wireless 291

telemetric device simultaneously with his own vocalizations and with those of his 292

female mate. We show example of neurotelemetric recordings from the HVC of one 293

male zebra finch during male-female vocal interactions (Fig. 2). HVC increased 294

extracellular activity both in relation to calling and singing (Fig. 2A & 2B). We 295

obtained 2-3 HVC single-units from each bird including one single-unit per recording 296

14

site that showed premotor activity associated with productions of stack calls. We 297

also recorded bursting activity associated with one specific type of syllables in the 298

same animal (Fig. 2C). The stack call-related HVC units (Fig. 2D, black trace) 299

recovered faster than did a bursting unit that did not show stack call-related activity 300

(Fig. 2D, red trace). The waveform widths of stack call-related units at 25% peak 301

amplitude were 0.22 ± 0.03 msec, which are comparable in spike width to those 302

obtained with putative interneurons <0.3 msec (Rauske et al., 2003; Raksin et al., 303

2012). Based on the previous studies and the spike waveform of the stack call-304

related HVC units suggest that these HVC units are putative interneurons (Kosche et 305

al., 2015). The stack call-related HVC units exhibited premotor activity at -17.5 ± 13.9 306

msec before own calling. The stack call-related units had an average spontaneous 307

firing rate of 7 ± 5 spikes/sec. They became active during calling and singing and had 308

firing rate of 116 ± 25 spikes/sec and 98 ± 28 spikes/sec, respectively (n = 8 birds). In 309

this study, we focused on stack call-related HVC units of males during female-male 310

calling interactions. 311

In addition to what has been previously reported (McCasland, 1987; 312

Margoliash, 1997; Hahnloser et al., 2002), we found that HVC activity occurred not 313

only in relation to the males’ own stack calls (Fig. 2E, the Z-Score of one HVC single-314

unit in relation to onset of his calls), but also associated with the female’s stack calls 315

during call-based vocal interactions (Fig. 2F, same unit as in Fig. 2E; the Z-Score of 316

one HVC single-unit in relation to onset of female calls). The female call-related 317

activity in male HVC included two peaks, one following the female’s calls and one 318

preceding the female’s calls by about 200 msec (Fig. 2F). The excitatory activity 319

15

preceding the female calls was always followed by activity suppression that lasted 320

about 200 msec (Fig. 2F, the Z-Score of one HVC single-unit in relation to numbers of 321

female calls). 322

Since males produced different types of vocalizations (songs and various call 323

types) including stack calls in association with the occurrence of the female stack 324

calls, these associated calls could be the reason for the preceding activity, thus either 325

being premotor- or auditory-evoked. If so, the spiking patterns of male HVC neurons 326

related to spontaneous female calls (not associated with any other vocalizations 327

including her own calls) should be different from the pattern related to the female 328

calls associated to other calls of the male or female. Therefore, we compared the 329

neuronal correlation of the associated female calls (“coh-aF”) with the neuronal 330

correlation of the spontaneous female calls in the context of cohabitation (“coh-sF”). 331

As similar as for the “coh-aF” (Fig. 2G, the Z-Score of one HVC single-unit), both the 332

HVC activity preceding and following the occurrence of the female’s stack calls 333

remained present for the “coh-sF” class (Fig. 2H, the Z-Score of one HVC single-unit). 334

Further, due to the definition of “coh-sF” calls, male calling was not related to these 335

female calls. 336

It would have been possible that the “coh-sF” were associated with some 337

very soft calls, which were not recorded by the backpack microphones or which were 338

missed by the amplitude-based sound detection software. Although some female 339

calls were at very low amplitude (27-31 dB), the backpack microphone recorded all 340

sounds that were recorded by a calibrated standard microphone placed in the cage. 341

Further, visual inspection of the sonograms of all “coh-sF” events confirmed the lack 342

16

of low amplitude female calls occurring briefly before the “coh-sF”. We were able to 343

exclude the occurrence of very low amplitude calls of the males uttered before the 344

“coh-sF” since the backpack microphone recorded all calls. 345

During call production, HVC single-unit did not show auditory-evoked activity 346

for female calls that occurred within 500 msec time window before (“FM-”) and after 347

male calls (“-MF”) (Fig. 2I & 2J, n = 8). Since the male HVC activity that preceded the 348

female stack calling in the social setting was neither premotor nor auditory-evoked, 349

we named this activity “predictive activity”. We discuss this term in relation to 350

possible explanations such as repressed premotor activity in the discussion. 351

352

In a social context, HVC neurons are auditory only if preceded by predictive activity 353

We tested the relationship between the predictive activity and the auditory-354

evoked activity (see Materials and Methods: False discovery rate statistics). We 355

found significant auditory-evoked activity only in calling events, in which the activity 356

increased transiently before the onset of the females’ calls (Fig. 3A). This increased 357

activity preceding the female calls corresponded to the predictive activity shown in 358

figure 2. We analyzed the correspondence between the auditory-evoked activity 359

(max Z-Scorepost) and the predictive activity (max Z-Scorepre) in relation to female 360

calls. The female calls were sorted into 10 quantiles according to the percentage of 361

spikes occurring half a second before the female calls (Fig. 3B, n = 8 birds). We only 362

found 9 (out of 80) cases with significant auditory-evoked activity (max Z-Scorepost) 363

after Benjamini-Hochberg corrections, in which 7 cases of significant auditory-364

17

evoked activity were found to be contingent upon there also being activity preceding 365

the female calls (Fig. 3A & 3B, the 20-30% quantile, cyan triangle). These results 366

suggest that, in the social context, a transient increase in predictive HVC activity is 367

required for the auditory-evoked activity to occur in HVC. 368

Next, we isolated the male from his female mate and played back her stack 369

calls to him. The male HVC neurons showed the calling-related premotor activity in 370

both isolated (Fig. 4A & 4C) and social context (Fig. 2E). In the isolated situation, the 371

played-back calls would occasionally fall within the 500 msec criterion before and 372

after a male call. We named these cases as “iso-aPb” (Fig. 4A). If the playback calls 373

did not fall within 500 msec criterion, we named these cases as “iso-sPb” (Fig. 4B). In 374

about 60% of the playbacks (553 of 960 played-back calls), the male did not respond 375

to the playbacks with calling and HVC neurons showed activity, which was auditory-376

evoked (Fig. 4B & 4D). In this isolated context, there was no HVC activity preceding 377

the hearing of the played-back female calls (Fig. 4D, the Z-Score of one HVC single-378

unit in relation to numbers of played-back calls). This result remained even if we split 379

the analysis of the playbacks into associated and “spontaneous” events; the 380

auditory-evoked activity was significant in both cases (“iso-aPb”, Fig. 4E; “iso-sPb”, 381

Fig. 4F), while there was no predictive activity preceding the hearing of the 382

playbacks. The average latency of the auditory-evoked activity (0.1 second, n = 3 383

birds) to the playbacks during isolation is shorter than the average latency (0.25 384

second, n = 8 birds) in relation to hearing of female calls in the social context (Mann-385

Whitney U-test, p = 0.012). 386

387

18

Male predicting requires context-dependent information from the female 388

In order to predict upcoming calling events, male zebra finches must have 389

acquired conjunctive information to foresee that their partners are about to 390

vocalize. Conjunctive cues can either be the vocal context such as strings of related 391

calls (Perez et al., 2015; Hernandez et al., 2016) (vocal cues) or non-auditory social-392

sensory cues such as social gestures or postures (Partan and Marler, 2005), for 393

simplicity, referred to as visual cues. If neither vocal nor visual cues are available, the 394

female call should not be predictable, and HVC should not be able to generate 395

predictive activity related to upcoming female calls. In the cohabitation experiment 396

(i.e. male and female being in direct contact), the predictive activity in response to 397

the female calls was still detectable, even though some of these female calls 398

occurred spontaneously as in “sF” condition (Fig. 5A, e.g. Fig. 2F, “coh-sF” vs “coh-399

aF”, F(1,14) = 2.21, p = 0.16, n = 8 pairs). To test the importance of sensory cues 400

further, we separated the female and male mates into two distant boxes with 401

acoustic interconnections. Visual isolation alone had little impact on the predictive 402

activity, which was similar to that of the cohabitation context (Fig. 5B, “sep-aF” vs 403

“coh-aF”, F(1,10) = 3.66, p = 0.08, n = 6 pairs). However, if the vocal cue was not 404

available during visual isolation, when the female calls occurred unexpectedly 405

without temporal association with either vocal or visual cues for half a second, the 406

males no longer exhibited significant predictive activity or auditory-evoked activity in 407

response to female calls (Fig. 5C, “sep-sF” vs “coh-aF”, F(1,10) = 9.11, p = 0.01, n = 6 408

pairs). When we played back female calls while the male and the female were 409

cohabitating in the same aviary, the male HVC did not show auditory-evoked activity 410

19

in relation to the played-back female calls (Fig. 5D, “coh-Pb” vs “coh-aF”, F(1,9) = 411

8.22, p = 0.02, n = 3). The playbacks were identical to those used in the isolated 412

setting reported above (see Fig. 4). 413

In summary, for all studied males and HVC single-units we found the 414

following patterns of task- and social context-dependent activity in HVC neurons of 415

male zebra finches (Fig. 6): HVC single-units showed stack call-related premotor 416

activity, regardless of whether the males were in social contexts, constrained social 417

contexts (separation) or isolated contexts (Fig. 6A, 6D & 6F). In the social context, 418

the HVC single-units showed predictive activity preceding the hearing of female calls 419

in all males. The auditory-evoked activity occurred only subsequently to the 420

predictive activity but not subsequently to premotor activity in the social context of 421

all males (Fig. 6B). Despite constrained interactions in the context of separation, the 422

HVC single-units displayed predictive activity and auditory-evoked activity (Fig. 6E). 423

The playbacks of female calls did not evoke auditory activity in male HVC during 424

cohabitation (Fig. 6C). When the playbacks became the only communication channel 425

in the isolated context, HVC single-units displayed auditory-evoked activity in 426

relation to hearing female calls but no predictive activity (Fig. 6G). Thus, in the social 427

context, the males produce predictive activity that helps to gate auditory response 428

that evokes HVC activity for calls of the social partner. 429

430

431

Discussion 432

20

Zebra finches, like most other group-living songbirds, communicate using 433

large numbers of acoustically invariant calls in addition to songs (Marler, 2004; 434

Beckers and Gahr, 2010; Gill et al., 2015; Elie and Theunissen, 2016). Sensorimotor 435

areas play a crucial role in motor planning (Riehle and Requin, 1989; Crammond and 436

Kalaska, 2000; Maranesi et al., 2014), decision-making (Wolpert et al., 2011), and the 437

accuracy and timing of actions in response to sensory input (Wolpert et al., 1995; 438

Hommel et al., 2013). The HVC of zebra finches is one of these sensorimotor areas, 439

which is well-known for its coding of temporal song patterns (Hahnloser et al., 2002; 440

Long and Fee, 2008). Previous studies have occasionally reported calling related 441

premotor activity in HVC neurons in the context of singing (McCasland, 1987; 442

Margoliash, 1997; Hahnloser et al., 2002; Kozhevnikov and Fee, 2007). This calling-443

related activity in HVC was either pointed out in relation to the production of 444

learned calls (Rajan, 2018) and was ignored in the more common unlearned calls 445

(Hahnloser et al., 2002). Furthermore, bilateral damage of RA and HVC had no effect 446

on the spectro-temporal morphology structure of innate calls of zebra finches 447

(Simpson and Vicario, 1990), which was used to argue that HVC is not involved in the 448

control of innate vocalizations of male and female songbirds. Since we detected 449

neurons that participate in calling and auditory responses, we suggest that HVC is a 450

general nucleus for vocal communication rather than a song-specific nucleus, likely 451

responsible for both timing call production and predicted perception of calls, 452

especially in a social context. 453

The stack call-related premotor activity of HVC interneurons is corroborated 454

by recent results (Benichov and Vallentin, 2020). For singing, structured activity of 455

21

interneurons is thought to provide permissive time windows for the activity of the 456

descending premotor neurons, which leads to a sequential motor pattern (Kosche et 457

al., 2015). During antiphonal callings, HVC interneurons might function in a similar 458

way, that is, by pruning away a range of possibilities they determine narrow time 459

windows in which call output in HVC is possible. Since individual interneurons were 460

active in ~90% of all stack calls, and since we found such neurons at each HVC 461

recording site, we assume that most HVC interneurons are synchronously active at 462

each call production. Thus, premotor activity likely involves both local and long-463

range excitatory and inhibitory circuits that lead to synchronized premotor activity of 464

most HVC interneurons (Kosche et al., 2015; Kornfeld et al., 2017). Further studies 465

should determine how the synchronized excitation of the HVC interneurons leads to 466

the timed activity of RA-projecting HVC neurons, which affects the timing of 467

antiphonal calling. 468

Instead of the premotor activity, the same neuron showed an increase of 469

activity preceding female calls that was followed by an auditory-evoked activity in 470

response to ~20% of the female calls during antiphonal callings. This activity might 471

be a predictive activity, generated to predict female calling or a fictive premotor 472

activity of HVC interneurons that did not lead to a motor output but led to auditory-473

evoked responses. Considering that predictive/fictive activity does not occur in RA 474

during stack call interactions (ter Maat et al., 2014), the predictive/fictive activity 475

would happen in HVC. We suggest that the predictive/fictive activity is produced by 476

a mechanism that reduces HVC-wide synchrony of interneurons involved in 477

premotor activity, so that ~10% of the interneurons are not active in timing call 478

22

output. Those interneurons that do not participate in call timing might participate in 479

premotor/fictive activity. Although interneurons can either be premotor or 480

predictive/fictive during calling interaction, these two neuronal states do not occur 481

within 500 msec from each other. Nevertheless, since most HVC interneurons 482

contribute to the premotor activity during antiphonal callings, these neurons could 483

determine the predictive/fictive activity that did not participate in the premotor 484

activity, e.g. via HVC internal feedback circuits (Rosen and Mooney, 2006). However, 485

since the predictive/fictive activity was generated even if the male did not call 486

immediately before or after female calls (Fig. 2H), also other mechanisms could 487

extract information of female’s previous behaviors in order to generate 488

predictive/fictive activity. In that case, the synchronized activity of a small fraction of 489

HVC interneurons might not be sufficient to excite RA-projecting neurons, i.e. will 490

not result in a motor output. The predictive/fictive activity might reflect the 491

evolutionary history of HVC and the song control system in general as a network that 492

evolved originally to control the timing of vocalizations such as innate calls. Only 493

later during evolution, this auditory-vocal flexibility that originally concerned the 494

time domain, was extended to the spectral domain enabling vocal learning. Since the 495

terms fictive or predictive matter only from an evolutionary perspective (Bubic et al., 496

2010) but not for the proximate effect of gated auditory responses, we continue the 497

discussion using the term “predictive”. 498

The predictive premotor HVC activity was contingent on the activity of the 499

same HVC interneuron activated by hearing the female call. During antiphonal 500

callings, HVC interneurons seem to determine narrow time windows in which call 501

23

processing in HVC is possible. Unlike motor performance, which can provide external 502

feedback, the predictive activity may generate a feed-forward signal to auditory 503

areas such as nucleus Avalanche (and caudomedial nidopallium (NCM) and caudal 504

mesopallium (CM) interconnected with Avalanche) that respond to the incoming 505

calls of the female mate, similar to an efferent copy of motor activity. Comparison of 506

the timing of the male’s predicting feed-forward signal, and of the timing of the 507

female calls in the auditory areas, would result in an expectation error that measures 508

the discrepancy between the predicted and the actual timing of female’s calls. 509

Neurons that detect errors in auditory feedback related to singing are present in CM 510

(Keller and Hahnloser, 2009). Since Avalanche projects back to HVC (Akutagawa and 511

Konishi, 2010; Roberts et al., 2017), Avalanche neurons would excite HVC in case of 512

small expectation errors, which are calculated in Avalanche and related auditory 513

areas. Large prediction errors would not excite HVC and might happen if the 514

prediction was wrong, or if no prediction was computed; an example might be the 515

lack of an auditory response to playbacks of female calls in the presence of a female 516

since those playbacks occurred unexpectedly (Fig. 5C). We currently don’t know 517

whether Avalanche neurons excite HVC interneurons directly. In case of contingent 518

excitation due to well-timed predictive activity and female calling, the excitatory 519

input of Avalanche would result in auditory evoked activity in HVC interneurons. 520

These complex computations would explain the delays of HVC activity when hearing 521

female calls in a social context compared to the playback of her calls in a non-social 522

context. 523

24

How might the perception of context affect HVC activity? The predictive HVC 524

activity was still produced when we excluded all visual information or all short-term 525

auditory information that preceded female calls. Since stack calling of zebra finch 526

mates is steady (Ma et al., 2017) and follows a stable pair-specific calling pattern 527

(Fig. 1D), the males seem to be able to extract information from the previous calling 528

activity of females in order to predict future calling events during vocal interactions. 529

Such auditory and visual information about the mate might reach HVC via Uva, which 530

exerts synaptic influence on HVC interneurons and those projecting to RA and Area X 531

(Coleman et al., 2007). Uva obtains auditory input from the ventral nucleus of the 532

lateral lemniscus (VNLL) (Coleman et al., 2007), visual input from the optic tectum 533

(Wild, 1994) and attentional input of the medial habenula (Akutagawa and Konishi, 534

2005). VNLL and Uva are, however, not sensitive to spectral details of sounds that 535

would identify the calling individual (Coleman et al., 2007). This specific information 536

could be provided to HVC by auditory association-cortex-like areas such as the NCM, 537

CM and nucleus Avalanche that are interconnected (Akutagawa and Konishi, 2010; 538

Roberts et al., 2017) and are likely to analyze the auditory scene in a non-gated way 539

(Beckers and Gahr, 2012; Elie and Theunissen, 2015). However, this auditory 540

information is only reaching HVC in case of predictive activity in a social context, or if 541

the male is entirely in a non-social context where prediction is not an option. In 542

relation to this, there are multiple mechanisms and pathways by which auditory 543

responses in HVC can be gated (Cardin and Schmidt, 2004; Coleman et al., 2007). 544

An ability to predict allows animals to avoid surprise and to minimize energy 545

cost in sensory sampling and processing (Friston, 2010). Alternatively, the ability to 546

predict might enable the individual to signal fitness, i.e. to signal quality or 547

25

commitment to a potential mate or to address a certain individual. Zebra finch pairs 548

that are reproductively successful have more highly coordinated call interactions 549

than those pairs that did not develop such cooperative calling behaviors (Gill et al., 550

2015). Birds may take advantage of expectations which optimize call timing to signal 551

their own commitment for pair bonding or to evaluate the commitment of the 552

female mate to the pair bonding. 553

554

26

References 555

Akutagawa E, Konishi M (2005) Connections of thalamic modulatory centers to the 556

vocal control system of the zebra finch. Proc Natl Acad Sci U S A 102:14086-557

14091. 558

Akutagawa E, Konishi M (2010) New brain pathways found in the vocal control 559

system of a songbird. J Comp Neurol 518:3086-3100. 560

Beckers GJL, Gahr M (2010) Neural Processing of Short-Term Recurrence in Songbird 561

Vocal Communication. PLoS One 5:e11129. 562

Beckers GJL, Gahr M (2012) Large-Scale Synchronized Activity during Vocal Deviance 563

Detection in the Zebra Finch Auditory Forebrain. Journal of Neuroscience 564

32:10594-10608. 565

Benichov JI, Vallentin D (2020) Inhibition within a premotor circuit controls the 566

timing of vocal turn-taking in zebra finches. Nature Communications 11:221. 567

Benichov JI, Benezra SE, Vallentin D, Globerson E, Long MA, Tchernichovski O (2016) 568

The Forebrain Song System Mediates Predictive Call Timing in Female and 569

Male Zebra Finches. Curr Biol 26:309-318. 570

Benjamini Y, Hochberg Y (1995) Controlling the False Discovery Rate - a Practical and 571

Powerful Approach to Multiple Testing. J Roy Stat Soc B Met 57:289-300. 572

Bolhuis JJ, Gahr M (2006) Neural mechanisms of birdsong memory. Nat Rev Neurosci 573

7:347-357. 574

Bubic A, von Cramon DY, Schubotz RI (2010) Prediction, cognition and the brain. 575

Frontiers in human neuroscience 4:25. 576

27

Cardin JA, Schmidt MF (2004) Auditory responses in multiple sensorimotor song 577

system nuclei are co-modulated by behavioral state. J Neurophysiol 91:2148-578

2163. 579

Coleman MJ, Roy A, Wild JM, Mooney R (2007) Thalamic gating of auditory 580

responses in telencephalic song control nuclei. J Neurosci 27:10024-10036. 581

Crammond DJ, Kalaska JF (2000) Prior information in motor and premotor cortex: 582

activity during the delay period and effect on pre-movement activity. J 583

Neurophysiol 84:986-1005. 584

Doupe AJ, Kuhl PK (1999) Birdsong and human speech: common themes and 585

mechanisms. Annu Rev Neurosci 22:567-631. 586

Elie JE, Theunissen FE (2015) Meaning in the avian auditory cortex: neural 587

representation of communication calls. Eur J Neurosci 41:546-567. 588

Elie JE, Theunissen FE (2016) The vocal repertoire of the domesticated zebra finch: a 589

data-driven approach to decipher the information-bearing acoustic features 590

of communication signals. Animal Cognition 19:285-315. 591

Friston K (2010) The free-energy principle: a unified brain theory? Nat Rev Neurosci 592

11:127-138. 593

Gill LF, Goymann W, Ter Maat A, Gahr M (2015) Patterns of call communication 594

between group-housed zebra finches change during the breeding cycle. eLife 595

4:e07770. 596

Gill LF, D'Amelio PB, Adreani NM, Sagunsky H, Gahr MC, ter Maat A (2016) A 597

minimum-impact, flexible tool to study vocal communication of small animals 598

with precise individual-level resolution. Methods Ecol Evol 7:1349-1358. 599

28

Hahnloser RH, Kozhevnikov AA, Fee MS (2002) An ultra-sparse code underlies the 600

generation of neural sequences in a songbird. Nature 419:65-70. 601

Hernandez AM, Perez EC, Mulard H, Mathevon N, Vignal C (2016) Mate Call as 602

Reward: Acoustic Communication Signals Can Acquire Positive Reinforcing 603

Values During Adulthood in Female Zebra Finches (Taeniopygia guttata). 604

Journal of Comparative Psychology 130:36-43. 605

Hommel B, Prinz W, Beisert M, Herwig A (2013) Ideomotor action control: on the 606

perceptual grounding of voluntary actions and agents. Action science: 607

Foundations of an emerging discipline:113-136. 608

Keller GB, Hahnloser RH (2009) Neural processing of auditory feedback during vocal 609

practice in a songbird. Nature 457:187-190. 610

Kornfeld J, Benezra SE, Narayanan RT, Svara F, Egger R, Oberlaender M, Denk W, 611

Long MA (2017) EM connectomics reveals axonal target variation in a 612

sequence-generating network. eLife 6. 613

Kosche G, Vallentin D, Long MA (2015) Interplay of inhibition and excitation shapes a 614

premotor neural sequence. J Neurosci 35:1217-1227. 615

Kozhevnikov AA, Fee MS (2007) Singing-related activity of identified HVC neurons in 616

the zebra finch. J Neurophysiol 97:4271-4283. 617

Long MA, Fee MS (2008) Using temperature to analyse temporal dynamics in the 618

songbird motor pathway. Nature 456:189-194. 619

Ma S, ter Maat A, Gahr M (2017) Power-law scaling of calling dynamics in zebra 620

finches. Scientific reports 7:8397. 621

29

Maranesi M, Livi A, Fogassi L, Rizzolatti G, Bonini L (2014) Mirror neuron activation 622

prior to action observation in a predictable context. J Neurosci 34:14827-623

14832. 624

Margoliash D (1997) Functional organization of forebrain pathways for song 625

production and perception. J Neurobiol 33:671-693. 626

Margoliash D, Fortune ES (1992) Temporal and harmonic combination-sensitive 627

neurons in the zebra finch's HVc. J Neurosci 12:4309-4326. 628

Marler P (2004) Bird calls: their potential for behavioral neurobiology. Ann N Y Acad 629

Sci 1016:31-44. 630

McCasland JS (1987) Neuronal control of bird song production. J Neurosci 7:23-39. 631

Mobbs D, Trimmer PC, Blumstein DT, Dayan P (2018) Foraging for foundations in 632

decision neuroscience: insights from ethology. Nat Rev Neurosci 19:419-427. 633

Mooney R (2009) Neurobiology of song learning. Curr Opin Neurobiol 19:654-660. 634

Nottebohm F, Arnold AP (1976) Sexual dimorphism in vocal control areas of the 635

songbird brain. Science 194:211-213. 636

Partan SR, Marler P (2005) Issues in the classification of multimodal communication 637

signals. Am Nat 166:231-245. 638

Perez EC, Fernandez MSA, Griffith SC, Vignal C, Soula HA (2015) Impact of visual 639

contact on vocal interaction dynamics of pair-bonded birds. Animal 640

Behaviour 107:125-137. 641

Pika S, Wilkinson R, Kendrick KH, Vernes SC (2018) Taking turns: bridging the gap 642

between human and animal communication. Proceedings of the Royal 643

Society B-Biological Sciences 285. 644

30

Prather JF, Peters S, Nowicki S, Mooney R (2008) Precise auditory-vocal mirroring in 645

neurons for learned vocal communication. Nature 451:305-U302. 646

Rajan R (2018) Pre-Bout Neural Activity Changes in Premotor Nucleus HVC Correlate 647

with Successful Initiation of Learned Song Sequence. Journal of Neuroscience 648

38:5925-5938. 649

Raksin JN, Glaze CM, Smith S, Schmidt MF (2012) Linear and nonlinear auditory 650

response properties of interneurons in a high-order avian vocal motor 651

nucleus during wakefulness. J Neurophysiol 107:2185-2201. 652

Rauske PL, Shea SD, Margoliash D (2003) State and neuronal class-dependent 653

reconfiguration in the avian song system. J Neurophysiol 89:1688-1701. 654

Riehle A, Requin J (1989) Monkey primary motor and premotor cortex: single-cell 655

activity related to prior information about direction and extent of an 656

intended movement. J Neurophysiol 61:534-549. 657

Roberts TF, Hisey E, Tanaka M, Kearney MG, Chattree G, Yang CF, Shah NM, Mooney 658

R (2017) Identification of a motor-to-auditory pathway important for vocal 659

learning. Nature Neuroscience 20:978-+. 660

Rosen MJ, Mooney R (2006) Synaptic interactions underlying song-selectivity in the 661

avian nucleus HVC revealed by dual intracellular recordings. J Neurophysiol 662

95:1158-1175. 663

Simpson HB, Vicario DS (1990) Brain pathways for learned and unlearned 664

vocalizations differ in zebra finches. J Neurosci 10:1541-1556. 665

Soon CS, Brass M, Heinze HJ, Haynes JD (2008) Unconscious determinants of free 666

decisions in the human brain. Int J Psychol 43:238-238. 667

31

ter Maat A, Trost L, Sagunsky H, Seltmann S, Gahr M (2014) Zebra finch mates use 668

their forebrain song system in unlearned call communication. PLoS ONE 669

9:e109334. 670

Vallentin D, Kosche G, Lipkind D, Long MA (2016) Neural circuits. Inhibition protects 671

acquired song segments during vocal learning in zebra finches. Science 672

351:267-271. 673

Wild JM (1994) The auditory-vocal-respiratory axis in birds. Brain Behav Evol 44:192-674

209. 675

Wolpert DM, Ghahramani Z (2000) Computational principles of movement 676

neuroscience. Nature Neuroscience 3:1212-1217. 677

Wolpert DM, Ghahramani Z, Jordan MI (1995) An internal model for sensorimotor 678

integration. Science 269:1880-1882. 679

Wolpert DM, Diedrichsen J, Flanagan JR (2011) Principles of sensorimotor learning. 680

Nat Rev Neurosci 12:739-751. 681

Yu AC, Margoliash D (1996) Temporal Hierarchical Control of Singing in Birds. Science 682

273:1871-1875. 683

Zann RA (1996) The Zebra Finch. In: Oxford Ornithology Series (Perrins CM, ed), pp 684

196-213. Oxford: Oxford University Press. 685

686

687

32

Figure captions 688

Figure 1. Vocal interactions between the male and the female of zebra finch pairs. A, 689

Successions of male and female vocal events of one pair. Each vertical line 690

corresponds to a vocal event and the color of the line relates to the vocal type 691

shown in the pie chart B, The latter gives the proportions of different types of 692

vocalizations in males and females; females lack song elements. C, The sonograms 693

show the stack call of a male and a female zebra finch. D, The Z-Score of male stack 694

calls before and after female stack calls are plotted as a function of time with 50 695

msec time intervals; the onset of female stack calls was aligned at zero. Gray lines: Z-696

Scores of the stack call interactions at three different (entire) days after pair bonding 697

of a representative pair. In this case, the female responded to the male calls more 698

frequently than the male responded to the female calls. Blue line: average Z-Score of 699

three days. E, Schematic illustration of an associated female call (“aF”) and a 700

spontaneous female call (“sF”) in relation to the timing of male calls (black vertical 701

lines). F, Comparisons of the median reaction times of males in response to the 702

female calls in different contexts: Cohabitation (coh) = mate present; separation 703

(sep) = mate can be heard but not seen; isolation (iso) = playback of the mate’s calls. 704

Significance index: ***: p < 0.001; **: 0.001 < p ≤ 0.01; *: 0.01 < p ≤ 0.05. 705

706

Figure 2. Neurotelemetry revealed both stack calling-related and hearing-related 707

activity of the same HVC single-unit of male zebra finches during call-based vocal 708

communications in the social context. A, Raw extracellular voltage trace (lower 709

panel) showing the HVC activity of one male in relation to the call of his female mate 710

33

(upper panel) and to his own calling (middle panel). B, Raw extracellular voltage 711

trace showing the HVC activity of the stack call-related neuron of 2A in relation to 712

singing. C, Raw extracellular voltage trace (lower panel) showing the activity of a 713

bursting unit during a song syllable (upper panels). D, Comparison of the waveforms 714

between stack call-related unit (black) and busting unit (red) recorded from the 715

same animal. E, PSTH of a stack call-related HVC single-unit with premotor activity in 716

relation to the male’s own active calling (stack calls, n = 233). Blue trace relates to 717

the Z-Score of one male. Black trace: average Z-Score of 8 males. F, Activity of the 718

same HVC single-unit shown in 2E in relation to hearing of his female’s stack calls (n 719

= 233 randomly selected from 8,689 female stack calls). Note that this unit shows a 720

transient increase in activity already before the onset of the female calls (named 721

predictive activity) and a second increase in activity following her calls (named 722

auditory-evoked activity). Magenta trace relates to the Z-Score of one male. Black 723

trace: average Z-Score of 8 males. G, The activity pattern of the HVC unit shown in 2E 724

in relation to the associated female stack calls (“coh-aF”) during cohabitation. H, 725

Activity of the same HVC unit shown in 2E associated with the spontaneous female 726

stack calls (“coh-sF”) during cohabitation. Note that the putative predictive activity 727

preceding the female call remained. I, The activity pattern of male HVC units related 728

to the onset of the males’ calls in cases that the female called within the 500 msec 729

time window before the males’ call production (“FM-”). These female calls did not 730

evoke auditory activity. J, The activity pattern of male HVC units related to the onset 731

of the males’ calls in cases that the female called within the 500 msec time window 732

after the males’ call production (“-MF”). These females’ calls did not evoke auditory 733

activity. Gray and black traces represent the individual and the average Z-Scores of 8 734

34

males, respectively. Horizontal dashed lines indicate the significant Z-Score = ±1.645. 735

Blue and red color bars indicate the duration of male and female calls, respectively. 736

The signal-to-noise ratio of this single-unit is 3.8 dB. 737

738

Figure 3. Hearing-related activity in male HVC according to different quantiles of 739

spiking activity preceding the onset of female calls. A, The upper and lower panels 740

illustrate the raster plots of one bird and the average Z-Scores of 8 single-units (n = 8 741

birds), respectively. The dashed lines indicate the critical Z-Score of ±1.645. Gray 742

lines: the individual Z-Scores of the single-units; Magenta lines: the average Z-Scores 743

of 8 HVC single-units (n = 8 birds). Note that maximum of 250 events were randomly 744

selected for raster plots. B, The auditory-evoked HVC activity was significant (max Z-745

Scorepost > 2.73), if the activity preceding the females’ calls increased (max Z-Scorepre 746

> 0). The maximum Z-Score of HVC activity during 500 msec following the female call 747

(max Z-Scorepost) is shown as a function of the maximum Z-Score of HVC activity 748

during 500 msec preceding the female call (max Z-Scorepre). The quantiles are color 749

coded according to the different quantiles of spiking activity preceding the onset of 750

female calls (see Materials and Methods). The degree of HVC activity (0-100%) 751

preceding the females’ calls are shown from violet to dark red. Triangles indicate 752

significant auditory-evoked activity with Z-Score > 2.73 (corresponding to Benjamini-753

Hochberg false discovery rate of q = 0.01). Circles indicate events that were not 754

significant. n = 8 HVC single-units sampled from 8 male zebra finches. 755

756

35

Figure 4. Calling-related premotor activity and auditory-evoked activity occurred in 757

the same HVC single-unit in isolated males during playback experiments. In A, the 758

raw extracellular voltage trace (lower panel) shows the neural activity of a male in 759

relation to the associated playback of his female’s call (iso-aPb) (upper panel) and to 760

his own call (middle panel). In B, the raw extracellular voltage trace (lower panel) 761

shows the neural activity of the unit shown in 4A in relation to the “spontaneous” 762

playback (iso-sPb) of his female’s call (upper panel). This HVC unit fired 763

stereotypically in relation to C active male calling (n = 250 calls randomly selected 764

from 1,629 own stack calls) and to D hearing of the played-back calls of his female 765

mate (n = 250 playbacks randomly selected from 960 played-back female calls). The 766

insert shows the spike shape of the depicted unit (vertical & horizontal scale bars: 50 767

mV & 0.5 msec). Blue and magenta traces represent the Z-Score of one male in 768

relation to call productions and hearing playbacks, respectively. Black trace 769

represents the average Z-Score of 8 males. E and F show that HVC unit’s activity after 770

splitting the playbacks into “associated” (iso-aPb) and “spontaneous” (iso-sPb) 771

events during isolation. Note that auditory-evoked activity of HVC in the isolated 772

male occurred without increased activity preceding the female calls. Blue and red 773

color bars indicate the duration of male and female calls, respectively. The signal-to-774

noise ratios of this single-unit is 5.8 dB. 775

776

Figure 5. Neuronal activity of male HVC single-units in relation to female calls is 777

context-dependent during calling interactions. A, Comparing predictive and auditory-778

evoked activity of males’ HVCs between the associated female calls (“aF”, magenta) 779

36

and spontaneous female calls (“sF”, green) of their female mates in cohabitation 780

(coh), n = 8 pairs. B, Predictive and auditory-evoked activity in males’ HVCs in 781

relation to the “aF” in separation (sep), i.e. without visual input but with access to 782

the females’ vocal output, n = 6 pairs. C, Lack of predictive and auditory-evoked 783

activity in HVC in relation to the “sF” in separation (sep), where the females’ vocal 784

output during half a second preceding a focal call was eliminated, in addition to the 785

missing visual input, n = 8 pairs. D, Lack of auditory-evoked activity in the males in 786

relation to the played-back calls of their female mates in cohabitation (coh), n = 3 787

pairs. Red bars: the durations of the females’ calls and played-back females calls. 788

Gray lines: the individual Z-Scores of the single-units; colored lines: the average Z-789

Scores. 790

791

Figure 6. Summary of the task- and social context-dependent activity in HVC neurons 792

of male zebra finches. Stack call-related premotor activity occurred in each context 793

(A, D, F). Auditory-evoked activity occurred in the social context only if that 794

recording showed predictive activity (B, E). The playbacks of female calls did not 795

evoke auditory activity in male HVC during cohabitation (C). The playbacks of female 796

calls evoked auditory activity in male HVC during isolation (G). Social: male and 797

female cohabited in the same cage; separated: male and female had only auditory 798

contact; isolated: male had no contact with the female. 799

0 15k5k 10k

stack calls (69.8%)tet calls (1.3%)

hat calls (0.5%)distance calls (28.4%)

stack calls (63.4%)

tet calls (12.4%)hat calls (8.5%)distance calls (0.7%)

song elements (15.0%)

male

female

0 15k5k 10k

6

0Fre

qu

en

cy (

KH

z)

Fre

qu

en

cy (

KH

z)

6

0

male

female

A

F

1 s

E

0.12 s

0

-1 -0.5 0 10.5

Z-S

core

8

6

4

2

-2

Time (s)

DMale

coh

sep

iso

**

Me

dia

n r

ea

ctio

n tim

e (

s)

0.01

0.1

1

10

100

1,000

Time (s)

Time (s)

aF sFaF

0.12 s

B C

*

*

**

C

20 mV

1 ms

Time (s)

20

A

Fre

qu

en

cy (

KH

z)

0

3

6

-50

0

50

Mill

ivo

lts (

mV

)

0

3

6

male

maleHVC

−1 −0.5 0 0.5 1−20

4

Num

bers

of m

ale

calls

Z-S

core

E

−1 −0.5 0 0.5 1−2

0

2

4

Num

bers

of fe

male

calls

Z-S

core

F

2

6

−1 −0.5 0 0.5 1−2

0

4

Z-S

core 2

−1 −0.5 0 0.5 1−2

0

4

Z-S

core 2

G

coh-sF

coh-aF

H0

150

200

250

100

50

0

150

200

250

100

50

Time (s) Time (s)

Time (s)

Time (s)

female

50 ms

Time (s)

10

Fre

qu

en

cy (

KH

z)

0

3

6

-50

0

50

Mill

ivo

lts (

mV

)0

3

6

male

maleHVC

female

20 mV

B

−1 −0.5 0 0.5 1−3

0

6

Z-S

core 3

−1 −0.5 0 0.5 1−3

0

6

Z-S

core 3

I

coh-M

(-MF)

coh-M

(FM-)

JTime (s)

Time (s)

D

−2 0 2 4 6 8

−1

0

1

2

3

4

max Z-Score pre

0%

100%

max Z

-Score

po

st

eve

nts

of fe

ma

le c

alls

0

250

0

30

0

150

0

80

0

40

−1 −0.5 0 0.5 1

−20

Z-S

core

Time (s)

24

−1 −0.5 0 0.5 1

−20

Time (s)

24

−1 −0.5 0 0.5 1

−20

Time (s)

24

−1 −0.5 0 0.5 1

−20

Time (s)

24

−1 −0.5 0 0.5 1

−20

Time (s)

24

−1 −0.5 0 0.5 1

−20

Z-S

core

Time (s)

24

−1 −0.5 0 0.5 1

−20

Time (s)

24

−1 −0.5 0 0.5 1

−20

Time (s)

24

−1 −0.5 0 0.5 1

−20

Time (s)

24

−1 −0.5 0 0.5 1

−20

Time (s)

24

0 - 10% 10 - 20% 20 - 30% 30 - 40% 40 - 50%

50 - 60% 60 - 70% 70 - 80% 80 - 90% 90 - 100%

eve

nts

of fe

ma

le c

alls

−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1

A B

0

250

−1 −0.5 0 0.5 10

250

−1 −0.5 0 0.5 10

250

−1 −0.5 0 0.5 10

250

−1 −0.5 0 0.5 10

250

−1 −0.5 0 0.5 1

B

0

3

6

Fre

qu

en

cy (

KH

z)

0

3

6

Time (s)0 10.5

-50

0

50

Mill

ivo

lts (

mV

)

playback

male

1.5

maleHVC

A

0

3

6

Fre

qu

en

cy (

KH

z)

0

3

6

Time (s)0 10.5

-50

0

50

Mill

ivo

lts (

mV

)

male

1.5

male HVC

−1 −0.5 0 0.5 1−2

0

4

Nu

mb

ers

of m

ale

ca

llsZ

-Score

Time (s)

C

−1 −0.5 0 0.5 1−2

0

2

4

Time (s)

Nu

mb

ers

of fe

ma

le c

alls

Z-S

core

D

2

6

−1 −0.5 0 0.5 1−2

0

4

Z-S

core

Time (s)

2

−1 −0.5 0 0.5 1−2

0

4

Z-S

core

Time (s)

2

E

iso-sPb

iso-aPb

iso-aPb iso-sPb

F

playback

0

100

200

250

50

0

100

200

250

50

1 s

A

−1 −0.5 0 0.5 1

−2

0

4

Z-S

co

re

Time (s)

2

coh-aFcoh-sF

D

−1 −0.5 0 0.5 1

−2

0

4

Z-S

co

re

Time (s)

2

coh-Pb

1 s

−1 −0.5 0 0.5 1

−2

0

4

Z-S

co

re

Time (s)

2

sep-aF

1 s

B

1 s

C

−2

0

4

Z-S

co

re

Time (s)

2

sep-sF

−1 −0.5 0 0.5 1

Context (male) Premotor Prediction Auditory

social (listening)

separated (listening)

social (playback)

isolated (calling)

social (calling)

isolated (playback)

separated (calling)

A

B

C

D

E

F

G