Intraspecific nucleotide variation at the pheromone binding protein locus in the turnip moth,...

10
Insect Molecular Biology (1999) 8 (4), 481– 490 © 1999 Blackwell Science Ltd 481 Blackwell Science, Ltd Intraspecific nucleotide variation at the pheromone binding protein locus in the turnip moth, Agrotis segetum S. M. LaForest, 1 G. D. Prestwich 2 and C. Löfstedt 3 1 Department of Ecology and Evolution, State University of New York at Stony Brook (institution at which research was conducted), 2 Department of Medicinal Chemistry, University of Utah, and 3 Department of Ecology, Lund University, Sweden Abstract Inter- and intraspecific amino acid variability in the pheromone binding proteins (PBPs) of the Lepidoptera is believed to contribute to a molecular mechanism of pheromone blend discrimination. Messenger RNA cod- ing for PBP sequence in Agrotis segetum (Noctuidae) was cloned, and nucleotide and inferred amino acid variation across a 769-bp region of a PBP locus was studied in two populations. A single gene copy was fully sequenced, revealing an intron/exon structure conserved with distant saturniids. While several nucle- otide substitutions are predicted to result in amino acid replacement, tests for the presence of natural selection suggest that the observed variation is neutral. A phylogenetic analysis provides evidence that the two populations are in the process of genetic isolation. Keywords: Agrotis segetum , pheromone binding pro- teins, odourant binding proteins. Introduction Many insects have evolved the ability to use volatile odourants as cues for locating distant mates as well as potential sources of food. In insects, volatile odourants are detected by sensilla, specialized hair-like structures arrayed along the antennae. In male moths, a specialized subset of these sensilla contain pheromone-sensitive neurones which are highly sensitive and specific to sex pheromone compounds produced by conspecific females. On the molecular level, pheromone perception is mediated by proteins that are associated with these pheromone- sensitive sensilla and pheromone-sensitive neurones. The best known class of proteins involved in insect pheromone perception are the pheromone binding proteins (PBPs), which are members of a larger superfamily of odourant proteins, the insect odourant binding proteins (OBPs). PBPs were initially identified by their capacity to bind sex pheromone compounds in combination with the tissue- specific pattern of their expression. They are small (14– 16 kDa), soluble proteins expressed in high concentrations (10 m M ) in the aqueous lymph surrounding the dendrites of pheromone-sensitive neurones within pheromone- specific sensilla (Vogt & Riddiford, 1981). It is believed that PBPs play a crucial role in pheromone perception by binding to hydrophobic pheromone compounds as they cross the cuticle surrounding each sensillum and transporting them across the aqueous lymph to receptor neurones (Vogt, 1987; Vogt et al ., 1988; Prestwich, 1993). Using photoaffinity labelling in a recombinant PBP, odourant binding has been localized to a conserved hydrophobic region in at least one PBP, Apol-1 from Antheraea poly- phemus (Du et al . 1994). Two additional subfamilies of lepidopteran OBPs, the general odourant binding pro- teins (GOBPs), are classified on the basis of sequence similarities as GOBP1 or GOBP2 proteins (Vogt et al ., 1991a). GOBPs are expressed in sensilla that are sens- itive to a broad class of general odourants in both males and females (Steinbrecht et al ., 1992; Laue et al ., 1994). Their odourant binding activity is now being investigated, but preliminary evidence with binding assays suggests that GOBPs bind plant volatiles (Feng & Prestwich, 1997). Proteins that share significant sequence homology and/or chemoreceptive tissue-specific patterns of expression have been identified in Drosophila melanogaster (Diptera) (McKenna et al ., 1994; Pikielny et al ., 1994), Lygus lineolaris (Heteroptera) (Dickens et al ., 1995) Anomala osakana and Popillia japonica (Coleoptera) (Wojtaseck et al ., 1998), and Carausius morosus (Phasmatodea) (Tuccini et al ., 1996). Full and partial PBP and GOBP sequences have been described in ten species of Lepidoptera, including those of A. segetum described here (Merritt et al ., 1998; reviewed in Pelosi & Maida, 1995; Prestwich et al ., 1995). PBPs are more variable between species than the highly conserved Accepted 19 March 1999. Correspondence: Siana M. LaForest, Massachusetts Institute of Technology, Division of Health Sciences and Technology E25–548, 77 Massachusetts Ave., Cambridge, MA, 02139, U.S.A. E-mail: [email protected]

Transcript of Intraspecific nucleotide variation at the pheromone binding protein locus in the turnip moth,...

Insect Molecular Biology (1999)

8

(4), 481–490

© 1999 Blackwell Science Ltd

481

Blackwell Science, Ltd

Intraspecific nucleotide variation at the pheromone binding protein locus in the turnip moth,

Agrotis segetum

S. M.

LaForest,

1

G. D.

Prestwich

2

and C.

Löfstedt

3

1

Department of Ecology and Evolution, State University of New York at Stony Brook (institution at which research was conducted),

2

Department of Medicinal Chemistry, University of Utah, and

3

Department of Ecology, Lund University, Sweden

Abstract

Inter- and intraspecific amino acid variability in thepheromone binding proteins (PBPs) of the Lepidopterais believed to contribute to a molecular mechanism ofpheromone blend discrimination. Messenger RNA cod-ing for PBP sequence in

Agrotis segetum

(Noctuidae)was cloned, and nucleotide and inferred amino acidvariation across a 769-bp region of a PBP locus wasstudied in two populations. A single gene copy wasfully sequenced, revealing an intron/exon structureconserved with distant saturniids. While several nucle-otide substitutions are predicted to result in aminoacid replacement, tests for the presence of naturalselection suggest that the observed variation is neutral.A phylogenetic analysis provides evidence that thetwo populations are in the process of genetic isolation.

Keywords:

Agrotis segetum

, pheromone binding pro-teins, odourant binding proteins.

Introduction

Many insects have evolved the ability to use volatileodourants as cues for locating distant mates as well aspotential sources of food. In insects, volatile odourantsare detected by sensilla, specialized hair-like structuresarrayed along the antennae. In male moths, a specializedsubset of these sensilla contain pheromone-sensitiveneurones which are highly sensitive and specific to sexpheromone compounds produced by conspecific females.On the molecular level, pheromone perception is mediated

by proteins that are associated with these pheromone-sensitive sensilla and pheromone-sensitive neurones. Thebest known class of proteins involved in insect pheromoneperception are the pheromone binding proteins (PBPs),which are members of a larger superfamily of odourantproteins, the insect odourant binding proteins (OBPs).

PBPs were initially identified by their capacity to bind sexpheromone compounds in combination with the tissue-specific pattern of their expression. They are small (14–16 kDa), soluble proteins expressed in high concentrations(10 m

M

) in the aqueous lymph surrounding the dendritesof pheromone-sensitive neurones within pheromone-specific sensilla (Vogt & Riddiford, 1981). It is believedthat PBPs play a crucial role in pheromone perceptionby binding to hydrophobic pheromone compounds asthey cross the cuticle surrounding each sensillum andtransporting them across the aqueous lymph to receptorneurones (Vogt, 1987; Vogt

et al

., 1988; Prestwich, 1993).Using photoaffinity labelling in a recombinant PBP, odourantbinding has been localized to a conserved hydrophobicregion in at least one PBP, Apol-1 from

Antheraea poly-phemus

(Du

et al

. 1994). Two additional subfamilies oflepidopteran OBPs, the general odourant binding pro-teins (GOBPs), are classified on the basis of sequencesimilarities as GOBP1 or GOBP2 proteins (Vogt

et al

.,1991a). GOBPs are expressed in sensilla that are sens-itive to a broad class of general odourants in both malesand females (Steinbrecht

et al

., 1992; Laue

et al

., 1994).Their odourant binding activity is now being investigated,but preliminary evidence with binding assays suggeststhat GOBPs bind plant volatiles (Feng & Prestwich, 1997).Proteins that share significant sequence homology and/orchemoreceptive tissue-specific patterns of expressionhave been identified in

Drosophila melanogaster

(Diptera)(McKenna

et al

., 1994; Pikielny

et al

., 1994),

Lygus lineolaris

(Heteroptera) (Dickens

et al

., 1995)

Anomala osakana

and

Popillia japonica

(Coleoptera) (Wojtaseck

et al

., 1998),and

Carausius morosus

(Phasmatodea) (Tuccini

et al

., 1996).Full and partial PBP and GOBP sequences have been

described in ten species of Lepidoptera, including those of

A. segetum

described here (Merritt

et al

., 1998; reviewedin Pelosi & Maida, 1995; Prestwich

et al

., 1995). PBPs aremore variable between species than the highly conserved

Accepted 19 March 1999. Correspondence: Siana M. LaForest, MassachusettsInstitute of Technology, Division of Health Sciences and TechnologyE25–548, 77 Massachusetts Ave., Cambridge, MA, 02139, U.S.A. E-mail:[email protected]

IMB143.fm Page 481 Tuesday, October 19, 1999 1:09 PM

482

S. M. LaForest, G. D. Prestwich and C. Löfstedt

© 1999 Blackwell Science Ltd,

Insect Molecular Biology

,

8

, 481–490

GOBPs, and this variation has been attributed to selectionfor binding to different pheromone compounds producedby each species (Vogt

et al

., 1991; LaForest, 1998). Withina species, multiple PBPs have been found in

Antheraeapernyi

(Krieger

et al

., 1991; Raming

et al

., 1990b).

Lymantria dispar

(Vogt

et al

., 1989; Merritt

et al

., 1998)and

Mamestra brassicae

(Maibéche-Coisnè

et al

., 1998).All of the species mentioned above utilize multiple com-pounds in their sex pheromone blend (Pelosi & Maida,1995), and it has been further suggested that differentPBPs expressed within a single species may play a role inpheromone blend discrimination by binding selectivelyto each of the pheromone components (Vogt

et al

., 1991;Prestwich

et al

., 1995). Evidence for this model is basedon the expression of different OBP subtypes in subsetsof olfactory sensilla in both Lepidoptera and Diptera(Laue

et al

., 1994; Vogt

et al

., 1991; Steinbrecht

et al

.,1992; McKenna

et al

., 1994; Steinbrecht, 1996) and ondifferent affinities between recombinant PBPs expressedin

Antheraea

species for pheromone analogues (Du &Prestwich, 1995).

The turnip moth,

Agrotis segetum

, has been a modelsystem for researchers for over a decade, but to date nofull-length pheromone binding protein, complementary DNA(cDNA) or gene sequence has been described in thisspecies. This noctuid moth is a significant pest of a widevariety of crops across Europe, Eurasia and parts of Africa.The mean ratio of the three major pheromone compounds,the proportion of sensilla specific to each pheromonecompound, and the male behavioural response variessignificantly across these regions (Löfstedt

et al

., 1986;Hansson

et al

., 1990; Tóth

et al

., 1992). The greatestdifference in blend ratios can be seen between theEuropean populations and the Zimbabwean population.Scandinavian females produce a blend of the three majorcomponents in a mean ratio of 1/5/2.5 (Z5–10:OAc/Z7–12:OAc/Z9–14:OAc) (Löfstedt

et al

., 1982). Zimbabweanfemales produce the same range of volatiles as Scandin-

avian females; however, the three major componentsare produced in a mean ratio of 1/0.25/0.03 (Wu, 1995;LaForest

et al

., 1997). In both field and flight tunnel tests,males from each of these populations are significantlymore attracted to the mean blend ratio produced byfemales from their own population (Wu, 1995). These datasuggest that the two populations may be in the process ofgenetic isolation and divergence of their systems of materecognition.

PBP electrophoretic mobility variation has been previ-ously demonstrated in two populations of

A. polyphemus

(Vogt & Prestwich, 1988). Intraspecific nucleotide variationat a PBP locus has not been characterized in any mothspecies to date. Our goal was to describe and analysenucleotide and amino acid polymorphism in the PBPsof two pheromone races of

A. segetum

. Messenger RNAcoding for a PBP sequence was cloned and sequenced,and primers derived from the cDNA were used to amplifyPBP alleles in both populations. Tests for the presence ofnatural selection were used to analyse variation at silentsites within the coding and non-coding region of the PBPgene, which may indicate the presence of selectively advant-ageous or disadvantageous alleles coding for differentPBPs. A phylogenetic analysis of nucleotide variation wasused to determine if Scandinavian and Zimbabwean allelesform distinct clades, which may indicate genetic isolation.

Results

As partial 3

PBP cDNA sequences were available (Prestwich

et al

., 1995), a 5

rapid amplification of cDNA ends (RACE)cloning strategy was used to obtain full-length cDNAsequences. Briefly, a 3

gene-specific primer (primer 1)was used to drive first-strand cDNA synthesis, an anchorsequence of fourteen dCTP nucleotides was added to the5

end of the cDNA, and the resulting cDNA was thenamplified using gene specific primers 2 and 3 at the 3

endand a 5

anchor primer containing a 3

complementary

Figure 1. Sequences of primers used in the cDNA cloning and amplification of PBP genes from genomic DNA, and a schematic representation of sites for PCR primer annealing and direction of amplification. Patterns indicate homologous coding regions between the cDNA and the gene. SP = inferred signal peptide, based on the 5′ RACE cDNA. Primer 6 contains sequence from the 5′ RACE cDNA sequence (GenBank accession no. AF134293).

IMB143.fm Page 482 Tuesday, October 19, 1999 1:09 PM

Nucleotide variation in

Agrotis segetum

PBPs

483

© 1999 Blackwell Science Ltd,

Insect Molecular Biology

,

8

, 481–490

dGTP tail (Fig. 1). The RACE-based cloning strategyyielded seven positive clones. Sequencing of these sevenpositives yielded only one cDNA coding for a single pro-tein. This clone, which we will call Aseg-1, is 429 bp longand codes for 143 amino acids (Fig. 2). The predictedamino acid sequence differs by one amino acid changefrom the partial PBP sequence (112 amino acids)described in Prestwich

et al

., 1995. By manual alignmentwith the Prestwich

et al

., 1995 sequence and other pub-lished PBP sequences, we determined that this sequencecodes for a mature PBP.

Primers designed from the full cDNA sequences (1–6)were used to amplify copies of the PBP gene from Scandin-avian and Zimbabwean individuals. Although DNA wasextracted from ten individuals from each of the two popu-lations, only eight Scandinavian individuals and sevenZimbabwean individuals supported amplification. In someof these individuals, sequencing of several positive clonesresulted in several identical sequences, and one or twoadditional sequences which differed from the others byonly one nucleotide. When the nucleotide difference wasshared with a sequence amplified from another individual,

the sequence was included in the data set as a separ-ate allele, since the chance of the same polymerasechain reaction (PCR)-derived misincorporation occurringat the same site in two sequences is extremely low. Whenthe nucleotide difference was unique to an individual, thesequence was not included in the analysis. If the dis-carded sequences represented different alleles ratherthan simply

Taq

DNA polymerase error, eliminating thesesequences from the analyses may have created a bias inour tests for the presence of natural selection by loweringthe number of rare alleles. However, only three of twenty-two unique sequences were eliminated from the analysesin this manner.

From these positive clones, nineteen alleles represent-ing ten Scandinavian and nine Zimbabwean alleles werefound. In one of the Scandinavian alleles (S2A, Fig. 3) theregion encompassing all three exons and both intronswas fully sequenced (GenBank accession no. AF134294).Intron/exon boundaries (Fig. 2) are identical to those ofthe only other PBP gene sequenced to date, Aper-1 from

Antheraea pernyi

(Krieger

et al

., 1991). In Aper-1, intron1 is 363 and intron 2 is 992 nucleotides long; in the

A.

Figure 2. Full nucleotide and inferred amino acid sequence of the A. segetum PBP cDNA Aseg-1 coding for mature protein (GenBank accession no. AF134292), with the positions of the two introns of the corresponding gene indicated. Nucleotide residues in Aseg-1 are numbered in the 5′ to 3′ direction. Sites that show variation in the PBP gene sequences are highlighted and underlined. Predicted amino acid polymorphisms in the PBP gene sequences are indicated below the relevant nucleotide site. Highlighted amino acids correspond to the putative pheromone binding region in the PBPs of A. polyphemus (Du et al., 1994).

IMB143.fm Page 483 Tuesday, October 19, 1999 1:09 PM

484

S. M. LaForest, G. D. Prestwich and C. Löfstedt

© 1999 Blackwell Science Ltd,

Insect Molecular Biology

,

8

, 481–490

Figure 3. CLUSTAL W alignment of partial sequences of nineteen PBP alleles from A. segetum in the PBP locus. Only variable sites are shown. The numbers above the sequence are the position numbers of each variable site, and are read vertically. Nucleotides that are identical to the corresponding nucleotide in the reference sequence, S1A, are indicated with a dot. Intron and exon boundaries are shown above the site numbers. For each of the alleles, the first letter indicates the population of origin (S = Scandinavia, Z = Zimbabwe), the number indicates the individual from which it was amplified, and the second letter indicates the allele. Variable sites that occur within the putative binding region (481, 487 and 509) are located in Exon 2 and are underlined in the reference sequence. * = Sites at which nucleotide substitutions result in amino acid replacements. Gaps introduced by CLUSTAL W are indicated by dashes. ? = Missing sequence, and alleles that required primer 2 for amplification. Sequence variation that occurred in other alleles at these sites were not included in analyses. GenBank accession numbers are as follows: S1A: AF134253, AF134254; S2A: AF134294; S3A: AF134255, AF134256; S4A: AF134257, AF134258; S4B: AF134259, AF134260; S5A: AF134261, AF134262; S6A: AF134263, AF134264; S7A: AF134265, AF134266; S7B: AF134267, AF134268; S8A: AF134269, AF134270; Z1A: AF134271, AF134272; Z1B: AF134273, AF134274; Z2A: AF134275, AF134276; Z3A: AF134277, AF134278; Z3B: AF134279; AF134280; Z4A: AF134281, AF134282; Z5A: AF134283, AF134284; Z6A: AF134285, AF134286; Z7A: AF134287, AF134288.

IMB143.fm Page 484 Tuesday, October 19, 1999 1:09 PM

Nucleotide variation in

Agrotis segetum

PBPs

485

© 1999 Blackwell Science Ltd,

Insect Molecular Biology

,

8

, 481–490

segetum

PBP gene S2A intron 1 is 318 and intron 2 is993 nucleotides long. Of the remaining eighteen alleles,only the region encompassing the first intron and exons1, 2 and 3 were then sequenced. Polymorphisms arepresent in each of the three exons as well as the firstintron (Figs 2 and 3). In both sets of alleles, most of thepolymorphisms occur in silent sites in the intron and insynonymous sites in the exons. There are ten fixed nucle-otide differences between the two populations, all of whichare in intron 1. In exons 1, 2 and 3 in the Scandinavianalleles, there are a total of eighteen synonymous poly-morphisms (segregating sites) out of an estimated 70.46effectively silent sites, and three replacement polymorph-isms out of 271.54 non-synonymous sites (Table 1). The

first intron contains seventy segregating sites out of a totalof 256 sites. By contrast, the Zimbabwean alleles appearless variable: in exons 1, 2 and 3 of the Zimbabwean alleles,there are a total of five synonymous segregating sitesout of an estimated 70.7 effectively silent sites, and tworeplacement polymorphisms out of 271.37 non-synonymoussites. In the Zimbabwean alleles, the first intron containsseven segregating sites out of a total of 313 sites.

A phylogenetic analysis using the neighbour-joiningmethod was conducted in order to address the questionof whether alleles from Scandinavian and Zimbabweanpopulations form distinct clades. A neighbour-joining treeis shown in Fig. 4. This tree supports the inclusion of allthe Zimbabwean alleles and seven of the Scandinavian

Table 1. The observed number of fixed and segregating non-synonymous (non-syn.), synonymous (syn.) and non-coding sites (silent) in each set of alleles.

Exon 1: 1–66 Intron: 67–406 Exon 2: 407–587 Exon 3: 588–769

non-syn. syn. silent syn. non-syn. non-syn. syn.

ScandinaviaTotal sites 54.17 11.83 256 141.87 38.13 75.5 20.5Fixed 54.17 9.83 182 139.87 27.13 74.5 15.5Segregating 0 2 70 2 11 1 5ZimbabweTotal sites 54.17 11.83 313 141.87 38.2 75.33 20.67Fixed: 54.17 10.83 302 139.87 35.2 75.33 19.67Segregating: 0 1 7 2 3 0 2

Figure 4. Neighbour-joining tree (Saitou & Nei, 1987) of PBP alleles, including all sites, using the Hvir-1 cDNA from the noctuid Heliothis virescens (Krieger et al., 1993) as an outgroup. Sequences were aligned in CLUSTAL W (Thompson et al., 1994) and distance estimates generated under the Kimura two-parameter model (Kimura, 1980). Bootstrap values are indicated at each node. Branch lengths are proportional and the scale of distance is indicated. For each of the alleles, the first letter indicates the population of origin (S = Scandinavia, Z = Zimbabwe), the number indicates the individual from which it was amplified, and the second letter indicates the allele. Identical alleles are represented by a single sequence in the analysis and are shown here separated by commas.

IMB143.fm Page 485 Tuesday, October 19, 1999 1:09 PM

486

S. M. LaForest, G. D. Prestwich and C. Löfstedt

© 1999 Blackwell Science Ltd,

Insect Molecular Biology

,

8

, 481–490

alleles as distinct clades. However, the relationshipbetween S2A, S4A, S4B and the clade consisting of theother Scandinavian alleles is not strongly supported. Totalnucleotide diversity and silent nucleotide diversity wereassessed by computing two estimates of heterozygosity,

q

W

(Watterson, 1975) and

q

T

(Tajima, 1989), for alleles fromeach population (Table 2). To compare levels of silentnucleotide diversity between the two populations, the

χ

2

testwas used. The ratio of fixed sites (238.8) to segregatingsilent sites (88) in the Scandinavian population is signific-antly greater than the ratio of fixed (372.04) to segreg-ating silent sites (12) in the Zimbabwean population(

χ

2

= 84.796,

P

= 0.0001).The possibility that the observed variation may reflect

selection on one or more amino acid polymorphism wasaddressed by using two statistical tests for the presenceof natural selection on nucleotide data, Tajima’s (1989)and Fu and Li’s (1993) tests. Tajima’s test is based ondetecting significant differences between

q

W

and

q

T

, and Fuand Li’s test is based on detecting significant differencesbetween the number of mutations on internal and externalbranches of the gene genealogy with those expectedunder a neutral model. Both methods generate a

D

-teststatistic, the significance of which may be tested by refer-ring to the relevant 95% confidence intervals supplied byTajima (1989) and Fu & Li (1993). Significantly positivevalues of

D

indicate an excess of intermediate frequencypolymorphisms, as might be expected if there is a stableamino acid polymorphism maintained by balancing selec-tion, and significantly negative values of

D

indicate anexcess of singletons which may result from a recentselective sweep or if the amino acid variation is deleteri-ous. In both cases the

D

statistic fell well within the pre-dicted 95% confidence limits, so the null model of neutralvariation was not rejected.

Inspection of Table 1 reveals high levels of silent poly-morphism associated with exon 2 amongst the Scandin-avian alleles. This exon maps to a putative binding regiondescribed by Du

et al

., 1994 in a recombinant PBP of

A.polyphemus

, and contains the only amino acid polymorph-ism shared across both populations. Levels of variationat silent sites will be elevated if they are closely linked to a

stable amino acid polymorphism maintained by balancingselection, or reduced if the amino acid polymorphism iseither deleterious or in the process of rapid adaptive fixa-tion. As an alternative to Tajima’s and Fu & Li’s tests, agoodness-of-fit test (

χ

2

) (Kreitman & Hudson, 1991) wasused. In this test, obvious differences in selective con-straints across coding regions can be eliminated as afactor by considering only variation at synonymous andnon-coding sites, which in the absence of intragenic re-combination are considered tightly linked to sites underselection which occur in the same locus. The test comparesthe number of observed fixed and segregating silent siteswithin each region with estimated numbers of fixed andsegregating silent sites given an overall estimate of

q

W

forthe entire gene. With an overall silent

q

W

= 0.09519 forScandinavia and

q

W

= 0.01150 for Zimbabwe, the observednumber of segregating sites for each region is not signific-antly different from the expected number of segregatingsites in both of the populations (data not shown).

Discussion

Many female moths produce multiple compounds in theirpheromone blend. Male moths possess thousands ofsensory hairs containing individual sensory neurones thatrespond selectively to each of these compounds. Con-sequently, receptor proteins that are specific for eachcompound must be associated with these neurones. Withinthe sensory hairs, PBPs may play a role in selectivelybinding and transporting each compound to its appropri-ate neuronal receptor (Vogt

et al

., 1991; Prestwich

et al

.,1995). If PBPs play a role in blend discrimination, diversi-fying selection is expected to influence PBP sequencedivergence between species that use different phero-mone compounds, or between PBP loci within a species ifeach gene product binds selectively to each compound.The presence of multiple PBPs in a number of differentspecies suggested that cDNAs coding for more than onePBP may also be found in

A. segetum

. Only one unique

A. segetum

cDNA was obtained through the methodsemployed here, and this result may reflect the presence ofonly one

A. segetum

PBP gene. Alternatively, if there are

Table 2. Estimates of nucleotide diversity.

All sites Silent sites D-test statistic

Population qW qT qW qT Tajima’s Fu and Li’s

Scandinavia 0.05772 0.05144 0.09519 0.09397 −0.063 0.110(n = 10) (± 0.024) (± 0.025) (± 0.039) (± 0.045) (−1.73–1.98) (−1.81–1.42)Zimbabwe 0.00684 0.00569 0.01150 0.00969 −0.744 −1.02(n = 9) (± 0.0033) (± 0.0031) (± 0.0056) (± 0.0053) (−1.71–1.95) (−1.79–1.42)

Note. The standard errors of estimates of qW and qT are indicated in the parentheses below each. Ninety-five per cent confidence limits are indicated in parentheses below each D statistic. No D statistics are significant.

IMB143.fm Page 486 Tuesday, October 19, 1999 1:09 PM

Nucleotide variation in Agrotis segetum PBPs 487

© 1999 Blackwell Science Ltd, Insect Molecular Biology, 8, 481–490

multiple PBP loci, this result may be due to difficulties inscreening for non-allelic products that share low aminoacid identity, such as between the two PBPs of L. dispar,which share only 50% identity (Merritt et al., 1998).

Because both populations currently use the same threepheromone compounds, albeit in different ratios, we donot expect there to be selection that would result in diver-gence of PBP sequence between them. Although onlyone cDNA corresponding to a single PBP was identified,sequencing of several alleles in both populations revealedamino acid variation. The PBP sequences obtained areall quite similar at the amino acid level, showing onlyfour variable amino acid residues among the nineteensequences, none of which represents fixed differencesbetween the two populations. Of the nucleotide polymorph-isms that result in amino acid replacement, one (serine/proline) includes a change in polarity, occurs in both popu-lations and is located within the putative binding region.Three other replacement polymorphisms include leucine/proline, lysine/arginine, and proline/leucine substitutions(Fig. 2). The possibility that one or more of the observedvariants may be in the process of elimination through puri-fying selection, fixation through positive selection, orretained as a polymorphism through balancing selectionwas tested using Tajima’s (1989) and Fu and Li’s (1993)methods, as well as a goodness-of-fit test (Kreitman &Hudson, 1991).

These tests failed to detect a significant departure fromthe null hypothesis, indicating that the observed aminoacid variation is most likely to be neutral. The estimates ofsilent nucleotide diversity (qW and qT), particularly thosederived from the Scandinavian population, are eithercomparable to or higher than these estimates in mostother nuclear genes which have been sequenced inDrosophila melanogaster, D. simulans and D. pseudoobscura(Moriyama & Powell, 1996). Our values of diversity estim-ated from silent sites are higher than estimates which con-sider variation at all sites; this reflects a typical pattern ofselective constraint on amino acid sequence (Li, 1997,pp. 237–267). If there are multiple loci in A. segetum, eachof which produces PBPs that bind specifically to a singlecomponent, amino acid variation within this locus mightbe relatively conserved due to selection for binding to onlyone of the compounds.

Balancing selection maintains a polymorphism or poly-morphisms within a population against loss by geneticdrift or fixation by natural selection. The coding region of theAdh locus in D. melanogaster has been demonstratedto be quite heterogeneous when compared to other lociin D. melanogaster, with an excess of silent nucleotidepolymorphisms in the third exon which has been attrib-uted to a single amino acid polymorphism maintainedby balancing selection (Hudson et al., 1987; Kreitman &Hudson, 1991). A PBP polymorphism may be maintained

by selection if each amino acid allele plays an unique rolein a male moth’s ability to discriminate between two ormore different compounds in a pheromone blend. Theserine/proline polymorphism which occurs in the bindingregion may be a balanced polymorphism which we wereunable to detect with our tests.

At site 509 (Fig. 3), we can presume that the T isancestral because it is possessed by the Heliothis virescenscDNA used as an outgroup in our phylogenetic analyses.The T→C substitution, which is predicted to result in thereplacement of serine by proline, is shared by S4A, A1Aand A2A. Thus, if S4A and S4B are monophyletic with theother Scandinavian alleles, it is most likely that the Ser/Prosubstitution occurred before the split between the twopopulations. In contrast to the polar amino acid serine,proline is non-polar, and differs from all other amino acidsin that its side chain loops back and binds to its own aminenitrogen, which is likely to cause a bend in the polypeptidechain within the binding region. In the PBP genes of otherlepidopteran species sequenced thus far, the site whichcontains this intraspecific polymorphism contains onlyserine or alanine (Pelosi & Maida, 1995). If this mutation isneutral, it suggests a greater flexibility in the shape ofthe binding region than previously assumed.

Separation of the Scandinavian and Zimbabwean alleles(with the exception of alleles S4A and S4B) in our phylo-genetic analysis suggests that these two populationsare in the process of genetic isolation. Throughout NorthAmerica, Europe and Eurasia, several moths in the genusAgrotis are known migrants and A. segetum is a sus-pected migrant (Wu 1995). The existence of distinct‘pheromone races’, in which female pheromone blend andmale response patterns vary across Europe and Eurasia,also suggests that this species is not truly panmicticdespite its wide range and flight capabilities. All otherforces being equal, the significantly lower levels of nucle-otide diversity at silent sites in Zimbabwean alleles is con-sistent with a model of loss of variation through geneticdrift during the history of this lineage, possibly due toa recent bottlenecking event or a recent selective sweepeither at the PBP locus or at a closely linked locus. Thepattern of bottlenecking and lower effective populationsize may be explained if the Zimbabwean population is anisolate from a larger European/Eurasian population. Thisis in contrast to the opposite pattern in D. melanogaster ofexpansion from Africa into temperate regions, also sug-gested by estimates of nucleotide diversity at nuclear loci(Begun & Aquadro, 1993).

PBPs almost certainly play a role in the filtering ofpheromone signals from ‘molecular noise’ during the pro-cess of pheromone perception. Current research in PBPsfocuses on the extent to which variation between PBPsalso plays a role in discrimination between the phero-mone compounds of different species, or in discrimination

IMB143.fm Page 487 Tuesday, October 19, 1999 1:09 PM

488 S. M. LaForest, G. D. Prestwich and C. Löfstedt

© 1999 Blackwell Science Ltd, Insect Molecular Biology, 8, 481–490

between different components produced in a single phero-mone blend. Evidence from phylogenetic analyses suggeststhat the evolution of the lepidopteran OBP gene familyis marked by a series of gene duplications followed bydiversification, a common pattern in molecular evolution(LaForest, 1998). Other phylogenetic analyses suggestthat sequence divergence between two L. dispar PBPs hasbeen influenced by natural selection (Merritt et al., 1998).Now, more direct evidence is needed to address theissue of whether selection for specific binding to uniquepheromone compounds is a major force shaping PBPvariation. Work is underway to uncover other lepidopteranOBP loci, characterize the biochemical properties ofthe proteins expressed, and examine PBP nucleotide andamino acid variation.

Experimental procedures

Source of Agrotis segetum moths

Scandinavian A. segetum were taken from a laboratory culture.Each generation consisted of several hundred offspring that werethe result of a mass mating between ten to fifteen randomlyselected males and ten to fifteen randomly selected females.Every five to six generations this culture was refreshed by mat-ing approximately fifteen laboratory-bred adults with approxim-ately fifteen adults that emerged from wild pupae collected fromDenmark and Southern Sweden. The Zimbabwean A. segetumculture was established from pupae collected in Zimbabwe,courtesy of Dr C. B. Cottrell. The Zimbabwean individuals usedin this study were the offspring of a colony that was started fromadult insects which had emerged from fifty wild collected pupae,and which had been maintained in the manner described abovefor the Scandinavian insects for two generations.

cDNA cloning

Although partial 3′ cDNA sequences were available from a previ-ous cDNA library, 192 base pairs in the 5′ upstream region,including the signal peptide and start codon, were unknown.To obtain a 5′ cDNA sequence, a RACE strategy was thenemployed. Fifty A. segetum antennae were collected from 1-to 2-day-old males and frozen in liquid nitrogen. Total RNAwas extracted from these antennae using RNAZol (guanidiniumthiocyanate solution from Tel-Test, Inc.), precipitated in isopropanolin the presence of 0.2 M NaCl, and washed in 75% ethanol.cDNAs were synthesized using a Gibco BRL kit for 5′ RapidAmplification of cDNA ends. First strand cDNA synthesis wasconducted using total RNA and primer 1 (Fig. 1). Primer 1 con-tains the twenty-one 3′ terminal nucleotides of a cDNA sequencewhich was obtained from the previous cDNA library, correspond-ing to the C-terminal region which is highly conserved across allPBPs sequenced to date (Prestwich et al., 1995). An anchorsequence consisting of an homopolymeric tail of fourteen dCTPnucleotides was then added to the 5′ end of the cDNA using ter-minal transferase and dCTP. The tailed cDNA was then PCRamplified using a second internal gene-specific primer 2, whichis derived from the partial cDNA sequence, and the Gibco BRLdeoxyinosine-containing Anchor Primer provided with the system.

For all PCR reactions, 2 units of Taq DNA polymerase were usedin buffer containing 1 mM of each primer, 20 mM Tris (pH 8.4), 50 mM

KCl, 0.20 mM of each dNTP, and varying concentrations (1.5–2.5 mM) of MgCl2. PCR was carried out for forty cycles underthe following conditions: denaturation at 95 °C for 1 min, annealingat 45 °C for 30 s, and extension at 72 °C for 1 min. One microlitreof a 1/250 dilution of the resulting PCR product was then reampli-fied using a third, internal gene-specific primer 3, also derivedfrom partial cDNA sequence, with the Gibco BRL Universal Ampli-fication primer. PCR was carried out for thirty-five cycles under thefollowing conditions: denaturation at 95 °C for 1 min, annealing at50 °C for 30 s, and extension at 72 °C for 1 min. The resultingRACE product was then purified using Qiagen PCR purificationkits, and sequenced on an ABI automatic sequencer.

To obtain a full-length cDNA sequence, fresh RNA was iso-lated using the RNAzole kit and procedures described abovefrom male antennae. New first strand cDNA was synthesizedwith a Boehringer Mannheim kit and gene-specific primer 1. ThiscDNA mixture was PCR amplified using primer 5 and primer 1.PCR product from this amplification was gel purified, and thenligated overnight at 4 °C into pGEM-T vector (Promega). Therecombinant plasmid was transformed into competent SUREstrain Escherichia coli which were plated on to LB/amp/X-galmedium for screening by b-galactosidase. Positive clones identi-fied by b-galactosidase were further screened for positives byselecting white colonies, resuspending these colonies in PCRbuffer, and amplifying using primers 3 and 5. Colonies yieldingspecific amplification products of the appropriate size were re-plated and plasmid was isolated using a Qiagen mini-prep kit.The plasmid preps were sequenced using Sequenase (US Bio-chemical) and T7, Sp6 and gene-specific primers. Sequencingreactions were run on both standard acrylamide gels and LongRanger acrylamide gels (AT Biochem), and both strands werecompletely sequenced.

Polymerase chain reaction amplification and sequencing of pheromone binding protein genes

Genomic DNA was extracted and purified from single larvae(10 mg each) frozen in liquid nitrogen. Ten nanograms ofgenomic template was PCR amplified using primers 6 and 1,derived from cDNA sequence obtained as described above. Forsome individuals internal primer 2 was required to yield amplifica-tion. PCR product was precipitated in ethanol in the presenceof 1.5 M NH4OAc, washed in 75% ethanol, dried, and resus-pended in ligation buffer. To separate sequences of different haplo-types present in heterozygous individuals, PCR product wasthen cloned into pGEM-T vector, and the recombinant vectorwas transformed into competent E. coli cells. These cells werescreened for the presence of positives by b-galactosidase. Foreach individual moth, twenty to twenty-five positives identified byb-galactosidase were then rescreened using PCR as describedabove with primers 2 and 5. Three to five positives per individualwere then sequenced as described above.

Analysis of DNA sequence data

The region that encompasses exon 1, the first intron, and exons2 and 3 was sequenced in each allele and aligned in CLUSTAL W(Thompson et al., 1994). With the exception of maximum parsi-mony analysis, all analysis of aligned sequences excluded sitescontaining gaps inserted by CLUSTAL W and sites with missing

IMB143.fm Page 488 Tuesday, October 19, 1999 1:09 PM

Nucleotide variation in Agrotis segetum PBPs 489

© 1999 Blackwell Science Ltd, Insect Molecular Biology, 8, 481–490

information. Phylogenetic analysis using the neighbour-joiningmethod (Saitou & Nei, 1987) on distance estimates generatedunder the Kimura 2 parameter model (Kimura, 1980) was con-ducted with the assistance of MEGA (Kumar et al., 1993). Two estim-ates of nucleotide heterozygosity per site, or (q), were employed,qw (Watterson, 1975), and qT (Tajima, 1983). qw and qT wereestimated with the assistance of DnaSP software (Rozas & Rozas,1997). qw and qT were estimated at all sites (non-synonymous,synonymous and non-coding sites), and at silent sites (synonym-ous and non-coding sites). The number of non-synonymous(replacement) and synonymous sites in each exon was calcu-lated with the assistance of DnaSP using the method of Nei &Gojobori (1986). All sites in the intron, excluding the GT/AGsplice junctions, were considered effectively silent. The neutralityof the observed nucleotide variation at silent sites in the PBPlocus was also tested using Tajima’s (1989) and Fu and Li’s(1993) methods. A goodness-of-fit (χ2) test was used to assessdifferences in levels of diversity at silent sites between Scandinaviaand Zimbabwe by comparing the number of observed fixed andsegregating sites between the Scandinavian and Zimbabweanalleles.

Acknowledgements

We thank G. Du for assistance with Aseg-1 cloning, andD. Futuyma, W. Eanes, B. Verrelli and I.-N. Wang for valu-able comments on an earlier version of this manuscript.We also thank the NSF for a US–Sweden InternationalCooperative Research Award (INT-0914102), the USDA(Grant no. 9601859) and the Herman Frasch Foundationfor financial support.

References

Begun, D.J. and Aquadro, C.F. (1993) African and North Americanpopulations of Drosophila melanogaster are very differentat the DNA level. Nature 365: 548–550.

Dickens, J.C., Callahan, F.E., Wergin, W.P. and Erbe, E.F. (1995)Olfaction in a hemimetabolous insect: antennal-specificprotein in adult Lygus lineolaris (Heteroptera: Miridae). J InsectPhysiol 41: 857–867.

Du, G.H., Ng, C.S. and Prestwich, G.D. (1994) Odorant bindingby a pheromone binding protein: active site mapping byphotoaffinity labeling. Biochemistry 33: 4812–4819.

Du, G. and Prestwich, G.D. (1995) Protein structure encodes theligand binding specificity in pheromone binding proteins.Biochemistry 34: 8726–8732.

Feng, L. and Prestwich, G.D. (1997) Expression and charac-terization of a lepidopteran general odorant binding protein.Insect Biochem Mol Biol 27: 405–412.

Fu, Y.X. and Li, W.H. (1993) Statistical tests of neutrality of muta-tions. Genetics 133: 693–709.

Hansson, B.S., Tóth, M., Löfstedt, C., Szöcs, G., Subchev, M.and Löfqvist, J. (1990) Pheromone variation among easternEuropean and a western Asian population of the turnip mothAgrotis segetum. J Chem Ecol 16: 1611–1622.

Hudson, R.R.M., Kreitman and Aguadé, M. (1987) A test ofneutral molecular evolution based on nucleotide data. Genetics116: 153–159.

Kimura, M. (1980) A simple method for estimating evolutionary

rate of base substitutions through comparative studies ofnucleotide sequences. J Mol Evol 2: 87–90.

Kreitman, M. and Hudson, R.R. (1991) Inferring the evolutionaryhistories of the Adh and Adh-dup loci in Drosophila melano-gaster from patterns of polymorphism and divergence.Genetics 127: 565–582.

Krieger, J., Raming, J. and Breer, H. (1991) Cloning of genomicand complementary DNA encoding insect pheromone bindingproteins: evidence for microdiversity. Biochim Biophys Acta1088: 277–284.

Krieger, J., Raming, J., Ganssle, H. and Breer, H. (1993) Odorantbinding protein of Heliothis virescens. Insect Biochem MolBiol 23: 449–456.

Kumar, S., Tamura, K. and Nei, M. (1993) Mega: molecularevolutionary genetics analysis. The Pennsylvania StateUniversity, University Park.

LaForest, S. (1998) The evolution of sex pheromone commun-ication in the turnip moth, Agrotis segetum. PhD Thesis. StateUniversity of New York at Stony Brook, Stony Brook, NewYork.

LaForest, S., Wu, W. and Löfstedt, C. (1997) A genetic analysisof female pheromone production and male response in theturnip moth, Agrotis segetum. J Chem Ecol 23: 1487–1503.

Laue, M., Steinbrecht, R.A. and Ziegelberger, G. (1994)Immunocytochemical localization of general odorant bindingin olfactory sensilla of the silkmoth Antheraea polyphemus.Naturwissenschaften 81: 178–180.

Li, W.-H. (1997) Molecular Evolution. Sinauer Associates, Inc.,Sunderland, MA.

Löfstedt, C., Löfqvist, J., Lanne, B.S., Van Der Pers, J.N.C.and Hansson, B.S. (1986) Pheromone dialects in Europeanturnip moths Agrotis segetum. Oikos 46: 250–257.

Löfstedt, C., Van Der Pers, J.N.C., Löfqvist, L., Lanne, B.S.,Appelgren, M. and Bergström, G. and Thelin, B. (1982) Sexpheromone components of the turnip moth, Agrotis segetum:chemical identification, electrophysiological evaluation andbehavioral activity. J Chem Ecol 8: 1305–1321.

Maibéche-Coisnè M., Jacquin-Joly, E., François, M.-C. andNagnan-Le Meillour, P. (1998) Molecular cloning of twopheromone binding proteins in the cabbage armywormMamestra brassicae. Insect Biochem Mol Biol 28: 815–818.

McKenna, M.P., Hekmat, S.D., Gaines, P. and Carlson, J.R. (1994)Putative Drosophila pheromone-binding proteins expressedin a subregion of the olfactory system. J Biol Chemistry 269:16340–16347.

Merritt, T.J., LaForest, S., Prestwich, G.D., Quattro, J.M. andVogt, D.G. (1998) Patterns of gene duplication in lepidopteranpheromone binding proteins. J Mol Evol 46: 272–276.

Moriyama, E.N. and Powell, J.R. (1996) Intraspecific nuclearDNA variation in Drosophila. Mol Biol Evol 13: 261–277.

Nei, M. and Gojobori, T. (1986) Simple methods for estimatingthe numbers of synonymous and nonsynonymous nucleotidesubstitutions. Mol Biol Evol 3: 418–426.

Pelosi, P. and Maida, R. (1995) Odorant-binding proteins ininsects. Comp Biochem Physiology B Biochem Mol Biol 111:503–514.

Pikielny, C.W., Hasan, G., Rouyer, F. and Rosbach, M. (1994)Members of a family of Drosophila putative odorant-bindingproteins are expressed in different subsets of olfactory hairs.Neuron 12: 35–49.

Prestwich, G.D. (1993) Chemical studies of pheromone receptorsin insects. Arch Insect Biochem Physiol 22: 75–83.

IMB143.fm Page 489 Tuesday, October 19, 1999 1:09 PM

490 S. M. LaForest, G. D. Prestwich and C. Löfstedt

© 1999 Blackwell Science Ltd, Insect Molecular Biology, 8, 481–490

Prestwich, G.D., Du, G. and LaForest, S.M. (1995) How ispheromone specificity encoded in proteins? Chem Senses20: 461–469.

Raming, K., Krieger, J. and Breer, H. (1990) Primary structure ofa pheromone-binding protein from Antheraea pernyi: homolo-gies with other ligand-carrying proteins. J Comp Physiol B160: 503–509.

Rozas, J. and Rozas, R. (1997) DnaSP, Version 2.0: a novel soft-ware package for extensive molecular population geneticsanalysis. Comput Applic Biosci 13: 307–311.

Saitou, N. and Nei, M. (1987) The neighbor-joining method: anew method for reconstructing phylogenetic trees. Mol BiolEvol 4: 406–425.

Steinbrecht, R.A. (1996) Are odorant-binding proteins involved inodorant discrimination? Chem Senses 20: 461–469.

Steinbrecht, R.A., Ozaki, M. and Ziegelberger, G. (1992)Immunocytochemical localization of pheromone-binding pro-tein in moth antennae. Cell Tissue Res 270: 287–302.

Tajima, F. (1983) Evolutionary relationship of DNA sequences infinite populations. Genetics 105: 437–460.

Tajima, F. (1989) Statistical method for testing the neutral muta-tion hypothesis by DNA polymorphism. Genetics 123: 585–595.

Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994) ClustalW: improving the sensitivity of progressive multiple sequencealignment through sequence weighting, position specific gappenalties and weight matrix choice. Nucleic Acids Research22: 4673–4680.

Tóth, M., Löfstedt, C., Blair, B.W., et al. (11 co-authors) (1992)Attraction of male turnip moths Agrotis segetum (Lepidoptera:Noctuidae) to sex pheromone components and their mixturesat 11 sites in Europe, Asia and Africa. J Chem Ecol 18: 1337–1347.

Tuccini, A., Maida, R., Rovero, P., Mazza, M. and Pelosi, P.(1996) Putative odorant-binding protein in antennae and legsof Carausius morosus (Insecta, Phasmatodea). Insect BiochemMol Biol 26: 19–24.

Vogt, R.G. (1987) The molecular basis of pheromone recep-tion: its influence on behavior. Pheromone Biochemistry(Prestwich, G.D. and Blomquist, G.J., eds), pp. 385–431.Academic Press. Orlando, FL.

Vogt, R.G. and Prestwich, G.D. (1988) Variation in olfactoryproteins: evolvable elements encoding insect behavior. Ann N YAcad Sci 510: 689–691.

Vogt, R.G. and Riddiford, L.M. (1981) Pheromone binding andinactivation by moth antennae. Nature 293: 161–163.

Vogt, R.G., Koehne, A.C., Dubnau, J.T. and Prestwich, G.D.(1989) Expression of pheromone binding proteins duringdevelopment in the gypsy moth, Lymantria dispar. J Neurosci9: 3332–3348.

Vogt, R.G., Prestwich, G.D. and Lerner, M.R. (1991) Odorantbinding protein subfamilies associate with distinct classes ofolfactory receptor neurons in insects. J Neurobiol 22: 74–84.

Vogt, R.G., Prestwich, G.D. and Riddiford, L.M. (1988) Sex phero-mone receptor proteins: visualization using a radiolabeledphotoaffinity analog. J Biol Chem 263: 3952–3959.

Watterson, G.A. (1975) On the number of segregating sites ingenetical models without recombination. Theor Pop Biol 7:256–276.

Wojtaseck, H., Hansson, B.S. and Leal, W.S. (1998) Attracted orrepelled? A matter of two neurons, one pheromone bindingprotein, and a chiral center. Biochem Biophys Res Comm250: 217–222.

Wu, W.-Q. (1995) Mechanisms of specificity in moth pheromoneproduction and response. PhD Thesis. Lund University,Sweden.

IMB143.fm Page 490 Tuesday, October 19, 1999 1:09 PM