Mitochondrial single nucleotide polymorphism genotyping by matrix-assisted laser...

9
Design and synthesis of cleavable biotinylated dideoxynucleotides for DNA sequencing by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry Chunmei Qiu a,b , Shiv Kumar a,b , Jia Guo a,c , Lin Yu a,b , Wenjing Guo a,b , Shundi Shi a,b , James J. Russo a,b , Jingyue Ju a,b,a Columbia Genome Center, Columbia University College of Physicians and Surgeons, New York, NY 10032, USA b Department of Chemical Engineering and Pharmacology, Columbia University, New York, NY 10027, USA c Department of Chemistry, Columbia University, New York, NY 10027, USA article info Article history: Received 3 March 2012 Accepted 17 April 2012 Available online 25 April 2012 Keywords: MALDI–TOF MS DNA sequencing Cleavable biotinylated nucleotides Solid-phase capture abstract Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI–TOF MS)-based methods have been widely explored for DNA sequencing. We report here the design, synthesis, and eval- uation of a novel set of chemically cleavable biotinylated dideoxynucleotides, ddNTPs-N 3 -biotin, for the DNA polymerase extension reaction and its application in DNA sequencing by mass spectrometry (MS). These nucleotide analogs have a biotin moiety attached to the 5 position of the pyrimidines (C and U) or the 7 position of the purines (A and G) via a chemically cleavable azido-based linker, with different length linker arms serving as mass tags that contribute to large mass differences among the nucleotides. We demonstrate that these modified nucleotides are efficiently incorporated by DNA polymerase, and the DNA strand bearing biotinylated nucleotides is captured by streptavidin-coated beads and efficiently released using tris(2-carboxyethyl)phosphine in aqueous solution, which is compatible with DNA and downstream procedures. We performed Sanger sequencing reactions using these nucleotides to generate DNA fragments for MALDI–TOF MS analysis. Both synthetic DNA and polymerase chain reaction (PCR) products were accurately decoded, and a read length of approximately 37 bases was achieved using these nucleotides in MS sequencing. Ó 2012 Elsevier Inc. All rights reserved. Since the completion of the first human genome project [1], there has been increased interest in resequencing of target genes at high throughput, with high accuracy and sensitivity, with the goal of understanding the molecular basis of disease and the devel- opment of new therapeutics. Although next generation sequencing approaches have become the methods of choice for very large- scale projects such as resequencing of the whole genome or every exon or for deep transcriptome sequencing [2], in many cases one is interested in sequencing only a limited area (one or a few genes or coding regions) with rapid speed. In these cases, next generation sequencing is unwarranted and too expensive, leaving only the tra- ditional electrophoresis-based Sanger sequencing approach most conveniently performed with fluorescent dideoxynucleotides [3– 6]. Although the latter is well established, producing generally high quality and long sequence reads, it has some limitations [7]. In addition to its being time-consuming and carrying an overall high cost per read, in certain regions where single nucleotide polymor- phisms (SNPs) 1 fall within or where there are deletions or insertions, sequence quality becomes problematic. Moreover, despite substan- tial improvements during recent years, due to the resolution of the separation matrices and gel-based artifacts caused by secondary structures in GC-rich sequence, difficulties exist in identifying the first few bases after the priming site. One option that can overcome some of these limitations and is particularly efficient at sequencing bases immediately succeeding the primer is to take advantage of mass spectrometry (MS) to distinguish the four nucleotides incorpo- rated during the polymerase reaction. Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI–TOF MS) has been widely explored for rapid and accurate DNA sequencing [8–16]. MALDI–TOF MS completely 0003-2697/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ab.2012.04.021 Corresponding author at: Department of Chemical Engineering and Pharmacology, Columbia University, New York, NY 10027, USA. Fax: +1 212 851 9330. E-mail address: [email protected] (J. Ju). 1 Abbreviations used: SNP, single nucleotide polymorphism; MS, mass spectrome- try; MALDI–TOF MS, matrix-assisted laser desorption/ionization time-of-flight mass spectrometry; RHOD, ras homolog family member D; TCEP, tris(2-carboxyethyl)phos- phine; PCR, polymerase chain reaction; NMR, nuclear magnetic resonance; ESI, electrospray ionization; HPLC, high-performance liquid chromatography; DMF, N,N- dimethyl formamide; B/W, binding and washing. Analytical Biochemistry 427 (2012) 193–201 Contents lists available at SciVerse ScienceDirect Analytical Biochemistry journal homepage: www.elsevier.com/locate/yabio

Transcript of Mitochondrial single nucleotide polymorphism genotyping by matrix-assisted laser...

Analytical Biochemistry 427 (2012) 193–201

Contents lists available at SciVerse ScienceDirect

Analytical Biochemistry

journal homepage: www.elsevier .com/locate /yabio

Design and synthesis of cleavable biotinylated dideoxynucleotides for DNAsequencing by matrix-assisted laser desorption/ionization time-of-flightmass spectrometry

Chunmei Qiu a,b, Shiv Kumar a,b, Jia Guo a,c, Lin Yu a,b, Wenjing Guo a,b, Shundi Shi a,b, James J. Russo a,b,Jingyue Ju a,b,⇑a Columbia Genome Center, Columbia University College of Physicians and Surgeons, New York, NY 10032, USAb Department of Chemical Engineering and Pharmacology, Columbia University, New York, NY 10027, USAc Department of Chemistry, Columbia University, New York, NY 10027, USA

a r t i c l e i n f o a b s t r a c t

Article history:Received 3 March 2012Accepted 17 April 2012Available online 25 April 2012

Keywords:MALDI–TOF MSDNA sequencingCleavable biotinylated nucleotidesSolid-phase capture

0003-2697/$ - see front matter � 2012 Elsevier Inc. Ahttp://dx.doi.org/10.1016/j.ab.2012.04.021

⇑ Corresponding author at: Department ofPharmacology, Columbia University, New York, NY 19330.

E-mail address: [email protected] (J. Ju).

Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI–TOF MS)-basedmethods have been widely explored for DNA sequencing. We report here the design, synthesis, and eval-uation of a novel set of chemically cleavable biotinylated dideoxynucleotides, ddNTPs-N3-biotin, for theDNA polymerase extension reaction and its application in DNA sequencing by mass spectrometry (MS).These nucleotide analogs have a biotin moiety attached to the 5 position of the pyrimidines (C and U) orthe 7 position of the purines (A and G) via a chemically cleavable azido-based linker, with different lengthlinker arms serving as mass tags that contribute to large mass differences among the nucleotides. Wedemonstrate that these modified nucleotides are efficiently incorporated by DNA polymerase, and theDNA strand bearing biotinylated nucleotides is captured by streptavidin-coated beads and efficientlyreleased using tris(2-carboxyethyl)phosphine in aqueous solution, which is compatible with DNA anddownstream procedures. We performed Sanger sequencing reactions using these nucleotides to generateDNA fragments for MALDI–TOF MS analysis. Both synthetic DNA and polymerase chain reaction (PCR)products were accurately decoded, and a read length of approximately 37 bases was achieved using thesenucleotides in MS sequencing.

� 2012 Elsevier Inc. All rights reserved.

Since the completion of the first human genome project [1],there has been increased interest in resequencing of target genesat high throughput, with high accuracy and sensitivity, with thegoal of understanding the molecular basis of disease and the devel-opment of new therapeutics. Although next generation sequencingapproaches have become the methods of choice for very large-scale projects such as resequencing of the whole genome or everyexon or for deep transcriptome sequencing [2], in many cases oneis interested in sequencing only a limited area (one or a few genesor coding regions) with rapid speed. In these cases, next generationsequencing is unwarranted and too expensive, leaving only the tra-ditional electrophoresis-based Sanger sequencing approach mostconveniently performed with fluorescent dideoxynucleotides [3–6]. Although the latter is well established, producing generally highquality and long sequence reads, it has some limitations [7]. Inaddition to its being time-consuming and carrying an overall high

ll rights reserved.

Chemical Engineering and0027, USA. Fax: +1 212 851

cost per read, in certain regions where single nucleotide polymor-phisms (SNPs)1 fall within or where there are deletions or insertions,sequence quality becomes problematic. Moreover, despite substan-tial improvements during recent years, due to the resolution of theseparation matrices and gel-based artifacts caused by secondarystructures in GC-rich sequence, difficulties exist in identifying thefirst few bases after the priming site. One option that can overcomesome of these limitations and is particularly efficient at sequencingbases immediately succeeding the primer is to take advantage ofmass spectrometry (MS) to distinguish the four nucleotides incorpo-rated during the polymerase reaction.

Matrix-assisted laser desorption/ionization time-of-flight massspectrometry (MALDI–TOF MS) has been widely explored for rapidand accurate DNA sequencing [8–16]. MALDI–TOF MS completely

1 Abbreviations used: SNP, single nucleotide polymorphism; MS, mass spectrome-try; MALDI–TOF MS, matrix-assisted laser desorption/ionization time-of-flight massspectrometry; RHOD, ras homolog family member D; TCEP, tris(2-carboxyethyl)phos-phine; PCR, polymerase chain reaction; NMR, nuclear magnetic resonance; ESI,electrospray ionization; HPLC, high-performance liquid chromatography; DMF, N,N-dimethyl formamide; B/W, binding and washing.

194 DNA sequencing by MALDI–TOF MS / C. Qiu et al. / Anal. Biochem. 427 (2012) 193–201

eliminates some of the difficulties typically encountered in fluores-cence capillary electrophoresis sequencing systems, achieves fastseparation of DNA fragments in microseconds, provides high reso-lution and sensitivity, and generates highly accurate data becausemass measurements are based directly on the intrinsic propertiesof the molecules as opposed to the added fluorescent tag. In gen-eral, in MS sequencing, the Sanger dideoxy sequencing reaction isused to produce DNA sequencing fragments for MS characteriza-tion. In this MS-based analysis, besides stringent purity require-ments, such as the elimination of alkaline and alkaline earth saltsand other contaminants, it is critical to isolate and purify DNAsequencing fragments from other DNA strands in the reaction mix-ture such as excess primers and falsely terminated DNA fragments.Therefore, we previously developed solid-phase sequencing chem-istry using biotinylated dideoxynucleotides that allowed streptavi-din-coated solid-phase isolation of 30 biotin terminated DNAextension products for accurate DNA sequencing and genotyping[8]. This method was shown to be particularly advantageous indetecting deletions and insertions in candidate disease genes [17].

The approach for solid-phase isolation of DNA fragments tookadvantage of the strong, specific, and stable interaction betweenbiotin and streptavidin [18] that has been used for a wide varietyof biological purifications [19–23]. Nonetheless, harsh conditions,such as treatment with formamide at a high temperature, are re-quired to cleave the biotin–streptavidin bond. This complicatesdownstream procedures because the isolated products need to beethanol precipitated prior to the desalting step for MS analysis.This is a tedious, time-consuming process and is prone to sampleloss. Although some groups have tried to develop mild approachesfor breaking the bond between biotin and streptavidin [24], thepresence of the biotin moiety in purified DNA fragments intro-duces its own complications in higher resolution MS analysis be-cause biotin contains a sulfur atom that exists as four majorstable isotopes. In addition, the decreasing resolving capacity ofthe mass spectrometer for larger DNA fragments requires biotinyl-ated dideoxynucleotides with appropriate mass differences toachieve high resolution and accuracy. These problems could besolved by introducing cleavable biotinylated nucleotide analogs[25], thereby improving the sensitivity, accuracy, and efficiencyof MS-based sequencing.

We report here the design, synthesis, and evaluation of a new setof chemically cleavable biotinylated dideoxynucleotides, ddNTPs-N3-biotin (ddATP-N3-biotin, ddGTP-N3-biotin, ddCTP-N3-biotin,and ddUTP-N3-biotin), for the DNA polymerase extension reactionand demonstrate their application in MS-based DNA sequencingof part of the RHOD (ras homolog family member D) oncogene. Inan accompanying article [31], we describe their utility for multiplexSNP genotyping by single base extension focusing on allelic hetero-plasmy in mitochondrial disease. The nucleotide analogs are de-signed with a biotin moiety attached to the 5 position ofpyrimidines (C and U) or the 7 position of purines (A and G) via acleavable linker, and nucleotide precursors with two differentlength carbon linker arms between the base and the biotin mole-cule were used to increase the mass differences among these nucle-otide analogs. Previously, we reported that chemically cleavablefluorescent dideoxynucleotides using azido-based linkers could beused successfully for DNA sequencing by synthesis, and the fluoro-phores were shown to be completely removable under very mildcleavage conditions using an aqueous tris(2-carboxyethyl)phos-phine (TCEP) solution [26]. Similarly, we reasoned that azido linkerswere ideal to attach the biotin to the nucleotides in the current MS-based sequencing design because the linker can be cleaved by TCEPvery efficiently and TCEP is compatible with the downstreamdesalting step. We demonstrate here that ddNTPs-N3-biotin areable to be incorporated into the growing DNA strands during thepolymerase extension reaction and that DNA strands with biotinyl-

ated dideoxynucleotides at their 30 end can be efficiently capturedby streptavidin-coated magnetic beads and then released fromthe beads with TCEP. We performed Sanger DNA sequencing reac-tions using ddNTPs-N3-biotin to generate DNA fragments of differ-ent lengths that were characterized by MS. Synthetic DNAtemplates and polymerase chain reaction (PCR) products wereaccurately sequenced, and a read length of more than 35 baseswas achieved using these nucleotides in MS sequencing. The samenucleotide analogs have also been used successfully in multiplexSNP detection by MALDI–TOF MS, as demonstrated in the accompa-nying article [31]. Similarly designed nucleotides should be usefulas well in isolating DNA–protein complexes.

Materials and methods

All chemicals were purchased from Sigma–Aldrich unless other-wise indicated. 1H NMR (nuclear magnetic resonance) spectra wererecorded on a Bruker DPX-400 (400 MHz) spectrometer. Electro-spray ionization (ESI) mass spectra were recorded on a Bruker Dal-tonics Esquire 6000 mass spectrometer. Mass measurement of DNAwas performed on a Voyager DE MALDI–TOF mass spectrometer(Applied Biosystems/Life Technologies, San Diego, CA, USA). High-performance liquid chromatography (HPLC) was performed on aWaters system (Milford, MA, USA) consisting of a Rheodyne 7725iinjector, a 600 controller, and a 996 photodiode array detector. Oli-gonucleotides were purchased from Integrated DNA Technologies(IDT, Coralville, IA, USA). Thermo Sequenase was obtained fromGE Healthcare (Piscataway, NJ, USA). Streptavidin-coated magneticbeads (Dynabeads MyOne Streptavidin C1) were obtained from LifeTechnologies.

Synthesis of ddNTP-N3-biotin

The propargylamino-dideoxynucleotides (1–4) were either pur-chased from PerkinElmer Life and Analytical Sciences or preparedfollowing the procedure described by Hobbs and Cocuzza [27].The longer linker arm dideoxynucleotides (5 and 6) were preparedaccording to Duthie and coworkers [28] and purified by HPLC(Fig. 1). The azido linker, (2-{2-[3-(2-amino-ethylcarbamoyl)-phenoxy]-1-azido-ethoxy}-ethoxyl)-acetic acid (7, Fig. 2) was pre-pared according to the literature [26,29].

Synthesis of biotin-N3-linker acid (8)Azido linker (7, 74 mg, 0.2 mmol) was dissolved in anhydrous

N,N-dimethyl formamide (DMF, 5 ml) and 1.5 ml of 1 M NaHCO3

aqueous solution. A solution of biotin–NHS (N-hydroxy-succinimide) ester (75 mg, 0.22 mmol) in 4 ml of anhydrous DMFwas added slowly to the stirred reaction mixture and stirred over-night at room temperature. The reaction mixture was concentratedin vacuo and purified on a silica gel chromatography column using25% methanol in methylene chloride to 40% methanol in methylenechloride. The appropriate fractions were combined and concen-trated to give compound 8 as a white solid (72 mg, 60%). 1H NMR(400 MHz, D2O): d 7.48–7.39 (m, 3H), 7.16 (d, 1H), 5.05 (t, 1H),4.48 (t, 1H), 4.29–4.20 (m, 3H), 4.00–3.90 (m, 4H), 3.78 (m, 2H),3.53–3.45 (2d, 4H), 2.91–2.87 (dd, 1H), 2.69 (d, 1H), 2.36 (t, 2H),1.66–1.37 (m, 6H); FAB–MS m/z: calculated for C25H36N7O8S(M+H+) 594.65; found 594.75.

General method for synthesis of ddNTP-N3-biotin (10–13)Biotin-N3-linker acid (8, 4 mg, 6.75 lmol) was coevaporated

with anhydrous DMF under reduced pressure and redissolved inanhydrous DMF (0.8 ml). A solution of O-(N-succinimidyl)-1,1,3,3-tetramethyluronium tetrafluoroborate (TSTU, 20 lmol) in anhy-drous DMF (0.4 ml) was added to the stirred solution under an

OO

N

N

N

NH2NH2

POO

O-PO

OO-

PO

-OO- O

ON

NH2

POO

O-PO

OO-

PO

-OO-

N

O

NH2

7-Propargylamino-ddATP (4-ddATP) 5-Propargylamino-ddCTP (4-ddCTP)

OO

N

HN

N

NH2O

H2N

POO

O-PO

OO-

PO

-OO- O

O

N

NH2

POO

O-PO

OO-

PO

-OO-

HN

O

O

OO

N

HN

N

NHO

H2N

POO

O-PO

OO-

PO

-OO- O

ON

NH

POO

O-PO

OO-

PO

-OO-

HN

O

O

(CH2)5NH2

O

(CH2)5NH2

O

7-Propargylamino-ddGTP (4-ddGTP) 5-Propargylamino-ddUTP (4-ddUTP)

5-Propargylamidoaminocaproyl-ddUTP (11-ddUTP)7-Propargylamidoaminocaproyl-ddGTP (11-ddGTP)

N

O

O

OCO(CH2)5NHTFA

a) 0.1 M carbonate-bicarbonate buffer, pH 8.5

b) NH4OH

N

O

O

OCO(CH2)5NHTFA

a) 0.1 M carbonate-bicarbonate buffer, pH 8.5

b) NH4OH

1 2

3 4

5 6

Fig.1. Structures of propargylamino-ddNTPs (1–4) and the nucleotides 11-ddGTP (5) and 11-ddUTP (6) with elongated linkers. The extension of the linkers to form the11-ddGTP (5) and 11-ddUTP (6) nucleotides is carried out to increase their molecular weight relative to the 4-ddATP (1) and 4-ddCTP (2), respectively.

DNA sequencing by MALDI–TOF MS / C. Qiu et al. / Anal. Biochem. 427 (2012) 193–201 195

argon atmosphere, and the reaction mixture was stirred at roomtemperature for 15 min. The appropriate amino-ddNTP (1, 2, 5, or6, 4 lmol) in 0.1 M NaHCO3–Na2CO3 buffer (pH 8.7, 300 ll) wasadded to the activated ester, and the reaction mixture was stirredat room temperature overnight (Fig. 2). The reaction mixture waspurified by reverse-phase HPLC on a 150 � 4.6 C18 column(Supelco, Bellefonte, PA, USA) using the following mobile phase:A, 8.6 mM triethylamine/100 mM hexafluoroisopropyl alcohol inwater (pH 8.1); B, methanol. Elution was performed with 100% Aisocratic for 10 min, followed by a linear gradient of 0 to 50% Bfor 20 min and then 50% isocratic for another 20 min. The isolatedpure adducts were characterized by ESI–MS analysis and singlebase extension followed by MALDI–TOF MS analysis.

ddATP-N3-biotin (10). HPLC retention time 30.7 min; TOF MS ES+m/z: anal. Calculated for C39H52N12O18P3S (M–H�) 1101.89; found1101.2.

ddCTP-N3-biotin (11). HPLC retention time 30.1 min; TOF MS ES+m/z: anal. Calculated for C37H51N11O19P3S (M–H–) 1078.85; found1078.2.

ddGTP-N3-biotin (12). HPLC retention time 30.8 min; TOF MS ES+m/z: anal. Calculated for C45H63N13O20P3S (M–H–) 1231.04; found1230.3.

ddUTP-N3-biotin (13). HPLC retention time 30.68 min; TOF MS ES+m/z: anal. Calculated for C43H61N11O21P3S (M–H–) 1192.99; found1192.2.

Polymerase extension using ddNTPs-N3-biotin, solid-phase capture,and cleavage

The four cleavable biotinylated dideoxynucleotides, ddNTPs-N3-biotin (ddATP-N3-biotin, ddGTP-N3-biotin, ddCTP-N3-biotin,and ddUTP-N3-biotin), were first characterized by performing fourseparate single base extension reactions, each with a differentself-priming DNA template allowing the four dideoxynucleotideanalogs to be incorporated. The following four self-priming DNAtemplates (26mer hairpin DNA with a 4-base 50 overhang) wereused for the extension: 50-GACTGCGCCGCGCCTTGGCGCGGCGC-30

for ddATP-N3-biotin, 50-GATCGCGCCGCGCCTTGGCGCGGCGC-30 fordd GTP-N3-biotin, 50-ATCGGCGCCGCGCCTTGGCGCGGCGC-30 fordd CTP-N3- biotin, and 50-GTCAGCGCCGCGCCTTGGCGCGGCGC-30

ONH

O

OO

N3O

S

HN NH

O

OO

N

N

N

NHNH2

POO

O-PO

OO-

PO

-OO-

OO

N

NH

POO

O-PO

OO-

PO

-OO-

N

O

NH2

OO

N

HN

N

NHO

H2N

POO

O-PO

OO-

PO

-OO-

OO

N

NH

POO

O-PO

OO-

PO

-OO-

HN

O

O

(CH2)5NH

O

(CH2)5NH

O

10

11

12

13

HN

O

ONH

O

OO

N3O

S

HN NH

O

HN

O

ONH

O

OO

N3O

S

HN NH

O

HN

O

ONH

O

OO

N3O

S

HN NH

O

HN

O

ONH

O

OO

N3O

ONH

O

OO

N3O

S

HN NH

O

HN

O

HO

HO

NH2

S

HN NH

O

O

ON

O

O

+1M NaHCO3

DMF

ONH

O

OO

N3O

S

HN NH

O

HN

OON

O

O

Amino-ddNTP; 1, 2, 5 or 6

0.1M NaHCO3/Na2CO3 (pH 8.7)

DMF

7

8

9

50mM TSTU

DMF

Fig.2. Synthesis and structures of biotin-N3-linker attached ddNTPs.

196 DNA sequencing by MALDI–TOF MS / C. Qiu et al. / Anal. Biochem. 427 (2012) 193–201

for ddUTP-N3-biotin. Each of the extension reactions consisted of40 pmol of self-priming DNA template, 60 pmol of correspondingddNTP-N3-biotin, 1� Thermo Sequenase reaction buffer, and 2 Uof Thermo Sequenase in a total volume of 20 ll. The reaction mix-ture was incubated at 65 �C for 15 min. For the incorporation test,

the extension products were desalted by using ZipTips (Millipore,Bellerica, MA, USA) and analyzed by MALDI–TOF MS. The matrixsolution was made by dissolving 35 mg of 3-hydroxypicolinic acidwith 6 mg of ammonium citrate in 800 ll of 50% acetonitrile. Forthe cleavage, extension products were mixed with 20 ll of TCEP

DNA sequencing by MALDI–TOF MS / C. Qiu et al. / Anal. Biochem. 427 (2012) 193–201 197

solution (100 mM, pH 9.0, adjusted with ammonium hydroxide)and incubated at 65 �C for 10 min to yield DNA cleavage productsthat were characterized by MALDI–TOF MS. To evaluate solid-phasecapture efficiency and DNA recovery from the solid phase, the samesingle base extension reactions were performed, and 20 ll of thesingle base extension products was incubated with 20 ll of strepta-vidin-coated magnetic beads that had been prewashed with 1�binding and washing (B/W) buffer (5 mM Tris–HCl, 0.5 mM EDTA,and 1 M NaCl, pH 7.5) three times, resuspended in 20 ll of 2� B/W buffer, and allowed to incubate for 1 h at room temperature.The streptavidin-coated magnetic beads bearing extended self-priming DNA templates were washed three times with 1� B/W buf-fer and three times with deionized water and then suspended in20 ll of TCEP solution and incubated at 65 �C for 25 min. This pro-cess removed the biotin moiety from the dideoxynucleotides and,hence, released extended self-priming templates from the magneticbeads. The supernatant was collected and desalted with ZipTips forMALDI–TOF MS characterization.

Sanger DNA sequencing reaction

A synthetic 90mer template with sequence related to a portionof the RHOD gene (50-TGCCTCTCCGAAGCCTCCTCACACCCTCCCCCGCCCTGCTTCTCCTCAGAGCTACACCCCCACGGTGTTTGAGCGGTACATGGTCAACCT-30) and the corresponding primer (50-TGTACCGCTC-30)were first used to test the sequencing method using ddNTPs-N3-biotin. Another template, a 150-bp double-stranded PCR productthat contains a portion of the RHOD gene (50-CAACTACGCTCCACT-GACCCCCAAGGAGGGAGCAGCGCTGTGGACAGACCAAGTCCCCAGTGCCTCTCCGAAGCCTCCTCACACCCTCCCCCGCCCTGCTTCTCCTCAGAGCTACACCCCCACGGTGTTTGAGCGGTACATGGTCAACCT-30), was gen-erated by PCR. The forward primer (50-CAACTACGCTCCACT-GACCC-30) and reverse primer (50-AGGTTGACCATGTACCGCTC-30)were used for PCR, which was carried out in 50 ll of PCR cocktailmixture containing 10 ng of template (DNA extracted from anony-

Fig.3. Scheme for purification of DNA sequencing fragments for MALDI–TOF MS analysiexcess primers, falsely terminated fragments, and salts by streptavidin-coated magnetiMALDI–TOF MS analysis, leaving the biotin moiety still bound to the surface.

mous sample), 10 nmol of dNTPs, 30 pmol of reverse and forwardprimers, 1 U of JumpStart REDTaq DNA Polymerase (Sigma–Al-drich, St. Louis, MO, USA), and 1� corresponding polymerase reac-tion buffer. The amplification was performed at 94 �C for 2 min,followed by 36 cycles of 94 �C for 20 s, 58 �C for 30 s, 72 �C for30 s, and a final extension at 72 �C for 5 min. The PCR productwas subsequently treated with ExoSAP-IT (USB/Affymetrix, Cleve-land, OH, USA) and further purified using a MinElute PCR Purifica-tion Kit (Qiagen, Valencia, CA, USA) before conducting thesequencing reaction. The corresponding primer for sequencingthe PCR product was 50-CCACTGACCC-30. Sanger sequencing reac-tions contained 2000 pmol each of dATP, dGTP, dCTP, and dTTP;20 pmol each of ddATP-N3-biotin, ddGTP-N3-biotin, ddCTP-N3-bio-tin, and ddUTP-N3-biotin; 6 U of Thermo Sequenase; 1� ThermoSequenase reaction buffer; 60 pmol of synthetic DNA template(or �2 lg of PCR products); and 300 pmol of primer (200 pmolfor PCR product) in a total volume of 20 ll. The sequencing reac-tions were subjected to 60 cycles of 94 �C for 30 s, 30 �C for1 min (36 �C for PCR product), and 65 �C for 30 s.

Solid-phase purification of DNA sequencing products for MSmeasurements

The scheme for the solid-phase purification of DNA sequencingfragments is shown in Fig. 3. The detailed procedure is described asfollows. First, 20 ll of DNA sequencing products were combinedwith 40 ll of streptavidin-coated magnetic beads that were pre-washed with 1� B/W buffer and then resuspended in 20 ll of 2�B/W buffer and allowed to incubate for 1 h at room temperature.After solid-phase capture, the beads containing the biotinylatedDNA fragments were washed three times with 1� B/W bufferand three times with deionized water. Then the beads were sus-pended in 30 ll of TCEP and incubated at 65 �C for 25 min. In thisway, the biotin moiety was removed from the dideoxynucleotides,and different lengths of DNA sequencing fragments were released

s. DNA sequencing fragments are isolated from the sequencing solution containingc beads. Then the sequencing fragments are cleaved from the beads with TCEP for

198 DNA sequencing by MALDI–TOF MS / C. Qiu et al. / Anal. Biochem. 427 (2012) 193–201

from the magnetic beads. The supernatant containing DNAsequencing fragments was desalted twice with a ZipTip and char-acterized by MS.

Results and discussion

We generated a set of cleavable biotinylated dideoxynucleo-tides, ddNTPs-N3-biotin, which are specifically designed for directuse in DNA polymerase reactions. The introduction of a chemicallycleavable biotin moiety into the DNA strand enabled highly effec-tive isolation and purification of the DNA extension products,which could then be applied for DNA sequencing by MALDI–TOFMS. The ddNTPs-N3-biotin, synthesized as shown in Fig. 2, have abiotin moiety attached to the 5 position of the pyrimidines (C/U)

GGCGCGGCGC

CCGCGCCGCGT

T

T

ThermoddATP-N3-biotin

GGCGCGGCGCA

CCGCGCCGCGT

T

T

GGCGCGGCGCA

CCGCGCCGCGT

T

T

Capturecoated mand was

Extension product (

O

NN

N

NH2HN O O O

O N3

OPO O-

5'- GACTGCGCCGCGCCCGCGGCGCGG

TT

O

NN

N

NH2HN O OH

O

O

OPO O-

5'- GACTGCGCCGCGCCCGCGGCGCGG

TT

Released product (3)(Ready for MS measurement)

TCE

NH2

O+

Capture Product (2

Fig.4. Polymerase extension reaction using ddATP-N3-biotin as substrate and TCEP cleavDNA polymerase Thermo Sequenase incorporates ddATP-N3-biotin onto the looped DNAstreptavidin-coated beads (capture product 2). Cleavage by TCEP of the DNA extension pthe biotin moiety remains on the solid surface of the beads.

and the 7 position of the purines (A/G) via chemically cleavableazido-based linkers. It was shown previously that when theserespective positions were modified with bulky fluorescent dyesthrough such azido-based linkers, DNA polymerase could stillincorporate the modified nucleotides into the growing DNA strandand the azido-based linker could be efficiently cleaved by theStaudinger reaction using an aqueous TCEP solution [26]. Hence,azido-based linkers were chosen for this study.

With increasing DNA fragment size, mass spectral peak widthsincrease, resulting in a decreasing resolution of the mass spectrom-eter for measuring larger DNA fragments. In addition, very highaccuracy is required to demonstrate the existence of single basepolymorphisms (SNPs). Hence, for the unambiguous determinationof sequence in the higher mass range as well as heterozygotedetection (see accompanying paper for SNP genotyping [31]), it is

CAG- 5'

Sequenase

OH-3'

-3'

CAG- 5'

N3-biotin

-3'

CAG- 5'

N3-biotin

with streptavidinagnetic beads

h

1)

NH

O HN

OS

NHHNO

NH

O HN

OS

NHHNO

P

)

age of DNA fragments containing ddA-N3-biotin on streptavidin-coated beads. Thetemplate primer to generate the extension product (1), which is then captured by

roducts captured on streptavidin-coated beads yields released product (3), whereas

Fig.5. MALDI–TOF mass spectra of the DNA extension products, their subsequent cleavage products in solution, and released DNA products from the solid-phase streptavidin-coated magnetic beads: (A) primers extended with ddATP-N3-biotin (1) (8889 m/z); (B) their cleavage products (2) (8416 m/z); (C) released products from solid phase (3)(8416 m/z); (D) primers extended with ddGTP-N3-biotin (4) (9018 m/z); (E) their cleavage products (5) (8545 m/z); (F) released products from solid phase (6) (8545 m/z); (G)primers extended with ddCTP-N3-biotin (7) (8866 m/z); (H) their cleavage products (8) (8393 m/z); (I) released products from solid phase (9) (8393 m/z); (J) primers extendedwith ddUTP-N3-biotin (10) (8980 m/z); (K) their cleavage products (11) (8507 m/z); (L) released products from solid phase (12) (8507 m/z).

Fig.6. Mass sequencing spectrum generated using ddNTPs-N3-biotin on synthetic template. The nested insets show increasing magnifications of the lower intensity region.

DNA sequencing by MALDI–TOF MS / C. Qiu et al. / Anal. Biochem. 427 (2012) 193–201 199

important to design the modified nucleotides with clearlydistinguishable mass tags [8,30]. In this study, as shown in Fig. 1,

dideoxynucleotide precursors with two different carbon linkerarms were chosen to increase the mass difference between different

Fig.7. Mass sequencing spectrum generated using ddNTPs-N3-biotin on a PCR product. The nested insets show increasing magnifications of the lower intensity region.

200 DNA sequencing by MALDI–TOF MS / C. Qiu et al. / Anal. Biochem. 427 (2012) 193–201

nucleotides before and after cleavage. Thus, the smallest mass dif-ference between two modified dideoxynucleotides is 23 Da (A andC), whereas it is only 9 Da for standard ddNTPs (A and T) [30] and16 Da for previously used biotinylated nucleotides (A and G) [8].The mass difference between A and G shifted from 16 to 129 Da,and the mass difference between C and G shifted from 39 to152 Da. These mass-tagged linker halves still remain after cleavagebecause they are positioned on the side of the azido group proximalto the base in the full linker. The adjusted masses of these taggednucleotides clearly provide better resolution and enhanced accu-racy within the separable range for DNA extension products.

In developing the MS DNA sequencing method, it was essentialthat the biotinylated nucleotides could be efficiently incorporatedinto the growing DNA strand during the polymerase reaction andthe DNA strand bearing biotinylated nucleotides could be effec-tively captured on the streptavidin-coated solid phase and thenefficiently released after cleavage while leaving the biotin behindon the solid phase. To verify this, single base extension reactionswith four corresponding self-priming DNA templates were carriedout in solution and purified according to the scheme in Fig. 4. Asshown in Fig. 5A, D, G, and J, essentially 100% incorporation wasconfirmed with MALDI–TOF MS by observing the disappearanceof the primer peak (7966 m/z) and its replacement by the extensionproduct for each dideoxynucleotide (8889 m/z for ddATP-N3-bio-tin, 9018 m/z for ddGTP-N3-biotin, 8866 m/z for ddCTP-N3-biotin,and 8980 m/z for ddUTP-N3-biotin). Incubation of the extensionproducts in TCEP solution led to cleavage of the N3-based linkertethering the biotin to the dideoxynucleotides. As shown inFig. 5B, E, H, and K, the mass peaks for the extension products havecompletely disappeared, whereas single peaks corresponding tothe cleavage products appear at 8416, 8545, 8393, and 8507 m/z,respectively, indicating 100% cleavage efficiency.

To evaluate these modified nucleotides for solid-phase purifica-tion, after performing the same single base extension reactions, theextension products were captured on streptavidin-coated mag-netic beads, released by TCEP, and subsequently analyzed by MAL-DI–TOF MS. The results are shown in Fig. 5C, F, I, and L, wheresingle peaks corresponding to released extension products appear

at 8416, 8545, 8393, and 8507 m/z, respectively, matching the ex-pected masses of the cleavage products.

Given ddATP-N3-biotin as an example, the detailed process isillustrated in Fig. 4. After single base extension, DNA extensionproduct 1 (Fig. 5A) was first captured by streptavidin-coated mag-netic beads. After cleavage by TCEP, the DNA extension productscaptured on streptavidin-coated magnetic beads yielded releasedproduct 3 (Fig. 5C), whereas the biotin moiety stayed on the solidsurface of the beads. The portion that remained on the nucleotidesserved as an enhanced mass tag that increased the mass differ-ences between each of the modified nucleotides. Compared withuncleavable biotinylated dideoxynucleotides used previously forsolid-phase purification, there is no need for ethanol precipitationbecause there is no formamide treatment with the ddNTPs-N3-bio-tin approach. Because TCEP is compatible with the ZipTip desaltingprocedure, the supernatant containing released products is takendirectly to the desalting step, saving time and also preventing sam-ple loss during the precipitation step. Proper adjustment of the pHof the TCEP with ammonium hydroxide avoids the production andretention of alkaline salts typically remaining after ethanol precip-itation, simplifying the overall desalting procedure. In addition, thereleased products from the solid phase do not contain the biotinmoiety. This excludes the interference from the four isotopes ofsulfur in the biotin molecule, resulting in higher resolution andclarity of the mass spectrum.

We first investigated the application of the cleavable biotinyla-ted dideoxynucleotides in sequencing of a synthetic single-stranded DNA template. The resulting mass spectrum is shown inFig. 6. The first peak in the spectrum is the primer peak plus thefirst nucleotide that is complementary to the corresponding nucle-otide in the DNA template. The mass difference between each peakand the prior dNTP extended primer are measured to determinethe identity of the base at each position because each dideoxynu-cleotide with its attached half linker has a unique molecularweight. The read length of 37 bases is an improvement over theprevious noncleavable biotinylated dideoxynucleotide-basedsequencing results [8], due mostly to the higher cleavage efficiencyand lower product loss during the desalting step.

DNA sequencing by MALDI–TOF MS / C. Qiu et al. / Anal. Biochem. 427 (2012) 193–201 201

We next applied this method to sequence a PCR product (Fig. 7). Aportion of the RHOD gene was first amplified by PCR on an anony-mous sample. To remove the excess dNTPs and primers, ExoSAP-ITtreatment and column purification were performed. After carryingout the sequencing steps, a read length of 32 bases was achieved.The appearance of a few extra peaks might have been caused byincomplete purification of the PCR product because the spectrumobtained with the synthetic template was free of such extraneouspeaks. However, these peaks do not interfere with the sequencedetermination because they do not correspond to any obvious incor-rectly extended products based on the known sequence of the RHODgene. The sequencing of the PCR product proves the potential ofusing these novel nucleotide analogs to sequence biological samplesfor mutation detection. With improvements in post-PCR cleanupand further optimization of sequencing conditions, longer readlengths with lower background signals might be achieved.

In conclusion, we have synthesized a set of chemically cleavablebiotinylated and mass-tagged dideoxynucleotides, ddNTPs-N3-bio-tin, and evaluated their application in solid-phase purification ofDNA extension products for DNA sequence determination by MAL-DI–TOF MS. These nucleotide analogs were shown to be excellentsubstrates for the DNA polymerase Thermo Sequenase in DNAextension reactions. DNA fragments incorporating ddNTPs-N3-bio-tin could be isolated via biotin–streptavidin interaction and effi-ciently recovered under mild conditions. The read length of37 bases on a synthetic template and a similar read length on a bio-logical sample prove its potential for MS-based Sanger sequencing.These modified nucleotide analogs that carry a biotin and a chem-ically cleavable linker allow purification of DNA products undermild conditions, facilitating and simplifying sample handling, andthe mass tags designed into these nucleotide analogs help to pro-duce unambiguous nucleotide identification. Such nucleotides willbe very useful for DNA sequencing by MS, especially for decodingshort stretches of DNA containing polymorphisms where fluores-cence-based Sanger sequencing is inadequate, for instance, in somelong (>15 bases) homopolymer stretches and in the case of shortindels in heterozygous individuals. Next generation sequencing ap-proaches, although accurate, are unnecessarily expensive for suchlimited sequences. These nucleotide analogs will also be invaluablein the design of MS-based multiplex genotyping assays, and asshown in the accompanying article [31], that can reveal low levelsof heteroplasmy.

Acknowledgment

This work was supported by National Institutes of Health (NIH)Grants R01NS060762 and R01HG004774.

References

[1] E.S. Lander, L.M. Linton, B. Birren, C. Nusbaum, M.C. Zody, J. Baldwin, K. Devon,K. Dewar, M. Doyle, W. FitzHugh, et al., Initial sequencing and analysis of thehuman genome, Nature 409 (2001) 860–892.

[2] M.L. Metzker, Sequencing technologies—the next generation, Nat. Rev. Genet.11 (2010) 31–46.

[3] F. Sanger, S. Nicklen, A.R. Coulson, DNA sequencing with chain-terminatinginhibitors, Proc. Natl. Acad. Sci. USA 74 (1977) 5463–5467.

[4] L.M. Smith, J.Z. Sanders, R.J. Kaiser, P. Hughes, C. Dodd, C.R. Connell, C. Heiner,S.B. Kent, L.E. Hood, Fluorescence detection in automated DNA sequencinganalysis, Nature 321 (1986) 674–679.

[5] J. Ju, C. Ruan, C.W. Fuller, A.N. Glazer, R.A. Mathies, Energy transfer fluorescentdye-labeled primers for DNA sequencing and analysis, Proc. Natl. Acad. Sci.USA 92 (1995) 4347–4351.

[6] I. Kheterpal, J.R. Scherer, S.M. Clark, A. Radhakrishnan, J. Ju, C.L. Ginther, G.F.Sensabaugh, R.A. Mathies, DNA sequencing using a four-color confocalfluorescence capillary array scanner, Electrophoresis 17 (1996) 1852–1859.

[7] J.M. Bowling, K.L. Bruner, J.L. Cmarik, C. Tibbetts, Neighboring nucleotideinteractions during DNA sequencing gel electrophoresis, Nucleic Acids Res. 19(1991) 3089–3097.

[8] J.R. Edwards, Y. Itagaki, J. Ju, DNA sequencing using biotinylateddideoxynucleotides and mass spectrometry, Nucleic Acids Res. 29 (2001) e104.

[9] D.J. Fu, K. Tang, A. Braun, D. Reuter, B. Darnhofer-Demar, D.P. Little, M.J.O’Donnell, C.R. Cantor, H. Köster, Sequencing exons 5 to 8 of the p53 gene byMALDI–TOF mass spectrometry, Nat. Biotechnol. 16 (1998) 381–384.

[10] M.C. Fitzgerald, L. Zhu, L.M. Smith, The analysis of mock DNA sequencingreactions using matrix-assisted laser desorption/ionization massspectrometry, Rapid Commun. Mass Spectrom. 7 (1993) 895–897.

[11] H. Köster, K. Tang, D.J. Fu, A. Braun, D. van den Boom, C.L. Smith, R.J. Cotter, C.R.Cantor, A strategy for rapid and efficient DNA sequencing by massspectrometry, Nat. Biotechnol. 14 (1996) 1123–1128.

[12] J. Monforte, C. Becker, High-throughput DNA analysis by time-of-flight massspectrometry, Nat. Med. 3 (1997) 360–362.

[13] F. Kirpekar, E. Nordhoff, L.K. Larsen, K. Kristiansen, P. Roepstorff, F. Hillenkamp,DNA sequence analysis by MALDI mass spectrometry, Nucleic Acids Res. 26(1998) 2554–2559.

[14] M.T. Roskey, P. Juhasz, I.P. Smirnov, E.J. Takach, S.A. Martin, L.A. Haff, DNAsequencing by delayed extraction–matrix-assisted laser desorption/ionizationtime of flight mass spectrometry, Proc. Natl Acad. Sci. USA 93 (1996) 4724–4729.

[15] E. Nordhoff, D. Luebbert, G. Thiele, V. Heiser, H. Lehrach, Rapid determinationof short DNA sequences by the use of MALDI–MS, Nucleic Acids Res. 28 (2000)e86.

[16] F. Mauger, K. Bauer, C.D. Calloway, J. Semhoun, T. Nishimoto, T.W. Myers, D.H.Gelfand, I.G. Gut, DNA sequencing by MALDI–TOF MS using alkali cleavage ofRNA/DNA chimeras, Nucleic Acids Res. 35 (2007) e62.

[17] S. Kim, H.D. Ruparel, T.C. Gilliam, J. Ju, Digital genotyping using molecularaffinity and mass spectrometry, Nat. Rev. Genet. 4 (2003) 1001–1008.

[18] O.H. Laitinen, V.P. Hytönen, H.R. Nordlund, M.S. Kulomaa, Geneticallyengineered avidins and streptavidins, Cell. Mol. Life Sci. 63 (2006) 2992–3017.

[19] M. Leblond-Francillard, M. Dreyfus, F. Rougeon, Isolation of DNA–proteincomplexes based on streptavidin and biotin interaction, Eur. J. Biochem. 166(1987) 351–355.

[20] L.O. Penalva, J.D. Keene, Biotinylated tags for recovery and characterization ofribonucleoprotein complexes, BioTechniques 37 (2004) 604–610.

[21] V. Tchikov, S. Schütze, Immunomagnetic isolation of tumor necrosis factorreceptosomes, Methods Enzymol. 442 (2008) 101–123.

[22] X. Tong, L.M. Smith, Solid-phase method for the purification of DNAsequencing reactions, Anal. Chem. 64 (1992) 2672–2677.

[23] T.L. Hawkins, T. O’Connor-Morin, A. Roy, C. Santillan, DNA purification andisolation using a solid phase, Nucleic Acids Res. 22 (1994) 4543–4544.

[24] A. Holmberg, A. Blomstergren, O. Nord, M. Lukacs, J. Lundeberg, M. Uhlén, Thebiotin–streptavidin interaction can be reversibly broken using water atelevated temperatures, Electrophoresis 26 (2005) 501–510.

[25] X. Bai, S. Kim, Z. Li, N.J. Turro, J. Ju, Design and synthesis of a photocleavablebiotinylated nucleotide for DNA analysis by mass spectrometry, Nucleic AcidsRes. 32 (2004) 535–541.

[26] J. Guo, N. Xu, Z. Li, S. Zhang, J. Wu, D.H. Kim, M.S. Marma, Q. Meng, H. Cao, X. Li,S. Shi, L. Yu, S. Kalachikov, J.J. Russo, N.J. Turro, J. Ju, Four-color DNA sequencingwith 30-O-modified nucleotide reversible terminators and chemicallycleavable fluorescent dideoxynucleotides, Proc. Natl. Acad. Sci. USA 105(2008) 9145–9150.

[27] F. W. Hobbs, A. J. Cocuzza, Alkynylamino-nucleotides, US patent 5047519(1991).

[28] R.S. Duthie, I.M. Kalve, S.B. Samols, S. Hamilton, I. Livshin, M. Khot, S. Nampalli,S. Kumar, C.W. Fuller, Novel cyanine dye-labeled dideoxynucleosidetriphosphates for DNA sequencing, Bioconjug. Chem. 13 (2002) 699–706.

[29] J. Milton, S. Ruediger, X. Liu, Labeled nucleotides, US patent application 2006/0160081 A1 (2006).

[30] Z. Fei, T. Ono, L.M. Smith, MALDI–TOF mass spectrometric typing of singlenucleotide polymorphisms with mass-tagged ddNTPs, Nucleic Acids Res. 26(1998) 2827–2828.

[31] C. Qiu, S. Kumar, J. Guo, J. Lu, S. Shi, S.M. Kalachikov, J.J. Russo, A.B. Naini, E.A.Schon, J. Ju, Mitochondrial single nucleotide polymorphism genotyping bymatrix-assisted laser desorption/ionization time-of-flight mass spectrometryusing cleavable biotinylated dideoxynucleotides, Anal. Biochem. (2012),http://dx.doi.org/10.1016/j.ab.2012.05.001.