Post on 25-Apr-2023
Laura Poliseno (ed.), Pseudogenes: Functions and Protocols, Methods in Molecular Biology, vol. 1167,DOI 10.1007/978-1-4939-0835-6_18, © Springer Science+Business Media New York 2014
Chapter 18
Discrimination of Pseudogene and Parental Gene DNA Methylation Using Allelic Bisulfite Sequencing
Luke B. Hesson and Robyn L. Ward
Abstract
Determining the methylation status of genes with pseudogenes can be technically challenging due to sequence homology. High sequence homology can result in the amplification of both pseudogene and parental gene alleles, potentially leading to data misinterpretation. Allelic bisulfite sequencing allows for detection of the methylation status of individual alleles at nucleotide resolution and represents the most reliable method for discriminating pseudogene and parental gene sequences. Here, we discuss important points that should be considered when investigating pseudogene and parental gene methylation status and we describe the method of allelic bisulfite sequencing, including assay design.
Key words Pseudogene, Epigenetics, Methylation, Bisulfite, Sequencing
1 Introduction
Pseudogenes are ancestral nonfunctional copies of protein coding genes that have lost the potential to give rise to a protein product [1]. Pseudogenes that arise by genomic duplication are called unprocessed pseudogenes, whereas those formed by retrotranspo-sition through reverse transcription of an mRNA intermediate and reintegration into the genome are known as processed. Pseudogenes are not restricted by the same selective pressures as functional parental genes and accumulate deleterious sequence changes over time, usually resulting in stop codons that render the “open read-ing frame” nonfunctional. Pseudogenes are ubiquitous in the human genome with current estimates indicating that there are over 17,000 pseudogenes [2].
Given the close similarity between pseudogenes and almost all coding genes, it is challenging to develop molecular analyses that are specific for the gene of interest rather than pseudogenes [3]. Most notably, amplification of DNA sequences using PCR can be problematic if the target region is not unique in the genome. Pseudogenes with high sequence homology can therefore be a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
source of “nonspecific” amplification when investigating gene expression, mutations, or DNA methylation [4, 5].
Both processed and unprocessed pseudogenes often show high sequence homology with the promoter regions of parental genes. Promoter regions are usually the focus of attention when assaying for methylation changes. In a recent study of the methylation sta-tus of pseudogene–parental gene pairs, Cortese et al. found that the majority of pseudogenes were methylated in different tissues compared with parental genes [6]. Therefore, when investigating the promoter methylation status of genes with pseudogenes, it is essential that the assays used are able to reliably discriminate pseu-dogene and parental gene sequences in order to avoid data misinterpretation.
Recently, we have demonstrated the technical challenges asso-ciated with analyzing the methylation status of the PTEN CpG island promoter, which shows >95 % sequence homology with the 5′ region of the PTENP1 pseudogene [7]. Using allelic bisulfite sequencing, we were able to unequivocally demonstrate that meth-ylation of the PTEN CpG island is a rare event in cancer cell lines and that apparent methylation in fact originates from homologous regions of the PTENP1 pseudogene [7]. Allelic bisulfite sequenc-ing involves bisulfite PCR, bacterial cloning of PCR amplicons, and fluorescent automated DNA sequencing of individual alleles.
Here, we describe a methodological approach to determine the methylation status of genes with pseudogenes or regions shar-ing high sequence similarity, using allelic bisulfite sequencing.
2 Materials
1. Genome browser [8]. 2. Sequence alignment tool such as BLAT [8] or BLAST [9].
1. EZ DNA methylation Gold Kit (Zymo Research).
1. Thermocycler. 2. Platinum®Taq DNA polymerase complete with 10× PCR
buffer and 50 mM MgCl2 (Invitrogen). 3. dNTP mixture. 4. Primers for amplification of region of interest from bisulfite
modified DNA. 5. PCR purification kit or Gel extraction kit.
1. pCR®2.1-TOPO® TA Cloning vector kit (Invitrogen). 2. Luria–Bertani (LB) agar plates supplemented with 50 μg/mL
Carbenicillin, 80 μg/mL 5-bromo-4-chloro-3-indolyl-β-d-
2.1 In Silico Characterization
2.2 Sodium Bisulfite DNA Modification
2.3 Bisulfite PCR and PCR Purification
2.4 Ligation and Transformation of PCR Products
Luke B. Hesson and Robyn L. Ward
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
galactopyranoside (X-β-gal), and 500 μM Isopropyl-B-d- thiogalactopyranoside (IPTG, BDH).
3. Water bath set to 42 °C. 4. Super Optimal broth with Catabolite repression (SOC) media:
2 % (w/v) Bacto-Tryptone, 0.5 % (w/v) yeast extract, 10 mM NaCl, 2.5 mM KCl and 10 mM MgCl2, 20 mM glucose.
5. Ice box. 6. Orbital shaker. 7. Incubator set to 37 °C. 8. Chemically competent DH5α E. coli.
1. Standard PCR reagents and equipment as listed in Subheading 2.3, items 1–3.
2. M13 sequencing primers (5′-GTTTTCCCAGTCACGAC-3′ and 5′-CAGGAAACAGCTATGAC-3′).
3. 96-well PCR reaction plates.
1. Thermocycler. 2. Antarctic phosphatase, 5,000 U/mL complete with 10×
Antarctic phosphatase buffer. 3. Exonuclease I, 20,000 U/mL.
1. Thermocycler. 2. BigDye® Terminator v3.1 Cycle Sequencing Kit complete with
5× reaction buffer (Applied Biosystems). 3. Ethanol (95 % and 70 % (v/v)). 4. 3 M Sodium Acetate (pH 5.2). 5. Refrigerated centrifuge with a plate spinning rotor capable of
2,235 RCF. 6. ABI3730 DNA analyzer (Applied Biosystems).
1. DNA sequence viewing software. 2. “CpGviewer” interactive bisulfite DNA sequencing analysis tool
[10] or equivalent bisulfite DNA sequencing analysis software.
3 Methods
1. Obtain the sequence of the CpG island promoter or of other regions of interest.
2. Using a sequence alignment tool, search for regions of homol-ogy in the genome (Fig. 1a).
3. Identify the sequence differences between the pseudogene and the parental gene (Fig. 1b).
2.5 Colony PCR
2.6 Phosphatase and Exonuclease Treatment
2.7 Fluorescent Automated DNA Sequencing
2.8 Data Interpretation
3.1 In Silico Characterization
Pseudogenes and Allelic Bisulphite Sequencing
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
4. Design bisulfite PCR primers that amplify regions that contain informative sequence differences between the parental gene and the pseudogene (see Notes 1–4 for hints on bisulfite PCR primer design).
Extract genomic DNA from the cell line or tissue of interest using phenol-chloroform DNA extraction or a commercially available kit. Extracted genomic DNA is then bisulfite modified, which involves the selective chemical conversion of cytosine to uracil, whereas 5-methylcytosine remains refractory to this conversion.
1. Dilute 1 μg of genomic DNA into 20 μL in nuclease-free water. 2. Prepare the “CT conversion reagent” (provided in the EZ
DNA methylation-Gold™ kit) according to manufacturer’s instructions and add 130 μL to the DNA.
3.2 Sodium Bisulfite DNA Modification
Homologous region in PTENP1 Bisulphite PCR
a
b
500 basesKLLN
PTEN
CGI
PTENP1
Bisulphite PCR
1 kb
Bisulphite PCR 100 bases
PTENPTENP1
CCTCCAGCCCGCCGGC
CGGACGAGACGCACGGGA
CTCCATCTGGAT
GCC---------GCCGCCGCCGCCGCCGCC
Fig. 1 Regions of homology between the PTEN and PTENP1 pseudogene and identification of discriminating sequence differences. (a) The PTEN CpG island (green bar) is a bidirectional promoter encompassing the 5′UTR of the KLLN and PTEN genes (blue bars). The fragmented black bar indicates the region that shares high sequence homology with the PTENP1 pseudogene and the degree of similarity. Thin vertical red lines indicate single nucleotide differences, whilst gaps indicate larger regions of sequence variation between the PTEN and PTENP1 genes across this region. The bracket indicates the region amplified using bisulfite PCR primers as described in Bennett et al. [11]. (b) Shown is the region of the PTENP1 processed pseudogene (light blue bar ) that is amplified by the bisulfite PCR described in (a). Thin vertical red lines and gaps within the black bar indicates the locations of sequence differences (bottom) that can be used to distinguish PTEN alleles from PTENP1 alleles
Luke B. Hesson and Robyn L. Ward
110
111
112
113
114
115
116
117
118
119
120
121
122
3. Place the reaction tube in a thermocycler and incubate at 98 °C for 10 min followed by 53 °C for 18 h. This extended incuba-tion time ensures the complete modification of DNA.
4. Recover the modified DNA using the DNA binding columns provided in the EZ DNA methylation-Gold™ kit according to the manufacturer’s instructions. Elute the modified DNA in 50 μL nuclease-free water to obtain ~20 ng/μL bisulfite modi-fied DNA.
1. The optimum conditions for each PCR must be determined empirically. We commonly use the following conditions when optimizing bisulfite PCRs: 0.2 mM dNTPs, 0.4–1 μM each primer, 0.5–1 U Platinum®Taq DNA polymerase, 2 mM MgCl2, and 40–100 ng bisulfite modified DNA. The thermo-cycle consists of 5 min at 95 °C, 35–40 cycles of 1 min at 94 °C, 1 min at the calculated annealing temperature of the primers and 2 min at 74 °C, followed by 10 min and 72 °C.
2. Purify PCR amplicons using a PCR purification or gel extrac-tion kit (see Note 5).
1. Perform ligation into the pCR®2.1-TOPO® TA cloning vector according to the manufacturer’s instructions. Ligation is per-formed for 30 min at room temperature.
2. Thaw DH5α E. coli on ice. Aliquot 50 μL into a fresh 1.5 mL tube and add 2 μL of ligation reaction. Gently mix with the pipette tip. Do not vortex or pipette up and down. Incubate on ice for 30 min.
3. Transform the bacteria by heat shock in a 42 °C water bath for 30 s. Incubate on ice for 2 min.
4. Add 450 μL room temperature SOC media. 5. Incubate in an orbital shaker for 90 min at 37 °C, shaking at
250 rpm. 6. Evenly spread 100 μL of each transformation mixture onto the
surface of a pre-warmed LB agar plate (see Note 6). 7. Incubate at 37 °C for 20 h.
1. Remove the LB agar plate from the incubator and mark the desired number of white colonies for colony PCR.
2. Place 5 μL nuclease-free water into the bottom of the desired number of wells within a 96-well PCR plate, plus one addi-tional well for a negative control (see Note 7).
3. To inoculate the water, gently touch a white colony using a pipette tip and place it in a well within the 96-well plate. Leave the tip in the well and continue picking colonies until the desired number is reached (see Note 8).
3.3 Bisulfite PCR and PCR Purification
3.4 Ligation and Transformation of PCR Products
3.5 Colony PCR
Pseudogenes and Allelic Bisulphite Sequencing
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
4. Gently shake the plate to agitate the tips. This will disperse the bacteria cells into the water. To remove the tips without cross- contaminating wells invert the plate over a waste disposal bin.
5. Prepare a PCR master mix containing 0.25 mM dNTPs, 2 mM MgCl2, 1 U Platinum® Taq DNA polymerase, and M13 sequencing primers (0.4 μM each primer, see Note 9).
6. Place 20 μL PCR master mix into each well to obtain a final volume 25 μL. Seal the plate and place it in a thermocycler.
7. Incubate for 5 min at 95 °C followed by 30 cycles of 30 s at 95 °C, 30 s at 50 °C, 30 s at 72 °C, and a final incubation for 10 min at 72 °C (see Note 10).
8. To identify reactions ready for sequencing, load 5 μL of each reaction into a 2 % agarose gel (see Note 11).
1. Add 0.5 μL (2.5 U) alkaline phosphatase, 0.5 μL (10 U) exo-nuclease I, 2.5 μL 10× alkaline phosphatase buffer, and 1 μL nuclease-free water (final volume 25 μL, see Note 12) to each reaction.
2. Incubate in a thermocycler for 30 min at 37 °C followed by 20 min at 80 °C.
1. Remove 4 μL of each reaction and place it in a fresh PCR plate. 2. Add 0.5 μL BigDye® Terminator v3.1 Cycle Sequencing
reagent, 2 μL 5× BigDye sequencing reaction buffer, 1 μL of 3.2 μM primer (see Note 13), and 2.5 μL nuclease-free water (final volume 10 μL). Place the reaction in a thermocycler.
3. Incubate for 25 cycles of 20 s at 94 °C, 20 s at 50 °C and 4 min at 60 °C. Store reactions at 4 °C and protect from light.
4. Add 25 μL 95 % (v/v) ethanol (chilled on ice) and 1 μL 3 M Sodium Acetate (pH 5.2) to each well.
5. Seal the plate and place it in a 4 °C centrifuge. 6. Centrifuge at 2,235 RCF for 20 min. 7. Remove ethanol and add 50 μL 70 % (v/v) ethanol. Repeat
steps 2–4. 8. Remove the 70 % (v/v) ethanol and air-dry for 10 min. 9. Sequence using an ABI3730 DNA analyzer.
1. Using DNA sequence viewing software separate pseudogene and parental gene alleles based on sequence differences identi-fied in step 3 in Subheading 3.1.
2. Determine the methylation status of each individual CpG dinu-cleotide using the “CpGviewer” software [10] (Fig. 2). To use this software, the genomic sequence of the regions analyzed
3.6 Phosphatase and Exonuclease Treatment
3.7 Fluorescent Automated DNA Sequencing
3.8 Data Interpretation
[AU1]
Luke B. Hesson and Robyn L. Ward
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
PTEN-derived allele (unmethylated)
PTENP1-derived allele (methylated)
PTEN-derived allele
PTENP1-derived allele
a
b
Fig. 2 Allelic bisulfite sequencing shows DNA methylation is specifically associ-ated with PTENP1 and not the PTEN CpG island. (a) Allelic bisulfite sequencing data showing the methylation status of PTEN and PTENP1-derived alleles in the hematological cancer cell line Raji (taken from Hesson et al. [7]). Each line rep-resents a single allele. Circles indicate the positions of CpG dinucleotides; black circles indicate methylated CpG dinucleotides, white circles indicate unmethyl-ated CpG dinucleotides; yellow diamonds indicate the positions of nucleotide variations within PTENP1 alleles used to discriminate between PTEN and PTENP1 alleles; black diamonds indicate the positions of additional CpG dinucleotides specific to PTENP1 alleles that were also methylated. (b) Representative electro-pherograms showing an unmethylated PTEN allele and a methylated PTENP1 allele. Indicated by the black arrow is the position of a nucleotide variation used to discriminate the PTEN and PTENP1 alleles
Pseudogenes and Allelic Bisulphite Sequencing
(both the pseudogene and the parental gene) are required in plain text (.txt format), as well as the electropherogram for each allele (.ab1 format).
4 Notes
1. Choice of the DNA strand. Following bisulfite modification, the two previously complementary DNA strands become non-complementary single-stranded DNAs that can be amplified separately using strand-specific PCR primers. Once the region of interest has been identified, it is crucial to choose the most appropriate DNA strand, so that the sequence differences between the parental gene and the pseudogene remain infor-mative even after the treatment with sodium bisulfite. For example, C/T mismatches between the parental gene and the pseudogene (with the C unmethylated) are lost following bisulfite conversion, but the corresponding G/A mismatches present on the other strand do persist.
2. Basic principles of bisulfite PCR primer design. It is essential that bisulfite PCR primers are specific for bisulfite modified DNA. To achieve this goal, we design them so that they include thymines (originally cytosines) at critical positions, such as a thymine (originally cytosine) at the most 3′ base or a short stretch of thymines (originally cytosines) in the central part of the primer.
Bisulfite modification reduces the complexity of the DNA sequence, making it more difficult to design primers with a low rate of off-target binding. It is therefore important to incorpo-rate a mix of the three remaining bases A, T, and G, whenever possible. In this respect, increasing primer length (between 25 and 40 nt) increases primer specificity and also compensates for reduced primer annealing temperature due to the loss of cyto-sine bases.
For allelic bisulfite sequencing, amplification of modified DNA with no bias towards original methylation status is also crucial. This is achieved by avoiding CpG dinucleotides within primer binding sites.
3. Amplicon size and nested PCRs. Generally bisulfite PCR works well with small amplicon sizes (up to ~500 bp). Nested PCRs can improve the amplification of particularly large amplicons or regions for which it is difficult to obtain specific amplicons. Nested bisulfite PCRs are split into two reactions, the first of which involves a limited number of cycles (~20). A small amount of this reaction is then transferred to a second PCR reaction, which includes nested primers.
Luke B. Hesson and Robyn L. Ward
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
4. When analyzing the methylation status of any gene with a pseudogene that shares high sequence homology, we advise against the use of other techniques such as bisulfite pyrose-quencing, methylation-specific PCR (MSP) or combined bisulfite restriction analysis (COBRA). These techniques may not be informative for the proportion of the amplicon that is pseudogene-derived and/or may not allow for discrimination of whether methylation originates from the pseudogene or the parental gene.
5. Ligation efficiency is improved by PCR purification. This can be done using PCR column purification or gel extraction. Gel extraction is desirable if the reaction contains significant primer dimer or nonspecific PCR products.
6. If low numbers of transformants are expected, then the entire transformation mixture can be plated following collection of cells by centrifugation and resuspension in 100 μL SOC media.
7. Each white colony should contain a single allele of the region of interest. The number of colonies to be screened depends largely on the application. Sequencing of a greater number of alleles will give a more accurate representation of the methyla-tion status of a region across a population of cells, as well as of the proportion of methylated pseudogene and parental gene- derived sequences.
8. When picking the colonies, avoid scraping the entire colony as this will overload the PCR reaction with too much DNA. If a plasmid miniprep containing a cloned allele is required, the remainder of the colony can be used to inoculate LB broth containing 50 μg/mL carbenicillin.
9. The use of M13 primers standardizes colony PCR conditions but also ensures that different primers can be used for colony PCR and sequencing reactions, which reduces background in sequencing reactions.
10. The initial incubation for 5 min at 95 °C is essential to activate Platinum®Taq DNA polymerase and also releases plasmid DNA from bacteria.
11. Colony PCR product size is defined by the size of the insert ligated into the pCR®2.1-TOPO® TA cloning vector plus ~200 bp of flanking vector sequence.
12. Prior to sequencing, unincorporated dNTPs and primers must be removed from the PCR reaction. This can be done enzymatically or through purification columns. We recom-mend enzymatic treatment using antarctic phosphatase and exonuclease I, which dephosphorylates dNTPs and removes single- stranded DNA primers, respectively. These enzymes are then heat inactivated, thereby preventing interference
Pseudogenes and Allelic Bisulphite Sequencing
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
1. Zheng D, Frankish A, Baertsch R et al (2007) Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution. Genome Res 17: 839–851
2. Karro JE, Yan Y, Zheng D et al (2007) Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation. Nucleic Acids Res 35:D55–D60
3. Kalyana-Sundaram S, Kumar-Sinha C, Shankar S et al (2012) Expressed pseudogenes in the transcriptional landscape of human cancers. Cell 149:1622–1634
4. Whang YE, Wu X, Sawyers CL (1998) Identification of a pseudogene that can mas-querade as a mutant allele of the PTEN/MMAC1 tumor suppressor gene. J Natl Cancer Inst 90:859–861
5. Zysman MA, Chapman WB, Bapat B (2002) Considerations when analyzing the methyla-tion status of PTEN tumor suppressor gene. Am J Pathol 160:795–800
6. Cortese R, Krispin M, Weiss G et al (2008) DNA methylation profiling of pseudogene- parental gene pairs and two gene families. Genomics 91:492–502
7. Hesson LB, Packham D, Pontzer E et al (2012) A reinvestigation of somatic hypermethylation at the PTEN CpG island in cancer cell lines. Biol Proced Online 14:5
8. Meyer LR, Zweig AS, Hinrichs AS et al (2012) The UCSC Genome Browser database: exten-sions and updates 2013. Nucleic Acids Res 41(Database issue):D64–D69
9. Altschul SF, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410
10. Carr IM, Valleley EM, Cordery SF et al (2007) Sequence analysis and editing for bisulphite genomic sequencing projects. Nucleic Acids Res 35:e79
11. Bennett KL, Mester J, Eng C (2010) Germline epigenetic regulation of KILLIN in Cowden and Cowden-like syndrome. JAMA 304:2724–2731
[AU2]
with subsequent sequencing. The use of these enzymes allows for a more convenient high-throughput sequencing of individual alleles in a 96-well format.
13. Add only one primer from the set used to obtain the original PCR amplicon.
References
Luke B. Hesson and Robyn L. Ward
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343