DNA sequence organization in the alga Euglena gracilis

16
Biochimica et Biophysica Acta, 563 (1979) 1--16 © Elsevier/North-Holland Biomedical Press BBA99452 DNA SEQUENCE ORGANIZATION IN THE ALGA EUGLENA GRACILIS JAMES R.Y. RAWSON *, VIRGINIA K. ECKENRODE, CINDY L. BOERMA and STEPHANIE CURTIS Departments of Botany and Biochemistry, University of Georgia, Athens, GA 30602 (U.S.A.) (Received September 19th, 1978) Key words: DNA sequence organization; Repetitive DNA sequence; Genome complexity; (Euglena gracilis) Summary The sequence organization of nuclear DNA in the single-celled alga Euglena gracilis has been studied by a combination of techniques: (1) the comparison of the reassociation kinetics of DNA fragments 300, 2000 and 8100 nucleotides long; (2) the reassociation of 32P-labeled DNA fragments of various lengths with driver fragments 300 nucleotides long; (3) the hyperchromicity of DNA struc- tures formed by the reassociation of repetitive sequences; (4) and the direct measurement of the size of the duplex regions of reassociated repetitive DNA resistant to S1 nuclease. The single copy DNA sequences are approximately 1500 nucleotide pairs long and are interspersed with repetitive DNA sequences. The repetitive DNA, consisting of both highly repetitive and middle repetitive sequences, consists of one fraction of nucleotide sequences (0.67) with an average size of 4900 nucleotide pairs and a second fraction {0.33) with an average size of 1000 nucleotide pairs. 34% of the DNA consists of foldback sequences which are present on 45% of the DNA 4000 nucleotides long. Introduction Repetitive and single copy DNA sequences are interspersed in the genomes of a variety of eukaryotes [1--4]. The interspersion of these DNA sequences is characterized by single copy sequences approximately 1000--2000 nucleotide pairs long immediately adjacent to and bracketed by short {200--400 nucleo- * To whom reprint requests should be directed, Abbreviation: Pipes, piperzine-N,N'obis(2-ethanesulfonic acid),

Transcript of DNA sequence organization in the alga Euglena gracilis

Biochimica et Biophysica Acta, 563 (1979) 1--16 © Elsevier/North-Holland Biomedical Press

BBA 99452

DNA SEQUENCE ORGANIZATION IN THE ALGA EUGLENA GRACILIS

JAMES R.Y. RAWSON *, VIRGINIA K. ECKENRODE, CINDY L. BOERMA and STEPHANIE CURTIS

Departments of Botany and Biochemistry, University of Georgia, Athens, GA 30602 (U.S.A.)

(Received September 19th, 1978)

Key words: DNA sequence organization; Repetitive DNA sequence; Genome complexity; ( Euglena gracilis)

Summary

The sequence organization of nuclear DNA in the single-celled alga Euglena gracilis has been studied by a combination of techniques: (1) the comparison of the reassociation kinetics of DNA fragments 300, 2000 and 8100 nucleotides long; (2) the reassociation of 32P-labeled DNA fragments of various lengths with driver fragments 300 nucleotides long; (3) the hyperchromicity of DNA struc- tures formed by the reassociation of repetitive sequences; (4) and the direct measurement of the size of the duplex regions of reassociated repetitive DNA resistant to S1 nuclease.

The single copy DNA sequences are approximately 1500 nucleotide pairs long and are interspersed with repetitive DNA sequences. The repetitive DNA, consisting of both highly repetitive and middle repetitive sequences, consists of one fraction of nucleotide sequences (0.67) with an average size of 4900 nucleotide pairs and a second fraction {0.33) with an average size of 1000 nucleotide pairs. 34% of the DNA consists of foldback sequences which are present on 45% of the DNA 4000 nucleotides long.

Introduction

Repetitive and single copy DNA sequences are interspersed in the genomes of a variety of eukaryotes [1--4]. The interspersion of these DNA sequences is characterized by single copy sequences approximately 1000--2000 nucleotide pairs long immediately adjacent to and bracketed by short {200--400 nucleo-

* To w h o m reprint requests should be directed, Abbreviation: Pipes, piperzine-N,N'obis(2-ethanesulfonic acid),

tide pairs) repetitive sequences. This pattern of organization of DNA was first demonstrated in Xenopus DNA [1] and has come to be referred to as short- term interspersion or the 'Xenopus pattern' of DNA organization. A different pattern of DNA sequence organization has been demonstrated in several other eukaryotes (Drosophila melanogaster, honeybee and the water mold, Achyla [5--7] and may be depicted as large repetitive sequences (1000--10 000 nucleo- tide pairs) linked to single copy sequences 10 000 nucleotide pairs or longer. Such sequence organization has been referred to as long-term interspersion or the 'Drosophila pattern' of DNA organization [6].

Studies of DNA sequence organization in eukaryotes have usually been carried out using organisms which are evolutionary well-defined, although distant with respect to one another. Euglena gracilis is a unicellular alga, classi- fied as a protista [8] and is evolutionarily quite isolated from other lower eukaryotes. The chromosomes in Euglena are also unusual in that they are continually condensed [9].

Initial studies in characterizing the genome of Euglena by reassociation kinetics suggested that there were two kinetic components, and that 36% of the genome consisted of single copy DNA [10]. We have further characterized the kinetics of reassociation of nuclear DNA in Euglena and have found the average size of the repeated DNA sequences somewhat larger than those in Xenopus but interspersed with single copy DNA sequences 1500 nucleotide pairs long.

Materials and Methods

Cell growth and DNA isolation. E. gracilis vat. Z was grown in a hetero- trophic medium [10]. Total cell DNA with double-stranded and single-stranded molecular weights of 9 • 106 or 13 500 nucleotide pairs and 8 . 1 0 s or 2400 nucleotides, respectively, was prepared as previously described [ 10].

DNA with a single-stranded length of 9000 nucleotides was prepared by a gentler procedure. Euglena cells (1 g wet weight/4 ml)were suspended in 0.15 M NaC1, I mM EDTA and 10 mM Tris-HC1, pH 7.5, plus 2.5% (w/v) sodium dodecyl sulfate. Pronase (predigested for 2 h at 37°C in 0.15 M NaC1) was added to the cell slurry to a final concentration of 200 ~g/ml and the mixture incubated at 37°C for 2 h. The cell lysate was deproteinized with an equal volume of a phenol mixture containing cresol (10%, v/v), 8-hydroxyquinoline (0.1%, w/v) and saturated with 0.15 M NaC1, 1 mM EDTA and 10 mM Tris-HC1 (pH 7.5). The emulsion was shaken at room temperature for 30 min and centrifuged at 6000 rev./min for 10 rain. The aqueous phase was further deproteinized with two volumes of chloroform/isoamyl alcohol (24 : 1, v/v) and the nucleic acids precipitated from the aqueous phase with 0.1 vol. 3 M sodium acetate and 2 vols. ethanol. The precipitate was suspended in 0.15 M NaC1, I mM EDTA and 10 mM Tris-HC1 (pH 7.5) (2 ml/g cells) and digested with 25 pg/ml pancreatic RNAase plus 20 units/ml of T1 RNAase for 90 min at 37°C. The mixture was extracted first with phenol then with chloroform, and the DNA was recovered from the aqueous phase by successive precipita- tions with ethanol and isopropanol. The DNA was further purified in CsC1 equilibrium density gradients [10].

Labeling of DNA with 32p. Euglena cells were grown in a heterotrophic

medium containing one-tenth the normal phosphate [11]. Cells were adapted to this low phosphate medium for ten generations and then inoculated (1 • 104 cells/ml) into 500 ml of the same medium containing 40 ~Ci/ml [32p]. orthophosphate (New England Nuclear) and grown to late log phase. The specific activity of the 32P-labeled DNA was approximately 200 000 cpm/gg.

Molecular weight determinations. DNA was sheared to various single- stranded molecular weights by either sonication [10] or by forcing it through small needles (23 or 27 gauge)wi th a syringe. Molecular weights of DNA preparations were determined by band sedimentation in the Spinco Model E [12]. Sedimentation coefficients of double-stranded and single-stranded DNA were measured in 1.0 M NaC1 and 0.9 M NaC1/0.1 N NaOH, respectively. The observed sedimentation coefficients were corrected for viscosity and density of the solvents [12]. The S o of double-stranded DNA or single-stranded 20,w DNA was converted to molecular weight using the formula of Freifelder [13] or Studier [ 12], respectively.

Ranges of molecular weights for preparations of double-stranded DNA were determined by electrophoresis on agarose slab gels [14]. Two sets of molecular weight standards were included on each gel: k-DNA digested with EcoRI [15] and ~-DNA digested with HindIII [16].

Renaturation of DNA. The renaturation of DNA was followed by separating single-stranded from double-stranded DNA in a reaction mix on hydroxyapati te columns [10,17,18]. Hydroxyapatite columns were prepared and used as described previously [10]. Samples in 0.12 M sodium phosphate (pH 6.8) were reassociated at 60°C, while those in 0.48 M sodium phosphate (pH 6.8) were reassociated at 73°C and corrected to the equivalent Cot (M nucleotide-sec) according to Britten et al. [18]. Double-stranded structures prepared from low molecular weight: DNA (less than 500 nucleotides) were eluted from hydroxy- apatite columns with 0.48 M sodium phosphate (pH 6.7}. Duplex structures prepared from larger single-stranded DNA were eluted from hydroxyapati te columns with 0.12 M sodium phosphate (pH 6.8) at 98°C. The total recovery of DNA from hydroxyapati te columns was always greater than 95%. The data were fit to a curve using a non-linear least squares regression and assuming second order kinetics [ 19].

Preparation of radioactive single copy DNA. Single copy DNA was prepared by allowing DNA 500 nucleotides long to reassociate to Cot 100. The single- stranded single copy DNA was separated from the reassociated DNA by hydroxyapati te chromatography, concentrated by alcohol precipitation and reassociated a second time to Cot 100. Hyperpolymers of single copy DNA were prepared by renaturing the DNA to Cot 20 000 [20]. DNA polymerase I was used to incorporate [3H]TTP into the gaps of the single copy hyper- polymers [20]. The DNA polymerase I reaction was carried out at 15°C for 48 h in 0.2 ml containing 20 ~g hyperpolymer single copy DNA, 37.5 pM of dATP, dCTP and dGTP, 7.5 mM MgC12, 60 mM sodium phosphate (pH 6.8}, 1 mM EDTA, 16 pM [3H]TTP (47 Ci/mmol) and nine units of DNA polymer- ase I (Boehringer Mannheim). The radioactive DNA was purified from the reactants on a Sephadex G-25 column. A small fraction (0.15) of the deoxy- ribonucleotides is incorporated into DNA fragments containing self- complementary or foldback regions. These were removed by incubating the

DNA to very low Cot values and passing it over a hydroxyapat i te column [20]. The single-stranded molecular weight o f the 3H-labeled single copy DNA prepared in this fashion was 300 nucleotides long and had a specific activity of 3 • 106 cpm/~g.

Melting curves. The thermal stability of native and reassociated samples of DNA was measured by either thermal elutions of radioactive DNA from hydroxyapat i te or by following the hyperchromici ty of the DNA in a Beckman VIM double-beam spectrophotometer . The absorbance was corrected for thermal expansion of water [21]. The hyperchromici ty of the DNA, H, was equal to [A260 (98°C) --A260 (60°C)]/A260 (98°C).

$1 nuclease digestion of single-stranded DNA. $1 nuclease was isolated from crude a-amylase powder from Aspergillus oryzae [22]. The sulfo-Sephadex chromatographic step was eliminated wi thout altering the ratio of digestion of single-stranded to double-stranded DNA. $1 nuclease digestion of single- stranded DNA was carried ou t in 0.15 M NaC1, 10 mM piperazine-N-N'-bis(2- ethanesulfonic acid) (Pipes), 25 mM sodium acetate (pH 4.6), 0.1 mM ZnSO4 and 5 mM 2-mercaptoethanol. A ten-fold excess of enzyme was added to assure complete digestion of all single-stranded DNA. The reaction was carried out at 37°C for 45 rain, and the nuclease reaction was s topped by adding cold 0.12 M sodium phosphate (pH 6.8). Using these conditions, more than 95% of single- stranded 3H-labeled pBR313 DNA was digested.

$1 nuclease-resistant duplex structures were separated from digestion products by hydroxyapat i te chromatography and dialyzed against 0 . 1 2 M sodium phosphate (pH 6.8). A portion of the DNA was melted and the remain- der of the sample was sized on an Agarose A-50 column.

Agarose chromatography. The size distribution of reassociated DNA duplexes was determined by chromatography on Agarose A-50 [3]. Agarose A-50 was poured into a column (92 cm × 1.5 cm) containing 6-ram glass beads [23]. The DNA was chromatographed in 0.12 M sodium phosphate (pH 6.8) and the DNA content o f the column fractions determined by absorbance at 260 nm. The column was calibrated using native DNA and KI as exclusion and inclusion markers, respectively.

Scintillation counting. Radiactive DNA samples were precipitated with 5% (w/v) trichloroacetic acid, collected on Millipore filters and counted in a Packard Liquid Scintillation counter [24].

Results

I~enaturation kinetics of total cellular DNA A preliminary s tudy of the renaturation kinetics of E. gracilis DNA revealed

two kinetic classes of DNA [10]. Fig. 1 shows the renaturation kinetics of single-stranded DNA 300 nucleotides long spanning a range of Cot values from 7 • 10 -s to 4 • 104. Cot values of less than 1 • 10 -2 were achieved using low con- centrations of a:P-labeled DNA. This datum was analyzed three different ways. First, the presence of two second-order kinetic components was assumed and the total nuclear DNA content of Euglena taken to be 3 pg [10]. The observed second-order rate constant for the slowest renaturing component was fixed at four different ploidy levels (diploid, tetraploid, hexaploid and octaploid). The

-o ~

0±~ ° .... °%..° ~° 0 °° ooo °

~b~(~ o.oobo~ o.o¢~01 O.dO= o,0~ o.I I Io Ioo =ooo ioooo

ECo~" (M- sec)

Fig. I. Reassociation kinetics of ]Euglena DNA fragments 300 nucleotides long. DNA with a single-

stranded fragment length of 300 nucleotides was reassociated in 0.12 M sodium phosphate (pH 6.8) at

60°C or in 0.48 M sodium phosphate (pH 6.8) at 73°C. Data from the reassoeiation of DNA in 0.48 M

sodium phosphate (pH 6.8) was corrected to equivalent Cot (EC0t) as described by Britten et al. [18]. The single-stranded and double-stranded products of the reaction were fractionated on hyd~oxyapatitc. o, reassoclation of total cell DNAt e, reassociation of 3H-labeled single copy DNA in the presence of a driver DNA. The curve depicting the reassociation kinetics of the total cell DNA was determined by a

least squares fit for three components holding the rate constant for the single copy DNA equal to that

expected for an organism with a genome size of 1.5 pg (root mean square 0.0267). The curve drawn

through the points showing the reassociation of single copy DNA was determined by a least squares fit for

a single component and allowing all the parameters to free float (root mean square 0.032).

non-linear least squares fit of the data analyzed in this fashion with the lowest root mean square (0.0268) had an observed rate constant of 0.00269 M -I • s -1 for the single copy DNA. Next, the same type of analysis was performed on the data assuming the presence of three kinetic components. The best fit (root mean square 0.0267) had an observed second-order rate constant for the single copy DNA of 0.00081 M -1 • s -1.

Finally, the observed second-order rate constant for the single copy DNA was independently measured by monitoring the renaturation of 3H-labeled single copy DNA in the presence of an excess of non-radioactive total cell DNA (Fig. 1). The best curve fit through these data was determined by allowing all the parameters in the computer program to free float and assuming the presence of a single kinetic component . The root mean square for this fit was 0.0319 and the observed second-order rate constant for the single copy DNA was 0.00111 M -1 • s -1. This value is most compatible with the observed second-order rate constant for the single copy DNA determined by assuming the presence of three kinetic components and that the genome complexity of Euglena was 1.5 pg (1.36 • 109 nucleotide pairs) or that of a diploid organism.

Table I summarizes the best non-linear least squares fit of the data in Fig. 1 assuming three kinetic components. 14% of the DNA reassociates at a Cot value of less than 7 • 10 -5, suggesting that this component consists of foldback DNA sequences. 34% of the DNA consists of highly repetitive DNA sequences with an observed second-order rate constant of 0.908 M -~ • s -~. A middle repetitive fraction (31% of the DNA) has an observed second-order rate constant of 0.00439 M -1 • s -1. The single copy DNA consists of 12% of the total DNA.

Interspersion of kinetic components The presence of interspersion of repetitive and single copy sequences can be

T A B L E I

K I N E T I C C O M P O N E N T S OF Euglena DNA

D N A 30 0 nuc l eo t ides long was reassoc ia ted . T h e da t a were ana lyzed b y a non- l inear least squares regres- sion a s suming s econd-o rde r k ine t ics a nd f ixing the obse rved ra te c o n s t a n t for the single c o p y c o m p o n e n t fo r t h a t of a cell wi th a g e n o m e c o m p l e x i c i t y of 1.5 pg (1 .36 • 109 (nuc leo t ide pairs). The r o o t m e a n square for this fit is 0 . 0267 . F o r the f r ac t ion o f D N A unreassoc ia t ed is 0 .091 . Pure k = observed k + frac- t ion of D N A . T h e k ine t ic c o m p l e x i t y is ca lcu la ted re la t ive to the g e n o m e c o m p l e x i t y o f K. coli (4 .24 ' 106 nuc l eo t i de pairs) and t he s e c o n d - o r d e r ra te c o n s t a n t for reassoc ia t ion o f E. col i DNA in our labora- t o r y (0 .259 M -1 • s - I ) .

C o m p o n e n t F rac t i on Observed k Pure k Kine t ic Re i t e r a t i o n of D N A (M -z • s -1) (M -1 • s -1) c o m p l e x i c i t y f r e q u e n c y

(nuc leo t ide pairs)

F o l d b a c k 0 .1 36 . . . . Highly r epe t i t ive 0 .339 0 .908 2 .674 4 .11 • 105 1120 Middle r epe t i t i ve 0 .310 0 . 0 0 4 3 9 0 . 0 1 4 2 7 .78 • I 0 ~ 5 Single c o p y 0 . 1 2 4 0 . 0 0 0 8 1 0 . 0 0 6 5 3 1.68 • 108 1

determined by comparing the hydroxyapatite binding of reassociated DNA of various lengths [1,4--7]. If there is interspersion of different kinetic com- ponents of DNA, the observed renaturation rate of the DNA will be equal to that of the least complex class of nucleotide sequences contained in a given DNA fragment. The observed rate constant of this least complex class of DNA will vary only as a function of the square root of the ratio of the molecular weights of the two DNA samples being compared; kl/k2 = (LI/L2) °'s, where kl and k2 are the rate constants for reassociation of short (L~) and long (L2) fragments, respectively [25].

Fig. 2 shows the reassociation kinetics of Euglena DNA fragments 2000 and 8100 nucleotides long. The curves drawn through the data were calculated by

0 ,

o 'C, o

< 0 . 5 z

g "a

1.0

o

v o o

• • o o o

e e o e • o

i i i i i 0 , 0 1 0.1 I 10 1 0 o 1 0 0 0

ECot (M- see)

Fig. 2. R e a ~ o c i a t i o n k ine t ics of EugJena D N A f r a g m e n t s 2 0 0 0 an d 8 1 0 0 nu¢leo t ides long. D N A wi th idngle-s t randed f r a g m e n t lengtl~u$ of 2 0 0 0 a n d 8 1 0 0 nuc l eo t ides were r e a u o c i a t e d in 0 .12 M s o d i u m p h o s -

p h a t e ( p H 6 .8) . T h e ¢eaotion p r o d u c t s w e r e f r a e t l o n a t e d on h y d r o x y a p a t i t e co lumns . SLngle*igranded D N A was e lu ted in 0 .12 M s o d i u m p h o s p h a t e ( p H 6 .8) a t 60°C an d ma te r i a l con ta in ing dup lex s t ruc tu re s was t he cm a l l y e in t ed in 9 8 ° C f r o m h y d x o x y a p a t l t e colum~uL Th e lines d r a w n t h r o u g h the po in t s r ep r e sen t a least squares fit f o r t wo c o m p o n e n t s a l lowing all p a r a m e t e r s to f ree f loat . T h e reassoc ia t lon of D N A f r a g m e n t s 2 0 0 0 and 8 1 0 0 nue l eo t ides long is r e p r e s e n t e d b y (o) an d (a ) , respec t ive ly .

assuming second-order kinetics and allowing all the parameters in the computer program to free float. Table II summarizes these data. Both sets of data were best fit by assuming the presence of two kinetic components .

Nearly all the long DNA fragments (both 2000 and 8100 nucleotides in length) reassociate at Cot values of less than 100. If single copy DNA were no t interspersed with repetitive DNA, we could expect the slowest reassociating component to be single copy DNA with an observed rate constant similar to that predicted solely from the acceleration due to size. The predicted rate con- stants for single copy DNA 2000 and 8100 nucleotides long are 0.0021 and 0.0042 M -I • s -I, respectively. The actual observed second-order rate constants for the slowest fraction of DNA 2000 and 8100 nucleotides long are 0.118 and 0.569 M -1 • s -I, respectively, similar to that predicted for the middle repetitive componen t on the basis of molecular weight (see Table II).

The kinetic component reassociating earliest in these experiments is assumed to be the highly repetitive class of nucleotide sequences. Using DNA 2000 nucleotides long, the predicted rate constant for this fraction is 2.34 M -1 • s -1, similar to the observed rate constant for this fraction (3.01 M -l" s-i). The predicted rate constant for DNA 8100 nucleotides long for the highly repetitive componen t is 4.71 M -~ • s -I while the observed rate constant is 46.75 M -l • s -I, ten times greater than expected. This observation can best be explained by assuming that the reassociation of the highly repetitive DNA sequences is accelerated by linkage of these sequences on DNA fragments 8100 nucleotides long to an even more rapidly renaturing component (possibly foldback DNA).

Interspersion of foldback sequences Euglena DNA has an unusually large fraction of foldback or zero-time

binding DNA. 14% of DNA 300 nucleotides long binds to hydroxyapat i te at Cot values of 1.5 • 10 -6. If 32P-labeled DNA of increasing size is incubated to Cot 1 . 1 0 -6, the fraction of DNA bound to hydroxyapat i te increases and plateaus at approximately 45% using DNA 4000 nucleotides long (Fig. 3). A

T A B L E I I

K I N E T I C C O M P O N E N T S O F Euglena D N A

D N A s a m p l e s 2 0 0 0 a n d 8 1 0 0 n u c l e o t l d e s l o n g w e r e reassociated. The d a t a were a n a l y z e d b y a non-lLnear

least squ a r e s r e g r e s s i o n a s s u m i n g s e c o n d - o r d e r k i n e t i c s a n d a l l owing all p a r a m e t e r s t o f ree f l oa t un t i l the l o w e s t r o o t m e a n squa re w a s f o u n d . The predicted k w a s ca l cu l a t ed f r o m the e x p r e s s i o n L 1 / L 2 ) 0 .5 • k 2 , w h e r e the subscripts indicate the length o f the fragments be ing c o m p a r e d and k 2 is the k n o w n r a t e con- s t a n t for D N A L 2 long .

F r a g m e n t C o m p o n e n t F r a c t i o n O b s e r v e d k P r e d i c t e d k length o f D N A (M - I . s -1) (M - 1 . s - l )

2 0 0 0 * F o l d b a c k 0 . 3 7 2 - - - -

High ly r e p e t i t i v e 0 . 2 1 5 3 .01 2 ,34 Middle r e p e t i t i v e 0 . 3 0 9 0 . 1 1 8 0 . 1 1 3

8 1 0 0 * * F o l d b a c k 0 . 5 6 1 - - - -

H igh ly r e p e t i t i v e 0 . 1 3 9 4 6 . 7 5 4 .71 Middle r e p e t i t i v e 0 . 5 6 9 0 .228

* F r a c t i o n u n r e a s s o c i a t e d = 0 . 1 0 4 and r o o t m e a n squa re is 0 . 2 0 1 4 . ** F r a c t i o n u n r e a s s o c i a t e d = 0 . 0 1 6 and r o o t m e a n squa re is 0 . 0 2 2 2 .

¢J

, .o

o

0

o 0.5 ~ o

. 0

Z 0

, ( . .

i, 0 l i I

2000

F'ra g m e n t

r f r T I I l

o

o 0 o o

o

o v

0 0

8 ° ~ 0

I I t I i 4000 sooo 8000

l e n g t h ( n u c l e o t i d e s )

Fig. 3. F r a c t i o n of f o l d b a c k D N A b ind ing to h y d r o x y a p a t i t e as a f u n c t i o n of f r a g m e n t length . 32polabeled D N A of va r ious f r a g m e n t l eng ths was ad jus t ed to 15 p g / m l in 0 ,12 M s o d i u m p h o s p h a t e ( p H 6 .8) an d bo i led fo r 3 rain. The D N A mix was coo led fo r 5 s in an ice -ba th and loaded o n t o a h y d r o x y a p a t l t e c o l u m n . T he D N A had equ i l i b ra t ed to 60°C wi th in this pe r iod o f t ime. Ths s ingle-s t randed ma te r i a l was e lu t ed f r o m t he h y d r o x y a p a t t t e c o l u r a n w i t h i n 30 s a f t e r loading, Th e Cot values fo r all po in t s were less t h a n 1.5 • 10 -6 M • s. T he line d r a w n t h r o u g h t h e o b s e r v e d po in ts is a non° l inear least squares fit using t h e e q u a t i o n fo r r a n d o m in te r spe r s ion of fo ldback D N A [ 2 5 ] .

similar fraction of DNA 8700 nucleotides in length also binds to hydroxy- apatite, but the molecular integrity of the DNA after these experiments is uncertain. This datum was analyzed by a non-linear least squares regression first assuming a regular and then a random interspersion of foldback DNA sequences [26]. The root mean square of the two different analyses were similar and too high to distinguish with any confidence between the two models.

The structures responsible for binding DNA to hydroxyapatite at these low Cot values are double-stranded structures as evidenced by the following experi- ment. s2P-labeled DNA with a single-stranded length of 1900 nucleotides was reassociated to Cot 1 . 1 0 -s. Unreassociated single-stranded regions were digested with $1 nuclease and the mixture adsorbed to a hydroxyapatite column. The hydroxyapatite column was washed extensively with 0.12 M sodium phosphate (pH 6.8) to elute S1 nuclease-digested material. The 32p. labeled DNA was thermally eluted, and the fraction of the total DNA bound was calculated as a function of the temperature. These DNA structures melted in a cooperative fashion {Fig. 4), and the Tm was 91°C, 3°C higher than that of native DNA. The thermal elution profile shows a pronounced ~ c t i o n (approx. 25%) which melts considerably earlier than the rest of the DNA. This fraction of DNA may be due to either A + T rich sequences of DNA or the presence of relatively short duplexes (less than 50 nucleotide pairs) which are not effec- tively retained on hydroxyapatite.

The increase and then the plateauing of the binding of DNA to hydroxy- apatite at these low Cot values demonstrates that in 45% of the DNA 4000

1.0~ ! ! ! ! ]

_=

~0.5

g

u_ 0 I

60 70 80 90 I O0

T e m p e r a t u r e ( ° C )

Fig. 4. Me l t i ng p ro f i l e o f f o l d b a c k D N A , 3 2 p - l a b e l e d D N A f r a g m e n t s 1 9 0 0 n u e l e o t i d e s l o n g w e r e r ena - t u r e d t o COt 1 ,3 • 1 0 - 5 M • s. T h e u n r e a s s o c l a t e d s i n g l e - s t r a n d e d r e g i o n s w e r e r e m o v e d b y t r e a t m e n t w i t h $ 1 n u c l e a s e . A f t e r t r e a t m e n t w i t h $1 n u c l e a s e , t h e m i x w a s l o a d e d o n t o a h y d r o x y a p a t i t e c o l u m n , a n d t h e n u c l e o t l d e d iges t s e l u t e d in 0 . 1 2 M s o d i u m p h o s p h a t e ( p H 6 .8 ) a t 6 0 ° C . T h e d o u b l e - s t r a n d e d s t r u c - t u r e s w e r e t h e r m a l l y e l u t e d a t 2 ° C in t e rva l s u s i n g 0 , 1 2 M s o d i u m p h o s p h a t e ( p H S.8) . T h e f r a c t i o n s w e r e p r e c i p i t a t e d w i t h t r i e h l o r o a e e t i e a c i d a n d c o u n t e d i n a s c i n t i l l a t i o n c o u n t e r . T h e 3 2 p - l a b a l e d D N A e l u t e d u p t o a n d i n c l u d i n g a g iven t e m p e r a t u r e w a s c a l c u l a t e d as a f r a c t i o n o f t h e t o t a l a m o u n t o f r a d i o a c t i v i t y r e c o v e r e d . A t 1 0 0 ° C a 0 . 4 8 M s o d i u m p h o s p h a t e ( p H 6 .8 ) w a s h r e m o v e d a n a d d i t i o n a l 1 .9% o f t h e t o t a l r a d i o a c t i v i t y .

nucleotides long there are foldback nucleotide sequences interspersed with other classes of DNA. The Tm of these duplex structures is 3°C greater than total cell DNA and can be used to calculate the mol% G + C content of this DNA fraction to be 53%.

Interspersion of repetitive and single copy DNA sequences Radioactive DNA of varying fragment lengths was incubated to Cot 25 with

driver DNA 300 nucleotides long, and the fraction of the radioactive DNA which bound to hydroxyapatite was measured [1,4]. At this Cot all of the highly repetitve DNA, approximately 50% of the middle repetitve DNA and less than 10% of the single copy DNA will have reassociated. The binding due to foldback DNA of various lengths was corrected by the formula: R = (B -- Z)/(0.91 -- Z), where B is the fraction of fragments binding at Cot 25, Z is the fraction of DNA binding due to foldback sequences [1,4] and 0.91 is that fraction of the DNA capable of reassociating. These data are shown in Fig. 5 and indicate that a major portion (81%) of Euglena DNA which reasso- ciates at Cot 25 exists as interspersed repetitive and single copy sequences. The steep rise in the early part of the curve and the abrupt change in slope to nearly zero argues that nearly all the single COlJy sequences which are interspersed with repetitive DNA are approximately 1500 nucleotides long. The shaded portion of this curve represents the range of statistically significant curves which may be drawn through these da ta The average size of the repeti- tive DNA reassociating at Cot 25 is obtained from extrapolation of the initial part of the curve to the x-axis [4]. The range of the size of the repetitive

10

0.,5 a-"

i i

I o 0 o

o o o o

o o

o

o

8 o o

o

I I I I I I I

2000 4000 6000 8000

F r a g m e n t l e n g t h ( n u c l e o t i d e s )

Fig. 5. F r a c t i o n o f Euglena D N A con ta in ing repe t i t ive s e q u e n c e e l e m e n t s as a f u n c t i o n of f r J ~ m e n t length . 32p- labe led D N A f r a g m e n t s o f va r ious l eng ths was m i x e d wi th a 1750- fo ld e x e e u of d r iver D N A 3 0 0 nuc l eo t ides long and i n c u b a t e d to Cot 25. The f r ac t ion of D N A f r a g m e n t s containin4g dup lex regions (B) was m e a s u r e d on h y d r o x y a p a t i t e . B was c o r r e c t e d fo r the f r ac t ion of ze ro - t ime or f o l d b a c k sequences (Z) and for the 9% u n r e a e t a b i l i t y o f the e q u a t i o n R = (B - - Z ) / ( 0 . 9 1 - - Z) . T h e l ines d r a w n t h r o u g h the po in t s were ca lcu la ted b y a l inear regress ion o f the d a t a and r ep re sen t t h e t w o outs ide l imits fo r a f ami ly o f curves w h i ch can be d r a w n t h r o u g h t he da t a points .

sequences is from 1700 to 2100 nucleotide pairs and is dependent upon which fit for the data is used for extrapolation.

Hyperchromicity and SI nuclease studies on reassociated long DNA fragments The interspersion of single copy and repetitive DNA sequences was studied

by measuring the hyperchromicity and $1 nuclease resistance of long DNA fragments reassociated to Cot 97. The reassociated DNA duplexes were prepared at Cot 97 rather than Cot 25 to include a greater fraction of the middle repetitive DNA (0.85) while still allowing less than 10% of the single copy DNA sequences to reassociate. The duplex structures were recovered from hydroxyapati te columns and melted in a spectrophotometer. Fig. 6 shows the increase in the relative A2~0 of the duplexes as a function of temperature resulting from the reassociation of single-stranded fragments 750 and 3500 nucleotides long. Table III summarizes the melting temperature, the hyper- chromicity and the fraction of fragments bound to hydroxyapati te for these experiments. The Tm values of the duplexes formed by the reassociation to Cot 97 of DNA of various lengths is 80--81°C, 8°C below the Tm of native DNA. The hyperchromicity o f these duplexes decreases with an increase in the size of the fragments.

The hyperchromicity due to c o l l ~ o f s'mgle-stranded DNA and of reassoci- ated pure repetitive DNA (81 nuctease-treated Cot 97 DNA) was co~ected for in calculating the average duplex content of the reassociated molecules [3,4]. When DNA samples 75Oand 3500 nu~eot ides long are reassociated to Cot 97, the hyperchromicity of the molecules containing double-stranded regions is

11

I 0

0 ,9

o

e3 =" 0 8

o

r f

I I i i

7 ~ N a t i v e

I 0.71 I I L I

60 70 80 9 0 I 0 0

T e m p e r a t u r e ( ° C )

Fig. 6. T h e r m a l d e n a t u r a t i o n o f r e a s s o c i a t e d Euglena D N A o f v a r i o u s l eng ths , D N A s a m p l e s w i t h f rag- m e n t l e n g t h s o f 7 5 0 a n d 3 5 0 0 n u c l e o t i d e s w e r e a d j u s t e d t o 0 . 1 2 M s o d i u m p h o s p h a t e ( p H 6 . 8 ) , d e n a t u r e d b y b o i l i n g 5 m i n a n d r e a s s o c i a t e d t o Cot 97. T h e s a m p l e s we re f r a c t i o n a t e d b y h y d r o x y a p a t i t e c h r o m a t o g r a p h y , a n d t h e d o u b l e - s t r a n d e d f r a c t i o n s w e r e d i a l y z e d a g a i n s t 0 . 1 2 M s o d i u m p h o s p h a t e ( p H 6 .8 ) a n d m e l t e d in a s p e c t r o p h o t o m e t e r . A t h e r m a l d e n a t u r a t i o n p ro f i l e o f na t i ve D N A is i n c l u d e d f o r c o m p a r i s o n .

T A B L E III

D U P L E X C O N T E N T O F Euglena D N A F R A G M E N T S R E A S S O C I A T E D T O C o t 9 7

D N A o f v a r y i n g s i n g l e - s t r a n d e d l e n g t h s w a s r e a s s o c i a t e d t o Cot 97 . T h e size o f t h e D N A w a s c a l c u l a t e d b y a lka l ine b a n d s e d i m e n t a t i o n as d e s c r i b e d in Ma te r i a l s a n d M e t h o d s . H y p e r c h r o m i c i t y (H) w a s ca lcu - l a t e d a c c o r d i n g t o t h e f o r m u l a [ A 2 6 0 (98°C) - - -A260 ( 6 0 ~ C ) ] / A 2 6 0 (98°C) . All a b s o r b a n c e m e a s u r e m e n t s w e r e c o r r e c t e d f o r t h e r m a l e x p a n s i o n o f w a t e r . T h e T m is t h a t t e m p e r a t u r e w h i c h p r o d u c e d 50% inc rease in h y p e r c h r o m i c i t y . T h e average l e n g t h o f d u p l e x f r a g m e n t s ( n u c l e o t i d e pa i r s ) is e q u a l t o t h e in i t i a l s ingie- s t r a n d e d f r a g m e n t l e n g t h × f r a c t i o n as d u p l e x a t Cot 9 7 .

S i n g i e - s t r a n d e d f r a g m e n t l e n g t h ( n u c l e o t i d e ) 7 5 0 2 5 0 0 3 5 0 0 F r a c t i o n b o u n d t o h y d r o x y a p a t i t e 0 . 7 2 0 . 8 3 0 . 8 1 H y p e r c h r o m i c i t y 0 . 1 8 1 - - 0 . 1 3 8 T m (°C) 8 0 . 0 - - 8 1 . 0 Ave rage d u p l e x c o n t e n t o f h y d r o x y a p a t i t e b o u n d f r a g m e n t s

F r o m h y p e r c h r o m i e i t y * 0 . 7 6 - - 0 . 4 7 F r o m $1 n u e l e a s e ** - - 0 . 6 1 - -

Ave rage l e n g t h o f d u p l e x f r a g m e n t s ( n u c l e o t i d e pa i r s ) 5 7 0 1 5 2 5 1 6 4 5

* C a l c u l a t e d re la t ive t o t h e h y p e r c h r o m i c i t y o f Euglena D N A d u p l e x e s r e s i s t a n t t o $1 n u c l e a s e a n d con - t a i n i n g m i s m a t c h e d r e g i o n s [ 3 , 4 ] . W h e n D N A 3 5 0 0 n u c l e o t i d e s l o n g is r e a s s o c i a t e d t o C o t 9 7 a n d t h e s i n g i e - s t r a n d e d r e g i o n s d i g e s t e d w i t h $1 n u c l e a s e , t h e r e m a i n i n g d u p l e x s t r u c t u r e s h a v e a h y p e r c h r o - m i e i t y o f 0 . 2 1 6 a n d a T m o f 6 1 . 5 ° C . T h e h y p e r c h r o m i c i t y a n d T m of n a t i v e D N A is 0 . 2 8 6 a n d 8 3 . 2 ° C , r e s p e c t i v e l y . T h e h y p e r c h r o m i c i t y d u e t o c o l l a p s e o f s i n g l e - s t r a n d e d D N A is 6 . 0 6 9 ( a f t e r c o r r e c t i o n f o r t h e r m a l e x p a n s i o n ) . T h e ave rage f~ac t ion o f n u e l e o t i d e s in a d u p l e x (D) is t h e n : D = ( H - - 0 . 0 6 9 ) / ( 0 . 21 - - -0 .069 ) .

** T h e r e a s s o c i a t e d D N A w a s t r e a t e d w i t h $1 n u c l e a s e p r i o r t o h y d r o x y a p a t i t e f r a c t i n n a t i o n . The f rac - t i o n o f r e a s s o c l a t e d D N A w h i c h w a s $1 n u c l e a s e r e s i s t a n t w a s m e a s u r e d b y h y d r o x y a p a t i t e c h r o m a t o g - r a p h y . T h e average d u p l e x c o n t e n t o f t h e b o u n d f r a g m e n t s w a s c a l c u l a t e d as D = f r a c t i o n $1 n u c l e a s e

r e s i s t a n t / F , w h e r e F is t h e f r a c t i o n o f D N A p r i o r t o $1 n u c l e a s e t r e a t m e n t w h i c h wil l b i n d t o h y d r o x y - a p a t i t e .

12

0.181 and 0.138, respectively. The average length of the double-stranded regions in Cot 97 DNA duplexes increases with fragment length over the range of sizes used in these studies (Fig. 3) and is at least 1600 nucleotide pairs in length.

The fraction of the duplex content of reassociated DNA was also determined by $1 nuclease digestion of structures formed from DNA 2500 nucleotides long. The reassociated DNA was divided in half, and one sample was used to measure how much of the DNA would bind to hydroxyapatite (83%). The remainder of the sample was treated with S1 nuclease, and the fraction of reassociated DNA which was S1 nuclease resistant (0.51) was measured by hydroxyapatite chromatography. The average duplex content of the bound fragments was calculated as D = fraction S1 nuclease resistant/F, where F is the fraction of DNA prior to S1 nuclease treatment which will bind to hydroxy- apatite. The average size of the duplex fragments calculated in this fashion using DNA with initial length of 2500 nucleotides is 1500 nucleotide pairs.

Both the hyperchromicity and the $1 nuclease measurements indicate that repetitive sequences equal to or longer than the non-repetitive sequences are interspersed with one another. The Tm of these repetitive duplex structures is 7--8°C less than the Tm for native DNA, indicating approximately 4--5% mismatch in the renatured duplexes [27].

Estimation of repetitive sequence lengths with $1 nuclease DNA 2500 nucleotides long was reassociated to Cot 97. The resulting

duplexes were treated with a single-stranded specific nuclease, and the size of the resistant material was measured directly [3]. The molecules containing reassociated repetitive duplexes were separated from the digestion products by hydroxyapatite chromatography, and their size distribution was determined by gel filtration on an Agarose A-50 column.

Fig. 7 shows the Agarose A-50 column profile of Sl-resistant DNA reassoci- ated to Cot 97. Approximately two-thirds of the Sl-resistant DNA is excluded from the column. The excluded and included fractions were individually com- bined and the mean double-stranded and single-stranded size measured by band sedimentation in the analytical ultracentrifuge. The range of sizes of double- stranded molecules in each fraction was determined by electrophoresis on agarose gels. Table IV summarizes the sizes of the excluded and included frac- tions from the Agarose A-50 column.

The double-stranded structures formed at Cot 97 ranged in size from 100 to 20 000 nucleotide pairs and were fractionated into two components on the Agarose A-50 column. The DNA duplexes in the excluded fraction ranged in size from 850 to 20 000 nucleotide pairs. The size of this DNA determined by neutral band sedimentation is 4900 nucleotide pairs. The size of the single- stranded DNA making up these duplexes is 1030 nucleotides. The DNA in the included fraction of this column ranged in size from 100 to 850 nucleotide pairs on Agarose gels. The size of the double-stranded DNA measured by band sedimentation was 1100 nucleotide pairs and consisted of renatured single- stranded DNA fragments 235 nucleotides long.

The observation that the molecular weight of the double-stranded structures in the excluded volume of the Agarose A-50 column is greater than the initial

13

0.4

0 ID

0.2

Ext. Inc

20 40 60

Fro ction number

Fi r , 7. A g a r o s e A - 5 0 c o l u m n p ro f i l e o f $1 n u c l e a s e - r e s i s t a n t Cot 9 7 D N A . Euglena D N A w i t h a s ingle- s t r a n d e d l e n g t h o f 2 5 0 0 n u c l e o t i d e s w a s a d j u s t e d t o 0 , 1 8 M NaC1 a n d 1 0 m M Pipes , d e n a t u r e d b y b o i l i n g a n d r e a s s o c i a t e d t o C o t 97 . T h e s a m p l e w a s a d j u s t e d t o t h e a p p r o p r i a t e i o n i c c o n d i t i o n s f o r $1 n u c l e a s e d i g e s t i o n (see Mate r i a l s a n d M e t h o d s ) a n d t r e a t e d w i t h $1 n u c l e a s e f o r 1 h a t 3 7 ° C . T h e d i g e s t i o n p r o d u c t s w e r e r e m o v e d b y h y d r o x y a p a t i t e c h r o m a t o g r a p h y , a n d t h e size d i s t r i b u t i o n o f t h e $1 nuc l ea se - r e s i s t a n t d u p l e x e s w a s d e t e r m i n e d b y gel f i l t r a t i o n o n a n A g a r o s c A-50 c o l u m n . T h e size o f t h e D N A in t h e e x c l u d e d a n d i n c l u d e d f~ac t ions w a s m e a s u r e d b y e l e c t r o p h o r e s i s o n a g a r o s e gels a n d b y b a n d sedi- m e n t a t i o n in a n a n a l y t i c a l u l t r a c e n t r i f u g e (see T a b l e IV) .

T A B L E IV

M O L E C U L A R W E I G H T D E T E R M I N A T I O N O F F R A C T I O N S F R O M A G A R O S E C O L U M N

D N A 2 5 0 0 n u c l e o t i d e s l o n g w a s r e a s s o e i a t e d in 0 . 1 8 M NaCI a n d 1 0 m M Pipes t o COt 9 7 . T h e s a m p l e w a s a d j u s t e d t o t h e $1 b u f f e r c o n d i t i o n s d e s c r i b e d in Mate r i a l s a n d M e t h o d s a n d t r e a t e d w i t h $1 n u c l e a s e f o r 4 5 r a i n a t 37°C . T h e d i g e s t i o n p r o d u c t s we re r e m o v e d b y h y d r o x y a p a t l t e c h r o m a t o g r a p h y . T h e $1 n u c l e a s e - r e s i s t a n t d u p l e x st1~tctures w e r e e l u t e d f r o m h y d r o x y a p a t i t e u s i n g 0 . 4 8 M sodlu_ra p h o s p h a t e ( p H 6 .8 ) f r a c t i o n a t e d o n a n A g a r o s e A - 5 0 c o h t m n as d e s c r i b e d in Fig . 7. T h e e x c l u d e d a n d i n c l u d e d f r a c t i o n s we re i n d i v i d u a l l y c o m b i n e d , d i a l y z e d a g a i n s t 1 2 m M s o d i u m p h o s p h a t e ( p H 6 .8 ) a n d c o n c e n t r a t e d w i t h b u t a n o l [ 3 0 ] .

D N A f r a c t i o n M e t h o d o f a n a l y s e s D o u b l e - s t r a n d e d S i n g l e - s t r a n d e d size size ( n u c l e o t i d e pa i r s ) ( n u c l e o t i d e s )

C o t 9 7 , $1 n u c l e a s e - r e s i s t a n t d u p l e x A g a r o s e A - 5 0 c o l u m n 6 7 % > 1 2 0 0 - - f r o m h y d r o x y a p a t i t e 3 3 % < 1 2 0 0 - -

A g a r o s e gel * 100-- -20 0 0 0 - - E x c l u d e d frac '~lon f r o m A g a r o s e A - 5 0 A g a r o s e gel 8 5 0 - - 2 0 0 0 0 - -

c o l u m n B a n d s e d i m e n t a t i o n ** 4 8 6 0 1 0 3 0 I n c l u d e d f r a c t i o n f r o m A g a r o s c A - 5 0 A g a r o s e gel 1 0 0 - - 8 5 0 - -

c o l u m n B a n d s e d i m e n t a t i o n 1 0 7 5 2 3 5

* T h e d o u b l e - s t r a n d e d size o f t h e D N A w a s a n a l y z e d b y e l e c t r o p h o r e s i s in agaro~e gels as d e s c r i b e d i n Mate r i a l s a n d M e t h o d s . T h e agasose gels w e r e c a l i b r a t e d w i t h ~ - D N A d i g e s t e d w i t h E c o R I a n d k - D N A d i g e s t e d w i t h H i n d I I I . A c o m b i n a t i o n o f t hese t w o s t a n d a r d s y i e l d e d D N A f r a g m e n t s v a r y i n g in size f r o m 0 . 5 t o 21 .9 k i l o b a s e pa i r s .

** O b s e r v e d s e d i m e n t a t i o n c o e f f i c i e n t s o f d o u b l e - s t r a n d e d D N A w e r e m e a s u r e d in 1 .0 M NaCL T h e S 2 0 , w w a s c a l c u l a t e d f r o m t h e o b s e r v e d s e d i m e n t a t i o n c o e f f i c i e n t as d e s c r i b e d in Mate r i a l s a n d M e t h - o ods . T h e S 2 0 . w va lues we re u s e d t o c a l c u l a t e t h e m o l e c u l a r w e i g h t s u s i n g F r e l f e l d e r ' s e q u a t i o n [ 1 3 ] . T h e o b s e r v e d s e d i m e n t a t i o n c o e f f i c i e n t s o f s i n g l e - s t r a n d e d D N A were m e a s u r e d in 0 . 9 M NaC1 a n d 0 .1 o N N a O H . T h e S 2 0 , w w a s c a l c u l a t e d f r o m t h e o b s e r v e d S P I l l 3 as d e s c r i b e d in Mate r i a l s a n d M e t h o d s a n d u s e d t o c a l c u l a t e t h e m o l e c u l a r w e i g h t s u s i n g S t u d i e r ' s [ 1 2 ] e q u a t i o n .

14

length of the single-stranded DNA suggests the presence of either hyperpoly- mers and/or multi-stranded $1 nuclease-resistant structures. Either structure could result from the renaturation of tandem repeats. This is consistent with the earlier suggestion that the highly repetitive DNA sequences are linked to a more rapidly renaturing component.

Discussion

A Cot curve of Euglena DNA was originally shown to be composed of two kinetic components [10]. Acquisition of more data and extension of the range of Cot values used in the analysis of the reassociation kinetics revealed three kinetic components and a relatively high fraction (0.14) of foldback DNA in the nuclear DNA of Euglena. The data for the reassociation kinetics can be easily fit by assuming the observed second-order rate constant is that expected for a diploid organism containing 3 pg of DNA/nucleus [10] and having a genome size of 1.5 pg or 1.36 • 109 nucleotide pairs.

Euglena contains a large fraction of repetitive DNA (0.65) and a relatively high proportion of foldback DNA (0.14). The repetitive DNA sequences con- sist of a highly repetitive component with a reiteration frequency of 1100 and a middle repetitive component whose sequences occur approximately 5 times/ genome. The single copy DNA comprises only 12% of the genome and has a kinetic complexity of approximately 40 times greater than Escherichia coli.

DNA fragments 2000 nucleotides long contain both single copy and repeti- tive DNA sequences. The absence of single copy DNA sequences in the reasso- ciation kinetics, using DNA of 2000 nucleotides long, suggests that it is linked to repetitive DNA sequences on fragments 2000 nucleotides long. The highly repetitive and middle repetitive DNA sequences on fragments of this size reassociate as expected based on the increase in size relative to that of DNA 300 nucleotides long. When DNA 8100 nucleotides long is reassociated, there also are two kinetic components. The observed rate constant for the middle repetitive DNA is similar to that expected for such sequences of this size, while the observed rate constant for the highly repetitive DNA is ten-fold greater than predicted. Again, the absence of any single copy component in the Cot curve argues for interspersion of the repetitive DNA with the single copy sequences. The large acceleration of the reassociation of the highly repetitive DNA suggests these sequences exist on DNA fragments 8100 nucleotides long with some other nucleotide sequences which reassociate very quickly, possibly foldback DNA sequences.

The foldback sequences present in Euglena DNA are present on 45% of DNA fragments 4000 and possibly 8600 nucleotides long, and they are separated by approximately 2000 nucleotides of repetitive and/or single copy sequences. Some of the foldback sequences may consist of DNA with a significantly higher mol% G + C content (53%) than the overall base composition of Euglena nuclear DNA (48 mol% G + C).

Keassociation kinetics monitored by hydroxyapatite chromatography tend to overestimate the actual fraction of repetitive DNA classes, thereby under- estimating the single copy fraction. An alternative means of estimating the repetitive DNA content is by extrapolation of the R vs. L curve (Fig. 5). This

15

particular method generally gives estimates of the amount of repeated DNA somewhat less than that calculated from Cot curves [4]. The data for this curve were collected at Cot 25 and, therefore, should give a value for that fraction of repetitive DNA expected to reassociate at this Cot. Extrapolation of this curve through the y-axis gives a value of 46--48% for the fraction of repetitive DNA expected to reassociate for DNA fragments of finite size at Cot 25. A larger value of 63% for DNA 300 nucleotides long is obtained from the Cot curve shown in Fig. 1. This discrepancy in the fraction of repetitive DNA reassociated at this Cot is undoubtedly due to single copy DNA tails attached to the repeti- tive DNA sequences which reassociate at Cot 25 on DNA fragments 300 nucleo- tides long.

The length of the repetitive DNA sequences was determined by four independent methods, and all are in good agreement with one another. The R vs. L curve yields a length for the repetitive DNA sequences in the range 1700-- 2100 nucleotide pairs. The hyperchromicity studies of reassociated repetitive duplexes yield a length for the repetitive DNA of 1600 nucleotide pairs. The average length of duplex fragments obtained by measuring the fraction of S1 nuclease-resistant DNA which will bind to hydroxyapatite is 1500 nucleotide pairs. Direct measurement of the size of S1 nuclease-resistant DNA duplexes at Cot 97 yields a bimodal distribution of repetitive DNA sequences. Two-thirds of this DNA is approximately 4900 nucleotide pairs long while the remainder one-third is about 1100 nucleotide pairs long.

Euglena contains chloroplast and mitochondrial DNA. These sequences have been included in the DNA isolated for these experiments, but neither type of DNA molecule should have contributed significantly to the kinetic components seen in these Cot curves. Chloroplast DNA is the more abundant of the two organelle DNAs and accounts for a total chemical complexity of 2.76.104 nucleotide pairs or 2% of the total cellular DNA [28].

There are no appreciable nuclear satellites in Euglena (Rawson, J.R.Y., unpublished results). When total cell DNA has been analyzed by complexing the DNA with Hg(II) or Ag ÷ and banded in Cs2SO4 gradients, less than 5% of the DNA appears as unique DNA components. Half of the satellite sequences are chloroplast and mitochondrial DNAs [29].

The nuclear DNA in Euglena is contained in 45--50 chromosomes which are permanently condensed [9]. The presence of the large fraction of foldback DNA in this organism may have some significance in the peculiar continual condensed structure of its chromosomes. A final peculiarity of this alga is that it requires such a large genome complexity for such a simple lifestyle.

Acknowledgements

This work was supoorted by a grant from the National Science Foundation (PCM77-0529). S.C. was supported by an NIH Predoctoral Trainship (5-T32- GM07103-03).

References

1 Davidson, E.H., Hough, B,R., Amenson, C.S. and Britten, R.J. (1973) J. Mol. Bioh 77, 1--23 2 Zimmerman, J.L. and Goldberg, R.B. (1977) Cbxomosoma 59, 227--252

16

3 Goldberg, R.B., Crain, W.R., Ruderman, J.V., Moore, G.P., Barnett, T.R., Higgins, R.C., Gelfand, R.A., Galau, G.A., Britten, R.J. and Davidson, E.H. (1975) Cb_romosoma 51, 225--251

4 Graham, D.E., Neufeld, B.R., Davidson, E.H. and Britten, R.J. (1974) Cell 1, 127--137 5 Manning, J.E., Schmid, C.W. and Davidson, N. (1975) Cell 4, 141--155 6 Crain, W.R., Eden, F., Pearson, W., Davidson, E.H. and Britten, R.J. (1976) Chromosoma 56, 309--

326 7 Crain, W.R., Davidson, E.H. and Britten, R.J. (1976) Chromosoma 59, 1--12 8 Whittaker, R.H. (1969) Science 136, 150--160 9 Leedale, G.F. (1958) Nature 181, 502--503

10 Rawson, J.R.Y. (1975) Biochim. Biophys. Acta 402, 171--178 11 Brown, R.D. and Haselkorn, R. (1971) J. Mol. BioL 59, 491--503 12 Studier, F.W. (1965) J. Mol. Biol. 11, 373--390 13 Freifelder, D. (1970) J. Mol. Biol. 54, 567--577 14 Rawson, J.R.Y., Kushner, S.R., Vapnek, D., Alton, N.K. and Boerma, C.L. (1978) Gene 3, 191--209 15 Helllng, R.B., Goodman, H.M. and Boyer, H.W. (1974) J. Virol. 14, 1285--1244 16 Wellauer, P.K., Reeder, R.H., Carroll, D., Brown, D.D., Deutch, A., Higuhinakagawa, T. and David,

I.B. (1975) Proc. Natl. Acad. Sci. U.S. 71, 2823--2827 17 Britten, R.J. and Kohne, D.E. (1968) Science 161, 529--540 18 Britten, R.J., Graham, D.E. and Neufeld, B.R. (1974) Methods Enzymol. 29, 363--418 19 Pearson, W.R., Wu, J. and Bonner, J. (1978) Biochemistry 17. 51--59 20 Galau, G.A., Britten, R.J. and Davidson, E.H. (1974) 2, 9--20 21 Mandel, M., Schfldkraut, C.L. and Marmur, J. (1968) Methods EnzymoL 12, 184--195 22 Vogt, V.M. (1973) Eur. J. Biochem. 33, 192--200 23 Sachs, D.H. and Painter, E. (1972) Science 175, 781--782 24 Rawson, J.R.Y. and Boerma, C.L. (1976) Biochemistry 15, 588--592 25 Wetmur, J.G. and Davidson, N. (1968) J. Mol. Biol. 31, 349--370 26 Hamer, D.H. and Thomas, C.A., Jr. (1974) J. Mol. Biol. 84, 1 3 9 - - 1 4 4 27 Bonnet, T., Brenner, D.J., Neufeld, B.R. and Britten, R.J, (1973) J. MoL BioL 81, 123--135 28 Rawson, J.R.Y. and Boerma, C.L. (1976) Proc. Natl. Acad. Sci. U.S. 73, 2401--2404 29 Gruol, D., Rawson, J.R.Y. and Haselkorn, R. (1975) Biochim. Biophys. Acts 414, 20---29 30 Stafford, D.W. and Bicber, D. (1975) Biochim. Biophys. Acts 378, 18---21