Intercalation-Mediated Synthesis and Replication: A New Approach to the Origin of Life

21
- Author to whom correspondence should be addressed. E-mail: hud@chemistry.gatech.edu J. theor. Biol. (2000) 205, 543 } 562 doi:10.1006/jtbi.2000.2084, available online at http://www.idealibrary.com on Intercalation-Mediated Synthesis and Replication: A New Approach to the Origin of Life NICHOLAS V. HUD*- AND FRANK A. L. ANET? *School of Chemistry and Biochemistry, Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of ¹echnology, Atlanta, GA 30332, ; .S.A. and ?Department of Chemistry and Biochemistry, ;niversity of California, ¸os Angeles, CA 90095, ; .S.A. (Received on 1 December 1999, Accepted in revised form on 15 April 2000) We propose that a molecular midwife,a #at molecule approximately 10 A s ]10 A s with two hydrophobic faces, was essential to the origin of life. This molecule was positively charged, water soluble and did not strongly associate with itself in solution. It may have been a derivative of phthalocyanine that no longer exists on the Earth today, and might have been formed solely from hydrogen cyanide and formaldehyde. The midwife tended to intercalate between side groups (bases, similar to those in RNA) of polymers to form stacks, which incorporated bare bases. The midwife alternated in these stacks with hydrogen-bonded tetrads of bases. Under conditions of low water activity, as in a desert during the day, bare bases in the stacks were joined together by neutral and chemically heterogeneous backbones of no "xed chirality. The components of the backbones were the products of the formose reaction of formaldehyde, and were involved in the reversible formation of N-glycosides and acetals catalysed by divalent metal ions. The "nal product of this assemblage was a fully intercalated quadruplex of four information-containing polymer strands (four proto-RNA molecules). This process constituted replication of the original polymer that had seeded the formation of the stack. The stack structure ensured that the polymer's base sequence was replicated faithfully despite the lack of both homochirality and chemical homogeneity in the backbone. At night, water from condensing dew would suddenly come in contact with these products, quenching all chemical reactions and releasing midwife molecules and single- or double-stranded proto- RNA. Evaporation of water during the day then gave new stacks containing one or two proto-RNA strands, bare bases, and midwife molecules, which could begin a new replication cycle. Our model also allows for the generation of new stacks and the extension of existing ones, without restricting the base sequence of either, thereby providing a source of genetic information. The proto-RNA replication cycle is driven purely by concentration changes caused by the Sun and the rotation of the Earth. We propose that this system as a whole could have gradually evolved into the RNA World. ( 2000 Academic Press 1. Introduction It is widely accepted that life evolved from a far simpler self-replicating molecular system than that existing in current living organisms (Deamer & Fleischaker, 1994). However, what molecules actually participated in the origin of life has not been determined (Miller, 1987; Orgel, 1998). The farther life evolves from its origins, the more di$cult it will be to prove that any given system 0022}5193/00/160543#20 $35.00/0 ( 2000 Academic Press

Transcript of Intercalation-Mediated Synthesis and Replication: A New Approach to the Origin of Life

J. theor. Biol. (2000) 205, 543}562doi:10.1006/jtbi.2000.2084, available online at http://www.idealibrary.com on

Intercalation-Mediated Synthesis and Replication:A New Approach to the Origin of Life

NICHOLAS V. HUD*- AND FRANK A. L. ANET?

*School of Chemistry and Biochemistry, Parker H. Petit Institute for Bioengineering and Bioscience,Georgia Institute of ¹echnology, Atlanta, GA 30332, ;.S.A. and ?Department of Chemistry

and Biochemistry, ;niversity of California, ¸os Angeles, CA 90095, ;.S.A.

(Received on 1 December 1999, Accepted in revised form on 15 April 2000)

We propose that a molecular midwife, a #at molecule approximately 10 As ]10 As with twohydrophobic faces, was essential to the origin of life. This molecule was positively charged,water soluble and did not strongly associate with itself in solution. It may have beena derivative of phthalocyanine that no longer exists on the Earth today, and might have beenformed solely from hydrogen cyanide and formaldehyde. The midwife tended to intercalatebetween side groups (bases, similar to those in RNA) of polymers to form stacks, whichincorporated bare bases. The midwife alternated in these stacks with hydrogen-bonded tetradsof bases. Under conditions of low water activity, as in a desert during the day, bare bases in thestacks were joined together by neutral and chemically heterogeneous backbones of no "xedchirality. The components of the backbones were the products of the formose reaction offormaldehyde, and were involved in the reversible formation of N-glycosides and acetalscatalysed by divalent metal ions. The "nal product of this assemblage was a fully intercalatedquadruplex of four information-containing polymer strands (four proto-RNA molecules). Thisprocess constituted replication of the original polymer that had seeded the formation of thestack. The stack structure ensured that the polymer's base sequence was replicated faithfullydespite the lack of both homochirality and chemical homogeneity in the backbone. At night,water from condensing dew would suddenly come in contact with these products, quenchingall chemical reactions and releasing midwife molecules and single- or double-stranded proto-RNA. Evaporation of water during the day then gave new stacks containing one or twoproto-RNA strands, bare bases, and midwife molecules, which could begin a new replicationcycle. Our model also allows for the generation of new stacks and the extension of existingones, without restricting the base sequence of either, thereby providing a source of geneticinformation. The proto-RNA replication cycle is driven purely by concentration changescaused by the Sun and the rotation of the Earth. We propose that this system as a whole couldhave gradually evolved into the RNA World.

( 2000 Academic Press

1. Introduction

It is widely accepted that life evolved from a farsimpler self-replicating molecular system than

-Author to whom correspondence should be addressed.E-mail: [email protected]

0022}5193/00/160543#20 $35.00/0

that existing in current living organisms (Deamer& Fleischaker, 1994). However, what moleculesactually participated in the origin of life has notbeen determined (Miller, 1987; Orgel, 1998). Thefarther life evolves from its origins, the moredi$cult it will be to prove that any given system

( 2000 Academic Press

544 N. V. HUD AND F. A. L. ANET

was the one which "rst achieved self-replicationand evolved into life as we know it. This is parti-cularly true if life has ceased to utilize one ormore molecular species that were crucial to itsorigins.

Since the discovery of ribozymes nearly twodecades ago (Guerrier-Takada et al., 1983;Kruger et al., 1982) much attention has focusedon the hypothesis that an early form of life usedRNA for both information storage and chemicalcatalysis (Gesteland & Atkins, 1999), before theemergence of DNA and protein enzymes. This&&RNA World'' hypothesis is attractive because italleviates the long-standing chicken-and-the-eggparadox concerning which came "rst, proteins ornucleic acids. Ardent supporters of the RNAWorld hypothesis believe that RNA participatedin the early evolution of life and perhaps even inthe actual origin of life (Hager et al., 1996). A feas-ible process for the spontaneous generation andreplication of RNA (or any information-contain-ing polymer) under acceptable pre-biotic condi-tions would largely solve the problem of theorigin of life, because Darwinian evolution couldthen come into play to select mutations and ulti-mately create more complex living forms. How-ever, experimental support for the spontaneousgeneration and replication of RNA is far fromsatisfactory (Cairns-Smith, 1982; Joyce & Orgel,1999; Shapiro, 1984). Two extreme and oppositeviews can be formulated: (a) it is just a matter oftime and further research for the di$culties(listed in Section 2) to be surmounted, and (b) it isimpossible to overcome the di$culties, except formeaningless improvements.

Numerous workers interested in the origin oflife have come to the conclusion that it is imposs-ible for RNA to have been the "rst molecule oflife. They have searched for alternative theories,with an origin of life based, for example, onproteins (de Duve, 1991), lipids (Segre & Lancet,1997), inorganic crystals (Cairns-Smith, 1982;Hartman, 1998), autotrophy based on FeS}H

2S

(WaK chtershaK user, 1992) or extraterrestrial sour-ces of life (postponing its origin) (Crick, 1981). Allthese theories have serious di$culties of theirown and none has received wide acceptance.A rather complete description of extant theoriesof life's origin has recently been published(Lahav, 1999).

The idea of a self-replicating RNA under pre-biotic conditions is nevertheless very appealingbecause of its simplicity and directness, andtherefore attempts have been made to removesome of the di$culties by making minor vari-ations in this polymer, but these have not beenreally successful and this area of research seemsto have reached an impasse (see Appendix A). Webelieve that several large and simultaneouschanges in the current thinking are necessary inorder to formulate a feasible scheme for thespontaneous generation and replication of aninformation-carrying polymer, more or less re-sembling RNA. This requires "nding solutions toall of the items in the list of di$culties givenbelow and that is the aim of the present paper.

2. Di7culties with the Spontaneous Generationand Replication of RNA

Before elaborating our hypothesis, we list herenine speci"c di$culties which stand in the way ofRNA polymers being accepted as the "rst self-replicating molecules of life.

(1) The need for sources of the two purinebases, adenine (A) and guanine (G), and the twopyrimidine bases, uracil (U) and cytosine (C).Syntheses of these bases from simple moleculessuch as HCN, ammonia, cyanate, and urea, insmall to moderate yields have been fairly success-ful (Ferris et al., 1968; Miller, 1987; OroH &Kimball, 1961; Sanchez et al., 1966, 1967), but notunder the same conditions and not without theformation of related compounds.

(2) The need for a source of a homochiralsugar (D-ribose). The simplest source of sugars,namely the formose reaction of formaldehyde,produces only about 1% of DL-ribose in a com-plex mixture of chemically related products(Decker et al., 1982; Miller, 1987; Mizuno &Weiss, 1974), and there is no known mechanismfor the separation of D-ribose (from L-ribose andother products) under plausible pre-biotic condi-tions.

(3) The need for chemically combining thebases with D-ribose in a selective way to give thenatural nucleosides.

(4) The need for a source of high-energy(&&activated'') phosphate groups (or of molecules

INTERCALATION AND THE ORIGIN OF LIFE 545

that can produce such groups). These compoundsare known to undergo rapid hydrolysis in waterand thus tend not to accumulate. The inactivehydrolysis (waste) products associated with theuse of activated phosphates must be constantlyrecycled, or at least eliminated from the immedi-ate environment.

(5) The need for the nucleosides to undergophosphorylation at speci"c positions on theribose residues so as to give the natural nucleo-tides.

(6) The need to convert the nucleotides to theiractivated forms.

(7) The need to build a new RNA polymerchain from the activated nucleotides using anexisting RNA molecule as a template, so as toproduce a complementary RNA strand contain-ing as few errors as possible.

(8) The need to dissociate the new RNA chainfrom the old one, so that further replication canoccur.

(9) All of the above eight requirements must beful"lled in the absence of enzymes.

3. Inspirations from Cairns-Smith:s InorganicHypothesis for the Origin of Life

Cairns-Smith (1982), in his inorganic crystal(clay) hypothesis for the origin of life has statedthat crystals are known to be highly selective inthe molecules that can be incorporated at a grow-ing crystal surface, unlike the lack of selectivityfor molecules added to the end of a growingpolymer chain. Furthermore, since covalent bondformation is not involved in the growth of a crys-tal, the addition of molecules is a reversible pro-cess. This allows errors, such as the incorporationof an unwanted molecule, to be corrected by theexchange of molecules between the crystal andsolution. These advantages are normally absentin a growing organic polymer chain, unless en-zymes are present to select molecules and tocorrect errors, as occurs in current living organ-isms. However, we argue that these are not in-herent di!erences between organic and inorganicsystems, and therefore one should try to deviseorganic systems that have the advantages claim-ed by Cairns-Smith for inorganic crystals. Theselectivity of crystals comes from the packing ofmolecules in a three-dimensional array, but selec-

tivity, even if somewhat reduced, should alsooccur in two-dimensional and even in one-dimensional array of non-covalently bondedmolecules. The advantages of a reversible addi-tion of molecules enjoyed by crystals can also beobtained in building a covalently linked organicpolymer, provided that its structure is appropri-ate and reversible reactions are used. We proposethat this was in fact the case for the "rst informa-tional polymers of life, which were functionallyand structurally analogous, but not identical, topresent-day nucleic acids.

4. Structure of a Proposed Proto-RNA

Our polymer analog of RNA, which for simpli-city we call proto-RNA, has some peculiar fea-tures but is nevertheless similar to RNA, andtherefore a potential candidate for incrementalevolution into RNA. It is an information-carry-ing polymer, making use of four kinds of bases,which can best be described as A-, U-, G-, andC-like. Each kind of base is not necessarilya unique molecule, but most likely a member ofa set of molecules which can replace one anotherin an information recognition role. However, thefour sets of molecules have no molecule in com-mon. The reason for using four groups of bases isdiscussed further in Appendix B.

In our proto-RNA, the four kinds of bases arelinked together so as to be separated by a dis-tance similar to the adjacent base separation inRNA (approximately 7 As ), but the linking mo-lecular components are not D-ribose and phos-phate. The molecules that covalently link thebases in proto-RNA form a backbone that is notunique and constant for all links. Thus, the back-bone is chemically heterogeneous and itscomponent molecules have no "xed chirality.The precise chemical nature of this backbone isgenerally not retained during replication of pro-to-RNA, unlike the base sequence. A generalstructure for proto-RNA is depicted in Fig. 1.

The source of the backbone molecules inproto-RNA is postulated to be the products ofthe formose reaction of formaldehyde, CH

2O

(Decker et al., 1982). These products are of twodi!erent structural types. The "rst type includesaldehydic sugar molecules such as pentoses andhexoses, whether in the furanose or pyranose

FIG. 1. A diagrammatic structure for proto-RNA, show-ing a portion of the polymer chain. The bases are A-, U-, G-or C-like. The sugar molecules are typically pentoses orhexoses of no "xed chirality and form N-glycoside bondswith the bases (the oxygen atoms shown are actually part ofthe nearest sugar molecule). The R group can be CH

2OH,

CH(CH2OH)

2, or CHOH}CH

2OH, for example, and can

vary along the chain.

546 N. V. HUD AND F. A. L. ANET

form. These sugars have a cyclic hemiacetalstructure and can combine with the bases,which contain weakly acidic NH groups, to givestable cyclic N-glycosides which we call proto-nucleosides in analogy with the natural RNAnucleosides.

The second type of molecule is also aldehydic,but cannot form stable monomeric hemiacetalstructures. This group includes formaldehydeitself, glycolaldehyde (CHO}CH

2OH, a major

product in the formose reaction), glyceraldehyde(CHO}CHOH}CH

2OH), tetroses and short

branched-chain sugars and accounts for roughlyhalf of the products of the formose reaction.These compounds exist in water largely in a hy-drated form, containing a terminal }CH(OH)

2rather than an actual }CHO group (Wagneret al., 1990). The hydrated aldehydes cannot formsimple stable cyclic N-glycosides of the type men-tioned above, but can link together two di!erentmolecules containing OH groups to give acetalstructures that are stable to hydrolysis in neutralor basic aqueous solutions. The thermo-dynamic equilibria between an aldehyde hydrate,R-CH(OH)

2, the hemiacetal, R}CH(OH)(OR@),

and the acetal, R}CH(OR@)(ORA) strongly favorthe hemiacetal over the hydrate and the acetalover the hemiacetal (Wiberg et al., 1994) and arediscussed further in Appendix B. Acetals aremore thermodynamically favored than are dia-lkyl phosphates relative to their respective hy-drolysis products, especially near a neutral pH,a favorable circumstance for our model.Oligonucleotides with one phosphate link re-placed by an acetal derived from formaldehyde

have been prepared and found to form duplexstructures with RNA and DNA (Gao, 1997; Joneset al., 1993). This suggests that our proposedproto-RNA would have had structural propertiessimilar to contemporary nucleic acids.

That a proto-RNA structure of the type de-scribed could undergo faithful replication of itsbase sequence information is, at least at "rstsight, a heresy. It has been repeatedly and veryforcefully enunciated that an information-carry-ing polymer must consist of homochiral andchemically homogeneous molecular components(Avetisov & Goldanskii, 1996; Bolli et al., 1997;Goldanskii et al., 1986; Joyce et al., 1984). Thissupposition, and the general interest in the originof biological homochirality, have led to an enor-mous (but unsuccessful, we think) e!ort to "nda plausible pre-biotic source of homochiral mol-ecules (e.g. homochiral nucleotides) (Avetisov& Goldanskii, 1996; Bolli et al., 1997; Bonner,1991). Before we can discuss the replication ofproto-RNA, we need to discuss further the natureof the bases and introduce the midwife molecule,a critical component of our hypothesis.

5. Base Tetrads and the Molecular Midwife

We assume that proto-RNA base recognitionand information transfer involves base}basehydrogen bonding, similar to that of nucleic acidsin contemporary life. For example, an A-like basecan form a Watson}Crick base pair with a U-likebase. This is important for proto-RNA function,and to facilitate its evolution into RNA. How-ever, we believe that such a simple base pairingis not su$cient by itself for the selection, witha low error rate, of the required bases from thearray of related compounds potentially presenton the pre-biotic Earth (Benner et al., 1999).It would also not select a purine}pyrimidinebase pairing system over possible alternatives(e.g. a purine}purine base pairing system).

Watson}Crick base pairing is generally accep-ted by supporters of the RNA World to havebeen the mode of information transfer since theadvent of nucleic acids (Joyce & Orgel, 1999).Nevertheless, attempts to use single-stranded nu-cleic acid polymers as templates for the spontan-eous polymerization of activated nucleosides,which relies exclusively on the formation of

FIG. 2. Isomorphous (A )U)2

and (G )C)2

tetrads constructed by placing the sugar C1@ carbon atom of all nucleosides inboth tetrads at the corners of congruent rectangles (after the models of McGavin, 1971). Both tetrads have two-fold (C

2)

rotational symmetry about an axis perpendicular to the paper, as shown. Although the four N}R bonds in each structure arenot related by exact symmetry, they are approximately related by rotation about horizontal and vertical two-fold axes thatintersect in the center of the structure.

INTERCALATION AND THE ORIGIN OF LIFE 547

Watson}Crick base pairs, have met with limitedsuccess and have fallen far short of what isconsidered necessary to initiate self-replication(Joyce & Orgel, 1999). The di$culties observed inthese studies include the ine$cient polymeriz-ation of pyrimidine nucleotides on purine-richtemplates, and the tendency for single-strandedtemplates to form stable intramolecular struc-tures (Joyce, 1987). Additional di$cultiesassociated with enzyme-free template-directedpolymerization, such as the need to separate theindividual strands of a duplex to act as templates(Luther et al., 1998) [problem (8) of Section 2],also indicate that an alternative mechanism forproto-RNA replication is in order.

To circumvent the problems which thwartreplication via single-stranded template-directedpolymerization, we propose a mechanism forproto-RNA information transfer and replicationthat is based upon two distinct yet highly coupledideas.

(1) Base selection, information transfer andproto-RNA replication involves the formation ofbase tetrads which incorporate Watson}Crickbase pairs together with additional hydrogenbonds.

(2) A molecular entity critical for proto-RNAsynthesis and replication is present in the envi-ronment and functions as a template for basetetrad formation. By intercalation, it can causea structural transition of proto-RNA from its&&biologically'' active (i.e. folded and twisted) formto its untwisted replicative state in a stack. Werefer to this molecular entity as the molecularmidwife and recognize that it can be one of severalclosely related molecules that perform the samefunction.

Our hypothesis allows only two kinds of basetetrads (Fig. 2), either (G )C)

2-like or (A )U)

2-like,

which are formed by the dimerization of twoidentical Watson}Crick base pairs. These tetradswould be more selective for the bases thanWatson}Crick pairs alone due to the additionalhydrogen bond complementarity required be-tween the edges of like base pairs. Base tetradsalso have several advantages over base pairs forproto-RNA replication, as will be discussedbelow. However, to propose that free bases ornucleosides spontaneously formed tetrads withan existing proto-RNA polymer in solution isunreasonable, since the act of immobilizinga large number of molecules would be associated

548 N. V. HUD AND F. A. L. ANET

with a considerable and unfavorable change inentropy. In fact, the free bases do not form stableWatson}Crick base pairs in aqueous solution,but rather associate face to face with hydrogenbonds satis"ed by water molecules (Ts'o, 1974).

One way to stabilize a tetrad of bases would beto place a single planar molecule on each face togenerate a columnar stack, or at least the begin-ning of such a stack, as shown in Fig. 3. Theadded molecule should be large enough to justcover the four bases in the tetrad and must notinterfere with hydrogen bonding between thebases in the tetrad. This is our molecular midwife.The molecular midwife must be water solubleand must not tend to associate strongly withitself, yet its two faces should be more or lesshydrophobic in order to attract the bases. Inaddition to tetrad stabilization by stacking inter-actions, the space between two midwife moleculeswould provide a local water-free environmentwhich favors the tetrad hydrogen bonds(Williams et al., 1989). Thus, two adjacentmidwife molecules in a stack could be viewed asa bimolecular receptor of a base tetrad.

The ability of the midwife faces to absorb thebases from solution serves, in one capacity, toincrease the local base concentration. This wouldbe critical in a solution where the bases were eitherof low concentration or in the presence of closelyrelated molecules that could not function as pro-to-RNA bases. This role is analogous to thatpreviously proposed for inorganic surfaces in theorigin of life (Bernal, 1951), but is accomplishedin a more versatile and satisfactory way by themolecular midwife than by a surface on a solid.

We suggest that the molecular midwife is pos-sible and that it may have occurred on the Earthin pre-biotic times, although it is no longer foundon the Earth today. It may have been a mixtureof closely related isomers such as positivelycharged isomers of tetramethylated octa}aza de-rivatives (Fig. 4) of the well-known (insoluble)pigment molecule, phthalocyanine (Fig. 4)(Kudrevich & van Lier, 1996; Linstead et al.,1937). The (positive) electric charge on themethylated nitrogen atoms leads to water solu-bility and prevents, or at least reduces, the directself-association of these molecules in solution.Related tetra}aza derivatives are known andhave been used as water-soluble histochemical

dyes (Scott, 1972). The primary reason we pro-pose this class of molecules is the remarkablecorrespondence between the shape of the basetetrads and phthalocyanine (Fig. 5). The sim-ilarity between one-half of phthalocyanine anda Watson}Crick base pair was previously noted(Scott, 1972). Here, we emphasize the similaritybetween a whole phthalocyanine and a basetetrad. Another satisfying feature of the tetra-methylated octa}aza derivatives of phthalo-cyanine is the existence of plausible synthesesfrom pre-biotic molecules (Appendix B).

6. Replication of Proto-RNA

6.1. THE PHYSICAL ENVIRONMENT

We now consider the replication of relativelyshort proto-RNA polymers, even though theoriginal creation of such polymers (see Section 7)has not yet been discussed. The reason we do thisis that replication must be a well-de"ned processsince it is a common event that is continuallytaking place. On the other hand, the originalformation of a moderately short proto-RNApolymer could be a rare event, as long as thepolymer can replicate and there is a feasiblemechanism for chain extension.

For simplicity, we start with a single-strandedmolecule of proto-RNA surrounded by water.Present in solution are the midwife molecules, thefour kinds of bases, the two kinds of products ofthe formose reaction (as discussed above) andinorganic divalent ions such as Ca2` and Mg2`.The initial concentrations of these compoundsare low enough so that there is no appreciableinter-molecular association.

The physical environment (Lahav, 1999) thatwe consider the most appropriate for replicationpurposes is not the classical primordial soup, noreven a beach or a small pond, but instead tinycavities just below the surface of a desert, similarto the Atacama desert in Chile. Rainfall is vir-tually absent in this desert, but dew forms atnight because of the presence of a cold oceancurrent nearby, allowing plants to exist and grow(Horowitz, 1992). In the absence of membranesthis environment could be advantageous, for rainor excess water might be deleterious as it couldcause unwanted dilution and dissociation of

FIG. 3. (a) A minimalist representation of the hypothetical molecular midwife as a planar aromatic molecule with a surfaceof approximately the same size and shape as the base tetrads in Fig. 2. (b) A space-"lling model of an (A )U)

2tetrad stacked on

the surface of the molecular midwife. (c) Two molecular midwives intercalated by an (A )U)2tetrad. (d) A space-"lling model of

a (G )C)2

tetrad stacked on top of a molecular midwife which is itself stacked on top of an (A )U)2

tetrad. Molecular midwivesare blue. Adenine and uracil are red and green, respectively. Guanine and cytosine are light blue and yellow, respectively. Thehydrogen atoms of the bases which will be displaced upon nucleoside formation are shown in magenta.

FIG. 5. (Left and right) Space-"lling models of the (A )U)2

and (G )C)2

tetrads. (Center) Space-"lling model ofphthalocyanine. The outline of phthalocyanine is superimposed on the two tetrads to illustrate the similarity of the shape ofthe aromatic surfaces of these structures. Color scheme: C, green; N, blue; O, red; H, white.

FIG. 4. Top: the structure of phthalocyanine. Divalentmetal cations, such as Cu2` and Mg2` can replace the twoinside NH protons. Bottom: a suggested structure for themolecular midwife. There are three other isomers possible ifthe four methyl groups are restricted to di!erent rings, andany synthesis is likely to give a mixture of all four isomers.

INTERCALATION AND THE ORIGIN OF LIFE 549

organized arrays of molecules. Furthermore, ourprocess for the chemistry of proto-RNA synthesisrequires periods of extremely low water activity,which are of course inherent to a desert.

6.2. FIRST STAGE OF REPLICATION:

FORMATION OF STACKS

We start with a small cavity, shielded fromdirect UV irradiation, and containing only a fewdrops of water formed by dew percolating intothe cavity during the cold night. Proto-RNA andthe molecules mentioned above are in solution.In the course of the day, the ground becomes hotand water in the cavity evaporates. The increasein concentration causes the midwife molecules tobecome intercalated between the bases of theproto-RNA. Because the area of the midwife mol-ecule (more or less square shaped, about 10 As oneach side) is much larger than that of a singlebase, a low-energy structure can only be formed iffree bases from the aqueous solution are alsoincorporated into a stack, which then containsalternating midwife molecules and base tetrads,

as shown in Fig. 6. The formation of columnarstacks such as these has been observed for a widevariety of planar molecules in studies of liquidcrystals (Guillon, 1999). Rod-like aggregates ofmolecules (e.g. porphyrins) in dilute aqueoussolutions have also been studied rather extensive-ly (Fuhrhop et al., 1992). We refer to the stack asthough it were positioned with its primary axis ina vertical direction, so the four bases in a tetradare horizontal and tetrads and midwife moleculesare stacked vertically.

An important feature of this model is that thesingle complete proto-RNA molecule that existsin the stack was originally synthesized (Section 7)as part of a quadruplex of proto-RNA moleculesthat were fully intercalated with midwife molecu-les. This ensures that the proto-RNA moleculeallows maximum intercalation by midwife mol-ecules and is easily accommodated into a newstack, irrespective of the heterogeneity of theproto-RNA's backbone sugar and acetallinkages. This is reminiscent of molecularimprinting (WuK l!, 1993, 1995). In this processpolymers which recognize a target molecule aregenerated by allowing heterogeneous monomersto bind through non-covalent bonds to the targetmolecule and then cross linking these monomersinto a polymer. Recognition of the molecularmidwife by proto-RNA should be even more spe-ci"c than that typically achieved by imprintedpolymers, because of the tight integration of pro-to-RNA and midwife molecules in a stack.

At this stage, only one base in each tetrad islinked to a backbone, this being the base ofthe original proto-RNA molecule [Fig. 6(c)]. Be-cause of hydrogen bonding between the bases,and the thermodynamic preference for the shapeof the base assemblies to match that of the #ank-ing midwife molecules, there should be excellentselectivity for the free bases that become incorp-orated from solution into any particular tetradunit of the stack. For example, if the proto-RNA(backbone-linked) base in a particular tetrad ofthe stack is an A-like purine, such as adenineitself, then the other three bases consist of twocomplementary U-like pyrimidine bases and oneA-like purine base, hydrogen bonded as pre-viously described (Fig. 2). The sequences of (unat-tached) bases that are on three vertical edges ofthe stack are either identical or complementary

FIG. 6. (Top) Replication of a single-stranded proto-RNA polymer by the formation of a replication stack with molecularmidwives and free bases. (a) A single-stranded polymer folded into a three-dimensional structure. (b) Molecular midwivesintercalate successive bases and straighten the polymer backbone. (c) Complete intercalation and tetrad formation results ina replication stack with a base sequence dictated by the single-stranded polymer. (d) Formation of a backbone along the edgeswith free bases results in the production of three progeny strands. (Bottom) Replication of a proto-RNA duplex. (e) Aproto-RNA duplex (the actual degree of helical twist is unknown due to the chemical heterogeneity of the proto-RNAbackbone). (f ) Intercalation of base pairs by molecular midwives causes the duplex to unwind. (g) The cleft generated by theextension of the molecular midwives becomes a receptor for Watson}Crick pairs from free bases which can participate in theformation of tetrads with the existing duplex. (h) Formation of backbones along the stack edges with covalent links to thebases, resulting in the production of two daughter strands in the stack.

550 N. V. HUD AND F. A. L. ANET

to the base sequence in the proto-RNA that re-sides on the remaining stack edge.

The bases in the stack, if not linked to a back-bone, are in a dynamic equilibrium with the basesin solution and can exit and be reintroduced intothe stack. Because this is costly in terms of en-ergy, only one base in the stack is likely to beremoved at any given time, so that the stack asa whole maintains its integrity. This process, ofcourse achieves the reversibility and selectivityclaimed for inorganic crystals by Cairns-Smith(1982), and which we aimed at introducing intoan organic polymer.

6.3. SECOND STAGE OF REPLICATION:

FORMATION OF BASE GLYCOSIDES IN STACKS

The replication process is not yet "nished be-cause the new bases introduced into the stack,though they have the correct sequential order,are not covalently linked together. However, thebases are held by the stack structure at the cor-rect distance from one another for the next step inthe proto-RNA synthesis. The midwife moleculesin such a stack structure behave much as doenzymes that bring molecules together by speci"ccomplexation. Upon further evaporation of the

INTERCALATION AND THE ORIGIN OF LIFE 551

water to near dryness, the highly water-solubleproducts of the formose reaction aggregatearound the outside of the midwife-tetrad stack,and the inorganic divalent ions coordinate withthe oxygen atoms of these molecules.

At this stage, the bases in the stack that are notpart of the proto-RNA combine with sugars(mostly pentoses or hexoses) to give N-glyco-sides, analogous to the natural nucleosides, aselaborated previously in the description of pro-to-RNA (Section 4). The fact that the reactingbases are wedged between positively chargedmidwife molecules in the stacks should signi"-cantly increase the acidity of the exposed NHgroups of these bases, greatly facilitating reac-tions with sugars. This is particularly importantfor the pyrimidine bases, as it has proved almostimpossible to convert these bases to nucleosidesby their reaction with ribose using plausible pre-biotic chemistry (albeit without the assistance ofour molecular midwife). A second consequence ofthe bases being a part of a tetrad in a stack is thatN-glycosidation can only take place on those NHgroups that are exposed to the outside, but noton those that are internally hydrogen bondedand on the inside of the tetrad. This shouldprevent, for example, any reaction on theadenine NH

2group, which can occur when free

adenine in solution reacts with a sugar (Fulleret al., 1972).

6.4. THIRD STAGE OF REPLICATION:

COMPLETION OF BACKBONES

The "nal step in backbone formation is thelinking together of vertically adjacent base-attached sugars on the outside of the stack viaacetal bonds (Fig. 6). Aldehydic molecules of thesecond type from the formose reaction are in-volved in this joining of neighboring sugar resi-dues, as was also discussed in our description ofproto-RNA (Section 4).

In our hypothesis, the linkage of sugars alongeach edge of a stack which contains new bases isfavored over the formation of linkages withsugars in solution. This does not rely so much onthe chemical composition of the backbone (whichcan vary widely within certain limits) but on thepositional identity of the molecules in the stack.That is, the linkage of two neighboring and

immobilized sugar molecules on a stack isentropically favored over the linkage of twosugar molecules that are free in solution, or thelinkage of a free sugar with one on a stack.

The stack structure also favors linkages be-tween adjacent proto-RNA ''nucleoside&& tetradsover linkages within a tetrad. The desired processof backbone formation involves the linking ofvertically adjacent sugar molecules on the stack.These sugar molecules are attached by parallelN-glycosidic bonds to base nitrogen atoms thatare 6.8 As apart, and acetal formation automati-cally selects any OH groups that are appropriate-ly placed to form a strainless acetal structure. Incontrast, the nitrogen}nitrogen distance for N-glycosidic bonds of two horizontally adjacentsugars of the same tetrad is 9.0 As and the N}Cglycosidic bonds diverge outward at an angle of903. This should be su$cient to prevent un-wanted acetal bonds within the same tetrad. Theformation of linkages within the same tetrad isundesirable as it could result in the cross linkingof two proto-RNA polymers.

The consequences of cross linking two proto-RNA chains in a stack would, however, notnecessarily be fatal. If such cross links are rare,then perhaps only two of the four proto-RNAmolecules in the stack will be cross linked. Sucha stack might still dissociate. The two cross lin-ked proto-RNA molecules can re-associate in thenext replication cycle with midwife molecules andbases to give a stack. When the water activity hasfallen su$ciently, the cross link could be de-stroyed since acetal formation is reversible underthese conditions. In any case, the two newly syn-thesized proto-RNA molecules will have normalstructures and do not inherit the cross link. Ifseveral cross links are formed between the fourproto-RNA molecules in a stack, no dissociationmay take place, but the cross linking errors maybe removed in the next (or some future) replica-tion cycle, again because of the reversibility ofacetal formation.

Acetal formation under reversible conditionswould also allow the base-attached (N-glyco-sidic) sugars to exchange, at least to some extent,with free sugars in solution. During theseprocesses, it is important that the stack not cleave(or at least do so only rarely). The danger is thatthe original single strand of proto-RNA could

552 N. V. HUD AND F. A. L. ANET

have its backbone broken before the backbones(at the same horizontal stack position) of any ofthe three new proto-RNA chains are formed. Thelifetime of such a backbone-de"cient stack maybe su$cient to allow reformation of covalentlinks before the stack cleaves. The direct replica-tion of double-stranded proto-RNA, which ispossible in our model and is discussed in Sec-tion 6.5, is less susceptible to this problem as itcan accommodate a small number of gaps in thebackbone.

Acetal formation could take place intra-molecularly on a sugar and this uses two OHgroups that are then no longer available for thedesired inter-molecular acetal formation. How-ever, intra-molecular acetals will often have somestrain, decreasing their concentrations at equilib-rium. In any case, only some of the sugars willpartake in these unwanted reactions. Duringacetal condensation more O-glycoside attach-ments to sugars may occur than those which linktogether adjacent sugars of a stack. However, aslong as complete backbones are formed, extraO-glycoside attachments to unused sugar OHgroups of proto-RNA molecules would do noharm. If the sugar molecule attached to a givenbase does not have OH groups disposed to formacetal bonds with adjacent sugar molecules verti-cally above and below it on the stack, the reversi-bility of N-glycosidation allows this sugar mol-ecule to be replaced by another one. Thermo-dynamics under low water activity will favor thegreatest number of low-energy acetal groups, astheir formation releases water molecules whichare then lost by evaporation, thus driving back-bone formation.

6.5. FOURTH STAGE OF REPLICATION:

SEPARATION OF STRANDS

At this stage the four molecules of proto-RNAare bound into a quadruplex with intercalatedmidwife molecules and need to be separated. Thiscan occur when the stack is suddenly immersedinto (cool) water from percolating dew, formed atnight. At low concentrations, entropy drives theequilibrium towards dissociation, because themidwife molecules are not covalently bonded tothe stack. Initially, the stack will dissociate intomidwife molecules and twisted proto-RNA

duplexes, which can further dissociate into singlestrands. The monomers can base pair internallyin some regions and acquire conformations likethose found in RNA (e.g. hairpins and loops).

This completes the replication of a proto-RNA. The "nal product (apart from an un-changed proto-RNA molecule) is one identicaland two Watson}Crick complementary proto-RNA molecules, if the whole replication processtakes place without base errors and all the back-bones are completely formed.

6.6. REPLICATION OF Proto-RNA DUPLEXES

WITHOUT PRIOR DISSOCIATION

Molecular midwife molecules could intercalateduplex proto-RNA and free bases to form a tet-rad-midwife stack containing initially two fullyformed proto-RNA molecules (Fig. 6). Comple-tion of two new backbones followed by dissocia-tion as described previously would give a newduplex proto-RNA containing the same basesequence information as the original duplexproto-RNA. This may be a valuable replicationmechanism when the chain length of the proto-RNA becomes so long that dissociation to givesingle-stranded proto-RNA does not take placeat feasibly low concentrations.

Another consequence of chain lengthening isthat as the length of a proto-RNA increases,having every new base (i.e. bases that were notpart of the original proto-RNA molecule) in thestack covalently linked to its adjacent bases bya backbone requires increasing e$ciency in theformation of both N-glycosides and acetals, aswell as the avoidance of unwanted covalent links.This is probably the most di$cult chemistry thatour model has to satisfy. Here double-strandedreplication would be valuable since a smallnumber of gaps in the backbones of each proto-RNA in a duplex can exist without cleavage ofthe strands, and is thus more robust than single-strand replication.

6.7. BASE N-GLYCOSIDES AND RELATED

ACETAL-LINKED OLIGOMERS AS Proto-RNA

BUILDING BLOCKS

During the course of proto-RNA replication,when the backbones of progeny strands areonly partially formed, proto-nucleosides (base

INTERCALATION AND THE ORIGIN OF LIFE 553

N-glycosides), proto-dinucleotides (two acetal-linked base N-glycosides), and higher oligomersare present in the stacks. (Note: the proto-RNAanalog of a nucleotide is a labile short-lived acyc-lic hemiacetal and hence has no signi"cancehere.) These assorted molecules contain basesthat were incorporated into a stack before theirglycosylation, as described in Section 6.2. Wehave already suggested that non-glycosylatedbases in a stack could exchange with free bases insolution. Likewise, proto-nucleosides and veryshort oligomers could leave a stack and becomefree in solution, to be replaced in the stack by freebases from solution. Of course, under the condi-tions of (reversible) backbone formation wateractivity would be very low, but some water mol-ecules are still expected to be present to allow thisexchange to take place. The release of partiallyformed proto-RNA progeny, and their replace-ment by free bases, would de"ne another functionfor the molecular midwife-tetrad stack. That is, asa synthetase for the production of proto-nucleo-sides and proto-oligonucleotides. A secondsource of these molecules would be the hydrolyticdestruction of proto-RNA.

If proto-nucleosides and related very shortoligomers are present in the environment, thesimple scheme of Sections 6.2}6.4 needs tobe modi"ed because these molecules would bereadily incorporated into replication stacks andultimately become part of proto-RNA polymers.Indeed, this would greatly facilitate proto-RNAreplication by reducing the number of stepsneeded for backbone synthesis. Furthermore, inthe case of proto-dinucleotides and higheroligomers, the pre-organization of the bases inthese molecules would favor their incorporationinto a stack (with the appropriate base sequence)at an earlier stage with respect to the incorpora-tion of free bases. This would also providea mechanism to reduce stack disruption duringsingle-stranded proto-RNA replication, whichmight result from the cleavage of the parentalpolymer backbone when conditions are such thatacetal formation is (reversibly) taking place. Inother words, even before the water activity isreduced to the level required for acetal formation,more than one base of a tetrad in a stack couldalready be linked to a base of either adjacenttetrad. One link would originate from the fully

formed proto-RNA which seeded the stack andone or more could come from short oligomersincorporated on any of the remaining three stackedges. This would provide protection similar tothat enjoyed by direct double-stranded replica-tion of proto-RNA (Section 6.6). The latter kindof replication would likewise become even morereliable with the usage of short proto-oligonuc-leotides.

7. Generation of an Initial Proto-RNAPolymer and Chain Length Extension

Because replication rapidly increases the num-ber of proto-RNA molecules, the synthesis ofinitial proto-RNA with random base sequencesdoes not have to be as favorable or as well de-"ned a reaction as replication, and might onlygive a very low equilibrium concentration of suchmolecules. Clearly, very short proto-RNA has thebest chance to be formed de novo. This couldoccur from a very low equilibrium concentrationof short stacks composed simply of midwifemolecules and free bases arranged in hydrogen-bonded tetrads (i.e. without a proto-RNAtemplate). After the formation of backbones tocreate short proto-RNA polymers, by a processof backbone formation essentially the same asthat described above for replication, longerchains can result if there is chain extension (addi-tion of a tetrad of unattached bases to a stackcontaining proto-RNA), or if two stacks can joinend to end. Our replicating mechanism obviouslyallows such &&errors'' or modi"cations of existingproto-RNA polymers to transpire. Again, thesechain-lengthening reactions can be relatively rare,as long as replication itself is an e$cient process.

An important aspect of initial chain formationand chain extension discussed here is that when-ever a wholly new tetrad is generated de novo(and not because of replication), the type of tetradadded (i.e. (C )G)

2-like vs. (A )U)

2-like) and its

base arrangement with respect to the adjacentand already present tetrad (i.e. purine over purinevs. purine over pyrimidine) is close to random,assuming an equal supply of all four types ofbases. This results from the isolation of tetradsfrom one another by intervening midwifemolecules and it ensues that proto-RNA can beformed without exclusion of any particular

554 N. V. HUD AND F. A. L. ANET

sequence. This might not be expected if the basetetrads are stacked directly upon each other, forthe free energy for the stacking of base pairs hasbeen shown to be very sensitive to the sequence ofthe stacked bases (Burkard et al., 1999). The ran-domness of the sequence in the origination ofa stack is to be contrasted with the high faithful-ness of the replication process itself in our model.

The replication of proto-RNA necessarily con-sumes component molecules until the concentra-tion of one component becomes so low thatnet polymer production stops. Since some ofthe proto-RNA molecules will be destroyedby hydrolysis reactions, particularly when inaqueous solutions, new proto-RNA moleculeswill be continually formed to balance thedestroyed proto-RNA so as to produce a steadystate. Proto-RNA molecules that fold into stablestructures would be more chemically stable thanunfolded proto-RNA. In the latter case, acetalcleavage will result in the separation of a proto-RNA into two parts due to di!usion. A singleacetal cleavage in the backbone of a folded pro-to-RNA, on the other hand, might be repairablesince non-covalent bonds could hold the twohalves of the polymer together until the acetalbackbone is reformed. This being the case, thechain length of proto-RNA will tend to increaseand reach a steady state. This length will belimited, of course, because longer chains havemore acetal and N-glycoside links and inherentlymore sites of attack for chain cleavage than doshorter chains.

The possibility that certain proto-RNA basesequences could be protected from cleavage byassuming tightly packed folded structures is anintriguing point, because such molecules will alsobe more likely to produce &&proto-ribozymes'', theancient analog of ribozymes regularly producedtoday by in vitro selection (Lorsch and Szostak,1996). Those that do not form stable folded struc-tures will NOT form proto-ribozymes and aretherefore dispensable as far as enzymatic activityis concerned. Some of these proto-ribozymesmight also promote their own survival andpropagation. For example, a base sequence mightlead to a folded proto-RNA with the catalyticactivity to cleave proto-RNA molecules withoutwell-de"ned folded structures during the periods ofhigh water activity. This would greatly accelerate

the replication of proto-RNA sequences withfolded structures (and perhaps other catalyticactivities) by making available resources forproto-RNA replication. The Darwinian selectionof proto-ribozymes with any sequences that facil-itate their own replication, from an initial poolof random sequences, could represent the emerg-ence of life's "rst genes.

8. Discussion

8.1. EFFECTS OF THE BACKBONE HETEROGENEITY

Our process for proto-RNA synthesis allowsfor a variety of base linkages and produces a het-erogeneous backbone, which might be thought tohave only deleterious consequences, but this isprobably not so. Consider the problem of fullintercalation of bases in a polymer, such as RNAor proto-RNA, to give an alternating base}inter-calator motif. Intercalation of DNA duplexes(and presumably RNA duplexes) is restrictedby the &&nearest-neighbor exclusion principle''(Saenger, 1984) and can only proceed as far as analternating base}base}intercalator motif. Thebackbone of DNA is homogeneous and its chem-ical linkages are "xed. Only its conformation canchange, but this becomes di$cult when the back-bone is close to being fully extended rather thancoiled. If the chemical structure of the backbonedoes not allow full intercalation to take place ina strainless way, only partial intercalation maytake place. This might seem to prohibit replica-tion by our proposed mechanism. However,proto-RNA has a chemically adjustable back-bone. It can potentially use any sugar and makeuse of any hydroxyl groups of these sugars tobuild a strainless backbone even with full interca-lation. Thus, a heterogeneous backbone that isperhaps limited in some ways with respect toa homogeneous one, can have an advantage inversatility.

The chemical and stereochemical heterogen-eity of the backbone is expected to reduce, moreor less severely, the enzymatic activity of proto-RNA as compared to that of a polymer witha homochiral and homogeneous backbone. Thelevel of this reduction would necessarily dependon the degree of heterogeneity in the backbone,and upon how many proto-nucleotides are

INTERCALATION AND THE ORIGIN OF LIFE 555

required for the generation of a proto-RNA ac-tive site. The minimum polymer length necessaryto achieve any form of catalysis is unknown, evenfor RNA. However, there is reason to believe thatthis length may not be prohibitively large. Forexample, the trinucleotide UUU has been shownto promote the speci"c cleavage of GAAACp inthe presence of metal ions (Kazakov & Altman,1992; Vogtherr & Limmer, 1998). Thus, if rela-tively small proto-RNA sequence elements werealso capable of catalysing a reaction, a signi"cantfraction of proto-RNA molecules containing thesame base sequence might show enzymatic activ-ity of the same type. Furthermore, when proto-RNA is formed as part of a stack, adjacent sugarunits within a particular stack may show somestructural and stereochemical correlation, despitethe backbone's overall chemical heterogeneity,and thereby reduce the number of possibilitiesfrom that of a purely random arrangement.In any case, the chemical homogeneity of theproto-RNA would be expected to increasethrough chemical evolution as soon as anycatalytic proto-RNA (i.e. proto-ribozyme) createsa selective pressure on the nature of thebackbone. As a result, all potential catalysis bythis evolved proto-RNA will become increasinglye!ective.

This potential heterogeneity in early proto-ribozymes is not at the base sequence level, asin Eigen's quasi-species (Eigen, 1971). Of course,a replicating proto-ribozyme might reach asteady state under certain conditions and couldconstitute a quasi-species in the Eigen sense, butit would also have an additional non-informa-tional heterogeneity due to backbone variations.

8.2. EVOLUTION OF Proto-RNA

In addition to growth in chain length, proto-RNA might evolve through the following in-cremental changes, whose order is not necessarilyas given here: (1) Formation of cyclic stacks.(2) Evolution of proto-RNA sequences havingenzymatic activity, which might lead to the startof metabolism and the selection of speci"c mol-ecules for the bases and the backbone, includingchiral symmetry breaking (see Section 8.3).(3) The replacement of the uncharged acetal-forming aldehyde by one carrying a negative

charge, e.g. by using glyoxylate, CH(OH)2}CO~

2.

(4) Dispensing with the molecular midwife for thereplication process, because too many proto-RNA molecules are being produced comparedto the number of existing midwife molecules.(5) Eventual evolution to give the RNA Worldand then the most primitive fossil and currentlyknown living organisms (Lahav, 1999; Schopf,1999; SzathmaH ry and Maynard Smith, 1997).

8.3. SYMMETRY BREAKING IN Proto-RNA

Discussions on the origin of life and the originof biological homochirality are generally inter-twined. In our hypothesis, replication of basesequence information and initial evolution areachieved without adopting a "xed homochiralproto-RNA backbone, so that chiral symmetry isnot broken. However, our system is even moresymmetrical than that, but to appreciate this weneed to apply symmetry in a way not normallydone in origin of life discussions.

The term &&symmetry'' is used in a broadercontext by physicists than by chemists. Forexample, physicists say that di!erent particles,which may have di!erent properties (e.g. restmass, spin, and charge) exhibit symmetry at veryhigh temperatures and energies because the par-ticles then behave in the same way and cannot bedi!erentiated. At lower energies, they can be dif-ferentiated and this symmetry is broken. Thisusage can be applied to proto-RNA and its com-ponent molecules as well as to the molecularmidwife. If two molecules behave in the same wayand are not di!erentiated in a given system, wemay say that they exhibit a symmetry (valid forthat system), otherwise, they exhibit a brokensymmetry.

For example, uracil, 5-hydroxymethyluracil,and N-methylcyanuric acid all belong to theU-like group of bases and are not di!erentiatedin proto-RNA, thus exhibiting a symmetry. Onthe other hand, uracil and cytosine are di!erenti-ated as they belong to the U- and C-like groupsof bases, respectively, and thus exhibit brokensymmetry. The molecules that can act as bases inproto-RNA can be characterized as having par-tially (but not totally) broken symmetry. Thealdehydic products of the formose reaction, ascomponents of proto-RNA, exhibit partial

556 N. V. HUD AND F. A. L. ANET

symmetry breaking because these molecules fallinto one of two groups. Some of these moleculesare chiral, but there is no di!erentiation on thatbasis and there is thus chiral symmetry. Whileany single molecule of proto-RNA would bechiral, a collection of a very large number of suchmolecules, even with the same base sequence in-formation, would have no optical rotation, andany helical structures present would have equalamounts of left- and right-handed forms. Again,there is chiral symmetry.

Our proto-RNA has no "xed directionality inits backbone and therefore has symmetry withrespect to this property. By contrast, RNA hasdirectionality in its backbone and the symmetryis broken. In the latter case, the symmetry break-ing involves the relative orientation of adjacentribose molecules in the polymer chain.

Symmetry breaking then corresponds to ahigher selectivity in the choice (and possibly thechemical linkage) of molecules. It becomes com-plete when only a unique molecule will su$ce forany one polymer component, which must be lin-ked in a unique way, as occurs in RNA in currentlife. At the origin of life, selectivity must havebeen di$cult to achieve since enzyme-like mol-ecules were absent. The most likely "rst replicat-ing molecules would have had the least symmetrybreaking in their component molecules, provided,of course, that information transfer during repli-cation took place with an appropriately low errorrate. Darwinian selection would then have in-creased e$ciency by increasing symmetry break-ing so as to give a replicating polymer like thepresent-day RNA. The choice of chiral sensewould have been accidental.

8.4. HAVE ALL DIFFICULTIES BEEN SURMOUNTED

IN OUR HYPOTHESIS?

Di$culty (1) (Section 2), which pertains to thesynthesis of the four RNA bases, has been muchreduced by not requiring a source of four uniquebases, but by requiring only four di!erent groupsof bases, each of which can be a mixture of severalcompounds. Di$culties (2)} (8) essentially do notexist in our hypothesis.

Our model does require successful backboneformation in the replication process, but thischemistry is far less demanding than that

required to replicate present-day RNA viapolymerization of nucleotides under pre-bioticconditions. Of course, our model also requiresmidwife molecules, but we do at least providepossible candidate structures and a plausiblepre-biotic route to the synthesis of a molecularmidwife (Appendix B).

9. Conclusions

We conclude that it was feasible and indeedlikely that the origin of life occurred as a result ofthe spontaneous generation of a self-replicatingand information-containing polymer (proto-RNA), which gradually evolved into RNA. Weargue that all this took place in a pre-bioticenvironment that included three importantclasses of organic compounds. The "rst class con-sisted of chemically heterogeneous monocyclicand bicyclic bases, derived chie#y from hydrogencyanide, cyanogen and cyanoacetylene. The sec-ond class consisted of racemic products, such asvarious pentose and hexose sugars, as well asa few achiral molecules such as glycoladehyde,originated from the formose reaction of formal-dehyde. The third class consisted of planarachiral molecules, which we call the molecularmidwife, and which were of a size and shaperesembling the tetrad of bases obtained by hydro-gen bonding together two Watson}Crick purine}pyrimidine base pairs. We suggest that themidwife may have been a derivative ofphthalocyanine and that it might have been for-med from hydrogen cyanide and formaldehydeunder pre-biotic conditions. In any case, the mo-lecular midwife was ideally suited for face-to-faceassociation with a planar tetrad of bases. In thepresence of proto-RNA and free bases, it causedthe self-assembly of stacks containing alternatingbase tetrads and midwife molecules. These stacksshould possess, we claim, enzyme-like properties.We postulate that such a molecular midwife wasessential to the origin of life and this representsa unique feature of our hypothesis.

Our model has a second critically importantfeature, which is also unique. It involves the prin-ciple of least symmetry breaking (or minimumrequired selectivity), which means that discrim-ination between molecules can be avoided aslong as the transfer of base sequence information

INTERCALATION AND THE ORIGIN OF LIFE 557

is accomplished with a low error rate duringreplication. Discrimination between bases is lim-ited to two types each of mono and bicyclicheterocycles, depending on their hydrogen bond-ing motifs and di!erent sizes, but not otherwise.There is discrimination between two types ofaldehydic products resulting from the formosereaction, according to whether strainless intra-molecular hemiacetal formation is possible ornot. Our model does not discriminate betweenenantiomers and chiral symmetry is not broken.

The proto-RNA of our hypothesis di!ers fromthe present-day RNA in several important re-spects. Perhaps the most revolutionary di!erenceis that proto-RNA does not have a chemicallyhomogeneous or homochiral backbone. A com-parison of a set of proto-RNA molecules, allhaving precisely the same base sequence informa-tion (i.e. an informationally homogeneous poly-mer), would reveal that corresponding base,sugar, and acetal-forming aldehyde moieties weregenerally di!erent at corresponding positionsalong the polymer chain of di!erent molecules.This heterogeneous molecular nature of proto-RNA, even when informationally homogeneous,will present challenging new problems in analyti-cal chemistry.

All the organic chemical components neededfor our model to be active are de"ned. Thus,although our approach to the origin of life ishypothetical and speculative, the model of a self-replicating polymer that is presented is immedi-ately open to both experimental and computa-tional tests.

The authors thank Dr Ragini Anet and Mrs MonaHud for helpful discussions and critical reviews of themanuscript. N.V.H. is grateful to Dr Charles Stevensfor early discussion and stimulating his interest in theorigin of life.

REFERENCES

AVETISOV, V. & GOLDANSKII, V. (1996). Mirror symmetrybreaking at the molecular level. Proc. Nat. Acad. Sci.;.S.A. 93, 11 435}11 442.

BENNER, S., BURGSTALLER, P., BATTERSBY, T. & JURCZYK,S. (1999). Did the RNA world exploit an expanded geneticalphabet? In: ¹he RNA=orld, Second Edition: ¹he Natureof Modern RNA Suggests a Prebiotic RNA=orld (Geste-land, R. F. & Atkins, J. F., eds), pp. 163}181. Cold SpringHarbor, NY: Cold Spring Harbor Laboratory Press.

BERNAL, J. D. (1951). ¹he Physical Basis of ¸ife. London:Routledge & Kegan Paul.

BOHLER, C., NIELSEN, P. E. & ORGEL, L. E. (1995). Tem-plate switching between PNA and RNA oligonucleotides.Nature 376, 578}581.

BOLLI, M., MICURA, R. & ESCHENMOSER, A. (1997).Pyranosyl-RNA: chiroselective self-assembly of base se-quences by ligative oligomerization of tetranucleotide-2@,3@-cyclophosphates (with a commentary concerning theorigin of biomolecular homochirality). Chem. Biol. 4,309}320.

BONNER, W. A. (1991). The origin and ampli"cation ofbiomolecular chirality. Origins ¸ife Evol. Biosphere 21,59}111.

BURKARD, M. E., TURNER, D. H. & TINOCO, I. (1999). Theinteractions that shape RNA structure. In: ¹he RNA=orld, Second Edition: ¹he Nature of Modern RNA Sug-gests a Prebiotic RNA=orld (Gesteland, R. F. & Atkins, J.F., eds), pp. 233}264. Cold Spring Harbor, NY: ColdSpring Harbor Laboratory Press.

CAIRNS-SMITH, A. G. (1982). Genetic ¹akeover and theMineral Origins of ¸ife. Cambridge: Cambridge UniversityPress.

CRICK, F. (1981). ¸ife itself. It1s Origin and Nature. NewYork: Simon and Schuster.

CRICK, F. (1999). Foreword to the "rst edition. In: ¹he RNA=orld, Second Edition: ¹he Nature of Modern RNA Sug-gests a Prebiotic RNA=orld (Gesteland, R. F. & Atkins, J.F., eds), pp. xiii}xvi. Cold Spring Harbor, NY: Cold SpringHarbor Laboratory Press.

DE DUVE, C. (1991). Blueprint for a Cell: ¹he Nature andOrigin of ¸ife. New York: Patterson.

DEAMER, D. W. & FLEISCHAKER, G. R., eds (1994). Originsof ¸ife: ¹he Central Concepts. Boston: Jones and BartlettPublishers.

DECKER, P., SCHWEER, P. & POHLMANN, R. (1982). Identi-"cation of formose sugars, presumable prebioticmetabolites, using capillary gas chromatography/gas chromatography-mass spectroscopy of n-butoximetri#uoroacetates on OV-225. J. Chromatogr. 244,281}291.

EIGEN, M. (1971). Selforganization of matter and the evolu-tion of biological macromolecules. Naturwissenschaften 58,465}523.

FERRIS, J. P., SANCHEZ, R. A. & ORGEL, L. E. (1968).Synthesis of pyrimidines from cyanoacetylene and cyanate.J. Mol. Biol. 33, 693}704.

FUHRHOP, J.-H., DEMOULIN, C., BOETTCHER, C., KONING,J. & SIGGEL, U. (1992). Chiral micellar porphyrin "berswith 2-aminoglycosamide head groups. J. Am. Chem. Soc.114, 4159}4165.

FULLER, W. D., SANCHEZ, R. A. & ORGEL, L. E. (1972).Studies in prebiotic synthesis: VII. Solid-state synthesis ofpurine nucleosides. J. Mol. Evol. 1, 249}257.

GAO, X. (1997). Conformation of formacetal and 3@-thioformacetal nucleoside linkers and stability of theirantisense RNA.DNA hybrid duplexes. Biochemistry 36,399}411.

GESTELAND, R. & ATKINS, J. F., eds (1999). ¹he RNA=orld, Second Edition: ¹he Nature of Modern RNA Sug-gests a Prebiotic RNA =orld.: Cold Spring Harbor, NY:Cold Spring Harbor Laboratory Press.

GOLDANSKII, V. I., AVETISOV, V. A. & KUZMIN, V. V.(1986). Chiral purity of nucleosides as a necessarycondition of complementarity. FEBS ¸ett. 207,181}183.

558 N. V. HUD AND F. A. L. ANET

GROBKE, K., HUNZIKER, J., FRASE, W., PENG, L., DIEDERIS-

CHEN, U., ZIMMERMAN, K., HOLZNER, A., LEUMANN, C.& ESCHENMOSER, A. (1998). Warum Pentose- und nichtHexose-NucleinsaK uren? (Purin-Purin)-Basenpaarung inder homo-DNS-Reihe: Guanin, Isoguanin, 2,6-Diamino-purin und Xanthin. Helv. Chim. Acta. 81, 375}474.

GUERRIER-TAKADA, C., GARDINER, K., MARSH, T., PACE,N. & ALTMAN, S. (1983). The RNA moiety of ribonucleaseP is the catalytic subunit of the enzyme. Cell 35, 849}857.

GUILLON, D. (1999). Columnar order in thermotropicmesophases. In: ¸iquid Crystals II (Mingos, D. M. P., ed.),pp. 41}82. New York: Springer-Verlag.

HAGER, A. J., POLLARD, J. D. & SZOSTAK, J. W. (1996).Ribozymes: aiming at RNA replication and protein syn-thesis. Chem. Biol. 3, 717}725.

HARTMAN, H. (1998). Photosynthesis and the origin of life.Orgins ¸ife Evol. Biosphere 28, 515}521.

HOROWITZ, A. (1992). Palynology of Arid ¸ands. Amster-dam: Elsevier.

JONES, R. J., LIN, K.-Y., MILLIGAN, J. F., WADWANI, S.& MATTEUCCI, M. D. (1993). Synthesis and binding prop-erties of pyrimidine oligonucleotide analogs containingneutral phosphodiester replacements: the formacetal and3@-thioformacetal internucleoside linkages. J. Org. Chem.58, 2983}2991.

JOYCE, G. F. (1987). Nonenzymatic template-directed syn-thesis of information macromolecules. Cold Spring HarborSymp. Quant. Biol. 52, 41}51.

JOYCE, G. F. & ORGEL, L. E. (1999). Prospects for under-standing the origin of the RNA world. In: ¹he RNA=orld, Second Edition: ¹he Nature of Modern RNA Sug-gests a Prebiotic RNA=orld (Gesteland, R. F. & Atkins, J.F., eds), pp. 49}77. Cold Spring Harbor, NY: Cold SpringHarbor Laboratory Press.

JOYCE, G., VISSER, G., VAN BOECKEL, C., VAN BOOM, J.,ORGEL, L. & VAN WESTRENEN, J. (1984). Chiral selectionin poly(C)-directed synthesis of oligo(G). Nature 310,602}604.

KAZAKOV, S. & ALTMAN, S. (1992). A trinucleotide canpromote metal ion-dependent speci"c cleavage of RNA.Proc. Nat. Acad. Sci. ;.S.A. 89, 7939}7943.

KETTANI, A., KUMAR, R. A. & PATEL, D. J. (1995). Solutionstructure of a DNA quadruplex containing the fragileX syndrome triplet repeat. J. Mol. Biol. 254, 638}656.

KRUGER, K., GRABOWSKI, P. J., ZAUG, A. J., SANDS, J.,GOTTSCHLING, D. E. & CECH, T. R. (1982). Self-splicingRNA: autoexcision and autocyclization of the ribosomalRNA intervening sequence of ¹etrahymena. Cell 31,147}157.

KUDREVICH, S. & VAN LIER, J. (1996). Azaanalogs ofphtalocyanines: syntheses and properties. Coord. Chem.Rev. 156, 163}182.

LAHAV, N. (1999). Biogenesis. ¹heories of ¸ife1s Origin.Oxford: Oxford University Press.

LINSTEAD, R., NOBLE, E. & WRIGHT, J. (1937). Phthalo-cyanines. Part IX. Derivatives of thiophen, thionapthen,pyricine, and pyrazine, and a note on the nomenclature. J.Chem. Soc. 1937, 911}921.

LORSCH, J. & SZOSTAK, J. (1996). Chance and necesssity inthe selection of nucleic acid catalysts. Acc. Chem. Res. 29,103}110.

LUTHER, A., BRANDSCH, R. & VON KIEDROWSKI, G. (1998).Surface-promoted replication and exponential ampli"ca-tion of DNA analogues. Nature 396, 245}248.

MCGAVIN, S. (1971). Models of speci"cally paired like(homologous) nucleic acid structures. J. Mol. Biol. 55,293}298.

MILLER, S. L. (1987). Which organic compounds could haveoccurred on the prebiotic Earth? Cold Spring HarborSymp. Quant. Biol. 52, 17}27.

MIZUNO, T. & WEISS, A. H. (1974). Synthesis and utilizationof formose sugars. Adv. Carbohydr. Chem. Biochem. 29,173}227.

MUG LLER, D., PITSCH, S., KITTAKA, A., WAGNER, E.,WINTNER, C. E. & ESCHENMOSER, A. (1990). Chemie vona-Aminonitrilen. Aldomerisierung von Glycoladehyd-phosphat zu racemischen Hexose-2,4,6-triphosphatenund (in Gegenwart von Formaldehyd) reacemischenPentose-2,4-diphosphaten: rac-Allose-2,4,6-triphosphatund rac-Ribose-2,4-diphosphat sind die Reaktionshaup-tprodukte. Helv. Chim. Acta. 73, 1410}1468.

O'BRIEN, E. J. (1966). Structure of a purine-pyrimidine cry-stalline complex: 9-ethylguanine with 1-methyl-5-#uoro-cytosine. J. Mol. Biol. 22, 377}379.

O'BRIEN, E. J. (1967). Crystal structures of two complexescontaining guanine and cytosine derivatives. Acta Crystal-logr. 23, 92}106.

ORGEL, L. E. (1998). The origin of life*a review of facts andspeculations. ¹IBS 23, 491}495.

OROD , J. & KIMBALL, A. (1961). Synthesis of purines underpossible primitive Earth conditions. I. Adenine fromhydrogen cyanide. Arch. Biochem. Biophys. 94, 217}227.

SAENGER, W. (1984). In: Principles of Nucleic Acid Structure.(Cantor, C., ed.), Springer Advanced Texts in Chemistry.New York: Springer-Verlag.

SANCHEZ, R., FERRIS, J. & ORGEL, L. (1966). Conditions forpurine synthesis: did prebiotic synthesis occur at low tem-peratures? Science 153, 72}73.

SANCHEZ, R. A., FERRIS, J. P. & ORGEL, L. E. (1967). Studiesin prebiotic synthesis II. Synthesis of purine precursorsand amino acids from aqueous hydrogen cyanide. J. Mol.Biol. 30, 223}253.

SCHALL, O. F. & GOKEL, G. W. (1994). Molecular boxesderived from crown ethers and nucleotide bases: probes forHoogsteen vs Watson}Crick H-bonding and otherbase}base interactions in self-assembly processes. J. Am.Chem. Soc. 116, 6089}6100.

SCHOPF, J. W. (1999). Cradle of ¸ife. ¹he Discovery ofEarth1s Earliest Fossils. Princeton: Princeton UniversityPress.

SCOTT, J. (1972). Histochemistry of Alcian Blue. Histochemie30, 215}234.

SEGRE, D. & LANCET, D. (1997). Mutually catalytic am-phiphiles: simulated chemical evolution and implicationsto exobiology. In: Exobiology: Matter, Energy, and In-formation in the Origin and Evolution of ¸ife in the;niverse,Proc. the 5th ¹rieste Conf. on Chemical Evolution(Chela}Flores, J. & Raulin, F., eds), pp. 123}131. Trieste:Kluwer Academic Publishers.

SHAPIRO, R. (1984). The improbability of prebiotic nucleicacid synthesis. Origins ¸ife 14, 565}570.

SIMUNDZA, G., SAKORE, T. D. & SOBELL, H. M. (1970).Base-pairing con"gurations between purines and py-rimidines in the solid state. V. Crystal and molecularstructure of two 1:1 hydrogen-bonded complexes,1-methyluracil: 9-ethyl-8-bromo-2,6-diaminopurine and1-ethylthymine: 9-ethyl-8-bromo-2,6}diaminopurine.J. Mol. Biol. 48, 263}278.

INTERCALATION AND THE ORIGIN OF LIFE 559

SZATHMAD RY, E. & MAYNARD SMITH, J. (1997). From repli-cators to reproducers: the "rst major transitions leading tolife. J. theor. Biol. 187, 555}571.

TS'O, P. O. P. (1974). Bases, nucleosides, and nucleotides. InBasic Principles in Nucleic Acid Chemistry (Ts'o, P. O. P.,ed.), Vol. 1, pp. 453}584. New York: Academic Press.

VOGTHERR, M. & LIMMER, S. (1998). NMR study on theimpact of metal ion binding and deoxynucleotide substitu-tion upon local structure and stability of a small ribozyme.FEBS ¸ett. 433, 301}306.

WAG CHTERSHAG USER, G. (1988). An all-purine precursorof nucleic acids. Proc. Nat. Acad. Sci. ;.S.A. 84,1134}1135.

WAG CHTERSHAG USER, G. (1992). Groundworks for an evolu-tionary biochemistry: the iron}sulphur world. Prog.Biophys. Mol. Biol. 58, 85}201.

WAGNER, E., XIANG, Y.-B., BAUTMANN, K., GUG CK, J. &ESCHENMOSER, A. (1990). Chemie von a-Aminonitrilen.Aziridin-2-carbonitril, ein VorlaK ufer von rac-O 3-Phos-phoserinnitril und Glycolaldehyd-phosphat. Helv. Chim.Acta 73, 1391}1409.

WIBERG, K. B., MORGAN, K. M. & MALTZ, H.(1994). Thermochemistry of carbonyl reactions. 6. A studyof hydration equilibria. J. Am. Chem. Soc. 116,11 067}11 077.

WILLIAMS, N. G., WILLIAMS, L. D. & SHAW, B. R. (1989).Dimers, trimers, and tetramers of cytosine with guanine.J. Am. Chem. Soc. 111, 7205}7209.

WUG LFF, G. (1993). The role of binding-site interactions inthe molecular imprinting of polymers. ¹rends Biotechnol.11, 85}87.

WUG LFF, G. (1995). Molecular imprinting in cross-linkedmaterials with the aid of molecular templates*A waytowards arti"cial antibodies. Angew. Chem. Int. Ed. Engl.34, 1812}1832.

APPENDIX A

It has been suggested that there were ancestorpolymers to RNA and that these polymers laterevolved into RNA (Joyce & Orgel, 1999). Animmediate problem with this approach is decid-ing the changes that are allowed to be made inthe structure of RNA and when to stop makingthem. As Orgel has argued, one does not evenknow where to start! Actually, Orgel himself hasconsidered chemical variants of RNA with theidea of overcoming the homochirality problem.This involves replacing the phosphate-linkedchiral sugars in RNA with an achiral moleculethat links adjacent bases together (Bohler et al.,1995). But this still leaves many other di$culties,including that of chemical selectivity in general,which pervades most of the thorny problemslisted in Section 2.

The achievement of selectivity (and especiallystereochemical selectivity) in chemical reactionshas been the dominant goal of organic chemical

synthesis, especially in recent years. Tremendousprogress has been made in this area, to the extentthat enzymatic selectivity, which once appearedto be almost magical in its power, can bemimicked by much simpler (but very carefullydesigned) non-biological molecular reagents andcatalysts. However, to achieve such selectivityunder pre-biotic conditions, where human-designed molecules were not available, is quiteanother matter. It is almost as di$cult as accept-ing the presence of pre-biotic enzymes. Thesearch for selectivity in pre-biotic chemistry hasbeen intense and has pervaded research in thisarea. For example, Eschenmoser's work on tryingto "nd conditions for the pre-biotic formation ofDL-ribose has led to a slightly greater than 50%yield of this compound (as the 2,4-diphosphate),but requires starting with glycolaldehyde phos-phate and formaldehyde, instead of just using thelatter compound as in the formose reaction, andstopping the reaction at a precise time understrongly basic conditions (MuK ller et al., 1990;Wagner et al., 1990). The two phosphate groupsin the product are not in the same positions onthe ribose molecules as are those in RNA. In ourview, these results do not provide a satisfactorysource of chemically homogeneous DL-ribose (letalone of D-ribose). Nevertheless, this and laterwork by Eschenmoser's group (e.g. Bolli et al.,1997; Grobke et al., 1998) is of great eleganceand importance in organic chemistry andsigni"cantly increases our understanding of thechemical behavior of nucleic acids and their ana-logs. We have chosen another, almost diametri-cally opposite path. Instead of seeking selectivityfor all the molecules needed in our model, wehave, wherever possible, embraced diversity inthe molecules that can be accommodated. This isour principle of least symmetry breaking. Ofcourse, the model has to be able to replicatefaithfully the information content of a polymerdespite the chemical heterogeneity of thepolymer.

APPENDIX B

B.1. Bases in proto-RNA

The bases in proto-RNA, as mentioned in thetext, are of four types. We classify all possible

560 N. V. HUD AND F. A. L. ANET

base structures for incorporation into proto-RNA as being A-, U-, G- or C-like, and thesecan hydrogen bond to one another in a Watson}Crick sense just as A, U, G, and C do. The A- andG-like bases are bicyclic heterocycles, typicallypurine-derived heterocycles, for example, aden-ine, guanine, hypoxanthine, and diaminopurine.The U- and C-like bases are monocyclic six-membered heterocycles, such as derivatives ofpyrimidine or 1,3,5 triazine. For example, uracil,5-substituted uracils and N-methyl cyanuric acidare U-like, whereas cytosine is C-like. Watson}Crick hydrogen bonding always involves ahydrogen bond from a ring NH group to a ringtertiary nitrogen (Fig. 2), such as the hydrogenbond between N-3 of uracil and N-1 of adenine.In the A-/U-like pair, this NH group resides onthe U-like base, whereas in the G-/C-like basepair, the NH group is on the G-like base. Theexistence of two kinds of bases (purine- and py-rimidine-like) together with an unsymmetricalhydrogen-bonding motif for the two ring nitro-gens explains why there are just four possiblegroups of bases.

Although a two-base system (purine and py-rimidine) seems simpler and less demanding thana four-base one, the unsymmetrical nature of thehydrogen bonding means that certain moleculeshave to be arbitrarily excluded: they must beeither A- and U-like or they must be G- andC-like, but not both. Another alternative fora two-base system is to use a monocyclic (py-rimidine) only or a bicyclic (purine) only systemand the latter as been proposed to o!er advant-ages (WaK chtershaK user, 1988). However, thepre-biotic environment probably containedmany compounds derived from both purine andpyrimidine, so that the unused componentswould build up in concentration and ultimatelycompete with the desired components. Othermolecules capable of hydrogen bonding as occursbetween the monocyclic and bicyclic bases in-clude acyclic molecules, such as simple amidesand urea, as well as tricyclic and more complexmolecules. The former, while common, may betoo water soluble while the latter may be tooinsoluble as well as being relatively rare. The useof both a smaller monocyclic and a larger bicyclicbase in the information-carrying polymer oflife thus appears natural and would have been

further promoted by the existence of stackscontaining tetrads as well as molecular midwivesof the correct size, such as derivatives ofphthalocyanine.

Isoguanine and xanthine are special cases thatdo not immediately "t in our tetrad structure.However, isoguanine as the appropriate tauto-mer, might replace adenine, with the isoguaninehydroxyl group hydrogen bonding to a uracilcarbonyl group. Xanthine could replace guaninein a corresponding way.

B.2. Experimentally Observed BaseTetrad Structures

The formation of (A )U)2

and (G )C)2

tetradsfrom modi"ed bases and their nucleotides havebeen reported under certain conditions. Forexample, the (G )C)

2tetrad was observed in the

structures of 9-ethylguanine in a co-crystal with1-methyl-5-#uorocytosine, and 9-ethylguaninein a co-crystal with 1-methylcytosine (O'Brien,1966, 1967). Similarly, a structure analogous tothe (A )U)

2tetrad was observed for 9-ethyl-8-Br-

2-6-diamino-purine in a co-crystal with 1-methyl-uracil (Simundza et al., 1970). Evidence has alsobeen presented for the formation of (A )T)

2and

(G )C)2

tetrads by lipophilic nucleotide deriva-tives in apolar solutions (Schall & Gokel, 1994;Williams et al., 1989). Most recently, two (G )C)

2tetrads were reported in the solution state struc-ture of a DNA quadruplex (Kettani et al., 1995).In this latter example, guanine quartets (a nucleicbase pairing motif stabilized by cation coordina-tion) apparently provide aromatic surfaces andbackbone pre-organization that facilitate theformation of (G )C)

2tetrads.

B.3. Acetals in Proto-RNA

Experimental NMR data and ab initio quan-tum mechanical calculations carried out at a highlevel of theory unambiguously show that, at equi-librium, an acetal is favored over a hemiacetalwhich is itself favored over an aldehyde hydrate,irrespective of the aldehyde and alcohol used(Wiberg et al., 1994). In the formation of theproto-RNA backbone, the equilibrium concen-tration of acetals depends on the concentrationsof aldehydes and OH-containing sugar molecules

FIG. B1. Proposed main steps in the formation of a precursor of the midwife molecule, shown in Fig. 4, starting fromdiaminomaleonitrile (a tetramer of HCN) by successive reactions with glycolaldehyde and formaldehyde.

INTERCALATION AND THE ORIGIN OF LIFE 561

and the water activity. Acetal formation in thelaboratory is typically carried out under anhyd-rous conditions in the presence of a trace ofstrong acid and occurs readily even below roomtemperature (Wiberg et al., 1994). We proposethat the acetal bonds in proto-RNA are formedunder conditions of very low water activity (i.e.under nearly dry conditions) in the presenceof calcium or magnesium ions to provide generalacid catalysis, so that the solution itself doesnot have to be strongly acidic. The counter an-ions to the molecular midwife and the metalliccations might be chloride, but also could includeorganic anions derived from a-hydroxy or a-amino acids, which are likely to be present in theenvironment. This could conceivably increase thecatalytic activity of the cations. Furthermore,proto-RNA is not made free in solution in ourmodel, but arises from its individual componentsin an organized stack of molecules, as describedin Section 6.

B.4. Pre-biotic Origin of the Molecular Midwife

In some discussions on the origin of life theterm &&midwife'' has been used to designate a self-replicating system that eventually gave rise toanother, the latter using RNA for informationstorage (Crick, 1999). It appears to us that theseprevious references refer to a species of polymeror even the self-replicating clays of Cairns-Smith(1982). We believe that our usage of midwife, andmore particularly &&molecular midwife'', is distinctfrom these in that it refers to a molecule whichitself does not contain sequence informationand cannot self-replicate, but is a critical com-ponent of a system that achieves self-replication,only for itself to become obsolete after thepolymer component of the system is re"ned byevolution.

Our molecular midwife (which might consist ofa set of several closely related molecules) is a

derivative of phthalocyanine (Fig. 4). Phthalonit-rile (o-dicyanobenzene) and its derivatives can beconverted to phthalocyanine and its derivatives,respectively, under a wide variety of conditions(basic, acidic, metal, and metal}salt catalysed)(Kudrevich & van Lier, 1996; Linstead et al.,1937). The product is obtained either in the freeform or as a metal ion complex (copper, magne-sium, etc., with the loss of two NH protons).Whereas copper is di$cult to remove from itscomplex, dilute acid can remove magnesium.

Diaminomaleonitrile (Fig. B1) is a knowntetramer of HCN that forms under weakly basicconditions from HCN (together with adenine, thepentamer, of HCN) and has been considered tobe a possible pre-biotic molecule (Miller, 1987).It has been converted via 2,3-dicyano-1,4-pyr-azine (and derivatives thereof ) into octaazaph-thalocyanine and corresponding derivatives(Kudrevich & van Lier, 1996), but thesecompounds do not appear to be good routes topositively charged midwife molecules under pre-biotic conditions. However, diaminomaleonitrilemight react with glycolaldehyde, a product of theformose reaction, to give a dihydro-2,3-dicyano-1,4-pyrazine, as shown in Fig. B1. Further reac-tion with formaldehyde might give a positivelycharged N-methyl derivative of 2,3-dicyano-1,4-pyrazine (Fig. B1), which might lead directly tothe required midwife molecule, as a mixture ofisomers (Fig. 4). While none of these reactionshave been shown to take place experimentally,feasible mechanisms involving easy prototropicrearangements (keto}enol and imino}enamine)can be written for these overall reactions, whichare therefore not implausible and are given herefor illustration purposes. It is desirable that mid-wife molecules not contain functional groups,such as hydroxyl, that could be involved inunwanted acetal bond formation to base N-glycosides present in stacks. Substitution of othera-hydroxyaldehydes for glycolaldehyde and/or of

562 N. V. HUD AND F. A. L. ANET

other aldehydes for formaldehydes would pro-duce such undesirable products, but there aremechanistic reasons for suggesting that thismight occur only to a minor extent. The sugges-ted structure of the midwife is intriguing as it isderived from just two components: hydrogen cy-anide and formaldehyde. The former is im-plicated in the pre-biotic synthesis of purines, andthe latter in the synthesis of sugars. Incidentally,phthalocyanine and its derivatives absorb visiblelight e$ciently and have been considered for so-lar energy conversion (Kudrevich & van Lier,1996). They might have been involved in earlyphotosynthesis and production of molecules,such as sugar phosphates.

B.5. Delivery of Ingredients for Replicationof Proto-RNA

The bases and diaminomaleonitrile (formedfrom HCN and related compounds) are the mostchemically stable of the molecules needed for our

model and could have arisen at a time (possiblyfar) earlier than that when generation and repli-cation of proto-RNA took place. The compoundsmight be present in the ground, mixed with inor-ganic material such as sand or clay. In a dry state,protected from UV light and in the absence ofoxygen, these molecules would be expected to beextremely long lived, especially compared to thedissolved state in a &&primordial soup''. Althoughsome of these compounds, such as guanine, havequite low solubilities, they would all dissolve tosome extent when in contact with water.

Formaldehyde, perhaps formed photochemi-cally in the atmosphere, might deposit on thesurface of the ground, and after dissolution inwater (from dew), might undergo the formosereaction in the presence of inorganic mineralcatalysts present in the ground. This would pro-vide a continual source of aldehydes and sugarsneeded for the synthesis of the molecular midwifeas well as for the replication of proto-RNA innearby and connected cavities.