Transition in the Appearance of Enzymes Urzymology: Experimental Access to a Key Minireviews

9
Charles W. Carter, Jr. Transition in the Appearance of Enzymes Urzymology: Experimental Access to a Key Minireviews: doi: 10.1074/jbc.R114.567495 originally published online September 10, 2014 2014, 289:30213-30220. J. Biol. Chem. 10.1074/jbc.R114.567495 Access the most updated version of this article at doi: . JBC Affinity Sites Find articles, minireviews, Reflections and Classics on similar topics on the Alerts: When a correction for this article is posted When this article is cited to choose from all of JBC's e-mail alerts Click here Supplemental material: http://www.jbc.org/content/suppl/2014/09/10/R114.567495.DC1.html http://www.jbc.org/content/289/44/30213.full.html#ref-list-1 This article cites 69 references, 26 of which can be accessed free at at University of North Carolina at Chapel Hill on November 3, 2014 http://www.jbc.org/ Downloaded from at University of North Carolina at Chapel Hill on November 3, 2014 http://www.jbc.org/ Downloaded from

Transcript of Transition in the Appearance of Enzymes Urzymology: Experimental Access to a Key Minireviews

Charles W. Carter, Jr.  Transition in the Appearance of EnzymesUrzymology: Experimental Access to a KeyMinireviews:

doi: 10.1074/jbc.R114.567495 originally published online September 10, 20142014, 289:30213-30220.J. Biol. Chem. 

  10.1074/jbc.R114.567495Access the most updated version of this article at doi:

  .JBC Affinity SitesFind articles, minireviews, Reflections and Classics on similar topics on the

 Alerts:

  When a correction for this article is posted• 

When this article is cited• 

to choose from all of JBC's e-mail alertsClick here

Supplemental material:

  http://www.jbc.org/content/suppl/2014/09/10/R114.567495.DC1.html

  http://www.jbc.org/content/289/44/30213.full.html#ref-list-1

This article cites 69 references, 26 of which can be accessed free at

at University of N

orth Carolina at C

hapel Hill on N

ovember 3, 2014

http://ww

w.jbc.org/

Dow

nloaded from

at University of N

orth Carolina at C

hapel Hill on N

ovember 3, 2014

http://ww

w.jbc.org/

Dow

nloaded from

Urzymology: ExperimentalAccess to a Key Transition in theAppearance of Enzymes*□S

Published, JBC Papers in Press, September 10, 2014, DOI 10.1074/jbc.R114.567495

Charles W. Carter, Jr.1

From the Department of Biochemistry and Biophysics, University of NorthCarolina, Chapel Hill, North Carolina 27599-7260

Urzymes are catalysts derived from invariant cores of proteinsuperfamilies. Urzymes from both aminoacyl-tRNA synthetaseclasses possess sophisticated catalytic mechanisms: pre-steadystate bursts, significant transition-state stabilization of bothamino acid activation, and tRNA acylation. However, they haveinsufficient specificity to ensure a fully developed genetic code,suggesting that they participated in synthesizing statistical pro-teins. They represent a robust experimental platform fromwhich to articulate and test hypotheses both about their ownancestors and about how they, in turn, evolved into modernenzymes. They help reshape numerous paradigms from theRNA World hypothesis to protein structure databases andallostery.

Novel Methods and Targets

Urzyme derives from the German/Dutch prefix “Ur,” whichmeans primitive or authentic. Woese (1) used the related term“ur-enzyme,” which he attributed to K. C. Atwood. As the namesuggests, my colleagues and I believe that Urzymes representlegitimate experimental models for very early ancestralenzymes. This belief is based on phylogenetic evidence, theirconservation in modern enzymes, and biochemical evidence,their surprisingly high catalytic rate enhancements.

On an evolutionary scale (Fig. 1), Urzymes occupy a gap thatis inaccessible to other approaches to evolution. Urzymologyhas different objectives and methods from ancestral genereconstruction and directed evolution. During most of biolog-ical evolution, fully articulated enzymes assumed new functionsvia point mutations and short insertions and/or deletions: aprocess called neofunctionalization (2). Many such events canbe recovered from multiple sequence alignments by ancestralgene reconstruction (3–9). Directed evolution uses selection toevoke changes in contemporary enzymes.

My colleagues and I seek to characterize catalysts that are50 – 85% smaller and more remote than their contemporarydescendants, and are missing entire domains of genetic infor-mation. Loss of these substantial modules creates a crucial bar-

rier to accessing early developmental evolution; phylogenetictrees based on multiple sequence alignments representingessentially modern enzymes lose coherence at that stage and donot root in the invariant structural cores (10).

Urzymology uses three-dimensional structural superposi-tion to identify invariant cores. Novel combinations of modularprotein engineering, design, biochemistry, and high sensitivityassays allow my colleagues and I then to reconstruct and ana-lyze representations of simpler primordial enzymes that havelong been extinct in any form, and whose accessible descend-ants are much more sophisticated.

Defining evolutionary intermediates is analogous to provingchemical reaction intermediates (11). Intermediate states mustbe identified and characterized. Then, they must be shown toresult from a plausible evolutionary precursor. Finally, theymust be shown to give rise to more modern molecular speciesby similarly plausible evolutionary processes. Biological evolu-tion intermediates are also constrained by using phylogeneticmethods to trace ancestries.

This logic extends that articulated by Thornton et al. (12).Urzymes inferred from multiple structure alignments are nottrue ancestors, but are our best approximations. Given theirhomology to contemporary enzymes and their robust catalysis,it is very likely that the catalytic cores of these enzymes con-served across all branches of the tree of life point to a quitesimilar common molecular ancestor. Moreover, irrespective ofhow closely they resemble true ancestral forms, their activitiesrepresent a powerful new tool for studies of contemporaryenzyme mechanisms.

The gap between the earliest peptide catalysts and modernenzymes represents an intellectually challenging era that coin-cides with the emergence of the genetic code, and hence of bothgenetics and biology. Codon-dependent translation is the nexusbetween chemical and biological evolution. Amino acid activa-tion and tRNA acylation are necessary and sufficient totranslate the genetic code (supplemental Fig. S1). Thus, my col-leagues and I have studied Urzymes derived from aminoacyl-tRNA synthetases (aaRS)2 (13–18). aaRS form two distinctsuperfamilies with unrelated primary, secondary, and tertiarystructures, as well as with significant mechanistic differences(19) (supplemental Figs. S2 and S3).

Fig. 2 illustrates both the notion of and the precedentinvolved in creating Urzymes. Schwob and Söll (20) nearly pro-duced the first Urzyme by selecting random internal deletionsof a suppressor glutaminyl-tRNA synthetase (GlnRS; Class I)for their ability to rescue an amber lacZ mutation. Much of theconnective peptide 1 (CP1) insertion and the anticodon-bind-ing domain (ABD) could be removed without eliminating sup-pression, and a “minimal GlnRS” fusion of the largest CP1 andABD deletions also restored prototrophy on minimal plates.* This work was supported, in whole or in part, by National Institutes of Health

Grants GM 78227 and GM 90406 through the NIGMS (to C. W. C.). This is thethird article in the Thematic Minireview series “Enzyme Evolution.”

□S This article contains supplemental Figs. S1–S5.1 To whom correspondence should be addressed: Dept. of Biochemistry and

Biophysics, CB 7260 University of North Carolina at Chapel Hill, Chapel Hill,NC 27599-7260. Tel.: 919-966-3263; Fax: 919-966-2852; E-mail: [email protected].

2 The abbreviations used are: aaRS, aminoacyl-tRNA synthetase(s); CP1, con-necting peptide 1; ABD, anticodon-binding domain; TrpRS, tryptophanyl-tRNA synthetase; TyrRS, tyrosyl-tRNA synthetase; LeuRS, leucyl-tRNA syn-thetase; HisRS, histidyl-tRNA synthetase; GlnRS, glutaminyl-tRNA synthetase;RO, Rodin-Ohno.

THE JOURNAL OF BIOLOGICAL CHEMISTRY VOL. 289, NO. 44, pp. 30213–30220, October 31, 2014© 2014 by The American Society for Biochemistry and Molecular Biology, Inc. Published in the U.S.A.

OCTOBER 31, 2014 • VOLUME 289 • NUMBER 44 JOURNAL OF BIOLOGICAL CHEMISTRY 30213

MINIREVIEW

at University of N

orth Carolina at C

hapel Hill on N

ovember 3, 2014

http://ww

w.jbc.org/

Dow

nloaded from

The length of this minimal GlnRS (260 residues) is still roughlytwice the size of the putative GlnRS Urzyme, which is shown asa ribbon within the context of the full-length GlnRS.

Similar work established the functionality of tRNA acceptorstems (21, 22) and microhelices (23), which could be acylated byintact cognate aaRS. Thus, contemporary aaRS and tRNAs bothcontain functional subsets that are 50 – 85% smaller than theirfull-length relatives. The distinction between these forms andtheir full-length relatives is both quantitative and qualitative.They have fewer domains and exhibit significant reductions incatalyzed rates.

The radical, directed protein surgery necessary to createUrzymes was motivated by the Rodin-Ohno (RO) hypothesis(24) (supplemental Figs. S2 and S3) that ancestral Class I andClass II aaRS coding sequences were originally complementarystrands of the same gene. The only segments of the superfami-lies that could be aligned antiparallel as suggested by Rodin andOhno also turned out to be the invariant cores that positionactive site residues. The RO hypothesis thus predicted thatthese cores were intermediates in aaRS evolution, and henceshould be catalytically active (18).

Excising Urzymes from full-length enzymes exposes manyhydrophobic side chains. These residues must be identified andmutated to restore solubility. Computational methods identi-fied side chains with the greatest newly generated solvent-ac-cessible surface area. Suitable mutations were suggested by theRosetta protein design program (25). Native TrpRS Urzymesequences at interfaces with deleted sequences are active whenthe wild type Urzyme is fused to the anticodon-binding domain(15).

Physical principles motivating the choices that Rosettamakes for these extinct sequences may overlap those inducedby selective pressures for stability (26), evoking surrogates forsequence information missing in multiple sequence alignmentsfrom living organisms. Thus, protein design extends the studyof protein evolution substantially closer to the origin of life.

My colleagues and I are fortunate to have begun by investi-gating aaRS Urzymes. Urzymes derived from the invariantstructural cores of Class I TrpRS (17), LeuRS,3 and Class IIHisRS (16) all have only �15–25% of the total contemporarymass, yet they accelerate both amino acid activation and tRNAaminoacylation proportionately by �108-fold over the uncata-lyzed rates (27).

Their strong phylogenetic support (structural invariance)and high catalytic proficiencies afford a robust platform. Theypoint both backward in time to yet more ancient antecedentsand forward to the fully developed genetic code (Fig. 1). Thus,their catalytic activities establish a base camp for articulatingand testing new and previously inaccessible experimental stud-ies to identify and test intermediates in molecular evolution andallostery (13, 27).

Urzyme Catalytic Activities Satisfy Complementary Testsof Authenticity

Three lines of evidence, pre-steady state burst size, sensitivityto mutation, and substrate binding affinity, reinforce the con-clusion that the Urzymes themselves are the authentic sourcesof the observed catalytic activities. Most unexpected was thatthe TrpRS, LeuRS, and HisRS Urzymes all exhibit pre-steady

3 O. Erdogan and M. Collier, unpublished results.

FIGURE 1. Urzymology and molecular evolution. A, proposed role of aaRSUrzymes in the development of codon-dependent translation (yellow back-ground). The scale indicates time in years since the earth formed. Urzymes arerehabilitated forms of invariant structural cores of modern enzyme super-families. Many modern enzymes likely preceded the last universal commonancestor and first “organism” (LUCA). B, methods appropriate for studyingobjects and processes along the timeline in A. Because urzymology connectsthe earliest genetic coding to the emergence of modern enzymes, it affords apowerful enabling technology for studying key transitions in the evolution ofthe genetic code.

FIGURE 2. Minimal Glutaminyl-tRNA synthetase (20) and its putativeUrzyme (firebrick ribbon). Selection of functional deletions from CP1(100 amino acids; blue) and the ABD (134 amino acids; green) yielded aminimal enzyme (wheat) that is still twice the size of the invariant coreused to construct TrpRS and LeuRS Urzymes. Active-site ligands are shownas spheres.

MINIREVIEW: Urzymology and the Appearance of Enzymes

30214 JOURNAL OF BIOLOGICAL CHEMISTRY VOLUME 289 • NUMBER 44 • OCTOBER 31, 2014

at University of N

orth Carolina at C

hapel Hill on N

ovember 3, 2014

http://ww

w.jbc.org/

Dow

nloaded from

state bursts comparable in magnitude with the catalyst concen-trations, ruling out contamination by tiny amounts of full-length aaRS (16 –18). Bursts established that rate-limitingproduct release (28) was a third fundamental link to contempo-rary enzymes, in addition to accelerating amino acid activationand tRNA aminoacylation (see supplemental Fig. S1).

Mutational and protein engineering experiments also sup-port the authenticity of aaRS Urzyme catalytic activities. TrpRSand HisRS Urzymes, expressed as maltose-binding protein(MBP) fusions, are activated �50-fold by tobacco etch virusprotease cleavage. Further, four different HisRS Urzymes dif-fering in the presence or absence of class-defining signatureMotif 3 and a 6-residue N-terminal extension exhibit catalyticdifferences consistent with catalytic contributions of eachmodule plus a significant (i.e. �1.6 kcal/mole) synergistic inter-action between them (16). Finally, point mutation of active-siteresidues alter catalytic activity by an order of magnitude ormore (16, 17). None of these effects are consistent with activityfrom a contaminating catalyst.

Steady-state kinetic parameters afford a third line of evi-dence for authenticity. Both TrpRS and HisRS Urzymes bindATP tightly, but amino acid affinities are 10 –100-fold lowerthan those of the full-length enzymes. The TrpRS Urzyme tryp-

tophan Km is 1–2 mM, 500 times that of intact TrpRS (17). Weakcognate amino acid binding suggests that discriminationagainst similar, non-cognate amino acids is also weakened, asobserved (13, 15, 17).

aaRS Urzymes: Low Specificity, High ProficiencyCatalysts

aaRS Urzyme catalytic activities are dramatically higher thanestimates for the uncatalyzed rates (Fig. 3A). Notably, Urzymesfrom both classes accelerate both activation and acylation 105–106-fold more than necessary to launch ribosome-independentprotein synthesis (Fig. 3B). Further, the two classes achieve sim-ilar activities with similar masses, consistent with their jointrequirement for protein synthesis. Thus, both classes appear tohave achieved comparable proficiency increments as theirentirely different domain architectures grew comparably insize.

Urzyme specificities are low. Spectra for Class I LeuRS andClass II HisRS Urzyme specificities (Fig. 3C) are similar andcomplementary. Both Urzymes activate a range of non-cognateamino acids. Nonetheless, each Urzyme exhibits an �5-foldpreference for amino acids from its own class. Unknown differ-

FIGURE 3. Quantitative framework for assessing the catalytic significance of Urzymes and other putative stages of aaRS evolution. A, rate accelerationsestimated from experimental data for single (red) and bisubstrate (black and bold) reactions adapted from Ref. 63 to include uncatalyzed and catalyzed ratesof bisubstrate reactions of the ribosome (71), amino acid activation (18), and kinases (72). Abbreviations: CAN, carbonic anhydrase; CMU, chorismate mutase;KSI, ketosteroid isomerase; RIBO, ribosome; CDA, cytidine deaminase; PEP, carboxypeptidase B; MAN, mandelate racemase; KINA, hexokinase; FUM, fumarase;GLU, �-glucosidase; ODC, orotidylate carboxylase; ADC, amino acid decarboxylase; IPT, inositol phosphatase. Second order rate constants (black bars) wereconverted into comparable units by multiplying by 0.002 M, which is the ATP concentration used to assay the catalysts shown in B. B, experimental rateaccelerations estimated from steady state kinetics as kcat/Km for a series of catalysts derived from Class I and Class II aaRS. A and B have the same vertical scales,and the origin in B has been set equal to the uncatalyzed rate of amino acid activation (AAact) in A. Red bars, Class I TrpRS and LeuRS (see Footnote 3) constructs;blue bars, Class II HisRS constructs; green bar, a ribozymal catalyst (38), for comparison. Research presented in A and B was originally published in Ref. 27. © TheAmerican Society for Biochemistry and Molecular Biology. C, Class I LeuRS and Class II HisRS Urzyme amino acid specificities. More negative �Gkcat/Km valuesindicate higher activities. By �1 kcal/mole (light bands) or �5-fold, each Urzyme prefers substrates from its own class (dark bands).

MINIREVIEW: Urzymology and the Appearance of Enzymes

OCTOBER 31, 2014 • VOLUME 289 • NUMBER 44 JOURNAL OF BIOLOGICAL CHEMISTRY 30215

at University of N

orth Carolina at C

hapel Hill on N

ovember 3, 2014

http://ww

w.jbc.org/

Dow

nloaded from

ences between aaRS Urzymes and the true ancestral forms mayaccount for some of their promiscuity.

The TrpRS Urzyme provided a unique “molecular knock-out” lacking the entire CP1 and ABD, affording a baselineagainst which to determine contributions of the two deletedmodules. Neither CP1 nor the ABD restored any specificity(15), which results entirely from their energetic coupling.

Using allosteric interactions between genetic modules entirelyabsent from the Urzymes to enhance specificity resolves chal-lenges (29 –33) associated with failure of rational amino acid-binding pocket point mutants to accomplish anything butreducing catalytic activity. Moreover, generating orthogonalsynthetase-tRNA pairs appears to require pruning all aminoacid-binding residues to alanine and using random mutation toselect rebuilt pockets to match the altered substrates consistentwith induced-fit mechanisms (34).

These partial results suggest that aaRS Urzymes could notsupport a canonical 20-amino acid alphabet. Low Urzyme spec-ificity and the fact that Urzymes cannot utilize the tRNA anti-codon for recognition form the first experimental basis for theconjecture of Woese (1, 35) that the first coded proteins werestatistical ensembles, without unique sequences.

Reduction and Recapitulation

“You have to deeply understand the essence of a product tobe able to get rid of the parts that are not essential.” (Jony Ive,quoted in Ref. 36)

Urzymes afford a robust platform for reductionist experi-ments aimed at characterizing even simpler ancestral proteincatalysts that look backward in time (37, 39)4 and for recapitu-lating plausible intermediates to test possible evolutionarypaths that look forward in time (15) (Fig. 1).

The ability to measure the subtle effects of modules as smallas 6 –20 amino acids (16) greatly enhances the resolution ofmodular deconstruction as a tool in protein science. Radicalsurgery of Class II aaRS afforded evidence that Motif 3, consid-ered essential to catalytic activity because of its interactionswith ATP, is dispensable and synergistic even with modularadditions elsewhere, including a 6-amino acid N-terminalextension to Motif 1 (15). The catalytic role of Motif 3 may berealized fully only when the Class II insertion domain is presentbetween the Urzyme and Motif 3 (40), or in full-length HisRS.Further experiments are necessary to map the intermodularsynergy.

The unprecedented radical protein surgery that gave rise tothe Class I TrpRS and LeuRS Urzymes entailed removing one ormore long internal peptides as well as conventional truncationof the C-terminal anticodon-binding domain. Deleting CP1, aninternal subdomain, was non-trivial because the two remainingfragments had to be joined together without corrupting theactive site. The LeuRS Urzyme5 entailed removing CP2, in addi-tion to CP1. Removing internal segments was facilitated by thefact that � carbon atoms of their N- and C-terminal residues infull-length aaRS are separated by the length of a peptide bond(13).

Straightforward removal of CP1 from TrpRS and LeuRS andCP2 from LeuRS affords existence proofs that the reverse pro-cess, assimilation of a CP1 or CP2 ancestor, is a legitimate evo-lutionary operation accounting for these insertions into theClass I aaRS as they evolved. Further support for this conclusionarises because insertions into the Toprim domain (41), the clos-est contemporary homolog of Class I aaRS Urzymes, occur atthe locations of Class I CP1 and CP2 insertions (Fig. 4) (42).

CP1 wraps nearly 360° around the TrpRS specificity �-helix,constraining it against the N-terminal crossover connectionthat binds ATP. Molecular dynamics simulations (17) sug-gested it might stabilize that helix, which reorients markedly inlong trajectories in the presence of ATP, but withouttryptophan.

Full TrpRS specificity and tRNATrp aminoacylation activity(15) both require essentially complete interdomain synergy(also called epistasis (43– 45)). The observed epistasis (15)implies that both modules interact in the decision to activatethe amino acid present in the active site. Combinatorialmutagenesis (13) implicated the D1 Switch, a dynamic (46)packing motif (47), in mediating allosteric communicationbetween domain movement and catalysis. The four-way ener-getic coupling between mutated D1 residues (13) makes a quan-titatively equivalent contribution to both catalysis and specific-ity as is observed for the intermodular CP1XABD energeticcoupling (15).

Unexpectedly, the TrpRS Urzyme appears to have higher fit-ness at the two tasks, amino acid recognition and tRNA amino-acylation, required of aaRS than either of the two larger inter-mediate, potentially more advanced constructs. Urzymes thusappear to lie closer to the actual path of aaRS evolution. Thenegative epistasis suggests that evolutionary growth of contem-porary aaRS must be subtler than simply accumulating eitherCP1 or ABD domains.

RO Hypothesis Redux

The RO hypothesis may have been ignored because it was notobvious how to test it. The relevant objects, ancestral Class Iand II aaRS, are so remote that it was hardly evident that thehypothesis could be falsified (48). My colleagues and I articu-lated and verified bioinformatic (14) and biochemical (16 –18,27) predictions. The balanced, proportionate rate accelerationsof both amino acid activation and tRNA aminoacylation byTrpRS and HisRS Urzymes confirmed the prediction that ClassI and II segments consistent with antiparallel alignment shouldbe active (27).

Bioinformatic predictions of sense/antisense coding ancestrywere tested by excerpting a 94-residue Urgene from �200 con-temporary coding sequences of Class I TrpRS and Class IIHisRS. Tyrosyl-tRNA synthetase (TyrRS) and prolyl-tRNAsynthetase (ProRS) served as outgroups in rooting the respec-tive trees. Codon middle bases formed base pairs in �0.34 ofall-by-all antiparallel alignments in all four cases, with a stand-ard error of �0.0003, as compared with a well established valueof 0.25 for the null hypothesis (supplemental Fig. S4). Middle-base pairing increased in independently reconstructed ances-tral sequences for the two trees (supplemental Fig. S5 (14). Mid-dle-base pairing of sense/antisense-related sequences thus

4 C. W. Carter, Jr., and R. Wolfenden, manuscript in preparation.5 M. Collier, O. Erdogan, and C. W. Carter, Jr., manuscript in preparation.

MINIREVIEW: Urzymology and the Appearance of Enzymes

30216 JOURNAL OF BIOLOGICAL CHEMISTRY VOLUME 289 • NUMBER 44 • OCTOBER 31, 2014

at University of N

orth Carolina at C

hapel Hill on N

ovember 3, 2014

http://ww

w.jbc.org/

Dow

nloaded from

appears to be a phylogenetic metric that persists far deeper intothe past than do metrics for multiple sequence alignments for asingle phylogenetic tree.

Their sophistication and tuned catalytic activities argue thataaRS Urzymes had far simpler ancestors. The sense/antisenseancestry of the Urzymes suggests in turn that these simplerancestors also were encoded on opposite strands of the samegene(s). Phylogenetic evidence and the similar comparisonsbetween two Class IC and two Class IIA aaRS sequence align-ments (14) and analysis of amino acid activation by 46-residueATP-binding sites coded by a designed sense/antisense gene(37) suggest, in turn, that certain properties of these ancestorsmay be accessible via methods analogous to those of ancestralgene reconstruction (8, 12). See Ref. 39 for additional details.

Urzymology-driven Paradigm Shifts

My colleagues and I expected that experimental studies ofancestral aaRS would identify novel perspectives on the originsof translation itself and hence in contemporary molecular biol-ogy and biochemistry. However, unexpected new perspectiveson phylogenetics/genomics and the origins of protein fold-ing, catalytic activity, specificity, and allostery will also likelyshift and enhance comprehensive paradigms in genetics andbiophysics.

Codon-dependent Translation

The evolutionary history of the universal genetic code is, in areal sense, that of the two synthetase superfamilies and their

cognate tRNAs. Support for the RO hypothesis argues againstparadigms holding that the aaRS classes appeared independ-ently, one after the other (49, 50). Urzyme tRNA acylation activ-ity opens to more detailed testing the proposal that early trans-lation used an “operational RNA code” (51) vested only in thetRNA acceptor stem bases, the only parts of tRNA that can berecognized by aaRS Urzymes.

Sense/antisense coding projects further into the past thanother metrics (14), implying that catalytic peptides responsiblefor activating amino acids co-evolved with tRNA from a veryprimitive state. Thus, contrary to the prevailing RNA Worldhypothesis, my colleagues and I restated (27) the proposal(52, 53) that genetic coding emerged from mutually catalyticRNA and peptides, using rudimentary stereochemical codingbetween the two biopolymers.

Phylogenetics/Genomics

Systematic protein structure classifications, SCOP (54) andCATH (55), fail to identify Urzymes of either aaRS class asancestral forms (10). However, aaRS Urzymes represent plau-sible ancestors for a wide spectrum of contemporary proteins.The Rossmannoid superfamily (56), the biggest in the proteome(57), includes consensus homologs of Class I aaRS. My col-leagues and I argued (58) that the 26 families in the HSP70/actinATPase superfamily are ancient paralogs of Class II aaRS (pfam:CL0108).

Class II aaRS Urzymes illustrate a distinct, but related prob-lem. One might expect their descendants to include a propor-

FIGURE 4. Structural homology between Class I Urzymes and the TOPRIM domain found in topoisomerases and primases (41). Core domain structuresare superimposed in the center. Insertion points corresponding to Class I CP1 and CP2 (42) are shared by various TOPRIM domains, as indicated. The N and Ctermini of the insertions are indicated by red spheres and correspond closely throughout the domain superfamily. P. horikoshii, Pyrococcus horikoshii; D. radio-durans, Deinococcus radiodurans; E. coli, Escherichia coli.

MINIREVIEW: Urzymology and the Appearance of Enzymes

OCTOBER 31, 2014 • VOLUME 289 • NUMBER 44 JOURNAL OF BIOLOGICAL CHEMISTRY 30217

at University of N

orth Carolina at C

hapel Hill on N

ovember 3, 2014

http://ww

w.jbc.org/

Dow

nloaded from

tion of the contemporary proteome comparable with that(0.25– 0.3) occupied by the Rossmannoid protein superfamilythat includes Class I aaRS. The only proposed relative is the dualfunction biotin synthase/repressor BirA (59). Although evi-dence for that homology is strong, it relies heavily on the C-ter-minal halves of the molecules that follow Motif 2. This essen-tially modern homology may also be misleading because thesmallest Class II Urzymes lack both Motif 3 and C-terminalinsertion domains, which may obscure more distant homolo-gies based on the Class II Urzyme, for example, the HSP70/actin superfamily (58).

Validating the RO sense/antisense coding hypothesis (24)thus introduces new, potentially deep relationships betweenprotein superfamilies. This conundrum highlights possible lim-itations in current structural database annotations, and accom-modating Urzymes into evolutionary frameworks should there-fore enhance inferences drawn from structural databases.

Protein Folding

Information carried by a gene is generally perceived to beunambiguous. However, for sense/antisense genes, this uniqueinformation has two valid but quite different interpretationsresulting in two different folds, depending on which strand isread. The inverse duality in the genetic code (60) ensures thatsecondary structures coded by opposite strands of the samegene can both be amphipathic, consistent with formatting glob-ular ensembles that are, in a sense, “inside out” (Fig. 6 of Ref. 14).

Catalysis

A widespread consensus (61– 63) holds that enzymic rateenhancements require strong bonds to the transition state andhence favorable enthalpies of activation. To minimize unfavor-able entropy changes in forming such strong transition stateinteractions, one naively expects aaRS Urzymes to be properlyfolded proteins. We do not yet know whether Urzymes are fullyfolded. Recently, however, Hu (64) confirmed that alternativeproperly folded and molten globular forms of the same enzyme(65) achieve the same rate enhancements by different enthalpy/entropy compensations. Molten globular proteins can actuallyachieve higher transition state complementarity than morerigid, folded proteins. Dissociating catalysis from the require-ment for folding suggests that a much broader range of earlytranslation products may have been catalytically active, accel-erating early stages of natural selection.

aaRS Modularity

The modularity of contemporary protein structures is a per-plexing problem that intensifies challenges posed by proteinevolution. aaRS enzymes exhibit unusually high percentages(5-fold more than other families (66)) of accreted sequenceswith new functionality, i.e. physiocrines, along the organismalphylogenetic trees. Physiocrines appear in aaRS from bothclasses. Their functions are unrelated to translation, rangingfrom endocrine regulation of cardiovascular development toimmune system activities, as well as regulation of mammaliantarget of rapamycin (mTOR), IFN-�, and p53 signaling (67).Urzymology introduces techniques needed to recapitulate thegain of such functions.

Specificity and Allostery

The most significant puzzle arising thus far from studies ofurzymology is that enhanced specificity of contemporaryTrpRS, relative to its Urzyme, involves negative epistasis,requiring allosteric energetic coupling between the ABD andthe CP1 insertion via a switching element intrinsic to theUrzyme (13, 15). Intramolecular epistasis may have developedwhile different modules present in contemporary aaRS func-tioned in trans. Some authors suggest that off-loading functionssuch as specificity to allosteric effects may be beneficial by mak-ing specificity a more robust property (68). The dynamicswitching element responsible for this coupling is a widely con-served packing motif (47). How this “protoallosteric” motif (69)functions without the ABD and CP1 modules and how couplingmight have emerged before modules became covalently joinedremain outstanding questions.

Like knock-out mice, Urzymes are extensive molecularknock-outs that provide unique experimental baselines forhelping to answer such questions. Their measurable catalyticrates facilitate multidimensional thermodynamic analysis ofmodular epistasis in mechanistic enzymology (13, 15).

Protein Design

Recovering Urzymes by deleting non-essential proteinmasses has adverse effects on stability and solubility. Muta-tions, identified by Rosetta (60, 61), compensate for theseeffects. Interfaces between Urzymes and more recentlyacquired modules can, in principle, also be redesigned. As pro-ficient, relatively nonspecific catalysts, Urzymes can likely beengineered to acylate tRNAs with non-canonical amino acids(34, 70) for industrial purposes.

Urzymes Have Measurable Activities

Most enzyme-catalyzed rates are within the same order ofmagnitude, irrespective of the uncatalyzed reaction rates (63).Roughly comparable enzyme-catalyzed rates appear to be arequirement for biology. An important implication is that cat-alytic activities of different enzymes have always been subject tosuch a constraint. The 108-fold rate accelerations observed forClass I and II aaRS Urzymes (3, 6) suggests that other Urzymesare likely also to have measurable catalytic rates.

Conclusions

In launching urzymology, my colleagues and I took threesteps for which there was little, if any, precedent: (i) validatingthe Rodin-Ohno hypothesis; (ii) testing the evolutionary impli-cation that the most highly conserved portions of enzymes (i.e.their invariant cores) probably have significant catalytic activi-ties; and (iii) using protein design (Rosetta (25)) to compensatefor protein mass lost on creating Urzymes, facilitating radicalprotein surgery. Previously, there was no way to formulate ortest hypotheses about how simple, extinct ancestors cameto resemble contemporary proteins. Urzymology now offerscoherent paradigms that open unprecedented experimentalaccess to mechanisms of very early protein evolution, as well asto novel and effective studies of contemporary mechanisticenzymology.

MINIREVIEW: Urzymology and the Appearance of Enzymes

30218 JOURNAL OF BIOLOGICAL CHEMISTRY VOLUME 289 • NUMBER 44 • OCTOBER 31, 2014

at University of N

orth Carolina at C

hapel Hill on N

ovember 3, 2014

http://ww

w.jbc.org/

Dow

nloaded from

The first examples of urzymology introduced the ability touse high resolution modular engineering pro-actively, toaddress previously inaccessible questions. My colleagues and Ideveloped Class I and II aaRS Urzymes to test the Rodin-Ohnohypothesis that ancestral forms of each class descended fromopposite strands of the same gene. Unexpected results fromthat effort will likely change prevailing paradigms in severalareas. In validating the hypothesis, my colleagues and I discov-ered that urzymology, because it connects the earliest geneticcoding to the emergence of modern enzymes, affords a power-ful enabling technology for studying a key transitional period inthe evolution of the genetic code. aaRS Urzymes afford a basecamp for probing even more primitive peptide catalysts and forrecapitulating subsequent evolutionary steps leading to theemergence of full-length aaRS.

Acknowledgments—I am grateful to numerous laboratory membersfor their contributions over the past decade and to Richard Wolfendenfor helpful discussions.

REFERENCES1. Woese, C. R. (1965) On the origin of the genetic code. Proc. Natl. Acad. Sci.

U.S.A. 54, 1546 –15522. Rastogi, S., and Liberles, D. A. (2005) Subfunctionalization of duplicated

genes as a transition state to neofunctionalization. BMC Evol. Biol. 5, 283. Hanson-Smith, V., Kolaczkowski, B., and Thornton, J. W. (2010) Robust-

ness of ancestral sequence reconstruction to phylogenetic uncertainty.Mol. Biol. Evol. 27, 1988 –1999

4. Bridgham, J. T., Ortlund, E. A., and Thornton, J. W. (2009) An epistaticratchet constrains the direction of glucocorticoid receptor evolution. Na-ture 461, 515–519

5. Dean, A. M., and Thornton, J. W. (2007) Mechanistic approaches to thestudy of evolution: the functional synthesis. Nat. Rev. Genet. 8, 675– 688

6. Thornton, J. W. (2004) Resurrecting ancient genes: experimental analysisof extinct molecules. Nat. Rev. Genet. 5, 366 –375

7. Stackhouse, J., Presnell, S. R., McGeehan, G. M., Nambiar, K. P., andBenner, S. A. (1990) The ribonuclease from an extinct bovid ruminant.FEBS Letters 262, 104 –106

8. Gaucher, E. A., Govindarajan, S., and Ganesh, O. K. (2008) Palaeotem-perature trend for Precambrian life inferred from resurrected proteins.Nature 451, 704 –707

9. Benner, S. A., Sassi, S. O., and Gaucher, E. A. (2007) Molecular paleosci-ence: systems biology from the past. Adv. Enzymol. Relat. Areas Mol. Biol.75, 1–132

10. Caetano-Anollés, G., Wang, M., and Caetano-Anollés, D. (2013) Struc-tural phylogenomics retrodicts the origin of the genetic code and uncoversthe evolutionary impact of protein flexibility. PLoS One 8, e72225

11. Fersht, A. R. (1999) Structure and Mechanism in Protein Science, pp.217–218, W. H. Freeman and Company, New York

12. Thornton, J. W., Need, E., and Crews, D. (2003) Resurrecting the ancestralsteroid receptor: ancient origin of estrogen signaling. Science 301,1714 –1717

13. Weinreb, V., Li, L., Chandrasekaran, S. N., Koehl, P., Delarue, M., andCarter, C. W., Jr. (2014) Enhanced amino acid selection in fully evolvedtryptophanyl-tRNA synthetase, relative to its Urzyme, requires domainmovement sensed by the D1 Switch, a remote, dynamic packing motif.J. Biol. Chem. 289, 4367– 4376

14. Chandrasekaran, S. N., Yardimci, G. G., Erdogan, O., Roach, J., and Carter,C. W., Jr. (2013) Statistical evaluation of the Rodin-Ohno hypothesis:sense/antisense coding of ancestral Class I and II aminoacyl-tRNA syn-thetases. Mol. Biol. Evol. 30, 1588 –1604

15. Li, L., and Carter, C. W., Jr. (2013) Full Implementation of the genetic codeby tryptophanyl-tRNA synthetase requires intermodular coupling. J. Biol.Chem. 288, 34736 –34745

16. Li, L., Weinreb, V., Francklyn, C., and Carter, C. W., Jr. (2011) Histidyl-tRNA synthetase Urzymes: Class I and II aminoacyl-tRNA synthetaseUrzymes have comparable catalytic activities for cognate amino acid ac-tivation. J. Biol. Chem. 286, 10387–10395

17. Pham, Y., Kuhlman, B., Butterfoss, G. L., Hu, H., Weinreb, V., and Carter,C. W., Jr. (2010) Tryptophanyl-tRNA synthetase Urzyme: a model to re-capitulate molecular evolution and investigate intramolecular comple-mentation. J. Biol. Chem. 285, 38590 –38601

18. Pham, Y., Li, L., Kim, A., Erdogan, O., Weinreb, V., Butterfoss, G. L.,Kuhlman, B., and Carter, C. W., Jr. (2007) A minimal TrpRS catalyticdomain supports sense/antisense ancestry of Class I and II aminoacyl-tRNA synthetases. Mol. Cell 25, 851– 862

19. Carter, C. W., Jr. (1993) Cognition, mechanism, and evolutionary relation-ships in aminoacyl-tRNA synthetases. Annu. Rev. Biochem. 62, 715–748

20. Schwob, E., and Söll, D. (1993) Selection of a ’minimal’ glutaminyl synthe-tase and the evolution of Class I synthetases. EMBO J. 12, 5201–5208

21. Francklyn, C., Musier-Forsyth, K., and Schimmel, P. (1992) Small RNAhelices as substrates for aminoacylation and their relationship to chargingof transfer RNAs. Eur. J. Biochem. 206, 315–321

22. Francklyn, C., and Schimmel, P. (1989) Aminoacylation of RNA minihe-lices with alanine. Nature 337, 478 – 481

23. Francklyn, C., and Schimmel, P. (1990) Enzymatic aminoacylation of aneight-base-pair microhelix with histidine. Proc. Natl. Acad. Sci. U.S.A. 87,8655– 8659

24. Rodin, S. N., and Ohno, S. (1995) Two types of aminoacyl-tRNA synthe-tases could be originally encoded by complementary strands of the samenucleic acid. Orig. Life Evol. Biosph. 25, 565–589

25. Kuhlman, B., Dantas, G., Ireton, G. C., Varani, G., Stoddard, B. L., andBaker, D. (2003) Design of a novel globular protein fold with atomic-levelaccuracy. Science 302, 1364 –1368

26. Dokholyan, N. V., and Shakhnovich, E. I. (2001) Understanding hierarchi-cal protein evolution from first principles. J. Mol. Biol. 312, 289 –307

27. Li, L., Francklyn, C., and Carter, C. W., Jr. (2013) Aminoacylating Urzymeschallenge the RNA World hypothesis. J. Biol. Chem. 288, 26856 –26863

28. Fersht, A. R. (1987) Dissection of the structure and activity of the tyrosyl-tRNA synthetase by site-directed mutagenesis. Biochemistry 26,8031– 8037

29. Perona, J. J., and Gruic-Sovulj, I. (2014) Synthetic and editing mechanismsof aminoacyl-tRNA synthetases. Top. Curr. Chem. 344, 1– 41

30. Corigliano, E. M., and Perona, J. J. (2009) Architectural underpinnings ofthe genetic code for glutamine. Biochemistry 48, 676 – 687

31. Bullock, T. L., Rodríguez-Hernández, A., Corigliano, E. M., and Perona,J. J. (2008) A rationally engineered misacylating aminoacyl-tRNA synthe-tase. Proc. Natl. Acad. Sci. U.S.A. 105, 7428 –7433

32. Ibba, M., and Söll, D. (2004) Aminoacyl-tRNAs: setting the limits of thegenetic code. Genes Dev. 18, 731–738

33. Praetorius-Ibba, M., Stange-Thomann, N., Kitabatake, M., Ali, K., Söll, I.,Carter, C. W., Jr., Ibba, M., and Söll, D. (2000) Ancient adaptation of theactive site of tryptophanyl-tRNA synthetase for tryptophan binding. Bio-chemistry 39, 13136 –13143

34. Wang, L., Brock, A., Herberich, B., and Schultz, P. G. (2001) Expanding thegenetic code of Escherichia coli. Science 292, 498 –500

35. Woese, C. (1998) The universal ancestor. Proc. Natl. Acad. Sci. U.S.A. 95,6854 – 6859

36. Isaacson, W. (2011) Steve Jobs, p. 364, Simon & Schuster, New York37. Jimenez, M., Williams, T., González-Rivera, A. K., Li, L., Erdogan, O., and

Carter, C. W. J. (2014) Did Class 1 and Class 2 aminoacyl-tRNA synthe-tases descend from genetically complementary, catalytically active ATP-binding motifs? Biophys. J. 106, 675a

38. Kumar, R. K., and Yarus, M. (2001) RNA-catalyzed amino acid activation.Biochemistry 40, 6998 –7004

39. Carter, C. W., Jr., Li, L., Weinreb, V., Collier, M., Gonzalez-Rivera, K.,Jimenez-Rodriguez, M., Erdogan, O., Kuhlman, B., Ambroggio, X.,Williams, T., and Chandrasekharan, S. N. (2014) The Rodin-Ohno hy-pothesis that two enzyme superfamilies descended from one ancestralgene: an unlikely scenario for the origins of translation that will not bedismissed. Biol. Direct 9, 11

40. Augustine, J., and Francklyn, C. (1997) Design of an active fragment of a

MINIREVIEW: Urzymology and the Appearance of Enzymes

OCTOBER 31, 2014 • VOLUME 289 • NUMBER 44 JOURNAL OF BIOLOGICAL CHEMISTRY 30219

at University of N

orth Carolina at C

hapel Hill on N

ovember 3, 2014

http://ww

w.jbc.org/

Dow

nloaded from

Class II aminoacyl-tRNA synthetase and its significance for synthetaseevolution. Biochemistry 36, 3473–3482

41. Aravind, L., Leipe, D. D., and Koonin, E. V. (1998) Toprim: a conservedcatalytic domain in type IA and II topoisomerases, DnaG-type primases,OLD family nucleases and RecR proteins. Nucleic Acids Res. 26,4205– 4213

42. Burbaum, J. J., Starzyk, R. M., and Schimmel, P. (1990) Understandingstructural relationships in proteins of unsolved three-dimensional struc-ture. Proteins 7, 99 –111

43. Gong, L. I., Suchard, M. A., and Bloom, J. D. (2013) Stability-mediatedepistasis constrains the evolution of an influenza protein. eLife 2, e00631

44. Nasrallah, C. A., and Huelsenbeck, J. P. (2013) A phylogenetic model forthe detection of epistatic interactions. Mol. Biol. Evol. 30, 2197–2208

45. Covert, A. W., 3rd, Lenski, R. E., Wilke, C. O., and Ofria, C. (2013) Exper-iments on the role of deleterious mutations as stepping stones in adaptiveevolution. Proc. Natl. Acad. Sci. U.S.A. 110, E3171–EE3178

46. Kapustina, M., Weinreb, V., Li, L., Kuhlman, B., and Carter, C. W., Jr.(2007) A conformational transition state accompanies amino acid activa-tion by B. stearothermophilus tryptophanyl-tRNA synthetase. Structure15, 1272–1284

47. Cammer, S., and Carter, C. W., Jr. (2010) Six Rossmannoid folds, includingthe Class I aminoacyl-tRNA synthetases, share a partial core with theanticodon-binding domain of a Class II aminoacyl-tRNA Synthetase.Bioinformatics 26, 709 –714

48. Popper, K. R. (1959) The Logic of Scientific Discovery, pp. 124 –130, Rout-ledge, Oxford, United Kingdom

49. Weiner, A. M. (1999) Molecular evolution: aminoacyl-tRNA synthetaseson the loose. Curr. Biol. 9, R842–R844

50. Klipcan, L., and Safro, M. (2004) Amino acid biogenesis, evolution of thegenetic code and aminoacyl-tRNA synthetases. J. Theor. Biol. 228,389 –396

51. Schimmel, P., Giegé, R., Moras, D., and Yokoyama, S. (1993) An opera-tional RNA code for amino acids and possible relationship to genetic code.Proc. Natl. Acad. Sci. U.S.A. 90, 8763– 8768

52. Carter, C. W. J. (1975) Cradles for molecular evolution. New Sci. 65,784 –787

53. Carter, C. W., Jr., and Kraut, J. (1974) A proposed model for interaction ofpolypeptides with RNA. Proc. Nat. Acad. Sci., U.S.A. 71, 283–287

54. Andreeva, A., Howorth, D., Chandonia, J. M., Brenner, S. E., Hubbard,T. J., Chothia, C., and Murzin, A. G. (2008) Data growth and its impact onthe SCOP database: new developments. Nucleic Acids Res. 36,D419 –D425

55. Pearl, F. M., Bennett, C. F., Bray, J. E., Harrison, A. P., Martin, N., Shep-herd, A., Sillitoe, I., Thornton, J., and Orengo, C. A. (2003) The CATHdatabase: an extended protein family resource for structural and func-tional genomics. Nucleic Acids Res. 31, 452– 455

56. Goldstein, R. (2008) The structure of protein evolution and the evolutionof protein structure. Curr. Opin. Struct. Biol. 18, 170 –177

57. Shakhnovich, B. E., Dokholyan, N. V., DeLisi, C., and Shakhnovich, E. I.(2003) Functional fingerprints of folds: evidence for correlated structure-function evolution. J. Mol. Biol. 326, 1–9

58. Carter, C. W., Jr., and Duax, W. L. (2002) Did tRNA synthetase classesarise on opposite strands of the same gene? Mol. Cell 10, 705–708

59. Artymiuk, P. J., Rice, D. W., Poirrette, A. R., and Willet, P. (1994) A tale oftwo synthetases. Nat. Struct. Biol. 1, 758 –760

60. Zull, J. E., and Smith, S. K. (1990) Is genetic code redundancy related toretention of structural information in both DNA strands? TrendsBiochem. Sci. 15, 257–261

61. Wolfenden, R. (2011) Benchmark reaction rates, the stability of biologicalmolecules in water, and the evolution of catalytic power in enzymes. Ann.Rev. Biochem. 80, 645– 667

62. Stockbridge, R. B., Lewis, C. A., Jr., Yuan, Y., and Wolfenden, R. (2010)Impact of temperature on the time required for the establishment of pri-mordial biochemistry, and for the evolution of enzymes. Proc. Natl. Acad.Sci. U.S.A. 107, 22102–22105

63. Wolfenden, R., and Snider, M. J. (2001) The depth of chemical time andthe power of enzymes as catalysts. Acc. Chem. Res. 34, 938 –945

64. Hu, H. (2014) Wild-type and molten globular chorismate mutase achievecomparable catalytic rates using very different enthalpy/entropy compen-sations. Science China 57, 156 –164

65. Pervushin, K., Vamvaca, K., Vögeli, B., and Hilvert, D. (2007) Structureand dynamics of a molten globular enzyme. Nat. Struct. Mol. Biol. 14,1202–1206

66. Guo, M., Yang, X.-L., and Schimmel, P. (2010) New functions of amino-acyl-tRNA synthetases beyond translation. Nat. Rev. Mol. Cell Biol. 11,668 – 674

67. Guo, M., and Schimmel, P. (2013) Essential nontranslational functions oftRNA synthetases. Nat. Chem. Biol. 9, 145–153

68. del Sol, A., Fujihashi, H., Amoros, D., and Nussinov, R. (2006) Residuescrucial for maintaining short paths in network communication mediatesignaling in proteins. Mol. Syst. Biol. 2, 2006.0019

69. Weinreb, V., Li, L., Campbell, C. L., Kaguni, L. S., and Carter, C. W., Jr.(2009) Mg2�-assisted catalysis by B. stearothermophilus TrpRS is pro-moted by allosteric effects. Structure 17, 952–964

70. Liu, C. C., and Schultz, P. G. (2010) Adding new chemistries to the geneticcode. Ann. Rev. Biochem. 79, 413– 444

71. Schroeder, G. K., and Wolfenden, R. (2007) The rate enhancement pro-duced by the ribosome: an improved model. Biochemistry 46, 4037– 4044

72. Stockbridge, R. B., and Wolfenden, R. (2009) The intrinsic reactivity ofATP and the catalytic proficiencies of kinases acting on glucose, N-acetyl-glucosamine, and homoserine. J. Biol. Chem. 284, 22747–22757

MINIREVIEW: Urzymology and the Appearance of Enzymes

30220 JOURNAL OF BIOLOGICAL CHEMISTRY VOLUME 289 • NUMBER 44 • OCTOBER 31, 2014

at University of N

orth Carolina at C

hapel Hill on N

ovember 3, 2014

http://ww

w.jbc.org/

Dow

nloaded from