Cloning, characterization and phylogenetic relationships ofcel5B, a new endoglucanase encoding gene...

17
J. Basic Microbiol. 44 (2004) 5, 383 – 399 DOI: 10.1002/jobm.200410422 © 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim 0233-111X/04/0510-0383 ( 1 Department of Agricultural Biotechnology and Microbiology – Szent István University, Páter K. u. 1., Gödöllő, H-2103, Hungary; 2 Agricultural Biotechnology Center, Gödöllő, H-2101, P.O.Box 411, Hungary; 3 Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA) Cloning, characterization and phylogenetic relationships of cel5B, a new endoglucanase encoding gene from Thermobifida fusca KATALIN POSTA 1 , EMESE BÉKI 1, 2 , DAVID B. WILSON 3 , JÓZSEF KUKOLYA 1, * and LÁSZLÓ HORNOK 1, 2 (Received 21 April 2004/Accepted 25 May 2004) Thermobifida fusca, a thermophilic, aerobic, cellulolytic bacterium has a highly complex cellulase system comprising three endoglucanases, two exoglucanases and one processive endoglucanase. Zymogram analysis indicated that additional cellulases may exist in T. fusca strain TM51, therefore a TM51 expression library was prepared in Streptomyces lividans TK24 and screened for hydrolases. A new endoglucanase gene, named Tf cel5B, was identified. Heterologous Cel5B, produced in S. lividans, had temperature and pH optima of 77 °C and 8.2, respectively and retained more than 60% of its activity after 24 h incubation at 60 °C. Domain analysis revealed an N-terminal catalytic domain with homology to known endoglucanases in family GH5 and a C-terminal cellulose binding module III domain (CBD). Comparing the domain structures of all seven known T. fusca cellulases showed, that the cellulase system of this organism consists of pairs of enzymes from the same GH family, including Cel5A Cel5B, Cel6A Cel6B and Cel9A Cel9B plus a single family GH48 enzyme (Cel48A). Further- more, the catalytic and substrate binding domains of enzymes, belonging to the same GH family were arranged in opposite orientations. Phylogenetic comparisons of the catalytic domain sequences of the T. fusca cellulases to other family GH5, GH6, GH9 and GH48 cellulases of bacterial origin revealed that the enzyme pairs in the same GH family are not closely related to each other, instead they showed significant similarities to various cellulase enzymes from taxonomically distinct organisms. Therefore, the complex and highly efficient cellulase system of T. fusca seems to be evolved as a result of horizontal gene transfers rather than gene duplication events. Thermobifida fusca is one of the most extensively studied aerobic, thermophilic, cellulose degrading bacteria. This actinomycete utilizes various plant cell wall polymers, including cellulose as the major carbon source and secretes multiple cellulases: three endoglucanases (Cel9B, Cel6A, Cel5A, formerly named E1, E2, and E5), two exoglucanases (Cel6B and Cel48A, formerly E3 and E6), and an endo/exoglucanase (Cel9A, formerly E4), which have been characterized in detail (IRWIN et al. 1998, 2000, JUNG et al. 1993, LAO et al. 1991, GHANGAS and WILSON 1988). These cooperatively acting extracellular enzymes degrade the cellulose chain in concert, yielding cellobiose as the main product. Cellobiose is further degraded to glucose by an intracellular β-glucosidase, named BglC (SPIRIDONOV and WIL- SON 2001). This system resembles the cellulase enzyme system of the aerobe mesophilic actinomycete, Cellulomonas fimi. Particularly interesting similarities are the multiplicity of endoglucanases and cellobiohydrolases (CBHs) (WARREN 1996), the presence of the two types of CBHs acting on the opposite ends of the cellulose chain (SHEN et al. 1996, BARR * Corresponding author: Dr. J. KUKOLYA; e-mail: kukolya@abc.hu

Transcript of Cloning, characterization and phylogenetic relationships ofcel5B, a new endoglucanase encoding gene...

J. Basic Microbiol. 44 (2004) 5, 383–399 DOI: 10.1002/jobm.200410422

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim 0233-111X/04/0510-0383

(1 Department of Agricultural Biotechnology and Microbiology – Szent István University, Páter K. u. 1., Gödöllő, H-2103, Hungary; 2 Agricultural Biotechnology Center, Gödöllő, H-2101, P.O.Box 411, Hungary; 3 Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA)

Cloning, characterization and phylogenetic relationships of cel5B, a new endoglucanase encoding gene from Thermobifida fusca

KATALIN POSTA1, EMESE BÉKI1, 2, DAVID B. WILSON3, JÓZSEF KUKOLYA1, * and LÁSZLÓ HORNOK1, 2

(Received 21 April 2004/Accepted 25 May 2004)

Thermobifida fusca, a thermophilic, aerobic, cellulolytic bacterium has a highly complex cellulase system comprising three endoglucanases, two exoglucanases and one processive endoglucanase. Zymogram analysis indicated that additional cellulases may exist in T. fusca strain TM51, therefore a TM51 expression library was prepared in Streptomyces lividans TK24 and screened for hydrolases. A new endoglucanase gene, named Tf cel5B, was identified. Heterologous Cel5B, produced in S. lividans, had temperature and pH optima of 77 °C and 8.2, respectively and retained more than 60% of its activity after 24 h incubation at 60 °C. Domain analysis revealed an N-terminal catalytic domain with homology to known endoglucanases in family GH5 and a C-terminal cellulose binding module III domain (CBD). Comparing the domain structures of all seven known T. fusca cellulases showed, that the cellulase system of this organism consists of pairs of enzymes from the same GH family, including Cel5A − Cel5B, Cel6A − Cel6B and Cel9A − Cel9B plus a single family GH48 enzyme (Cel48A). Further-more, the catalytic and substrate binding domains of enzymes, belonging to the same GH family were arranged in opposite orientations. Phylogenetic comparisons of the catalytic domain sequences of the T. fusca cellulases to other family GH5, GH6, GH9 and GH48 cellulases of bacterial origin revealed that the enzyme pairs in the same GH family are not closely related to each other, instead they showed significant similarities to various cellulase enzymes from taxonomically distinct organisms. Therefore, the complex and highly efficient cellulase system of T. fusca seems to be evolved as a result of horizontal gene transfers rather than gene duplication events.

Thermobifida fusca is one of the most extensively studied aerobic, thermophilic, cellulose degrading bacteria. This actinomycete utilizes various plant cell wall polymers, including cellulose as the major carbon source and secretes multiple cellulases: three endoglucanases (Cel9B, Cel6A, Cel5A, formerly named E1, E2, and E5), two exoglucanases (Cel6B and Cel48A, formerly E3 and E6), and an endo/exoglucanase (Cel9A, formerly E4), which have been characterized in detail (IRWIN et al. 1998, 2000, JUNG et al. 1993, LAO et al. 1991, GHANGAS and WILSON 1988). These cooperatively acting extracellular enzymes degrade the cellulose chain in concert, yielding cellobiose as the main product. Cellobiose is further degraded to glucose by an intracellular β-glucosidase, named BglC (SPIRIDONOV and WIL-SON 2001). This system resembles the cellulase enzyme system of the aerobe mesophilic actinomycete, Cellulomonas fimi. Particularly interesting similarities are the multiplicity of endoglucanases and cellobiohydrolases (CBHs) (WARREN 1996), the presence of the two types of CBHs acting on the opposite ends of the cellulose chain (SHEN et al. 1996, BARR

* Corresponding author: Dr. J. KUKOLYA; e-mail: [email protected]

384 K. POSTA et al.

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

et al. 1996) and the existence of the same unique processive endoglucanase in both organ-isms (TOMME et al. 1996, IRWIN et al. 1998). During a previous survey we isolated a number of thermophilic actinomycetes from the hot core of manure compost and determined their endoglucanase, cellobiohydrolase, en-doxylanase, β-mannosidase, protease, amylase, as well as lignin solubilization activities (KUKOLYA et al. 1997). One strain of T. fusca, designated TM51 showed the highest ligno-cellulose degrading capability (KUKOLYA et al. 2002). Preliminary zymogram analysis of the cellulolytic enzyme system of this strain revealed a more complex endoglucanase pat-tern than that described previously in T. fusca YX. In order to find whether these additional proteins are encoded by unknown cellulase genes, or result from post-translational modifications, more detailed studies were initiated. The present paper reports on the cloning and characterization of this gene, named Tf cel5B. Data on physico-chemical parameters and enzyme activity of the Cel5B protein are also provided. Investigations on domain organization and phylogenetic relationships of this new enzyme and the other cellulases of T. fusca, described thus far allowed us to delineate the evolution of the highly complex and efficient cellulose degrading system in this model or-ganism.

Materials and methods

Chemicals: The different p-nitrophenyl glycosides (pNP-β-D-cellobioside, pNP-β-D-glucopyranoside, pNP-β-D-mannopyranoside, pNP-β-D-xylopyranoside) IPTG, laminarin, lichenin, birch wood xylan, oat spelt xylan, CM-cellulose and locust bean gum (LBG) were all purchased from SIGMA. MN300 cellulose powder was from MACHEREY-NAGEL, Avicel from MERCK, and pustulan from CALBIOCHEM. Restriction endonucleases were from PROMEGA, Gibco-BRL, T4 DNA ligase from BOEHRINGER Mannheim and thiostrepton was obtained from FLUKA. Strains and plasmids: Identification of Thermobifida fusca TM51, based on standard procedures (KUKOLYA et al. 1997) was confirmed by partial 16S rDNA sequencing and DNA-DNA hybridization; T. fusca ATCC 27730 served as the reference strain. Streptomyces lividans TK24 was used to prepare an expression library from T. fusca TM51. DNA fragments from cellulase-positive clones were subcloned in Escherichia coli DH5α. Plasmids pIJ699 and pBluescript KS+ (Stratagene) were used for constructing the expression library and subcloning DNA fragments in E. coli, respectively. Stock cultures were maintained at –70 °C in LURIA-BERTANI (LB) medium containing 15% glycerol or were freeze-dried for long term preservation. Plasmid DNA was stored in water at –20 °C. Growth conditions: T. fusca TM51 was grown in LB broth (200 ml in 1000 ml ERLENMEYER flasks) inoculated with a 1 ml conidiospore suspension (106 ml–1) collected by washing from 7-day-old cul-tures on MN 300 plates. Cultures were incubated in a rotary shaker at 200 r.p.m. at 50 ± 2 °C for one week. Cells were collected by centrifugation (6000 r.p.m.), washed two times with distilled water, freeze-dried and subjected to DNA extraction. For protoplast preparation S. lividans TK 24 was grown as shaken culture at 30 °C for 2 days in yeast extract – malt extract medium (YEME) supplemented with 5 mM MgCl2 and 0.5% (w/v) glycin (HUNTER 1985). S. lividans transformants were grown for mass production of endoglucanases in LB medium (200 ml in 1000 ml flasks) inoculated with 1 ml conidium suspension (106 ml–1) and incubated in a rotary shaker (250 r.p.m.) at 30 °C for 2 days; 200 µg thiostrepton ml–1 were added to maintain stability of the transformants. Construction and screening of the genomic library of T. fusca TM51: Genomic DNA, prepared from T. fusca TM51 according to KUKOLYA et al. (2002) was partially digested in serial dilutions of SalI for 1 hour to find the optimum enzyme concentration yielding the highest proportion of fragments sized between 10–15 kb. These fragments, separated by electrophoresis were isolated from the agarose gel with a DNA extraction kit from FERMENTAS, and ligated into pIJ699 purified by the Wizard® Plus Maxipreps DNA Purification System (PROMEGA) according to the manufac- turer’s recommendations. Protoplast preparation, transformation, regeneration and selection of the

New endoglucanase encoding gene from Thermobifida fusca 385

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

transformants were carried out largely as described by HUNTER et al. (1985). After transformation, cells were allowed to regenerate on solid R2 agar (HUNTER 1985) for 24 h. Plates were then overlaid with 1 ml thiostrepton solution (200 µg ml–1) to select the transformants. Transformants were maintained on R2 plates at 4 °C or in LB medium containing 15% glycerol at –70 °C. Thiostrepton was added to both media (200 µg ml–1). Transformants were screened for endoglucanase activity by growing them as shaken cultures in 500 µl LB containing 200 µg thiostrepton ml–1 at 30 °C for 2 days and culture supernatants were tested on agar plates containing 0.5% (w/v) CM-cellulose; endo- glucanase activities were visualized with Congo red staining after 30 min incubation at 50 °C. In order to identify the isoenzyme patterns of the cellulase positive S. lividans transformants, supernatants of two-day-old cultures grown in LB were collected by centrifugation at 5000 × g for 15 min and concentrated with a Vivaspin 4 concentrator (Vivascience). Proteins were separated by 8% SDS-PAGE using standard protocols (LAEMLI 1970). For zymography, we used the method described earlier (BÉKI et al. 2003) and gels were stained with 1% (w/v) Congo red solution. PCR identification of T. fusca endoglucanases in S. lividans transformants: Oligonucleotide primers based on known T. fusca endoglucanase sequences were designed (Table 1) and used for PCR-identification of the cloned sequences. PCR was carried out in 50 µl reaction mixture containing 1.5 mM MgCl2, 100 µM of each dNTP, 0.5 µM of each pairs of primers, 10 ng DNA purified from cel-lulase positive S. lividans transformants and 1 U of Taq polymerase (PROMEGA). Amplifications were performed in a PERKIN-ELMER DNA thermal cycler by using the following program: one cycle of 95 °C for 2 min, 30 cycles of 94 °C for 1 min, 55 °C for 1 min, 72 °C for 1 min, one cycle of 72 °C for 10 min, and storage at 4 °C. The PCR products were separated by electrophoresis in 0.8% (w/v) agarose gel, stained with ethidium-bromid and visualized under UV light. Cloning of the new endoglucanase gene: One cellulase-positive S. lividans transformant, named E61 yielded no PCR-fragment with any of the primers listed in Table 1. This transformant was presumed to harbour a new endoglucanase gene from T. fusca. Plasmid DNA purified from the transformant was digested with HindIII and a 5.2 kb fragment was ligated into pBluescript KS+ yielding the vector pCELG. This vector was used to transform E. coli DH5α according to standard protocols (SAMBROOK et al. 1989). Plasmid DNA was purified from positive transformants for subcloning and sequencing by a plasmid miniprep kit (Qiagen Qiaprep Spin Miniprep Kit). DNA sequencing was performed with the fmo1 sequencing system (PROMEGA) using both M13 and sequence-specific primers. Sequence analysis and comparisons were carried out by using the DNA Inspector IIe (TEXTCO) program and GCG software (DEVEREUX et al. 1984). Domain structure and phylogenetic analysis of the new endoglucanase: Total and codon position G + C content determination of the cellulases of T. fusca were made by GCG software. The CAZY internet server (http://afmb.cnrs-mrs.fr/∼pedro/CAZY) was used to assign the new enzyme into the appropriate glycosyl hydrolyse (GH) family (HENRISSAT and BAIROCH 1996). The Pfam site (http://www.sanger.ac.uk/Software/Pfam/) was used to determine the domain structure and com-position (BATEMAN et al. 2002). For phylogenetic analysis, sequence data were imported from the Swissprot database. Multiple alignments and phylogenetic analysis were performed using the program GROWTREE from the GCG sequence analyses software package, by the Neighbour-joining method. The nucleotide sequence of the Tf cel5B gene has been deposited in the GenBank data base under accession number AY298814. Isolation and purification of the recombinant enzyme: LB medium (200 ml in 1000 ml ERLEN-MEYER flasks) was inoculated with spores (at a final concentration of 104 spores ml–1) from a 5-day-old

culture of transformant E61 grown on MN300 agar containing 1% (w/v) sucrose and incubated in a rotary shaker at 200 r.p.m. at 30 °C for 2 days. Culture supernatant was filtered through glass wool and centrifuged at 5000 × g for 15 min at room temperature; phenylmethylsulfonyl fluoride (0.1 mM final concentration) was added to prevent proteolysis. The supernatant, concentrated with Vivaspin (Vivascience) and redissolved in Tris-HCl buffer (50 mM, pH 7.4) was purified with FPLC (Pharma-cia) by cellulose affinity chromatography on microcrystalline cellulose column. Linear gradient elution was performed using NaCl (1 M NaCl, 50 mM Tris-HCl, pH 7.4) and then SDS (0.1%, w/v, 50 mM Tris-HCl, pH 7.4) solutions. Fractions were tested for protein homogeneity and activity by standard SDS-PAGE using CM-cellulose as substrate.

386 K. POSTA et al.

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Substrate specificity and mode of action of Cel5B: Hydrolytic activity was assayed on low viscosity CM-cellulose, MN300 cellulose, Avicel, oat-spelt xylan, birch wood xylan, LBG-mannan, lichenin, pustulan and laminarin as polysaccharide substrates, and pNP-cellobioside, -glucoside, -xyloside, and -mannoside as oligosaccharide substrates at 70 °C for 20 min. The reaction mixture contained 0.25 µg purified Cel5B enzyme and 10 mg of each polysaccharide or 20 mM of each pNP-glucoside, respectively in 1 ml phosphate buffer (0.1 M, pH 7.0). When polysaccharide substrates were used, enzyme activities were determined by measuring the reducing sugars by the dinitrosalicylic acid assay (MILLER 1959): one unit (U) of activity was equivalent to the amount of enzyme producing 1 µM of reducing sugar per minute. In activity tests on pNP-substrates, the reaction was terminated by adding equal volume of borate buffer (0.2 M, pH 10.0) to the reaction mixture and the liberated p-nitro- phenol was measured at 400 nm: the amount of enzyme that released one nmol of nitrophenol per minute was calculated as one unit. The mode of action of Cel5B was demonstrated by viscosimetry according to IRWIN et al. (1993). Purified Cel5B (0 · 25 µg) was added to 10 ml 0 · 2% (w/v) CM-cellulose solution and the incubation time needed for 50% decrease in relative viscosity of the solution was recorded. All activity measurements were repeated three times. Protein concentrations were determined as described by BRADFORD (1976) by using bovine serum albumin as the standard. Enzyme characterization-pH and temperature relationships: Endoglucanase activity of the re- combinant enzyme was determined by using RBB-CM-cellulose (Remazol brilliant blue-dyed cellu-lose, LÖEWE Biochemica) substrate, according to the manufacturer’s recommendations. Ten µl of concentrated enzyme solution was incubated with 200 µl substrate (20 mg RBB-CM-cellulose ml–1) in 590 µl of buffer at the appropriate temperature for 15 min. The rate of hydrolysis was determined by measuring the released dye at 600 nm and activities were expressed as relative activity. The effect of pH on the activity of the recombinant enzyme was determined over the range between 4.5 and 11.0 with increments of 0.5 pH-units by using sodium succinate, potassium phosphate and glycyl-glycine buffer (50 mM of each, containing 0 · 1% bovine serum albumin and 150 mM NaCl) between pH 4.5–6.0, 6.0–8.0 and 8.0–11.0, respectively at 70 °C. Thermal optimum of the enzyme activity was determined at different temperatures in the range from 20–80 °C. Thermostability of the enzyme was determined by incubating the substrate-free reaction mixture (10 µl enzyme in 590 µl 100 mM phosphate buffer pH 7.0) at 60, 66, 70 and 77 °C for 48 h. 600 µl aliquots were taken at 1, 2, 4, 8, 16, 24, 36 and 48 h intervals, pre-incubated at 70 °C for 10 min and subjected to activity assay on CM-cellulose at the same temperature for 15 min.

Results

Cloning of a new endoglucanase gene

Altogether 2000 thiostrepton resistant S. lividans transformants were collected and cultured in LB medium. Screening of the culture supernatants for endoglucanase activities on CM-cellulose-plates resulted in the identification of 16 endoglucanase positive clones. Based on their zymogram patterns, these 16 clones could be assigned into 5 groups (Fig. 1). Representatives of each group were subjected to PCR analyses using the specific oli-gonucleotide primers (Table 1) we designed on the basis of the other endoglucanase se-quences described previously in T. fusca. Transformants K1, B33, A17 and A24 were found to contain known T. fusca genes, whereas transformant E61, which produced a ∼67 kDa endoglucanase positive band as determined by zymogram analysis yielded no PCR product with any of these specific primers, indicating that this transformant contained a new en-doglucanase encoding gene. Plasmid DNA prepared from E61 was digested with HindIII and a 5.2 kb fragment was cloned into pBluescript, yielding the vector pCELG. Pilot sequencing confirmed that this 5.2 kb DNA fragment contained an unknown endoglucanase gene. Based on restriction mapping information an 1 · 4 kb HindIII – XhoI and a 2.7 kb BamHI – SacI fragment with a 295 bp overlap were subcloned and used for sequencing.

New endoglucanase encoding gene from Thermobifida fusca 387

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Fig. 1 Zymogram analysis of Thermobifida fusca TM51 endoglucanases expressed in Streptomyces lividans TK 24 Lane 1: Non-transformed host, Streptomyces lividans TK24; lanes 2–12: A17, A24, A77 A95, B33, E61, F20, G41, K1, K34, and Q7, endoglucanase positive transformants of S. lividans TK24; lane 13: Thermobifida fusca TM51. Proteins were separated in 10% (w/v) polyacrylamide gel, containing 0.1% (w/v) CMC and stained for activity with Congo-red. Note: no endoglucanase activity was induced in S. lividans on LB medium

Sequence analysis

Sequence analysis of the cloned DNA revealed an ORF of 1851 nucleotides, encoding a putative protein of 616 aa, with a calculated molecular mass of 67 665 Da and an estimated pI of 4.22 (Fig. 2). The ORF had a G+C content of 65.47%, not significantly different from the overall 67% G+C content determined for the whole genome of T. fusca (KUKOLYA et al. 2002). A poten-tial ribosome binding site (AGGA) was identified 22 nucleotides upstream of the translation initiation codon. All cellulase genes from T. fusca contain a 14-bp inverted repeat sequence, TGGGAGCGCTCCCA in their 5′ regulatory regions, which serves as the binding site for CelR, a transcriptional regulator protein (SPIRIDONOV and WILSON 1998) providing the coordinated regulation of these enzymes. In this new, putative endoglucanase encoding gene

388 K. POSTA et al.

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Table 1 Specific primers for PCR-based identification of known cellulases of T. fusca

Cellulases Sequences of sense and antisense primers DNA-fragments

Cel5A (L01577) 5′-CACCGGGATGACCTCTTGG-3′ 5′-GAGTCGCGCAGGCCG-3′

911 bp

Cel6A (M73321) 5′-CAGTCACCAGCGCCTACC- 3′ 5′-GCCGTTGCCGTTGCG-3′

409 bp

Cel6B (U189785) 5′-GCCAACGTCACCATCAC-3′ 5′-GGCGTCGATGTAGTTGTA-3′

912 bp

Cel9B (L20094) 5′-GCTCCGGCACCGCACCC-3′ 5′-GAAGAGGTCATCCCGGTG-3′

859 bp

Cel48A (AF144563) 5′-CCGAAGAGGTCATCCCGGTG-3′ 5′-CGGGGTCCTTGATCTTC-3′

1039 bp

an imperfect copy of the CelR binding sequence, CGGGAGCGCACCCT was identified 67 nucleotides before the translational start codon. The deduced amino acid sequence of the putative endoglucanase was compared with other protein sequences in GenBank (Fig. 3) and it had that 56, 56 and 54% identities over 423, 424 and 482 aa stretches, respectively with THAN (a family 5 β-glucanase from the thermophilic anaerobe, NA10), CELB (another family 5 endoglucanase from Caldicellu-losiruptor saccharolyticus) and CEND (again a family 5 endoglucanase from Cellulomonas fimi). These findings indicated that the new endoglucanase identified in the present study belongs to family GH5 (HENRISSAT and BAIROCH 1996). According to a more recent no-menclature, proposed by HENRISSAT et al. (1998), the correct name of this new endogluca-nase is Tf Cel5B (the corresponding gene is Tf cel5B), as another GH5 endoglucanase, Cel5A (E5) has already been described from T. fusca (LAO et al. 1991). Tf Cel5B could also be named as E7, according to a former nomenclature used for T. fusca cellulases (LAO et al. 1991, JUNG et al. 1993).

Domain analysis

Cel5B has a 42 aa signal peptide preceding the catalytic domain, which is similar to other actinomycete signal sequences both in size and aa composition. The signal peptide cleavage site was predicted between aa 30 and 31 by the SignalP program (NIELSEN et al. 1997). Further computer searches for protein homologies confirmed that Cel5B contains a family 5 catalytic domain located between aa 43 and 385 and an 78 aa long CBD III type cellulose binding domain, starting from aa 476. Alignments of the Cel5B catalytic domain to similar regions of other cellulases revealed significant homologies. There was a 67% identity (over a 389 aa stretch) with an endoglucanase from Cellulomonas fimi, a 66% identity (in a 392 stretch) with Cel5A of Cellulomonas flavigena and a 60% identity (in a 342 aa stretch) with an endoglucanase from Caldicellulosiruptor saccharolyticus. On the basis of sequence ho-mologies with other members of family 5 glycosyl hydrolases, two conserved catalytic residues, Glu201 and Glu344 could be predicted as the putative proton donor and nucleophile, respectively. A comparison of the putative cellulose binding domain (CBD) of Cel5B with

Fig. 2 Nucleotide and deduced amino acid sequences of Cel5B. Sequence numbering begins with the putative ATG codon. The putative CelR binding site is marked by bold, italics. Asterisks indicate the putative catalytic domain regions. The dashed lines indicate the putative substrate binding regions. The boxed sequences correspond to unique triplicated motifs in the linker region

New endoglucanase encoding gene from Thermobifida fusca 389

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

1441 AACAGTCTTGCGGCCGACGACAGCCAGATCGCGCCGGGGCTGCGTCTGGTCAACACCGGAAGCAGCACGGTCGACCTGGCTGACGTGGAAATCCACTACTACTTCACCAACGAGCCC

-120 AATCTTCTCGTGGCTCCAGTGAAGTTTTCTCCTTCCATTCGGGAGCGCACCCTGCGCTTTTGGTCCCCCCACGCTTCTGCGGAGGCGTGTTCCCCGAAAGGACTACCGGCAGTGGGCAGA

1 ATGACCCCCCTGACGCGACGACTGCGTGCCGGAGCAGCCGCGATCGCGATCGGAGCGTCTGCTCTTATCCCCCTGACTTCCTCCCCCGCAGCCGCCTCAGGCACCGCTGACTGGCTGCAC

M T P L T R R L R A G A A A I A I G A S A L I P L T S S P A A A S G T A D W L H 40

121 ACGGACGGCAACCGGATCGTGGACTCCGCGGGCAACGAGGTGTGGCTCACCGGAGCCAACTGGTTCGGCTTCAACACCAGCGAACGGATGTTCCACGGGCTGTGGGCCGCCAACATCGAG

T D G N R I V D S A G N E V W L T G A N W F G F N T S E R M F H G L W A A N I E 80

***************************************************************************************************************

241 GACATCACCAGTGCGATGGCCGAGCGCGGCATCAACATGGTGCGTGTCCCCATCAGCACCCAACTGCTGTTGGAGTGGAAGAACGGCCAGGCCGGACCGAGTGGAGTCAACGAGTACGTC

D I T S A M A E R G I N M V R V P I S T Q L L L E W K N G Q A G P S G V N E Y V 120

************************************************************************************************************************

361 AACCCCGAACTGGCGGGGATGAACACCCTCGAAGTGTTCGACTACTGGCTGCAACTGTGCGAAGAGTACGGCCTCAAAGTCATGCTTGACGTGCACAGCGCGGAGGCCGACAACTCCGGG

N P E L A G M N T L E V F D Y W L Q L C E E Y G L K V M L D V H S A E A D N S G 160

************************************************************************************************************************

481 CACTACTACCCGGTCTGGTACAAGGGCGATATCACCACCGAGGACTTCTACACGGCCTGGGAGTGGGTCACCGAGCGGTACAAGAACAACGACACCATCGTCGCCGCAGACATCAAGAAC

H Y Y P V W Y K G D I T T E D F Y T A W E W V T E R Y K N N D T I V A A D I K N 200

************************************************************************************************************************

601 GAGCCCCACGGCAAAGCCAACGAGACCCCGCGCGCCAAGTGGGACGGCTCCACGGACATCGACAACTTCAAGCACGTCTGCGAGACCGCCGGTAAGCGCATCCTCGCGATCAACCCGAAC

E P H G K A N E T P R A K W D G S T D I D N F K H V C E T A G K R I L A I N P N 240

************************************************************************************************************************

721 ATGCTCATCCTGTGCGAGGGGATCGAGATCTACCCCAAGGATGGGCAGGACTGGTCCTCCACCGATGGGCGGGACTACTACTCCACCTGGTGGGGCGGCAACCTGCGCGGCGTCGCCGAC

M L I L C E G I E I Y P K D G Q D W S S T D G R D Y Y S T W W G G N L R G V A D 280

************************************************************************************************************************

841 CACCCCGTCGACCTGGGCGCCCACCAGGACCAGTTGGTCTACTCGCCGCACGACTACGGTCCCAGCGTGTTCGAGCAGCCCTGGTTCGAAGGCGAGTGGAACCGGCAGACCCTGACCGAG

H P V D L G A H Q D Q L V Y S P H D Y G P S V F E Q P W F E G E W N R Q T L T E 320

************************************************************************************************************************

961 GACGTGTGGCGTCCCAACTGGCTCTACATCCACGAAGACGACATCGCTCCGCTGCTCATCGGCGAGTGGGGCGGCTTCCTGGATGGGGGCGACAACGAGAAGTGGATGACGGCGCTGCGC

D V W R P N W L Y I H E D D I A P L L I G E W G G F L D G G D N E K W M T A L R 360

************************************************************************************************************************

1081 TCGCTCATCATCGACGAGAAGATGCACCACACCTTCTGGGCGCTGAACCCGAACTCGGGTGACACCGGCGGCCTGCTCAACTACGACTGGACAACCTGGGACGAGGCCAAGTACGCGTTC

S L I I D E K M H H T F W A L N P N S G D T G G L L N Y D W T T W D E A K Y A F 400

************************************************************************

1201 CTCAAGCCTGCGTTGTGGCAGGACGCCAACGGCAAGTTCGTCGGCCTGGACCACGATGTGCCGCTGGGCGGGGTGGGCTCCACCACCGGTGTGTCGCTGAACCAGTACTACGGCGGGGGC

L K P A L W Q D A N G K F V G L D H D V P L G G V G S T T G V S L N Q Y Y G G G 440

1321 GGACCGAGCCAGCCGCCCACGGAACCCACGGAGCCGCCGACGGAGCCCACGGAGCCGCCGACGGAGCCGACGGAGCCGCCCGCCAACCCCACGGGTGCGCTCGAGGTCTACTACCGCAAC

G P S Q P P T E P T E P P T E P T E P P T E P T E P P A N P T G A L E V Y Y R N 480

---------------

GGC

N S L A A D D S Q I A P G L R L V N T G S S T V D L A D V E I H Y Y F T N E P G 520

------------------------------------------------------------------------------------------------------------------------

1561 GGTACCCTCCAGTTCACCTGCGACTGGGCTCAAGTGGGCTGCGCCAACGTCAACGCGTCCTTCACGTCGCTGTCGGCTCCGGGCGCCGACACCTCCCTGGTGCTCACCCTCAGCGGCAGC

G T L Q F T C D W A Q V G C A N V N A S F T S L S A P G A D T S L V L T L S G S 560

------------------------------------------------------------------------------------------------------

1681 CTCGCTCCCGGGGCGAGCACCGAGCTCCAAGGCCGGATCCACACCGCGAACTGGGCGAACTTCGACGAAAGCGACGACTACAGCCGCGGCACCAACACTGACTGGGAGCTCAGCGAGGTG

L A P G A S T E L Q G R I H T A N W A N F D E S D D Y S R G T N T D W E L S E V 600

1801 ATCACCGCATACCTCGGCGGCACCCTGGTCTGGGGTACGCCTCCCGCCTAA

I T A Y L G G T L V W G T P P A *

390 K. POSTA et al.

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

CEL5 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~MTPLTRRLRAGAAAIA

THAN 460 FKSGAGQLQPGKDTGEIQIRFNKSDWSNYNQGNDWSWIQSMTSYGENMKVTAYIDGVLVWGQEPTGATAAPIATPTPTPAPTATPTPTSTPTPTPTAAPT

CELB 507 FKSGAGQLQPGKDTGEIQIRFNKSDWSNYNQGNDWSWLQSMTSYGENEKVTAYIDGVLVWGQEPSGATPA....PTMTVAPTATPTPTLSPTVTPTPAPT

CEND 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~VHSASRTRARTRVRTAVSGLL

↓ ↓↓↓ ****************************************************************

CEL5 17 IGASALIPLT..SSPAAAS.......GTADWLHTDGNRIVDSAGNEVWLTGANWFGFNT.SERMFHGLWAANIEDITSAMAERGINIVRVPISTQLLLEW

THAN 560 ITPTPTITATPAPTPAPTSTPAYLDDTNDDWLYVSGNKIVDKDGRPVWLTGVNWFGYNT.GTNVFDGVWSCNLKSTLAEIANRGFNLLRVPISAELILNW

CELB 603 QTAIPTPTLTPNPTPT.SSIP...DDTNDDWLYVSGNKIVDKDGRPVWLTGINWFGYNT.GTNVFDGVWSCNLKDTLAEIANRGFNLLRVPISAELILNW

CEND 22 AATVLAAPLTLVAAPAQAA.......TGDDWLHVEGNTIVDSTGKEAILSGVNWFGFNA.SERVFHGLWSGNITQITQQMAQRGINVVRVPVSTQLLLEW

****************************************************************************************************

CEL5 107 KNGQAGPSG.VNEYVNPELAGMNTLEVFDYWLQLCEEYGLKVMLDVHSAEADNSGHYYPVWYKGDITTEDFYTAWEWVTERYKNNDTIVAADIKNEPHGK

THAN 659 SQGIYPKPN.INYYVNPELEGKNSLEVFDIVVQTCKEVGLKIMLDIHSIKTDAMGHIYPVWYDEKFTPEDFYKACEWITNRYKNDDTIIAFDLKNEPHGK

CELB 698 SQGIYPKPN.INYYVNPELEGKNSLEVFDIVVQTCKEVGLKIMLDIHSIKTDAMGHIYPVWYDEKFTPEDFYKACEWITNRYKNDDTIIAFDLKNEPHGK

CEND 114 KAGTFLKPN.VNTYANPELEGKNSLQIFEYWLTLCQKYGIKVFLDVHSAEADNSGHVYNMWWKGDITTEDVYEGWEWAATRWKDDDTIVGADIKNEPHGT

****************************************************************************************************

CEL5 206 A.NETPRAKWDGSTDIDNFKHVCETAGKRILAINPNMLILCEGIEIYPKDGQDWSSTDGRDYYSTWWGGNLRGVADHPVDLGAHQDQLVYSPHDYGPSVF

THAN 758 PWQDTTFAKWDNSTDINNWKYAAETCAKRILNINPNLLIVIEGIEAYPKDDVTWTSKSSSDYYSTWWGGNLRGVRKYPINLGKYQNKVVYSPHDYGPSVY

CELB 797 PWQDTTFAKWDNSTDINNWKYAAETCAKRILNINPNLLIVIEGIEAYPKDDVTWTSKSSSDYYSTWWGGNLRGVRKYPINLGKYQNKVVYSPHDYGPSVY

CEND 213 Q.GSTERAKWDGTTDKDNFKHFAETASKKILAINPNWLVFVEGVEIYPKPGVPWTSTGLTDYYGTWWGGNLRGVRDHPIDLGAHQDQLVYSPHDYGPLVF

**********************************************************************************

CEL5 305 EQPWFEGEWNRQTLTEDVWRPNWLYIHEDDIAPLLIGEWGGFLDGGD..NEKWMTALRSLIIDEKLHHTFWALNPNSGDTGGLLNYDWTTWDEAKYA.FL

THAN 858 QQPWFYPGFTKESLLQDCWRPNWAYIMEENIAPLLIGEWGGHLDGAD..NEKWMKYLRDYIIENHIHHTFWCFNANSGDTGGLVGYDFTTWDEKKYS.FL

CELB 897 QQPWFYPGFTKESLLQDCWRPNWAYIMEENIAPLLIGEWGGHLDGAD..NEKWMKYLRDYIIENHIHHTFWCFNANSGDTGGLVGYDFTTWDEKKYS.FL

CEND 312 DQKWFQKDFDKASLTADVWGPNWLFIHDEDIAPLLIGEWGGRL.GQDPRQDKWMAALRDLVAERRLSQTFWVLNPNSGDTGGLLLDDWKTWDEVKYSTML

CEL5 402 KPALWQDANGKFVGLDHDVPLGGVGSTT....GVSLNQYYGGGGPSQPPTEPT....EPPTEPTEPPT..EPTEPPANPTGALEVYYRNNSLAADDSQIA

THAN 955 KPALWQDSQGRFVGLDHKRPLGTNGK......NINITTYYNNNEPEPVPATK~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

CELB 994 KPALWQDSQGRFVGLDHKRPLGTNGK......NINITTYYNNNEPEPVPASK~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

CEND 411 EPTLWKH.GGKYVGLDHQVPLGGVGSTT....GTSISQVGGGTPDTTAPTAPTGLRAGTPTASTVPLTWSASTDTGGSGVAGYEV.YRGTTLVGTTTA..

Fig.

3

Alig

nmen

t of

Cel

5B o

f T

herm

obif

ida

fusc

a w

ith T

HA

N o

f th

e th

erm

ophi

lic a

naer

obe

NA

10, C

EL

B o

f C

aldi

cell

ulos

irup

tor

sacc

haro

lyti

cus

and

CE

ND

of

Cel

lulo

mon

as f

imi

Iden

tical

and

sim

ilar

resi

dues

are

ind

icat

ed b

y bl

ack

and

grey

bac

kgro

unds

, re

spec

tivel

y. A

rrow

ind

icat

es t

he p

redi

cted

sig

nal

pept

ide

clea

vage

site

. Ast

eris

ks in

dica

te th

e pu

tativ

e ca

taly

tic d

omai

n re

gion

s

New endoglucanase encoding gene from Thermobifida fusca 391

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Fig. 4 Domain organization of cellulases described in T. fusca: Domain structure of known cellulases of T. fusca (Cel5A, Cel6A, Cel6B, Cel9A, Cel9B and Cel48A) and the newly described endoglucanase Cel5B. Colour code: signal peptide − brown, catalytic domain − green, cellulose binding module II (CBM2) − purple, CBM3 − red, CBM6 − blue, N-terminal Ig-like domain (CelD-N) − pink, fibronectin type III domain (FN3) − yellow, polycystic kidney disease protein like domain (PKD) − orange known CBD sequences from other bacterial β-1,4-glucanases also revealed striking similari-ties. 41, 46, 46%, identity was found between the CBD of Cel5B and that of putative en-doglucanases of Bacillus subtilis (KUNST et al. 1997), Streptomyces avermitilis (OMURA et al. 2001) and the β-xylanase of Caldibacillus cellulovorans (SUNNA et al. 2000), respec-tively. A putative linker region, separating the catalytic and the cellulose binding domains was identified between aa 385 and 476. The sequence of this linker is quite different from that of the linkers found in the other glucanases, mentioned above and contains multiple Pro, Ser and Thr residues, as well as a unique triplicate motif of PPTEPTE. The length of this region is 91 aa, significantly exceeding the length of the linker sequences of other T. fusca cellulases. Fig. 4 compares the domain organizations of Cel5B and the other cellulases known from T. fusca. As can be seen, the domains of enzymes, which belong to the same GH family have opposite orientation.

Phylogenetic interrelationships of the new endoglucanase

We compared the G+C content of the different cellulase encoding genes of T. fusca in order to find their evolutionary relatedness. The G+C contents of these seven genes ranged be-tween 65.47 and 67.19%, which correlate well with the average 66.5% G+C content of the genome of this actinomycete (KUKOLYA et al. 2002). Although cel5B was found to have the lowest G+C content, amounting only to 65.47%, this small divergence allows no indication to the distant origin of this gene. The third codon position G+C values were also assessed to

392 K. POSTA et al.

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

support potential horizontal transfers of these cellulases. The mean value of the third posi-tion G+C content of the seven genes was 90.5 ± 3.05% and again few difference were ob-served among the T. fusca cellulases with the exception of cel9A, which showed a signifi-cantly lower, 84.9% G+C value at the third position. Alignments of the catalytic domain sequences of the seven cellulases of T. fusca and 35 other microbial glycosyl hydrolases (GHs), belonging to families GH5, GH6, GH9 and GH48 revealed several interesting relationships. Fig. 5 shows the phylogenetic tree con-structed from this multiple alignment. Of the two GH5 endoglucanases of T. fusca, Cel5B clustered together with GUNB of Cellulomonas fimi, GUNG and GUNB of Clostridium thermocellum, whereas Cel5A fell into a separate branch containing proteins like GUNA of

Fig. 5 Phylogenetic relationships among catalytic domains of cellulases from different microorganisms. Cellulases from Thermobifida fusca are indicated by bold, italics. The dendogram was constructed from the matrix of correlation distances by the program GROWTREE from the GCG sequence analyses software package, using the Neighbour-joining method. SwissProt accession numbers and source organisms are: P54583 Acidothermus cellulolyticus; P22541 Butyrivibrio fibriosolvens; P07984, P14090, P26225, P50400, P50401, P50899 Cellulomonas fimi; Q02934, Q05332, Q59325, Q9L3J8, P04956, P38686 Clostridium thermocellum; P50900 Clostridium stercorarium; Q59394 Erwinia carotovora; P26414 Microbispora bispora; Q53488 Micromonospora cellulolyticum; Q50901 Myxococcus xanthus; Q8KKF7 Paenibacillus sp.; P10476 Pseudomonas fluorescens; Q07940 Ruminococcus albus; O86728 O86730 Streptomyces coelicolor; P27035 Streptomyces lividans; Q05156 Streptomyces reticuli; AY298814, P26221, P26222, Q01786, Q08166, Q60029, Q9XCD4 Thermobifida fusca; Q8P514 Xanthomonas campestris; Q9PDW2 Xylella fastidiosa

New endoglucanase encoding gene from Thermobifida fusca 393

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Streptomyces lividans, GUN2 of Butyrivibrio fibriosolvens and GUN4 of Ruminococcus albus. A similar separation was observed for the two GH6 cellulases. Cel6A showed a defi-nite affinity towards GUNA of Microbispora bispora, GUNA of Cellulomonas fimi and Q53488 of Micromonospora cellulolyticum, while Cel6B formed a group with protein 086730 of S. coelicolor and GUXA of Cellulomonas fimi. As far as the family GH9 en-zymes of T. fusca are concerned, they also showed divergence: Cel9A clustered together with GUNB of Cellulomonas fimi and GUNF of Clostridium thermocellum, whereas Cel9B clustered with GUNC of Cellulomonas fimi, GUN1 of Streptomyces reticuli and GUND of Clostridium thermocellum. Cel48A, which still lacks its counterpart in T. fusca grouped together with enzymes like GUXB of Cellulomonas fimi, Q9XCD of S. coelicolor and GUNS of Clostridium thermocellum.

Characterization of the recombinant Cel5B enzyme

Tf cel5B showed no expression in E. coli. However, the recombinant protein was synthe-sized and secreted by S. lividans TK24 allowing its biochemical characterization. When extracellular proteins from transformant E61 were separated by cellulose affinity chro- matography and subjected to SDS-PAGE, a distinct protein band was resolved staining at ∼67 kDa. The estimated molecular mass of the mature protein agreed well with the deduced value of 67,665 Da calculated from the nt sequence of the Tf cel5B gene. The substrate specificity of the recombinant enzyme was tested on CM-cellulose, MN300 cellulose, Avicel, oat-spelt xylan, birch wood xylan, LBG-mannan, laminarin, lichenin, pustulan and pNP-cellobioside, -glucoside, -xyloside, -mannoside. Cel5B exhibited high activity only on CM-cellulose (121.4 U mg–1), showed low activity towards MN300 (2.6 U mg–1) and Avicel (3.9 U mg–1) and gave no measurable activity on hemicellulosic and oligosaccharide substrates (Table 2). Table 2 Hydrolytic activity of purified Cel5B on various polysaccharide substrates

Substrates Hydrolytic activity (U mg–1 enzyme)

CM-cellulose 121.4 Avicel 3.9 MN300 cellulose 2.6 Oat spelt xylan ND Birch wood xylan ND Mannan ND Lichenin ND Laminarin ND Pustulan ND

ND; not detected

Viscosimetric assays showed that even a small amount of Cel5B caused rapid decrease in the relative viscosity of the CM-cellulose solution. When 0.5 µg purified enzyme was added to 10 ml solution containing 200 mg CM-cellulose, less than 9 min of incubation was needed to observe 50% loss of viscosity, confirming the endo-acting nature of Cel5B. The effect of pH on the CM-cellulase activity of Cel5B was also examined. The pH opti-mum of Cel5B determined on CM-cellulose as substrate was ∼8.2, and the enzyme retained 90% of its maximal activity between pH 7.0 and 8.5 after 15 min incubation at 70 °C (Fig. 6).

394 K. POSTA et al.

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

a.

0

10

20

30

40

50

60

70

80

90

100

20 30 40 50 60 70 80 90

Temperature (oC)

Rel

ativ

eac

tivi

ty

b.

0

10

20

30

40

50

60

70

80

90

100

4.5 5.5 6.5 7.5 8.5 9.5 10.5

pH

Rel

ativ

eac

tivi

ty

Fig. 6 Effect of temperature and pH on endoglucanase activity of Cel5B. a; Effect of pH on endoglucanase activity of Cel5B. b; Effect of temperature on endoglucanase activity of Cel5B Temperature optimum of Cel5B was found to be at ∼77 °C (Fig. 6). However the enzyme underwent a rapid inactivation at this temperature, loosing 50% of its original activity after 3 h of incubation (Fig. 7).

New endoglucanase encoding gene from Thermobifida fusca 395

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

0

10

20

30

40

50

60

70

80

90

100

0 4 8 12 16 20 24 28 32 36 40 44 48

Incubation time (hour)

Rel

ativ

eac

tivi

ty60 oC

66 oC

70 oC

77 oC

Discussion

Comparative sequence analysis of the E61 clone and biochemical investigations on the expressed protein showed that T. fusca contains a second family GH5 endoglucanase en-zyme, named Cel5B (E7 according to the former nomenclature) in addition to the Cel5A (E5) enzyme described earlier by LAO et al. (1991). The GH5 family enzymes are known to posses cellulose 1,4-β-cellobiosidase, β-manno-sidase, glucan 1,3-β-glucosidase, licheninase, glucan endo-1,6-β-glucosidase, mannan endo-1,4-β-mannosidase, or endo-1,4-β-xylanase activities; therefore we tested a number of dif-ferent substrates to determine the substrate specificity of Cel5B. It was found to hydrolyze only cellulose and showed no activity on xylan, mannan or other hemicellulosic substrates. According to previous studies, Cel5A, the counterpart of Cel5B rapidly hydrolyzed methy-lumbelliferyl-cellobioside, whereas Cel5B was found to have no cellobiosidase activity. Measuring viscosity reduction of CM-cellulose-containing solutions is a widely used method to distinguish endo- and exo-acting cellulolytic activity. IRWIN et al. (1993) demon-strated that Cel6B, an exoglucanase of T. fusca caused no significant reduction of viscosity, Cel9A, the processive endoglucanase of this organism caused only slow reduction, whereas adding small amounts of Cel5A and Cel6A, the two true previously described endogluca-nase enzymes of T. fusca to a CM-cellulose-containing solution resulted in a rapid and strong decrease of viscosity. In the present experiments, Cel5B was found to decrease vis-cosity as efficiently as Cel5A and Cel6A, indicating that this protein is also a typical endo-acting glucanase enzyme, the third one known to be produced by this thermophilic actino-mycete. The thermostable, alkalitolerant nature of this extracellular enzyme could be explained by the natural habitat of T. fusca. Composted horse manure had a pH around 8.0 and tempera-ture of 75 °C, when sampled for isolating T. fusca TM51. The enzyme was stable up to 66 °C then, at higher temperatures its stability rapidly decreased. Irrespective of this relative instability, the temperature optimum of Cel5B, as can be seen in Fig. 6 was ∼77 °C owing to the stabilizing effect of the CM-cellulose substrate. Similar substrate mediated stabilization events have previously been observed by MAWAZDA et al. (2000) in studies on endogluca-nases from Bacillus sp. strains CH43 and HR68. This 77 °C temperature optimum of Cel5B falls in the temperature optimum range of 70–80 °C of the other endoglucanase, endoxy-lanase and endomannanase enzymes of T. fusca (IRWIN et al. 1994, HILGE et al. 1998) sug-

Fig. 7 Comparison of thermostability of Cel5B at different incubation temperature. Purified Cel5B fractions were incubated at 60, 66, 70 and 77 °C for 1, 2, 4, 8, 16, 24, 36 and 48 h in 0.1 M phosphate buffer, pH 7.0. Residual activ- ities were assayed on RBB-CM-cellulose

396 K. POSTA et al.

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

gesting that this newly described endoglucanase has similar advantages for industrial utili-zation to the other thermostable hydrolases from this thermophilic microorganism. Cellulases of T. fusca described thus far belong to four different GH families, including GH5, GH6, GH9 and GH48. The most interesting findings of Fig. 4, which compares the domain structures of all seven T. fusca cellulases are, that (i) one GH-family seems to be represented by two cellulase enzymes in this organism and (ii) domains of the two enzymes belonging to the same family are at opposite ends. Of the two GH5 family endoglucanases, Cel5A (E5) contains its cellulose binding domain (CBD) at the N-terminal region, whereas Cel5B (E7) has this domain at its C-terminal part. Similar opposition of CBDs are present in the two GH6 family cellulases. The two GH9 family endoglucanases both have two CBDs. Cel9B (E1) shows some kind of irregularity, as the two CBDs of this protein were found in symmetrical positions, one at the C-terminal and the other at the N-terminal region. How-ever, Cel9A seems to return to the rule with its two C-terminal CBDs, forming thus the counterpart of the single N-terminal CBD of Cel9B. This highly efficient lignocellulose degrading organism seems to be equipped with a complex cellulase system comprising pairs of enzymes of the same GH family with CBDs in opposite orientations. At the present stage of research, the biological significance of this type of organization is difficult to predict, but this apparently regular arrangement may promote substrate binding, contributing thus to the well-documented synergistic interactions of these enzymes (IRWIN et al. 1993, KIM et al. 1998). Worthy of note is the fact that Cellulomonas fimi have a similar pattern: i.e., en-zymes of the same GH family also formed pairs in this organism and their CBDs were of opposite orientation. Theoretically, this cellulase system, composed of pairs of enzymes that belong to the same GH family may have evolved in T. fusca as the results of gene duplications. Some previous studies provide support for this assumptions. For example, the XYLA enzyme of Neocallimastix patriciarum contains two homologous, duplicated catalytic domains (GIL-BERT et al. 1992), the two GH2 family endoglucanases from Bacillus sp. strain N4 share 77% sequence identity (FUKUMORI et al. 1986), and CelK and CbhA, two cellobiohy-drolases from Clostridium thermocellum also show more than 80% homology (KATAEVA et al. 1999, ZVERLOV et al. 1999) indicating that these pairs of sequences originate from gene duplication events. In other cases, however horizontal gene transfers were postulated to contribute to the complexity of the cellulase systems. Based on the sequence similarities found between Cel9B of T. fusca and CbhA of C. thermocellum, ZVERLOV et al. (1998) have already suggested that such an event might happened between these two organisms. In the present study, where the catalytic domain sequences of the seven known cellulase en-zymes of T. fusca were compared with the same sequences of 35 cellulases of families GH5, GH6, GH9 and GH48 from 18 organisms, we clearly demonstrated, that meanwhile the enzyme pairs of the same GH family in T. fusca are not closely related to each other, they show in many cases significant similarities to various cellulase enzymes of taxonomically distinct organisms. The close similarity, observed between several enzymes of T. fusca and C. fimi is especially worthy of mentioning. Sources and chronologies of horizontal gene transfers can be estimated by comparing the G + C content and the codon usage patterns of the whole genome and that of the suspectedly acquired genes (GARCIA-VALLVÉ et al. 1999). Such comparisons have recently been used to demonstrate the horizontal transfer of an endoglucanase gene from the rumen bacterium, Fibrobacter succinogenes to Orpinomyces joyonii, a rumen inhabiting fungus by GARCIA-VALLVÉ et al. (2000). In T. fusca, no significant differences were observed in the G+C con-tents of the seven cellulase genes and that of the whole genome, indicating that all these cellulases were acquired by this thermophilic actinomycete in the ancient times or they arrived from taxonomically not too distant organisms. By comparing the G+C contents at the third codon position only cel9A was found to differ significantly, indicating that this component of the cellulase system of T. fusca arrived most lately, but still in early times.

New endoglucanase encoding gene from Thermobifida fusca 397

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Potential sources of horizontal gene transfer are other compost inhabiting, lignocellulose degrading actinomycetes, like Cellulomonas or Streptomyces spp. which have an average G+C content of 70%, higher than that of the genome of T. fusca. These organisms, while sharing the same ecosystem have to compete for lignocellulose, a poorly digestible source of carbon and energies. Successful competition could only be achieved by building up highly efficient lignocellulose degrading enzyme systems. Plant debris, composed mainly of lignocellulose creates a virtually closed microcosm, which could exclusively be colonized by species, capable of utilizing this poor carbon source. Such a closed system provides, at the same time an ideal room for the transfer of genetic material among those organisms that could gain a footing there.

Acknowledgements

The authors are grateful to Dr. ENDRE BARTA (Agricultural Biotechnology Center) for consulting on the phylogenetic comparisons. This work was supported by grant from OTKA TS 044778. J. K. is recipient of Bolyai János research fellowship.

References

BARR, B. K., HSIEH, Y. L., GANEM, B. and WILSON D. B., 1996. Identification of two functionally different classes of exocellulases. Biochem., 35, 586–592.

BATEMAN, A., BIRNEY, E., CERRUTI, L., DURBIN, R., ETWILLER, L., EDDY, S. R., GRIFFITHS-JONES, S., HOWE, K. L., MARSHALL, M. and SONNHAMMER, E. L., 2002. The Pfam protein families database. Nucleic Acids Res., 30, 276–280.

BÉKI, E., NAGY, I., VANDERLEYDEN, J., JÄGER, S., KISS, L., FÜLÖP, L., HORNOK, L. and KUKOLYA, J., 2003. Cloning and heterologous expression of a β-D-mannosidase encoding gene from Thermobi-fida fusca TM51. Appl. Env. Microbiol., 69, 1944–1952.

BRADFORD, M. M., 1976. A rapid, and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal. Biochem., 72, 248–254.

DEVEREUX, J. P., HAEBERLI, P. and SMITHIES, O., 1984. A comprehensive set of sequence analysis programs for VAX. Nucleic Acids Res., 12, 387–395.

FUKUMORI, F., SASHIHARA, N., KUDO, T. and HORIKOSHI, K., 1986. Nucleotide sequences of two cellulase genes from alkalophilic Bacillus sp. strain N-4 and their strong homology. J. Bacteriol., 168, 479–485.

GARCIA-VALLVÉ, S., PALAU, J. and ROMEU, A., 1999. Horizontal gene transfer in glycosyl hydrolases inferred from codon usage in Escherichia coli and Bacillus subtilis. Mol. Biol. Evol., 16, 1125–1134.

GARCIA-VALLVÉ, S., ROMEU, A. and PALAU, J., 2000. Horizontal gene transfer of glycosyl hydrolases of the rumen fungi. Mol. Biol. Evol., 17, 352–361.

GHANGAS, G. S. and WILSON, D. B., 1988., Cloning of the Thermomonospora fusca endoglucanase E2 gene in Streptomyces lividans: affinity purification and functional domains of cloned product. Appl. Environ. Microbiol., 54, 2521–2526.

GILBERT, H. J., HAZLEWOOD, G. P., LAURIE, J. I., ORPIN, C. G. and XUE, G. P., 1992. Homologous catalytic domains in a rumen fungal xylanase: evidence for gene duplication and procaryotic origin. Mol. Microbiol., 6, 2065–2072.

HENRISSAT, B. and BAIROCH, A., 1996. Updating the sequence-based classification of glycosyl hy-drolases. Biochem. J., 316, 695–6.

HENRISSAT, B., TEERI, T. T. and WARREN, R. A. J., 1998. A scheme for designating enzymes that hydrolyse the polysaccharides in the cell walls of plants. FEBS Lett., 425, 352–354.

HILGE, M., GLOOR, S. M., RYPNIEWSKI, W., SAUER, O., HEIGHTMAN, T. D., ZIMMERMANN, W., WIN-TERHALTER, K. and PIONTEK, K., 1998. High-resolution native and complex structures of thermo-stable beta-mannanase from Thermomonospora fusca – substrate specificity in glycosyl hydrolase family 5. Structure, 6, 1433–1444.

398 K. POSTA et al.

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

HUNTER, I. S., 1985. Gene cloning in Streptomyces. In. DNA Cloning: a Practical Approach, pp. 19−44 (edited by D. M. GLOVER). Oxford, IRL Press.

IRWIN, D., JUNG, E. D. and WILSON, D. B., 1994. Characterization and sequence of a Thermomono-spora fusca xylanase. Appl. Environ. Microbiol., 60, 763–770.

IRWIN, D., SHIN, D. H., ZHANG, S., BAKR, B. K., SAKON, J., KARPLUS, P. A. and WILSON, D. B., 1998. Roles of the catalytic domain and two cellulose binding domains of Thermomonospora fusca E4 in cellulose hydrolysis. J. Bacteriol., 180, 1709–1714.

IRWIN, D. C., SPEZIO, M., WALKER, L. P. and WILSON, D. B., 1993. Activity studies of eight purified cellulases: specificity, synergism, and binding domain effects. Biotech. Bioeng., 42, 1002–1013.

IRWIN, D. C., ZHANG. S. and WILSON, D. B., 2000. Cloning, expression and characterization of a family 48 exocellulase, Cel48A, from Thermobifida fusca. Eur. J. Biochem., 267, 4988–4997.

JUNG, E. D., LAO, G., IRWIN, D., BARR, B. K., BENJAMIN, A. and WILSON, D. B., 1993. DNA sequences and expression in Streptomyces lividans an exoglucanase gene and an endoglucanase gene from Thermomonospora fusca. Appl. Environ. Microbiol., 59, 3032–3043.

KATAEVA, I., LI, X. L., CHEN, H., CHOI, S. K. and LJUNGDAHL, L. G., 1999. Cloning and sequence analysis of a new cellulase gene encoding CelK, a major cellulosome component of Clostridium thermocellum: evidence for gene duplication and recombination. J. Bacteriol., 181, 5288–5295.

KIM, E., IRWIN, D. C., WALKER, L. P. and WILSON, D. B., 1998. Factoral optimization of a six-cellulase mixture. Biotech. Bioeng., 58, 494–501.

KUKOLYA, J., DOBOLYI, C. and HORNOK, L., 1997. Isolation and identification of thermophilic cellu-lolytic actinomycetes. Acta Phytopath. Entomo. Hung., 32, 97–107.

KUKOLYA, J., NAGY, I., LÁDAY, M., ORAVECZ, O., MÁRIALIGETI, K. and HORNOK, L., 2002. Thermo-bifida cellulolytica sp. nov., a new lignocellulose decomposing member of the genus Thermobifida. Int. J. Syst. Evol. Microbiol., 52, 1193–1199.

KUNST, F., OGASAWARA, N., MOSZER, I., ALBERTINI, A. M., ALLONI, G., AZEVEDO, V., BERTERO, M. G., BESSIERES, P., BOLOTIN, A., BORCHERT, S., BORRISS, R., BOURSIER, L., BRANS, A., BRAUN, M., BRIGNELL, S. C., BRON, S., BROUILLET, S., BRUSCHI, C. V., CALDWELL, B., CAPUANO, V., CAR-TER, N. M., CHOI, S. K., CODANI, J. J., CONNERTON, I. F. and DANCHIN, A., 1997. The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature, 390, 249–256.

LAEMLI, U. K., 1970. Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature, 227, 680–685.

LAO, G., GHANGAS, G. S., JUNG, E. D. and WILSON, D. B., 1991. DNA sequences of three β-1,4-endoglucanase genes from Thermomonospora fusca. J. Bacteriol. 173, 3397–3407.

MAWADZA, C., HATTI-KAUL, R., ZVAUYA, R. and MATTIASSON, B., 2000. Purification and characteri-zation pof cellulases produced by two Bacillus strains. J. Biotechnol., 83, 177–187.

MILLER, G. L., 1959. Use of dinitrosalicyclic acid reagent for determination of reducing sugar. Anal. Chem., 31, 426–428.

NIELSEN, H., ENGELBRECHT, J., BRUNAK, S. and VON HEIJNE, G., 1997. Identification of procaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng., 10, 1–6.

OMURA, S., IKEDA, H., ISHIKAWA, J., HANAMOTO, A., TAKAHASHI, C., SHINOSE, M., TAKAHASHI, Y., HORIKAWA, H., NAKAZAWA, H., OSONOE, T., KIKUCHI, H., SHIBA, T., SAKAKI, Y. and HATTORI, M., 2001. Genome sequence of an industrial microorganism Streptomyces avermitilis: deducing the ability of producing secondary metabolites. Proc. Natl. Acad. Sci., 98, 2215–12220.

SAMBROOK, J., FRITSCH, E. F. and MANIATIS, T., 1989. Molecular Cloning: a Laboratory Manual, 2nd edn. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory.

SHEN, H., MEINKE, A., TOMME, P., DAMUDE, H. G. and KWAN, E., 1996. Cellulomonas fimi cellobio-hydrolases. In: Enzymatic Degradation of Insoluble Carbohydrates, pp. 174–196 (edited by J. N. SADDLER, M. H. PENNER). Oxford University Press.

SPIRIDONOV, N. A. and WILSON, D. B., 1998. Regulation of biosynthesis of individual cellulases in Thermomonospora fusca. J. Bacteriol., 180, 3529–3532.

SPIRIDONOV, N. A. and WILSON, D. B., 2001. Cloning and biochemical characterization of BglC, a β-glucosidase from the cellulolytic actinomycete Thermobifida fusca. Curr. Microbiol., 42, 295–301.

SUNNA, A., GIBBS, M. D. and BERGQUIST, P. L., 2000. A novel thermostable multidomain 1,4-beta-xylanase from Caldibacillus cellulovorans and effect of its xylan-binding domain on enzyme activ-ity. Microbiology, 146, 2947–2955.

TOMME, P., KWAN, E., GILKES, N. R., KILBURN, D. G. and WARREN, R. A. J., 1996. Characterization of CenC, an enzyme from Cellulomonas fimi with both endo- and exoglucanase activities. J. Bacte-riol., 178, 4216–4223.

New endoglucanase encoding gene from Thermobifida fusca 399

© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

WARREN, R. A. J., 1996. Microbial hydrolysis of polysaccharides. Annu. Rev. Microbiol., 50, 183–212.

ZVERLOV, V. V., VELIKODVORSKAYA, G. V., SCHWARZ, W. H., BRONNENMEIER, K., KELLERMANN, J. and STAUDENBAUER, W., 1998. Multidomain structure and cellulosomal localization of the Clos-tridium thermocellum cellobiohydrolase CbhA. J. Bacteriol., 180, 3091–3099.

ZVERLOV, V. V., VELIKODVORSKAYA, G. A., SCHWARZ, W. H., KELLERMANN, J. and STAUDENBAUER, W. L., 1999. Duplicated Clostridium thermocellum cellobiohydrolase gene encoding cellulosomal subunits S3 and S5. Appl. Microbiol. Biotechnol., 51, 852–859.

Mailing address: Dr. JÓZSEF KUKOLYA, Szent István University, Department of Agricultural Biotech-nology and Microbiology, Páter K. u. 1, Gödöllő, H-2103, Hungary Tel: +36 28 522-910, Fax: +36 28 410 804 E-mail: [email protected]