Genome-Wide Analysis of MicroRNAs in Sacred Lotus, Nelumbo nucifera (Gaertn)

14
Genome-Wide Analysis of MicroRNAs in Sacred Lotus, Nelumbo nucifera (Gaertn) Yun Zheng & Guru Jagadeeswaran & Kanchana Gowdu & Nian Wang & Shaohua Li & Ray Ming & Ramanjulu Sunkar Received: 29 April 2013 / Accepted: 3 July 2013 # Springer Science+Business Media New York 2013 Abstract MicroRNAs (miRNAs) are small non-coding reg- ulatory RNAs that degrade or repress protein synthesis of their messenger RNA targets. This mode of posttranscrip- tional gene regulation is critical for plant growth and devel- opment as well as adaptation to stress conditions. Sacred lotus (Nelumbo nucifera) is a land plant but adapted to the aquatic environment. It is a basal eudicot in the order Proteales, with significant taxonomic importance. Thus iden- tification of miRNAs in sacred lotus could provide informa- tion about miRNA evolution, particularly the conservation as well as divergence of miRNAs in dicots. To identify conserved and novel miRNAs in sacred lotus, small RNA libraries from leaves and flowers were sequenced as well as computational strategy was employed. These approaches resulted in identification of 81 miRNAs that can be grouped into 41 conserved/known miRNA families and 52 novel miRNAs forming 49 novel miRNA families. Using 3 mis- matches between miRNAs and their mRNA targets as cutoff, we have predicted 137 genes as targets for the conserved and known miRNAs. Overall, this analysis provided a glimpse of miRNA-dependent posttranscriptional gene regulations in sacred lotus. Keywords MicroRNAs . MicroRNA targets . Post- transcriptional gene regulation . Sacred lotus . Small RNAs Abbreviations AGO Argonaute ARF Auxin response factor AP2-like Apetala 2-like transcription factor CSD Cu/Zn superoxide dismutase DCL1 Dicer like-1 GRFs Growth regulating factors HD-Zip factors Homeodomain leucine zipper family of transcription factors HEN1 Hua Enhancer 1 HYL1 Hyponastic leaves 1 miRNAs MicroRNAs NAC factors NAM, ATAF1/2 and CUC2 domain containing transcription factors NBS-LRR genes Nucleoside-binding site leucine rich repeat genes RISC RNA-induced silencing complex RPTM Reads per ten million SE Serrate 1 SPL Squamosa promoter binding protein-like tasiRNAs Trans-acting small interfering RNAs TCP factors Teosinte branched 1, Cycloidea, PCF (TCP)-domain protein family TIR1 Transport inhibitor response 1 Communicated by: Luiz Vieira Electronic supplementary material The online version of this article (doi:10.1007/s12042-013-9127-z) contains supplementary material, which is available to authorized users. Y. Zheng : G. Jagadeeswaran : K. Gowdu : R. Sunkar Department of Biochemistry and Molecular Biology, Oklahoma State University, Stillwater, OK 74074, USA Y. Zheng Institute of Developmental Biology and Molecular Medicine, School of Life Sciences, Fudan University, Shanghai 200433, China N. Wang : S. Li Key Laboratory of Plant Germplasm Enhancement and Speciality Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China R. Ming Department of Plant Biology, University of Illinois, UIUC, Urbana Champaign, IL, USA R. Sunkar (*) Department of Biochemistry and Molecular Biology, Oklahoma State University, Stillwater, OK 74078, USA e-mail: [email protected] Tropical Plant Biol. DOI 10.1007/s12042-013-9127-z

Transcript of Genome-Wide Analysis of MicroRNAs in Sacred Lotus, Nelumbo nucifera (Gaertn)

Genome-Wide Analysis of MicroRNAs in Sacred Lotus,Nelumbo nucifera (Gaertn)

Yun Zheng & Guru Jagadeeswaran & Kanchana Gowdu &

Nian Wang & Shaohua Li & Ray Ming & Ramanjulu Sunkar

Received: 29 April 2013 /Accepted: 3 July 2013# Springer Science+Business Media New York 2013

Abstract MicroRNAs (miRNAs) are small non-coding reg-ulatory RNAs that degrade or repress protein synthesis oftheir messenger RNA targets. This mode of posttranscrip-tional gene regulation is critical for plant growth and devel-opment as well as adaptation to stress conditions. Sacredlotus (Nelumbo nucifera) is a land plant but adapted to theaquatic environment. It is a basal eudicot in the orderProteales, with significant taxonomic importance. Thus iden-tification of miRNAs in sacred lotus could provide informa-tion about miRNA evolution, particularly the conservationas well as divergence of miRNAs in dicots. To identifyconserved and novel miRNAs in sacred lotus, small RNAlibraries from leaves and flowers were sequenced as well as

computational strategy was employed. These approachesresulted in identification of 81 miRNAs that can be groupedinto 41 conserved/known miRNA families and 52 novelmiRNAs forming 49 novel miRNA families. Using 3 mis-matches between miRNAs and their mRNA targets as cutoff,we have predicted 137 genes as targets for the conserved andknown miRNAs. Overall, this analysis provided a glimpse ofmiRNA-dependent posttranscriptional gene regulations insacred lotus.

Keywords MicroRNAs .MicroRNA targets . Post-transcriptional gene regulation . Sacred lotus . Small RNAs

AbbreviationsAGO ArgonauteARF Auxin response factorAP2-like Apetala 2-like transcription factorCSD Cu/Zn superoxide dismutaseDCL1 Dicer like-1GRFs Growth regulating factorsHD-Zip factors Homeodomain leucine zipper family of

transcription factorsHEN1 Hua Enhancer 1HYL1 Hyponastic leaves 1miRNAs MicroRNAsNAC factors NAM, ATAF1/2 and CUC2 domain

containing transcription factorsNBS-LRR genes Nucleoside-binding site leucine rich

repeat genesRISC RNA-induced silencing complexRPTM Reads per ten millionSE Serrate 1SPL Squamosa promoter binding protein-liketasiRNAs Trans-acting small interfering RNAsTCP factors Teosinte branched 1, Cycloidea, PCF

(TCP)-domain protein familyTIR1 Transport inhibitor response 1

Communicated by: Luiz Vieira

Electronic supplementary material The online version of this article(doi:10.1007/s12042-013-9127-z) contains supplementary material,which is available to authorized users.

Y. Zheng :G. Jagadeeswaran :K. Gowdu : R. SunkarDepartment of Biochemistry and Molecular Biology, OklahomaState University, Stillwater, OK 74074, USA

Y. ZhengInstitute of Developmental Biology and Molecular Medicine,School of Life Sciences, Fudan University, Shanghai 200433,China

N. Wang : S. LiKey Laboratory of Plant Germplasm Enhancement and SpecialityAgriculture, Wuhan Botanical Garden, Chinese Academy ofSciences, Wuhan 430074, China

R. MingDepartment of Plant Biology, University of Illinois, UIUC, UrbanaChampaign, IL, USA

R. Sunkar (*)Department of Biochemistry and Molecular Biology, OklahomaState University, Stillwater, OK 74078, USAe-mail: [email protected]

Tropical Plant Biol.DOI 10.1007/s12042-013-9127-z

Introduction

In plants, miRNA-controlled target gene expression is re-quired for plant growth and development as well as adaptationto stress conditions (Chen 2009; Sunkar et al. 2012). PrimarymiRNAs (pri-miRNAs) are transcribed from nuclear-encodedMIR genes by RNA polymerase II (Chen 2009; Voinnet2009). Dicer-like enzyme (DCL1), with the assistance fromHYPONASTIC LEAVES 1 (HYL1), SERRATE (SE) andseveral other proteins, catalyzes the release of 20–22 ntmiRNA/miRNA* duplexes from the pri-miRNAs that adoptshairpin-like structure. The miRNA/miRNA* duplexes arethen methylated at the 3′ terminus by HUA ENHANCER 1(HEN1) and exported into the cytoplasm by HASTY (HST1).In the cytoplasm, miRNA is loaded into an RNA-inducedsilencing complex (RISC) containing Argonaute (AGO) pro-tein, and guides the RISC to cause site-specific cleavage orrepression of the mRNA targets (Jones-Rhoades et al. 2006;Sunkar and Zhu 2007; Voinnet 2009). Although not often, butmiRNAs also silence gene expression transcriptionally bycausing DNA methylation (Khraiwesh et al. 2010; Wu et al.2010).

MicroRNAs have been reported from diverse plant spe-cies but not from sacred lotus, a basal eudicot that completeslife cycle in an aquatic environment. Thus, identification ofmiRNAs could provide valuable information about miRNAconservation as well as divergence in this basal angiosperm.Given its phylogenetic importance as well as other interest-ing traits such as seed longevity and adaptation to the aquaticenvironment sacred lotus’s genome has been recently se-quenced (Ming et al. 2013). The availability of genomesequence is critical for identification of miRNAs, particular-ly the novel miRNAs in an organism. In this study, we reportthe identification of miRNAs by sequencing small RNAlibraries and using computational strategies. These analysesled to identification of 81 conserved/known miRNAs (41miRNA families) and 52 novel miRNAs (49 novel miRNAfamilies). These miRNAs have been predicted to regulate theexpression of approximately 137 genes that are likely to playessential roles in growth and development as well as otherbiological processes in sacred lotus.

Results and Discussion

Identification of Conserved miRNAs in Sacred Lotus

To identify both conserved and novel miRNAs in sacredlotus we generated small RNA libraries from leaves andflowers of sacred lotus. These libraries were sequenced usingIllumina GAII analyzer that yielded 18,505,940 and29,067,085 (a total of 47,573,025 reads) 18–30 nt long smallRNAs from leaves and flowers, respectively (Table 1). Like

most plant small RNA populations, small RNAs in sacredlotus showed peaks at 21 and 24 nt sizes (Fig. 1). These smallRNA reads correspond to approximately 3.5 and 5 millionnon-redundant reads (together 8,348,852 unique reads fromboth libraries) from leaves and flowers, respectively. Ap-proximately half of these unique reads could be mapped tothe sacred lotus genome that has been sequenced recently(Ming et al. 2013) implying that almost 50 % of the uniquereads could not be analyzed. One potential reason is that alarge portion of small RNAs are likely to be derived from thecentromeres that are enriched with repetitive sequences, andit is known that such regions are only sequenced partially ingenome sequencing projects (Hayden and Willard 2012).Additionally, the incomplete genome sequencing as anotherpotential reason cannot be excluded. The genotypic variationas a possible reason can be excluded because we used thegenotype (China Antique variety of the sacred lotus) that hasbeen sequenced by Ming et al. (2013). Approximately 50 %of the genome-matching unique reads in both libraries ap-pears to be the degradation products from non-coding RNAssuch as rRNAs and tRNAs, which have been discarded(Table 1). Further, small RNAs that were mapped to themessenger RNAs and repeat-rich regions were removed.The remaining unique reads were used to map to themiRBase to identify conserved as well as poorly conservedmiRNA homologs (also referred to as “knownmiRNAs” thatare not well-conserved and only reported from some plantspecies) in sacred lotus. This resulted in identification of 76known miRNA variants belonging to 36 miRNA families(Table 2). Like miRNAs, trans-acting small interferingRNAs (tasiRNAs), a class of conserved endogenous siRNAsthat are derived from TAS3 loci regulate the expression ofAuxin Response Factors (ARFs) at the post-transcriptionallevel in plants (Allen and Howell 2010). Three conservedtasiRNAs were recovered from the small RNA libraries insacred lotus (Table 2).

In parallel, we also conducted bioinformatics analysis toidentify miRNA homologs that are poorly conserved (con-served only in few plant species), but could not be identifiedusing small RNA sequencing-based approach either due totheir extremely low abundance/does not accumulate inleaves and flowers, the tissues used in this study. The homo-logues of miRNAs such as miR869, miR1863, miR2275,miR3948 and miR4414 have been identified in few plantspecies such as Arabidopsis thaliana, A. lyrata, rice, maize,soybean and Medicago truncatula (www. miRBase.org).Although no reads were recovered for these miRNAs insacred lotus, these miRNAs have been predicted in sacredlotus on the basis of sequence homology coupled with thepredictable hairpin-like structure for the miRNA precursorsequence (Table 2). Thus, the computational approach hasidentified 5 miRNA families (miR869, miR1863, miR2275,miR3948, and miR4414) in sacred lotus. Taken together the

Tropical Plant Biol.

small RNA sequencing and bioinformatics prediction, wehave identified a total of 81 miRNAs that can be groupedinto 41 miRNA families in sacred lotus (Table 2).

Although homologous sequences with one to two nucle-otide variation to miR481 (Populus trichocorpa), miR1171(Chlamydomonas reinhardti), miR1533 (Glycine max), miR3946(Citrus sinensis), miR5139 (Rehamannia glutinosa), miR5653(Arabidopsis) and several miRNA homologs from Medicagotruncatula such as miR5221, miR5225, miR5227 and miR5234were predicted in sacred lotus, but their annotation as miRNAs inthose respective plant species is not based on miRNA* identifi-cation (Meyers et al. 2008). Therefore, these homologous

sequences were not considered as miRNAs in N. nucifera in thisstudy.

MicroRNA Organization and Number of Loci in SacredLotus Genome

Of these 41 miRNA families, some are represented by mul-tiple loci where as some others are by single locus in thegenome. By far miR169 family is represented by most num-ber of loci (22 loci), followed by miR171 family with 12 loci,and miR399 with 11 loci (Supplemental Table 1). On theother hand, conserved miRNA families such as miR390 andmiR397 were represented by single locus in N. nuciferagenome. A few of the highly conserved miRNA familiesshow significant size variation in sacred lotus (Fig. 2) ascompared with the number of loci for each miRNA familyin Arabidopsis, Populus, grapevine and rice. For instance,each of the miR168 and miR393 family is represented by 4loci in sacred lotus compared to 2 loci in other plant species.Similarly, miR164 (8 in sacred lotus compared to 6 or 4 inother plant sps), and miR408 (2 in sacred lotus compared to 1in other plant sps) families showed considerable increase inmiRNA loci in sacred lotus compared to other plants.

A few conserved miRNA families exist as clusters in thegenomes of various plant species (Sunkar and Jagadeeswaran2008). In N. nucifera genome, five miRNA families (miR165/166, miR169, miR395 and miR399 and miR3627) exist inclusters (data not shown). Of these, the most clusters (7) wereidentified for miR169 family that has 22 loci in the genome. Inmost plant species, miRNA clusters usually represented by twomiRNA genes. Interestingly, in sacred lotus, 4 (miR169r,k,m,v)miR169 genes are found in one cluster. Clustered miRNAsusually originate from tandem duplication events.

Abundance of miRNA Families and Family Membersin Sacred Lotus

Because small RNA libraries were generated from leaves andflowers in this study, it is possible to identify differences inmiRNA expression levels between these two tissues. However,

Table 1 Summary of the small RNA libraries reads analysis

Leaves Flowers

Total number of reads Unique reads Total number of reads Unique reads

Non coding RNAs 10,597,073 650,160 16,938,371 841,423

pre-miRNAs 630,072 7,429 775,064 9,840

Messenger RNAs 169,293 113,498 276,756 184,603

Repeat elements 3,918,045 73,871 6,121,404 85,403

Chromosomes 8,233,537 1,489,245 12,911,215 2,220,048

Total 18,505,940 3,349,548 29,067,085 4,970,971

0

500000

1000000

1500000

2000000

2500000

3000000

3500000

18 19 20 21 22 23 24 25 26 27 28 29

Total reads

Unique reads

a Small RNA library from leaves

0

1000000

2000000

3000000

4000000

5000000

6000000

18 19 20 21 22 23 24 25 26 27 28 29

Total reads

Unique reads

b Small RNA library from flowers

Size of the nucleotides

Size of the nucleotides

Fig. 1 Small RNA read abundance based on their size classes in sacredlocus

Tropical Plant Biol.

Table 2 Conserved and known microRNAs and their abundances inleaves and flowers of sacred lotus. A few miRNAs (five) that arecomputationally predicted but not sequenced also included. miRNA*

that are recovered in greater frequencies than their correspondingmiRNAs are highlighted. (normalized abundance is denoted as RPTM:Reads per ten million small RNAs)

miRNA Sequence

Leaf Flower

Total reads normalized

(RPTM) reads

Total reads normalized

(RPTM) reads

miR156a UUGACAGAAGAGAGAGAGCAC 3799 2053 4059 1396

miR156b,c,d,e UGACAGAAGAGAGUGAGCAC 14428 7796 18546 6380

miR156f,g,h,i UUGACAGAAGAUAGAGAGCAC 196778 106332 207149 71266

miR157 UUGACAGAAGAUAGAGAGAC 55 30 61 21

miR159 UUUGGAUUGAAGGGAGCUCUG 528 285 711 245

miR159b UUUGCAUAUUUCAGGAGCUGC 7228 3906 11248 3870

miR160a,b,c UGCCUGGCUCCCUGUAUGCCA 289 156 354 122

miR160d UGCCUGGCUCCCUGAAUGCCA 70 38 96 33

miR160e UGCCUGGCUCCCUGAGUGCCA 26 14 61 21

miR162a,b UCGAUAAACCUCUGCAUCCGG 974 526 1439 495

miR164a,c UGGAGAAGGGGAGCACGUGCA 1 1 1 0

miR164b,d,f,g UGGAGAAGCAGGGCACGUGCA 4025 2175 5943 2045

miR164h UGGAGAAGCAGGGCACGUGCU 312 169 427 147

miR165a UCGGACCAGGCUUCAUCCCCG 1660 897 1927 663

miR166a-to-h,j UCGGACCAGGCUUCAUUCCCC 94276 50944 112664 38760

miR166i UCUCGGAUCAGGCUUCAUUCC 37460 20242 43934 15115

miR167a,f UGAAGCUGCCAGCAUGAUCUA 14885 8043 20943 7205

miR167c,d,h UGAAGCUGCCAGCAUGAUCUAA 30948 16723 44444 15290

miR168a (22nt)

UCGCUUGGUGCAGGUCGGGUGC 576 311 739 254

miR168c,e (21nt)

UCGCUUGGUGCAGGUCGGGAA 11570 6252 16037 5517

miR169a GAGCCAAGGAUGACUUGCCGA 55 30 98 34

miR169b,c,d,r CAGCCAAGGAUGACUUGCCGG 650 351 914 314

miR169e,f,j,s,n,o,p,q

UAGCCAAGGAUGACUUGCCUA 136 73 151 52

miR169g CAGCCAAGAAUGACUUGCCGG 119 64 182 63

miR169h,I,l,y CAGCCAAGAAUGACUUGCCGA 38 21 50 17

miR169k UAGCCAAGGAUGACUUGCCUG 40 22 61 21

miR169m,v UAGCCAAGGAUGAUUUGCCUG 1 1 1 0

miR169u CAGCCAAGAAUGACUUGCCGU 14 8 16 6

miR169w,x CAGCCAAGAAUGACUUGCCA 82 44 144 50

miR169z UCGCCAAGGAUGACUUGCCUA 5 3 0 0

miR171a,b UUGAGCCGCGUCAAUAUCUCC 247 133 340 117

miR171d,j UUGAGCCGCGCCAAUAUCACU 1 1 0 0

miR171c,e,f,g,h,I,m

UGAUUGAGCCGUGCCAAUAUC 2489 1345 3836 1320

miR172b,c GGAAUCUUGAUGAUGCUGCAU 16411 8868 22848 7860

Tropical Plant Biol.

Table 2 (continued)

miR172e AGAAUCUUGAUGAUGCUGCAU 17678 9553 21674 7457

miR172f,g UGAAUCUUGAUGAUGCUACAU 0 0 2 1

miR319a,b UUUGGACUGAAGGGAGCUCCU 17 9 34 12

miR319c,d,e UUGGACUGAAGGGAGCUCCCU 27 15 45 15

miR390 AAGCUCAGGAGGGAUAGCGCC 572 309 908 312

miR393 UCCAAAGGGAUCGCAUUGAUCC 90 49 155 53

miR393* AUCAUGCGAUCUCUUCGGAAU 13077 7066 13185 4536

miR394a,b,c,d UUGGCAUUCUGUCCACCUCC 411 222 582 200

miR395a,b,c,g,h,I,j

CUGAAGUGUUUGGGGGAACUC 63 34 108 37

miR395d,e,f CUGAAGGGUUUGGAGGAACUC 0 0 1 0

miR396a,f,k UUCCACAGCUUUCUUGAACUG 2335 1262 3417 1176

miR396b,c,d,e,j,l

UUCCACAGCUUUCUUGAACUU 1634 883 2257 776

miR396i UUUUCCACGGCUUUCUUGAAC 665 359 914 314

miR397 UCAUUGAGUGCAGCGUUGAUG 74 40 119 41

miR398a UGUGUUCUCAGGUCGCCCCUG 4 2 6 2

miR398b UGUGUUCUCAGGUCACCCCUU 2 1 5 2

miR399a CGCCAAAGGAGAGUUGCCCU 5 3 7 2

miR399d,j CGCCAAAGGAGAAUUGCCCUG 3 2 0 0

miR399e CGCCAAAGGAGAGUUGCCCUU 0 0 3 1

miR399f,g UGCCAAAGGAGAUUUGCCCUG 0 0 1 0

miR403a,b UUAGAUUCACGCACAAGCUCG 237 128 357 123

miR408a,b UGCACUGCCUCUUCCCUGGC 44 24 55 19

miR408a* ACAGGGAUAAGACAGAGCAUG 1410 762 2174 748

miR408b* GCAGGGAUAAGGCAGAGCAUG 1664 899 2422 833

miR472 UUUCCCACACCGCCCAUUCCUA 150 81 271 93

miR482 CUUCCAAUUCCGCCCAUUCCUA 2408 1301 3453 1188

miR529a AGAAGAGAGAGAGUACAGCUU 44 24 67 23

miR530 AUGCAGGUGCAGGUCCAGACG 17 9 27 9

miR535a UGACAACGAGAGAGAGCACGC 39627 21413 57501 19782

miR827 UUAGAUGAUCAUCAACAAACA 61 33 103 35

miR1030 UCUGCAUUUGCACCUGCACCG 9 5 15 5

miR1511a,b AACCAGGCUCUGAUACCAUGU 1 1 6 2

miR1432 UCAGGAGAGAUGAUGCCGGCGU 32519 17572 47613 16380

miR2111a,b,c UAAUCUGCAUCCUGAGGUUUA 97 52 98 34

miR2111a* AUCCUCUGGAUGCAGGUUACC 190 103 281 97

miR2111d,e AUCCUCUGGAUGCAGGUUACC 193 104 286 98

miR2118a,c UUCCCAAGGCCUCCCAUGCCGA 1900 1027 3031 1043

miR2118b UUUCCGAUCCCACCCAUACCUA 782 423 1254 431

miR2950a,b UUCCAUCUCUUGCACACUGGA 1031 557 1456 501

miR2950a* CGGUGUGCAGAGGAUGGAACA 1495 808 2342 806

miR3627a,c,e UUGUCGCAGGAGAGAUGGCAC 93 50 148 51

miRNA Sequence

Leaf Flower

Total reads normalized

(RPTM) reads

Total reads normalized

(RPTM) reads

Tropical Plant Biol.

the overall trend of normalized miRNA family abundanceis almost similar between the libraries: most abundantlyexpressed miRNA family in leaves is also the most abundantlyexpressed miRNA family in flowers. The top most 5 abundantmiRNA families in both tissues are miR156, miR166, miR167,miR535 and miR1432. The abundance of several conservedmiRNAs such as miR398, miR399, miR530, miR1030, andmiR1511 was as low as or even lower than 10 RPTM in bothtissues. In most plant species, miR399 levels are inducedduring phosphate-deprivation (Jagadeeswaran et al. 2009).Thus, the low abundance of miR399 can be expected in thesesmall RNA libraries that are generated from tissues grownunder normal conditions.

MicroRNA families constitute miRNA isoforms/variantsdiffering by one-two nucleotides. The sequencing-based ap-proach will facilitate quantifying the differences in expressionlevels among the members of same miRNA family. As report-ed earlier in other plant species, the abundance of individualmiRNAs with in a family varied significantly in sacred lotus.Analysis of miRNAvariants (miRNA isoforms) indicated thatmiR169 family has 10 members, representing the largestmiRNA family in sacred lotus (Table 2). Ten members havedistinct expression levels; one miR169b,c,d,r isoform isexpressed at relatively high levels (about 300 RPTM) com-pared with miR169m,v variant, which is barely expressed(only one read was recovered from one of the libraries).Similarly, the overall miR164, miR171 and miR172 familyabundance is relatively high, but one variant from each ofthese miRNA families is represented by single readssuggesting that some of these loci are hardly expressed inthe tissues examined in this study (Table 2). Such a differential

expression of family members is likely to be spatially sepa-rated to fine tune target mRNA expression.

21 and 22 nt Long Isoforms of miR168 and ArgonauteHomeostasis in Sacred Lotus

miR168 is one of the deeply conserved miRNAs in plants andit is conserved even in Selaginella. It targets Argonaute 1, themain component of the RISC in plants. miR168 largely rep-resented by 2 loci in most plant genomes but it is representedby 4 loci inN. nucifera (Fig. 2). Of the two loci (MIR168a andMIR168b) in Arabidopsis and rice, only MIR168a is activelytranscribed and abundantly expressed whereas MIR168bexpressed at extremely low levels (Sunkar et al. 2008;Vaucheret 2009). Most importantly, these two loci have spe-cialized functions in Arabidopsis (Vaucheret 2009). MIR168alocus generates 21-nt long miR168 species whereasMIR168b locus produces equal amount of 21-nt and 22-ntlong miR168b species. The combinatorial action of both 21-nt and 22-nt long miR168s is necessary to fine-tuneArgonaute 1 homeostasis in Arabidopsis (Vaucheret 2009).Because four loci have been identified for miR168 in N.nucifera, it will be interesting to see whether or not whathas been observed for Arabidopsis miR168 is also true for N.nucifera. Indeed, both 21 and 22 nt long miR168 membershave been identified in small RNA libraries of sacred lotus(Table 2). Of these, 22 nt long miR168 is expressed at verylow levels (20 fold lower) relative to 21 nt long miR168(Table 2), suggesting potentially a similar Argonaute homeo-stasis mechanism in sacred lotus as observed in Arabidopsis(Vaucheret 2009).

Table 2 (continued)

miR3627b UUUGUCGCAGGAGAGAUGGCAC 779 421 1229 423

miR3627d AUGUCGCAGGAGAGAUGGCGC 231 125 311 107

miR5179 UCUUGCUCAAGACCGCGCAGC 1484 802 2117 728

miR869 AUUGGUUCAAUUCGGGUGUUG P

miR1863 GCUCUGAUACCAUGUGAACUG P

miR2275 AGAAUUGGAGGAAACCAAACUG

A P

miR3948 GGAGUGGGAGUGGGAGGAGGG P

miR4414 AGCUGCUGACUCGCUGGUUCA P

TAS3 UUCUUGACCUUGUAAGGCCCC 20 11 23 8

TAS3 UUCUUGACCUUGUAAGACCUU 28 15 35 12

TAS3 UUCUUGGCCUUGUAAGACCCC 33 18 26 9

miRNA Sequence

Leaf Flower

Total reads normalized

(RPTM) reads

Total reads normalized

(RPTM) reads

Tropical Plant Biol.

miR1432 is Expressed in Sacred Lotus

Thus far, miR1432 has been reported from rice, sorghum,maize and sugarcane (Sunkar et al. 2008; Lu et al. 2008;Paterson et al. 2009; Zhang et al. 2009; Zanca et al. 2010).Despite the fact that several dozens of dicots have been ana-lyzed for their miRNApopulations, miR1432 has been reportedonly from monocots suggesting that it may be specific tomonocots. Surprisingly, we have identified miR1432 in sacredlotus. In fact, MIR1432 locus has been identified in the genomeof sacred lotus (Supplemental Table 1). Furthermore, miR1432is abundantly expressed both in leaves and flowers of sacredlotus (Table 2). These findings indicate that miR1432 is alsoexpressed in some of the basal eudicots. Identification ofmiR1432 in sacred lotus provides an opportunity to trace itsevolution, conservation and divergence among dicots.miR1432 is predicted to target glutathione S transferase andSET domain protein in sacred lotus, whereas it is targeting EF-hand proteins in rice (Sunkar et al. 2008; Lu et al. 2008).

miR2118/miR482 Superfamily in Sacred Lotus

miR2118, miR482 and miR472 could be grouped into onesuperfamily on the basis of sequence homology (Shivaprasadet al. 2012). Phylogenetic distribution of these miRNAs in landplants is highly varied, i.e., miR472 was found in A. thaliana,A. lyrata, Populus and Citrus; miR482 was reported from

several dicots including gymnosperms (Pinus and Picea)but not from monocots; miR2118 has been reported fromgymnosperms, some of the dicots (mostly legumes but alsoin some Solanaceae members) (Jagadeeswaran et al. 2009;Arenas-Huertero et al. 2009; Zhai et al. 2011; Shivaprasadet al. 2012; Li et al. 2012) and some of the monocots suchas rice and maize (Johnson et al. 2009; Song et al. 2012).Interestingly, miR2118, miR482 and miR472 all three thatbelong to this super family have been identified in sacredlotus (Table 2). The normalized abundance of miR2118 andmiR482 families range between 1,000 and 1,300 RPTMsuggesting that these two miRNAs are more abundantlyexpressed compared to miR472 that has less than 100RPTM. Identification of miR2118, miR482 and miR472in sacred lotus coupled with the identification of miR2118and miR482 in some gymnosperms (Zhai et al. 2011),suggest that these miRNAs have evolved during the earlyland plant evolution but selectively lost in specific lineagesof core-eudicots and monocots.

MicroRNA2118 is targeting NBS-LRR genes that are im-plicated in host-pathogen interactions through effector-triggered immunity (Jagadeeswaran et al. 2009; Zhai et al.2011; Shivaprasad et al. 2012). miR482 and miR472 alsobeen shown to regulate NBS-LRR genes in Nicotianabenthamiana (Shivaprasad et al. 2012) or predicted to targetputative disease resistance genes in this study (Table 4). Theseare 22-nt long (canonical 21nt long) miRNAs that are able to

0

5

10

15

20

25

30

Arabidopsis

Vinegrape

Populus

Rice

Sacred lotus

miR

NA

fam

ily m

emb

ers

miR

NA

fam

ily m

emb

ers

0

5

10

15

20

25

30

35

Arabidopsis

Vinegrape

Populus

Rice

Sacred lotus

Fig. 2 Number of miRNA locifor each of the deeply conservedmiRNA families in sacred locus,Arabidopsis, grapevine, Populusand rice

Tropical Plant Biol.

Table 3 Identified novel miRNAs and their abundance in sacred lotus

NmiRNA# miRNA sequence Leaves Leaves normalized(RPTM)

Flowers Flowers normalized(RPTM)

NmiRNA#1 GGAAUGGACGGGUCGGAAACA 2386 1289 3384 1164

NmiRNA#2 UUCCCCCCGUCGGCAACGGUA 1991 1076 2922 1005

NmiRNA#3 UGUGGCUGCAGUGCACGGUGG 910 492 1426 491

NmiRNA#4 UUCUCUGUGAAGUAGAUGAAC 641 346 747 257

NmiRNA#5 UUGACAAGUCAGGAUCAGAGG 439 237 655 225

NmiRNA#6 UGCUGUGGGGCGUUGUGGGGCGGA 374 202 595 205

NmiRNA#7 UUGGAAAGAGUACAAAUGCAA 344 186 488 168

NmiRNA#8 AAUGCACAGGGGGAUGGGCUG 330 178 487 168

NmiRNA#9 CGGAAUCAAAUCGGUGAUAAG 302 163 457 157

NmiRNA#10a UUCUGGAAUAGAAGAGGCAAG 276 149 431 148

NmiRNA#10b UUUUUGAAUGGAGAAGGCAAG 23 12 27 9

NmiRNA#10c UUCUUGAAUGGAAGAGGUAAG 5 3 10 3

NmiRNA#11 GGUGAGCGGAGAUAGGCGGAG 117 63 174 60

NmiRNA#12 UAUUUCAGGAUAUGGGGUUUG 106 57 191 66

NmiRNA#13 CCGACUGAGAAAGCAGUGGCA 83 45 122 42

NmiRNA#14 CGGUGAUGAAACAGAGGCGAG 79 43 105 36

NmiRNA#15 UGGGCAUGAACCAGGAGAUGA 71 38 115 40

NmiRNA#16 ACUCUGCUCUGAUACCAUGUUGAG 71 38 87 30

NmiRNA#17 UGGACUCGAAGGCAUUUUUGC 61 33 74 25

NmiRNA#28 UCUAUUGUGCUGUGGACGGAG 54 29 99 34

NmiRNA#19 UCGACUGGCGGAACUCGAAGGGGG 47 25 97 33

NmiRNA#20 UGUGCUGUGGGGCGGCCCCUUG 34 18 39 13

NmiRNA#21 GGAGUAACCUGGAGAACGACG 30 16 40 14

NmiRNA#22 UGCGACGGCUGUGCCGAGAGG 29 16 50 17

NmiRNA#23 AGUAGAGGAGGCUAUGGAACA 23 12 30 10

NmiRNA#24 AAGGAAAGUGUAGGGAAAGAA 22 12 31 11

NmiRNA#25 GAUGUUCCCAACUCACGGCUG 21 11 29 10

NmiRNA#26 UGGUUCGAGUUGUAGCGGCGG 21 11 45 15

NmiRNA#27 UACAAAUUAUGUAGAGUGCAAA 19 10 24 8

NmiRNA#28 GAUGUGGGACGUUGUGGGGCG 14 8 22 8

NmiRNA#29a CGAGGAAGGCCAUGACGUUGA 3 2 13 4

NmiRNA#29b CGAGGAAGGCCAUGAUGUUGA 11 6 20 7

NmiRNA#30 AUUCGUUCGAACACCGUUGGA 11 6 16 6

NmiRNA#31 UCACGGAUGAAACGUUCUGUG 9 5 12 4

NmiRNA#32 UCUGAAGCAAGUCAAUGGAUG 9 5 16 6

NmiRNA#33 AGAGAGAAACAUAAGGUUGAG 7 4 4 1

NmiRNA#34 UGCGACGGUUGUGCUGAGAGG 7 4 9 3

NmiRNA#35 AAGGUAGGAGUUGACAAUUGC 6 3 10 3

NmiRNA#36 AUGCAUGGAGGGGAUGUGCUGU 6 3 8 3

NmiRNA#37 UGAAGUAAAAAGUAGGGAAAG 5 3 13 4

NmiRNA#38 CAGUGCAAGGUUGCAAGAGGC 5 3 15 5

NmiRNA#39 CACGGCUGAUAUGUGCUGUGA 5 3 2 1

NmiRNA#40 AGAAGAAGGAAAGGGGAAAAA 4 2 5 2

NmiRNA#41 UAGGUAGCCCGGCCCGGUGUG 3 2 4 1

NmiRNA#42 AGAUAUUCCCGAUGUAAGGCU 3 2 4 1

NmiRNA#43 UUGCAGCUGUGUAUUUAGAAU 3 2 6 2

NmiRNA#44 CACGGCGGAGAUGUGCUAUGG 3 2 2 1

NmiRNA#45 AGAUGCUCUCAAUACAUGGCGGGG 2 1 6 2

Tropical Plant Biol.

direct secondary siRNA biogenesis from the 3′ cleaved frag-ment of the target mRNA after miRNA-guided cleavage(Chen et al. 2010; Cuperus et al. 2010). Identification of thissuper family and its conserved targets in sacred lotus raises thehypothesis that this super family may also generate secondarysiRNAs from their RNA targets in this plant species.

miR156/miR535 and miR529 Superfamily in Sacred Lotus

miR156 family negatively regulates the expression of SPLtranscription factors and this regulation is critical for vegetativeto reproductive phase transition in plants (Poethig 2009).miR156, miR529 andmiR535 share sequence similarities, thusthese can be grouped as superfamily, like miR482/2118/472

superfamily. On the basis of normalized read abundance,miR156 is the most abundantly expressed and miR535 ismoderately expressed whereas miR529 is least expressed insacred lotus (Table 2). miR156 is one of the deeply conservedmiRNAs in plant lineages including bryophytes. miR529 wasfound in rice, maize and Physcomitrella and miR535 wasfound in various plant sps such as grapes, Populus includingPhyscomitrella (bryophyte), but not from Arabidopsis(miRBase). All these three miRNAs have been reported frombryophytes but miR529 and miR535 are selectively lost insome of the angiosperms. Interestingly all these three miRNAsthat belong to this super family are found in sacred lotus(Table 2). Notably, miR156 is expressed at very high levels,and miR535 is expressed at moderate levels and miR529 is

Table 3 (continued)

NmiRNA# miRNA sequence Leaves Leaves normalized(RPTM)

Flowers Flowers normalized(RPTM)

NmiRNA#46 CACCGAGGGGAUAGGCAGUGG 2 1 7 2

NmiRNA#47 UCCAAAGUCCAAGGUAGAAGA 2 1 7 2

NmiRNA#48 CAUGGCGAGAAUGUGCUGCAG 2 1 3 1

NmiRNA#49 UGGGGCGCAUUCUGUUGGGGGG 2 1 7 2

NmiRNA#10a

c u c c c a aucucau c caucuaguu ucuuuaucaaucauccaa cauuga agaaaaugaau agu ccaaauccuugccucuucuauuc agaaaguuggggucac uggg \guagauuaa agaaauaguugguagguu guaacu uuuuuuacuua uca gguuuagGAACGGAGAAGAUAAG UCUUucaaccccagug gccu a

a u a - c G ------- u

NmiRNA#10b

- c c c c c a- ---- uc cu a g ca--| aug gu au uaauuuuc uuaucaac auc aaac ugaca aaauuaacagu cagauccuugccu ucuauucaagaa guug ggucac ga cca ua auuaaaag aauaguug uag uuug acugu uuugauuguca guuuagGAACGGA AGGUAAGUUUUU caac cuagug cu c

a a a a u a ca uuuc ga AG c a caaa^ cac

NmiRNA#10cc c g ------| c

ucaucuaauuuucuuuaucaaccauccaaacauuga aaaaaaugaauagu ccaaauccuugccucuucuauucaagaaaguu ggguca ugucu aaguagauuaaaagagauaguugguagguuuguaacu uuuuuuacuuguca gguuuagGAAUGGAGAAGGUAAGUUCUUucaa cccggu acggg g

a u a gguuuu^ u

NmiRNA#29a- a--|ug auggguugaugau uu a A C UU- - - aac guuagu a guaugc cagag guagg agCGAGGA GGC AUGACG GA aga gagugggg acgaucg u uauaug gucuc caucc ucgcuccu cug uacugc cu ucu cuuacccu a

c aaa^gu ------------- uu c c c ccc c a uaa

NmiRNA#29b

- uu A C U A ---

uggua gc ugaug aucagag guaggaagCGAGGA GGC AUGAUG UG agaga gugggg a

gccgu cg guuau uggucuc caucuuucgcuccu cug uacugc ac ucucu uauccu a

g ----

agcaa u augggu

u aaau-- a cu c c c a acu^

| aac

uaa

Fig. 3 Predicted hairpin-like structures for some of the novel miRNA families in sacred locus

Tropical Plant Biol.

Table 4 Summary of conserved and knownmiRNA targets found in N.nucifera compared with Arabidopsis thaliana and Oryza sativa. Thetotal number of targets of Arabidopsis thaliana and Oryza sativa thatwere reported in published reports were indicated. The N. nucifera

column lists the number of predicted miRNA targets in sacred lotus.Number of mismatches in sacred lotus miRNA:target complementarysites are indicated if it exceeds 3 mismatches between miRNA and itstarget

miR family Target gene family Ath Osa N. nucifera No. ofmismatches

miR156/529/535 SBP transcription factors 11 10 10

miR159 MYB transcription factors 7 2 2

miR3627 MYB transcription factors 2

miR159/319 TCP 5 4 3

miR319 Transcription factor GAMYB NNU_021320-RA

VP1 Regulatory protein viviparous-1 NNU_024976-RA

miR160 ARF transcription factors 3 4 3

miR162 DCL 1 1 1

miR164 NAC transcription factors 7 6 4

BHLH transcription factors 1

miR165/166 HD-Zip transcription factors 6 4 4

miR167 ARF transcription factors 2 4 3 4

Putative disease resistance protein;14-3-3-like protein;Probable serine/threonine-protein kinase;Similar to CHX15 Cation/H

NNU_018065-RA;NNU_001202-RA;NNU_001500-RA;NNU_017765-RA

miR168 Argonaute 1 6 2

miR169 HAP2/CAAT BF/ 7 8 3

miR170/171 SCL transcription factors 4 5 2

miR172 AP2 transcription factors 6 5 2

miR172 E3 ubiquitin-protein ligase MARCH9 NNU_009068-RA

Superoxide dismutase [Cu-Zn], chl NNU_023478-RA

miR390/391 TAS3 3 3 2

miR393 F-Box 5 2 1

miR394 F-Box 1 1 1 4

Probable WRKY transcription factor 19 NNU_010140-RA

Putative G3BP-like protein NNU_000984-RA

Similar to KEA1 K NNU_017671-RA

miR395 APS 3 1 2

SO2 Transp. 1 3 2

miR396 GRF 7 12 4

Protein of unknown function NNU_018235-RA

Similar to Kafirin PGK1 NNU_006992-RA

Serine/threonine-protein kinase AKL1 NNU_012471-RA

Serine-rich adhesin for platelets NNU_002283-RA

Protein of unknown function NNU_013120-RA

Protein of unknown function NNU_002395-RA

Similar to Cell wall protein AWA1 NNU_014113-RA

Similar to Protein rtoA NNU_000550-RA

Similar to cyb5d2 Neuferricin NNU_000071-RA

Serine/threonine-protein kinase phg2 NNU_001408-RA

miR397 Laccase 3 16 9

miR398 CSD 2 2 3

CCS1 1 1 1 3.5

miR399 PO4 Transp. 1 4 1 4.5

E2-UBC 1 1 2

Tropical Plant Biol.

expressed at very low levels in in sacred lotus. miR156 ispredicted to target 10 SPL transcription factors in sacred lotus(Table 4). Of these 10 SPLs, 8 and 4 of the same genes are alsotargeted by miR529 and miR535, respectively in sacred lotus(Table 4). This suggests a potential combinatorial regulation aswell as tissue-specific regulation of SPL factors in sacred lotus,as has been suggested for these miRNAs in rice (Jeong et al.2011).

Greater Abundance of miRNA-Stars than TheirCorresponding miRNAs in Sacred Lotus

In general, miRNA* (complementary strand of miRNA in theduplex) are largely degraded at a much faster rate. Therefore,in the sequenced libraries their proportion is low, i.e., 10–50times lower than their corresponding miRNA species. Simi-larly their detection using small RNA blot analysis could only

yield a very faint signal compared to their miRNA species. Inthis study, we found that miRNA* species for four of theknown miRNA families (miR393, miR408, miR2111 andmiR2950) was more abundant than their miRNA counterparts(Table 2). miR393 and miR408 are two highly conservedmiRNA families in angiosperms (Sunkar and Jagadeeswaran2008). Relative to the miR393 abundance, miR393* levelswere abnormally high (130 and 78 fold greater abundance inleaves and flowers, respectively). miR408 is represented bytwo different loci (miR408a, and miR408b) with identicalmature miRNA sequence. However their miRNA* sequenceis different for each of the locus and their normalized abun-dances are 30–40 fold higher than the miR408 (Table 2). Twoother miRNAs (miR2111a and miR2950a) that are not widelyconserved also showed much greater abundances for theirrespective miRNA* species (Table 2). miR2111 has beenidentified in 9 different species including soybean, grape,

Table 4 (continued)

miR family Target gene family Ath Osa N. nucifera No. ofmismatches

miR403 Argonaute 2 2 0 1

miR408 Plantacyanin 3 7 2

Laccase 3 2 0

TAS3-siR ARF 3 5 7

miR827 SPX domain-containing membrane protein NNU_021156-RA

miR1432 Glutathione S-transferase theta-2 NNU_006730-RA

O-glucosyltransferase rumi homolog NNU_006740-RA

Protein SET DOMAIN GROUP 41 NNU_010056-RA

miR1511ab Unknown function NNU_023088-RA

Unknown function NNU_018912-RA

miR2111abc F-box/kelch-repeat protein NNU_004122-RA

F-box/kelch-repeat protein NNU_003086-RA

miR2118ac Putative disease resistance RPP13-like protein 1 NNU_020694-RA

Putative disease resistance RPP13-like protein 1 NNU_016497-RA

Putative disease resistance RPP13-like protein 1 NNU_012693-RA

RPM1 Disease resistance protein RPM1 NNU_017037-RA

TMV resistance protein N NNU_006666-RA

miR2950ab FBX13 F-box only protein 13 NNU_010524-RA

FBX13 F-box only protein 13 NNU_018151-RA

Cyclin-L1-1 NNU_024354-RA

F-box/kelch-repeat protein NNU_023566-RA

miR3627 Calcium-transporting ATPase 8, plasmamembrane-type

NNU_017972-RA

Probable histone acetyltransferase HAC-like 1 NNU_021107-RA

At1g80170 Probable polygalacturonase At1g80170 NNU_002791-RA

MYBBP1A Myb-binding protein 1A NNU_005057-RA

MYB86 Transcription factor MYB86 NNU_009950-RA

Putative Myb family transcription factor At1g14600 NNU_016277-RA

miR5179 Similar to FD Protein FD NNU_013614-RA

Heat stress transcription factor A-1 NNU_024675-RA

Tropical Plant Biol.

wheat andMedicago, whereas miR2950 has been identified incotton and vine grape (miRBase Release 19, August 2012).These results suggest a potential function for miRNA* insacred lotus, besides their miRNA species. Indeed, it wasrecently demonstrated that miR393* is targeting a SNAREgene (Zhang et al. 2011) and miR171* is targeting Su(VAR)-3-9 HOMOLOG8 (Manavella et al. 2013) in Arabidopsissuggesting that miRNA* are also regulatory molecules. Ex-perimental confirmations such as degradome analysis willreveal whether both or only one of the duplexes (miRNAand miRNA*) are functional in sacred lotus.

Novel miRNAs in Sacred Lotus

MicroRNA population in plant species comprises not onlyconserved miRNAs but also novel species-specific orlineage-specific miRNAs. Conserved miRNAs have establishedfunctions where as novel miRNAs are thought to pickup novelgene regulatory functions and stabilize such regulations overperiod of time if the regulation has biological advantage. Inplants, several such novel but species-specific or lineage-specific miRNAs have been reported. Because small RNApopulation is huge and diverse and the miRNAs are only smallfraction of it, sequencing of miRNA* is required to annotatenovel small RNA as novel miRNA in plants (Meyers et al.2008). We have annotated 52 novel miRNAs that can begrouped into 49 novel miRNA families in sacred lotus on thebasis of presence of miRNA* reads in our small RNA libraries.Additionally, for all these novel miRNAs, hairpin-like struc-tures were predicted for their precursors (Supplemental Fig-ure 1). Some of these novel miRNA families had more thanone miRNA. For instance, NmiRNA#10 family is representedby 3 variants (#10a,b,c), and their expression levels variedsignificantly, and NmiRNA#29 is represented by twomembersin sacred lotus (Table 3, Fig. 3). While most of the novelmiRNAs are expressed at low levels, but some of their expres-sion levels are comparable with conserved miRNAs. For in-stance, NmiRNA#1 has 1,289 RPTM and 1,164 RPTM inleaves and flowers, respectively, Similarly, NmiRNA#2 withapproximately 1,000 RPTM in both the libraries. The veryhigh level accumulation suggests an important gene regulatoryfunction for some of the novel miRNAs in sacred lotus.

Bioinformatics Prediction of miRNATargets in Sacred Lotus

To predict the genes that conserved miRNAs are targeting insacred lotus, we used a conserved cut off of 3 mismatchesbetween miRNAs and their target genes. This predictionyielded a total of 137 genes as targets for highly conservedand known miRNAs in sacred lotus (Table 4). As previouslydescribed in several other plant species, most of the predictedmiRNA targets are transcription factors in sacred lotus (Ta-ble 4). Besides the conserved targets, our prediction also

identified several non-conserved targets for many conservedmiRNAs in sacred lotus. Notable such targets are; besidesNAC factors, miR164 is predicted to target BHLH transcrip-tion factors; In addition to conserved Apetala 2, miR172 alsohas complementarity with CSD (Cu/Zn superoxide dismutase)gene (Table 4); besides its conserved CSDs, miR398 is pre-dicted to target several blue copper proteins; miR396 displayedcomplementarity with 9 genes that are non-conserved targetsfor this miRNA. Target prediction led to the identification ofthree MYB transcripton factors as targets for miR3627, inaddition to conserved MYB transcription factors that aretargeted by miR159 in sacred lotus (Table 4). AGO-1 is well-known target for miR168 in plants. AGO1 appears to havetandem duplications in sacred lotus genome (AGO1 andAGO-1B) and miR168 targets both AGO1 and AGO-1B (Table 4).ARFs targeted by miR167 are highly conserved across pantspecies but using 3.0 mismatches cutoff between miR167 andARFs, these genes cannot be predicted as targets in sacredlotus. To predict ARFs as targets, this cutoff has to be relaxedto 4.0 mismatches (Table 4). Although most confirmed plantmiRNA targets are captured by this cutoff, but the prediction ofARFs targeted by miR167 in sacred lotus required relaxationof this criterion suggesting that these target relation werestabilized in core eudicots compared to the basal eudicot sacredlotus. With 3.0 mismatches cutoff, we have predicted 4 othergenes (NNU_018065, NNU_001202, NNU_001500 andNNU_017765) as targets for miR167 in sacred lotus (Table 4).Whether these 4 genes are true targets for miR167 in sacredlotus needs additional studies.

Target genes involved in other biological processes suchas sulfate/phosphate uptake or remobilization (sulfate trans-porter and phosphate transporter), sulfate assimilation (ATPsulfurylases), and ubiquitination (miR393, miR394, miR399and miR2950 are predicted to target TIR1, UBC24 and otherF-box proteins) in sacred lotus. miR398 is targeting 2 CSDgenes in Arabidopsis and rice that are involved in scavengingsuperoxide radicals, thus decreasing the oxidative stress. Insacred lotus, a total of 4 CSDs are found to be targets formiRNAs, i.e., miR398 is predicted to target 3 CSD genes andmiR172 targets another CSD gene (Table 4). Only four GRFshave been predicted as targets for miR396 in N. nucifera,whereas in Arabidopsis and rice, 7 and 12 GRFs, respective-ly, were confirmed as miR396 targets. On the other hand,TAS3siRNAs have been predicted to target seven AuxinResponse Factors (ARFs) in N. nucifera, but only 3 and 5ARFs were confirmed as targets in Arabidopsis and rice,respectively. Thus, there are many similarities between N.nucifera and higher plants with respect to miRNA-guidedgene regulations but there are also minor differences withrespect to the number of miRNAs and miRNA targets in N.nucifera.

Expansion in copper-dependent proteins has been found insacred lotus genome and thought such an expansion in multi-

Tropical Plant Biol.

copper oxidases may have a role in adaptation to aquaticenvironment (Ming et al. 2013). The copper-containing pro-teins such as basic blue proteins/chemocyanins whose roles arenot well known are conserved targets for miR408 in differentplant species. Besides CSDs that are conserved targets formiR398, two blue-copper proteins have been predicted astargets for miR398 in sacred lotus (Table 4). These genes havenot been predicted as targets for miR398 in other plant speciessuggesting that miRNA-regulation of copper-dependent pro-teins is also expanded in sacred lotus. The conserved and novelmiRNAs identified in sacred lotus provides an opportunity toreveal if miRNAs play a role in adaptation to aquatic environ-ment. A role for the miRNAs identified in this study in thethermogenesis that is thought to enhance its pollination insacred lotus is also possible. Only two different tissues usedin this study limits the novel miRNA discovery in sacred lotus,and, their number could increase with the sequencing of smallRNA libraries from different tissues and developmental stagesor even stress treatments.

Methods

MicroRNA Library Construction, Sequence Analysisand Identification of Conserved and Novel miRNAsin Sacred Lotus

To identify conserved and novel miRNAs in the transcriptomeof sacred lotus, small RNA libraries were constructed andanalyzed as described earlier (Li et al. 2011; Jagadeeswaranet al. 2012) using RNA isolated from the pooled tissues ofleaves and flowers from several plants of sacred lotus. In brief,small RNAs ranging in 18–30 nucleotides were size fraction-ated electrophoretically, isolated from the gel, ligated with the5′ and 3′ RNA adapters. The ligated product was reversetranscribed and subsequently amplified using 10–12 PCRcycles. The purified PCR product was sequenced usingIllumina Genome analyzer. A total of 47.5 million smallRNAs representing 8,348,852 unique reads were obtainedfrom these libraries. These unique reads were used to identifyconserved miRNA homologs by mapping to the miRBase.The remaining sequences were analyzed to discard break-down products from messenger RNA, rRNA, tRNA, smallnuclear RNA and small nucleolar RNA, by mapping to thesedifferent categories of RNAs. For identification of novelmiRNAs, initially the unique reads that were mapped to thegenome of sacred lotus. The surrounding sequences (down-and up-stream) were extended by 300-bp which were used topredict fold-back structure using mFold. From the predictedfold-back structures, miRNA* sequences for the novelmiRNAs were predicted that aided in identifying miRNA*sequences in the sequenced small RNA libraries. On the basisof presence of miRNA* sequences in our small RNA libraries

(Meyers et al. 2008), novel small RNAs were identified insacred lotus (Supplemental Table 2).

Computational Prediction of Known miRNA Homologsin Sacred Lotus

To predict known miRNA homologs in sacred lotus that werenot represented in small RNA libraries, unique miRNA se-quences were used as queries against the N. nucifera genomeusing BLASTN (Altschul et al. 1990). Hits with no more thantwo mismatches were identified and the flanking regions (150nt down stream and 150 nt upstream) to the mapped maturemiRNAs were isolated and used to predict fold-back struc-tures using the mfold (Zuker 2003). The predicted fold-backstructures were examined for the presence of miRNA on thesame arm of the hairpin as the known family members fromother plants. These precursor sequences were further evaluat-ed byMIRcheck (Jones-Rhoades and Bartel 2004) and select-ed candidates that have≤6 mismatches, ≤2 bulged or asym-metrically unpaired nucleotides, and ≤3 continuous mis-matches within the mature miRNA.

Prediction of miRNATargets in Sacred Lotus

The HitSensor algorithm was used to predict complementarysites of unique miRNA and tasiRNAs on lotus mRNA se-quences (Zheng and Zhang 2010). The conserved miRNA/tasiRNA target families were selected based on the publishedresults (Zheng et al. 2012). A maximum of 3.0 mismatchesbetween the miRNA and its target mRNA were allowed inpredicting targets for the N. nucifera miRNAs. Although mostconfirmed plant miRNA targets are captured by this cutoff, butthe prediction of some of the authentic targets required relaxa-tion of this criterion. For instance ARF genes that are knowntargets for miR167 could not be identified using this cut off andrequired relaxation of this criteria to 4.0 mismatches. Fewadditional target such as predicting phosphate transporter as atarget for miR399, CCS1 as a target for miR398, and F-boxprotein as target for miR394 required relaxation of the numberof mismatches between miRNA and their targets as shown inTable 4.

Acknowledgments This research was supported by the OklahomaAgricultural Experiment Station to RS and by a start-up grant of FudanUniversity to YZ.

References

Allen E, Howell MD (2010) miRNAs in the biogenesis of trans-actingsiRNAs in higher plants. Semin Cell Dev Biol 21:798–804

Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basiclocal alignment search tool. Mol Biol 215:403–410

Tropical Plant Biol.

Arenas-Huertero C, Perez B, Rabanal F, Blanco-Melo D, De la Rosa C,Estrada-Navarrete G et al (2009) Conserved and novel miRNAs inthe legume Phaseolus vulgaris in response to stress. Plant Mol Biol70:385–401

Chen X (2009) Small RNAs and their roles in plant development. AnnuRev Cell Dev Biol 25:21–44

Chen H-M, Chen L-T, Patel K, Li Y-H, Baulcombe DC, Wu S-H (2010)22-Nucleotide RNAs trigger secondary siRNA biogenesis inplants. Proc Natl Acad Sci U S A 107:15269–15274

Cuperus JT, Carbonell A, Fahlgren N, Garcia-Ruiz H, Burke RT,Takeda A, Sullivan CM, Gilbert SD, Montgomery TA, CarringtonJC (2010) Unique functionality of 22-nt miRNAs in triggeringRDR6-dependent siRNA biogenesis from target transcripts inArabidopsis. Nat Struct Mol Biol 17:997–1003

Hayden KE, Willard HF (2012) Composition and organization of activecentromere sequences in complex genomes. BMC Genomics 13:324

Jagadeeswaran G, Zheng Y, Li Y, Shukla L, Matts J, Hoyt P, GrahamMS,Roe BA, Zhang W, Sunkar R (2009) Sequencing of a small RNAlibrary from Medicago truncatula revealed four families of novellegume-specific and candidate microRNAs. New Phytol 184:85–98

Jagadeeswaran G, Nimmakayala P, Zheng Y, Gowdu K, Reddy UK,Sunkar R (2012) Characterization of small RNA component of theleaves and fruits from four cucurbit species. BMC Genomics13(1):329

Jeong DH, Park S, Zhai J, Gurazada SG, De Paoli E, Meyers BC, GreenPJ (2011) Massive analysis of rice small RNAs: mechanisticimplications of regulated microRNAs and variants for differentialtarget RNA cleavage. Plant Cell 23:4185–4207

Johnson C, Kasprzewska A, Tennessen K, Fernandes J, Nan GL,Walbot V, Sundaresan V, Vance V, Bowman LH (2009) Clustersand superclusters of phased small RNAs in the developing inflo-rescence of rice. Genome Res 19:1429–1440

Jones-Rhoades MW, Bartel DP (2004) Computational identification ofplant microRNAs and their targets, including a stress-inducedmiRNA. Mol Cell 14:787–799

Jones-Rhoades MJ, Bartel B, Bartel DP (2006) MicroRNAs and theirregulatory targets in plants. Annu Rev Plant Biol 57:19–53

Khraiwesh B, Arif MA, Seumel GI, Ossowski S, Weigel D, Reski R,Frank W (2010) Transcriptional control of gene expression bymicroRNAs. Cell 140:111–122

Li Z, Zheng Y, Jagadeeswaran G, Li Y, Gowdu K, Sunkar R (2011)Identification and temporal expression analysis of conserved andnovel miRNAs in Sorghum. Genomics 98:460–468

Li F, Pignatta D, Bendix C, Brunkard JO, Cohn MM, Tung J, Sun H,Kumar P, Baker B (2012) MicroRNA regulation of plant innateimmune receptors. Proc Natl Acad Sci U S A 109:1790–1795

Lu C et al (2008) Genome-wide analysis for discovery of ricemicroRNAs reveals natural antisense microRNAs (nat-miRNAs).Proc Natl Acad Sci U S A 105:4951–4956

Manavella PA, Koenig D, Rubio-Somoza I, Burbano HA, Becker C,Weigel D (2013) Tissue-specific silencing of ArabidopsisSU(VAR)3-9 HOMOLOG8 bymiR171a. Plant Physiol 161:805–812

Meyers BC, Axtell MJ, Bartel B et al (2008) Criteria for annotation ofplant MicroRNAs. Plant Cell 20:3186–3190

Ming R, VanBuren R, Liu Yet al. (2013) The genome of the long-livingsacred lotus (Nelumbo nucifera, Gaertn.). Genome Biol 14:R41

Paterson AH, Bowers JE, Bruggmann R et al (2009) The Sorghum bicolorgenome and the diversification of grasses. Nature 457:551–556

Poethig RS (2009) Small RNAs and developmental timing in plants.Curr Opin Genet Dev 19:374–378

Shivaprasad PV, Chen HM, Patel K, Bond DM, Santos BA, BaulcombeDC (2012) A microRNA superfamily regulates nucleotide bindingsite-leucine-rich repeats and other mRNAs. Plant Cell 24:859–874

Song X et al (2012) Roles of DCL4 and DCL3b in rice phased smallRNA biogenesis. Plant J 69:462–474

Sunkar R, Jagadeeswaran G (2008) In silico identification of conservedmiRNAs in large number of diverse plant species. BMC Plant Biol8:37

Sunkar R, Zhu JK (2007) Micro RNAs and short-interfering RNAs inplants. J Integr Plant Biol 49:817–826

Sunkar R, Zhou X, Zheng Y, ZhangW, Zhu J-K (2008) Identification ofnovel and candidate miRNAs in rice by high throughput sequenc-ing. BMC Plant Biol 8:25

Sunkar R, Li Y, Jagadeeswaran G (2012) Functions of microRNAs inplant stress responses. Trends Plant Sci 17:196–203

Vaucheret H (2009) AGO1 Homeostasis involves differential produc-tion of 21-nt and 22-nt miR168 Species by MIR168a andMIR168b. PLoS One 4:e6442

Voinnet O (2009) Origin, biogenesis, and activity of plant microRNAs.Cell 136:669–687

Wu L, Zhou H, Zhang Q, Zhang J, Ni F, Liu C, Qi Y (2010) DNAmethylation mediated by a microRNA pathway. Mol Cell 38:465–475

Zanca AS, Vicentini R, Ortiz-Morea FA, Del Bem LE, da Silva MJ,Vincentz M, Nogueira FT (2010) Identification and expressionanalysis of microRNAs and targets in the biofuel crop sugarcane.BMC Plant Biol 10:260

Zhai J et al (2011) MicroRNAs as master regulators of the plant NB-LRR defense gene family via the production of phased, trans-acting siRNAs. Genes Dev 25:2540–2553

Zhang L, Chia JM, Kumari S, Stein JC, Liu Z, Narechania A, MaherCA, Guill K, McMullen MD, Ware D (2009) A genome-widecharacterization of microRNA genes in maize. PLoS Genet5:e1000716

Zhang X, Zhao H, Gao S, Wang WC, Katiyar-Agarwal S, Huang HD,Raikhel N, Jin H (2011) Arabidopsis Argonaute 2 regulates innateimmunity via miRNA393(*)-mediated silencing of a Golgi-localized SNARE gene, MEMB12. Mol Cell 42:356–366

Zheng Y, Zhang W (2010) Animal microRNA target prediction usingdiverse sequence-specific determinants. J Bioinform Comput Biol8: 763–788

Zheng Y, Li Y-F, Sunkar R, Zhang W (2012) SeqTar: An effectivemethod for identifying microRNA guided cleavage sites fromdegradome of polyadenylated transcripts in plants. NucleicAcids Res 40:e28

Zuker M (2003) Mfold web server for nucleic acid folding and hybrid-ization prediction. Nucleic Acids Res 31:3406–3415

Tropical Plant Biol.