首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 859 毫秒
1.
2.
Plant genomes have undergone multiple rounds of duplications that contributed massively to the growth of gene families. The structure of resulting families has been studied in depth for protein-coding genes. However, little is known about the impact of duplications on noncoding RNA (ncRNA) genes. Here we perform a systematic analysis of duplicated regions in the rice genome in search of such ncRNA repeats. We observe that, just like their protein counterparts, most ncRNA genes have undergone multiple duplications that left visible sequence conservation footprints. The extent of ncRNA gene duplication in plants is such that these sequence footprints can be exploited for the discovery of novel ncRNA gene families on a large scale. We developed an SVM model that is able to retrieve likely ncRNA candidates among the 100,000+ repeat families in the rice genome, with a reasonably low false-positive discovery rate. Among the nearly 4000 ncRNA families predicted by this means, only 90 correspond to putative snoRNA or miRNA families. About half of the remaining families are classified as structured RNAs. New candidate ncRNAs are particularly enriched in UTR and intronic regions. Interestingly, 89% of the putative ncRNA families do not produce a detectable signal when their sequences are compared to another grass genome such as maize. Our results show that a large fraction of rice ncRNA genes are present in multiple copies and are species-specific or of recent origin. Intragenome comparison is a unique and potent source for the computational annotation of this major class of ncRNA.  相似文献   

3.
Many noncoding RNAs (ncRNAs) function through both their sequences and secondary structures. Thus, secondary structure derivation is an important issue in today's RNA research. The state-of-the-art structure annotation tools are based on comparative analysis, which derives consensus structure of homologous ncRNAs. Despite promising results from existing ncRNA aligning and consensus structure derivation tools, there is a need for more efficient and accurate ncRNA secondary structure modeling and alignment methods. In this work, we introduce a consensus structure derivation approach based on grammar string, a novel ncRNA secondary structure representation that encodes an ncRNA's sequence and secondary structure in the parameter space of a context-free grammar (CFG) and a full RNA grammar including pseudoknots. Being a string defined on a special alphabet constructed from a grammar, grammar string converts ncRNA alignment into sequence alignment. We derive consensus secondary structures from hundreds of ncRNA families from BraliBase 2.1 and 25 families containing pseudoknots using grammar string alignment. Our experiments have shown that grammar string-based structure derivation competes favorably in consensus structure quality with Murlet and RNASampler. Source code and experimental data are available at http://www.cse.msu.edu/~yannisun/grammar-string.  相似文献   

4.
5.
Non-coding RNAs (ncRNAs) ubiquitously exist in normal and cancer cells. Despite their prevalent distribution, the functions of most long ncRNAs remain uncharacterized. The fission yeast Schizosaccharomyces pombe expresses >1800 ncRNAs annotated to date, but most unconventional ncRNAs (excluding tRNA, rRNA, snRNA and snoRNA) remain uncharacterized. To discover the functional ncRNAs, here we performed a combinatory screening of computational and biological tests. First, all S. pombe ncRNAs were screened in silico for those showing conservation in sequence as well as in secondary structure with ncRNAs in closely related species. Almost a half of the 151 selected conserved ncRNA genes were uncharacterized. Twelve ncRNA genes that did not overlap with protein-coding sequences were next chosen for biological screening that examines defects in growth or sexual differentiation, as well as sensitivities to drugs and stresses. Finally, we highlighted an ncRNA transcribed from SPNCRNA.1669, which inhibited untimely initiation of sexual differentiation. A domain that was predicted as conserved secondary structure by the computational operations was essential for the ncRNA to function. Thus, this study demonstrates that in silico selection focusing on conservation of the secondary structure over species is a powerful method to pinpoint novel functional ncRNAs.  相似文献   

6.
7.
Secondary structure remains the most exploitable feature for noncoding RNA (ncRNA) gene finding in genomes. However, methods based on secondary structure prediction may generate superfluous amount of candidates for validation and have yet to deliver the desired performance that can complement experimental efforts in ncRNA gene finding. This paper investigates a novel method, unpaired structural entropy (USE) as a measurement for the structure fold stability of ncRNAs. USE proves to be effective in identifying from the genome background a class of ncRNAs, such as precursor microRNAs (pre-miRNAs) that contains a long stem hairpin loop. USE correlates well and performs better than other measures on pre-miRNAs, including the previously formulated structural entropy. As an SVM classifier, USE outperforms existing pre-miRNA classifiers. A long stem hairpin loop is common for a number of other functional RNAs including introns splicing hairpins loops and intrinsic termination hairpin loops. We believe USE can be further applied in developing ab initio prediction programs for a larger class of ncRNAs.  相似文献   

8.
9.
10.
11.
We present a survey for non-coding RNAs and other structured RNA motifs in the genomes of Caenorhabditis elegans and Caenorhabditis briggsae using the RNAz program. This approach explicitly evaluates comparative sequence information to detect stabilizing selection acting on RNA secondary structure. We detect 3,672 structured RNA motifs, of which only 678 are known non-translated RNAs (ncRNAs) or clear homologs of known C. elegans ncRNAs. Most of these signals are located in introns or at a distance from known protein-coding genes. With an estimated false positive rate of about 50% and a sensitivity on the order of 50%, we estimate that the nematode genomes contain between 3,000 and 4,000 RNAs with evolutionary conserved secondary structures. Only a small fraction of these belongs to the known RNA classes, including tRNAs, snoRNAs, snRNAs, or microRNAs. A relatively small class of ncRNA candidates is associated with previously observed RNA-specific upstream elements.  相似文献   

12.
13.
近年来,越来越多的研究表明,RNA结合蛋白(RNA binding protein,RBP)与多种类型的非编码RNAs(noncoding RNA,ncRNAs)具有互相调节的关系,且调节机制形式多样。一方面,RBP可以调节ncRNA的生物合成、稳定性和功能;另一方面,ncRNA也可以影响RBP的功能和结构。同时,RBP和ncRNA的相互作用还在其他靶基因的调节上起着重要的作用,从而参与众多的生物过程,如组织发育、代谢性疾病、神经退行性疾病、抗病毒免疫和各种癌症等。该文就RBP与常见类型的ncRNAs,包括miRNA、lncRNA、circRNA的相互作用方式和调节机制的研究进展作一综述。  相似文献   

14.
15.
Sequence-based heuristics for faster annotation of non-coding RNA families   总被引:7,自引:0,他引:7  
MOTIVATION: Non-coding RNAs (ncRNAs) are functional RNA molecules that do not code for proteins. Covariance Models (CMs) are a useful statistical tool to find new members of an ncRNA gene family in a large genome database, using both sequence and, importantly, RNA secondary structure information. Unfortunately, CM searches are extremely slow. Previously, we created rigorous filters, which provably sacrifice none of a CM's accuracy, while making searches significantly faster for virtually all ncRNA families. However, these rigorous filters make searches slower than heuristics could be. RESULTS: In this paper we introduce profile HMM-based heuristic filters. We show that their accuracy is usually superior to heuristics based on BLAST. Moreover, we compared our heuristics with those used in tRNAscan-SE, whose heuristics incorporate a significant amount of work specific to tRNAs, where our heuristics are generic to any ncRNA. Performance was roughly comparable, so we expect that our heuristics provide a high-quality solution that--unlike family-specific solutions--can scale to hundreds of ncRNA families. AVAILABILITY: The source code is available under GNU Public License at the supplementary web site.  相似文献   

16.
17.
18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号