共查询到20条相似文献,搜索用时 6 毫秒
1.
2.
Dynalign: an algorithm for finding the secondary structure common to two RNA sequences 总被引:28,自引:0,他引:28
With the rapid increase in the size of the genome sequence database, computational analysis of RNA will become increasingly important in revealing structure-function relationships and potential drug targets. RNA secondary structure prediction for a single sequence is 73 % accurate on average for a large database of known secondary structures. This level of accuracy provides a good starting point for determining a secondary structure either by comparative sequence analysis or by the interpretation of experimental studies. Dynalign is a new computer algorithm that improves the accuracy of structure prediction by combining free energy minimization and comparative sequence analysis to find a low free energy structure common to two sequences without requiring any sequence identity. It uses a dynamic programming construct suggested by Sankoff. Dynalign, however, restricts the maximum distance, M, allowed between aligned nucleotides in the two sequences. This makes the calculation tractable because the complexity is simplified to O(M(3)N(3)), where N is the length of the shorter sequence.The accuracy of Dynalign was tested with sets of 13 tRNAs, seven 5 S rRNAs, and two R2 3' UTR sequences. On average, Dynalign predicted 86.1 % of known base-pairs in the tRNAs, as compared to 59.7 % for free energy minimization alone. For the 5 S rRNAs, the average accuracy improves from 47.8 % to 86.4 %. The secondary structure of the R2 3' UTR from Drosophila takahashii is poorly predicted by standard free energy minimization. With Dynalign, however, the structure predicted in tandem with the sequence from Drosophila melanogaster nearly matches the structure determined by comparative sequence analysis. 相似文献
3.
Shima Jahani Elahe Nazeri Keivan Majidzadeh-A Mona Jahani Rezvan Esmaeili 《Journal of cellular physiology》2020,235(7-8):5501-5510
Circular RNAs (circRNAs) were recently discovered as a looped subset of competing endogenous RNAs, with an ability to regulate gene expression by microRNA sponging. There are several studies on their potential roles in cancer development, such as colorectal cancer and basal cell carcinoma. However, there is still a significant gap in the knowledge about circRNA functions in breast cancer (BC) progression. The current study systematically reviewed circRNA biogenesis and their potential roles as a novel biomarker in BC on published studies of the MEDLINE®/PubMed, Cochrane®, and Scopus® databases. The obtained results showed a general dysregulation of circRNAs expression in BC cells with a cell-type and stage-specific manner. The potential connection between circRNAs and BC cell proliferation, apoptosis, metastasis, and chemotherapy sensitivity and resistance were discussed. 相似文献
4.
Wenlin Li Lisa N. Kinch P. Andrew Karplus Nick V. Grishin 《Protein science : a publication of the Protein Society》2015,24(7):1075-1086
Chameleon sequences (ChSeqs) refer to sequence strings of identical amino acids that can adopt different conformations in protein structures. Researchers have detected and studied ChSeqs to understand the interplay between local and global interactions in protein structure formation. The different secondary structures adopted by one ChSeq challenge sequence‐based secondary structure predictors. With increasing numbers of available Protein Data Bank structures, we here identify a large set of ChSeqs ranging from 6 to 10 residues in length. The homologous ChSeqs discovered highlight the structural plasticity involved in biological function. When compared with previous studies, the set of unrelated ChSeqs found represents an about 20‐fold increase in the number of detected sequences, as well as an increase in the longest ChSeq length from 8 to 10 residues. We applied secondary structure predictors on our ChSeqs and found that methods based on a sequence profile outperformed methods based on a single sequence. For the unrelated ChSeqs, the evolutionary information provided by the sequence profile typically allows successful prediction of the prevailing secondary structure adopted in each protein family. Our dataset will facilitate future studies of ChSeqs, as well as interpretations of the interplay between local and nonlocal interactions. A user‐friendly web interface for this ChSeq database is available at prodata.swmed.edu/chseq . 相似文献
5.
近年来,越来越多的研究表明,RNA结合蛋白(RNA binding protein,RBP)与多种类型的非编码RNAs(noncoding RNA,ncRNAs)具有互相调节的关系,且调节机制形式多样。一方面,RBP可以调节ncRNA的生物合成、稳定性和功能;另一方面,ncRNA也可以影响RBP的功能和结构。同时,RBP和ncRNA的相互作用还在其他靶基因的调节上起着重要的作用,从而参与众多的生物过程,如组织发育、代谢性疾病、神经退行性疾病、抗病毒免疫和各种癌症等。该文就RBP与常见类型的ncRNAs,包括miRNA、lncRNA、circRNA的相互作用方式和调节机制的研究进展作一综述。 相似文献
6.
Yuting He Xingsong Li Yuhuan Meng Shuying Fu Ying Cui Yong Shi Hongli Du 《Journal of cellular biochemistry》2019,120(10):16692-16702
Breast cancer, the most common cancer in women worldwide, is associated with high mortality. The long non-coding RNAs (lncRNAs) with a little capacity of coding proteins is playing an increasingly important role in the cancer paradigm. Accumulating evidences demonstrate that lncRNAs have crucial connections with breast cancer prognosis while the studies of lncRNAs in breast cancer are still in its primary stage. In this study, we collected 1052 clinical patient samples, a comparatively large sample size, including 13 159 lncRNA expression profiles of breast invasive carcinoma (BRCA) from The Cancer Genome Atlas database to identify prognosis-related lncRNAs. We randomly separated all of these clinical patient samples into training and testing sets. In the training set, we performed univariable Cox regression analysis for primary screening and played the model for Robust likelihood-based survival for 1000 times. Then 11 lncRNAs with a frequency more than 600 were selected for prediction of the prognosis of BRCA. Using the analysis of multivariate Cox regression, we established a signature risk-score formula for 11 lncRNA to identify the relationship between lncRNA signatures and overall survival. The 11 lncRNA signature was validated both in the testing and the complete set and could effectively classify the high-/low-risk group with different OS. We also verified our results in different stages. Moreover, we analyzed the connection between the 11 lncRNAs and the genes of ESR1, PGR, and Her2, of which protein products (ESR, PGR, and HER2) were used to classify the breast cancer subtypes widely. The results indicated correlations between 11 lncRNAs and the gene of PGR and ESR1. Thus, a prognostic model for 11 lncRNA expression was developed to classify the BRAC clinical patient samples, providing new avenues in understanding the potential therapeutic methods of breast cancer. 相似文献
7.
8.
Daniel Luis Notari Aurione Molin Vanessa Davanzo Douglas Picolotto Helena Graziottin Ribeiro Scheila de Avila e Silva 《Bioinformation》2014,10(6):381-383
A whole genome contains not only coding regions, but also non-coding regions. These are located between the end of a given
coding region and the beginning of the following coding region. For this reason, the information about gene regulation process
underlies in intergenic regions. There is no easy way to obtain intergenic regions from current available databases. IntergenicDB
was developed to integrate data of intergenic regions and their gene related information from NCBI databases. The main goal of
INTERGENICDB is to offer friendly database for intergenic sequences of bacterial genomes.
Availability
http://intergenicdb.bioinfoucs.com/ 相似文献9.
10.
Signature sequences are contiguous patterns of amino acids 10-50 residues long that are associated with a particular structure or function in proteins. These may be of three types (by our nomenclature): superfamily signatures, remnant homologies, and motifs. We have performed a systematic search through a database of protein sequences to automatically and preferentially find remnant homologies and motifs. This was accomplished in three steps: 1. We generated a nonredundant sequence database. 2. We used BLAST3 (Altschul and Lipman, Proc. Natl. Acad. Sci. U.S.A. 87:5509-5513, 1990) to generate local pairwise and triplet sequence alignments for every protein in the database vs. every other. 3. We selected "interesting" alignments and grouped them into clusters. We find that most of the clusters contain segments from proteins which share a common structure or function. Many of them correspond to signatures previously noted in the literature. We discuss three previously recognized motifs in detail (FAD/NAD-binding, ATP/GTP-binding, and cytochrome b5-like domains) to demonstrate how the alignments generated by our procedure are consistent with previous work and make structural and functional sense. We also discuss two signatures (for N-acetyltransferases and glycerol-phosphate binding) which to our knowledge have not been previously recognized. 相似文献
11.
In proteome studies, identification of proteins requires searching protein sequence databases. The public protein sequence databases (e.g., NCBInr, UniProt) each contain millions of entries, and private databases add thousands more. Although much of the sequence information in these databases is redundant, each database uses distinct identifiers for the identical protein sequence and often contains unique annotation information. Users of one database obtain a database-specific sequence identifier that is often difficult to reconcile with the identifiers from a different database. When multiple databases are used for searches or the databases being searched are updated frequently, interpreting the protein identifications and associated annotations can be problematic. We have developed a database of unique protein sequence identifiers called Sequence Globally Unique Identifiers (SEGUID) derived from primary protein sequences. These identifiers serve as a common link between multiple sequence databases and are resilient to annotation changes in either public or private databases throughout the lifetime of a given protein sequence. The SEGUID Database can be downloaded (http://bioinformatics.anl.gov/SEGUID/) or easily generated at any site with access to primary protein sequence databases. Since SEGUIDs are stable, predictions based on the primary sequence information (e.g., pI, Mr) can be calculated just once; we have generated approximately 500 different calculations for more than 2.5 million sequences. SEGUIDs are used to integrate MS and 2-DE data with bioinformatics information and provide the opportunity to search multiple protein sequence databases, thereby providing a higher probability of finding the most valid protein identifications. 相似文献
12.
Kersten T. Schroeder Scott A. McPhee Jonathan Ouellet David M.J. Lilley 《RNA (New York, N.Y.)》2010,16(8):1463-1468
The kink-turn (k-turn) is a common structural motif in RNA that introduces a tight kink into the helical axis. k-turns play an important architectural role in RNA structures and serve as binding sites for a number of proteins. We have created a database of known and postulated k-turn sequences and three-dimensional (3D) structures, available via the internet. This site provides (1) a database of sequence and structure, as a resource for the RNA community, and (2) a tool to enable the manipulation and comparison of 3D structures where known. 相似文献
13.
14.
15.
16.
Mohammad-Taher Moradi Hossein Fallahi Zohreh Rahimi 《Journal of cellular biochemistry》2019,120(3):3339-3352
The competitive endogenous RNA (ceRNA) hypothesis suggests that a long noncoding RNA (lncRNA) can function as sinks for pools of microRNAs (miRNAs); thereby, in the presence of ceRNA, messenger RNAs (mRNAs) targeted by specific miRNAs can liberate and translate to protein. Maternally expressed gene 3 (MEG3) is a lncRNA, which its expression has been detected in various normal tissues, while it is lost or downregulated in human tumors. The MEG3 is an imprinted gene which, is methylated and suppressed by DNA methyltransferases (DNMTs) family. Also, miRNAs are involved in the regulation of MEG3 gene expression. Interestingly, the lncRNA MEG3 (lnc-MEG3), as a ceRNA affects various cell processes such as proliferation, apoptosis, and angiogenesis by sponging miRNAs. These miRNAs, in turn, regulate different mRNAs in different pathways. This review focuses on the interaction between lnc-MEG3 and experimentally validated miRNAs. In addition, the discussion supplemented by some data obtained from mirPath (v.3) and TarBase (v.8) databanks to provide more details about the pathways affected by this ceRNA. 相似文献
17.
18.
19.
20.
Competing endogenous RNA database 总被引:1,自引:0,他引:1
A given mRNA can be regulated by interactions with miRNAs and in turn the availability of these miRNAs can be regulated by
their interactions with alternate mRNAs. The concept of regulation of a given mRNA by alternate mRNA (competing endogenous
mRNA) by virtue of interactions with miRNAs through shared miRNA response elements is poised to become a fundamental
genetic regulatory mechanism. The molecular basis of the mRNA-mRNA cross talks is via miRNA response elements, which can
be predicted based on both molecular interaction and evolutionary conservation. By examining the co-occurrence of miRNA
response elements in the mRNAs on a genome-wide basis we predict competing endogenous RNA for specific mRNAs targeted by
miRNAs. Comparison of the mRNAs predicted to regulate PTEN with recently published work, indicate that the results presented
within the competing endogenous RNA database (ceRDB) have biological relevance.