共查询到20条相似文献,搜索用时 265 毫秒
1.
Background
Upwards of 1200 miRNA loci have hitherto been annotated in the human genome. The specific features defining a miRNA precursor and deciding its recognition and subsequent processing are not yet exhaustively described and miRNA loci can thus not be computationally identified with sufficient confidence.Results
We rendered pre-miRNA and non-pre-miRNA hairpins as strings of integrated sequence-structure information, and used the software Teiresias to identify sequence-structure motifs (ss-motifs) of variable length in these data sets. Using only ss-motifs as features in a Support Vector Machine (SVM) algorithm for pre-miRNA identification achieved 99.2% specificity and 97.6% sensitivity on a human test data set, which is comparable to previously published algorithms employing combinations of sequence-structure and additional features. Further analysis of the ss-motif information contents revealed strongly significant deviations from those of the respective training sets, revealing important potential clues as to how the sequence and structural information of RNA hairpins are utilized by the miRNA processing apparatus.Conclusion
Integrated sequence-structure motifs of variable length apparently capture nearly all information required to distinguish miRNA precursors from other stem-loop structures. 相似文献2.
Andrei Kochegarov Ashley Moses William Lian Jessica Meyer Michael C Hanna Larry F Lemanski 《Journal of biomedical science》2013,20(1):20
Background
A recessive mutation “c” in the Mexican axolotl, Ambystoma mexicanum, results in the failure of normal heart development. In homozygous recessive embryos, the hearts do not have organized myofibrils and fail to beat. In our previous studies, we identified a noncoding Myofibril-Inducing RNA (MIR) from axolotls which promotes myofibril formation and rescues heart development.Results
We randomly cloned RNAs from fetal human heart. RNA from clone #291 promoted myofibril formation and induced heart development of mutant axolotls in organ culture. This RNA induced expression of cardiac markers in mutant hearts: tropomyosin, troponin and α-syntrophin. This cloned RNA matches in partial sequence alignment to human microRNA-499a and b, although it differs in length. We have concluded that this cloned RNA is unique in its length, but is still related to the microRNA-499 family. We have named this unique RNA, microRNA-499c. Thus, we will refer to this RNA derived from clone #291 as microRNA-499c throughout the rest of the paper.Conclusions
This new form, microRNA-499c, plays an important role in cardiac development. 相似文献3.
4.
Background
Structured RNAs have many biological functions ranging from catalysis of chemical reactions to gene regulation. Yet, many homologous structured RNAs display most of their conservation at the secondary or tertiary structure level. As a result, strategies for structured RNA discovery rely heavily on identification of sequences sharing a common stable secondary structure. However, correctly distinguishing structured RNAs from surrounding genomic sequence remains challenging, especially during de novo discovery. RNA also has a long history as a computational model for evolution due to the direct link between genotype (sequence) and phenotype (structure). From these studies it is clear that evolved RNA structures, like protein structures, can be considered robust to point mutations. In this context, an RNA sequence is considered robust if its neutrality (extent to which single mutant neighbors maintain the same secondary structure) is greater than that expected for an artificial sequence with the same minimum free energy structure.Results
In this work, we bring concepts from evolutionary biology to bear on the structured RNA de novo discovery process. We hypothesize that alignments corresponding to structured RNAs should consist of neutral sequences. We evaluate several measures of neutrality for their ability to distinguish between alignments of structured RNA sequences drawn from Rfam and various decoy alignments. We also introduce a new measure of RNA structural neutrality, the structure ensemble neutrality (SEN). SEN seeks to increase the biological relevance of existing neutrality measures in two ways. First, it uses information from an alignment of homologous sequences to identify a conserved biologically relevant structure for comparison. Second, it only counts base-pairs of the original structure that are absent in the comparison structure and does not penalize the formation of additional base-pairs.Conclusion
We find that several measures of neutrality are effective at separating structured RNAs from decoy sequences, including both shuffled alignments and flanking genomic sequence. Furthermore, as an independent feature classifier to identify structured RNAs, SEN yields comparable performance to current approaches that consider a variety of features including stability and sequence identity. Finally, SEN outperforms other measures of neutrality at detecting mutational robustness in bacterial regulatory RNA structures.Electronic supplementary material
The online version of this article (doi:10.1186/s12864-014-1203-8) contains supplementary material, which is available to authorized users. 相似文献5.
6.
7.
8.
Background
The power of the genome wide association studies starts to go down when the minor allele frequency (MAF) is below 0.05. Here, we proposed the use of Cohen’s h in detecting disease associated rare variants. The variance stabilizing effect based on the arcsine square root transformation of MAFs to generate Cohen’s h contributed to the statistical power for rare variants analysis. We re-analyzed published datasets, one microarray and one sequencing based, and used simulation to compare the performance of Cohen’s h with the risk difference (RD) and odds ratio (OR).Results
The analysis showed that the type 1 error rate of Cohen’s h was as expected and Cohen’s h and RD were both less biased and had higher power than OR. The advantage of Cohen’s h was more obvious when MAF was less than 0.01.Conclusions
Cohen’s h can increase the power to find genetic association of rare variants and diseases, especially when MAF is less than 0.01.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-875) contains supplementary material, which is available to authorized users. 相似文献9.
10.
Background
Recent advances in deep digital sequencing have unveiled an unprecedented degree of clonal heterogeneity within a single tumor DNA sample. Resolving such heterogeneity depends on accurate estimation of fractions of alleles that harbor somatic mutations. Unlike substitutions or small indels, structural variants such as deletions, duplications, inversions and translocations involve segments of DNAs and are potentially more accurate for allele fraction estimations. However, no systematic method exists that can support such analysis.Results
In this paper, we present a novel maximum-likelihood method that estimates allele fractions of structural variants integratively from various forms of alignment signals. We develop a tool, BreakDown, to estimate the allele fractions of most structural variants including medium size (from 1 kilobase to 1 megabase) deletions and duplications, and balanced inversions and translocations.Conclusions
Evaluation based on both simulated and real data indicates that our method systematically enables structural variants for clonal heterogeneity analysis and can greatly enhance the characterization of genomically instable tumors.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2105-15-299) contains supplementary material, which is available to authorized users. 相似文献11.
12.
13.
Raheleh Salari Cagri Aksay Emre Karakoc Peter J. Unrau Iman Hajirasouliha S. Cenk Sahinalp 《PloS one》2009,4(5)
Background
Non-coding RNAs (ncRNAs) have important functional roles in the cell: for example, they regulate gene expression by means of establishing stable joint structures with target mRNAs via complementary sequence motifs. Sequence motifs are also important determinants of the structure of ncRNAs. Although ncRNAs are abundant, discovering novel ncRNAs on genome sequences has proven to be a hard task; in particular past attempts for ab initio ncRNA search mostly failed with the exception of tools that can identify micro RNAs.Methodology/Principal Findings
We present a very general ab initio ncRNA gene finder that exploits differential distributions of sequence motifs between ncRNAs and background genome sequences.Conclusions/Significance
Our method, once trained on a set of ncRNAs from a given species, can be applied to a genome sequences of other organisms to find not only ncRNAs homologous to those in the training set but also others that potentially belong to novel (and perhaps unknown) ncRNA families. Availability: http://compbio.cs.sfu.ca/taverna/smyrna 相似文献14.
Anna Maisa Ute Str?her Hans-Dieter Klenk Wolfgang Garten Thomas Strecker 《PLoS neglected tropical diseases》2009,3(6)
Background
Proteolytic processing of the Lassa virus envelope glycoprotein precursor GP-C by the host proprotein convertase site 1 protease (S1P) is a prerequisite for the incorporation of the subunits GP-1 and GP-2 into viral particles and, hence, essential for infectivity and virus spread. Therefore, we tested in this study the concept of using S1P as a target to block efficient virus replication.Methodology/Principal Finding
We demonstrate that stable cell lines inducibly expressing S1P-adapted α1-antitrypsin variants inhibit the proteolytic maturation of GP-C. Introduction of the S1P recognition motifs RRIL and RRLL into the reactive center loop of α1-antitrypsin resulted in abrogation of GP-C processing by endogenous S1P to a similar level observed in S1P-deficient cells. Moreover, S1P-specific α1-antitrypsins significantly inhibited replication and spread of a replication-competent recombinant vesicular stomatitis virus expressing the Lassa virus glycoprotein GP as well as authentic Lassa virus. Inhibition of viral replication correlated with the ability of the different α1-antitrypsin variants to inhibit the processing of the Lassa virus glycoprotein precursor.Conclusions/Significance
Our data suggest that glycoprotein cleavage by S1P is a promising target for the development of novel anti-arenaviral strategies. 相似文献15.
Michael J. Monument Kirsten M. Johnson Elizabeth McIlvaine Lisa Abegglen W. Scott Watkins Lynn B. Jorde Richard B. Womer Natalie Beeler Laura Monovich Elizabeth R. Lawlor Julia A. Bridge Joshua D. Schiffman Mark D. Krailo R. Lor Randall Stephen L. Lessnick 《PloS one》2014,9(8)
Background
The genetics involved in Ewing sarcoma susceptibility and prognosis are poorly understood. EWS/FLI and related EWS/ETS chimeras upregulate numerous gene targets via promoter-based GGAA-microsatellite response elements. These microsatellites are highly polymorphic in humans, and preliminary evidence suggests EWS/FLI-mediated gene expression is highly dependent on the number of GGAA motifs within the microsatellite.Objectives
Here we sought to examine the polymorphic spectrum of a GGAA-microsatellite within the NR0B1 promoter (a critical EWS/FLI target) in primary Ewing sarcoma tumors, and characterize how this polymorphism influences gene expression and clinical outcomes.Results
A complex, bimodal pattern of EWS/FLI-mediated gene expression was observed across a wide range of GGAA motifs, with maximal expression observed in constructs containing 20–26 GGAA motifs. Relative to white European and African controls, the NR0B1 GGAA-microsatellite in tumor cells demonstrated a strong bias for haplotypes containing 21–25 GGAA motifs suggesting a relationship between microsatellite function and disease susceptibility. This selection bias was not a product of microsatellite instability in tumor samples, nor was there a correlation between NR0B1 GGAA-microsatellite polymorphisms and survival outcomes.Conclusions
These data suggest that GGAA-microsatellite polymorphisms observed in human populations modulate EWS/FLI-mediated gene expression and may influence disease susceptibility in Ewing sarcoma. 相似文献16.
Background
Recent evidence suggests that the number and variety of functional RNAs (ncRNAs as well as cis-acting RNA elements within mRNAs ) is much higher than previously thought; thus, the ability to computationally predict and analyze RNAs has taken on new importance. We have computationally studied the secondary structures in an alignment of six Aspergillus genomes. Little is known about the RNAs present in this set of fungi, and this diverse set of genomes has an optimal level of sequence conservation for observing the correlated evolution of base-pairs seen in RNAs.Methodology/Principal Findings
We report the results of a whole-genome search for evolutionarily conserved secondary structures, as well as the results of clustering these predicted secondary structures by structural similarity. We find a total of 7450 predicted secondary structures, including a new predicted ∼60 bp long hairpin motif found primarily inside introns. We find no evidence for microRNAs. Different types of genomic regions are over-represented in different classes of predicted secondary structures. Exons contain the longest motifs (primarily long, branched hairpins), 5′ UTRs primarily contain groupings of short hairpins located near the start codon, and 3′ UTRs contain very little secondary structure compared to other regions. There is a large concentration of short hairpins just inside the boundaries of exons. The density of predicted intronic RNAs increases with the length of introns, and the density of predicted secondary structures within mRNA coding regions increases with the number of introns in a gene.Conclusions/Sigificance
There are many conserved, high-confidence RNAs of unknown function in these Aspergillus genomes, as well as interesting spatial distributions of predicted secondary structures. This study increases our knowledge of secondary structure in these aspergillus organisms. 相似文献17.
18.
Background and Aims
The green algal class Chlorophyceae comprises five orders (Chlamydomonadales, Sphaeropleales, Chaetophorales, Chaetopeltidales and Oedogoniales). Attempts to resolve the relationships among these groups have met with limited success. Studies of single genes (18S rRNA, 26S rRNA, rbcL or atpB) have largely failed to unambiguously resolve the relative positions of Oedogoniales, Chaetophorales and Chaetopeltidales (the OCC taxa). In contrast, recent genomics analyses of plastid data from OCC exemplars provided a robust phylogenetic analysis that supports a monophyletic OCC alliance.Methods
An ITS2 data set was assembled to independently test the OCC hypothesis and to evaluate the performance of these data in assessing green algal phylogeny at the ordinal or class level. Sequence-structure analysis designed for use with ITS2 data was employed for phylogenetic reconstruction.Key Results
Results of this study yielded trees that were, in general, topologically congruent with the results from the genomic analyses, including support for the monophyly of the OCC alliance.Conclusions
Not all nodes from the ITS2 analyses exhibited robust support, but our investigation demonstrates that sequence-structure analyses of ITS2 provide a taxon-rich means of testing phylogenetic hypotheses at high taxonomic levels. Thus, the ITS2 data, in the context of sequence-structure analysis, provide an economical supplement or alternative to the single-marker approaches used in green algal phylogeny. 相似文献19.
Miaofei Xu Yufeng Qin Jianhua Qu Chuncheng Lu Ying Wang Wei Wu Ling Song Shoulin Wang Feng Chen Hongbing Shen Jiahao Sha Zhibin Hu Yankai Xia Xinru Wang 《PloS one》2013,8(11)
Background
Oligozoospermia is one of the severe forms of idiopathic male infertility. However, its pathology is largely unknown, and few genetic factors have been defined. Our previous genome-wide association study (GWAS) has identified four risk loci for non-obstructive azoospermia (NOA).Objective
To investigate the potentially functional genetic variants (including not only common variants, but also less-common and rare variants) of these loci on spermatogenic impairment, especially oligozoospermia.Design, Setting, and Participants
A total of 784 individuals with oligozoospermia and 592 healthy controls were recruited to this study from March 2004 and January 2011.Measurements
We conducted a two-stage study to explore the association between oligozoospermia and new makers near NOA risk loci. In the first stage, we used next generation sequencing (NGS) in 96 oligozoospermia cases and 96 healthy controls to screen oligozoospermia-susceptible genetic variants. Next, we validated these variants in a large cohort containing 688 cases and 496 controls by SNPscan for high-throughput Single Nucleotide Polymorphism (SNP) genotyping.Results and Limitations
Totally, we observed seven oligozoospermia associated variants (rs3791185 and rs2232015 in PRMT6, rs146039840 and rs11046992 in Sox5, rs1129332 in PEX10, rs3197744 in SIRPA, rs1048055 in SIRPG) in the first stage. In the validation stage, rs3197744 in SIRPA and rs11046992 in Sox5 were associated with increased risk of oligozoospermia with an odds ratio (OR) of 4.62 (P = 0.005, 95%CI 1.58-13.4) and 1.82 (P = 0.005, 95%CI 1.01-1.64), respectively. Further investigation in larger populations and functional characterizations are needed to validate our findings.Conclusions
Our study provides evidence of independent oligozoospermia risk alleles driven by variants in the potentially functional regions of genes discovered by GWAS. Our findings suggest that integrating sequence data with large-scale genotyping will serve as an effective strategy for discovering risk alleles in the future. 相似文献20.
Alejandro Floriano Icíar Santa-Olalla Alberto Sanchez-Reyes 《Reports of Practical Oncology and Radiotherapy》2013,18(3):173-178