首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.

Background

Post-crystallization dehydration methods, applying either vapor diffusion or humidity control devices, have been widely used to improve the diffraction quality of protein crystals. Despite the fact that RNA crystals tend to diffract poorly, there is a dearth of reports on the application of dehydration methods to improve the diffraction quality of RNA crystals.

Results

We use dehydration techniques with a Free Mounting System (FMS, a humidity control device) to recover the poor diffraction quality of RNA crystals. These approaches were applied to RNA constructs that model various RNA-mediated repeat expansion disorders.

Conclusion

The method we describe herein could serve as a general tool to improve diffraction quality of RNA crystals to facilitate structure determinations.
  相似文献   

3.
4.
5.
6.

Background

RNA secondary structure prediction methods based on probabilistic modeling can be developed using stochastic context-free grammars (SCFGs). Such methods can readily combine different sources of information that can be expressed probabilistically, such as an evolutionary model of comparative RNA sequence analysis and a biophysical model of structure plausibility. However, the number of free parameters in an integrated model for consensus RNA structure prediction can become untenable if the underlying SCFG design is too complex. Thus a key question is, what small, simple SCFG designs perform best for RNA secondary structure prediction?

Results

Nine different small SCFGs were implemented to explore the tradeoffs between model complexity and prediction accuracy. Each model was tested for single sequence structure prediction accuracy on a benchmark set of RNA secondary structures.

Conclusions

Four SCFG designs had prediction accuracies near the performance of current energy minimization programs. One of these designs, introduced by Knudsen and Hein in their PFOLD algorithm, has only 21 free parameters and is significantly simpler than the others.
  相似文献   

7.
8.
9.
10.

Background

Human cancers are complex ecosystems composed of cells with distinct molecular signatures. Such intratumoral heterogeneity poses a major challenge to cancer diagnosis and treatment. Recent advancements of single-cell techniques such as scRNA-seq have brought unprecedented insights into cellular heterogeneity. Subsequently, a challenging computational problem is to cluster high dimensional noisy datasets with substantially fewer cells than the number of genes.

Methods

In this paper, we introduced a consensus clustering framework conCluster, for cancer subtype identification from single-cell RNA-seq data. Using an ensemble strategy, conCluster fuses multiple basic partitions to consensus clusters.

Results

Applied to real cancer scRNA-seq datasets, conCluster can more accurately detect cancer subtypes than the widely used scRNA-seq clustering methods. Further, we conducted co-expression network analysis for the identified melanoma subtypes.

Conclusions

Our analysis demonstrates that these subtypes exhibit distinct gene co-expression networks and significant gene sets with different functional enrichment.
  相似文献   

11.
12.
13.
14.
15.

Background

The rate at which fitness declines as an organism's genome accumulates random mutations is an important variable in several evolutionary theories. At an intuitive level, it might seem natural that random mutations should tend to interact synergistically, such that the rate of mean fitness decline accelerates as the number of random mutations is increased. However, in a number of recent studies, a prevalence of antagonistic epistasis (the tendency of multiple mutations to have a mitigating rather than reinforcing effect) has been observed.

Results

We studied in silico the net amount and form of epistatic interactions in RNA secondary structure folding by measuring the fraction of neutral mutants as a function of mutational distance d. We found a clear prevalence of antagonistic epistasis in RNA secondary structure folding. By relating the fraction of neutral mutants at distance d to the average neutrality at distance d, we showed that this prevalence derives from the existence of many compensatory mutations at larger mutational distances.

Conclusions

Our findings imply that the average direction of epistasis in simple fitness landscapes is directly related to the density with which fitness peaks are distributed in these landscapes.
  相似文献   

16.

Background

Single-cell RNA sequencing (scRNA-seq) technology provides an effective way to study cell heterogeneity. However, due to the low capture efficiency and stochastic gene expression, scRNA-seq data often contains a high percentage of missing values. It has been showed that the missing rate can reach approximately 30% even after noise reduction. To accurately recover missing values in scRNA-seq data, we need to know where the missing data is; how much data is missing; and what are the values of these data.

Methods

To solve these three problems, we propose a novel model with a hybrid machine learning method, namely, missing imputation for single-cell RNA-seq (MISC). To solve the first problem, we transformed it to a binary classification problem on the RNA-seq expression matrix. Then, for the second problem, we searched for the intersection of the classification results, zero-inflated model and false negative model results. Finally, we used the regression model to recover the data in the missing elements.

Results

We compared the raw data without imputation, the mean-smooth neighbor cell trajectory, MISC on chronic myeloid leukemia data (CML), the primary somatosensory cortex and the hippocampal CA1 region of mouse brain cells. On the CML data, MISC discovered a trajectory branch from the CP-CML to the BC-CML, which provides direct evidence of evolution from CP to BC stem cells. On the mouse brain data, MISC clearly divides the pyramidal CA1 into different branches, and it is direct evidence of pyramidal CA1 in the subpopulations. In the meantime, with MISC, the oligodendrocyte cells became an independent group with an apparent boundary.

Conclusions

Our results showed that the MISC model improved the cell type classification and could be instrumental to study cellular heterogeneity. Overall, MISC is a robust missing data imputation model for single-cell RNA-seq data.
  相似文献   

17.

Objectives

To explore the effects of Lin28A on progression of osteocarcinoma (OS) cells.

Results

Lin28A mRNA and protein expressions were significantly increased in OS tissues compared with that in normal adjacent tissues. Expressions of Lin28A and long noncoding RNA MALAT1 were positively correlated. Patients with higher Lin28A expression had shorter overall survival. Moreover, Lin28A knockdown inhibited OS cells proliferation, migration, invasion and promoted cell apoptosis; Lin28A was found to harbor binding sites on MALAT1 sequences and associated with MALAT1, and increased MALAT1 stability and expression. Notably, the inhibition of Lin28A knockdown was attenuated or even reversed by MALAT1 overexpression.

Conclusions

RNA binding protein Lin28A could facilitate OS cells progression by associating with the long noncoding RNA MALAT1.
  相似文献   

18.
19.
20.

Background

Gene signatures are important to represent the molecular changes in the disease genomes or the cells in specific conditions, and have been often used to separate samples into different groups for better research or clinical treatment. While many methods and applications have been available in literature, there still lack powerful ones that can take account of the complex data and detect the most informative signatures.

Methods

In this article, we present a new framework for identifying gene signatures using Pareto-optimal cluster size identification for RNA-seq data. We first performed pre-filtering steps and normalization, then utilized the empirical Bayes test in Limma package to identify the differentially expressed genes (DEGs). Next, we used a multi-objective optimization technique, “Multi-objective optimization for collecting cluster alternatives” (MOCCA in R package) on these DEGs to find Pareto-optimal cluster size, and then applied k-means clustering to the RNA-seq data based on the optimal cluster size. The best cluster was obtained through computing the average Spearman’s Correlation Score among all the genes in pair-wise manner belonging to the module. The best cluster is treated as the signature for the respective disease or cellular condition.

Results

We applied our framework to a cervical cancer RNA-seq dataset, which included 253 squamous cell carcinoma (SCC) samples and 22 adenocarcinoma (ADENO) samples. We identified a total of 582 DEGs by Limma analysis of SCC versus ADENO samples. Among them, 260 are up-regulated genes and 322 are down-regulated genes. Using MOCCA, we obtained seven Pareto-optimal clusters. The best cluster has a total of 35 DEGs consisting of all-upregulated genes. For validation, we ran PAMR (prediction analysis for microarrays) classifier on the selected best cluster, and assessed the classification performance. Our evaluation, measured by sensitivity, specificity, precision, and accuracy, showed high confidence.

Conclusions

Our framework identified a multi-objective based cluster that is treated as a signature that can classify the disease and control group of samples with higher classification performance (accuracy 0.935) for the corresponding disease. Our method is useful to find signature for any RNA-seq or microarray data.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号