共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
Computational analysis of microarray data 总被引:1,自引:0,他引:1
Quackenbush J 《Nature reviews. Genetics》2001,2(6):418-427
Microarray experiments are providing unprecedented quantities of genome-wide data on gene-expression patterns. Although this technique has been enthusiastically developed and applied in many biological contexts, the management and analysis of the millions of data points that result from these experiments has received less attention. Sophisticated computational tools are available, but the methods that are used to analyse the data can have a profound influence on the interpretation of the results. A basic understanding of these computational tools is therefore required for optimal experimental design and meaningful data analysis. 相似文献
4.
5.
Computational analysis of shotgun proteomics data 总被引:2,自引:0,他引:2
MacCoss MJ 《Current opinion in chemical biology》2005,9(1):88-94
Proteomics technology is progressing at an incredible rate. The latest generation of tandem mass spectrometers can now acquire tens of thousands of fragmentation spectra in a matter of hours. Furthermore, quantitative proteomics methods have been developed that incorporate a stable isotope-labeled internal standard for every peptide within a complex protein mixture for the measurement of relative protein abundances. These developments have opened the doors for 'shotgun' proteomics, yet have also placed a burden on the computational approaches that manage the data. With each new method that is developed, the quantity of data that can be derived from a single experiment increases. To deal with this increase, new computational approaches are being developed to manage the data and assess false positives. This review discusses current approaches for analyzing proteomics data by mass spectrometry and identifies present computational limitations and bottlenecks. 相似文献
6.
7.
Computational analysis of small RNA cloning data 总被引:1,自引:0,他引:1
Cloning and sequencing is the method of choice for small regulatory RNA identification. Using deep sequencing technologies one can now obtain up to a billion nucleotides--and tens of millions of small RNAs--from a single library. Careful computational analyses of such libraries enabled the discovery of miRNAs, rasiRNAs, piRNAs, and 21U RNAs. Given the large number of sequences that can be obtained from each individual sample, deep sequencing may soon become an alternative to oligonucleotide microarray technology for mRNA expression profiling. In this report we present the methods that we developed for the annotation and expression profiling of small RNAs obtained through large-scale sequencing. These include a fast algorithm for finding nearly perfect matches of small RNAs in sequence databases, a web-accessible software system for the annotation of small RNA libraries, and a Bayesian method for comparing small RNA expression across samples. 相似文献
8.
Zhixiang Zhang Shuishui Qi Nan Tang Xinxin Zhang Shanshan Chen Pengfei Zhu Lin Ma Jinping Cheng Yun Xu Meiguang Lu Hongqing Wang Shou-Wei Ding Shifang Li Qingfa Wu 《PLoS pathogens》2014,10(12)
Replicating circular RNAs are independent plant pathogens known as viroids, or act to modulate the pathogenesis of plant and animal viruses as their satellite RNAs. The rate of discovery of these subviral pathogens was low over the past 40 years because the classical approaches are technical demanding and time-consuming. We previously described an approach for homology-independent discovery of replicating circular RNAs by analysing the total small RNA populations from samples of diseased tissues with a computational program known as progressive filtering of overlapping small RNAs (PFOR). However, PFOR written in PERL language is extremely slow and is unable to discover those subviral pathogens that do not trigger in vivo accumulation of extensively overlapping small RNAs. Moreover, PFOR is yet to identify a new viroid capable of initiating independent infection. Here we report the development of PFOR2 that adopted parallel programming in the C++ language and was 3 to 8 times faster than PFOR. A new computational program was further developed and incorporated into PFOR2 to allow the identification of circular RNAs by deep sequencing of long RNAs instead of small RNAs. PFOR2 analysis of the small RNA libraries from grapevine and apple plants led to the discovery of Grapevine latent viroid (GLVd) and Apple hammerhead viroid-like RNA (AHVd-like RNA), respectively. GLVd was proposed as a new species in the genus Apscaviroid, because it contained the typical structural elements found in this group of viroids and initiated independent infection in grapevine seedlings. AHVd-like RNA encoded a biologically active hammerhead ribozyme in both polarities, and was not specifically associated with any of the viruses found in apple plants. We propose that these computational algorithms have the potential to discover novel circular RNAs in plants, invertebrates and vertebrates regardless of whether they replicate and/or induce the in vivo accumulation of small RNAs. 相似文献
9.
Bacterial leaf pustule (BLP) disease is caused by Xanthomonas axonopodis pv. glycines (Xag). To investigate the plant basal defence mechanisms induced in response to Xag, differential gene expression in near-isogenic lines (NILs) of BLP-susceptible and BLP-resistant soybean was analysed by RNA-Seq. Of a total of 46 367 genes that were mapped to soybean genome reference sequences, 1978 and 783 genes were found to be up- and down-regulated, respectively, in the BLP-resistant NIL relative to the BLP-susceptible NIL at 0, 6, and 12h after inoculation (hai). Clustering analysis revealed that these genes could be grouped into 10 clusters with different expression patterns. Functional annotation based on gene ontology (GO) categories was carried out. Among the putative soybean defence response genes identified (GO:0006952), 134 exhibited significant differences in expression between the BLP-resistant and -susceptible NILs. In particular, pathogen-associated molecular pattern (PAMP) and damage-associated molecular pattern (DAMP) receptors and the genes induced by these receptors were highly expressed at 0 hai in the BLP-resistant NIL. Additionally, pathogenesis-related (PR)-1 and -14 were highly expressed at 0 hai, and PR-3, -6, and -12 were highly expressed at 12 hai. There were also significant differences in the expression of the core JA-signalling components MYC2 and JASMONATE ZIM-motif. These results indicate that powerful basal defence mechanisms involved in the recognition of PAMPs or DAMPs and a high level of accumulation of defence-related gene products may contribute to BLP resistance in soybean. 相似文献
10.
Computational cluster validation in post-genomic data analysis 总被引:9,自引:0,他引:9
MOTIVATION: The discovery of novel biological knowledge from the ab initio analysis of post-genomic data relies upon the use of unsupervised processing methods, in particular clustering techniques. Much recent research in bioinformatics has therefore been focused on the transfer of clustering methods introduced in other scientific fields and on the development of novel algorithms specifically designed to tackle the challenges posed by post-genomic data. The partitions returned by a clustering algorithm are commonly validated using visual inspection and concordance with prior biological knowledge--whether the clusters actually correspond to the real structure in the data is somewhat less frequently considered. Suitable computational cluster validation techniques are available in the general data-mining literature, but have been given only a fraction of the same attention in bioinformatics. RESULTS: This review paper aims to familiarize the reader with the battery of techniques available for the validation of clustering results, with a particular focus on their application to post-genomic data analysis. Synthetic and real biological datasets are used to demonstrate the benefits, and also some of the perils, of analytical clustervalidation. AVAILABILITY: The software used in the experiments is available at http://dbkweb.ch.umist.ac.uk/handl/clustervalidation/. SUPPLEMENTARY INFORMATION: Enlarged colour plots are provided in the Supplementary Material, which is available at http://dbkweb.ch.umist.ac.uk/handl/clustervalidation/. 相似文献
11.
RNA-Seq analysis in MeV 总被引:1,自引:0,他引:1
12.
With the increased number of single-cell RNA sequencing (scRNA-seq) datasets in public repositories, integrative analysis of multiple scRNA-seq datasets has become commonplace. Batch effects among different datasets are inevitable because of differences in cell isolation and handling protocols, library preparation technology, and sequencing platforms. To remove these batch effects for effective integration of multiple scRNA-seq datasets, a number of methodologies have been developed based on diverse concepts and approaches. These methods have proven useful for examining whether cellular features, such as cell subpopulations and marker genes, identified from a certain dataset, are consistently present, or whether their condition-dependent variations, such as increases in cell subpopulations in particular disease-related conditions, are consistently observed in different datasets generated under similar or distinct conditions. In this review, we summarize the concepts and approaches of the integration methods and their pros and cons as has been reported in previous literature. 相似文献
13.
14.
15.
16.
High-throughput proteomics experiments involving tandem mass spectrometry produce large volumes of complex data that require sophisticated computational analyses. As such, the field offers many challenges for computational biologists. In this article, we briefly introduce some of the core computational and statistical problems in the field and then describe a variety of outstanding problems that readers of PLoS Computational Biology might be able to help solve. 相似文献
17.
Tissue-specific alternative splicing is a key mechanism for generating tissue-specific proteomic diversity in eukaryotes. Splicing regulatory elements (SREs) in pre-mature messenger RNA play a very important role in regulating alternative splicing. In this article, we use mouse RNA-Seq data to determine a positive data set where SREs are over-represented and a reliable negative data set where the same SREs are most likely under-represented for a specific tissue and then employ a powerful discriminative approach to identify SREs. We identified 456 putative splicing enhancers or silencers, of which 221 were predicted to be tissue-specific. Most of our tissue-specific SREs are likely different from constitutive SREs, since only 18% of our exonic splicing enhancers (ESEs) are contained in constitutive RESCUE-ESEs. A relatively small portion (20%) of our SREs is included in tissue-specific SREs in human identified in two recent studies. In the analysis of position distribution of SREs, we found that a dozen of SREs were biased to a specific region. We also identified two very interesting SREs that can function as an enhancer in one tissue but a silencer in another tissue from the same intronic region. These findings provide insight into the mechanism of tissue-specific alternative splicing and give a set of valuable putative SREs for further experimental investigations. 相似文献
18.
RNA-Seq and microarray platforms have emerged as important tools for detecting changes in gene expression and RNA processing
in biological samples. We present ExpressionPlot, a software package consisting of a default back end, which prepares raw
sequencing or Affymetrix microarray data, and a web-based front end, which offers a biologically centered interface to browse,
visualize, and compare different data sets. Download and installation instructions, a user's manual, discussion group, and
a prototype are available at . 相似文献
19.
20.
J Bermúdez D López A Sanahuja M Vi?as J G Lorén 《Canadian journal of microbiology》1988,34(9):1058-1062
In this work a bacterial classification method based on the discriminant analysis of the microcalorimetric data provided by the growth power-time (p-t) curves is developed. This method is applied to classify several species of Enterobacteria of different origins, and the results are compared with those obtained by conventional techniques. The proposed analysis allows us to classify bacteria into species and discriminate among strains of the same species. The classification is carried out using one run of each isolate after standardization of inocula and growth conditions. The discrimination power of available microcalorimetric data is also discussed, and the most discriminant set of data is proposed as the input variables of the analysis. Finally, the advantages of microcalorimetry as a taxonomical technique are discussed. 相似文献