共查询到20条相似文献,搜索用时 31 毫秒
1.
Background
An increasing number of bioinformatics methods are considering the phylogenetic relationships between biological sequences. Implementing new methodologies using the maximum likelihood phylogenetic framework can be a time consuming task. 相似文献2.
Background
Concomitant with the rise in the popularity of DNA microarrays has been a surge of proposed methods for the analysis of microarray data. Fully controlled "spike-in" datasets are an invaluable but rare tool for assessing the performance of various methods. 相似文献3.
Background
Maximum parsimony phylogenetic tree reconstruction from genetic variation data is a fundamental problem in computational genetics with many practical applications in population genetics, whole genome analysis, and the search for genetic predictors of disease. Efficient methods are available for reconstruction of maximum parsimony trees from haplotype data, but such data are difficult to determine directly for autosomal DNA. Data more commonly is available in the form of genotypes, which consist of conflated combinations of pairs of haplotypes from homologous chromosomes. Currently, there are no general algorithms for the direct reconstruction of maximum parsimony phylogenies from genotype data. Hence phylogenetic applications for autosomal data must therefore rely on other methods for first computationally inferring haplotypes from genotypes. 相似文献4.
Peter A DiMaggioJr Scott R McAllister Christodoulos A Floudas Xiao-Jiang Feng Joshua D Rabinowitz Herschel A Rabitz 《BMC bioinformatics》2008,9(1):458
Background
The analysis of large-scale data sets via clustering techniques is utilized in a number of applications. Biclustering in particular has emerged as an important problem in the analysis of gene expression data since genes may only jointly respond over a subset of conditions. Biclustering algorithms also have important applications in sample classification where, for instance, tissue samples can be classified as cancerous or normal. Many of the methods for biclustering, and clustering algorithms in general, utilize simplified models or heuristic strategies for identifying the "best" grouping of elements according to some metric and cluster definition and thus result in suboptimal clusters. 相似文献5.
Hunter AA Macgregor AB Szabo TO Wellington CA Bellgard MI 《Source code for biology and medicine》2012,7(1):1
Background
There is a significant demand for creating pipelines or workflows in the life science discipline that chain a number of discrete compute and data intensive analysis tasks into sophisticated analysis procedures. This need has led to the development of general as well as domain-specific workflow environments that are either complex desktop applications or Internet-based applications. Complexities can arise when configuring these applications in heterogeneous compute and storage environments if the execution and data access models are not designed appropriately. These complexities manifest themselves through limited access to available HPC resources, significant overhead required to configure tools and inability for users to simply manage files across heterogenous HPC storage infrastructure. 相似文献6.
Background
In high-dimensional data analysis such as differential gene expression analysis, people often use filtering methods like fold-change or variance filters in an attempt to reduce the multiple testing penalty and improve power. However, filtering may introduce a bias on the multiple testing correction. The precise amount of bias depends on many quantities, such as fraction of probes filtered out, filter statistic and test statistic used. 相似文献7.
Background
Principal component analysis (PCA) has gained popularity as a method for the analysis of high-dimensional genomic data. However, it is often difficult to interpret the results because the principal components are linear combinations of all variables, and the coefficients (loadings) are typically nonzero. These nonzero values also reflect poor estimation of the true vector loadings; for example, for gene expression data, biologically we expect only a portion of the genes to be expressed in any tissue, and an even smaller fraction to be involved in a particular process. Sparse PCA methods have recently been introduced for reducing the number of nonzero coefficients, but these existing methods are not satisfactory for high-dimensional data applications because they still give too many nonzero coefficients. 相似文献8.
Background
Increasing amounts of data from large scale whole genome analysis efforts demands convenient tools for manipulation, visualization and investigation. Whole genome plots offer an intuitive window to the analysis. We describe two applications that enable users to easily plot and explore whole genome data from their own or other researchers' experiments. 相似文献9.
Tiago Antao Ana Lopes Ricardo J Lopes Albano Beja-Pereira Gordon Luikart 《BMC bioinformatics》2008,9(1):323
Background
Testing for selection is becoming one of the most important steps in the analysis of multilocus population genetics data sets. Existing applications are difficult to use, leaving many non-trivial, error-prone tasks to the user. 相似文献10.
Luca Corradi Marco Fato Ivan Porro Silvia Scaglione Livia Torterolo 《BMC bioinformatics》2008,9(1):480
Background
Microarray techniques are one of the main methods used to investigate thousands of gene expression profiles for enlightening complex biological processes responsible for serious diseases, with a great scientific impact and a wide application area. Several standalone applications had been developed in order to analyze microarray data. Two of the most known free analysis software packages are the R-based Bioconductor and dChip. The part of dChip software concerning the calculation and the analysis of gene expression has been modified to permit its execution on both cluster environments (supercomputers) and Grid infrastructures (distributed computing). 相似文献11.
Background
Cluster analysis is an important technique for the exploratory analysis of biological data. Such data is often high-dimensional, inherently noisy and contains outliers. This makes clustering challenging. Mixtures are versatile and powerful statistical models which perform robustly for clustering in the presence of noise and have been successfully applied in a wide range of applications. 相似文献12.
Stephan Pabinger Gerhard G Thallinger René Snajder Heiko Eichhorn Robert Rader Zlatko Trajanoski 《BMC bioinformatics》2009,10(1):268
Background
Since its introduction quantitative real-time polymerase chain reaction (qPCR) has become the standard method for quantification of gene expression. Its high sensitivity, large dynamic range, and accuracy led to the development of numerous applications with an increasing number of samples to be analyzed. Data analysis consists of a number of steps, which have to be carried out in several different applications. Currently, no single tool is available which incorporates storage, management, and multiple methods covering the complete analysis pipeline. 相似文献13.
Patrik Rydén Henrik Andersson Mattias Landfors Linda Näslund Blanka Hartmanová Laila Noppa Anders Sjöstedt 《BMC bioinformatics》2006,7(1):300-17
Background
Recently, a large number of methods for the analysis of microarray data have been proposed but there are few comparisons of their relative performances. By using so-called spike-in experiments, it is possible to characterize the analyzed data and thereby enable comparisons of different analysis methods. 相似文献14.
Background
Analysis of DNA microarray data usually begins with a normalization step where intensities of different arrays are adjusted to the same scale so that the intensity levels from different arrays can be compared with one other. Both simple total array intensity-based as well as more complex "local intensity level" dependent normalization methods have been developed, some of which are widely used. Much less developed methods for microarray data analysis include those that bypass the normalization step and therefore yield results that are not confounded by potential normalization errors. 相似文献15.
Ramon Diaz-Uriarte 《BMC bioinformatics》2008,9(1):30
Background
Censored data are increasingly common in many microarray studies that attempt to relate gene expression to patient survival. Several new methods have been proposed in the last two years. Most of these methods, however, are not available to biomedical researchers, leading to many re-implementations from scratch of ad-hoc, and suboptimal, approaches with survival data. 相似文献16.
Background
With current technology, vast amounts of data can be cheaply and efficiently produced in association studies, and to prevent data analysis to become the bottleneck of studies, fast and efficient analysis methods that scale to such data set sizes must be developed. 相似文献17.
Marcin Olszewski Anna Grot Marek Wojciechowski Marta Nowak Małgorzata Mickiewicz Józef Kur 《BMC microbiology》2010,10(1):260
Background
In recent years, there has been an increasing interest in SSBs because they find numerous applications in diverse molecular biology and analytical methods. 相似文献18.
Background
Low-level processing and normalization of microarray data are most important steps in microarray analysis, which have profound impact on downstream analysis. Multiple methods have been suggested to date, but it is not clear which is the best. It is therefore important to further study the different normalization methods in detail and the nature of microarray data in general. 相似文献19.
Lior Shamir Nikita Orlov D Mark Eckley Tomasz Macura Josiah Johnston Ilya G Goldberg 《Source code for biology and medicine》2008,3(1):13
Background
Biological imaging is an emerging field, covering a wide range of applications in biological and clinical research. However, while machinery for automated experimenting and data acquisition has been developing rapidly in the past years, automated image analysis often introduces a bottleneck in high content screening. 相似文献20.