共查询到20条相似文献,搜索用时 15 毫秒
1.
Background
Many bioinformatics analyses, ranging from gene clustering to phylogenetics, produce hierarchical trees as their main result. These are used to represent the relationships among different biological entities, thus facilitating their analysis and interpretation. A number of standalone programs are available that focus on tree visualization or that perform specific analyses on them. However, such applications are rarely suitable for large-scale surveys, in which a higher level of automation is required. Currently, many genome-wide analyses rely on tree-like data representation and hence there is a growing need for scalable tools to handle tree structures at large scale. 相似文献2.
David M Mutch Alvin Berger Robert Mansourian Andreas Rytz Matthew-Alan Roberts 《BMC bioinformatics》2002,3(1):17-11
Background
The biomedical community is developing new methods of data analysis to more efficiently process the massive data sets produced by microarray experiments. Systematic and global mathematical approaches that can be readily applied to a large number of experimental designs become fundamental to correctly handle the otherwise overwhelming data sets. 相似文献3.
Background
There is an increasing demand to assemble and align large-scale biological sequence data sets. The commonly used multiple sequence alignment programs are still limited in their ability to handle very large amounts of sequences because the system lacks a scalable high-performance computing (HPC) environment with a greatly extended data storage capacity. 相似文献4.
Microarray data analysis: a practical approach for selecting differentially expressed genes
下载免费PDF全文
![点击此处可从《Genome biology》网站下载免费的PDF全文](/ch/ext_images/free.gif)
David M Mutch Alvin Berger Robert Mansourian Andreas Rytz Matthew-Alan Roberts 《Genome biology》2001,2(12):preprint00-29
Background
The biomedical community is rapidly developing new methods of data analysis for microarray experiments, with the goal of establishing new standards to objectively process the massive datasets produced from functional genomic experiments. Each microarray experiment measures thousands of genes simultaneously producing an unprecedented amount of biological information across increasingly numerous experiments; however, in general, only a very small percentage of the genes present on any given array are identified as differentially regulated. The challenge then is to process this information objectively and efficiently in order to obtain knowledge of the biological system under study and by which to compare information gained across multiple experiments. In this context, systematic and objective mathematical approaches, which are simple to apply across a large number of experimental designs, become fundamental to correctly handle the mass of data and to understand the true complexity of the biological systems under study. 相似文献5.
6.
Background
Genome assemblers have grown very large and complex in response to the need for algorithms to handle the challenges of large whole-genome sequencing projects. Many of the most common uses of assemblers, however, are best served by a simpler type of assembler that requires fewer software components, uses less memory, and is far easier to install and run. 相似文献7.
Background
The allele frequencies of single-nucleotide polymorphisms (SNPs) are needed to select an optimal subset of common SNPs for use in association studies. Sequence-based methods for finding SNPs with allele frequencies may need to handle thousands of sequences from the same genome location (sequences of deep coverage). 相似文献8.
9.
Missing value imputation for epistatic MAPs 总被引:1,自引:0,他引:1
Background
Epistatic miniarray profiling (E-MAPs) is a high-throughput approach capable of quantifying aggravating or alleviating genetic interactions between gene pairs. The datasets resulting from E-MAP experiments typically take the form of a symmetric pairwise matrix of interaction scores. These datasets have a significant number of missing values - up to 35% - that can reduce the effectiveness of some data analysis techniques and prevent the use of others. An effective method for imputing interactions would therefore increase the types of possible analysis, as well as increase the potential to identify novel functional interactions between gene pairs. Several methods have been developed to handle missing values in microarray data, but it is unclear how applicable these methods are to E-MAP data because of their pairwise nature and the significantly larger number of missing values. Here we evaluate four alternative imputation strategies, three local (Nearest neighbor-based) and one global (PCA-based), that have been modified to work with symmetric pairwise data. 相似文献10.
Background
Large-scale genetic mapping projects require data management systems that can handle complex phenotypes and detect and correct high-throughput genotyping errors, yet are easy to use. 相似文献11.
Victoria Martin-Requena Antonio Mu?oz-Merida M Gonzalo Claros Oswaldo Trelles 《BMC bioinformatics》2009,10(1):16
Background
Nowadays, microarray gene expression analysis is a widely used technology that scientists handle but whose final interpretation usually requires the participation of a specialist. The need for this participation is due to the requirement of some background in statistics that most users lack or have a very vague notion of. Moreover, programming skills could also be essential to analyse these data. An interactive, easy to use application seems therefore necessary to help researchers to extract full information from data and analyse them in a simple, powerful and confident way. 相似文献12.
Angela CM Luyf Barbera DC van Schaik Michel de Vries Frank Baas Antoine HC van Kampen Silvia D Olabarriaga 《BMC bioinformatics》2010,11(1):598
Background
Bioinformatics is confronted with a new data explosion due to the availability of high throughput DNA sequencers. Data storage and analysis becomes a problem on local servers, and therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently, as well as facilitate collaborations. However, interfaces to grids are often unfriendly to novice users. 相似文献13.
Isabel A Nepomuceno-Chamorro Jesus S Aguilar-Ruiz Jose C Riquelme 《BMC bioinformatics》2010,11(1):517
Background
Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. 相似文献14.
Background
Visualization of sequence annotation is a common feature in many bioinformatics tools. For many applications it is desirable to restrict the display of such annotation according to a score cutoff, as biological interpretation can be difficult in the presence of the entire data. Unfortunately, many visualisation solutions are somewhat static in the way they handle such score cutoffs. 相似文献15.
Background
The analysis of high-throughput screening data sets is an expanding field in bioinformatics. High-throughput screens by RNAi generate large primary data sets which need to be analyzed and annotated to identify relevant phenotypic hits. Large-scale RNAi screens are frequently used to identify novel factors that influence a broad range of cellular processes, including signaling pathway activity, cell proliferation, and host cell infection. Here, we present a web-based application utility for the end-to-end analysis of large cell-based screening experiments by cellHTS2. 相似文献16.
Background
Automated protein function prediction methods are the only practical approach for assigning functions to genes obtained from model organisms. Many of the previously reported function annotation methods are of limited utility for fungal protein annotation. They are often trained only to one species, are not available for high-volume data processing, or require the use of data derived by experiments such as microarray analysis. To meet the increasing need for high throughput, automated annotation of fungal genomes, we have developed a tool for annotating fungal protein sequences with terms from the Gene Ontology. 相似文献17.
Background
Protein-protein interaction data used in the creation or prediction of molecular networks is usually obtained from large scale or high-throughput experiments. This experimental data is liable to contain a large number of spurious interactions. Hence, there is a need to validate the interactions and filter out the incorrect data before using them in prediction studies. 相似文献18.
Sacha?AFT?van Hijum Anne?de Jong Richard?JS?Baerends Harma?A?Karsens Naomi?E?Kramer Rasmus?Larsen Chris?D?den Hengst Casper?J?Albers Jan?Kok Oscar?P?Kuipers
Background
In research laboratories using DNA-microarrays, usually a number of researchers perform experiments, each generating possible sources of error. There is a need for a quick and robust method to assess data quality and sources of errors in DNA-microarray experiments. To this end, a novel and cost-effective validation scheme was devised, implemented, and employed. 相似文献19.
Benjamin Schmid Johannes Schindelin Albert Cardona Mark Longair Martin Heisenberg 《BMC bioinformatics》2010,11(1):274
Background
Current imaging methods such as Magnetic Resonance Imaging (MRI), Confocal microscopy, Electron Microscopy (EM) or Selective Plane Illumination Microscopy (SPIM) yield three-dimensional (3D) data sets in need of appropriate computational methods for their analysis. The reconstruction, segmentation and registration are best approached from the 3D representation of the data set. 相似文献20.
Jeffrey C Miecznikowski Senthilkumar Damodaran Kimberly F Sellers Richard A Rabin 《Proteome science》2010,8(1):66