首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

Fungi from environmental samples are typically identified to species level through DNA sequencing of the nuclear ribosomal internal transcribed spacer (ITS) region for use in BLAST-based similarity searches in the International Nucleotide Sequence Databases. These searches are time-consuming and regularly require a significant amount of manual intervention and complementary analyses. We here present software – in the form of an identification pipeline for large sets of fungal ITS sequences – developed to automate the BLAST process and several additional analysis steps. The performance of the pipeline was evaluated on a dataset of 350 ITS sequences from fungi growing as epiphytes on building material.  相似文献   

2.

Background  

The use of haplotype-based association tests can improve the power of genome-wide association studies. Since the observed genotypes are unordered pairs of alleles, haplotype phase must be inferred. However, estimating haplotype phase is time consuming. When millions of single-nucleotide polymorphisms (SNPs) are analyzed in genome-wide association study, faster methods for haplotype estimation are required.  相似文献   

3.

Background

Protein-protein interaction (PPI) plays a core role in cellular functions. Massively parallel supercomputing systems have been actively developed over the past few years, which enable large-scale biological problems to be solved, such as PPI network prediction based on tertiary structures.

Results

We have developed a high throughput and ultra-fast PPI prediction system based on rigid docking, “MEGADOCK”, by employing a hybrid parallelization (MPI/OpenMP) technique assuming usages on massively parallel supercomputing systems. MEGADOCK displays significantly faster processing speed in the rigid-body docking process that leads to full utilization of protein tertiary structural data for large-scale and network-level problems in systems biology. Moreover, the system was scalable as shown by measurements carried out on two supercomputing environments. We then conducted prediction of biological PPI networks using the post-docking analysis.

Conclusions

We present a new protein-protein docking engine aimed at exhaustive docking of mega-order numbers of protein pairs. The system was shown to be scalable by running on thousands of nodes. The software package is available at: http://www.bi.cs.titech.ac.jp/megadock/k/.
  相似文献   

4.
A whole-genome association study for pig reproductive traits   总被引:3,自引:0,他引:3  
A whole-genome association study was performed for reproductive traits in commercial sows using the PorcineSNP60 BeadChip and Bayesian statistical methods. The traits included total number born (TNB), number born alive (NBA), number of stillborn (SB), number of mummified foetuses at birth (MUM) and gestation length (GL) in each of the first three parities. We report the associations of informative QTL and the genes within the QTL for each reproductive trait in different parities. These results provide evidence of gene effects having temporal impacts on reproductive traits in different parities. Many QTL identified in this study are new for pig reproductive traits. Around 48% of total genes located in the identified QTL regions were predicted to be involved in placental functions. The genomic regions containing genes important for foetal developmental (e.g. MEF2C) and uterine functions (e.g. PLSCR4) were associated with TNB and NBA in the first two parities. Similarly, QTL in other foetal developmental (e.g. HNRNPD and AHR) and placental (e.g. RELL1 and CD96) genes were associated with SB and MUM in different parities. The QTL with genes related to utero-placental blood flow (e.g. VEGFA) and hematopoiesis (e.g. MAFB) were associated with GL differences among sows in this population. Pathway analyses using genes within QTL identified some modest underlying biological pathways, which are interesting candidates (e.g. the nucleotide metabolism pathway for SB) for pig reproductive traits in different parities. Further validation studies on large populations are warranted to improve our understanding of the complex genetic architecture for pig reproductive traits.  相似文献   

5.
He Y  Li C  Amos CI  Xiong M  Ling H  Jin L 《PloS one》2011,6(7):e22097
The genome-wide association study (GWAS) has become a routine approach for mapping disease risk loci with the advent of large-scale genotyping technologies. Multi-allelic haplotype markers can provide superior power compared with single-SNP markers in mapping disease loci. However, the application of haplotype-based analysis to GWAS is usually bottlenecked by prohibitive time cost for haplotype inference, also known as phasing. In this study, we developed an efficient approach to haplotype-based analysis in GWAS. By using a reference panel, our method accelerated the phasing process and reduced the potential bias generated by unrealistic assumptions in phasing process. The haplotype-based approach delivers great power and no type I error inflation for association studies. With only a medium-size reference panel, phasing error in our method is comparable to the genotyping error afforded by commercial genotyping solutions.  相似文献   

6.

Background  

Massively parallel sequencing readouts of epigenomic assays are enabling integrative genome-wide analyses of genomic and epigenomic variation. Pash 3.0 performs sequence comparison and read mapping and can be employed as a module within diverse configurable analysis pipelines, including ChIP-Seq and methylome mapping by whole-genome bisulfite sequencing.  相似文献   

7.
8.
Spectronet is a package that uses various methods for exploring and visualising complex evolutionary signals. Given an alignment in NEXUS format, the package works by computing a collection of weighted splits or bipartitions of the taxa and then allows the user to interactively analyse the resulting collection using tools such as Lento-plots and median networks. The package is highly interactive and available for PCs.  相似文献   

9.
ClustalW-MPI: ClustalW analysis using distributed and parallel computing   总被引:7,自引:0,他引:7  
ClustalW is a tool for aligning multiple protein or nucleotide sequences. The alignment is achieved via three steps: pairwise alignment, guide-tree generation and progressive alignment. ClustalW-MPI is a distributed and parallel implementation of ClustalW. All three steps have been parallelized to reduce the execution time. The software uses a message-passing library called MPI (Message Passing Interface) and runs on distributed workstation clusters as well as on traditional parallel computers.  相似文献   

10.
Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.  相似文献   

11.
An algorithm for multiple sequence comparison was implementedin FORTRAN 77 for VAX/VMS in GCG-atible format. The MULTICOMPprogram package includes several procedures with which one querysequence can be compared simultaneously to several DNA, RNAor amino acid sequences. The same technique was also introducedfor comparing propensities of secondary structural features,which can be predicted on the basis of amino acid sequences.The technique has been applied to a wide range of sequence andstructural analyses.  相似文献   

12.
SUMMARY: TREE-PUZZLE is a program package for quartet-based maximum-likelihood phylogenetic analysis (formerly PUZZLE, Strimmer and von Haeseler, Mol. Biol. Evol., 13, 964-969, 1996) that provides methods for reconstruction, comparison, and testing of trees and models on DNA as well as protein sequences. To reduce waiting time for larger datasets the tree reconstruction part of the software has been parallelized using message passing that runs on clusters of workstations as well as parallel computers. AVAILABILITY: http://www.tree-puzzle.de. The program is written in ANSI C. TREE-PUZZLE can be run on UNIX, Windows and Mac systems, including Mac OS X. To run the parallel version of PUZZLE, a Message Passing Interface (MPI) library has to be installed on the system. Free MPI implementations are available on the Web (cf. http://www.lam-mpi.org/mpi/implementations/).  相似文献   

13.
A software package is described to reconstruct three-dimensional pictures in true perspective from a series of parallel sections using a low-cost computer system (Apple II plus). Data sampling via a graphic tablet and graphical output on the monitor screen or a digital plotter are assigned to different programs under control of a menu program. The number of data representing the object under study is unlimited. Originally written in BASIC, the programs were translated to machine language. As an application of the package, reconstructions of an identified large interneuron of the locust brain are presented.  相似文献   

14.
The large choice of Distributed Computing Infrastructures (DCIs) available allows users to select and combine their preferred architectures amongst Clusters, Grids, Clouds, Desktop Grids and more. In these hybrid DCIs, elasticity is emerging as a key property. In elastic infrastructures, resources available to execute application continuously vary, either because of application requirements or because of constraints on the infrastructure, such as node volatility. In the former case, there is no guarantee that the computing resources will remain available during the entire execution of an application. In this paper, we show that Bag-of-Tasks (BoT) execution on these “Best-Effort” infrastructures suffer from a drop of the task completion rate at the end of the execution. The SpeQuloS service presented in this paper improves the Quality of Service (QoS) of BoT applications executed on hybrid and elastic infrastructures. SpeQuloS monitors the execution of the BoT, and dynamically supplies fast and reliable Cloud resources when the critical part of the BoT is executed. SpeQuloS offers several features to hybrid DCIs users, such as estimating completion time and execution speedup. Performance evaluation shows that BoT executions can be accelerated by a factor 2, while offloading less than 2.5 % of the workload to the Cloud. We report on several scenarios where SpeQuloS is deployed on hybrid infrastructures featuring a large variety of infrastructures combinations. In the context of the European Desktop Grid Initiative (EDGI), SpeQuloS is operated to improve QoS of Desktop Grids using resources from private Clouds. We present a use case where SpeQuloS uses both EC2 regular and spot instances to decrease the cost of computation while preserving a similar QoS level. Finally, in the last scenario SpeQuloS allows to optimize Grid5000 resources utilization.  相似文献   

15.
16.
Programs for the calculation, storage and analysis of restriction maps derived from the analysis of partial digestion products from end labelled DNA (1,2,3) and their correlation with digestion - and hybridisation patterns in total digestions and Southern blot experiments are described. These programs allow direct input of gel patterns from partial or complete digestion experiments using a digitizer tablet, calculation of molecular weights and restriction maps, plotting of maps and actual or predicted fragment patterns and automated identification of overlapping cosmids from partial restriction mapping results. Programs are written in PASCAL and have been implemented on a VAX/VMS system, with a HP-7221T plotter and a digitizing tablet.  相似文献   

17.
18.
ANTHEPROT: a package for protein sequence analysis using a microcomputer   总被引:2,自引:0,他引:2  
A simple microcomputer package is described to make the theoreticalanalysis of protein sequences. Several methods designed to comparetwo sequences, to model proteolytic reactions and to predictthe secondary structure, the hydro-phobic/hydrophilic regionsand the potential antigenic sites of proteins have been includedin an Apple II microcomputer software. The package comprises21 programs as well as the secondary structure database of Kabschand Sander (1983). Received on November 24, 1987; accepted on March 8, 1988  相似文献   

19.

Background  

Large multigene sequence alignments have over recent years been increasingly employed for phylogenomic reconstruction of the eukaryote tree of life. Such supermatrices of sequence data are preferred over single gene alignments as they contain vastly more information about ancient sequence characteristics, and are thus more suitable for resolving deeply diverging relationships. However, as alignments are expanded, increasingly numbers of sites with misleading phylogenetic information are also added. Therefore, a major goal in phylogenomic analyses is to maximize the ratio of information to noise; this can be achieved by the reduction of fast evolving sites.  相似文献   

20.
Copy number alterations are important contributors to many genetic diseases, including cancer. We present the readDepth package for R, which can detect these aberrations by measuring the depth of coverage obtained by massively parallel sequencing of the genome. In addition to achieving higher accuracy than existing packages, our tool runs much faster by utilizing multi-core architectures to parallelize the processing of these large data sets. In contrast to other published methods, readDepth does not require the sequencing of a reference sample, and uses a robust statistical model that accounts for overdispersed data. It includes a method for effectively increasing the resolution obtained from low-coverage experiments by utilizing breakpoint information from paired end sequencing to do positional refinement. We also demonstrate a method for inferring copy number using reads generated by whole-genome bisulfite sequencing, thus enabling integrative study of epigenomic and copy number alterations. Finally, we apply this tool to two genomes, showing that it performs well on genomes sequenced to both low and high coverage. The readDepth package runs on Linux and MacOSX, is released under the Apache 2.0 license, and is available at http://code.google.com/p/readdepth/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号