首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Abstract

We discuss some of the problems that have frustrated the development of reliable model intermolecular potentials for polyatomic molecules. In particular, the usual assumption of an isotropic atom-atom model potential is analysed, and evidence for its inadequacies is presented. A new approach to designing model potentials, an anisotropic site—site model, is introduced by describing several applications to both small and organic molecules, including molecular dynamics and Monte Carlo simulations. The anisotropy required in an atom—atom potential can be directly linked to the non-spherical features in the valence electron distribution, such as lone pairs and π electrons. An accurate electrostatic model for these effects can be constructed from a distributed multipole analysis of the ab initio wavefunction. The empirically required forms of anisotropy in the repulsion potential can also be qualitatively linked to the molecular electron density difference map. Thus, consideration of the molecular bonding can be a useful indication of how to construct adequate model intermolecular pair potentials.  相似文献   

2.
VennPainter is a program for depicting unique and shared sets of genes lists and generating Venn diagrams, by using the Qt C++ framework. The software produces Classic Venn, Edwards’ Venn and Nested Venn diagrams and allows for eight sets in a graph mode and 31 sets in data processing mode only. In comparison, previous programs produce Classic Venn and Edwards’ Venn diagrams and allow for a maximum of six sets. The software incorporates user-friendly features and works in Windows, Linux and Mac OS. Its graphical interface does not require a user to have programing skills. Users can modify diagram content for up to eight datasets because of the Scalable Vector Graphics output. VennPainter can provide output results in vertical, horizontal and matrix formats, which facilitates sharing datasets as required for further identification of candidate genes. Users can obtain gene lists from shared sets by clicking the numbers on the diagram. Thus, VennPainter is an easy-to-use, highly efficient, cross-platform and powerful program that provides a more comprehensive tool for identifying candidate genes and visualizing the relationships among genes or gene families in comparative analysis.  相似文献   

3.
Integrative genomics predictors, which score highly in predicting bacterial essential genes, would be unfeasible in most species because the data sources are limited. We developed a universal approach and tool designated Geptop, based on orthology and phylogeny, to offer gene essentiality annotations. In a series of tests, our Geptop method yielded higher area under curve (AUC) scores in the receiver operating curves than the integrative approaches. In the ten-fold cross-validations among randomly upset samples, Geptop yielded an AUC of 0.918, and in the cross-organism predictions for 19 organisms Geptop yielded AUC scores between 0.569 and 0.959. A test applied to the very recently determined essential gene dataset from the Porphyromonas gingivalis, which belongs to a phylum different with all of the above 19 bacterial genomes, gave an AUC of 0.77. Therefore, Geptop can be applied to any bacterial species whose genome has been sequenced. Compared with the essential genes uniquely identified by the lethal screening, the essential genes predicted only by Gepop are associated with more protein-protein interactions, especially in the three bacteria with lower AUC scores (<0.7). This may further illustrate the reliability and feasibility of our method in some sense. The web server and standalone version of Geptop are available at http://cefg.uestc.edu.cn/geptop/ free of charge. The tool has been run on 968 bacterial genomes and the results are accessible at the website.  相似文献   

4.
Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of features reduces the problem’s dimensionality and may result in higher classification performance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates simultaneously large number of candidate collections of features. DWFS also integrates various filtering methods that may be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different biomedical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used existing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs.  相似文献   

5.
Near linear scaling fragment based quantum chemical calculations are becoming increasingly popular for treating large systems with high accuracy and is an active field of research. However, it remains difficult to set up these calculations without expert knowledge. To facilitate the use of such methods, software tools need to be available to support these methods and help to set up reasonable input files which will lower the barrier of entry for usage by non-experts. Previous tools relies on specific annotations in structure files for automatic and successful fragmentation such as residues in PDB files. We present a general fragmentation methodology and accompanying tools called FragIt to help setup these calculations. FragIt uses the SMARTS language to locate chemically appropriate fragments in large structures and is applicable to fragmentation of any molecular system given suitable SMARTS patterns. We present SMARTS patterns of fragmentation for proteins, DNA and polysaccharides, specifically for D-galactopyranose for use in cyclodextrins. FragIt is used to prepare input files for the Fragment Molecular Orbital method in the GAMESS program package, but can be extended to other computational methods easily.  相似文献   

6.
The perpetually increasing rate at which viral full-genome sequences are being determined is creating a pressing demand for computational tools that will aid the objective classification of these genome sequences. Taxonomic classification approaches that are based on pairwise genetic identity measures are potentially highly automatable and are progressively gaining favour with the International Committee on Taxonomy of Viruses (ICTV). There are, however, various issues with the calculation of such measures that could potentially undermine the accuracy and consistency with which they can be applied to virus classification. Firstly, pairwise sequence identities computed based on multiple sequence alignments rather than on multiple independent pairwise alignments can lead to the deflation of identity scores with increasing dataset sizes. Also, when gap-characters need to be introduced during sequence alignments to account for insertions and deletions, methodological variations in the way that these characters are introduced and handled during pairwise genetic identity calculations can cause high degrees of inconsistency in the way that different methods classify the same sets of sequences. Here we present Sequence Demarcation Tool (SDT), a free user-friendly computer program that aims to provide a robust and highly reproducible means of objectively using pairwise genetic identity calculations to classify any set of nucleotide or amino acid sequences. SDT can produce publication quality pairwise identity plots and colour-coded distance matrices to further aid the classification of sequences according to ICTV approved taxonomic demarcation criteria. Besides a graphical interface version of the program for Windows computers, command-line versions of the program are available for a variety of different operating systems (including a parallel version for cluster computing platforms).  相似文献   

7.
Presumptive identification of different Enterobacteriaceae species is routinely achieved based on biochemical properties. Traditional practice includes manual comparison of each biochemical property of the unknown sample with known reference samples and inference of its identity based on the maximum similarity pattern with the known samples. This process is laborintensive, time-consuming, error-prone, and subjective. Therefore, automation of sorting and similarity in calculation would be advantageous. Here we present a MATLAB-based graphical user interface(GUI) tool named Bio Cluster. This tool was designed for automated clustering and identification of Enterobacteriaceae based on biochemical test results. In this tool, we used two types of algorithms, i.e., traditional hierarchical clustering(HC) and the Improved Hierarchical Clustering(IHC), a modified algorithm that was developed specifically for the clustering and identification of Enterobacteriaceae species. IHC takes into account the variability in result of 1–47 biochemical tests within this Enterobacteriaceae family. This tool also provides different options to optimize the clustering in a user-friendly way. Using computer-generated synthetic data and some real data, we have demonstrated that Bio Cluster has high accuracy in clustering and identifying enterobacterial species based on biochemical test data. This tool can be freely downloaded at http://microbialgen.du.ac.bd/biocluster/.  相似文献   

8.
《Fly》2013,7(5):279-281
Microsatellites show tremendous variation between genomes in terms of their occurrence and composition. Availability of whole genome sequences allows us to study microsatellite characteristics of fully sequenced insect genomes to understand the evolution and biological significance of microsatellites. InSatDb is an insect microsatellite database that provides an interactive interface to query information on microsatellites annotated with size (in base pairs and repeat units); genomic location (exon, intron, up-stream or transposon); nature (perfect or imperfect); and sequence composition (repeat motif and GC%). Here, we present a snap shot of the distribution and composition of microsatellites in introns and exons of insect genomes. The data present interesting observations regarding the microsatellite life-cycle and genome flux.  相似文献   

9.
In the last decade, the cost of genomic sequencing has been decreasing so much that researchers all over the world accumulate huge amounts of data for present and future use. These genomic data need to be efficiently stored, because storage cost is not decreasing as fast as the cost of sequencing. In order to overcome this problem, the most popular general-purpose compression tool, gzip, is usually used. However, these tools were not specifically designed to compress this kind of data, and often fall short when the intention is to reduce the data size as much as possible. There are several compression algorithms available, even for genomic data, but very few have been designed to deal with Whole Genome Alignments, containing alignments between entire genomes of several species. In this paper, we present a lossless compression tool, MAFCO, specifically designed to compress MAF (Multiple Alignment Format) files. Compared to gzip, the proposed tool attains a compression gain from 34% to 57%, depending on the data set. When compared to a recent dedicated method, which is not compatible with some data sets, the compression gain of MAFCO is about 9%. Both source-code and binaries for several operating systems are freely available for non-commercial use at: http://bioinformatics.ua.pt/software/mafco.  相似文献   

10.
A Genomic Islands (GI) is a chunk of DNA sequence in a genome whose origin can be traced back to other organisms or viruses. The detection of GIs plays an indispensable role in biomedical research, due to the fact that GIs are highly related to special functionalities such as disease-causing GIs - pathogenicity islands. It is also very important to visualize genomic islands, as well as the supporting features corresponding to the genomic islands in the genome. We have developed a program, Genomic Island Visualization (GIV), which displays the locations of genomic islands in a genome, as well as the corresponding supportive feature information for GIs. GIV was implemented in C++, and was compiled and executed on Linux/Unix operating systems.

Availability

GIV is freely available for non-commercial use at http://www5.esu.edu/cpsc/bioinfo/software/GIV  相似文献   

11.
In the present paper the ability of calibration free laser induced breakdown spectroscopy (CF-LIBS) as a quality control tool to monitor the composition of different minerals present in food supplement samples belonging to Indian brands (brand-A and brand-B) has been demonstrated. LIBS spectra of these two food supplements (brand-A and brand-B) available in the form of tablet have been recorded. As reported by manufacturers of these two food supplements, LIBS spectra of brand-A contains the spectral signatures of minerals like Ca, Mg, C, P, Zn, Fe, Cu, and Cr whereas LIBS spectra of brand-B shows the presence of spectral lines like Ca, Mg and C. The spectral signatures of Na and K are also found in both brands whereas spectral signature of Ti is observed only in brand-B but these elements are not mentioned on the nutritional label of the brands. The quantitative analysis of mineral contents in food supplements has been done using CF-LIBS for brand A and brand B to verify the content of the minerals reported by the manufacturer of the food supplements. Our results show that Ca and Mg are the main matrix elements of these brands. The concentration of minor and trace elements estimated using CF-LIBS technique is found in agreement with the reported nutritional values of both the brands. The concentration of major elements Ca and Mg are also estimated from Atomic Absorption Spectroscopy which is in close agreement with CF-LIBS result.  相似文献   

12.

Background

Highly parallel sequencing technologies have become important tools in the analysis of sequence polymorphisms on a genomic scale. However, the development of customized software to analyze data produced by these methods has lagged behind.

Methods/Principal Findings

Here I describe a tool, ‘galign’, designed to identify polymorphisms between sequence reads obtained using Illumina/Solexa technology and a reference genome. The ‘galign’ alignment tool does not use Smith-Waterman matrices for sequence comparisons. Instead, a simple algorithm comparing parsed sequence reads to parsed reference genome sequences is used. ‘galign’ output is geared towards immediate user application, displaying polymorphism locations, nucleotide changes, and relevant predicted amino-acid changes for ease of information processing. To do so, ‘galign’ requires several accessory files easily derived from an annotated reference genome. Direct sequencing as well as in silico studies demonstrate that ‘galign’ provides lesion predictions comparable in accuracy to available prediction programs, accompanied by greater processing speed and more user-friendly output. We demonstrate the use of ‘galign’ to identify mutations leading to phenotypic consequences in C. elegans.

Conclusion/Significance

Our studies suggest that ‘galign’ is a useful tool for polymorphism discovery, and is of immediate utility for sequence mining in C. elegans.  相似文献   

13.
14.
15.
胚胎干细胞研究是20世纪90年代以来在生物医学领域中最引人注目的热点之一,而新近发展起来的RNA干扰技术,能快速有效地沉默基因表达,将成为胚胎干细胞生物学研究的得力工具。现对RNA干扰的作用机制,以及RNA干扰应用于胚胎干细胞研究的方法与RNA干扰在胚胎干细胞研究领域的进展作一综述,以期为今后这方面的研究提供参考。  相似文献   

16.

Background & Objective

Managing data from large-scale projects (such as The Cancer Genome Atlas (TCGA)) for further analysis is an important and time consuming step for research projects. Several efforts, such as the Firehose project, make TCGA pre-processed data publicly available via web services and data portals, but this information must be managed, downloaded and prepared for subsequent steps. We have developed an open source and extensible R based data client for pre-processed data from the Firehouse, and demonstrate its use with sample case studies. Results show that our RTCGAToolbox can facilitate data management for researchers interested in working with TCGA data. The RTCGAToolbox can also be integrated with other analysis pipelines for further data processing.

Availability and implementation

The RTCGAToolbox is open-source and licensed under the GNU General Public License Version 2.0. All documentation and source code for RTCGAToolbox is freely available at http://mksamur.github.io/RTCGAToolbox/ for Linux and Mac OS X operating systems.  相似文献   

17.
18.
19.
Modern evolutionary research has much to contribute to medical research and health care practices. Conversely, evolutionary biologists are tapping into the rapidly expanding databases of medical genomic information to further their research. These two fields, which have historically functioned in almost complete isolation, are finding mutual benefit in the exchange of information. The long-term benefits of this synthesis of two major areas of research include improved health care. Recently, efforts to catalyze this relationship have brought together evolutionary biologists, medical practitioners, anthropologists, and ethicists to lay the groundwork for further collaboration and exploration. The range of overlap is surprisingly broad and potentially invaluable.  相似文献   

20.

Background

The exponential increase of published biomedical literature prompts the use of text mining tools to manage the information overload automatically. One of the most common applications is to mine protein-protein interactions (PPIs) from PubMed abstracts. Currently, most tools in mining PPIs from literature are using co-occurrence-based approaches or rule-based approaches. Hybrid methods (frame-based approaches) by combining these two methods may have better performance in predicting PPIs. However, the predicted PPIs from these methods are rarely evaluated by known PPI databases and co-occurred terms in Gene Ontology (GO) database.

Methodology/Principal Findings

We here developed a web-based tool, PPI Finder, to mine human PPIs from PubMed abstracts based on their co-occurrences and interaction words, followed by evidences in human PPI databases and shared terms in GO database. Only 28% of the co-occurred pairs in PubMed abstracts appeared in any of the commonly used human PPI databases (HPRD, BioGRID and BIND). On the other hand, of the known PPIs in HPRD, 69% showed co-occurrences in the literature, and 65% shared GO terms.

Conclusions

PPI Finder provides a useful tool for biologists to uncover potential novel PPIs. It is freely accessible at http://liweilab.genetics.ac.cn/tm/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号