共查询到20条相似文献,搜索用时 0 毫秒
1.
In this report we describe a novel graphically oriented method for pathway modeling and a software package that allows for both modeling and visualization of biological networks in a user-friendly format. The Visinets mathematical approach is based on causal mapping (CMAP) that has been fully integrated with graphical interface. Such integration allows for fully graphical and interactive process of modeling, from building the network to simulation of the finished model. To test the performance of Visinets software we have applied it to: a) create executable EGFR-MAPK pathway model using an intuitive graphical way of modeling based on biological data, and b) translate existing ordinary differential equation (ODE) based insulin signaling model into CMAP formalism and compare the results. Our testing fully confirmed the potential of the CMAP method for broad application for pathway modeling and visualization and, additionally, showed significant advantage in computational efficiency. Furthermore, we showed that Visinets web-based graphical platform, along with standardized method of pathway analysis, may offer a novel and attractive alternative for dynamic simulation in real time for broader use in biomedical research. Since Visinets uses graphical elements with mathematical formulas hidden from the users, we believe that this tool may be particularly suited for those who are new to pathway modeling and without the in-depth modeling skills often required when using other software packages. 相似文献
2.
Jing Li Xudong Qu Xinyi He Lian Duan Guojun Wu Dexi Bi Zixin Deng Wen Liu Hong-Yu Ou 《PloS one》2012,7(9)
Thiopeptides are a growing class of sulfur-rich, highly modified heterocyclic peptides that are mainly active against Gram-positive bacteria including various drug-resistant pathogens. Recent studies also reveal that many thiopeptides inhibit the proliferation of human cancer cells, further expanding their application potentials for clinical use. Thiopeptide biosynthesis shares a common paradigm, featuring a ribosomally synthesized precursor peptide and conserved posttranslational modifications, to afford a characteristic core system, but differs in tailoring to furnish individual members. Identification of new thiopeptide gene clusters, by taking advantage of increasing information of DNA sequences from bacteria, may facilitate new thiopeptide discovery and enrichment of the unique biosynthetic elements to produce novel drug leads by applying the principle of combinatorial biosynthesis. In this study, we have developed a web-based tool ThioFinder to rapidly identify thiopeptide biosynthetic gene cluster from DNA sequence using a profile Hidden Markov Model approach. Fifty-four new putative thiopeptide biosynthetic gene clusters were found in the sequenced bacterial genomes of previously unknown producing microorganisms. ThioFinder is fully supported by an open-access database ThioBase, which contains the sufficient information of the 99 known thiopeptides regarding the chemical structure, biological activity, producing organism, and biosynthetic gene (cluster) along with the associated genome if available. The ThioFinder website offers researchers a unique resource and great flexibility for sequence analysis of thiopeptide biosynthetic gene clusters. ThioFinder is freely available at http://db-mml.sjtu.edu.cn/ThioFinder/. 相似文献
3.
Despite recent developments in analyzing RNA secondary structures, relatively few RNA structures have been determined. To date, many investigators have relied on the traditional method of using structure-specific RNAse enzymes to probe RNA secondary structures. However, if these data were combined with novel computational approaches, investigators would have an informative and valuable tool for RNA structural analysis. To this end, we created the web server “RNAdigest.” RNAdigest uses mfold RNA structural models in order to predict the results of RNAse digestion experiments. Furthermore, RNAdigest also utilizes both RNA sequence and the experimental digestion patterns to formulate the constraints for predicting secondary structures of the RNA. Thus, RNAdigest allows for the structural interpretation of RNAse digestion experiments. Overall, RNAdigest simplifies RNAse digestion result analyses while allowing for the identification of unique fragments. These unique fragments can then be used for testing predicted mfold structures and for designing structural-specific DNA/RNA probes. 相似文献
4.
ScriptingRT is a new open source tool to collect response latencies in online studies of human cognition. ScriptingRT studies run as Flash applets in enabled browsers. ScriptingRT provides the building blocks of response latency studies, which are then combined with generic Apache Flex programming. Six studies evaluate the performance of ScriptingRT empirically. Studies 1–3 use specialized hardware to measure variance of response time measurement and stimulus presentation timing. Studies 4–6 implement a Stroop paradigm and run it both online and in the laboratory, comparing ScriptingRT to other response latency software. Altogether, the studies show that Flash programs developed in ScriptingRT show a small lag and an increased variance in response latencies. However, this did not significantly influence measured effects: The Stroop effect was reliably replicated in all studies, and the found effects did not depend on the software used. We conclude that ScriptingRT can be used to test response latency effects online. 相似文献
5.
Background
The analysis of biological networks has become a major challenge due to the recent development of high-throughput techniques that are rapidly producing very large data sets. The exploding volumes of biological data are craving for extreme computational power and special computing facilities (i.e. super-computers). An inexpensive solution, such as General Purpose computation based on Graphics Processing Units (GPGPU), can be adapted to tackle this challenge, but the limitation of the device internal memory can pose a new problem of scalability. An efficient data and computational parallelism with partitioning is required to provide a fast and scalable solution to this problem.Results
We propose an efficient parallel formulation of the k-Nearest Neighbour (kNN) search problem, which is a popular method for classifying objects in several fields of research, such as pattern recognition, machine learning and bioinformatics. Being very simple and straightforward, the performance of the kNN search degrades dramatically for large data sets, since the task is computationally intensive. The proposed approach is not only fast but also scalable to large-scale instances. Based on our approach, we implemented a software tool GPU-FS-kNN (GPU-based Fast and Scalable k-Nearest Neighbour) for CUDA enabled GPUs. The basic approach is simple and adaptable to other available GPU architectures. We observed speed-ups of 50–60 times compared with CPU implementation on a well-known breast microarray study and its associated data sets.Conclusion
Our GPU-based Fast and Scalable k-Nearest Neighbour search technique (GPU-FS-kNN) provides a significant performance improvement for nearest neighbour computation in large-scale networks. Source code and the software tool is available under GNU Public License (GPL) at https://sourceforge.net/p/gpufsknn/. 相似文献6.
NAExplor is a software tool for converting coordinates files between the software packages AMBER, CHARMM, and XPLOR. In addition, it manages the conversion of NMR-derived distance restraints information from the MARDIGRAS program into the appropriate file formats used for input in AMBER, CHARMM, and XPLOR. Analyses of H-H distances in nucleic acid structures and calculations of torsion angles for nucleic acid backbone and riboses are also possible. 相似文献
7.
Michael A. DeJesus Chaitra Ambadipudi Richard Baker Christopher Sassetti Thomas R. Ioerger 《PLoS computational biology》2015,11(10)
TnSeq has become a popular technique for determining the essentiality of genomic regions in bacterial organisms. Several methods have been developed to analyze the wealth of data that has been obtained through TnSeq experiments. We developed a tool for analyzing Himar1 TnSeq data called TRANSIT. TRANSIT provides a graphical interface to three different statistical methods for analyzing TnSeq data. These methods cover a variety of approaches capable of identifying essential genes in individual datasets as well as comparative analysis between conditions. We demonstrate the utility of this software by analyzing TnSeq datasets of M. tuberculosis grown on glycerol and cholesterol. We show that TRANSIT can be used to discover genes which have been previously implicated for growth on these carbon sources. TRANSIT is written in Python, and thus can be run on Windows, OSX and Linux platforms. The source code is distributed under the GNU GPL v3 license and can be obtained from the following GitHub repository: https://github.com/mad-lab/transit
This is a PLOS Computational Biology Software paper相似文献
8.
JRGarbe YDa 《遗传学报》2003,30(12):1193-1195
对于在遗传研究和家系研究中大的系谱结构图还很难分析。系谱的绘制通常是遗传性状的分析研究的第一步。系图可以反映整个群体的结构、每个个体之间的相互关系以及基因流的走向,便于理解遗传性状的本质。因为所用家系数目的增大和复杂性的增加,绘制1个清晰的系谱有时变得十分困难。因此开发了1种名为Pedigraph软件,可以解决这个问题。Pedigraph能够完成对于大的复杂的群体的系谱绘制工作,并能进行相应的系谱分析。初步的测试表明这个软件在研究动植物的遗传育种中是1个有用的工具,同时它也可以用于人类的群体和历史等方面的研究。 相似文献
9.
The identification of virulent proteins in any de-novo sequenced genome is useful in estimating its pathogenic ability and understanding the mechanism of pathogenesis. Similarly, the identification of such proteins could be valuable in comparing the metagenome of healthy and diseased individuals and estimating the proportion of pathogenic species. However, the common challenge in both the above tasks is the identification of virulent proteins since a significant proportion of genomic and metagenomic proteins are novel and yet unannotated. The currently available tools which carry out the identification of virulent proteins provide limited accuracy and cannot be used on large datasets. Therefore, we have developed an MP3 standalone tool and web server for the prediction of pathogenic proteins in both genomic and metagenomic datasets. MP3 is developed using an integrated Support Vector Machine (SVM) and Hidden Markov Model (HMM) approach to carry out highly fast, sensitive and accurate prediction of pathogenic proteins. It displayed Sensitivity, Specificity, MCC and accuracy values of 92%, 100%, 0.92 and 96%, respectively, on blind dataset constructed using complete proteins. On the two metagenomic blind datasets (Blind A: 51–100 amino acids and Blind B: 30–50 amino acids), it displayed Sensitivity, Specificity, MCC and accuracy values of 82.39%, 97.86%, 0.80 and 89.32% for Blind A and 71.60%, 94.48%, 0.67 and 81.86% for Blind B, respectively. In addition, the performance of MP3 was validated on selected bacterial genomic and real metagenomic datasets. To our knowledge, MP3 is the only program that specializes in fast and accurate identification of partial pathogenic proteins predicted from short (100–150 bp) metagenomic reads and also performs exceptionally well on complete protein sequences. MP3 is publicly available at http://metagenomics.iiserb.ac.in/mp3/index.php. 相似文献
10.
We outline a framework for evaluating food- and water-borne surveillance systems using hospitalization records, and demonstrate the approach using data on salmonellosis, campylobacteriosis and giardiasis in persons aged ≥65 years in Massachusetts. For each infection, and for each reporting jurisdiction, we generated smoothed standardized morbidity ratios (SMR) and surveillance to hospitalization ratios (SHR) by comparing observed surveillance counts with expected values or the number of hospitalized cases, respectively. We examined the spatial distribution of SHR and related this to the mean for the entire state. Through this approach municipalities that deviated from the typical experience were identified and suspected of under-reporting. Regression analysis revealed that SHR was a significant predictor of SMR, after adjusting for population age-structure. This confirms that the spatial “signal” depicted by surveillance is in part influenced by inconsistent testing and reporting practices since municipalities that reported fewer cases relative to the number of hospitalizations had a lower relative risk (as estimated by SMR). Periodic assessment of SHR has potential in assessing the performance of surveillance systems. 相似文献
11.
12.
13.
Sherif H. Elmeligy Abdelhamid Chris J. Kuhlman Madhav V. Marathe Henning S. Mortveit S. S. Ravi 《PloS one》2015,10(8)
Discrete dynamical systems are used to model various realistic systems in network science, from social unrest in human populations to regulation in biological networks. A common approach is to model the agents of a system as vertices of a graph, and the pairwise interactions between agents as edges. Agents are in one of a finite set of states at each discrete time step and are assigned functions that describe how their states change based on neighborhood relations. Full characterization of state transitions of one system can give insights into fundamental behaviors of other dynamical systems. In this paper, we describe a discrete graph dynamical systems (GDSs) application called GDSCalc for computing and characterizing system dynamics. It is an open access system that is used through a web interface. We provide an overview of GDS theory. This theory is the basis of the web application; i.e., an understanding of GDS provides an understanding of the software features, while abstracting away implementation details. We present a set of illustrative examples to demonstrate its use in education and research. Finally, we compare GDSCalc with other discrete dynamical system software tools. Our perspective is that no single software tool will perform all computations that may be required by all users; tools typically have particular features that are more suitable for some tasks. We situate GDSCalc within this space of software tools. 相似文献
14.
It is inevitable that tree species will undergo considerable range shifts in response to anthropogenic induced climate change, even in the near future. Species Distribution Models (SDMs) are valuable tools in exploring general temporal trends and spatial patterns of potential range shifts. Understanding projections to future climate for tree species will facilitate policy making in forestry. Comparative studies for a large number of tree species require the availability of suitable and standardized indices. A crucial limitation when deriving such indices is the threshold problem in defining ranges, which has made interspecies comparison problematic until now. Here we propose a set of threshold-free indices, which measure range explosion (I), overlapping (O), and range center movement in three dimensions (Dx, Dy, Dz), based on fuzzy set theory (Fuzzy Set based Potential Range Shift Index, F-PRS Index). A graphical tool (PRS_Chart) was developed to visualize these indices. This technique was then applied to 46 Pinaceae species that are widely distributed and partly common in China. The spatial patterns of the modeling results were then statistically tested for significance. Results showed that range overlap was generally low; no trends in range size changes and longitudinal movements could be found, but northward and poleward movement trends were highly significant. Although range shifts seemed to exhibit huge interspecies variation, they were very consistent for certain climate change scenarios. Comparing the IPCC scenarios, we found that scenario A1B would lead to a larger extent of range shifts (less overlapping and more latitudinal movement) than the A2 and the B1 scenarios. It is expected that the newly developed standardized indices and the respective graphical tool will facilitate studies on PRS''s for other tree species groups that are important in forestry as well, and thus support climate adaptive forest management. 相似文献
15.
16.
Clifton K. Fagerquist Brandon R. Garbus Katherine E. Williams Anna H. Bates Síobhán Boyle Leslie A. Harden 《Applied and environmental microbiology》2009,75(13):4341-4353
We have developed web-based software for the rapid identification of protein biomarkers of bacterial microorganisms. Proteins from bacterial cell lysates were ionized by matrix-assisted laser desorption ionization (MALDI), mass isolated, and fragmented using a tandem time of flight (TOF-TOF) mass spectrometer. The sequence-specific fragment ions generated were compared to a database of in silico fragment ions derived from bacterial protein sequences whose molecular weights are the same as the nominal molecular weights of the protein biomarkers. A simple peak-matching and scoring algorithm was developed to compare tandem mass spectrometry (MS-MS) fragment ions to in silico fragment ions. In addition, a probability-based significance-testing algorithm (P value), developed previously by other researchers, was incorporated into the software for the purpose of comparison. The speed and accuracy of the software were tested by identification of 10 protein biomarkers from three Campylobacter strains that had been identified previously by bottom-up proteomics techniques. Protein biomarkers were identified using (i) their peak-matching scores and/or P values from a comparison of MS-MS fragment ions with all possible in silico N and C terminus fragment ions (i.e., ions a, b, b-18, y, y-17, and y-18), (ii) their peak-matching scores and/or P values from a comparison of MS-MS fragment ions to residue-specific in silico fragment ions (i.e., in silico fragment ions resulting from polypeptide backbone fragmentation adjacent to specific residues [aspartic acid, glutamic acid, proline, etc.]), and (iii) fragment ion error analysis, which distinguished the systematic fragment ion error of a correct identification (caused by calibration drift of the second TOF mass analyzer) from the random fragment ion error of an incorrect identification.Food-borne illness is a serious and continuing problem, with an estimated 76 million cases in the United States per year (http://www.cdc.gov). It is often caused by bacteria and viruses that are often ubiquitous in the environment and are difficult to eliminate due to their ability to adapt. In addition to the resulting morbidity, food-borne illness also has enormous societal costs, including losses in worker productivity due to illness, recall of food products determined (or suspected) to be contaminated, etc. Consequently, there is a critical need to develop rapid and sensitive methods for detection and accurate identification of food-borne pathogens.A number of techniques have been developed for detection and identification of food-borne pathogens. A relatively recent technique for bacterial identification involves the use of mass spectrometry (MS). Because of its sensitivity and high specificity, MS has become a popular technique for chemicotaxonomic classification of microorganisms (16, 27). The use of MS in the analysis of microorganisms is a relatively recent application that was dramatically accelerated by the development of two ionization techniques in the late 1980s and early 1990s: electrospray ionization (15) and matrix-assisted laser desorption ionization (MALDI) (24, 37). When coupled with time of flight (TOF) MS, MALDI has been demonstrated to be a powerful tool for “fingerprinting” microorganisms by ionization and detection of proteins from intact bacterial cells or extracts resulting from bacterial cell lysis (1, 2, 3, 8-12, 19, 21, 25, 26, 29, 34, 40, 41, 42). Typically, MALDI-TOF MS “fingerprinting” of microorganisms involves analysis using either pattern recognition or bioinformatic algorithms.Pattern recognition analysis compares MALDI-TOF MS spectra of samples of unknown microorganisms to spectra of known microorganisms. A high degree of similarity between the MS spectrum of an unknown microorganism and an MS spectrum of a known microorganism strongly suggests the identity of the unknown microorganism (22, 39, 43). It should be noted that pattern recognition analysis does not rely on actual identification of the biomarker ion peaks in an MS spectrum. It is the pattern generated by multiple ion peaks that constitutes a microorganism''s “fingerprint.” The actual identities of individual ion peaks are not specified, and the peaks could be peaks for any of a number of possible biological molecules generated by a microorganism, including proteins, nucleic acids, lipids, etc.Microorganism identification by bioinformatic analysis of MALDI-TOF MS data involves using the protein molecular weights (MWs) in bacterial genomic databases to assign biomarker ion peaks in a mass spectrum to specific proteins (4, 5, 32, 33, 45). If a significant number of biomarker ion peaks in a mass spectrum correspond to protein MWs for the open reading frames of a microorganism''s genome, then the microorganism is considered identified. Such an analysis has also incorporated the simplest and most common posttranslational modification (PTM) observed for bacterial proteins, N-terminal methionine cleavage (5). It should be noted, however, that “identification” of a microorganism relies solely on a sufficient number of protein MWs derived from open reading frames of its genome corresponding to the m/z of biomarker ions in a MALDI-TOF MS spectrum. However, the protein MW alone is not sufficient to definitively identify a biomarker ion as a specific protein. Protein biomarkers are considered to be tentatively assigned instead of definitively identified.Analysis of samples containing multiple bacterial organisms presents increased challenges for MALDI-TOF MS when protein MW is the sole criterion for protein biomarker identification. Clearly, it would be advantageous if researchers could obtain more information about a biomarker in addition to its MW. In the case of protein biomarkers, this can be accomplished by enzymatically digesting a protein in solution and analyzing its tryptic peptides by MS (peptide mass mapping) or by tandem MS (MS-MS) (sequence tags) (45). Alternatively, it is possible to fragment mature, intact proteins (without digestion) in the gas phase to obtain sequence-specific and PTM information. This approach is referred to as top-down proteomics. Until recently, top-down proteomics was possible only if Fourier transform ion cyclotron resonance MS involving complicated gas phase ion dissociation techniques was used (6, 23).Although not originally designed for top-down proteomics, recently developed MALDI-tandem TOF (MALDI-TOF-TOF) MS was shown to fragment small or modest-size proteins (5 kDa > molecular mass < 15 kDa) without prior digestion (28). Demirev and coworkers (7) identified Bacillus atrophaeus and Bacillus cereus spores by fragmenting their protein biomarkers using a MALDI tandem mass spectrometer and analyzing the sequence-specific fragment ions generated by comparison to in silico fragment ions derived from protein amino acid sequences from genomic databases. Protein and microorganism identities were determined using a probability-based significance-testing algorithm (P value). The P value algorithm calculates the probability that a protein or microorganism identification occurred randomly. The smaller the P value, the lower the probability that an identification occurred randomly. The data analysis was performed using software developed in house (7).In the current study, web-based software and databases, developed in house at the U.S. Department of Agriculture (USDA), were used to identify 10 protein biomarkers from three pure strains of Campylobacter by sequence-specific fragmentation using a MALDI-TOF-TOF mass spectrometer. Many of the protein biomarkers had been identified previously by bottom-up proteomics techniques (9, 11, 12), which provided an excellent data set to test the accuracy and performance of the algorithms incorporated into the software. MALDI-TOF-TOF MS-MS fragment ions were compared with a database of in silico fragment ions derived from bacterial protein sequences. The sequence-specific MS-MS fragment ions were used to identify a protein and thus the source microorganism. A simple peak-matching mathematical algorithm, incorporated into the software, was used to score and rank protein and microorganism identifications. In addition, the P value algorithm of Demirev and coworkers (7) was also incorporated into the USDA software (available with execution of appropriate control usage agreement) for comparison to the peak-matching algorithm. The peak-matching algorithm correctly identified a protein biomarker among as many as ∼1,400 possible bacterial proteins and gave rankings for protein identification comparable to the rankings obtained by more complicated and computationally intensive P value calculation. We often observed enhancement of the score for correct identification when results for MS-MS fragment ions were compared to results for residue-specific in silico fragment ions compared to non-residue-specific in silico fragment ions. In addition, the correctness of the algorithm''s identification was, in certain cases, further confirmed by fragment ion error analysis which compared random error caused by false matches between MS-MS fragment ions and in silico fragment ions with the systematic error observed for correct matches due to drift in the calibration of the TOF mass analyzer (38).(Portions of this work were presented at the 121st AOAC Conference [13] and at the 55th American Society of Mass Spectrometry Conference [14].) 相似文献
17.
T346Hunter (Type Three, Four and Six secretion system Hunter) is a web-based tool for the identification and localisation of type III, type IV and type VI secretion systems (T3SS, T4SS and T6SS, respectively) clusters in bacterial genomes. Non-flagellar T3SS (NF-T3SS) and T6SS are complex molecular machines that deliver effector proteins from bacterial cells into the environment or into other eukaryotic or prokaryotic cells, with significant implications for pathogenesis of the strains encoding them. Meanwhile, T4SS is a more functionally diverse system, which is involved in not only effector translocation but also conjugation and DNA uptake/release. Development of control strategies against bacterial-mediated diseases requires genomic identification of the virulence arsenal of pathogenic bacteria, with T3SS, T4SS and T6SS being major determinants in this regard. Therefore, computational methods for systematic identification of these specialised machines are of particular interest. With the aim of facilitating this task, T346Hunter provides a user-friendly web-based tool for the prediction of T3SS, T4SS and T6SS clusters in newly sequenced bacterial genomes. After inspection of the available scientific literature, we constructed a database of hidden Markov model (HMM) protein profiles and sequences representing the various components of T3SS, T4SS and T6SS. T346Hunter performs searches of such a database against user-supplied bacterial sequences and localises enriched regions in any of these three types of secretion systems. Moreover, through the T346Hunter server, users can visualise the predicted clusters obtained for approximately 1700 bacterial chromosomes and plasmids. T346Hunter offers great help to researchers in advancing their understanding of the biological mechanisms in which these sophisticated molecular machines are involved. T346Hunter is freely available at http://bacterial-virulence-factors.cbgp.upm.es/T346Hunter. 相似文献
18.
19.
A web-based resource, Microbial Community Analysis (MiCA), has been developed to facilitate studies on microbial community
ecology that use analyses of terminal-restriction fragment length polymorphisms (T-RFLP) of 16S and 18S rRNA genes. MiCA provides
an intuitive web interface to access two specialized programs and a specially formatted database of 16S ribosomal RNA sequences.
The first program performs virtual polymerase chain reaction (PCR) amplification of rRNA genes and restriction of the amplicons
using primer sequences and restriction enzymes chosen by the user. This program, in silico PCR and Restriction (ISPaR), uses a binary encoding of DNA sequences to rapidly scan large numbers of sequences in databases
searching for primer annealing and restriction sites while permitting the user to specify the number of mismatches in primer
sequences. ISPaR supports multiple digests with up to three enzymes. The number of base pairs between the 5′ and 3′ primers
and the proximal restriction sites can be reported, printed, or exported in various formats. The second program, APLAUS, infers
a plausible community structure(s) based on T-RFLP data supplied by a user. APLAUS estimates the relative abundances of populations
and reports a listing of phylotypes that are consistent with the empirical data. MiCA is accessible at . 相似文献
20.
《Fly》2013,7(5):279-281
Microsatellites show tremendous variation between genomes in terms of their occurrence and composition. Availability of whole genome sequences allows us to study microsatellite characteristics of fully sequenced insect genomes to understand the evolution and biological significance of microsatellites. InSatDb is an insect microsatellite database that provides an interactive interface to query information on microsatellites annotated with size (in base pairs and repeat units); genomic location (exon, intron, up-stream or transposon); nature (perfect or imperfect); and sequence composition (repeat motif and GC%). Here, we present a snap shot of the distribution and composition of microsatellites in introns and exons of insect genomes. The data present interesting observations regarding the microsatellite life-cycle and genome flux. 相似文献