首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A widely used algorithm for computing an optimal local alignment between two sequences requires a parameter set with a substitution matrix and gap penalties. It is recognized that a proper parameter set should be selected to suit the level of conservation between sequences. We describe an algorithm for selecting an appropriate substitution matrix at given gap penalties for computing an optimal local alignment between two sequences. In the algorithm, a substitution matrix that leads to the maximum alignment similarity score is selected among substitution matrices at various evolutionary distances. The evolutionary distance of the selected substitution matrix is defined as the distance of the computed alignment. To show the effects of gap penalties on alignments and their distances and help select appropriate gap penalties, alignments and their distances are computed at various gap penalties. The algorithm has been implemented as a computer program named SimDist. The SimDist program was compared with an existing local alignment program named SIM for finding reciprocally best-matching pairs (RBPs) of sequences in each of 100 protein families, where RBPs are commonly used as an operational definition of orthologous sequences. SimDist produced more accurate results than SIM on 50 of the 100 families, whereas both programs produced the same results on the other 50 families. SimDist was also used to compare three types of substitution matrices in scoring 444,461 pairs of homologous sequences from the 100 families.  相似文献   

2.
We have created databases and software applications for the analysis of DNA mutations at the humanp53gene, the humanhprtgene and both the rodent transgeniclacIandlacZlocus. The databases themselves are stand-alone dBASE files and the software for analysis of the databases runs on IBM-compatible computers. Each database has a separate software analysis program. The software created for these databases permit the filtering, ordering, report generation and display of information in the database. In addition, a significant number of routines have been developed for the analysis of single base substitutions. One method of obtaining the databases and software is via the World Wide Web (WWW). Open the following home page with a Web Browser: http://sunsite.unc.edu/dnam/mainpage.ht ml . Alternatively, the databases and programs are available via public FTP from: anonymous@sunsite.unc.edu . There is no password required to enter the system. The databases and software are found beneath the subdirectory: pub/academic/biology/dna-mutations. Two other programs are available at the site-a program for comparison of mutational spectra and a program for entry of mutational data into a relational database.  相似文献   

3.
We have created databases and software applications for the analysis of DNA mutations at the human p53 gene, the human hprt gene and both the rodent transgenic lacI and lacZ loci. The databases themselves are stand-alone dBASE files and the software for analysis of the databases runs on IBM-compatible computers with Microsoft Windows. Each database has a separate software analysis program. The software created for these databases permit the filtering, ordering, report generation and display of information in the database. In addition, a significant number of routines have been developed for the analysis of single base substitutions. One method of obtaining the databases and software is via the World Wide Web. Open the following home page with a Web Browser: http://sunsite.unc.edu/dnam/mainpage. html . Alternatively, the databases and programs are available via public FTP from: anonymous@sunsite.unc.edu. There is no password required to enter the system. The databases and software are found beneath the subdirectory: pub/academic/biology/dna-mutations. Two other programs are available at the site, a program for comparison of mutational spectra and a program for entry of mutational data into a relational database.  相似文献   

4.

Background

While the theory of enzyme kinetics is fundamental to analyzing and simulating biochemical systems, the derivation of rate equations for complex mechanisms for enzyme-catalyzed reactions is cumbersome and error prone. Therefore, a number of algorithms and related computer programs have been developed to assist in such derivations. Yet although a number of algorithms, programs, and software packages are reported in the literature, one or more significant limitation is associated with each of these tools. Furthermore, none is freely available for download and use by the community.

Results

We have implemented an algorithm based on the schematic method of King and Altman (KA) that employs the topological theory of linear graphs for systematic generation of valid reaction patterns in a GUI-based stand-alone computer program called KAPattern. The underlying algorithm allows for the assumption steady-state, rapid equilibrium-binding, and/or irreversibility for individual steps in catalytic mechanisms. The program can automatically generate MathML and MATLAB output files that users can easily incorporate into simulation programs.

Conclusion

A computer program, called KAPattern, for generating rate equations for complex enzyme system is a freely available and can be accessed at http://www.biocoda.org.  相似文献   

5.
MOTIVATION: The annotation of the Arabidopsis thaliana genome remains a problem in terms of time and quality. To improve the annotation process, we want to choose the most appropriate tools to use inside a computer-assisted annotation platform. We therefore need evaluation of prediction programs with Arabidopsis sequences containing multiple genes. RESULTS: We have developed AraSet, a data set of contigs of validated genes, enabling the evaluation of multi-gene models for the Arabidopsis genome. Besides conventional metrics to evaluate gene prediction at the site and the exon levels, new measures were introduced for the prediction at the protein sequence level as well as for the evaluation of gene models. This evaluation method is of general interest and could apply to any new gene prediction software and to any eukaryotic genome. The GeneMark.hmm program appears to be the most accurate software at all three levels for the Arabidopsis genomic sequences. Gene modeling could be further improved by combination of prediction software. AVAILABILITY: The AraSet sequence set, the Perl programs and complementary results and notes are available at http://sphinx.rug.ac.be:8080/biocomp/napav/. CONTACT: Pierre.Rouze@gengenp.rug.ac.be.  相似文献   

6.
We have created databases and software applications for the analysis of DNA mutations in the human p53 gene, the human hprt gene and the rodent transgenic lacZ locus. The databases themselves are stand-alone dBase files and the software for analysis of the databases runs on IBM- compatible computers. The software created for these databases permits filtering, ordering, report generation and display of information in the database. In addition, a significant number of routines have been developed for the analysis of single base substitutions. One method of obtaining the databases and software is via the World Wide Web (WWW). Open home page http://sunsite.unc.edu/dnam/mainpage.ht ml with a WWW browser. Alternatively, the databases and programs are available via public ftp from anonymous@sunsite.unc.edu. There is no password required to enter the system. The databases and software are found in subdirectory pub/academic/biology/dna-mutations. Two other programs are available at the WWW site, a program for comparison of mutational spectra and a program for entry of mutational data into a relational database.  相似文献   

7.
The current status and portability of our sequence handling software.   总被引:94,自引:15,他引:79       下载免费PDF全文
I describe the current status of our sequence analysis software. The package contains a comprehensive suite of programs for managing large shotgun sequencing projects, a program containing 61 functions for analysing single sequences and a program for comparing pairs of sequences for similarity. The programs that have been described before have been improved by the addition of new functions and by being made very much easier to use. The major interactive programs have 125 pages of online help available from within them. Several new programs are described including screen editing of aligned gel readings for shotgun sequencing projects; a method to highlight errors in aligned gel readings, new methods for searching for putative signals in sequences. We use the programs on a VAX computer but the whole package has been rewritten to make it easy to transport it to other machines. I believe the programs will now run on any machine with a FORTRAN77 compiler and sufficient memory. We are currently putting the programs onto an IBM PC XT/AT and another micro running under UNIX.  相似文献   

8.
We developed novel programs for displaying and analyzing the transmembrane alpha-helical segments (TMSs) in the aligned sequences of homologous integral membrane proteins. TMS_ALIGN predicts the positions of putative TMSs in multiply aligned protein sequences and graphically shows the TMSs in the alignment. TMS_SPLIT (1). predicts the positions of TMSs for each sequence; (2). allows a user to select proteins with a specified number of TMSs, and (3). splits the sequences into groups of TMSs of equal numbers. TMS_CUT works like TMS_SPLIT, but it can cut sequences with any combination of TMSs. The BASS program similarly allows comparison of protein repeat elements, equivalent to TMS_SPLIT plus IC, but it provides the comparison data expressed in BLAST E values. These programs, together with the IntraCompare program, facilitate the identification of repeat sequences in integral membrane proteins. They also facilitate the estimation of protein topology and the determination of evolutionary pathways.  相似文献   

9.
Each amino acid in a protein is considered to be an individual, mutable characteristic of the species from which the protein is extracted. For a branching tree representing the evolutionary history of the known sequences in different species, our computer programs use majority logic and parsimony of mutations to determine the most likely ancestral amino acid for each position of the protein at each node of the tree. The number of mutations necessary between the ancestral and present species is summed for each branch and the entire tree. The programs then move branches to make many different configurations, from which we select the one with the minimum number of mutations as the most likely evolutionary history. We used this method to elucidate primate phylogeny from sequences of fibrinopeptides, carbonic anhydrase, and the hemoglobin beta, delta and alpha chains. All available sequences indicate that the early Pongidae had diverged into two lines before the divergence of an ancestor for the human line alone. We have constructed some probable ancestral sequences at major points during primate evolution and have developed tentative trees showing the order of divergences and evolutionary distances among primate groups. Further questions on primate evolution could be answered in the future by the detemination of the appropriate sequences.  相似文献   

10.
Many different programs are available to analyze microarray images. Most programs are commercial packages, some are free. In the latter group only few propose automatic grid alignment and batch mode. More often than not a program implements only one quantification algorithm. AGScan is an open source program that works on all major platforms. It is based on the ImageJ library [Rasband (1997-2006)] and offers a plug-in extension system to add new functions to manipulate images, align grid and quantify spots. It is appropriate for daily laboratory use and also as a framework for new algorithms. AVAILABILITY: The program is freely distributed under X11 Licence. The install instructions can be found in the user manual. The software can be downloaded from http://mulcyber.toulouse.inra.fr/projects/agscan/. The questions and plug-ins can be sent to the contact listed below.  相似文献   

11.
The increase in computer power and the development of new mathematical concepts implemented in software have allowed computational chemistry to emerge as a new research field. Although programs were freely distributed during the "golden age" of this discipline, today they are usually copyrighted and have become easier and easier to use through sophisticated graphical interfaces. This "democratization" is a vector of success for this discipline. Nowadays, non-theoreticians can use such programs more easily and solve chemistry-related problems with the computer. The number of program offerings has rapidly grown and private companies specialized in molecular modeling have appeared and compete to sell their products. Thus, numerous software packages, often presenting similar capabilities, are now available on the market. Within this context, the availability of the program source code remains, in our opinion, an important criterion for program selection.  相似文献   

12.
High throughput macromolecular structure determination is very essential in structural genomics as the available number of sequence information far exceeds the number of available 3D structures. ACORN, a freely available resource in the CCP4 suite of programs is a comprehensive and efficient program for phasing in the determination of protein structures, when atomic resolution data are available. ACORN with the automatic model-building program ARP/wARP and refinement program REFMAC is a suitable combination for the high throughput structural genomics. ACORN can also be run with secondary structural elements like helices and sheets as inputs with high resolution data. In situations, where ACORN phasing is not sufficient for building the protein model, the fragments (incomplete model/dummy atoms) can again be used as a starting input. Iterative ACORN is proved to work efficiently in the subsequent model building stages in congerin (PDB-ID: lis3) and catalase (PDB-ID: 1gwe) for which models are available.  相似文献   

13.
We present an approach for analyzing internal dependencies in counting processes. This covers the case with repeated events on each of a number of individuals, and more generally, the situation where several processes are observed for each individual. We define dynamic covariates, i.e., covariates depending on the past of the processes. The statistical analysis is performed mainly by the nonparametric additive approach. This yields a method for analyzing multivariate survival data, which is an alternative to the frailty approach. We present cumulative regression plots, statistical tests, residual plots, and a hat matrix plot for studying outliers. A program in R and S-PLUS for analyzing survival data with the additive regression model is available on the web site http://www.med.uio.no/imb/stat/addreg. The program has been developed to fit the counting process framework.  相似文献   

14.
15.
We describe the use of image software programs available for both PC and Macintosh computers to quantify the accumulation and distribution of gold-labeled constructs within two-dimensional cell sections. The compartmentalization of a biotinylated-peptide was visualized in radiation-induced fibrosarcoma cells by transmission electron microscopy, using a gold particle-streptavidin conjugate. This study illustrates the ease of tabulating gold particles observed in scanned electron micrographs, using Adobe Photoshop in conjunction with the public domain NIH Image program (Version 1.61). Quantitative information regarding the localization of molecules inside cells is crucial in defining their sites of action and in developing more effective therapeutic agents.  相似文献   

16.
Invasive species are a cause for concern in natural and economic systems and require both monitoring and management. There is a trade‐off between the amount of resources spent on surveying for the species and conducting early management of occupied sites, and the resources that are ultimately spent in delayed management at sites where the species was present but undetected. Previous work addressed this optimal resource allocation problem assuming that surveys continue despite detection until the initially planned survey effort is consumed. However, a more realistic scenario is often that surveys stop after detection (i.e., follow a “removal” sampling design) and then management begins. Such an approach will indicate a different optimal survey design and can be expected to be more efficient. We analyze this case and compare the expected efficiency of invasive species management programs under both survey methods. We also evaluate the impact of mis‐specifying the type of sampling approach during the program design phase. We derive analytical expressions that optimize resource allocation between monitoring and management in surveillance programs when surveys stop after detection. We do this under a scenario of unconstrained resources and scenarios where survey budget is constrained. The efficiency of surveillance programs is greater if a “removal survey” design is used, with larger gains obtained when savings from early detection are high, occupancy is high, and survey costs are not much lower than early management costs at a site. Designing a surveillance program disregarding that surveys stop after detection can result in an efficiency loss. Our results help guide the design of future surveillance programs for invasive species. Addressing program design within a decision‐theoretic framework can lead to a better use of available resources. We show how species prevalence, its detectability, and the benefits derived from early detection can be considered.  相似文献   

17.
Conserved segments in DNA or protein sequences are strong candidates for functional elements and thus appropriate methods for computing them need to be developed and compared. We describe five methods and computer programs for finding highly conserved blocks within previously computed multiple alignments, primarily for DNA sequences. Two of the methods are already in common use; these are based on good column agreement and high information content. Three additional methods find blocks with minimal evolutionary change, blocks that differ in at most k positions per row from a known center sequence and blocks that differ in at most k positions per row from a center sequence that is unknown a priori. The center sequence in the latter two methods is a way to model potential binding sites for known or unknown proteins in DNA sequences. The efficacy of each method was evaluated by analysis of three extensively analyzed regulatory regions in mammalian beta-globin gene clusters and the control region of bacterial arabinose operons. Although all five methods have quite different theoretical underpinnings, they produce rather similar results on these data sets when their parameters are adjusted to best approximate the experimental data. The optimal parameters for the method based on information content varied little for different regulatory regions of the beta-globin gene cluster and hence may be extrapolated to many other regulatory regions. The programs based on maximum allowed mismatches per row have simple parameters whose values can be chosen a priori and thus they may be more useful than the other methods when calibration against known functional sites is not available.  相似文献   

18.
19.
MOTIVATION: Many evolutionarily distant, but functionally meaningful links between proteins come to light through comparison of spatial structures. Most programs that assess structural similarity compare two proteins to each other and find regions in common between them. Structural classification experts look for a particular structural motif instead. Programs base similarity scores on superposition or closeness of either Cartesian coordinates or inter-residue contacts. Experts pay more attention to the general orientation of the main chain and mutual spatial arrangement of secondary structural elements. There is a need for a computational tool to find proteins with the same secondary structures, topological connections and spatial architecture, regardless of subtle differences in 3D coordinates. RESULTS: We developed ProSMoS--a Protein Structure Motif Search program that emulates an expert. Starting from a spatial structure, the program uses previously delineated secondary structural elements. A meta-matrix of interactions between the elements (parallel or antiparallel) minding handedness of connections (left or right) and other features (e.g. element lengths and hydrogen bonds) is constructed prior to or during the searches. All structures are reduced to such meta-matrices that contain just enough information to define a protein fold, but this definition remains very general and deviations in 3D coordinates are tolerated. User supplies a meta-matrix for a structural motif of interest, and ProSMoS finds all proteins in the protein data bank (PDB) that match the meta-matrix. ProSMoS performance is compared to other programs and is illustrated on a beta-Grasp motif. A brief analysis of all beta-Grasp-containing proteins is presented. Program availability: ProSMoS is freely available for non-commercial use from ftp://iole.swmed.edu/pub/ProSMoS.  相似文献   

20.
MOTIVATION: With hundreds of completely sequenced microbial genomes available, and advancements in DNA microarray technology, the detection of genes in microbial communities consisting of hundreds of thousands of sequences may be possible. The existing strategies developed for DNA probe design, geared toward identifying specific sequences, are not suitable due to the lack of coverage, flexibility and efficiency necessary for applications in metagenomics. METHODS: ProDesign is a tool developed for the selection of oligonucleotide probes to detect members of gene families present in environmental samples. Gene family-specific probe sequences are generated based on specific and shared words, which are found with the spaced seed hashing algorithm. To detect more sequences, those sharing some common words are re-clustered into new families, then probes specific for the new families are generated. RESULTS: The program is very flexible in that it can be used for designing probes for detecting many genes families simultaneously and specifically in one or more genomes. Neither the length nor the melting temperature of the probes needs to be predefined. We have found that ProDesign provides more flexibility, coverage and speed than other software programs used in the selection of probes for genomic and gene family arrays. AVAILABILITY: ProDesign is licensed free of charge to academic users. ProDesign and Supplementary Material can be obtained by contacting the authors. A web server for ProDesign is available at http://www.uhnresearch.ca/labs/tillier/ProDesign/ProDesign.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号