首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
GeMS: an advanced software package for designing synthetic genes   总被引:3,自引:0,他引:3  
A user-friendly, advanced software package for gene design is described. The software comprises an integrated suite of programs—also provided as stand-alone tools—that automatically performs the following tasks in gene design: restriction site prediction, codon optimization for any expression host, restriction site inclusion and exclusion, separation of long sequences into synthesizable fragments, Tm and stem–loop determinations, optimal oligonucleotide component design and design verification/error-checking. The output is a complete design report and a list of optimized oligonucleotides to be prepared for subsequent gene synthesis. The user interface accommodates both inexperienced and experienced users. For inexperienced users, explanatory notes are provided such that detailed instructions are not necessary; for experienced users, a streamlined interface is provided without such notes. The software has been extensively tested in the design and successful synthesis of over 400 kb of genes, many of which exceeded 5 kb in length.  相似文献   

2.
With the development of high-throughput experimental techniques such as microarray, mass spectrometry and large-scale mutagenesis, there is an increasing need to automatically annotate gene sets and identify the involved pathways. Although many pathway analysis tools are developed, new tools are still needed to meet the requirements for flexible or advanced analysis purpose. Here, we developed an R-based software package (SubpathwayMiner) for flexible pathway identification. SubpathwayMiner facilitates sub-pathway identification of metabolic pathways by using pathway structure information. Additionally, SubpathwayMiner also provides more flexibility in annotating gene sets and identifying the involved pathways (entire pathways and sub-pathways): (i) SubpathwayMiner is able to provide the most up-to-date pathway analysis results for users; (ii) SubpathwayMiner supports multiple species (∼100 eukaryotes, 714 bacteria and 52 Archaea) and different gene identifiers (Entrez Gene IDs, NCBI-gi IDs, UniProt IDs, PDB IDs, etc.) in the KEGG GENE database; (iii) the system is quite efficient in cooperating with other R-based tools in biology. SubpathwayMiner is freely available at http://cran.r-project.org/web/packages/SubpathwayMiner/.  相似文献   

3.
Increasingly, data on shape are analysed in combination with molecular genetic or ecological information, so that tools for geometric morphometric analysis are required. Morphometric studies most often use the arrangements of morphological landmarks as the data source and extract shape information from them by Procrustes superimposition. The MorphoJ software combines this approach with a wide range of methods for shape analysis in different biological contexts. The program offers an integrated and user-friendly environment for standard multivariate analyses such as principal components, discriminant analysis and multivariate regression as well as specialized applications including phylogenetics, quantitative genetics and analyses of modularity in shape data. MorphoJ is written in Java and versions for the Windows, Macintosh and Unix/Linux platforms are freely available from http://www.flywings.org.uk/MorphoJ_page.htm.  相似文献   

4.
ModEco: an integrated software package for ecological niche modeling   总被引:2,自引:0,他引:2  
Qinghua Guo  Yu Liu 《Ecography》2010,33(4):637-642
ModEco is a software package for ecological niche modeling. It integrates a range of niche modeling methods within a geographical information system. ModEco provides a user friendly platform that enables users to explore, analyze, and model species distribution data with relative ease. ModEco has several unique features: 1) it deals with different types of ecological observation data, such as presence and absence data, presence‐only data, and abundance data; 2) it provides a range of models when dealing with presence‐only data, such as presence‐only models, pseudo‐absence models, background vs presence data models, and ensemble models; and 3) it includes relatively comprehensive tools for data visualization, feature selection, and accuracy assessment.  相似文献   

5.
We present a new software package (hzar ) that provides functions for fitting molecular genetic and morphological data from hybrid zones to classic equilibrium cline models using the Metropolis–Hastings Markov chain Monte Carlo (MCMC) algorithm. The software applies likelihood functions appropriate for different types of data, including diploid and haploid genetic markers and quantitative morphological traits. The modular design allows flexibility in fitting cline models of varying complexity. To facilitate hypothesis testing, an autofit function is included that allows automated model selection from a set of nested cline models. Cline parameter values, such as cline centre and cline width, are estimated and may be compared statistically across clines. The package is written in the R language and is available through the Comprehensive R Archive Network (CRAN; http://cran.r-project.org/ ). Here, we describe hzar and demonstrate its use with a sample data set from a well‐studied hybrid zone in western Panama between white‐collared (Manacus candei) and golden‐collared manakins (M. vitellinus). Comparisons of our results with previously published results for this hybrid zone validate the hzar software. We extend analysis of this hybrid zone by fitting additional models to molecular data where appropriate.  相似文献   

6.
7.
We compared trysin-digested protein samples desalted by ZipTip(C18) reverse-phase microcolumns with on-plate washing of peptides deposited either on paraffin-coated plates (PCP), Teflon-based AnchorChip plates, or stainless steel plates, before analysis by matrix-assisted laser desorption/ionization-time of flight-mass spectrometry (MALDI-TOF-MS). Trypsinized bovine serum albumin and ovalbumin and 16 protein spots extracted from silver-stained two-dimensional gels of murine C(2)C(12) myoblasts or human leukocytes, prepared by the above two methods, were subjected to MALDI on PCP, AnchorChip plates, or uncoated stainless steel plates. Although most peptide mass peaks were identical regardless of the method of desalting and concentrating of protein samples, samples washed and concentrated by the PCP-based method had peptide peaks that were not seen in the samples prepared using the ZipTip(C18) columns. The mass spectra of peptides desalted and washed on uncoated stainless steel MALDI plates were consistently inferior due to loss of peptides. Some peptides of large molecular masses were apparently lost from samples desalted by ZipTip(C18) microcolumns, thus diminishing the quality of the fingerprint needed for protein identification. We demonstrate that the method of washing of protein samples on paraffin-coated plates provides an easy, reproducible, inexpensive, and high-throughput alternative to ZipTip(C18)-based purification of protein prior to MALDI-TOF-MS analysis.  相似文献   

8.
9.
10.
Algorithms and software for support of gene identification experiments   总被引:1,自引:0,他引:1  
MOTIVATION: Gene annotation is the final goal of gene prediction algorithms. However, these algorithms frequently make mistakes and therefore the use of gene predictions for sequence annotation is hardly possible. As a result, biologists are forced to conduct time-consuming gene identification experiments by designing appropriate PCR primers to test cDNA libraries or applying RT-PCR, exon trapping/amplification, or other techniques. This process frequently amounts to 'guessing' PCR primers on top of unreliable gene predictions and frequently leads to wasting of experimental efforts. RESULTS: The present paper proposes a simple and reliable algorithm for experimental gene identification which bypasses the unreliable gene prediction step. Studies of the performance of the algorithm on a sample of human genes indicate that an experimental protocol based on the algorithm's predictions achieves an accurate gene identification with relatively few PCR primers. Predictions of PCR primers may be used for exon amplification in preliminary mutation analysis during an attempt to identify a gene responsible for a disease. We propose a simple approach to find a short region from a genomic sequence that with high probability overlaps with some exon of the gene. The algorithm is enhanced to find one or more segments that are probably contained in the translated region of the gene and can be used as PCR primers to select appropriate clones in cDNA libraries by selective amplification. The algorithm is further extended to locate a set of PCR primers that uniformly cover all translated regions and can be used for RT-PCR and further sequencing of (unknown) mRNA.   相似文献   

11.
Software-information system Protein Structure Discovery was developed. The system can be used for the wide range of tasks in the field of computer proteomics including prediction of function, structure and immunological properties of proteins. A specially created section of the system allows evaluating the quantitative and qualitative effects of mutations on the structural and functional properties of proteins. There are 19 of different programs integrated into the system, including the database of protein functional sites PDBSite, a PDBSiteScan program for the prediction of functional sites in three-dimensional structures of proteins, and WebProAnalyst program for the quantitative analysis of the structure-activity relationship of proteins. Protein Structure Discovery program has a Web interface and is available for users through the Internet (http://www-bionet.sscc.ru/psd/). For example, binding sites of zinc ion and ADP showed high stability of the method to errors PDBSiteScan reconstruction of spatial structures of proteins in the recognition of functional sites in model structures.  相似文献   

12.

Background

Deviations in the amount of genomic content that arise during tumorigenesis, called copy number alterations, are structural rearrangements that can critically affect gene expression patterns. Additionally, copy number alteration profiles allow insight into cancer discrimination, progression and complexity. On data obtained from high-throughput sequencing, improving quality through GC bias correction and keeping false positives to a minimum help build reliable copy number alteration profiles.

Results

We introduce seqCNA, a parallelized R package for an integral copy number analysis of high-throughput sequencing cancer data. The package includes novel methodology on (i) filtering, reducing false positives, and (ii) GC content correction, improving copy number profile quality, especially under great read coverage and high correlation between GC content and copy number. Adequate analysis steps are automatically chosen based on availability of paired-end mapping, matched normal samples and genome annotation.

Conclusions

seqCNA, available through Bioconductor, provides accurate copy number predictions in tumoural data, thanks to the extensive filtering and better GC bias correction, while providing an integrated and parallelized workflow.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-178) contains supplementary material, which is available to authorized users.  相似文献   

13.
14.
Rapid development, transparency and small size are the outstanding features of zebrafish that make it as an increasingly important vertebrate system for developmental biology, functional genomics, disease modeling and drug discovery. Zebrafish has been regarded as ideal animal specie for studying the relationship between genotype and phenotype, for pathway analysis and systems biology. However, the tremendous amount of data generated from large numbers of embryos has led to the bottleneck of data analysis and modeling. The zebrafish image quantitator (ZFIQ) software provides streamlined data processing and analysis capability for developmental biology and disease modeling using zebrafish model. AVAILABILITY: ZFIQ is available for download at http://www.cbi-platform.net.  相似文献   

15.
The aim of the ecospat package is to make available novel tools and methods to support spatial analyses and modeling of species niches and distributions in a coherent workflow. The package is written in the R language (R Development Core Team) and contains several features, unique in their implementation, that are complementary to other existing R packages. Pre‐modeling analyses include species niche quantifications and comparisons between distinct ranges or time periods, measures of phylogenetic diversity, and other data exploration functionalities (e.g. extrapolation detection, ExDet). Core modeling brings together the new approach of ensemble of small models (ESM) and various implementations of the spatially‐explicit modeling of species assemblages (SESAM) framework. Post‐modeling analyses include evaluation of species predictions based on presence‐only data (Boyce index) and of community predictions, phylogenetic diversity and environmentally‐constrained species co‐occurrences analyses. The ecospat package also provides some functions to supplement the ‘biomod2’ package (e.g. data preparation, permutation tests and cross‐validation of model predictive power). With this novel package, we intend to stimulate the use of comprehensive approaches in spatial modelling of species and community distributions.  相似文献   

16.
Suppression subtractive hybridization (SSH) is a widely used technique for the identification of differentially expressed genes. SSH as well as other types of sequencing projects generate large amounts of anonymous sequences. SSHSuite automates the handling and storage of these sequences and enables identification through similarity searches. SSHSuite also offers analysis tools for the retrieval and comparison of the resulting similarity data. SSHSuite consists of four programs: SSHHandler, SSHOverview, SSHAnalysis, and SSHCompare.  相似文献   

17.
The discovery of unanticipated protein modifications is one of the most challenging problems in proteomics. Whereas widely used algorithms such as Sequest and Mascot enable mapping of modifications when the mass and amino acid specificity are known, unexpected modifications cannot be identified with these tools. We have developed an algorithm and software called P-Mod, which enables discovery and sequence mapping of modifications to target proteins known to be represented in the analysis or identified by Sequest. P-Mod matches MS/MS spectra to peptide sequences in a search list. For spectra of modified peptides, P-Mod calculates mass differences between search peptide sequences and MS/MS precursors and localizes the mass shift to a sequence position in the peptide. Because modifications are detected as mass shifts, P-Mod does not require the user to guess at masses or sequence locations of modifications. P-Mod uses extreme value statistics to assign p value estimates to sequence-to-spectrum matches. The reported p values are scaled to account for the number of comparisons, so that error rates do not increase with the expanded search lists that result from incorporating potential peptide modifications. Combination of P-Mod searches from multiple LC-MS/MS analyses and multiple samples revealed previously unreported BSA modifications, including a novel decarboxymethylation or D-->G substitution at position 579 of the protein. P-Mod can serve a unique role in the identification of protein modifications both from exogenous and endogenous sources and may be useful for identifying modified protein forms as biomarkers for toxicity and disease processes.  相似文献   

18.

Background

The Immunoglobulins (IG) and the T cell receptors (TR) play the key role in antigen recognition during the adaptive immune response. Recent progress in next-generation sequencing technologies has provided an opportunity for the deep T cell receptor repertoire profiling. However, a specialised software is required for the rational analysis of massive data generated by next-generation sequencing.

Results

Here we introduce tcR, a new R package, representing a platform for the advanced analysis of T cell receptor repertoires, which includes diversity measures, shared T cell receptor sequences identification, gene usage statistics computation and other widely used methods. The tool has proven its utility in recent research studies.

Conclusions

tcR is an R package for the advanced analysis of T cell receptor repertoires after primary TR sequences extraction from raw sequencing reads. The stable version can be directly installed from The Comprehensive R Archive Network (http://cran.r-project.org/mirrors.html). The source code and development version are available at tcR GitHub (http://imminfo.github.io/tcr/) along with the full documentation and typical usage examples.  相似文献   

19.
MOTIVATION: Bayesian analysis is one of the most popular methods in phylogenetic inference. The most commonly used methods fix a single multiple alignment and consider only substitutions as phylogenetically informative mutations, though alignments and phylogenies should be inferred jointly as insertions and deletions also carry informative signals. Methods addressing these issues have been developed only recently and there has not been so far a user-friendly program with a graphical interface that implements these methods. RESULTS: We have developed an extendable software package in the Java programming language that samples from the joint posterior distribution of phylogenies, alignments and evolutionary parameters by applying the Markov chain Monte Carlo method. The package also offers tools for efficient on-the-fly summarization of the results. It has a graphical interface to configure, start and supervise the analysis, to track the status of the Markov chain and to save the results. The background model for insertions and deletions can be combined with any substitution model. It is easy to add new substitution models to the software package as plugins. The samples from the Markov chain can be summarized in several ways, and new postprocessing plugins may also be installed.  相似文献   

20.
Shadforth I  Crowther D  Bessant C 《Proteomics》2005,5(16):4082-4095
Current proteomics experiments can generate vast quantities of data very quickly, but this has not been matched by data analysis capabilities. Although there have been a number of recent reviews covering various aspects of peptide and protein identification methods using MS, comparisons of which methods are either the most appropriate for, or the most effective at, their proposed tasks are not readily available. As the need for high-throughput, automated peptide and protein identification systems increases, the creators of such pipelines need to be able to choose algorithms that are going to perform well both in terms of accuracy and computational efficiency. This article therefore provides a review of the currently available core algorithms for PMF, database searching using MS/MS, sequence tag searches and de novo sequencing. We also assess the relative performances of a number of these algorithms. As there is limited reporting of such information in the literature, we conclude that there is a need for the adoption of a system of standardised reporting on the performance of new peptide and protein identification algorithms, based upon freely available datasets. We go on to present our initial suggestions for the format and content of these datasets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号