首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 906 毫秒
1.
We developed a resource, the Arabidopsis PeptideAtlas (www.peptideatlas.org/builds/arabidopsis/), to solve central questions about the Arabidopsis thaliana proteome, such as the significance of protein splice forms and post-translational modifications (PTMs), or simply to obtain reliable information about specific proteins. PeptideAtlas is based on published mass spectrometry (MS) data collected through ProteomeXchange and reanalyzed through a uniform processing and metadata annotation pipeline. All matched MS-derived peptide data are linked to spectral, technical, and biological metadata. Nearly 40 million out of ∼143 million MS/MS (tandem MS) spectra were matched to the reference genome Araport11, identifying ∼0.5 million unique peptides and 17,858 uniquely identified proteins (only isoform per gene) at the highest confidence level (false discovery rate 0.0004; 2 non-nested peptides ≥9 amino acid each), assigned canonical proteins, and 3,543 lower-confidence proteins. Physicochemical protein properties were evaluated for targeted identification of unobserved proteins. Additional proteins and isoforms currently not in Araport11 were identified that were generated from pseudogenes, alternative start, stops, and/or splice variants, and small Open Reading Frames; these features should be considered when updating the Arabidopsis genome. Phosphorylation can be inspected through a sophisticated PTM viewer. PeptideAtlas is integrated with community resources including TAIR, tracks in JBrowse, PPDB, and UniProtKB. Subsequent PeptideAtlas builds will incorporate millions more MS/MS data.

A web resource providing the global community with mass spectrometry-based Arabidopsis proteome information and its spectral, technical, and biological metadata integrated with TAIR and JBrowse.  相似文献   

2.
Finding homologous and orthologous protein sequences is often the first step in evolutionary studies, annotation projects, and experiments of functional complementation. Despite all currently available computational tools, there is a requirement for easy-to-use tools that provide functional information. Here, a new web application called orthoFind is presented, which allows a quick search for homologous and orthologous proteins given one or more query sequences, allowing a recurrent and exhaustive search against reference proteomes, and being able to include user databases. It addresses the protein multidomain problem, searching for homologs with the same domain architecture, and gives a simple functional analysis of the results to help in the annotation process. orthoFind is easy to use and has been proven to provide accurate results with different datasets. Availability: http://www.bioinfocabd.upo.es/orthofind/.  相似文献   

3.
Libraries of randomised peptides displayed on phages or viral particles are essential tools in a wide spectrum of applications. However, there is only limited understanding of a library''s fundamental dynamics and the influences of encoding schemes and sizes on their quality. Numeric properties of libraries, such as the expected number of different peptides and the library''s coverage, have long been in use as measures of a library''s quality. Here, we present a graphical framework of these measures together with a library''s relative efficiency to help to describe libraries in enough detail for researchers to plan new experiments in a more informed manner. In particular, these values allow us to answer-in a probabilistic fashion-the question of whether a specific library does indeed contain one of the "best" possible peptides. The framework is implemented in a web-interface based on two packages, discreteRV and peptider, to the statistical software environment R. We further provide a user-friendly web-interface called PeLiCa (Peptide Library Calculator, http://www.pelica.org), allowing scientists to plan and analyse their peptide libraries.  相似文献   

4.
Seed storage proteins, the major food proteins, possess unique physicochemical characteristics which determine their nutritional importance and influence their utilization by humans. Here, we describe a database driven tool named Seed Pro-Nutra Care which comprises a systematic compendium of seed storage proteins and their bioactive peptides influencing several vital organ systems for maintenance of health. Seed Pro-Nutra Careis an integrated resource on seed storage protein. This resource help in the (I) Characterization of proteins whether they belong to seed storage protein group or not. (II) Identification the bioactive peptides with their sequences using peptide name (III) Determination of physico chemical properties of seed storage proteins. (IV) Epitope identification and mapping (V) Allergenicity prediction and characterization. Seed Pro-Nutra Care is a compilation of data on bioactive peptides present in seed storage proteins from our own collections and other published and unpublished sources. The database provides an information resource of a variety of seed related biological information and its use for nutritional and biomedical application.

Availability

http://www.gbpuat-cbsh.ac.in/departments/bi/database/seed_pro_nutra_care/  相似文献   

5.
6.
Tn-seq is a high throughput technique for analysis of transposon mutant libraries. Tn-seq Explorer was developed as a convenient and easy-to-use package of tools for exploration of the Tn-seq data. In a typical application, the user will have obtained a collection of sequence reads adjacent to transposon insertions in a reference genome. The reads are first aligned to the reference genome using one of the tools available for this task. Tn-seq Explorer reads the alignment and the gene annotation, and provides the user with a set of tools to investigate the data and identify possibly essential or advantageous genes as those that contain significantly low counts of transposon insertions. Emphasis is placed on providing flexibility in selecting parameters and methodology most appropriate for each particular dataset. Tn-seq Explorer is written in Java as a menu-driven, stand-alone application. It was tested on Windows, Mac OS, and Linux operating systems. The source code is distributed under the terms of GNU General Public License. The program and the source code are available for download at http://www.cmbl.uga.edu/downloads/programs/Tn_seq_Explorer/ and https://github.com/sina-cb/Tn-seqExplorer.  相似文献   

7.
Synthetic Biology Open Language (SBOL) Visual is a graphical standard for genetic engineering. It consists of symbols representing DNA subsequences, including regulatory elements and DNA assembly features. These symbols can be used to draw illustrations for communication and instruction, and as image assets for computer-aided design. SBOL Visual is a community standard, freely available for personal, academic, and commercial use (Creative Commons CC0 license). We provide prototypical symbol images that have been used in scientific publications and software tools. We encourage users to use and modify them freely, and to join the SBOL Visual community: http://www.sbolstandard.org/visual.  相似文献   

8.
An important step in mass spectrometry (MS)-based proteomics is the identification of peptides by their fragment spectra. Regardless of the identification score achieved, almost all tandem-MS (MS/MS) spectra contain remaining peaks that are not assigned by the search engine. These peaks may be explainable by human experts but the scale of modern proteomics experiments makes this impractical. In computer science, Expert Systems are a mature technology to implement a list of rules generated by interviews with practitioners. We here develop such an Expert System, making use of literature knowledge as well as a large body of high mass accuracy and pure fragmentation spectra. Interestingly, we find that even with high mass accuracy data, rule sets can quickly become too complex, leading to over-annotation. Therefore we establish a rigorous false discovery rate, calculated by random insertion of peaks from a large collection of other MS/MS spectra, and use it to develop an optimized knowledge base. This rule set correctly annotates almost all peaks of medium or high abundance. For high resolution HCD data, median intensity coverage of fragment peaks in MS/MS spectra increases from 58% by search engine annotation alone to 86%. The resulting annotation performance surpasses a human expert, especially on complex spectra such as those of larger phosphorylated peptides. Our system is also applicable to high resolution collision-induced dissociation data. It is available both as a part of MaxQuant and via a webserver that only requires an MS/MS spectrum and the corresponding peptides sequence, and which outputs publication quality, annotated MS/MS spectra (www.biochem.mpg.de/mann/tools/). It provides expert knowledge to beginners in the field of MS-based proteomics and helps advanced users to focus on unusual and possibly novel types of fragment ions.In MS-based proteomics, peptides are matched to peptide sequences in databases using search engines (13). Statistical criteria are established for accepted versus rejected peptide spectra matches based on the search engine score, and usually a 99% certainty is required for reported peptides. The search engines typically only take sequence specific backbone fragmentation into account (i.e. a, b, and y ions) and some of their neutral losses. However, tandem mass spectra—especially of larger peptides—can be quite complex and contain a number of medium or even high abundance peptide fragments that are not annotated by the search engine result. This can result in uncertainty for the user—especially if only relatively few peaks are annotated—because it may reflect an incorrect identification. However, the most common cause of unlabeled peaks is that another peptide was present in the precursor selection window and was cofragmented. This has variously been termed “chimeric spectra” (46), or the problem of low precursor ion fraction (PIF)1 (7). Such spectra may still be identifiable with high confidence. The Andromeda search engine in MaxQuant, for instance, attempts to identify a second peptide in such cases (8, 9). However, even “pure” spectra (those with a high PIF) often still contain many unassigned peaks. These can be caused by different fragment types, such as internal ions, single or combined neutral losses as well as immonium and other ion types in the low mass region. A mass spectrometric expert can assign many or all of these peaks, based on expert knowledge of fragmentation and manual calculation of fragment masses, resulting in a higher degree of confidence for the identification. However, there are more and more practitioners of proteomics without in depth training or experience in annotating MS/MS spectra and such annotation would in any case be prohibitive for hundreds of thousands of spectra. Furthermore, even human experts may wrongly annotate a given peak—especially with low mass accuracy tandem mass spectra—or fail to consider every possibility that could have resulted in this fragment mass.Given the desirability of annotating fragment peaks to the highest degree possible, we turned to “Expert Systems,” a well-established technology in computer science. Expert Systems achieved prominence in the 1970s and 1980s and were meant to solve complex problems by reasoning about knowledge (10, 11). Interestingly, one of the first examples was developed by Nobel Prize winner Joshua Lederberg more than 40 years ago, and dealt with the interpretation of mass spectrometric data. The program''s name was Heuristic DENTRAL (12), and it was capable of interpreting the mass spectra of aliphatic ethers and their fragments. The hypotheses produced by the program described molecular structures that are plausible explanations of the data. To infer these explanations from the data, the program incorporated a theory of chemical stability that provided limiting constraints as well as heuristic rules.In general, the aim of an Expert System is to encode knowledge extracted from professionals in the field in question. This then powers a rule-based system that can be applied broadly and in an automated manner. A rule-based Expert System represents the information obtained from human specialists in the form of IF-THEN rules. These are used to perform operations on input data to reach appropriate conclusion. A generic Expert System is essentially a computer program that provides a framework for performing a large number of inferences in a predictable way, using forward or backward chains, backtracking, and other mechanisms (13). Therefore, in contrast to statistics based learning, the “expert program” does not know what it knows through the raw volume of facts in the computer''s memory. Instead, like a human expert, it relies on a reasoning-like process of applying an empirically derived set of rules to the data.Here we implemented an Expert System for the interpretation for high mass accuracy tandem mass spectrometry data of peptides. It was developed in an iterative manner together with human experts on peptide fragmentation, using the published literature on fragmentation pathways as well as large data sets of higher-energy collisional dissociation (HCD) (14) and collision-induced dissociation (CID) based peptide identifications. Our goal was to achieve an annotation performance similar or better than experienced mass spectrometrists (15), thus making comprehensively annotated peptide spectra available in large scale proteomics.  相似文献   

9.
Scanning mutagenesis is a powerful protein engineering technique used to study protein structure-function relationship, map binding sites and design more stable proteins or proteins with altered properties. One of the time-consuming tasks encountered in application of this technique is the design of primers for site-directed mutagenesis. Here we present an open-source multi-platform software AAscan developed to design primers for this task according to a set of empirical rules such as melting temperature, overall length, length of overlap regions, and presence of GC clamps at the 3’ end, for any desired substitution. We also describe additional software tools which are used to analyse a large number of sequencing results for the presence of desired mutations, as well as related software to design primers for ligation independent cloning. We have used AAscan software to design primers to make over 700 mutants, with a success rate of over 80%. We hope that the open-source nature of our software and ready availability of freeware tools used for its development will facilitate its adaptation and further development. The software is distributed under GPLv3 licence and is available at http://www.psi.ch/lbr/aascan.  相似文献   

10.
Dbf4-dependent kinase (DDK) and cyclin-dependent kinase (CDK) are essential to initiate DNA replication at individual origins. During replication stress, the S-phase checkpoint inhibits the DDK- and CDK-dependent activation of late replication origins. Rad53 kinase is a central effector of the replication checkpoint and both binds to and phosphorylates Dbf4 to prevent late-origin firing. The molecular basis for the Rad53Dbf4 physical interaction is not clear but occurs through the Dbf4 N terminus. Here we found that both Rad53 FHA1 and FHA2 domains, which specifically recognize phospho-threonine (pT), interacted with Dbf4 through an N-terminal sequence and an adjacent BRCT domain. Purified Rad53 FHA1 domain (but not FHA2) bound to a pT Dbf4 peptide in vitro, suggesting a possible phospho-threonine-dependent interaction between FHA1 and Dbf4. The Dbf4Rad53 interaction is governed by multiple contacts that are separable from the Cdc5- and Msa1-binding sites in the Dbf4 N terminus. Importantly, abrogation of the Rad53Dbf4 physical interaction blocked Dbf4 phosphorylation and allowed late-origin firing during replication checkpoint activation. This indicated that Rad53 must stably bind to Dbf4 to regulate its activity.  相似文献   

11.
12.
13.
14.

Motivation

In mass spectrometry-based proteomics, XML formats such as mzML and mzXML provide an open and standardized way to store and exchange the raw data (spectra and chromatograms) of mass spectrometric experiments. These file formats are being used by a multitude of open-source and cross-platform tools which allow the proteomics community to access algorithms in a vendor-independent fashion and perform transparent and reproducible data analysis. Recent improvements in mass spectrometry instrumentation have increased the data size produced in a single LC-MS/MS measurement and put substantial strain on open-source tools, particularly those that are not equipped to deal with XML data files that reach dozens of gigabytes in size.

Results

Here we present a fast and versatile parsing library for mass spectrometric XML formats available in C++ and Python, based on the mature OpenMS software framework. Our library implements an API for obtaining spectra and chromatograms under memory constraints using random access or sequential access functions, allowing users to process datasets that are much larger than system memory. For fast access to the raw data structures, small XML files can also be completely loaded into memory. In addition, we have improved the parsing speed of the core mzML module by over 4-fold (compared to OpenMS 1.11), making our library suitable for a wide variety of algorithms that need fast access to dozens of gigabytes of raw mass spectrometric data.

Availability

Our C++ and Python implementations are available for the Linux, Mac, and Windows operating systems. All proposed modifications to the OpenMS code have been merged into the OpenMS mainline codebase and are available to the community at https://github.com/OpenMS/OpenMS.  相似文献   

15.
We present a computational toolkit consisting of five utility tools, for performing basic operations on a protein structure file in PDB format. The toolkit consists of five different programs which can be integrated as part of a pipeline for computational protein structure characterization or as a standalone analysis package. The programs include tools for chirality check for amino acids (ProChiral), contact map generation (CoMa), data redundancy (DaRe), hydrogen bond potential energy (HyPE) and electrostatic interaction energy (EsInE). All programs in the toolkit can be accessed and downloaded through the following link: http://www.iitg.ac.in/bpetoolkit/.  相似文献   

16.
Homologous recombination is associated with the dynamic assembly and disassembly of DNA–protein complexes. Assembly of a nucleoprotein filament comprising ssDNA and the RecA homolog, Rad51, is a key step required for homology search during recombination. The budding yeast Srs2 DNA translocase is known to dismantle Rad51 filament in vitro. However, there is limited evidence to support the dismantling activity of Srs2 in vivo. Here, we show that Srs2 indeed disrupts Rad51-containing complexes from chromosomes during meiosis. Overexpression of Srs2 during the meiotic prophase impairs meiotic recombination and removes Rad51 from meiotic chromosomes. This dismantling activity is specific for Rad51, as Srs2 Overexpression does not remove Dmc1 (a meiosis-specific Rad51 homolog), Rad52 (a Rad51 mediator), or replication protein A (RPA; a single-stranded DNA-binding protein). Rather, RPA replaces Rad51 under these conditions. A mutant Srs2 lacking helicase activity cannot remove Rad51 from meiotic chromosomes. Interestingly, the Rad51-binding domain of Srs2, which is critical for Rad51-dismantling activity in vitro, is not essential for this activity in vivo. Our results suggest that a precise level of Srs2, in the form of the Srs2 translocase, is required to appropriately regulate the Rad51 nucleoprotein filament dynamics during meiosis.  相似文献   

17.
The oocytes of most sexually reproducing animals arrest in meiotic prophase I. Oocyte growth, which occurs during this period of arrest, enables oocytes to acquire the cytoplasmic components needed to produce healthy progeny and to gain competence to complete meiosis. In the nematode Caenorhabditis elegans, the major sperm protein hormone promotes meiotic resumption (also called meiotic maturation) and the cytoplasmic flows that drive oocyte growth. Prior work established that two related TIS11 zinc-finger RNA-binding proteins, OMA-1 and OMA-2, are redundantly required for normal oocyte growth and meiotic maturation. We affinity purified OMA-1 and identified associated mRNAs and proteins using genome-wide expression data and mass spectrometry, respectively. As a class, mRNAs enriched in OMA-1 ribonucleoprotein particles (OMA RNPs) have reproductive functions. Several of these mRNAs were tested and found to be targets of OMA-1/2-mediated translational repression, dependent on sequences in their 3′-untranslated regions (3′-UTRs). Consistent with a major role for OMA-1 and OMA-2 in regulating translation, OMA-1-associated proteins include translational repressors and activators, and some of these proteins bind directly to OMA-1 in yeast two-hybrid assays, including OMA-2. We show that the highly conserved TRIM-NHL protein LIN-41 is an OMA-1-associated protein, which also represses the translation of several OMA-1/2 target mRNAs. In the accompanying article in this issue, we show that LIN-41 prevents meiotic maturation and promotes oocyte growth in opposition to OMA-1/2. Taken together, these data support a model in which the conserved regulators of mRNA translation LIN-41 and OMA-1/2 coordinately control oocyte growth and the proper spatial and temporal execution of the meiotic maturation decision.  相似文献   

18.
19.
20.
PathVisio is a commonly used pathway editor, visualization and analysis software. Biological pathways have been used by biologists for many years to describe the detailed steps in biological processes. Those powerful, visual representations help researchers to better understand, share and discuss knowledge. Since the first publication of PathVisio in 2008, the original paper was cited more than 170 times and PathVisio was used in many different biological studies. As an online editor PathVisio is also integrated in the community curated pathway database WikiPathways.Here we present the third version of PathVisio with the newest additions and improvements of the application. The core features of PathVisio are pathway drawing, advanced data visualization and pathway statistics. Additionally, PathVisio 3 introduces a new powerful extension systems that allows other developers to contribute additional functionality in form of plugins without changing the core application.PathVisio can be downloaded from http://www.pathvisio.org and in 2014 PathVisio 3 has been downloaded over 5,500 times. There are already more than 15 plugins available in the central plugin repository. PathVisio is a freely available, open-source tool published under the Apache 2.0 license (http://www.apache.org/licenses/LICENSE-2.0). It is implemented in Java and thus runs on all major operating systems. The code repository is available at http://svn.bigcat.unimaas.nl/pathvisio. The support mailing list for users is available on https://groups.google.com/forum/#!forum/wikipathways-discuss and for developers on https://groups.google.com/forum/#!forum/wikipathways-devel.
This is a PLOS Computational Biology software article.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号