期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The RESID Database of Protein Modifications as a resource and annotation tool

Garavelli JS 《Proteomics》2004,4(6):1527-1533

相似文献

2.

The RESID Database of protein structure modifications.

下载免费PDF全文

J S Garavelli 《Nucleic acids research》1999,27(1):198-199

Because the number of post-translational modifications requiring standardized annotation in the PIR-International Protein Sequence Database was large and steadily increasing, a database of protein structure modifications was constructed in 1993 to assist in producing appropriate feature annotations for covalent binding sites, modified sites and cross-links. In 1995 RESID was publicly released as a PIR-International text database distributed on CD-ROM and accessible through the ATLAS program. In 1998 it was made available on the PIR Web site at http://www-nbrf.georgetown.edu/pir/searchdb++ +.html . The RESID Database includes such information as: systematic and frequently observed alternate names; Chemical s Service registry numbers; atomic formulas and weights; enzyme activities; indicators forN-terminal, C-terminal or peptide chain cross-link modifications; keywords; and literature citations with database cross-references. The RESID Database can be used to predict atomic masses for peptides, and is being enhanced to provide molecular structures for graphical presentation on the PIR Web site using widely available molecular viewing programs. 相似文献

3.

The RESID Database of protein structure modifications and the NRL-3D Sequence-Structure Database

Garavelli JS Hou Z Pattabiraman N Stephens RM 《Nucleic acids research》2001,29(1):199-201

The RESID Database is a comprehensive collection of annotations and structures for protein post-translational modifications including N-terminal, C-terminal and peptide chain cross-link modifications. The RESID Database includes systematic and frequently observed alternate names, Chemical Abstracts Service registry numbers, atomic formulas and weights, enzyme activities, taxonomic range, keywords, literature citations with database cross-references, structural diagrams and molecular models. The NRL-3D Sequence-Structure Database is derived from the three-dimensional structure of proteins deposited with the Research Collaboratory for Structural Bioinformatics Protein Data Bank. The NRL-3D Database includes standardized and frequently observed alternate names, sources, keywords, literature citations, experimental conditions and searchable sequences from model coordinates. These databases are freely accessible through the National Cancer Institute-Frederick Advanced Biomedical Computing Center at these web sites: http://www. ncifcrf.gov/RESID, http://www.ncifcrf.gov/NRL-3D; or at these National Biomedical Research Foundation Protein Information Resource web sites: http://pir.georgetown.edu/pirwww/dbinfo/resid .html, http://pir.georgetown.edu/pirwww/dbinfo/nrl3d .html 相似文献

4.

PTMD: A Database of Human Disease-associated Post-translational Modifications

Haodong Xu Yongbo Wang Shaofeng Lin Wankun Deng Di Peng Qinghua Cui Yu Xue 《基因组蛋白质组与生物信息学报(英文版)》2018,16(4):244-251

Various posttranslational modifications(PTMs) participate in nearly all aspects of biological processes by regulating protein functions, and aberrant states of PTMs are frequently implicated in human diseases. Therefore, an integral resource of PTM–disease associations(PDAs)would be a great help for both academic research and clinical use. In this work, we reported PTMD,a well-curated database containing PTMs that are associated with human diseases. We manually collected 1950 known PDAs in 749 proteins for 23 types of PTMs and 275 types of diseases from the literature. Database analyses show that phosphorylation has the largest number of disease associations, whereas neurologic diseases have the largest number of PTM associations. We classified all known PDAs into six classes according to the PTM status in diseases and demonstrated that the upregulation and presence of PTM events account for a predominant proportion of diseaseassociated PTM events. By reconstructing a disease–gene network, we observed that breast cancershave the largest number of associated PTMs and AKT1 has the largest number of PTMs connected to diseases. Finally, the PTMD database was developed with detailed annotations and can be a useful resource for further analyzing the relations between PTMs and human diseases. PTMD is freely accessible at http://ptmd.biocuckoo.org. 相似文献

5.

The Protein Mutant Database. 总被引：3，自引：0，他引：3

下载免费PDF全文

T Kawabata M Ota K Nishikawa 《Nucleic acids research》1999,27(1):355-357

Currently the protein mutant database (PMD) contains over 81 000 mutants, including artificial as well as natural mutants of various proteins extracted from about 10 000 articles. We recently developed a powerful viewing and retrieving system (http://pmd.ddbj.nig.ac.jp), which is integrated with the sequence and tertiary structure databases. The system has the following features: (i) mutated sequences are displayed after being automatically generated from the information described in the entry together with the sequence data of wild-type proteins integrated. This is a convenient feature because it allows one to see the position of altered amino acids (shown in a different color) in the entire sequence of a wild-type protein; (ii) for those proteins whose 3D structures have been experimentally determined, a 3D structure is displayed to show mutation sites in a different color; (iii) a sequence homology search against PMD can be carried out with any query sequence; (iv) a summary of mutations of homologous sequences can be displayed, which shows all the mutations at a certain site of a protein, recorded throughout the PMD. 相似文献

6.

The Pfam Protein Families Database 总被引：17，自引：0，他引：17

下载免费PDF全文

Alex Bateman Ewan Birney Lorenzo Cerruti Richard Durbin Laurence Etwiller Sean R. Eddy Sam Griffiths-Jones Kevin L. Howe Mhairi Marshall Erik L. L. Sonnhammer 《Nucleic acids research》2002,30(1):276-280

Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the World Wide Web in the UK at http://www.sanger.ac.uk/Software/Pfam/, in Sweden at http://www.cgb.ki.se/Pfam/, in France at http://pfam.jouy.inra.fr/ and in the US at http://pfam.wustl.edu/. The latest version (6.6) of Pfam contains 3071 families, which match 69% of proteins in SWISS-PROT 39 and TrEMBL 14. Structural data, where available, have been utilised to ensure that Pfam families correspond with structural domains, and to improve domain-based annotation. Predictions of non-domain regions are now also included. In addition to secondary structure, Pfam multiple sequence alignments now contain active site residue mark-up. New search tools, including taxonomy search and domain query, greatly add to the functionality and usability of the Pfam resource. 相似文献

7.

The Eukaryotic Promoter Database (EPD): recent developments. 总被引：2，自引：3，他引：2

下载免费PDF全文

R C Prier T Junier C Bonnard P Bucher 《Nucleic acids research》1999,27(1):307-309

相似文献

8.

The PROF_PAT Protein Pattern Database: Assessment of Efficiency

Nizolenko L. Ph. Bachinsky A. G. Naumochkin A. N. Yarygin A. A. Grigorovich D. A. 《Molecular Biology》2004,38(2):210-217

相似文献

9.

The PIR-International Protein Sequence Database. 总被引：1，自引：0，他引：1

下载免费PDF全文

D G George W C Barker H W Mewes F Pfeiffer A Tsugita 《Nucleic acids research》1996,24(1):17-20

From its origin the Protein Sequence Database has been designed to support research and has focused on comprehensive coverage, quality control and organization of the data in accordance with biological principles. Since 1988 the database has been maintained collaboratively within the framework of PIR-International, an association of macromolecular sequence data collection centers dedicated to fostering international cooperation as an essential element in the development of scientific databases. The database is widely distributed and is available on the World Wide Web, via ftp, email server, on CD-ROM and magnetic media. It is widely redistributed and incorporated into many other protein sequence data compilations, including SWISS-PROT and the Entrez system of the NCBI. 相似文献

10.

The PIR-International Protein Sequence Database.

下载免费PDF全文

D G George W C Barker H W Mewes F Pfeiffer A Tsugita 《Nucleic acids research》1994,22(17):3569-3573

PIR-International is an association of macromolecular sequence data collection centers dedicated to fostering international cooperation as an essential element in the development of scientific databases. A major objective of PIR-International is to continue the development of the Protein Sequence Database as an essential public resource for protein sequence information. This paper briefly describes the architecture of the Protein Sequence Database and how it and associated data sets are distributed and can be accessed electronically. 相似文献

11.

The RESID database of protein structure modifications: 2000 update

Garavelli JS 《Nucleic acids research》2000,28(1):209-211

The RESID Database contains supplemental information on post-translational modifications for the standardized annotations appearing in the PIR-International Protein Sequence Database. The RESID Database includes: systematic and frequently observed alternate names, Chemical s Service registry numbers, atomic formulas and weights, enzyme activities, indicators for N-terminal, C-terminal or peptide chain cross-link modifications, keywords, literature citations with database cross-references, structural diagrams and molecular models. Since 1995 updates of the RESID Database have appeared as often as weekly, and full releases appear quarterly. The database is freely accessible through the PIR Web site http://pir.georgetown.edu/pirwww/dbinfo/resid.html and by FTP. 相似文献

12.

蛋白质翻译后修饰研究进展 总被引：1，自引：0，他引：1

郭会灿《生物技术通报》2011,(7)

翻译后修饰在蛋白质加工、成熟的过程中发挥着重要的作用,它可以改变蛋白质的物理、化学性质,影响蛋白质的空间构象、立体位阻及其稳定性,进而对蛋白质的生物学活性产生作用,引起蛋白质的功能改变。修饰基团自身的结构特性对蛋白质的性质、功能也会产生深远的影响。在已有的研究基础上,综述蛋白质翻译后修饰的主要类型以及各修饰作用潜在的生物学功能。相似文献

13.

PSPDB: Plant Stress Protein Database 总被引：1，自引：0，他引：1

S. Anil Kumar P. Hima Kumari Vijayaraghava Seshadri Sundararajan Prashanth Suravajhala Rajaraman Kanagasabai P. B. Kavi Kishor 《Plant Molecular Biology Reporter》2014,32(4):940-942

Plants produce various proteins to overcome biotic and abiotic stresses. Current plant stress databases report plant genes without protein annotations specific to these stresses. To date, according to our findings, a unique plant stress protein database for both biotic and abiotic stresses is not available explicitly for plant biologists that describe linking out to other related databases. This need initiated us to formulate a distinctive database that includes important resources for stress-based factors. We developed the Plant Stress Protein Database (PSPDB), a web-accessible resource that covers 2,064 manually curated plant stress proteins from a wide array of 134 plant species with 30 different types of biotic and abiotic stresses. Functional and experimental validation of proteins associated with biotic and abiotic stresses has been employed as the sole criterion for inclusion in the database. The database is available at http://www.bioclues.org/pspdb/. 相似文献

14.

VPDB: Viral Protein Structural Database

Om Prakash Sharma Ankush Jadhav Afzal Hussain Muthuvel Suresh Kumar 《Bioinformation》2011,6(8):324-326

相似文献

15.

Protein Language: Post-Translational Modifications Talking to Each Other

Lam Dai Vu Kris Gevaert Ive De Smet 《Trends in plant science》2018,23(12):1068-1080

相似文献

16.

The Essential Role of Mass Spectrometry in Characterizing Protein Structure: Mapping Posttranslational Modifications

Roland S. Annan Steven A. Carr 《Journal of Protein Chemistry》1997,16(5):391-402

Over the last few years we have developed mass spectrometry-based approaches for selective identification of a variety of posttranslational modifications, and for sequencing the modified peptides. These methods do not involve radiolabeling or derivatization. Instead, modification-specific fragment ions are produced by collision-induced dissociation (CID) during analysis of peptides by ESMS. The formation and detection of these marker ions on-the-fly during the LC-ESMS analysis of a protein digest is a powerful technique for identifying posttranslationally modified peptides. Using the marker ion strategy in an orthogonal fashion, a precursor ion scan can detect peptides which give rise to a diagnostic fragment ion, even in an unfractionated protein digest. Once the modified peptide has been located, the appropriate precursor ion can be sequenced by tandem MS. The utility and interplay of this approach to mapping PTM is illustrated with examples that involve protein glycosylation and phosphorylation. 相似文献

17.

HPID: the Human Protein Interaction Database 总被引：1，自引：0，他引：1

Han K Park B Kim H Hong J Park J 《Bioinformatics (Oxford, England)》2004,20(15):2466-2470

The Human Protein Interaction Database (http://www.hpid.org) was designed (1) to provide human protein interaction information pre-computed from existing structural and experimental data, (2) to predict potential interactions between proteins submitted by users and (3) to provide a depository for new human protein interaction data from users. Two types of interaction are available from the pre-computed data: (1) interactions at the protein superfamily level and (2) those transferred from the interactions of yeast proteins. Interactions at the superfamily level were obtained by locating known structural interactions of the PDB in the SCOP domains and identifying homologs of the domains in the human proteins. Interactions transferred from yeast proteins were obtained by identifying homologs of the yeast proteins in the human proteins. For each human protein in the database and each query submitted by users, the protein superfamilies and yeast proteins assigned to the protein are shown, along with their interacting partners. We have also developed a set of web-based programs so that users can visualize and analyze protein interaction networks in order to explore the networks further. AVAILABILITY: http://www.hpid.org. 相似文献

18.

Focus on Metabolism: Posttranslational Protein Modifications in Plant Metabolism

Giulia Friso Klaas J. van Wijk 《Plant physiology》2015,169(3):1469-1487

Posttranslational modifications (PTMs) of proteins greatly expand proteome diversity, increase functionality, and allow for rapid responses, all at relatively low costs for the cell. PTMs play key roles in plants through their impact on signaling, gene expression, protein stability and interactions, and enzyme kinetics. Following a brief discussion of the experimental and bioinformatics challenges of PTM identification, localization, and quantification (occupancy), a concise overview is provided of the major PTMs and their (potential) functional consequences in plants, with emphasis on plant metabolism. Classic examples that illustrate the regulation of plant metabolic enzymes and pathways by PTMs and their cross talk are summarized. Recent large-scale proteomics studies mapped many PTMs to a wide range of metabolic functions. Unraveling of the PTM code, i.e. a predictive understanding of the (combinatorial) consequences of PTMs, is needed to convert this growing wealth of data into an understanding of plant metabolic regulation.The primary amino acid sequence of proteins is defined by the translated mRNA, often followed by N- or C-terminal cleavages for preprocessing, maturation, and/or activation. Proteins can undergo further reversible or irreversible posttranslational modifications (PTMs) of specific amino acid residues. Proteins are directly responsible for the production of plant metabolites because they act as enzymes or as regulators of enzymes. Ultimately, most proteins in a plant cell can affect plant metabolism (e.g. through effects on plant gene expression, cell fate and development, structural support, transport, etc.). Many metabolic enzymes and their regulators undergo a variety of PTMs, possibly resulting in changes in oligomeric state, stabilization/degradation, and (de)activation (), and PTMs can facilitate the optimization of metabolic flux. However, the direct in vivo consequence of a PTM on a metabolic enzyme or pathway is frequently not very clear, in part because it requires measurements of input and output of the reactions, including flux through the enzyme or pathway. This Update will start out with a short overview on the major PTMs observed for each amino acid residue (PTMs, including determination of the localization within proteins (i.e. the specific residues) and occupancy. Challenges in dealing with multiple PTMs per protein and cross talk between PTMs will be briefly outlined. We then describe the major physiological PTMs observed in plants as well as PTMs that are nonenzymatically induced during sample preparation (PTMs, in particular for enzymes in primary metabolism (Calvin cycle, glycolysis, and respiration) and the C4 shuttle accommodating photosynthesis in C4 plants (PTMs observed in plants

Amino Acid Residue	Observed Physiological PTM in Plants	PTMs Caused by Sample Preparation
Ala (A)	Not known
Arg (R)	Methylation, carbonylation
Asn (N)	Deamidation, N-linked gycosylation	Deamidation
Asp (D)	Phosphorylation (in two-component system)
Cys (C)	Glutathionylation (SSG), disulfide bonded (S-S), sulfenylation (-SOH), sulfonylation (-SO₃H), acylation, lipidation, acetylation, nitrosylation (SNO), methylation, palmitoylation, phosphorylation (rare)	Propionamide
Glu (E)	Carboxylation, methylation	Pyro-Glu
Gln (Q)	Deamidation	Deamidation, pyro-Glu
Gly (G)	N-Myristoylation (N-terminal Gly residue)
His (H)	Phosphorylation (infrequent)	Oxidation
Ile (I)	Not known
Leu (L)	Not known
Lys (K)	N-ε-Acetylation, methylation, hydroxylation, ubiquitination, sumoylation, deamination, O-glycosylation, carbamylation, carbonylation, formylation
Met (M)	(De)formylation, excision (NME), (reversible) oxidation, sulfonation (-SO₂), sulfoxation (-SO)	Oxidation, 2-oxidation, formylation, carbamylation
Phe (F)	Not known
Pro (P)	Carbonylation	Oxidation
Ser (S)	Phosphorylation, O-linked glycosylation, O-linked GlcNAc (O-GlcNAc)	Formylation
Thr (T)	Phosphorylation, O-linked glycosylation, O-linked GlcNAc (O-GlcNAc), carbonylation	Formylation
Trp (W)	Glycosylation (C-mannosylation)	Oxidation
Tyr (Y)	Phosphorylation, nitration
Val (V)	Not known
Free NH₂ of protein N termini	Preprotein processing, Met excision, formylation, pyro-Glu, N-myristoylation, N-acylation (i.e. palmitoylation), N-terminal α-amine acetylation, ubiquitination	Formylation (Met), pyro-Glu (Gln)

Open in a separate window

Table II.

Most significant and/or frequent PTMs observed in plants

Type of PTM (Reversible, Except if Marked with an Asterisk)	Spontaneous (S; Nonenzymatic) or Enzymatic (E)	Comment on Subcellular Location and Frequency
Phosphorylation (Ser, Thr, Tyr, His, Asp)	E	His and Asp phosphorylation have low frequency
S-Nitrosylation (Cys) and nitration* (Tyr)	S (RNS), but reversal is enzymatic for Cys by thioredoxins	Throughout the cell
Acetylation (N-terminal α-amine, Lys ε-amine)	E	In mitochondria, very little N-terminal acetylation, but high Lys acetylation; Lys acetylation correlates to [acetyl-CoA]
Deamidation (Gln, Asn)	S, but reversal of isoAsp is enzymatic by isoAsp methyltransferase	Throughout the cell
Lipidation (S-acetylation, N-meristoylation, prenylation; Cys, Gly, Lys, Trp, N terminal)	E	Not (or rarely) within plastids, mitochondria, peroxisomes
N-Linked glycosylation (Asp); O linked (Lys, Ser, Thr, Trp)	E	Only proteins passing through the secretory system; O linked in the cell wall
Ubiquination (Lys, N terminal)	E	Not within plastids, mitochondria, peroxisomes
Sumoylation (Lys)	E	Not within plastids, mitochondria, peroxisomes
Carbonylation* (Pro, Lys, Arg, Thr)	S (ROS)	High levels in mitochondria and chloroplast
Methylation (Arg, Lys, N terminal)	E	Histones (nucleus) and chloroplasts; still underexplored
Glutathionylation (Cys)	E	High levels in chloroplasts
Oxidation (Met, Cys)	S (ROS) and E (by PCOs; see Fig. 1B), but reversal is enzymatic by Met sulfoxide reductases, glutaredoxins, and thioredoxins, except if double oxidized	High levels in mitochondria and chloroplast
Peptidase* (cleavage peptidyl bond)	E	Throughout the cell
S-Guanylation (Cys)	S (RNS)	Rare; 8-nitro-cGMP is signaling molecule in guard cells
Formylation (Met)	S, but deformylation is enzymatic by peptide deformylase	All chloroplasts and mitochondria-encoded proteins are synthesized with initiating formylated Met

Open in a separate window

Table III.

Regulation by PTMs in plant metabolism and classic examples of well-studied enzymes and pathwaysMany of these enzymes also undergo allosteric regulation through cellular metabolites. GAPDH, Glyceraldehyde-3-phosphate dehydrogenase; PRK, phosphoribulokinase.

Process	Enzymes	PTMs, Protein Modifiers, Localization	References
Calvin-Benson cycle (chloroplasts)	Many enzymes	Oxidoreduction of S-S bonds, reversible nitrosylation, glutathionylation; through ferredoxin/ferredoxin-thioredoxin reductase/thioredoxins (mostly f and m) and glutaredoxins; proteomics studies in Arabidopsis and C. reinhardtii	Michelet et al. (2013)
	Rubisco	Methylation, carbamylation, acetylation, N-terminal processing, oligomerization; classical studies in pea (Pisum sativum), spinach (Spinacia oleracea), and Arabidopsis	Houtz and Portis (2003); Houtz et al. (2008)
	GAPDH/CP12/PRK supercomplex	Dynamic heterooligomerization through reversible S-S bond formation controlled by thioredoxins	Graciet et al. (2004); Michelet et al. (2013); López-Calcagno et al. (2014)
Glycolysis	Cytosolic PEPC	Phosphorylation (S, T), monoubiquitination	O’Leary et al. (2011)
Photorespiration	Seven enzymes are phosphorylated	Phosphorylation from meta-analysis of public phosphoproteomics data for Arabidopsis; located in chloroplasts, peroxisomes, mitochondria	Hodges et al. (2013)
	Maize glycerate kinase	Redox-regulated S-S bond; thioredoxin f; studied extensively in chloroplasts of C4 maize	Bartsch et al. (2010)
Respiration (mitochondria)	Potentially many enzymes, but functional/biochemical consequences are relatively unexplored	Recent studies suggested PTMs for many tricarboxylic acid cycle enzymes, including Lys acetylation and thioredoxin-driven S-S formation; in particular, succinate dehydrogenase and fumarase are inactivated by thioredoxins	Lázaro et al. (2013); Schmidtmann et al. (2014); Daloso et al. (2015)
	PDH	Ser (de)phosphorylation by intrinsic kinase and phosphatase; ammonia and pyruvate control PDH kinase activity; see Figure 1B	Thelen et al. (2000); Tovar-Méndez et al. (2003)
C4 cycle (C3 and C4 homologs also involved in glycolysis and/or gluconeogenesis)	Pyruvate orthophosphate dikinase	Phosphorylation by pyruvate orthophosphate dikinase-RP, an S/T bifunctional kinase-phosphatase; in chloroplasts	Chastain et al. (2011); Chen et al. (2014)
	PEPC	Phosphorylation; allosteric regulation by malate and Glc-6-P; in cytosol in mesophyll cells in C4 species (e.g. Panicum maximum); see Figure 1A	Izui et al. (2004); Bailey et al. (2007)
	PEPC kinase	Ubiquitination resulting in degradation (note also diurnal mRNA levels and linkage to activity level; very low protein level); in cytosol in mesophyll cells in C4 species (e.g. Flaveria spp. and maize)	Agetsuma et al. (2005)
	PEPC kinase	Phosphorylation in cytosol in bundle sheath cells	Bailey et al. (2007)
Starch metabolism (chloroplasts)	ADP-Glc pyrophosphorylase	Redox-regulated disulfide bonds and dynamic oligomerization; thioredoxins; see Figure 1C	Geigenberger et al. (2005); Geigenberger (2011)
	Starch-branching enzyme II	Phosphorylation by Ca²⁺-dependent protein kinase; P-driven heterooligomerization	Grimaud et al. (2008); Tetlow and Emes (2014)
Suc metabolism (cytosol)	SPS (synthesis of Suc)	(De)phosphorylation; SPS kinase and SPS phosphatase; 14-3-3 proteins; cytosol (maize and others)	Huber (2007)
	Suc synthase (breakdown of Suc)	Phosphorylation; Ca²⁺-dependent protein kinase; correlations to activity, localization, and turnover	Duncan and Huber (2007); Fedosejevs et al. (2014)
Photosynthetic electron transport (chloroplast thylakoid membranes)	PSII core and light-harvesting complex proteins	(De)phosphorylation by state-transition kinases (STN7/8) and PP2C phosphatases (PBCP and PPH1/TAP38)	Pesaresi et al. (2011); Tikkanen et al. (2012); Rochaix (2014)
Nitrogen assimilation	Nitrate reductase	(De)phosphorylation; 14-3-3 proteins	Lillo et al. (2004); Huber (2007)

Open in a separate windowThere are many recent reviews focusing on specific PTMs in plant biology, many of which are cited in this Update. However, the last general review on plant PTMs is from 2010 (); given the enormous progress in PTM research in plants over the last 5 years, a comprehensive overview is overdue. Finally, this Update does not review allosteric regulation by metabolites or other types of metabolic feedback and flux control, even if this is extremely important in the regulation of metabolism and (de)activation of enzymes. Recent reviews for specific pathways, such as isoprenoid metabolism (; ; ), tetrapyrrole metabolism (), the Calvin-Benson cycle (), starch metabolism (; ; ), and photorespiration () provide more in-depth discussions of metabolic regulation through various posttranslational mechanisms. Many of the PTMs that have been discovered in the last decade through large-scale proteomics approaches have not yet been integrated in such pathway-specific reviews, because these data are not always easily accessible and because the biological significance of many PTMs is simply not yet understood. We hope that this Update will increase the general awareness of the existence of these PTM data sets, such that their biological significance can be tested and incorporated in metabolic pathways. 相似文献

19.

Zipping and Unzipping: Protein Modifications Regulating Synaptonemal Complex Dynamics

Jinmin Gao Monica P. Colaiácovo 《Trends in genetics : TIG》2018,34(3):232-245

相似文献

20.

Negative Example Selection for Protein Function Prediction: The NoGO Database

Noah Youngs Duncan Penfold-Brown Richard Bonneau Dennis Shasha 《PLoS computational biology》2014,10(6)

Negative examples – genes that are known not to carry out a given protein function – are rarely recorded in genome and proteome annotation databases, such as the Gene Ontology database. Negative examples are required, however, for several of the most powerful machine learning methods for integrative protein function prediction. Most protein function prediction efforts have relied on a variety of heuristics for the choice of negative examples. Determining the accuracy of methods for negative example prediction is itself a non-trivial task, given that the Open World Assumption as applied to gene annotations rules out many traditional validation metrics. We present a rigorous comparison of these heuristics, utilizing a temporal holdout, and a novel evaluation strategy for negative examples. We add to this comparison several algorithms adapted from Positive-Unlabeled learning scenarios in text-classification, which are the current state of the art methods for generating negative examples in low-density annotation contexts. Lastly, we present two novel algorithms of our own construction, one based on empirical conditional probability, and the other using topic modeling applied to genes and annotations. We demonstrate that our algorithms achieve significantly fewer incorrect negative example predictions than the current state of the art, using multiple benchmarks covering multiple organisms. Our methods may be applied to generate negative examples for any type of method that deals with protein function, and to this end we provide a database of negative examples in several well-studied organisms, for general use (The NoGO database, available at: bonneaulab.bio.nyu.edu/nogo.html). 相似文献