期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Protein structure determination using a database of interatomic distance probabilities

Wall ME Subramaniam S Phillips GN 《Protein science : a publication of the Protein Society》1999,8(12):2720-2727

The accelerated pace of genomic sequencing has increased the demand for structural models of gene products. Improved quantitative methods are needed to study the many systems (e.g., macromolecular assemblies) for which data are scarce. Here, we describe a new molecular dynamics method for protein structure determination and molecular modeling. An energy function, or database potential, is derived from distributions of interatomic distances obtained from a database of known structures. X-ray crystal structures are refined by molecular dynamics with the new energy function replacing the Van der Waals potential. Compared to standard methods, this method improved the atomic positions, interatomic distances, and side-chain dihedral angles of structures randomized to mimic the early stages of refinement. The greatest enhancement in side-chain placement was observed for groups that are characteristically buried. More accurate calculated model phases will follow from improved interatomic distances. Details usually seen only in high-resolution refinements were improved, as is shown by an R-factor analysis. The improvements were greatest when refinements were carried out using X-ray data truncated at 3.5 A. The database potential should therefore be a valuable tool for determining X-ray structures, especially when only low-resolution data are available. 相似文献

2.

Estimating false discovery rates for peptide and protein identification using randomized databases

Gregory Hather Roger Higdon Andrew Bauman Priska D. von Haller Eugene Kolker 《Proteomics》2010,10(12):2369-2376

MS‐based proteomics characterizes protein contents of biological samples. The most common approach is to first match observed MS/MS peptide spectra against theoretical spectra from a protein sequence database and then to score these matches. The false discovery rate (FDR) can be estimated as a function of the score by searching together the protein sequence database and its randomized version and comparing the score distributions of the randomized versus nonrandomized matches. This work introduces a straightforward isotonic regression‐based method to estimate the cumulative FDRs and local FDRs (LFDRs) of peptide identification. Our isotonic method not only performed as well as other methods used for comparison, but also has the advantages of being: (i) monotonic in the score, (ii) computationally simple, and (iii) not dependent on assumptions about score distributions. We demonstrate the flexibility of our approach by using it to estimate FDRs and LFDRs for protein identification using summaries of the peptide spectra scores. We reconfirmed that several of these methods were superior to a two‐peptide rule. Finally, by estimating both the FDRs and LFDRs, we showed for both peptide and protein identification, moderate FDR values (5%) corresponded to large LFDR values (53 and 60%). 相似文献

3.

Validation of endogenous peptide identifications using a database of tandem mass spectra 总被引：1，自引：0，他引：1

Fälth M Svensson M Nilsson A Sköld K Fenyö D Andren PE 《Journal of proteome research》2008,7(7):3049-3053

The SwePep database is designed for endogenous peptides and mass spectrometry. It contains information about the peptides such as mass, pl, precursor protein and potential post-translational modifications. Here, we have improved and extended the SwePep database with tandem mass spectra, by adding a locally curated version of the global proteome machine database (GPMDB). In peptidomic experiment practice, many peptide sequences contain multiple tandem mass spectra with different quality. The new tandem mass spectra database in SwePep enables validation of low quality spectra using high quality tandem mass spectra. The validation is performed by comparing the fragmentation patterns of the two spectra using algorithms for calculating the correlation coefficient between the spectra. The present study is the first step in developing a tandem spectrum database for endogenous peptides that can be used for spectrum-to-spectrum identifications instead of peptide identifications using traditional protein sequence database searches. 相似文献

4.

Population abundance estimation with heterogeneous encounter probabilities using numerical integration

Gary C. White Evan G. Cooch 《The Journal of wildlife management》2017,81(2):322-336

相似文献

5.

Prediction of error associated with false-positive rate determination for peptide identification in large-scale proteomics experiments using a combined reverse and forward peptide sequence database strategy

Huttlin EL Hegeman AD Harms AC Sussman MR 《Journal of proteome research》2007,6(1):392-398

In recent years, a variety of approaches have been developed using decoy databases to empirically assess the error associated with peptide identifications from large-scale proteomics experiments. We have developed an approach for calculating the expected uncertainty associated with false-positive rate determination using concatenated reverse and forward protein sequence databases. After explaining the theoretical basis of our model, we compare predicted error with the results of experiments characterizing a series of mixtures containing known proteins. In general, results from characterization of known proteins show good agreement with our predictions. Finally, we consider how these approaches may be applied to more complicated data sets, as when peptides are separated by charge state prior to false-positive determination. 相似文献

6.

Protein probabilities in shotgun proteomics: evaluating different estimation methods using a semi-random sampling model 总被引：3，自引：0，他引：3

Xue X Wu S Wang Z Zhu Y He F 《Proteomics》2006,6(23):6134-6145

The calculation of protein probabilities is one of the most intractable problems in large-scale proteomic research. Current available estimating methods, for example, ProteinProphet, PROT_PROBE, Poisson model and two-peptide hits, employ different models trying to resolve this problem. Until now, no efficient method is used for comparative evaluation of the above methods in large-scale datasets. In order to evaluate these various methods, we developed a semi-random sampling model to simulate large-scale proteomic data. In this model, the identified peptides were sampled from the designed proteins and their cross-correlation scores were simulated according to the results from reverse database searching. The simulated result of 18 control proteins was consistent with the experimental one, demonstrating the efficiency of our model. According to the simulated results of human liver sample, ProteinProphet returned slightly higher probabilities and lower specificity than real cases. PROT_PROBE was a more efficient method with higher specificity. Predicted results from a Poisson model roughly coincide with real datasets, and the method of two-peptide hits seems solid but imprecise. However, the probabilities of identified proteins are strongly correlated with several experimental factors including spectra number, database size and protein abundance distribution. 相似文献

7.

Peptide identification using peptide amino acid attribute vectors

Halligan BD Dratz EA Feng X Twigger SN Tonellato PJ Greene AS 《Journal of proteome research》2004,3(4):813-820

We describe the theoretical basis for a peptide identification method wherein peptides are represented as vectors based on their amino acid composition and grouped into clusters. Unknown peptides are identified by finding the database cluster and peptide entries with the shortest Euclidian distance. We demonstrate that the amino acid composition of peptides is virtually as informative as the sequence and allows rapid peptide identification more accurately than peptide mass alone. 相似文献

8.

Posterior probabilities for a change-point using ranks 总被引：1，自引：0，他引：1

PETTITT A. N. 《Biometrika》1981,68(2):443-450

相似文献

9.

The construction of a bioactive peptide database in Metazoa

Liu F Baggerman G Schoofs L Wets G 《Journal of proteome research》2008,7(9):4119-4131

Bioactive peptides play critical roles in regulating most biological processes in animals, and have considerable biological, medical and industrial importance. A number of peptides have been discovered usually based on their biological activities in vitro or based on their sequence similarities in silico. Through searches in Swiss-Prot and Trembl protein databases using BLAST alignment tools and other in silico methods, all currently known bioactive peptides and their precursor proteins are extracted. In addition, 132 recently discovered putative peptide genes in Drosophila as well as their orthologs in other species are collected. In total, 20 027 bioactive peptides from 19 438 precursor proteins covering 2820 metazoan species are retained, and they, respectively, make up a peptide and a peptide precursor database. The peptides and peptide precursor proteins are further classified into 373 families, 178 of which are represented by Prosite Pfam or Smart motifs, or by typical peptide motifs that have been constructed recently. The remaining 195 families are novel peptide families. The motifs characterizing the 178 peptide families are saved into a peptide motif database. The peptide, peptide precursor and peptide motif databases (version 1.0) are the most complete peptide, precursor and peptide motif collection in Metazoa so far. They are available on the WWW at http://www.peptides.be/. 相似文献

10.

Evaluating the effect of database inflation in proteogenomic search on sensitive and reliable peptide identification

Li Honglan Joh Yoon Sung Kim Hyunwoo Paek Eunok Lee Sang-Won Hwang Kyu-Baek 《BMC genomics》2016,17(13):1031-162

相似文献

11.

Large improvements in MS/MS-based peptide identification rates using a hybrid analysis

Cannon WR Rawlins MM Baxter DJ Callister SJ Lipton MS Bryant DA 《Journal of proteome research》2011,10(5):2306-2317

We report a hybrid search method combining database and spectral library searches that allows for a straightforward approach to characterizing the error rates from the combined data. Using these methods, we demonstrate significantly increased sensitivity and specificity in matching peptides to tandem mass spectra. The hybrid search method increased the number of spectra that can be assigned to a peptide in a global proteomics study by 57-147% at an estimated false discovery rate of 5%, with clear room for even greater improvements. The approach combines the general utility of using consensus model spectra typical of database search methods with the accuracy of the intensity information contained in spectral libraries. A common scoring metric based on recent developments linking data analysis and statistical thermodynamics is used, which allows the use of a conservative estimate of error rates for the combined data. We applied this approach to proteomics analysis of Synechococcus sp. PCC 7002, a cyanobacterium that is a model organism for studies of photosynthetic carbon fixation and biofuels development. The increased specificity and sensitivity of this approach allowed us to identify many more peptides involved in the processes important for photoautotrophic growth. 相似文献

12.

PEAKS DB: de novo sequencing assisted database search for sensitive and accurate peptide identification

Zhang J Xin L Shan B Chen W Xie M Yuen D Zhang W Zhang Z Lajoie GA Ma B 《Molecular & cellular proteomics : MCP》2012,11(4):M111.010587

Many software tools have been developed for the automated identification of peptides from tandem mass spectra. The accuracy and sensitivity of the identification software via database search are critical for successful proteomics experiments. A new database search tool, PEAKS DB, has been developed by incorporating the de novo sequencing results into the database search. PEAKS DB achieves significantly improved accuracy and sensitivity over two other commonly used software packages. Additionally, a new result validation method, decoy fusion, has been introduced to solve the issue of overconfidence that exists in the conventional target decoy method for certain types of peptide identification software. 相似文献

13.

Efficiency improvement of peptide identification for an organism without complete genome sequence, using expressed sequence tag database and tandem mass spectral data

Kwon KH Kim M Kim JY Kim KW Kim SI Park YM Yoo JS 《Proteomics》2003,3(12):2305-2309

We compared peptide identification by database (DB) search methods with de novo sequencing results for proteomics study in an organism without genome sequence information. When the former was done by searching the Expressed Sequence Tag (EST) DB of the sample organism or the NCBI nonredundant (nr) protein DB of green plants using either the MASCOT or SEQUEST software program, it was confirmed that the former is as accurate as the latter. Peptides identified from EST DB were twice as many as those from the nr protein DB, in spite of the fact that the EST DB has less data (26 222 EST) than the NCBI nr protein DB (224 238). This study demonstrates that EST DB with tandem mass spectra can be used reliably for high-throughput proteomics studies in an organism without genome information. 相似文献

14.

Stochastic inequality probabilities for adaptively randomized clinical trials

Cook JD Nadarajah S 《Biometrical journal. Biometrische Zeitschrift》2006,48(3):356-365

We examine stochastic inequality probabilities of the form P (X > Y) and P (X > max (Y, Z)) where X, Y, and Z are random variables with beta, gamma, or inverse gamma distributions. We discuss the applications of such inequality probabilities to adaptively randomized clinical trials as well as methods for calculating their values. 相似文献

15.

Rapid identification of high affinity peptide ligands using positional scanning synthetic peptide combinatorial libraries.

C Pinilla J R Appel P Blanc R A Houghten 《BioTechniques》1992,13(6):901-905

We describe here a conceptually unique set of individual synthetic peptide combinatorial libraries (SPCLs), termed a positional scanning SPCL (PS-SPCL), that can be used for the rapid (i.e., a single day) identification of peptide sequences that bind with high affinity to antibodies, receptors or other acceptor molecules. The PS-SPCL described here is made up of six individual positional peptide libraries, each one consisting of hexamers with a single position defined and five positions as mixtures. As an example of the utility of such PS-SPCLs, the antigenic determinants recognized by two different monoclonal antibodies were correctly identified upon a single screening. 相似文献

16.

Cross-validation in nonparametric estimation of probabilities and probability densities 总被引：3，自引：0，他引：3

BOWMAN ADRIAN W.; HALL PETER; TITTERINGTON D. M. 《Biometrika》1984,71(2):341-351

相似文献

17.

A hybrid method for peptide identification using integer linear optimization, local database search, and quadrupole time-of-flight or OrbiTrap tandem mass spectrometry

DiMaggio PA Floudas CA Lu B Yates JR 《Journal of proteome research》2008,7(4):1584-1593

A novel hybrid methodology for the automated identification of peptides via de novo integer linear optimization, local database search, and tandem mass spectrometry is presented in this article. A modified version of the de novo identification algorithm PILOT, is utilized to construct accurate de novo peptide sequences. A modified version of the local database search tool FASTA is used to query these de novo predictions against the nonredundant protein database to resolve any low-confidence amino acids in the candidate sequences. The computational burden associated with performing several alignments is alleviated with the use of distributive computing. Extensive computational studies are presented for this new hybrid methodology, as well as comparisons with MASCOT for a set of 38 quadrupole time-of-flight (QTOF) and 380 OrbiTrap tandem mass spectra. The results for our proposed hybrid method for the OrbiTrap spectra are also compared with a modified version of PepNovo, which was trained for use on high-precision tandem mass spectra, and the tag-based method InsPecT. The de novo sequences of PILOT and PepNovo are also searched against the nonredundant protein database using CIDentify to compare with the alignments achieved by our modifications of FASTA. The comparative studies demonstrate the excellent peptide identification accuracy gained from combining the strengths of our de novo method, which is based on integer linear optimization, and database driven search methods. 相似文献

18.

Comprehensive bibliography database using a microcomputer

P J R Harkin 《BMJ (Clinical research ed.)》1986,293(6539):136-137

相似文献

19.

PhoPepMass: A database and search tool assisting human phosphorylation peptide identification from mass spectrometry data

Menghuan Zhang Hui Cui Lanming Chen Ying Yu Michael O. Glocker Lu Xie 《遗传学报》2018,45(7):381-388

Protein phosphorylation, one of the most important protein post-translational modifications, is involved in various biological processes, and the identification of phosphorylation peptides (phosphopeptides) and their corresponding phosphorylation sites (phosphosites) will facilitate the understanding of the molecular mechanism and function of phosphorylation. Mass spectrometry (MS) provides a high-throughput technology that enables the identification of large numbers of phosphosites. PhoPepMass is designed to assist human phosphopeptide identification from MS data based on a specific database of phophopeptide masses and a multivariate hypergeometric matching algorithm. It contains 244,915 phosphosites from several public sources. Moreover, the accurate masses of peptides and fragments with phosphosites were calculated. It is the first database that provides a systematic resource for the query of phosphosites on peptides and their corresponding masses. This allows researchers to search certain proteins of which phosphosites have been reported, to browse detailed phosphopeptide and fragment information, to match masses from MS analyses with defined threshold to the corresponding phosphopeptide, and to compare proprietary phosphopeptide discovery results with results from previous studies. Additionally, a database search software is created and a “two-stage search strategy” is suggested to identify phosphopeptides from tandem mass spectra of proteomics data. We expect PhoPepMass to be a useful tool and a source of reference for proteomics researchers. PhoPepMass is available at https://www.scbit.org/phopepmass/index.html. 相似文献

20.

Genome-wide identification and extensive analysis of rice-endosperm preferred genes using reference expression database

Woo-Jong Hong Yo-Han Yoo Sun-A Park Sunok Moon Sung-Ruyl Kim Gynheung An Ki-Hong Jung 《Journal of Plant Biology》2017,60(3):249-258

相似文献