首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Adamczak R  Porollo A  Meller J 《Proteins》2005,59(3):467-475
Owing to the use of evolutionary information and advanced machine learning protocols, secondary structures of amino acid residues in proteins can be predicted from the primary sequence with more than 75% per-residue accuracy for the 3-state (i.e., helix, beta-strand, and coil) classification problem. In this work we investigate whether further progress may be achieved by incorporating the relative solvent accessibility (RSA) of an amino acid residue as a fingerprint of the overall topology of the protein. Toward that goal, we developed a novel method for secondary structure prediction that uses predicted RSA in addition to attributes derived from evolutionary profiles. Our general approach follows the 2-stage protocol of Rost and Sander, with a number of Elman-type recurrent neural networks (NNs) combined into a consensus predictor. The RSA is predicted using our recently developed regression-based method that provides real-valued RSA, with the overall correlation coefficients between the actual and predicted RSA of about 0.66 in rigorous tests on independent control sets. Using the predicted RSA, we were able to improve the performance of our secondary structure prediction by up to 1.4% and achieved the overall per-residue accuracy between 77.0% and 78.4% for the 3-state classification problem on different control sets comprising, together, 603 proteins without homology to proteins included in the training. The effects of including solvent accessibility depend on the quality of RSA prediction. In the limit of perfect prediction (i.e., when using the actual RSA values derived from known protein structures), the accuracy of secondary structure prediction increases by up to 4%. We also observed that projecting real-valued RSA into 2 discrete classes with the commonly used threshold of 25% RSA decreases the classification accuracy for secondary structure prediction. While the level of improvement of secondary structure prediction may be different for prediction protocols that implicitly account for RSA in other ways, we conclude that an increase in the 3-state classification accuracy may be achieved when combining RSA with a state-of-the-art protocol utilizing evolutionary profiles. The new method is available through a Web server at http://sable.cchmc.org.  相似文献   

3.

Background  

Structural properties of proteins such as secondary structure and solvent accessibility contribute to three-dimensional structure prediction, not only in the ab initio case but also when homology information to known structures is available. Structural properties are also routinely used in protein analysis even when homology is available, largely because homology modelling is lower throughput than, say, secondary structure prediction. Nonetheless, predictors of secondary structure and solvent accessibility are virtually always ab initio.  相似文献   

4.
Zhang H  Zhang T  Gao J  Ruan J  Shen S  Kurgan L 《Amino acids》2012,42(1):271-283
Proteins fold through a two-state (TS), with no visible intermediates, or a multi-state (MS), via at least one intermediate, process. We analyze sequence-derived factors that determine folding types by introducing a novel sequence-based folding type predictor called FOKIT. This method implements a logistic regression model with six input features which hybridize information concerning amino acid composition and predicted secondary structure and solvent accessibility. FOKIT provides predictions with average Matthews correlation coefficient (MCC) between 0.58 and 0.91 measured using out-of-sample tests on four benchmark datasets. These results are shown to be competitive or better than results of four modern predictors. We also show that FOKIT outperforms these methods when predicting chains that share low similarity with the chains used to build the model, which is an important advantage given the limited number of annotated chains. We demonstrate that inclusion of solvent accessibility helps in discrimination of the folding kinetic types and that three of the features constitute statistically significant markers that differentiate TS and MS folders. We found that the increased content of exposed Trp and buried Leu are indicative of the MS folding, which implies that the exposure/burial of certain hydrophobic residues may play important role in the formation of the folding intermediates. Our conclusions are supported by two case studies.  相似文献   

5.
Garg A  Kaur H  Raghava GP 《Proteins》2005,61(2):318-324
The present study is an attempt to develop a neural network-based method for predicting the real value of solvent accessibility from the sequence using evolutionary information in the form of multiple sequence alignment. In this method, two feed-forward networks with a single hidden layer have been trained with standard back-propagation as a learning algorithm. The Pearson's correlation coefficient increases from 0.53 to 0.63, and mean absolute error decreases from 18.2 to 16% when multiple-sequence alignment obtained from PSI-BLAST is used as input instead of a single sequence. The performance of the method further improves from a correlation coefficient of 0.63 to 0.67 when secondary structure information predicted by PSIPRED is incorporated in the prediction. The final network yields a mean absolute error value of 15.2% between the experimental and predicted values, when tested on two different nonhomologous and nonredundant datasets of varying sizes. The method consists of two steps: (1) in the first step, a sequence-to-structure network is trained with the multiple alignment profiles in the form of PSI-BLAST-generated position-specific scoring matrices, and (2) in the second step, the output obtained from the first network and PSIPRED-predicted secondary structure information is used as an input to the second structure-to-structure network. Based on the present study, a server SARpred (http://www.imtech.res.in/raghava/sarpred/) has been developed that predicts the real value of solvent accessibility of residues for a given protein sequence. We have also evaluated the performance of SARpred on 47 proteins used in CASP6 and achieved a correlation coefficient of 0.68 and a MAE of 15.9% between predicted and observed values.  相似文献   

6.
The global analysis of proteins is now feasible due to improvements in techniques such as two-dimensional gel electrophoresis (2-DE), mass spectrometry, yeast two-hybrid systems and the development of bioinformatics applications. The experiments form the basis of proteomics, and present significant challenges in data analysis, storage and querying. We argue that a standard format for proteome data is required to enable the storage, exchange and subsequent re-analysis of large datasets. We describe the criteria that must be met for the development of a standard for proteomics. We have developed a model to represent data from 2-DE experiments, including difference gel electrophoresis along with image analysis and statistical analysis across multiple gels. This part of proteomics analysis is not represented in current proposals for proteomics standards. We are working with the Proteomics Standards Initiative to develop a model encompassing biological sample origin, experimental protocols, a number of separation techniques and mass spectrometry. The standard format will facilitate the development of central repositories of data, enabling results to be verified or re-analysed, and the correlation of results produced by different research groups using a variety of laboratory techniques.  相似文献   

7.

Background

Estimation of allele frequency is of fundamental importance in population genetic analyses and in association mapping. In most studies using next-generation sequencing, a cost effective approach is to use medium or low-coverage data (e.g., < 15X). However, SNP calling and allele frequency estimation in such studies is associated with substantial statistical uncertainty because of varying coverage and high error rates.

Results

We evaluate a new maximum likelihood method for estimating allele frequencies in low and medium coverage next-generation sequencing data. The method is based on integrating over uncertainty in the data for each individual rather than first calling genotypes. This method can be applied to directly test for associations in case/control studies. We use simulations to compare the likelihood method to methods based on genotype calling, and show that the likelihood method outperforms the genotype calling methods in terms of: (1) accuracy of allele frequency estimation, (2) accuracy of the estimation of the distribution of allele frequencies across neutrally evolving sites, and (3) statistical power in association mapping studies. Using real re-sequencing data from 200 individuals obtained from an exon-capture experiment, we show that the patterns observed in the simulations are also found in real data.

Conclusions

Overall, our results suggest that association mapping and estimation of allele frequencies should not be based on genotype calling in low to medium coverage data. Furthermore, if genotype calling methods are used, it is usually better not to filter genotypes based on the call confidence score.  相似文献   

8.
The prediction of loop regions in the process of protein structure prediction by homology is still an unsolved problem. In an earlier publication, we could show that the correct placement of the amino acids serving as an anchor group to be connected by a loop fragment with a predicted geometry is a highly important step and an essential requirement within the process (Lessel and Schomburg, Proteins 1999; 37:56-64). In this article, we present an analysis of the quality of possible loop predictions with respect to gap length, fragment length, amino acid type, secondary structure, and solvent accessibility. For 550 insertions and 544 deletions, we test all possible positions for anchor groups with an inserted loop of a length between 3 and 12 amino acids. We could show that approximately 80% of the indel regions could be predicted within 1.5 A RMSD from a knowledge-based loop data base if criteria for the correct localization of anchor groups could be found and the loops can be sorted correctly. From our analysis, several conclusions regarding the optimal placement of anchor groups become obvious: (1) The correct placement of anchor groups is even more important for longer gap lengths, (2) medium length fragments (length 5-8) perform better than short or long ones, (3) the placement of anchor groups at hydrophobic amino acids gives a higher chance to include the best possible loop, (4) anchor groups within secondary structure elements, in particular beta-sheets are suitable, (5) amino acids with lower solvent accessibility are better anchor group. A preliminary test using a combination of the anchor group positioning criteria deduced from our analysis shows very promising results.  相似文献   

9.
E R Wohlfeil  R A Hudson 《Biochemistry》1991,30(29):7231-7241
The heterobifunctional organomercurial reagents 3-(acetoxymercurio)- and 3-(chloromercurio)-5-nitrosalicylaldehyde were prepared, characterized in model studies, and used to probe the interaction between cobratoxin, purified from the venom of the Thailand cobra (Naja naja siamensis), and the affinity-purified nicotinic acetylcholine receptor (AcChR) from Torpedo california electroplax. These reagents may also be useful in introducing chemically well-defined heavy metal atoms into proteins containing no reactive thiols. Model reagent adducts were prepared in situ by reductive amination with N-butylamine and N alpha-acetyllysine-N-methylamide. The nitrophenolic pKaS of the amine adducts were similar to those of the aldehyde reagents through reduced by 1.3-1.5 units when compared with the hydroxylmethyl reduction product. Reaction of either mercuriosalicylaldehyde with cobratoxin led to a single major modification product incorporating 1 mol of the reagent into cobratoxin at Lys 23. The Lys 23 modified toxin had a reduced binding affinity for the AcChR over that of the native toxin (Kd 2.75 nM cf. 0.3 nM). Reduction of the purified AcChR with 1 mM dithiothreitol (DTT) followed by removal of excess thiol led to cross-linking reactions with the Lys 23 modified cobratoxin to both the alpha and beta subunits of the AcChR complex. Reaction of DTT-treated AcChR with N-ethylmaleimide (NEM) blocked cross-linking, while treatment of the initially cross-linked toxin-AcChR complex with mercaptoethanol leads to reversal of cross-linking. These observations strongly support cross-linking mediated by the formation of a mercury-sulfur bond and further lend support the identity of the respective interacting sites in AcChR and toxin.  相似文献   

10.
Platelet-activating factor receptor (PAFR) is a member of G-protein coupled receptor (GPCR) superfamily. Understanding the regulation mechanisms of PAFR by its agonists and antagonists at the atomic level is essential for designing PAFR antagonists as drug candidates for treating PAF-mediated diseases. In this study, a 3D model of PAFR was constructed by a hierarchical approach integrating homology modeling, molecular docking and molecular dynamics (MD) simulations. Based on the 3D model, regulation mechanisms of PAFR by agonists and antagonists were investigated via three 8-ns MD simulations on the systems of apo-PAFR, PAFR-PAF and PAFR-GB. The simulations revealed that binding of PAF to PAFR triggers the straightening process of the kinked helix VI, leading to its activated state. In contrast, binding of GB to PAFR locks PAFR in its inactive state.  相似文献   

11.
Nicotinic acetylcholine receptors (AChRs) immunoaffinity-purified from brains are composed of only two kinds of subunits rather than the four kinds present in muscle-type AChRs. Here we report the N-terminal protein sequences of the structural subunits of AChRs from rat and chicken brains and the cloning of full-length cDNAs for the chicken brain AChR structural subunit. Previously, the N-terminal amino acid sequence of the ACh-binding subunit of AChR immunoaffinity-purified from rat brain was shown to correspond to the cDNA alpha 4. Thus, cDNA sequences are now known for both of the subunits that form one AChR subtype in vivo.  相似文献   

12.
GABA(A) receptors (GABA(A)Rs) are ligand gated chloride ion channels that mediate overall inhibitory signaling in the CNS. A detailed understanding of their structure is important to gain insights in, e.g., ligand binding and functional properties of this pharmaceutically important target. Homology modeling is a necessary tool in this regard because experimentally determined structures are lacking. Here we present an exhaustive approach for creating a high quality model of the α(1)β(2)γ(2) subtype of the GABA(A)R ligand binding domain, and we demonstrate its usefulness in understanding details of orthosteric ligand binding. The model was constructed by using multiple templates and by incorporation of knowledge from biochemical/pharmacological experiments. It was validated on the basis of objective energy functions, its ability to account for available residue specific information, and its stability in molecular dynamics (MD) compared with that of the two homologous crystal structures. We then combined the model with extensive structure-activity relationships available from two homologous series of orthosteric GABA(A)R antagonists to create a detailed hypothesis for their binding modes. Excellent agreement with key experimental data was found, including the ability of the model to accommodate and explain a previously developed pharmacophore model. A coupling to agonist binding was thereby established and discussed in relation to activation mechanisms. Our results highlight the importance of critical evaluation and optimization of each step in the homology modeling process. The approach taken here can greatly aid in increasing the understanding of GABA(A)Rs and related receptors where structural insight is limited and reliable models are difficult to obtain.  相似文献   

13.
To explore the spatial organization and functional dynamics of the citrate transport protein (CTP), a nitroxide scan was carried out along 22 consecutive residues within the fourth transmembrane domain (TMDIV). This domain has been implicated as being of unique importance to the CTP mechanism due to (i) the presence of two intramembranous positive charges that are essential for CTP function and (ii) the existence of a transmembrane aqueous surface within this domain which likely corresponds to a portion of the citrate translocation pathway. The sequence-specific variation in the mobilities of the introduced nitroxides and their accessibilities to molecular O(2) reveal an alpha-helical conformation along the sequence. The accessibilities to NiEDDA are out of phase with accessibilites to O(2), indicating that one face of the helix is solvated by the lipid bilayer while the other is solvated by an aqueous environment. A gradient of NiEDDA accessibility is observed along the helix surface facing the aqueous phase, and the EPR spectral line shapes at these sites indicate considerable motional restriction. In the context of the model where TMDIV lines the translocation pathway, these data suggest a barrier to passive diffusion through the pathway. This paper reports the first use of site-directed spin labeling to study mitochondrial transporter structure.  相似文献   

14.
A new subunit, beta 2, of the neuronal nicotinic receptor family has been identified. This subunit has the structural features of a non-agonist-binding subunit. We provide evidence that beta 2 can substitute for the muscle beta 1 subunit to form a functional nicotinic receptor in Xenopus oocytes. Expression studies performed in oocytes have demonstrated that three different neuronal nicotinic acetylcholine receptors can be formed by the pairwise injection of beta 2 mRNA and each of the neuronal alpha subunit mRNAs. The beta 2 gene is expressed in PC12 cells and in areas of the central nervous system where the alpha 2, alpha 3, and alpha 4 genes are expressed. These results lead us to propose that the nervous system expresses diverse forms of neuronal nicotinic acetylcholine receptors by combining beta 2 subunits with different agonist-binding alpha subunits.  相似文献   

15.
A simple method is presented for projecting the conformation of extended secondary structure elements of peptides and proteins that extend over four Cαatoms onto a simple two-dimensional surface. A new set of two degrees of freedom is defined, a pseudo-dihedral involving four sequential Cαatoms, as well as the triple scalar product for the vectors describing the orientation of the three intervening peptide groups. The method provides a reduction in dimensionality, from the usual combination of multiple ϕ,ψ pairs to a single pair, yielding valuable information concerning the structure and dynamics of these important elements. The new two-dimensional surface is explored by reference to 63 selected protein crystal structures together with a comparison of model built peptides representing the common secondary structural elements. Dynamical aspects on this new surface are examined using a molecular dynamics trajectory of Basic Pancreatic Trypsin Inhibitor. © 1997 Wiley-Liss, Inc.  相似文献   

16.
Recently, several experimental techniques have emerged for probing RNA structures based on high-throughput sequencing. However, most secondary structure prediction tools that incorporate probing data are designed and optimized for particular types of experiments. For example, RNAstructure-Fold is optimized for SHAPE data, while SeqFold is optimized for PARS data. Here, we report a new RNA secondary structure prediction method, restrained MaxExpect (RME), which can incorporate multiple types of experimental probing data and is based on a free energy model and an MEA (maximizing expected accuracy) algorithm. We first demonstrated that RME substantially improved secondary structure prediction with perfect restraints (base pair information of known structures). Next, we collected structure-probing data from diverse experiments (e.g. SHAPE, PARS and DMS-seq) and transformed them into a unified set of pairing probabilities with a posterior probabilistic model. By using the probability scores as restraints in RME, we compared its secondary structure prediction performance with two other well-known tools, RNAstructure-Fold (based on a free energy minimization algorithm) and SeqFold (based on a sampling algorithm). For SHAPE data, RME and RNAstructure-Fold performed better than SeqFold, because they markedly altered the energy model with the experimental restraints. For high-throughput data (e.g. PARS and DMS-seq) with lower probing efficiency, the secondary structure prediction performances of the tested tools were comparable, with performance improvements for only a portion of the tested RNAs. However, when the effects of tertiary structure and protein interactions were removed, RME showed the highest prediction accuracy in the DMS-accessible regions by incorporating in vivo DMS-seq data.  相似文献   

17.
Alcohol and nicotine are coabused, and preclinical and clinical data suggest that common genes may influence responses to both drugs. A gene in a region of mouse chromosome 9 that includes a cluster of three nicotinic acetylcholine receptor (nAChR) subunit genes influences the locomotor stimulant response to ethanol. The current studies first used congenic mice to confirm the influential gene on chromosome 9. Congenic F2 mice were then used to more finely map the location. Gene expression of the three subunit genes was quantified in strains of mice that differ in response to ethanol. Finally, the locomotor response to ethanol was examined in mice heterozygous for a null mutation of the α3 nAChR subunit gene ( Chrna3 ). Congenic data indicate that a gene on chromosome 9, within a 46 cM region that contains the cluster of nAChR subunit genes, accounts for 41% of the genetic variation in the stimulant response to ethanol. Greater expression of Chrna3 was found in whole brain and dissected brain regions relevant to locomotor behavior in mice that were less sensitive to ethanol-induced stimulation compared to mice that were robustly stimulated; the other two nAChR subunit genes in the gene cluster (α5 and β4) were not differentially expressed. Locomotor stimulation was not expressed on the genetic background of Chrna3 heterozygous (+/−) and wild-type (+/+) mice; +/− mice were more sensitive than +/+ mice to the locomotor depressant effects of ethanol. Chrna3 is a candidate gene for the acute locomotor stimulant response to ethanol that deserves further examination.  相似文献   

18.
We report the isolation and sequence of a cDNA clone that encodes a locust (Schistocerca gregaria) nervous system nicotinic acetylcholine receptor (AChR) subunit (alpha L1). The calculated molecular weight of the unglycosylated polypeptide, which contains in the proposed extracellular domain two adjacent cysteine residues which are characteristic of alpha (ligand binding) subunits, is 60,641 daltons. Injection into Xenopus oocytes, of RNA synthesized from this clone in vitro, results in expression of functional nicotinic receptors in the oocyte membrane. In these, nicotine opens a cation channel; the receptors are blocked by both alpha-bungarotoxin (alpha-Bgt) and kappa-bungarotoxin (kappa-Bgt). Reversible block of the expressed insect AChR by mecamylamine, d-tubocurarine, tetraethylammonium, bicuculline and strychnine has also been observed. These data are entirely consistent with previously reported electrophysiological studies on in vivo insect nicotinic receptors and also with biochemical studies on an alpha-Bgt affinity purified locust AChR. Thus, a functional receptor exhibiting the characteristic pharmacology of an in vivo insect nicotinic AChR can be expressed in Xenopus oocytes by injection with a single subunit RNA.  相似文献   

19.
20.
Mansoor SE  McHaourab HS  Farrens DL 《Biochemistry》1999,38(49):16383-16393
We report an investigation of how much protein structural information could be obtained using a site-directed fluorescence labeling (SDFL) strategy. In our experiments, we used 21 consecutive single-cysteine substitution mutants in T4 lysozyme (residues T115-K135), located in a helix-turn-helix motif. The mutants were labeled with the fluorescent probe monobromobimane and subjected to an array of fluorescence measurements. Thermal stability measurements show that introduction of the label is substantially perturbing only when it is located at buried residue sites. At buried sites (solvent surface accessibility of <40 A(2)), the destabilizations are between 3 and 5.5 kcal/mol, whereas at more exposed sites, DeltaDeltaG values of < or = 1.5 kcal/mol are obtained. Of all the fluorescence parameters that were explored (excitation lambda(max), emission lambda(max), fluorescence lifetime, quantum yield, and steady-state anisotropy), the emission lambda(max) and the steady-state anisotropy values most accurately reflect the solvent surface accessibility at each site as calculated from the crystal structure of cysteine-less T4 lysozyme. The parameters we identify allow the classification of each site as buried, partially buried, or exposed. We find that the variations in these parameters as a function of residue number reflect the sequence-specific secondary structure, the determination of which is a key step for modeling a protein of unknown structure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号