共查询到20条相似文献,搜索用时 15 毫秒
1.
Background
Many proteins contain disordered regions that lack fixed three-dimensional (3D) structure under physiological conditions but have important biological functions. Prediction of disordered regions in protein sequences is important for understanding protein function and in high-throughput determination of protein structures. Machine learning techniques, including neural networks and support vector machines have been widely used in such predictions. Predictors designed for long disordered regions are usually less successful in predicting short disordered regions. Combining prediction of short and long disordered regions will dramatically increase the complexity of the prediction algorithm and make the predictor unsuitable for large-scale applications. Efficient batch prediction of long disordered regions alone is of greater interest in large-scale proteome studies. 相似文献2.
Protein structure prediction methods such as Rosetta search for the lowest energy conformation of the polypeptide chain. However, the experimentally observed native state is at a minimum of the free energy, rather than the energy. The neglect of the missing configurational entropy contribution to the free energy can be partially justified by the assumption that the entropies of alternative folded states, while very much less than unfolded states, are not too different from one another, and hence can be to a first approximation neglected when searching for the lowest free energy state. The shortcomings of current structure prediction methods may be due in part to the breakdown of this assumption. Particularly problematic are proteins with significant disordered regions which do not populate single low energy conformations even in the native state. We describe two approaches within the Rosetta structure modeling methodology for treating such regions. The first does not require advance knowledge of the regions likely to be disordered; instead these are identified by minimizing a simple free energy function used previously to model protein folding landscapes and transition states. In this model, residues can be either completely ordered or completely disordered; they are considered disordered if the gain in entropy outweighs the loss of favorable energetic interactions with the rest of the protein chain. The second approach requires identification in advance of the disordered regions either from sequence alone using for example the DISOPRED server or from experimental data such as NMR chemical shifts. During Rosetta structure prediction calculations the disordered regions make only unfavorable repulsive contributions to the total energy. We find that the second approach has greater practical utility and illustrate this with examples from de novo structure prediction, NMR structure calculation, and comparative modeling. 相似文献
3.
Proteins in general consist not only of globular structural domains (SDs), but also of intrinsically disordered regions (IDRs), i.e. those that do not assume unique three-dimensional structures by themselves. Although IDRs are especially prevalent in eukaryotic proteins, the functions are mostly unknown. To elucidate the functions of IDRs, we first divided eukaryotic proteins into subcellular localizations, identified IDRs by the DICHOT system that accurately divides entire proteins into SDs and IDRs, and examined charge and hydropathy characteristics. On average, mitochondrial proteins have IDRs more positively charged than SDs. Comparison of mitochondrial proteins with orthologous prokaryotic proteins showed that mitochondrial proteins tend to have segments attached at both N and C termini, high fractions of which are IDRs. Segments added to the N-terminus of mitochondrial proteins contain not only signal sequences but also mature proteins and exhibit a positive charge gradient, with the magnitude increasing toward the N-terminus. This finding is consistent with the notion that positively charged residues are added to the N-terminus of proteobacterial proteins so that the extended proteins can be chromosomally encoded and efficiently transported to mitochondria after translation. By contrast, nuclear proteins generally have positively charged SDs and negatively charged IDRs. Among nuclear proteins, DNA-binding proteins have enhanced charge tendencies. We propose that SDs in nuclear proteins tend to be positively charged because of the need to bind to negatively charged nucleotides, while IDRs tend to be negatively charged to interact with other proteins or other regions of the same proteins to avoid premature proteasomal degradation. 相似文献
4.
Background
The evolutionary rate of a protein is a basic measure of evolution at the molecular level. Previous studies have shown that genes expressed in the brain have significantly lower evolutionary rates than those expressed in somatic tissues.Results
We study the evolutionary rates of genes expressed in 21 different human brain regions. We find that genes highly expressed in the more recent cortical regions of the brain have lower evolutionary rates than genes highly expressed in subcortical regions. This may partially result from the observation that genes that are highly expressed in cortical regions tend to be highly expressed in subcortical regions, and thus their evolution faces a richer set of functional constraints. The frequency of mammal-specific and primate-specific genes is higher in the highly expressed gene sets of subcortical brain regions than in those of cortical brain regions. The basic inverse correlation between evolutionary rate and gene expression is significantly stronger in brain versus nonbrain tissues, and in cortical versus subcortical regions. Extending upon this cortical/subcortical trend, this inverse correlation is generally more marked for tissues that are located higher along the cranial vertical axis during development, giving rise to the possibility that these tissues are also more evolutionarily recent.Conclusions
We find that cortically expressed genes are more conserved than subcortical ones, and that gene expression levels exert stronger constraints on sequence evolution in cortical versus subcortical regions. Taken together, these findings suggest that cortically expressed genes are under stronger selective pressure than subcortically expressed genes. 相似文献5.
6.
7.
POODLE-L: a two-level SVM prediction system for reliably predicting long disordered regions 总被引:1,自引:0,他引:1
Hirose S Shimizu K Kanai S Kuroda Y Noguchi T 《Bioinformatics (Oxford, England)》2007,23(16):2046-2053
8.
Background
Understanding the adaptive changes that alter the function of proteins during evolution is an important question for biology and medicine. The increasing number of completely sequenced genomes from closely related organisms, as well as individuals within species, facilitates systematic detection of recent selection events by means of comparative genomics. 相似文献9.
Sanghita Banerjee Sandip Chakraborty 《Journal of biomolecular structure & dynamics》2017,35(2):233-249
Why the intrinsically disordered regions evolve within human proteome has became an interesting question for a decade. Till date, it remains an unsolved yet an intriguing issue to investigate why some of the disordered regions evolve rapidly while the rest are highly conserved across mammalian species. Identifying the key biological factors, responsible for the variation in the conservation rate of different disordered regions within the human proteome, may revisit the above issue. We emphasized that among the other biological features (multifunctionality, gene essentiality, protein connectivity, number of unique domains, gene expression level and expression breadth) considered in our study, the number of unique protein domains acts as a strong determinant that negatively influences the conservation of disordered regions. In this context, we justified that proteins having a fewer types of domains preferably need to conserve their disordered regions to enhance their structural flexibility which in turn will facilitate their molecular interactions. In contrast, the selection pressure acting on the stretches of disordered regions is not so strong in the case of multi-domains proteins. Therefore, we reasoned that the presence of conserved disordered stretches may compensate the functions of multiple domains within a single domain protein. Interestingly, we noticed that the influence of the unique domain number and expression level acts differently on the evolution of disordered regions from that of well-structured ones. 相似文献
10.
Motonori Ota Ryotaro Koike Takayuki Amemiya Takeshi Tenno Pedro R. Romero Hidekazu Hiroaki A. Keith Dunker Satoshi Fukuchi 《Journal of structural biology》2013,181(1):29-36
Intrinsically disordered proteins (IDPs) do not adopt stable three-dimensional structures in physiological conditions, yet these proteins play crucial roles in biological phenomena. In most cases, intrinsic disorder manifests itself in segments or domains of an IDP, called intrinsically disordered regions (IDRs), but fully disordered IDPs also exist. Although IDRs can be detected as missing residues in protein structures determined by X-ray crystallography, no protocol has been developed to identify IDRs from structures obtained by Nuclear Magnetic Resonance (NMR). Here, we propose a computational method to assign IDRs based on NMR structures. We compared missing residues of X-ray structures with residue-wise deviations of NMR structures for identical proteins, and derived a threshold deviation that gives the best correlation of ordered and disordered regions of both structures. The obtained threshold of 3.2 Å was applied to proteins whose structures were only determined by NMR, and the resulting IDRs were analyzed and compared to those of X-ray structures with no NMR counterpart in terms of sequence length, IDR fraction, protein function, cellular location, and amino acid composition, all of which suggest distinct characteristics. The structural knowledge of IDPs is still inadequate compared with that of structured proteins. Our method can collect and utilize IDRs from structures determined by NMR, potentially enhancing the understanding of IDPs. 相似文献
11.
A systematic survey of intrinsically disordered (ID) regions was carried out in 2109 human plasma membrane proteins with full assignment of the transmembrane topology with respect to the lipid bilayer. ID regions with 30 consecutive residues or more were detected in 41.0% of the human proteins, a much higher percentage than the corresponding figure (4.7%) for inner membrane proteins of Escherichia coli. The domain organization of each of the membrane protein in terms of transmembrane helices, structural domains, ID, and unassigned regions as well as the distinction of inside or outside of the cell was determined. Long ID regions constitute 13.3 and 3.5% of the human plasma membrane proteins on the inside and outside of the cell, respectively, showing that they preferentially occur on the cytoplasmic side. We interpret this phenomenon as a reflection of the general scarcity of ID regions on the extracellular side and their relative abundance on the cytoplasmic side in multicellular eukaryotic organisms. 相似文献
12.
13.
14.
Predicting disordered regions in proteins based on decision trees of reduced amino acid composition.
Pengfei Han Xiuzhen Zhang Raymond S Norton Zhi-Ping Feng 《Journal of computational biology》2006,13(10):1723-1734
Intrinsically unstructured proteins (IUPs) are proteins lacking a fixed three dimensional structure or containing long disordered regions. IUPs play an important role in biology and disease. Identifying disordered regions in protein sequences can provide useful information on protein structure and function, and can assist high-throughput protein structure determination. In this paper we present a system for predicting disordered regions in proteins based on decision trees and reduced amino acid composition. Concise rules based on biochemical properties of amino acid side chains are generated for prediction. Coarser information extracted from the composition of amino acids can not only improve the prediction accuracy but also increase the learning efficiency. In cross-validation tests, with four groups of reduced amino acid composition, our system can achieve a recall of 80% at a 13% false positive rate for predicting disordered regions, and the overall accuracy can reach 83.4%. This prediction accuracy is comparable to most, and better than some, existing predictors. Advantages of our approach are high prediction accuracy for long disordered regions and efficiency for large-scale sequence analysis. Our software is freely available for academic use upon request. 相似文献
15.
Intraneoplastic diversity in human tumors is a widespread phenomenon of critical importance for tumor progression and the response to therapeutic intervention. Insights into the evolutionary events that control tumor heterogeneity would be a major breakthrough in our comprehension of cancer development and could lead to more effective prevention methods and therapies. In this paper, we design an evolutionary mathematical framework to study the dynamics of heterogeneity over time. We consider specific situations arising during tumorigenesis, such as the emergence of positively selected mutations ("drivers") and the accumulation of neutral variation ("passengers"). We perform exact computer simulations of the emergence of diverse tumor cell clones over time, and derive analytical estimates for the extent of heterogeneity within a population of cancer cells. Our methods contribute to a quantitative understanding of tumor heterogeneity and the impact of heritable alterations on this tumor trait. 相似文献
16.
Collagen heterogeneity within different growth regions of long bones of rachitic and nonrachitic chicks
下载免费PDF全文

Bryan P. Toole Andrew H. Kang Robert L. Trelstad Jerome Gross 《The Biochemical journal》1972,127(4):715-720
The different anatomical regions involved in osteogenesis in the chick long bone have been examined for heterogeneities in collagen structure that might relate to the mechanism of ossification. Experimentally induced lathyrism was employed to enhance collagen solubility, and vitamin D deficiency to allow accumulation of osteoid, the precursor of bone matrix. The extractable lathyritic collagens of the cartilaginous and osseous regions of growing long bones from rachitic and non-rachitic chicks were examined for alpha-chain type and amino acid composition. In both groups of animals the growth plate and cartilaginous regions of the epiphysis gave collagen molecules of the constitution [alpha1(II)](3), whereas the ossifying regions contained [alpha1(I)](2) alpha2. The degree of hydroxylation of the lysine moieties was increased by approximately 50% in the alpha1(I)-chain and alpha2-chain of rachitic bone collagen. Since uncalcified osteoid is greatly enriched in rachitic bone, it is concluded that the collagen of osteoid has the configuration [alpha1(I)](2) alpha2, similar to that of bone matrix, but has an elevated hydroxylysine content. The possible relationship of this difference to the mechanism of calcification is discussed. 相似文献
17.
Background
Intrinsically disordered regions are enriched in short interaction motifs that play a critical role in many protein-protein interactions. Since new short interaction motifs may easily evolve, they have the potential to rapidly change protein interactions and cellular signaling. In this work we examined the dynamics of gain and loss of intrinsically disordered regions in duplicated proteins to inspect if changes after genome duplication can create functional divergence. For this purpose we used Saccharomyces cerevisiae and the outgroup species Lachancea kluyveri.Principal Findings
We find that genes duplicated as part of a genome duplication (ohnologs) are significantly more intrinsically disordered than singletons (p<2.2e-16, Wilcoxon), reflecting a preference for retaining intrinsically disordered proteins in duplicate. In addition, there have been marked changes in the extent of intrinsic disorder following duplication. A large number of duplicated genes have more intrinsic disorder than their L. kluyveri ortholog (29% for duplicates versus 25% for singletons) and an even greater number have less intrinsic disorder than the L. kluyveri ortholog (37% for duplicates versus 25% for singletons). Finally, we show that the number of physical interactions is significantly greater in the more intrinsically disordered ohnolog of a pair (p = 0.003, Wilcoxon).Conclusion
This work shows that intrinsic disorder gain and loss in a protein is a mechanism by which a genome can also diverge and innovate. The higher number of interactors for proteins that have gained intrinsic disorder compared with their duplicates may reflect the acquisition of new interaction partners or new functional roles. 相似文献18.
Zhang T Faraggi E Xue B Dunker AK Uversky VN Zhou Y 《Journal of biomolecular structure & dynamics》2012,29(4):799-813
Short and long disordered regions of proteins have different preference for different amino acid residues. Different methods often have to be trained to predict them separately. In this study, we developed a single neural-network-based technique called SPINE-D that makes a three-state prediction first (ordered residues and disordered residues in short and long disordered regions) and reduces it into a two-state prediction afterwards. SPINE-D was tested on various sets composed of different combinations of Disprot annotated proteins and proteins directly from the PDB annotated for disorder by missing coordinates in X-ray determined structures. While disorder annotations are different according to Disprot and X-ray approaches, SPINE-D's prediction accuracy and ability to predict disorder are relatively independent of how the method was trained and what type of annotation was employed but strongly depend on the balance in the relative populations of ordered and disordered residues in short and long disordered regions in the test set. With greater than 85% overall specificity for detecting residues in both short and long disordered regions, the residues in long disordered regions are easier to predict at 81% sensitivity in a balanced test dataset with 56.5% ordered residues but more challenging (at 65% sensitivity) in a test dataset with 90% ordered residues. Compared to eleven other methods, SPINE-D yields the highest area under the curve (AUC), the highest Mathews correlation coefficient for residue-based prediction, and the lowest mean square error in predicting disorder contents of proteins for an independent test set with 329 proteins. In particular, SPINE-D is comparable to a meta predictor in predicting disordered residues in long disordered regions and superior in short disordered regions. SPINE-D participated in CASP 9 blind prediction and is one of the top servers according to the official ranking. In addition, SPINE-D was examined for prediction of functional molecular recognition motifs in several case studies. 相似文献
19.
Terminal regions of flagellin are disordered in solution 总被引:8,自引:0,他引:8
Limited proteolysis of flagellin from Salmonella typhimurium SJW1103 by subtilisin, trypsin and thermolysin results in homologous degradation patterns. The terminal regions of flagellin are very sensitive to proteolysis. These parts are degraded into small oligopeptides at the very early stage of a mild digestion that yields a relatively stable fragment with a molecular weight of 40,000. Further proteolytic degradation results in a stable 27,000 Mr fragment. The 40,000 Mr tryptic fragment has been identified as residues 67 to 446 of the flagellin sequence, while the 27,000 Mr fragment involves the 179 to 418 segment. The NH2-terminal sequence positions for the corresponding fragments produced by subtilisin are 60 and 174 for the 40,000 Mr and 27,000 Mr fragments, respectively. The fragments lost their polymerizing ability. Structural properties of flagellin and its 40,000 Mr tryptic fragment were compared by circular dichroism spectroscopy and differential scanning calorimetry. Analysis of the calorimetric melting profiles suggests that terminal parts of flagellin have no significant internal stability and they are in extensive contact with water. However, these regions contain some secondary structure, probably alpha-helices, as revealed by comparison of the circular dichroic spectra in the far-ultraviolet region. Our results indicate that, although the terminal regions of flagellin may contain some alpha-helical secondary structure of marginal stability, they have no compact ordered tertiary structure in solution. On the contrary, the central region of the molecule involves at least two compact structural units. 相似文献
20.
Kai-Lieh Huang Amanda B. Chadee Chyi-Ying A. Chen Yueqiang Zhang Ann-Bin Shyu 《RNA (New York, N.Y.)》2013,19(3):295-305
Cytoplasmic poly(A)-binding protein (PABP) C1 recruits different interacting partners to regulate mRNA fate. The majority of PABP-interacting proteins contain a PAM2 motif to mediate their interactions with PABPC1. However, little is known about the regulation of these interactions or the corresponding functional consequences. Through in silico analysis, we found that PAM2 motifs are generally embedded within an extended intrinsic disorder region (IDR) and are located next to cluster(s) of potential serine (Ser) or threonine (Thr) phosphorylation sites within the IDR. We hypothesized that phosphorylation at these Ser/Thr sites regulates the interactions between PAM2-containing proteins and PABPC1. In the present study, we have tested this hypothesis using complementary approaches to increase or decrease phosphorylation. The results indicate that changing the extent of phosphorylation of three PAM2-containing proteins (Tob2, Pan3, and Tnrc6c) alters their ability to interact with PABPC1. Results from experiments using phospho-blocking or phosphomimetic mutants in PAM2-containing proteins further support our hypothesis. Moreover, the phosphomimetic mutations appreciably affected the functions of these proteins in mRNA turnover and gene silencing. Taken together, these results provide a new framework for understanding the roles of intrinsically disordered proteins in the dynamic and signal-dependent control of cytoplasmic mRNA functions. 相似文献