期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A Novel algorithm for identifying low-complexity regions in a protein sequence

Li X Kahveci T 《Bioinformatics (Oxford, England)》2006,22(24):2980-2987

MOTIVATION: We consider the problem of identifying low-complexity regions (LCRs) in a protein sequence. LCRs are regions of biased composition, normally consisting of different kinds of repeats. RESULTS: We define new complexity measures to compute the complexity of a sequence based on a given scoring matrix, such as BLOSUM 62. Our complexity measures also consider the order of amino acids in the sequence and the sequence length. We develop a novel graph-based algorithm called GBA to identify LCRs in a protein sequence. In the graph constructed for the sequence, each vertex corresponds to a pair of similar amino acids. Each edge connects two pairs of amino acids that can be grouped together to form a longer repeat. GBA finds short subsequences as LCR candidates by traversing this graph. It then extends them to find longer subsequences that may contain full repeats with low complexities. Extended subsequences are then post-processed to refine repeats to LCRs. Our experiments on real data show that GBA has significantly higher recall compared to existing algorithms, including 0j.py, CARD, and SEG. AVAILABILITY: The program is available on request. 相似文献

2.

Low Complexity Regions in Mammalian Proteins are Associated with Low Protein Abundance and High Transcript Abundance

Zachery W. Dickson G. Brian Golding 《Molecular biology and evolution》2022,39(5)

相似文献

3.

Understanding hierarchical protein evolution from first principles.

N V Dokholyan E I Shakhnovich 《Journal of molecular biology》2001,312(1):289-307

We propose a model that explains the hierarchical organization of proteins in fold families. The model, which is based on the evolutionary selection of proteins by their native state stability, reproduces patterns of amino acids conserved across protein families. Due to its dynamic nature, the model sheds light on the evolutionary time-scales. By studying the relaxation of the correlation function between consecutive mutations at a given position in proteins, we observe separation of the evolutionary time-scales: at short time intervals families of proteins with similar sequences and structures are formed, while at long time intervals the families of structurally similar proteins that have low sequence similarity are formed. We discuss the evolutionary implications of our model. We provide a "profile" solution to our model and find agreement between predicted patterns of conserved amino acids and those actually observed in nature. 相似文献

4.

Amino Acid Metabolic Origin as an Evolutionary Influence on Protein Sequence in Yeast

Benjamin L. de Bivort Ethan O. Perlstein Sam Kunes Stuart L. Schreiber 《Journal of molecular evolution》2009,68(5):490-497

The metabolic cycle of Saccharomyces cerevisiae consists of alternating oxidative (respiration) and reductive (glycolysis) energy-yielding reactions. The intracellular concentrations of amino acid precursors generated by these reactions oscillate accordingly, attaining maximal concentration during the middle of their respective yeast metabolic cycle phases. Typically, the amino acids themselves are most abundant at the end of their precursor’s phase. We show that this metabolic cycling has likely biased the amino acid composition of proteins across the S. cerevisiae genome. In particular, we observed that the metabolic source of amino acids is the single most important source of variation in the amino acid compositions of functionally related proteins and that this signal appears only in (facultative) organisms using both oxidative and reductive metabolism. Periodically expressed proteins are enriched for amino acids generated in the preceding phase of the metabolic cycle. Proteins expressed during the oxidative phase contain more glycolysis-derived amino acids, whereas proteins expressed during the reductive phase contain more respiration-derived amino acids. Rare amino acids (e.g., tryptophan) are greatly overrepresented or underrepresented, relative to the proteomic average, in periodically expressed proteins, whereas common amino acids vary by a few percent. Genome-wide, we infer that 20,000 to 60,000 residues have been modified by this previously unappreciated pressure. This trend is strongest in ancient proteins, suggesting that oscillating endogenous amino acid availability exerted genome-wide selective pressure on protein sequences across evolutionary time. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users. Benjamin L. de Bivort and Ethan O. Perlstein have contributed equally to this work. 相似文献

5.

Codep: maximizing co-evolutionary interdependencies to discover interacting proteins

Tillier ER Biro L Li G Tillo D 《Proteins》2006,63(4):822-831

Approaches for the determination of interacting partners from different protein families (such as ligands and their receptors) have made use of the property that interacting proteins follow similar patterns and relative rates of evolution. Interacting protein partners can then be predicted from the similarity of their phylogenetic trees or evolutionary distances matrices. We present a novel method called Codep, for the determination of interacting protein partners by maximizing co-evolutionary signals. The order of sequences in the multiple sequence alignments from two protein families is determined in such a manner as to maximize the similarity of substitution patterns at amino acid sites in the two alignments and, thus, phylogenetic congruency. This is achieved by maximizing the total number of interdependencies of amino acids sites between the alignments. Once ordered, the corresponding sequences in the two alignments indicate the predicted interacting partners. We demonstrate the efficacy of this approach with computer simulations and in analyses of several protein families. A program implementing our method, Codep, is freely available to academic users from our website: http://www.uhnresearch.ca/labs/tillier/. 相似文献

6.

Phylogenetic trees constructed from hydrophobicity values of protein sequences

J A Leunissen W W de Jong 《Journal of theoretical biology》1986,119(2):189-196

Information about conformational properties of a protein is contained in the hydrophobicity values of the amino acids in its primary sequence. We have investigated the possibility of extracting meaningful evolutionary information from the comparison of the hydrophobicity values of the corresponding amino acids in the sequences of homologous proteins. Distance matrices for six families of homologous proteins were made on the basis of the differences in hydrophobicity values of the amino acids. The phylogenetic trees constructed from such matrices were at least as good (as judged from their faithful reflection of evolutionary relationships), as trees constructed from the usual minimum mutation distance matrix. 相似文献

7.

Proteomic analysis reveals perturbed energy metabolism and elevated oxidative stress in hearts of rats with inborn low aerobic capacity

Burniston JG Kenyani J Wastling JM Burant CF Qi NR Koch LG Britton SL 《Proteomics》2011,11(16):3369-3379

Selection on running capacity has created rat phenotypes of high-capacity runners (HCRs) that have enhanced cardiac function and low-capacity runners (LCRs) that exhibit risk factors of metabolic syndrome. We analysed hearts of HCRs and LCRs from generation 22 of selection using DIGE and identified proteins from MS database searches. The running capacity of HCRs was six-fold greater than LCRs. DIGE resolved 957 spots and proteins were unambiguously identified in 369 spots. Protein expression profiling detected 67 statistically significant (p<0.05; false discovery rate <10%, calculated using q-values) differences between HCRs and LCRs. Hearts of HCR rats exhibited robust increases in the abundance of each enzyme of the β-oxidation pathway. In contrast, LCR hearts were characterised by the modulation of enzymes associated with ketone body or amino acid metabolism. LCRs also exhibited enhanced expression of antioxidant enzymes such as catalase and greater phosphorylation of α B-crystallin at serine 59, which is a common point of convergence in cardiac stress signalling. Thus, proteomic analysis revealed selection on low running capacity is associated with perturbations in cardiac energy metabolism and provided the first evidence that the LCR cardiac proteome is exposed to greater oxidative stress. 相似文献

8.

Processes of fungal proteome evolution and gain of function: gene duplication and domain rearrangement

Cohen-Gihon I Sharan R Nussinov R 《Physical biology》2011,8(3):035009

During evolution, organisms have gained functional complexity mainly by modifying and improving existing functioning systems rather than creating new ones ab initio. Here we explore the interplay between two processes which during evolution have had major roles in the acquisition of new functions: gene duplication and protein domain rearrangements. We consider four possible evolutionary scenarios: gene families that have undergone none of these event types; only gene duplication; only domain rearrangement, or both events. We characterize each of the four evolutionary scenarios by functional attributes. Our analysis of ten fungal genomes indicates that at least for the fungi clade, species significantly appear to gain complexity by gene duplication accompanied by the expansion of existing domain architectures via rearrangements. We show that paralogs gaining new domain architectures via duplication tend to adopt new functions compared to paralogs that preserve their domain architectures. We conclude that evolution of protein families through gene duplication and domain rearrangement is correlated with their functional properties. We suggest that in general, new functions are acquired via the integration of gene duplication and domain rearrangements rather than each process acting independently. 相似文献

9.

Function and factor interactions of a locus control region element in the mouse T cell receptor-alpha/Dad1 gene locus 总被引：1，自引：0，他引：1

Ortiz BD Harrow F Cado D Santoso B Winoto A 《Journal of immunology (Baltimore, Md. : 1950)》2001,167(7):3836-3845

相似文献

10.

Comparison of the frequency of functional SH3 domains with different limited sets of amino acids using mRNA display

Tanaka J Yanagawa H Doi N 《PloS one》2011,6(3):e18034

Although modern proteins consist of 20 different amino acids, it has been proposed that primordial proteins consisted of a small set of amino acids, and additional amino acids have gradually been recruited into the genetic code. This hypothesis has recently been supported by comparative genome sequence analysis, but no direct experimental approach has been reported. Here, we utilized a novel experimental approach to test a hypothesis that native-like globular proteins might be easily simplified by a set of putative primitive amino acids with retention of its structure and function than by a set of putative new amino acids. We performed in vitro selection of a functional SH3 domain as a model from partially randomized libraries with different sets of amino acids using mRNA display. Consequently, a library rich in putative primitive amino acids included a larger number of functional SH3 sequences than a library rich in putative new amino acids. Further, the functional SH3 sequences were enriched from the primitive library slightly earlier than from a randomized library with the full set of amino acids, while the function and structure of the selected SH3 proteins with the primitive alphabet were comparable with those from the 20 amino acid alphabet. Application of this approach to various combinations of codons in protein sequences may be useful not only for clarifying the precise order of the amino acid expansion in the early stages of protein evolution but also for efficiently creating novel functional proteins in the laboratory. 相似文献

11.

Locus control regions: coming of age at a decade plus.

Q Li S Harju K R Peterson 《Trends in genetics : TIG》1999,15(10):403-408

相似文献

12.

Aminoacylation of Plasmodium falciparum tRNAAsn and Insights in the Synthesis of Asparagine Repeats

Denis Filisetti Anne Théobald-Dietrich Nassira Mahmoudi Jo?lle Rudinger-Thirion Ermanno Candolfi Magali Frugier 《The Journal of biological chemistry》2013,288(51):36361-36371

Genome sequencing revealed an extreme AT-rich genome and a profusion of asparagine repeats associated with low complexity regions (LCRs) in proteins of the malarial parasite Plasmodium falciparum. Despite their abundance, the function of these LCRs remains unclear. Because they occur in almost all families of plasmodial proteins, the occurrence of LCRs cannot be associated with any specific metabolic pathway; yet their accumulation must have given selective advantages to the parasite. Translation of these asparagine-rich LCRs demands extraordinarily high amounts of asparaginylated tRNA^Asn. However, unlike other organisms, Plasmodium codon bias is not correlated to tRNA gene copy number. Here, we studied tRNA^Asn accumulation as well as the catalytic capacities of the asparaginyl-tRNA synthetase of the parasite in vitro. We observed that asparaginylation in this parasite can be considered standard, which is expected to limit the availability of asparaginylated tRNA^Asn in the cell and, in turn, slow down the ribosomal translation rate when decoding asparagine repeats. This observation strengthens our earlier hypothesis considering that asparagine rich sequences act as “tRNA sponges” and help cotranslational folding of parasite proteins. However, it also raises many questions about the mechanistic aspects of the synthesis of asparagine repeats and about their implications in the global control of protein expression throughout Plasmodium life cycle. 相似文献

13.

Prediction of RNA-binding residues in proteins from primary sequence using an enriched random forest model with a novel hybrid feature 总被引：1，自引：0，他引：1

Ma X Guo J Wu J Liu H Yu J Xie J Sun X 《Proteins》2011,79(4):1230-1239

相似文献

14.

Discovery of Unique Lanthionine Synthetases Reveals New Mechanistic and Evolutionary Insights

Yuki Goto Bo Li Jan Claesen Yanxiang Shi Mervyn J. Bibb Wilfred A. van der Donk 《PLoS biology》2010,8(3)

Lantibiotic synthetases are remarkable biocatalysts generating conformationally constrained peptides with a variety of biological activities by repeatedly utilizing two simple posttranslational modification reactions: dehydration of Ser/Thr residues and intramolecular addition of Cys thiols to the resulting dehydro amino acids. Since previously reported lantibiotic synthetases show no apparent homology with any other known protein families, the molecular mechanisms and evolutionary origin of these enzymes are unknown. In this study, we present a novel class of lanthionine synthetases, termed LanL, that consist of three distinct catalytic domains and demonstrate in vitro enzyme activity of a family member from Streptomyces venezuelae. Analysis of individually expressed and purified domains shows that LanL enzymes install dehydroamino acids via phosphorylation of Ser/Thr residues by a protein kinase domain and subsequent elimination of the phosphate by a phosphoSer/Thr lyase domain. The latter has sequence homology with the phosphothreonine lyases found in various pathogenic bacteria that inactivate host mitogen activated protein kinases. A LanC-like cyclase domain then catalyzes the addition of Cys residues to the dehydro amino acids to form the characteristic thioether rings. We propose that LanL enzymes have evolved from stand-alone protein Ser/Thr kinases, phosphoSer/Thr lyases, and enzymes catalyzing thiol alkylation. We also demonstrate that the genes for all three pathways to lanthionine-containing peptides are widespread in Nature. Given the remarkable efficiency of formation of lanthionine-containing polycyclic peptides and the latter''s high degree of specificity for their cognate cellular targets, it is perhaps not surprising that (at least) three distinct families of polypeptide sequences have evolved to access this structurally and functionally diverse class of compounds. 相似文献

15.

Evolutionary origin, diversification and specialization of eukaryotic MutS homolog mismatch repair proteins 总被引：11，自引：2，他引：9

Culligan KM Meyer-Gauen G Lyons-Weiler J Hays JB 《Nucleic acids research》2000,28(2):463-471

Most eubacteria, and all eukaryotes examined thus far, encode homologs of the DNA mismatch repair protein MutS. Although eubacteria encode only one or two MutS-like proteins, eukaryotes encode at least six distinct MutS homolog (MSH) proteins, corresponding to conserved (orthologous) gene families. This suggests evolution of individual gene family lines of descent by several duplication/specialization events. Using quantitative phylogenetic analyses (RASA, or relative apparent synapomorphy analysis), we demonstrate that comparison of complete MutS protein sequences, rather than highly conserved C-terminal domains only, maximizes information about evolutionary relationships. We identify a novel, highly conserved middle domain, as well as clearly delineate an N-terminal domain, previously implicated in mismatch recognition, that shows family-specific patterns of aromatic and charged amino acids. Our final analysis, in contrast to previous analyses of MutS-like sequences, yields a stable phylogenetic tree consistent with the known biochemical functions of MutS/MSH proteins, that now assigns all known eukaryotic MSH proteins to a monophyletic group, whose branches correspond to the respective specialized gene families. The rooted phylogenetic tree suggests their derivation from a mitochondrial MSH1-like protein, itself the descendent of the MutS of a symbiont in a primitive eukaryotic precursor. 相似文献

16.

The Origin of Trypsin: Evidence for Multiple Gene Duplications in Trypsins

António M. Baptista Per Harald Jonson Edward Hough Steffen B. Petersen 《Journal of molecular evolution》1998,47(3):353-362

The trypsin family of serine proteases is one of the most studied protein families, with a wealth of amino acid sequence information available in public databases. Since trypsin-like enzymes are widely distributed in living organisms in nature, likely evolutionary scenarios have been proposed. A novel methodology for Fourier transformation of biological sequences (FOTOBIS) is presented. The methodology is well suited for the identification of the size and extent of short repeats in protein sequences. In the present paper the trypsin family of enzymes is analyzed with FOTOBIS and strong evidence for tandem gene duplication is found. A likely evolutionary path for the development of present-day trypsins involved an intrinsic extensive tandem gene duplication of a small DNA fragment of 15–18 nucleotides, corresponding to five or six amino acids. This ancestral trypsin gene was subsequently duplicated, leading to the earliest version of a full-sized trypsin, from which the contemporary trypsins have developed. Received: 22 November 1997 / Accepted: 26 January 1998 相似文献

17.

A role for selection in regulating the evolutionary emergence of disease-causing and other coding CAG repeats in humans and mice 总被引：8，自引：0，他引：8

Hancock JM Worthey EA Santibáñez-Koref MF 《Molecular biology and evolution》2001,18(6):1014-1023

The evolutionary expansion of CAG repeats in human triplet expansion disease genes is intriguing because of their deleterious phenotype. In the past, this expansion has been suggested to reflect a broad genomewide expansion of repeats, which would imply that mutational and evolutionary processes acting on repeats differ between species. Here, we tested this hypothesis by analyzing repeat- and flanking-sequence evolution in 28 repeat-containing genes that had been sequenced in humans and mice and by considering overall lengths and distributions of CAG repeats in the two species. We found no evidence that these repeats were longer in humans than in mice. We also found no evidence for preferential accumulation of CAG repeats in the human genome relative to mice from an analysis of the lengths of repeats identified in sequence databases. We then investigated whether sequence properties, such as base and amino acid composition and base substitution rates, showed any relationship to repeat evolution. We found that repeat-containing genes were enriched in certain amino acids, presumably as the result of selection, but that this did not reflect underlying biases in base composition. We also found that regions near repeats showed higher nonsynonymous substitution rates than the remainder of the gene and lower nonsynonymous rates in genes that contained a repeat in both the human and the mouse. Higher rates of nonsynonymous mutation in the neighborhood of repeats presumably reflect weaker purifying selection acting in these regions of the proteins, while the very low rate of nonsynonymous mutation in proteins containing a CAG repeat in both species presumably reflects a high level of purifying selection. Based on these observations, we propose that the mutational processes giving rise to polyglutamine repeats in human and murine proteins do not differ. Instead, we propose that the evolution of polyglutamine repeats in proteins results from an interplay between mutational processes and selection. 相似文献

18.

Essential amino acid usage and evolutionary nutrigenomics of eukaryotes--insights into the differential usage of amino acids in protein domains and extra-domains

Santana-Santos L Prosdocimi F Ortega JM 《Genetics and molecular research : GMR》2008,7(3):839-852

相似文献

19.

Low-complexity regions within protein sequences have position-dependent roles 总被引：1，自引：0，他引：1

Alain Coletta John W Pinney David Y Weiss Solís James Marsh Steve R Pettifer Teresa K Attwood 《BMC systems biology》2010,4(1):43

Background

Regions of protein sequences with biased amino acid composition (so-called Low-Complexity Regions (LCRs)) are abundant in the protein universe. A number of studies have revealed that i) these regions show significant divergence across protein families; ii) the genetic mechanisms from which they arise lends them remarkable degrees of compositional plasticity. They have therefore proved difficult to compare using conventional sequence analysis techniques, and functions remain to be elucidated for most of them. Here we undertake a systematic investigation of LCRs in order to explore their possible functional significance, placed in the particular context of Protein-Protein Interaction (PPI) networks and Gene Ontology (GO)-term analysis. 相似文献

20.

Synchronization of stochastic Ca²(+) release units creates a rhythmic Ca²(+) clock in cardiac pacemaker cells

Maltsev AV Maltsev VA Mikheev M Maltseva LA Sirenko SG Lakatta EG Stern MD 《Biophysical journal》2011,100(2):271-283

In sinoatrial node cells of the heart, beating rate is controlled, in part, by local Ca2(+) releases (LCRs) from the sarcoplasmic reticulum, which couple to the action potential via electrogenic Na(+)/Ca2(+) exchange. We observed persisting, roughly periodic LCRs in depolarized rabbit sinoatrial node cells (SANCs). The features of these LCRs were reproduced by a numerical model consisting of a two-dimensional array of stochastic, diffusively coupled Ca2(+) release units (CRUs) with fixed refractory period. Because previous experimental studies showed that β-adrenergic receptor stimulation increases the rate of Ca2(+) release through each CRU (dubbed I(spark)), we explored the link between LCRs and I(spark) in our model. Increasing the CRU release current I(spark) facilitated Ca2(+)-induced-Ca2(+) release and local recruitment of neighboring CRUs to fire more synchronously. This resulted in a progression in simulated LCR size (from sparks to wavelets to global waves), LCR rhythmicity, and decrease of LCR period that parallels the changes observed experimentally with β-adrenergic receptor stimulation. The transition in LCR characteristics was steeply nonlinear over a narrow range of I(spark), resembling a phase transition. We conclude that the (partial) periodicity and rate regulation of the "Calcium clock" in SANCs are emergent properties of the diffusive coupling of an ensemble of interacting stochastic CRUs. The variation in LCR period and size with I(spark) is sufficient to account for β-adrenergic regulation of SANC beating rate. 相似文献