首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.

Background  

Amino acids in proteins are not used equally. Some of the differences in the amino acid composition of proteins are between species (mainly due to nucleotide composition and lifestyle) and some are between proteins from the same species (related to protein function, expression or subcellular localization, for example). As several factors contribute to the different amino acid usage in proteins, it is difficult both to analyze these differences and to separate the contributions made by each factor.  相似文献   

2.

Background  

The kelch motif is an ancient and evolutionarily-widespread sequence motif of 44–56 amino acids in length. It occurs as five to seven repeats that form a β-propeller tertiary structure. Over 28 kelch-repeat proteins have been sequenced and functionally characterised from diverse organisms spanning from viruses, plants and fungi to mammals and it is evident from expressed sequence tag, domain and genome databases that many additional hypothetical proteins contain kelch-repeats. In general, kelch-repeat β-propellers are involved in protein-protein interactions, however the modest sequence identity between kelch motifs, the diversity of domain architectures, and the partial information on this protein family in any single species, all present difficulties to developing a coherent view of the kelch-repeat domain and the kelch-repeat protein superfamily. To understand the complexity of this superfamily of proteins, we have analysed by bioinformatics the complement of kelch-repeat proteins encoded in the human genome and have made comparisons to the kelch-repeat proteins encoded in other sequenced genomes.  相似文献   

3.

Background

Microsatellites have been used extensively in the field of comparative genomics. By studying microsatellites in coding regions we have a simple model of how genotypic changes undergo selection as they are directly expressed in the phenotype as altered proteins. The simplest of these tandem repeats in coding regions are the tri-nucleotide repeats which produce a repeat of a single amino acid when translated into proteins. Tri-nucleotide repeats are often disease associated, and are also known to be unstable to both expansion and contraction. This makes them sensitive markers for studying proteome evolution, in closely related species.

Results

The evolutionary history of the family of malarial causing parasites Plasmodia is complex because of the life-cycle of the organism, where it interacts with a number of different hosts and goes through a series of tissue specific stages. This study shows that the divergence between the primate and rodent malarial parasites has resulted in a lineage specific change in the simple amino acid repeat distribution that is correlated to A–T content. The paper also shows that this altered use of amino acids in SAARs is consistent with the repeat distributions being under selective pressure.

Conclusions

The study shows that simple amino acid repeat distributions can be used to group related species and to examine their phylogenetic relationships. This study also shows that an outgroup species with a similar A–T content can be distinguished based only on the amino acid usage in repeats, and suggest that this might be a useful feature for proteome clustering. The lineage specific use of amino acids in repeat regions suggests that comparative studies of SAAR distributions between proteomes gives an insight into the mechanisms of expansion and the selective pressures acting on the organism.  相似文献   

4.

Background  

Classification of bacteria within the genus Brucella has been difficult due in part to considerable genomic homogeneity between the different species and biovars, in spite of clear differences in phenotypes. Therefore, many different methods have been used to assess Brucella taxonomy. In the current work, we examine 32 sequenced genomes from genus Brucella representing the six classical species, as well as more recently described species, using bioinformatical methods. Comparisons were made at the level of genomic DNA using oligonucleotide based methods (Markov chain based genomic signatures, genomic codon and amino acid frequencies based comparisons) and proteomes (all-against-all BLAST protein comparisons and pan-genomic analyses).  相似文献   

5.

Background  

Many parasitic organisms, eukaryotes as well as bacteria, possess surface antigens with amino acid repeats. Making up the interface between host and pathogen such repetitive proteins may be virulence factors involved in immune evasion or cytoadherence. They find immunological applications in serodiagnostics and vaccine development. Here we use proteins which contain perfect repeats as a basis for comparative genomics between parasitic and free-living organisms.  相似文献   

6.

Background

Many proteins with tandem repeats in their sequence have been described and classified according to the length of the repeats: I) Repeats of short oligopeptides (from 2 to 20 amino acids), including structural cell wall proteins and arabinogalactan proteins. II) Repeats that range in length from 20 to 40 residues, including proteins with a well-established three-dimensional structure often involved in mediating protein-protein interactions. (III) Longer repeats in the order of 100 amino acids that constitute structurally and functionally independent units. Here we analyse ShooT specific (ST) proteins, a family of proteins with tandem repeats of unknown function that were first found in Leguminosae, and their possible similarities to other proteins with tandem repeats.

Results

ST protein sequences were only found in dicotyledonous plants, limited to several plant families, mainly the Fabaceae and the Asteraceae. ST mRNAs accumulate mainly in the roots and under biotic interactions. Most ST proteins have one or several Domain(s) of Unknown Function 2775 (DUF2775). All deduced ST proteins have a signal peptide, indicating that these proteins enter the secretory pathway, and the mature proteins have tandem repeat oligopeptides that share a hexapeptide (E/D)FEPRP followed by 4 partially conserved amino acids, which could determine a putative N-glycosylation signal, and a fully conserved tyrosine. In a phylogenetic tree, the sequences clade according to taxonomic group. A possible involvement in symbiosis and abiotic stress as well as in plant cell elongation is suggested, although different STs could play different roles in plant development.

Conclusions

We describe a new family of proteins called ST whose presence is limited to the plant kingdom, specifically to a few families of dicotyledonous plants. They present 20 to 40 amino acid tandem repeat sequences with different characteristics (signal peptide, DUF2775 domain, conservative repeat regions) from the described group of 20 to 40 amino acid tandem repeat proteins and also from known cell wall proteins with repeat sequences. Several putative roles in plant physiology can be inferred from the characteristics found.  相似文献   

7.
Mutation patterns of amino acid tandem repeats in the human proteome   总被引:1,自引:0,他引:1  

Background

Amino acid tandem repeats are found in nearly one-fifth of human proteins. Abnormal expansion of these regions is associated with several human disorders. To gain further insight into the mutational mechanisms that operate in this type of sequence, we have analyzed a large number of mutation variants derived from human expressed sequence tags (ESTs).

Results

We identified 137 polymorphic variants in 115 different amino acid tandem repeats. Of these, 77 contained amino acid substitutions and 60 contained gaps (expansions or contractions of the repeat unit). The analysis showed that at least about 21% of the repeats might be polymorphic in humans. We compared the mutations found in different types of amino acid repeats and in adjacent regions. Overall, repeats showed a five-fold increase in the number of gap mutations compared to adjacent regions, reflecting the action of slippage within the repetitive structures. Gap and substitution mutations were very differently distributed between different amino acid repeat types. Among repeats containing gap variants we identified several disease and candidate disease genes.

Conclusion

This is the first report at a genome-wide scale of the types of mutations occurring in the amino acid repeat component of the human proteome. We show that the mutational dynamics of different amino acid repeat types are very diverse. We provide a list of loci with highly variable repeat structures, some of which may be potentially involved in disease.  相似文献   

8.

Background

Trypanosoma cruzi has a single flagellum attached to the cell body by a network of specialized cytoskeletal and membranous connections called the flagellum attachment zone. Previously, we isolated a DNA fragment (clone H49) which encodes tandemly arranged repeats of 68 amino acids associated with a high molecular weight cytoskeletal protein. In the current study, the genomic complexity of H49 and its relationships to the T. cruzi calpain-like cysteine peptidase family, comprising active calpains and calpain-like proteins, is addressed. Immunofluorescence analysis and biochemical fractionation were used to demonstrate the cellular location of H49 proteins.

Methods and Findings

All of H49 repeats are associated with calpain-like sequences. Sequence analysis demonstrated that this protein, now termed H49/calpain, consists of an amino-terminal catalytic cysteine protease domain II, followed by a large region of 68-amino acid repeats tandemly arranged and a carboxy-terminal segment carrying the protease domains II and III. The H49/calpains can be classified as calpain-like proteins as the cysteine protease catalytic triad has been partially conserved in these proteins. The H49/calpains repeats share less than 60% identity with other calpain-like proteins in Leishmania and T. brucei, and there is no immunological cross reaction among them. It is suggested that the expansion of H49/calpain repeats only occurred in T. cruzi after separation of a T. cruzi ancestor from other trypanosomatid lineages. Immunofluorescence and immunoblotting experiments demonstrated that H49/calpain is located along the flagellum attachment zone adjacent to the cell body.

Conclusions

H49/calpain contains large central region composed of 68-amino acid repeats tandemly arranged. They can be classified as calpain-like proteins as the cysteine protease catalytic triad is partially conserved in these proteins. H49/calpains could have a structural role, namely that of ensuring that the cell body remains attached to the flagellum by connecting the subpellicular microtubule array to it.  相似文献   

9.
A cDNA clone, pMA1949, detects two mRNA species in wheat seedling tissue that are late embryogenesis-abundant (LEA) and dehydration stress-inducible. Sequence analysis of the pMA1949 clone shows it to be a 991 bp partial cDNA encoding a polypeptide of 317 amino acids with homology to two group 3 LEA proteins, carrot (DC8) and a soybean protein encoded by pGmPM2 cDNA. Molecular analysis of the deduced protein reveals a 33 kDa acidic and extremely hydrophilic protein with potential amphiphilic -helical regions. In addition, the protein contains eleven similar, contiguous repeats of 11 amino acids, which are separated by 118 amino acids from two additional and unique repeats of 36 residues each at the carboxyl end of the protein. Comparisons of sequences of reported group 3 LEA proteins revealed that there are two types, separable by sequence similarity of the 11 amino acid repeating motifs and by the presence or absence of a certain amino acid stretch at the carboxyl terminus. Based on resuls from these comparisons, we propose a second type of group 3 LEA proteins, called group 3 LEA (II).  相似文献   

10.
11.

Background  

Evolutionary relations of similar segments shared by different protein folds remain controversial, even though many examples of such segments have been found. To date, several methods such as those based on the results of structure comparisons, sequence-based classifications, and sequence-based profile-profile comparisons have been applied to identify such protein segments that possess local similarities in both sequence and structure across protein folds. However, to capture more precise sequence-structure relations, no method reported to date combines structure-based profiles, and sequence-based profiles based on evolutionary information. The former are generally regarded as representing the amino acid preferences at each position of a specific conformation of protein segment. They might reflect the nature of ancient short peptide ancestors, using the results of structural classifications of protein segments.  相似文献   

12.

Background  

The nature of the protein molecular clock, the protein-specific rate of amino acid substitutions, is among the central questions of molecular evolution. Protein expression level is the dominant determinant of the clock rate in a number of organisms. It has been suggested that highly expressed proteins evolve slowly in all species mainly to maintain robustness to translation errors that generate toxic misfolded proteins. Here we investigate this hypothesis experimentally by comparing the growth rate of Escherichia coli expressing wild type and misfolding-prone variants of the LacZ protein.  相似文献   

13.

Background  

Single amino acid repeats make up a significant proportion in all of the proteomes that have currently been determined. They have been shown to be functionally and medically significant, and are associated with cancers and neuro-degenerative diseases such as Huntington's Chorea, where a poly-glutamine repeat is responsible for causing the disease. The COPASAAR database is a new tool to facilitate the rapid analysis of single amino acid repeats at a proteome level. The database aims to simplify the comparison of repeat distributions between proteomes in order to provide a better understanding of their function and evolution.  相似文献   

14.

Background  

Understanding how amino acid substitutions affect protein functions is critical for the study of proteins and their implications in diseases. Although methods have been developed for predicting potential effects of amino acid substitutions using sequence, three-dimensional structural, and evolutionary properties of proteins, the applications are limited by the complication of the features and the availability of protein structural information. Another limitation is that the prediction results are hard to be interpreted with physicochemical principles and biological knowledge.  相似文献   

15.

Background

Among bacteria and archaea, amino acid usage is correlated with habitat temperatures. In particular, protein surfaces in species thriving at higher temperatures appear to be enriched in amino acids that stabilize protein structure and depleted in amino acids that decrease thermostability. Does this observation reflect a causal relationship, or could the apparent trend be caused by phylogenetic relatedness among sampled organisms living at different temperatures? And do proteins from endothermic and exothermic vertebrates show similar differences?

Results

We find that the observed correlations between the frequencies of individual amino acids and prokaryotic habitat temperature are strongly influenced by evolutionary relatedness between the species analysed; however, a proteome-wide bias towards increased thermostability remains after controlling for phylogeny. Do eukaryotes show similar effects of thermal adaptation? A small shift of amino acid usage in the expected direction is observed in endothermic ('warm-blooded') mammals and chicken compared to ectothermic ('cold-blooded') vertebrates with lower body temperatures; this shift is not simply explained by nucleotide usage biases.

Conclusion

Protein homologs operating at different temperatures have different amino acid composition, both in prokaryotes and in vertebrates. Thus, during the transition from ectothermic to endothermic life styles, the ancestors of mammals and of birds may have experienced weak genome-wide positive selection to increase the thermostability of their proteins.
  相似文献   

16.

Background  

The OmcB protein is one of the most immunogenic proteins in C. trachomatis and C. pneumoniae infections. This protein is highly conserved leading to serum cross reactivity between the various chlamydial species. Since previous studies based on recombinant proteins failed to identify a species specific immune response against the OmcB protein, this study evaluated an in silico predicted specific and immunogenic antigen from the OmcB protein for the serodiagnosis of C. trachomatis infections.  相似文献   

17.

Background  

The amino acid substitution model is the core component of many protein analysis systems such as sequence similarity search, sequence alignment, and phylogenetic inference. Although several general amino acid substitution models have been estimated from large and diverse protein databases, they remain inappropriate for analyzing specific species, e.g., viruses. Emerging epidemics of influenza viruses raise the need for comprehensive studies of these dangerous viruses. We propose an influenza-specific amino acid substitution model to enhance the understanding of the evolution of influenza viruses.  相似文献   

18.

Background  

A controversial topic in evolutionary developmental biology is whether morphological diversification in natural populations can be driven by expansions and contractions of amino acid repeats in proteins. To promote adaptation, selection on protein length variation must overcome deleterious effects of multiple correlated traits (pleiotropy). Thus far, systems that demonstrate this capacity include only ancient or artificial morphological diversifications. The Hawaiian Islands, with their linear geological sequence, present a unique environment to study recent, natural radiations. We have focused our research on the Hawaiian endemic mints (Lamiaceae), a large and diverse lineage with paradoxically low genetic variation, in order to test whether a direct relationship between coding-sequence repeat diversity and morphological change can be observed in an actively evolving system.  相似文献   

19.

Background  

Amino acid repeat-containing proteins have a broad range of functions and their identification is of relevance to many experimental biologists. In human-infective protozoan parasites (such as the Kinetoplastid and Plasmodium species), they are implicated in immune evasion and have been shown to influence virulence and pathogenicity. RepSeq is a new database of amino acid repeat-containing proteins found in lower eukaryotic pathogens. The RepSeq database is accessed via a web-based application which also provides links to related online tools and databases for further analyses.  相似文献   

20.

Background  

Widely used substitution models for proteins, such as the Jones-Taylor-Thornton (JTT) or Whelan and Goldman (WAG) models, are based on empirical amino acid interchange matrices estimated from databases of protein alignments that incorporate the average amino acid frequencies of the data set under examination (e.g JTT + F). Variation in the evolutionary process between sites is typically modelled by a rates-across-sites distribution such as the gamma (Γ) distribution. However, sites in proteins also vary in the kinds of amino acid interchanges that are favoured, a feature that is ignored by standard empirical substitution matrices. Here we examine the degree to which the pattern of evolution at sites differs from that expected based on empirical amino acid substitution models and evaluate the impact of these deviations on phylogenetic estimation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号