首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

The highly homologous PE_PGRS (Proline-glutamic acid_polymorphic GC-rich repetitive sequence) genes are members of the PE multigene family which is found only in mycobacteria. PE genes are particularly abundant within the genomes of pathogenic mycobacteria where they seem to have expanded as a result of gene duplication events. PE_PGRS genes are characterized by their high GC content and extensive repetitive sequences, making them prone to recombination events and genetic variability.

Results

Comparative sequence analysis of Mycobacterium tuberculosis genes PE_PGRS17 (Rv0978c) and PE_PGRS18 (Rv0980c) revealed a striking genetic variation associated with this typical tandem duplicate. In comparison to the M. tuberculosis reference strain H37Rv, the variation (named the 12/40 polymorphism) consists of an in-frame 12-bp insertion invariably accompanied by a set of 40 single nucleotide polymorphisms (SNPs) that occurs either in PE_PGRS17 or in both genes. Sequence analysis of the paralogous genes in a representative set of worldwide distributed tubercle bacilli isolates revealed data which supported previously proposed evolutionary scenarios for the M. tuberculosis complex (MTBC) and confirmed the very ancient origin of " M. canettii " and other smooth tubercle bacilli. Strikingly, the identified polymorphism appears to be coincident with the emergence of the post-bottleneck successful clone from which the MTBC expanded. Furthermore, the findings provide direct and clear evidence for the natural occurrence of gene conversion in mycobacteria, which appears to be restricted to modern M. tuberculosis strains.

Conclusion

This study provides a new perspective to explore the molecular events that accompanied the evolution, clonal expansion, and recent diversification of tubercle bacilli.  相似文献   

2.
We present evidence of remarkable genome-wide mobility and evolutionary expansion for a class of protein domains whose borders locate close to the borders of their encoding exons. These exon-bordering domains are more numerous and widely distributed in the human genome than other domains. They also co-occur with more diverse domains to form a larger variety of domain architectures in human proteins. A systematic comparison of nine animal genomes from nematodes to mammals revealed that exon-bordering domains expanded faster than other protein domains in both abundance and distribution, as well as the diversity of co-occurring domains and the domain architectures of harboring proteins. Furthermore, exon-bordering domains exhibited a particularly strong preference for class 1-1 intron phase. Our findings suggest that exon-bordering domains were amplified and interchanged within a genome more often and/or more successfully than other domains during evolution, probably the result of extensive exon shuffling and gene duplication events. The diverse biological functions of these domains underscore the important role they play in the expansion and diversification of animal proteomes.  相似文献   

3.
Three common protein isoforms of apolipoprotein E (apoE), encoded by the epsilon2, epsilon3, and epsilon4 alleles of the APOE gene, differ in their association with cardiovascular and Alzheimer's disease risk. To gain a better understanding of the genetic variation underlying this important polymorphism, we identified sequence haplotype variation in 5.5 kb of genomic DNA encompassing the whole of the APOE locus and adjoining flanking regions in 96 individuals from four populations: blacks from Jackson, MS (n=48 chromosomes), Mayans from Campeche, Mexico (n=48), Finns from North Karelia, Finland (n=48), and non-Hispanic whites from Rochester, MN (n=48). In the region sequenced, 23 sites varied (21 single nucleotide polymorphisms, or SNPs, 1 diallelic indel, and 1 multiallelic indel). The 22 diallelic sites defined 31 distinct haplotypes in the sample. The estimate of nucleotide diversity (site-specific heterozygosity) for the locus was 0.0005+/-0.0003. Sequence analysis of the chimpanzee APOE gene showed that it was most closely related to human epsilon4-type haplotypes, differing from the human consensus sequence at 67 synonymous (54 substitutions and 13 indels) and 9 nonsynonymous fixed positions. The evolutionary history of allelic divergence within humans was inferred from the pattern of haplotype relationships. This analysis suggests that haplotypes defining the epsilon3 and epsilon2 alleles are derived from the ancestral epsilon4s and that the epsilon3 group of haplotypes have increased in frequency, relative to epsilon4s, in the past 200,000 years. Substantial heterogeneity exists within all three classes of sequence haplotypes, and there are important interpopulation differences in the sequence variation underlying the protein isoforms that may be relevant to interpreting conflicting reports of phenotypic associations with variation in the common protein isoforms.  相似文献   

4.
The generation of affinity reagents to large numbers of human proteins depends on the ability to express the target proteins as high-quality antigens. The Structural Genomics Consortium (SGC) focuses on the production and structure determination of human proteins. In a 7-year period, the SGC has deposited crystal structures of >800 human protein domains, and has additionally expressed and purified a similar number of protein domains that have not yet been crystallised. The targets include a diversity of protein domains, with an attempt to provide high coverage of protein families. The family approach provides an excellent basis for characterising the selectivity of affinity reagents. We present a summary of the approaches used to generate purified human proteins or protein domains, a test case demonstrating the ability to rapidly generate new proteins, and an optimisation study on the modification of >70 proteins by biotinylation in vivo. These results provide a unique synergy between large-scale structural projects and the recent efforts to produce a wide coverage of affinity reagents to the human proteome.  相似文献   

5.
Prion diseases are invariably fatal neurodegenerative disorders affecting man and various animal species. A large body of evidence supports the notion that the causative agent of these diseases is the prion, which, devoid of nucleic acids, is composed largely, if not entirely, of a conformationally abnormal isoform (PrP(Sc) of the cellular prion protein (PrPc). PrPc is a highly conserved and ubiquitously expressed sialoglycoprotein, the normal function of which is, however, still ill defined. Several modules have been recognised in PrPc structure. Their extensive analysis by different experimental approaches, including transgenic animal models, has allowed to assigning to several modules a putative role in PrPc physiology. Concurrently, it has underscored the possibility that alteration of specific domains may determine the switching from a beneficial role of PrPc into one that becomes detrimental to neurons, and/or promote the conversion of PrPc into the pathogenic PrP(Sc) conformer.  相似文献   

6.
Toll-interleukin-1 receptor (TIR)-encoding proteins represent one of the most important families of disease resistance genes in plants. Studies that have explored the functional details of these genes tended to focus on only a few limited groups; the origin and evolutionary history of these genes were therefore unclear. In this study, focusing on the four principal groups of TIR-encoding genes, we conducted an extensive genome-wide survey of 32 fully sequenced plant genomes and Expressed Sequence Tags (ESTs) from the gymnosperm Pinus taeda and explored the origins and evolution of these genes. Through the identification of the TIR-encoding genes, the analysis of chromosome positions, the identification and analysis of conserved motifs, and sequence alignment and phylogenetic reconstruction, our results showed that the genes of the TIR-X family (TXs) had an earlier origin and a wider distribution than the genes from the other three groups. TIR-encoding genes experienced large-scale gene duplications during evolution. A skeleton motif pattern of the TIR domain was present in all spermatophytes, and the genes with this skeleton pattern exhibited a conserved and independent evolutionary history in all spermatophytes, including monocots, that followed their gymnosperm origin. This study used comparative genomics to explore the origin and evolutionary history of the four main groups of TIR-encoding genes. Additionally, we unraveled the mechanism behind the uneven distribution of TIR-encoding genes in dicots and monocots.  相似文献   

7.
Modularity is a hallmark of molecular evolution. Whether considering gene regulation, the components of metabolic pathways or signaling cascades, the ability to reuse autonomous modules in different molecular contexts can expedite evolutionary innovation. Similarly, protein domains are the modules of proteins, and modular domain rearrangements can create diversity with seemingly few operations in turn allowing for swift changes to an organism's functional repertoire. Here, we assess the patterns and functional effects of modular rearrangements at high resolution. Using a well resolved and diverse group of pancrustaceans, we illustrate arrangement diversity within closely related organisms, estimate arrangement turnover frequency and establish, for the first time, branch-specific rate estimates for fusion, fission, domain addition and terminal loss. Our results show that roughly 16 new arrangements arise per million years and that between 64% and 81% of these can be explained by simple, single-step modular rearrangement events. We find evidence that the frequencies of fission and terminal deletion events increase over time, and that modular rearrangements impact all levels of the cellular signaling apparatus and thus may have strong adaptive potential. Novel arrangements that cannot be explained by simple modular rearrangements contain a significant amount of repeat domains that occur in complex patterns which we term “supra-repeats”. Furthermore, these arrangements are significantly longer than those with a single-step rearrangement solution, suggesting that such arrangements may result from multi-step events. In summary, our analysis provides an integrated view and initial quantification of the patterns and functional impact of modular protein evolution in a well resolved phylogenetic tree. This article is part of a Special Issue entitled: The emerging dynamic view of proteins: Protein plasticity in allostery, evolution and self-assembly.  相似文献   

8.
9.
Thioredoxin reductase (TR) and thioredoxin constitute a major cellular redox system present in all organisms. In contrast to a single form of thioredoxin, there are two TR types: One (bacterial type or small TR) is present in bacteria, archaea, plants, and most unicellular eukaryotes, whereas the second (animal or large TR) is only found in animals and typically contains a carboxy-terminal penultimate selenocysteine encoded by TGA. Surprisingly, we detected sequences of large TRs in various unicellular eukaryotes. Moreover, green algae Chlamydomonas reinhardtii had both small and large TRs, with the latter being a selenoprotein, but no examples of horizontal gene transfer from animals to the green algae could be detected. In addition, phylogenetic analyses revealed that large TRs formed a subgroup of lower eukaryotic glutathione reductases (GRs). The data suggest that the large TR evolved in a lower eukaryote capable of selenocysteine insertion rather than in an animal. The enzyme appeared to evolve by a carboxy-terminal extension of GR such that the resulting carboxy-terminal glutathionelike peptide became an intramolecular substrate for GR and a reductant for thioredoxin. Subsequently, small TRs were lost in an organism that gave rise to animals, large TRs were lost in plants and fungi, and selenocysteine/cysteine replacements took place in some large TRs. Our data implicate carboxy-terminal extension of proteins as a general mechanism of evolution of new protein function.  相似文献   

10.
  1. Download : Download high-res image (166KB)
  2. Download : Download full-size image
  相似文献   

11.

Background  

In higher multicellular eukaryotes, complex protein domain combinations contribute to various cellular functions such as regulation of intercellular or intracellular signaling and interactions. To elucidate the characteristics and evolutionary mechanisms that underlie such domain combinations, it is essential to examine the different types of domains and their combinations among different groups of eukaryotes.  相似文献   

12.
Jadwin JA  Ogiue-Ikeda M  Machida K 《FEBS letters》2012,586(17):2586-2596
The ability of modular protein domains to independently fold and bind short peptide ligands both in vivo and in vitro has allowed a significant number of protein-protein interaction studies to take advantage of them as affinity and detection reagents. Here, we refer to modular domain based proteomics as "domainomics" to draw attention to the potential of using domains and their motifs as tools in proteomics. In this review we describe core concepts of domainomics, established and emerging technologies, and recent studies by functional category. Accumulation of domain-motif binding data should ultimately provide the foundation for domain-specific interactomes, which will likely reveal the underlying substructure of protein networks as well as the selectivity and plasticity of signal transduction.  相似文献   

13.
Gastric fluid is a source of gastric cancer biomarkers. However, very little is known about the normal gastric fluid proteome and its biological variations. In this study, we performed a comprehensive analysis of the human gastric fluid proteome using samples obtained from individuals with benign gastric conditions. Gastric fluid proteins were prefractionated using ultracentrifuge filters (3 kDa cutoff) and analyzed by two-dimensional gel electrophoresis (2-DE) and multidimensional LC-MS/MS. Our 2-DE analysis of 170 gastric fluid samples revealed distinct protein profiles for acidic and neutral samples, highlighting pH effects on protein composition. By 2D LC-MS/MS analysis of pooled samples, we identified 284 and 347 proteins in acidic and neutral samples respectively (FDR ≤1%), of which 265 proteins (72.4%) overlapped. However, unlike neutral samples, most proteins in acidic samples were identified from peptides in the filtrate (i.e., <3 kDa). Consistent with this finding, immunoblot analysis of six potential gastric cancer biomarkers rarely detected full-length proteins in acidic samples. These findings have important implications for biomarker studies because a majority of gastric cancer patients have neutral gastric fluid compared to noncancer controls. Consequently, sample stratification, choice of proteomic approaches, and validation strategy can profoundly affect the interpretation of biomarker findings. These observations should help to refine gastric fluid biomarker studies.  相似文献   

14.
The main mechanisms shaping the modular evolution of proteins are gene duplication, fusion and fission, recombination and loss of fragments. While a large body of research has focused on duplications and fusions, we concentrated, in this study, on how domains are lost. We investigated motif databases and introduced a measure of protein similarity that is based on domain arrangements. Proteins are represented as strings of domains and comparison was based on the classic dynamic alignment scheme. We found that domain losses and duplications were more frequent at the ends of proteins. We showed that losses can be explained by the introduction of start and stop codons which render the terminal domains nonfunctional, such that further shortening, until the whole domain is lost, is not evolutionarily selected against. We demonstrated that domains which also occur as single-domain proteins are less likely to be lost at the N terminus and in the middle, than at the C terminus. We conclude that fission/fusion events with single-domain proteins occur mostly at the C terminus. We found that domain substitutions are rare, in particular in the middle of proteins. We also showed that many cases of substitutions or losses result from erroneous annotations, but we were also able to find courses of evolutionary events where domains vanish over time. This is explained by a case study on the bacterial formate dehydrogenases.  相似文献   

15.
A model for tRNA molecule origin is discussed. The model postulates that this molecule originated simply by direct duplication (and subsequent evolution) of a gene coding for an RNA hairpin structure, which can thus be hypothesized as the evolutionary precursor of the tRNA molecule. The main properties are defined for these hairpin structures and it is suggested that these structures might have housed, near their 3' end, anticodons that were transferred to the loop of the tRNA anticodon during duplication of the hairpin structures. Moreover, the main characteristics are given for the evolutionary intermediary formed by direct duplication of the hairpin structure, i.e. the double hairpin. The evolutionary stages envisaged by this model for tRNA origin seem to naturally imply some evolutionary transitions through which the origin of protein synthesis passed. Finally, some strong historical evidence is provided to corroborate the model.  相似文献   

16.
17.
Intron boundaries were extracted from genomic data and mapped onto single-domain human and murine protein structures taken from the Protein Data Bank. A first analysis of this set of proteins shows that intron boundaries prefer to be in non-regular secondary structure elements, while avoiding alpha-helices and beta-strands. This fact alone suggests an evolutionary model in which introns are constrained by protein structure, particularly by tertiary structure contacts. In addition, in silico recombination experiments of a subset of these proteins together with their homologues, including those in different species, show that introns have a tendency to occur away from artificial crossover hot spots. Altogether, these findings support a model in which genes can preferentially harbour introns in less constrained regions of the protein fold they code for. In the light of these findings, we discuss some implications for protein modelling and design.  相似文献   

18.
HnifU, a gene exhibiting similarity tonifU genes of nitrogen fixation gene clusters, was identified in the course of expressed sequence tag (EST) generation from a human fetal heart cDNA library. Northern blot of human tissues and polymerase chain reaction (PCR) using human genomic DNA verified that the hnifU gene represented a human gene rather than a microbial contaminant of the cDNA library. Conceptual translation of the hnifU cDNA yielded a protein product bearing 77% and 70% amino acid identity to NifU-like hypothetical proteins fromHaemophilus influenzae andSaccharomyces cerevisiae, respectively, and 40–44% identity to the N-terminal regions of NifU proteins from several diazatrophs (i.e., nitrogen-fixing organisms). Pairwise determination of amino acid identities between the NifU-like proteins of nondiazatrophs showed that these NifU-like proteins exhibited higher sequence identity to each other (63–77%) than to the diazatrophic NifU proteins (40–48%). Further, the NifU-like proteins of non-nitrogenfixing organisms were similar only to the N-terminal region of diazatrophic NifU proteins and therefore identified a novel modular domain in these NifU proteins. These findings support the hypothesis that NifU is indeed a modular protein. The high degree of sequence similarity between NifU-like proteins from species as divergent as humans andH. influenzae suggests that these proteins perform some basic cellular function and may be among the most highly conserved proteins. Correspondence to: C.-C. Liew  相似文献   

19.
20.
Members of the newly discovered regulator of G protein signaling (RGS) families of proteins have a common RGS domain. This RGS domain is necessary for conferring upon RGS proteins the capacity to regulate negatively a variety of Galpha protein subunits. However, RGS proteins are more than simply negative regulators of signaling. RGS proteins can function as effector antagonists, and recent evidence suggests that RGS proteins can have positive effects on signaling as well. Many RGS proteins possess additional C- and N-terminal modular protein-binding domains and motifs. The presence of these additional modules within the RGS proteins provides for multiple novel regulatory interactions performed by these molecules. These regions are involved in conferring regulatory selectivity to specific Galpha-coupled signaling pathways, enhancing the efficacy of the RGS domain, and the translocation or targeting of RGS proteins to intracellular membranes. In other instances, these domains are involved in cross-talk between different Galpha-coupled signaling pathways and, in some cases, likely serve to integrate small GTPases with these G protein signaling pathways. This review discusses these C- and N-terminal domains and their roles in the biology of the brain-enriched RGS proteins. Methods that can be used to investigate the function of these domains are also discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号