首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Mirny LA  Gelfand MS 《Genome biology》2002,3(3):preprint00-20

Background  

Concepts of orthology and paralogy are become increasingly important as whole-genome comparison allows their identification in complete genomes. Functional specificity of proteins is assumed to be conserved among orthologs and is different among paralogs. We used this assumption to identify residues which determine specificity of protein-DNA and protein-ligand recognition. Finding such residues is crucial for understanding mechanisms of molecular recognition and for rational protein and drug design.  相似文献   

2.
3.
DNA glycosylases are important repair enzymes that eliminate a diverse array of aberrant nucleobases from the genomes of all organisms. Individual bacterial species often contain multiple paralogs of a particular glycosylase, yet the molecular and functional distinctions between these paralogs are not well understood. The recently discovered HEAT-like repeat (HLR) DNA glycosylases are distributed across all domains of life and are distinct in their specificity for cationic alkylpurines and mechanism of damage recognition. Here, we describe a number of phylogenetically diverse bacterial species with two orthologs of the HLR DNA glycosylase AlkD. One ortholog, which we designate AlkD2, is substantially less conserved. The crystal structure of Streptococcus mutans AlkD2 is remarkably similar to AlkD but lacks the only helix present in AlkD that penetrates the DNA minor groove. We show that AlkD2 possesses only weak DNA binding affinity and lacks alkylpurine excision activity. Mutational analysis of residues along this DNA binding helix in AlkD substantially reduced binding affinity for damaged DNA, for the first time revealing the importance of this structural motif for damage recognition by HLR glycosylases.  相似文献   

4.
Transmembrane transport is an essential component of the cell life. Many genes encoding known or putative transport proteins are found in bacterial genomes. In most cases their substrate specificity is not experimentally determined and only approximately predicted by comparative genomic analysis. Even less is known about the 3D structure of transporters. Nevertheless, the published experimental data demonstrate that channel-forming residues determine the substrate specificity of secondary transporters and analysis of these residues would provide better understanding of the transport mechanism. We developed a simple computational method for identification of channel-forming residues in transporter sequences. It is based on the analysis of amino acids frequencies in bacterial secondary transporters. We applied this method to a variety of transmembrane proteins with resolved 3D structure. The predictions are in sufficiently good agreement with the real protein structure.  相似文献   

5.
6.
7.
8.
Orthologs generally are under selective pressure against loss of function, while paralogs usually accumulate mutations and finally die or deviate in terms of function or regulation. Most ortholog detection methods contaminate the resulting datasets with a substantial amount of paralogs. Therefore we aimed to implement a straightforward method that allows the detection of ortholog clusters with a reduced amount of paralogs from completely sequenced genomes. The described cross-species expansion of the reciprocal best BLAST hit method is a time-effective method for ortholog detection, which results in 68% truly orthologous clusters and the procedure specifically enriches single-copy orthologs. The detection of true orthologs can provide a phylogenetic toolkit to better understand evolutionary processes. In a study across six photosynthetic eukaryotes, nuclear genes of putative mitochondrial origin were shown to be over-represented among single copy orthologs. These orthologs are involved in fundamental biological processes like amino acid metabolism or translation. Molecular clock analyses based on this dataset yielded divergence time estimates for the red/green algae (1,142 MYA), green algae/land plant (725 MYA), mosses/seed plant (496 MYA), gymno-/angiosperm (385 MYA) and monocotyledons/core eudicotyledons (301 MYA) divergence times. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

9.
10.
The function of most proteins is not determined experimentally, but is extrapolated from homologs. According to the "ortholog conjecture", or standard model of phylogenomics, protein function changes rapidly after duplication, leading to paralogs with different functions, while orthologs retain the ancestral function. We report here that a comparison of experimentally supported functional annotations among homologs from 13 genomes mostly supports this model. We show that to analyze GO annotation effectively, several confounding factors need to be controlled: authorship bias, variation of GO term frequency among species, variation of background similarity among species pairs, and propagated annotation bias. After controlling for these biases, we observe that orthologs have generally more similar functional annotations than paralogs. This is especially strong for sub-cellular localization. We observe only a weak decrease in functional similarity with increasing sequence divergence. These findings hold over a large diversity of species; notably orthologs from model organisms such as E. coli, yeast or mouse have conserved function with human proteins.  相似文献   

11.
Gene duplication is one of the main mechanisms by which genomes can acquire novel functions. It has been proposed that the retention of gene duplicates can be associated to processes of tissue expression divergence. These models predict that acquisition of divergent expression patterns should be acquired shortly after the duplication, and that larger divergence in tissue expression would be expected for paralogs, as compared to orthologs of a similar age. Many studies have shown that gene duplicates tend to have divergent expression patterns and that gene family expansions are associated with high levels of tissue specificity. However, the timeframe in which these processes occur have rarely been investigated in detail, particularly in vertebrates, and most analyses do not include direct comparisons of orthologs as a baseline for the expected levels of tissue specificity in absence of duplications. To assess the specific contribution of duplications to expression divergence, we combine here phylogenetic analyses and expression data from human and mouse. In particular, we study differences in spatial expression among human-mouse paralogs, specifically duplicated after the radiation of mammals, and compare them to pairs of orthologs in the same species. Our results show that gene duplication leads to increased levels of tissue specificity and that this tends to occur promptly after the duplication event.  相似文献   

12.
Protein-DNA recognition plays an essential role in the regulation of gene expression. Regulatory proteins are known to recognize specific DNA sequences directly through atomic contacts (intermolecular readout) and/or indirectly through the conformational properties of the DNA (intramolecular readout). However, little is known about the respective contributions made by these so-called direct and indirect readout mechanisms. We addressed this question by making use of information extracted from a structural database containing many protein-DNA complexes. We quantified the specificity of intermolecular (direct) readout by statistical analysis of base-amino acid interactions within protein-DNA complexes. The specificity of the intramolecular (indirect) readout due to DNA was quantified by statistical analysis of the sequence-dependent DNA conformation. Systematic comparison of these specificities in a large number of protein-DNA complexes revealed that both intermolecular and intramolecular readouts contribute to the specificity of protein-DNA recognition, and that their relative contributions vary depending upon the protein-DNA complexes. We demonstrated that combination of the intermolecular and intramolecular energies derived from the statistical analyses lead to enhanced specificity, and that the combined energy could explain experimental data on binding affinity changes caused by base mutations. These results provided new insight into the relationship between specificity and structure in the process of protein-DNA recognition, which would lead to prediction of specific protein-DNA binding sites.  相似文献   

13.
alpha and beta Tubulin are well-characterized paralogs with similar structures and functions. We quantify the variability of every amino acid position in both tubulins from the aligned sequences of their numerous known orthologs. By aligning the variability profiles, we identify residues that differ significantly in variability between alpha and beta tubulin. Most of these residues are part of well-defined secondary structures and are clustered around the nucleotide binding pocket, the site of greatest functional difference between the two paralogs. The remaining residues of large difference in variability are located in the N-terminal loop between H1 and S2. We therefore predict that certain residues in this unstructured region also contribute to a functional difference between alpha and beta tubulin. Furthermore, we find the most restrictive variability-based alignment is nearly identical to the true structure-based alignment. Thus, by using a stringent variability-based alignment to approximate the true alignment, the method introduced here may predict sites of functional distinction between paralogous proteins even in the absence of structural information.  相似文献   

14.
15.
Olfactory receptors (ORs) are a large family of proteins involved in the recognition and discrimination of numerous odorants. These receptors belong to the G-protein coupled receptor (GPCR) hyperfamily, for which little structural data are available. In this study we predict the binding site residues of OR proteins by analyzing a set of 1441 OR protein sequences from mouse and human. The central insight utilized is that functional contact residues would be conserved among pairs of orthologous receptors, but considerably less conserved among paralogous pairs. Using judiciously selected subsets of 218 ortholog pairs and 518 paralog pairs, we have identified 22 sequence positions that are both highly conserved among the putative orthologs and variable among paralogs. These residues are disposed on transmembrane helices 2 to 7, and on the second extracellular loop of the receptor. Strikingly, although the prediction makes no assumption about the location of the binding site, these amino acid positions are clustered around a pocket in a structural homology model of ORs, mostly facing the inner lumen. We propose that the identified positions constitute the odorant binding site. This conclusion is supported by the observation that all but one of the predicted binding site residues correspond to ligand-contact positions in other rhodopsin-like GPCRs.  相似文献   

16.
Structural and biochemical studies of Cys(2)His(2) zinc finger proteins initially led several groups to propose a "recognition code" involving a simple set of rules relating key amino acid residues in the zinc finger protein to bases in its DNA site. One recent study from our group, involving geometric analysis of protein-DNA interactions, has discussed limitations of this idea and has shown how the spatial relationship between the polypeptide backbone and the DNA helps to determine what contacts are possible at any given position in a protein-DNA complex. Here we report a study of a zinc finger variant that highlights yet another source of complexity inherent in protein-DNA recognition. In particular, we find that mutations can cause key side-chains to rearrange at the protein-DNA interface without fundamental changes in the spatial relationship between the polypeptide backbone and the DNA. This is clear from a simple analysis of the binding site preferences and co-crystal structures for the Asp20-->Ala point mutant of Zif268. This point mutation in finger one changes the specificity of the protein from GCG TGG GCG to GCG TGG GC(G/T), and we have solved crystal structures of the D20A mutant bound to both types of sites. The structure of the D20A mutant bound to the GCG site reveals that contacts from key residues in the recognition helix are coupled in complex ways. The structure of the complex with the GCT site also shows an important new water molecule at the protein-DNA interface. These side-chain/side-chain interactions, and resultant changes in hydration at the interface, affect binding specificity in ways that cannot be predicted either from a simple recognition code or from analysis of spatial relationships at the protein-DNA interface. Accurate computer modeling of protein-DNA interfaces remains a challenging problem and will require systematic strategies for modeling side-chain rearrangements and change in hydration.  相似文献   

17.
The enhanceosome   总被引:1,自引:0,他引:1  
  相似文献   

18.
A phylogenomic study of the MutS family of proteins.   总被引:23,自引:4,他引:19       下载免费PDF全文
The MutS protein of Escherichia coli plays a key role in the recognition and repair of errors made during the replication of DNA. Homologs of MutS have been found in many species including eukaryotes, Archaea and other bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E.coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species; many species encode multiple MutS homologs with distinct functions. To better characterize the MutS protein family, I have used a combination of phylogenetic reconstructions and analysis of complete genome sequences. This phylogenomic analysis is used to infer the evolutionary relationships among the MutS family members and to divide the family into subfamilies of orthologs. Analysis of the distribution of these orthologs in particular species and examination of the relationships within and between subfamilies is used to identify likely evolutionary events (e.g. gene duplications, lateral transfer and gene loss) in the history of the MutS family. In particular, evidence is presented that a gene duplication early in the evolution of life resulted in two main MutS lineages, one including proteins known to function in mismatch repair and the other including proteins known to function in chromosome segregation and crossing-over. The inferred evolutionary history of the MutS family is used to make predictions about some of the uncharacterized genes and species included in the analysis. For example, since function is generally conserved within subfamilies and lineages, it is proposed that the function of uncharacterized proteins can be predicted by their position in the MutS family tree. The uses of phylogenomic approaches to the study of genes and genomes are discussed.  相似文献   

19.
Sun X  Cao Y  Wang S 《Plant physiology》2006,140(3):998-1008
The rice (Oryza sativa) Xa26 gene, which confers resistance to bacterial blight disease and encodes a leucine-rich repeat (LRR) receptor kinase, resides at a locus clustered with tandem homologous genes. To investigate the evolution of this family, four haplotypes from the two subspecies of rice, indica and japonica, were analyzed. Comparative sequence analysis of 34 genes of 10 types of paralogs of the family revealed haplotype polymorphisms and pronounced paralog diversity. The orthologs in different haplotypes were more similar than the paralogs in the same haplotype. At least five types of paralogs were formed before the separation of indica and japonica subspecies. Only 7% of amino acid sites were detected to be under positive selection, which occurred in the extracytoplasmic domain. Approximately 74% of the positively selected sites were solvent-exposed amino acid residues of the LRR domain that have been proposed to be involved in pathogen recognition, and 73% of the hypervariable sites detected in the LRR domain were subject to positive selection. The family is formed by tandem duplication followed by diversification through recombination, deletion, and point mutation. Most variation among genes in the family is caused by point mutations and positive selection.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号