首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 250 毫秒
1.
Bacterial species, and even strains within species, can vary greatly in their gene contents and metabolic capabilities. We examine the evolution of this diversity by assessing the distribution and ancestry of each gene in 13 sequenced isolates of Escherichia coli and Shigella. We focus on the emergence and demise of two specific classes of genes, ORFans (genes with no homologs in present databases) and HOPs (genes with distant homologs), since these genes, in contrast to most conserved ancestral sequences, are known to be a major source of the novel features in each strain. We find that the rates of gain and loss of these genes vary greatly among strains as well as through time, and that ORFans and HOPs show very different behavior with respect to their emergence and demise. Although HOPs, which mostly represent gene acquisitions from other bacteria, originate more frequently, ORFans are much more likely to persist. This difference suggests that many adaptive traits are conferred by completely novel genes that do not originate in other bacterial genomes. With respect to the demise of these acquired genes, we find that strains of Shigella lose genes, both by disruption events and by complete removal, at accelerated rates.  相似文献   

2.
Structural biology sheds light on the puzzle of genomic ORFans   总被引:5,自引:0,他引:5  
Genomic ORFans are orphan open reading frames (ORFs) with no significant sequence similarity to other ORFs. ORFans comprise 20-30% of the ORFs of most completely sequenced genomes. Because nothing can be learnt about ORFans via sequence homology, the functions and evolutionary origins of ORFans remain a mystery. Furthermore, because relatively few ORFans have been experimentally characterized, it has been suggested that most ORFans are not likely to correspond to functional, expressed proteins, but rather to spurious ORFs, pseudo-genes or to rapidly evolving proteins with non-essential roles. As a snapshot view of current ORFan structural studies, we searched for ORFans among proteins whose three-dimensional structures have been recently determined. We find that functional and structural studies of ORFans are not as underemphasized as previously suggested. These recently determined structures correspond to ORFans from all Kingdoms of life, and include proteins that have previously been functionally characterized, as well as structural genomics targets of unknown function labeled as "hypothetical proteins". This suggests that many of the ORFans in the databases are likely to correspond to expressed, functional (and even essential) proteins. Furthermore, the recently determined structures include examples of the various types of ORFans, suggesting that the functions and evolutionary origins of ORFans are diverse. Although this survey sheds some light on the ORFan mystery, further experimental studies are required to gain a better understanding of the role and origins of the tens of thousands of ORFans awaiting characterization.  相似文献   

3.
The mimivirus genome contains many genes that lack homologs in the sequence database and are thus known as ORFans. In addition, mimivirus genes that encode proteins belonging to known fold families are in some cases fused to domain-sized segments that cannot be classified. One such ORFan region is present in the mimivirus enzyme R596, a member of the Erv family of sulfhydryl oxidases. We determined the structure of a variant of full-length R596 and observed that the carboxy-terminal region of R596 assumes a folded, compact domain, demonstrating that these ORFan segments can be stable structural units. Moreover, the R596 ORFan domain fold is novel, hinting at the potential wealth of protein structural innovation yet to be discovered in large double-stranded DNA viruses. In the context of the R596 dimer, the ORFan domain contributes to formation of a broad cleft enriched with exposed aromatic groups and basic side chains, which may function in binding target proteins or localization of the enzyme within the virus factory or virions. Finally, we find evidence for an intermolecular dithiol/disulfide relay within the mimivirus R596 dimer, the first such extended, intersubunit redox-active site identified in a viral sulfhydryl oxidase.  相似文献   

4.
ORFans are orphan open reading frames. The numbers of ORFans are steadfastly increasing despite of the genome database increment. Characterizing ORFans is essential to fully understanding the diversity of the structure and function of proteins in nature. In this study, MPN423 from Mycoplasma pneumoniae has been cloned, expressed, purified, and crystallized. MPN423 is an orthologous ORFan whose only known homologue in the whole genome database is MG296 from M. genitalium. X-ray diffraction data were collected to 2.7 A from the crystal of a selenomethionine substitute MPN423. The crystal belongs to the primitive monoclinic space group P2(1), with unit-cell parameters of a = 50.5 A, b = 89.2 A, c = 50.6 A, and beta = 102.9 degrees . A preliminary electron density map shows five alpha-helical segments per MPN423 molecule. A full structure determination is under way to provide helpful information to general questions about orthologous ORFan products.  相似文献   

5.
The complete nucleotide sequences of over 37 microbial and three eukaryote genomes are already publicly available, and more sequencing is in progress. Despite this accumulation of data, newly sequenced microbial genomes continue to reveal up to 50% of functionally uncharacterized "anonymous" genes. A majority of these anonymous proteins have homologues in other organisms, whereas the rest exhibit no clear similarity to any other sequence in the data bases. This set of unique, apparently species-specific, sequences are referred to as ORFans. The biochemical and structural analysis of ORFan gene products is of both evolutionary and functional interest. Here we report the cloning and expression of Escherichia coli ORFan ykfE gene and the functional characterization of the encoded protein. Under physiological conditions, the protein is a homodimer with a strong affinity for C-type lysozyme, as revealed by co-purification and co-crystallization. Activity measurements and fluorescence studies demonstrated that the YkfE gene product is a potent C-type lysozyme inhibitor (K(i) approximately 1 nm). To denote this newly assigned function, ykfE has now been registered under the new gene name Ivy (inhibitor of vertebrate lysozyme) at the E. coli genetic stock center.  相似文献   

6.
HOPs (HSP70–HSP90 organizing proteins) are a highly conserved family of HSP70 and HSP90 co-chaperones whose role in assisting the folding of various hormonal receptors has been extensively studied in mammals. In plants, HOPs are mainly associated with stress response, but their potential involvement in hormonal networks remains completely unexplored. In this article we describe that a member of the HOP family, HOP3, is involved in the jasmonic acid (JA) pathway and is linked to plant defense responses not only to pathogens, but also to a generalist herbivore. The JA pathway regulates responses to Botrytis cinerea infection and to Tetranychus urticae feeding; our data demonstrate that the Arabidopsis (Arabidopsis thaliana) hop3-1 mutant shows an increased susceptibility to both. The hop3-1 mutant exhibits reduced sensitivity to JA derivatives in root growth assays and downregulation of different JA-responsive genes in response to methyl jasmonate, further revealing the relevance of HOP3 in the JA pathway. Interestingly, yeast two-hybrid assays and in planta co-immunoprecipitation assays found that HOP3 interacts with COI1, suggesting that COI1 is a target of HOP3. Consistent with this observation, COI1 activity is reduced in the hop3-1 mutant. All these data strongly suggest that, specifically among HOPs, HOP3 plays a relevant role in the JA pathway by regulating COI1 activity in response to JA and, consequently, participating in defense signaling to biotic stresses.

One-sentence summary: The co-chaperone protein HOP3 (HSP70-HSP90 ORGANIZING PROTEIN 3) regulates the activity of jasmonic acid co-receptor CORONATINE INSENSITIVE 1 and functions in plant defense.  相似文献   

7.

Background

Mimivirus isolated from A. polyphaga is the largest virus discovered so far. It is unique among all the viruses in having genes related to translation, DNA repair and replication which bear close homology to eukaryotic genes. Nevertheless, only a small fraction of the proteins (33%) encoded in this genome has been assigned a function. Furthermore, a large fraction of the unassigned protein sequences bear no sequence similarity to proteins from other genomes. These sequences are referred to as ORFans. Because of their lack of sequence similarity to other proteins, they can not be assigned putative functions using standard sequence comparison methods. As part of our genome-wide computational efforts aimed at characterizing Mimivirus ORFans, we have applied fold-recognition methods to predict the structure of these ORFans and further functions were derived based on conservation of functionally important residues in sequence-template alignments.

Results

Using fold recognition, we have identified highly confident computational 3D structural assignments for 21 Mimivirus ORFans. In addition, highly confident functional predictions for 6 of these ORFans were derived by analyzing the conservation of functional motifs between the predicted structures and proteins of known function. This analysis allowed us to classify these 6 previously unannotated ORFans into their specific protein families: carboxylesterase/thioesterase, metal-dependent deacetylase, P-loop kinases, 3-methyladenine DNA glycosylase, BTB domain and eukaryotic translation initiation factor eIF4E.

Conclusion

Using stringent fold recognition criteria we have assigned three-dimensional structures for 21 of the ORFans encoded in the Mimivirus genome. Further, based on the 3D models and an analysis of the conservation of functionally important residues and motifs, we were able to derive functional attributes for 6 of the ORFans. Our computational identification of important functional sites in these ORFans can be the basis for a subsequent experimental verification of our predictions. Further computational and experimental studies are required to elucidate the 3D structures and functions of the remaining Mimivirus ORFans.  相似文献   

8.
Siew N  Fischer D 《Proteins》2003,53(2):241-251
Singleton sequence ORFans are orphan ORFs (open reading frames) that have no detectable sequence similarity to any other sequence in the databases. ORFans are of particular interest not only as evolutionary puzzles but also because we can learn little about them using bioinformatics tools. Here, we present a first systematic analysis of singleton ORFans in the first 60 fully sequenced microbial genomes. We show that although ORFans have been underemphasized, the number of ORFans is steadily growing, currently accounting for 23,634 sequences. At the same time, the percentage of ORFans as a fraction of all sequences is slowly diminishing, and is currently about 14%. Short ORFans comprise about 61% of all ORFans. The abundance of short ORFans may be due to a yet unexplained artifact. The data also suggest that the number of longer ORFans may soon diminish as more genomes of closely related organisms become available. To better address the questions about the functions and origins of ORFans, we propose to focus further studies on the longer ORFans, with emphasis on three new types of ORFans: ORFan modules, paralogous ORFans, and orthologous ORFans. We conclude that the large number of ORFans reflects an intrinsic property of the genetic material not yet fully understood. Further computational and experimental studies aimed at understanding Nature's protein diversity should also include ORFans.  相似文献   

9.
Siew N  Saini HK  Fischer D 《FEBS letters》2005,579(14):3175-3182
A large number of sequences in each newly sequenced genome correspond to lineage and species-specific proteins, also known as ORFans. Amongst these ORFans, a large number are sequences with unknown structures and functions. We have identified a family of sequences, annotated as hypothetical proteins, which are specific to Bacillus and have carried out a computational study aimed at characterizing this family. Fold-recognition methods predict that these sequences belong to the alpha/beta hydrolase fold. We suggest possible catalytic triads for the ORFans and propose a hypothesis regarding the possible families within the alpha/beta hydrolase superfamily to which they may belong.  相似文献   

10.
11.
12.
ORFans are hypothetical proteins lacking any significant sequence similarity with other proteins. Here, we highlighted by quantitative proteomics the TGAM_1934 ORFan from the hyperradioresistant Thermococcus gammatolerans archaeon as one of the most abundant hypothetical proteins. This protein has been selected as a priority target for structure determination on the basis of its abundance in three cellular conditions. Its solution structure has been determined using multidimensional heteronuclear NMR spectroscopy. TGAM_1934 displays an original fold, although sharing some similarities with the 3D structure of the bacterial ortholog of frataxin, CyaY, a protein conserved in bacteria and eukaryotes and involved in iron–sulfur cluster biogenesis. These results highlight the potential of structural proteomics in prioritizing ORFan targets for structure determination based on quantitative proteomics data. The proteomic data and structure coordinates have been deposited to the ProteomeXchange with identifier PXD000402 ( http://proteomecentral.proteomexchange.org/dataset/PXD000402 ) and Protein Data Bank under the accession number 2mcf, respectively.  相似文献   

13.
Type II restriction enzymes are commercially important deoxyribonucleases and very attractive targets for protein engineering of new specificities. At the same time they are a very challenging test bed for protein structure prediction methods. Typically, enzymes that recognize different sequences show little or no amino acid sequence similarity to each other and to other proteins. Based on crystallographic analyses that revealed the same PD-(D/E)XK fold for more than a dozen case studies, they were nevertheless considered to be related until the combination of bioinformatics and mutational analyses has demonstrated that some of these proteins belong to other, unrelated folds PLD, HNH, and GIY-YIG. As a part of a large-scale project aiming at identification of a three-dimensional fold for all type II REases with known sequences (currently approximately 1000 proteins), we carried out preliminary structure prediction and selected candidates for experimental validation. Here, we present the analysis of HpaI REase, an ORFan with no detectable homologs, for which we detected a structural template by protein fold recognition, constructed a model using the FRankenstein monster approach and identified a number of residues important for the DNA binding and catalysis. These predictions were confirmed by site-directed mutagenesis and in vitro analysis of the mutant proteins. The experimentally validated model of HpaI will serve as a low-resolution structural platform for evolutionary considerations in the subgroup of blunt-cutting REases with different specificities. The research protocol developed in the course of this work represents a streamlined version of the previously used techniques and can be used in a high-throughput fashion to build and validate models for other enzymes, especially ORFans that exhibit no sequence similarity to any other protein in the database.  相似文献   

14.
15.
Phosphagen kinases are found throughout the animal kingdom and catalyze the transfer of a high-energy gamma phosphoryl-group from ATP to a guanidino group on a suitable acceptor molecule such as creatine or arginine. Recent genome sequencing efforts in several proteobacteria, including Desulfotalea psychrophila LSv54, Myxococcus xanthus, Sulfurovum sp. NBC37-1, and Moritella sp. PE36 have revealed what appears to be a phosphagen kinase homolog present in their genomes. Based on sequence comparisons these putative homologs bear a strong resemblance to arginine kinases found in many invertebrates and some protozoa. We describe here a biochemical characterization of one of these homologs from D. psychrophila expressed in E. coli that confirms its ability to reversibly catalyze phosphoryl transfer from ATP to arginine. A phylogenetic analysis suggests that these bacteria homologs are not widely distributed in proteobacteria species. They appear more related to protozoan arginine kinases than to similar proteins seen in some Gram-positive bacteria that share key catalytic residues but encode protein tyrosine kinases. This raises the possibility of horizontal gene transfer as a likely origin of the bacterial arginine kinases.  相似文献   

16.
Seventy integral membrane proteins from the Mycobacterium tuberculosis genome have been cloned and expressed in Escherichia coli. A combination of T7 promoter-based vectors with hexa-His affinity tags and BL21 E. coli strains with additional tRNA genes to supplement sparsely used E. coli codons have been most successful. The expressed proteins have a wide range of molecular weights and number of transmembrane helices. Expression of these proteins has been observed in the membrane and insoluble fraction of E. coli cell lysates and, in some cases, in the soluble fraction. The highest expression levels in the membrane fraction were restricted to a narrow range of molecular weights and relatively few transmembrane helices. In contrast, overexpression in insoluble aggregates was distributed over a broad range of molecular weights and number of transmembrane helices.  相似文献   

17.
18.
The genome of the choanoflagellate Monosiga brevicollis contains at least three genes for the phosphoryl transfer enzyme, arginine kinase (AK; EC 2.7.3.3). Bioinformatic analyses of the deduced amino acid sequences of the proteins coded for by two of these genes showed that one of these AKs is cytoplasmic (denoted AK1) while the other appears to have an N-terminal mitochondrial targeting peptide (denoted AK2). Cloning and expression of the cDNA for AK1 yielded considerable soluble AK activity. Three AK2 constructs were expressed - one corresponding to the full length protein and two corresponding to truncated versions in which the signal peptide had been deleted. Expression of the former construct yielded minimal soluble activity. In contrast, significant AK activity was found in both truncated constructs confirming the importance of removal of the targeting peptide for proper folding and catalytic activity. Both AK1 and AK2 are functional oligomers unlike typical AKs which are monomeric. A phylogenetic analysis showed that these choanoflagellate AKs group more closely with a supercluster consisting of cytoplasmic and mitochondrial CKs and invertebrate AKs that evolved secondarily from a CK-like ancestor. Reaction-diffusion constraints in choanoflagellates are likely mitigated by the presence of AK isoforms which facilitate energy transport in these highly polarized cells.  相似文献   

19.
MOTIVATION: A large fraction of open reading frames (ORFs) identified as 'hypothetical' proteins correspond to either 'conserved hypothetical' proteins, representing sequences homologous to ORFs of unknown function from other organisms, or to hypothetical proteins lacking any significant sequence similarity to other ORFs in the databases. Elucidating the functions and three-dimensional structures of such orphan ORFs, termed ORFans or poorly conserved ORFs (PCOs), is essential for understanding biodiversity. However, it has been claimed that many ORFans may not encode for expressed proteins. RESULTS: A genome-wide experimental study of 'paralogous PCOs' in the halophilic archaea Halobacterium sp. NRC-1 was conducted. Paralogous PCOs are ORFs with at least one homolog in the same organism, but with no clear homologs in other organisms. The results reveal that mRNA is synthesized for a majority of the Halobacterium sp. NRC-1 paralogous PCO families, including those comprising relatively short proteins, strongly suggesting that these Halobacterium sp. NRC-1 paralogous PCOs correspond to true, expressed proteins. Hence, further computational and experimental studies aimed at characterizing PCOs in this and other organisms are merited. Such efforts could shed light on PCOs' functions and origins, thereby serving to elucidate the vast diversity observed in the genetic material.  相似文献   

20.
We have tried to approach the nature of the last common ancestor to Haemophilus influenzae and Escherichia coli and to determine how each bacterium could have diverged from this putative organism. The approach used was exhaustive analysis of the homologous proteins coded by genes present in these bacteria, using as criteria for sequence relatedness an alignment of at least 80 amino acid residues and a PAM distance (number of accepted point mutations per 100 residues separating two sequences) below 250. Evolutionarily significant similarities were found between 1,345 H. influenzae proteins (85% of the total genome) and 3,058 E. coli. proteins (75% of the total genome), many of them belonging to families of various sizes (from 666 doublets to 35 large groups of more than 10 members). Nearly all the genes found by this approach to be duplicated in both bacteria were already duplicated in their last common ancestor. This was deduced from (1) the comparison of the respective distributions of evolutionary distances between orthologs (genes separated only by speciation events) and paralogs (genes duplicated in the same genome) and (2) the analysis of the phylogenetic trees reconstructed for each family of paralogs containing at least two members belonging to each bacterium. The distributions of the different categories of homologs show a significant loss of paralogous genes in H. influenzae (reduction proportional to the genome size), of many sequences which are still present in one copy in E. coli, and of some entire gene families. Phylogenetic trees also confirmed this recent loss of paralogous genes in H. influenzae. Thus, the genome size of the last common ancestor of these two bacteria would have been close to that of present-day E. coli, and the evolution of H. influenzae toward a parasitic life led to an important decrease in its genome size by some mechanism of streamlining. During this recent evolution, the memory of the gene order present in the last common ancestor has been blurred, but a few short conserved chromosomal fragments can still be detected in present-day E. coli and H. influenzae.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号