首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Distribution of orphan metabolic activities   总被引:2,自引:0,他引:2  
A significant fraction (30-40%) of known metabolic activities is currently orphan. Although orphan activities have been biochemically characterized, we do not know a single gene responsible for these reactions in any organism. The problem of orphan activities represents one of the major challenges of modern biochemistry. We analyze the distribution of orphans across biochemical space, through years of enzymatic characterization, and by biological organisms. We find that orphan metabolic activities have been accumulating for many decades. They are widely distributed across enzymatic functional space and metabolic network neighborhoods. Although orphans are relatively more abundant in less studied species, over half of orphan reactions have been experimentally characterized in more than one organism. Shrinking the space of orphan activities will likely require a close collaboration between computational and experimental laboratories.  相似文献   

2.
Despite the current wealth of sequencing data, one‐third of all biochemically characterized metabolic enzymes lack a corresponding gene or protein sequence, and as such can be considered orphan enzymes. They represent a major gap between our molecular and biochemical knowledge, and consequently are not amenable to modern systemic analyses. As 555 of these orphan enzymes have metabolic pathway neighbours, we developed a global framework that utilizes the pathway and (meta)genomic neighbour information to assign candidate sequences to orphan enzymes. For 131 orphan enzymes (37% of those for which (meta)genomic neighbours are available), we associate sequences to them using scoring parameters with an estimated accuracy of 70%, implying functional annotation of 16 345 gene sequences in numerous (meta)genomes. As a case in point, two of these candidate sequences were experimentally validated to encode the predicted activity. In addition, we augmented the currently available genome‐scale metabolic models with these new sequence–function associations and were able to expand the models by on average 8%, with a considerable change in the flux connectivity patterns and improved essentiality prediction.  相似文献   

3.
4.
Metabolome refers to the complete set of metabolites synthesized through a series of multiple enzymatic steps from various biochemical pathways processing the information encrypted in the plant genome. Knowledge about synthesis and regulation of various plant metabolic substances has improved substantially with availability of Omics data originating from sequencing of plant genomes. Metabolic profiling of crops is increasingly becoming popular in assessing plant phenotypes and genetic diversity. Metabolic compositional changes vividly reflect the changes occurring during plant growth, development, and in response to stress. Hence, study of plant metabolic pathways, the interconnections between them in context of systems biology is increasingly becoming popular in identification of candidate genes. The present article reviews recent developments in analysis of plant metabolomics, available bioinformatics techniques and databases employed for comparative pathway analysis, metabolic QTLs, and their application in plants.  相似文献   

5.

Metabolons are multi-enzyme protein complexes composed of enzymes catalyzing sequential reactions in a metabolic pathway. Metabolons mediate substrate channeling between the enzyme catalytic cores to enhance the pathway reactions, to achieve containment of reactive intermediates, and to prevent access of competing enzymes to the intermediates. These provide unique advantages in metabolic regulation. The discovery of plant metabolons has been accelerated by the recent technical developments and a considerable number of metabolons involved in both primary and secondary metabolism have been indicated in the last decade. These findings related with plant metabolons are comprehensively reviewed in this review, indicating metabolome-wide engagement of metabolons. However, there are still unexplored frontiers remaining for further discovery of metabolons in plant metabolism. Pathways with high potential of novel metabolon and technical issues to be solved for the future discovery will also be discussed.

  相似文献   

6.
Genome-scale metabolic model of Helicobacter pylori 26695   总被引:6,自引:0,他引:6       下载免费PDF全文
A genome-scale metabolic model of Helicobacter pylori 26695 was constructed from genome sequence annotation, biochemical, and physiological data. This represents an in silico model largely derived from genomic information for an organism for which there is substantially less biochemical information available relative to previously modeled organisms such as Escherichia coli. The reconstructed metabolic network contains 388 enzymatic and transport reactions and accounts for 291 open reading frames. Within the paradigm of constraint-based modeling, extreme-pathway analysis and flux balance analysis were used to explore the metabolic capabilities of the in silico model. General network properties were analyzed and compared to similar results previously generated for Haemophilus influenzae. A minimal medium required by the model to generate required biomass constituents was calculated, indicating the requirement of eight amino acids, six of which correspond to essential human amino acids. In addition a list of potential substrates capable of fulfilling the bulk carbon requirements of H. pylori were identified. A deletion study was performed wherein reactions and associated genes in central metabolism were deleted and their effects were simulated under a variety of substrate availability conditions, yielding a number of reactions that are deemed essential. Deletion results were compared to recently published in vitro essentiality determinations for 17 genes. The in silico model accurately predicted 10 of 17 deletion cases, with partial support for additional cases. Collectively, the results presented herein suggest an effective strategy of combining in silico modeling with experimental technologies to enhance biological discovery for less characterized organisms and their genomes.  相似文献   

7.
Organisms that live in deserts offer the opportunity to investigate how species adapt to environmental conditions that are lethal to most plants and animals. In the hot deserts of North America, high temperatures and lack of water are conspicuous challenges for organisms living there. The cactus mouse (Peromyscus eremicus) displays several adaptations to these conditions, including low metabolic rate, heat tolerance, and the ability to maintain homeostasis under extreme dehydration. To investigate the genomic basis of desert adaptation in cactus mice, we built a chromosome‐level genome assembly and resequenced 26 additional cactus mouse genomes from two locations in southern California (USA). Using these data, we integrated comparative, population, and functional genomic approaches. We identified 16 gene families exhibiting significant contractions or expansions in the cactus mouse compared to 17 other Myodontine rodent genomes, and found 232 sites across the genome associated with selective sweeps. Functional annotations of candidate gene families and selective sweeps revealed a pervasive signature of selection at genes involved in the synthesis and degradation of proteins, consistent with the evolution of cellular mechanisms to cope with protein denaturation caused by thermal and hyperosmotic stress. Other strong candidate genes included receptors for bitter taste, suggesting a dietary shift towards chemically defended desert plants and insects, and a growth factor involved in lipid metabolism, potentially involved in prevention of dehydration. Understanding how species adapted to deserts will provide an important foundation for predicting future evolutionary responses to increasing temperatures, droughts and desertification in the cactus mouse and other species.  相似文献   

8.
Researchers in applied biocatalysis are now reaping the rewards of intensive effort and technological developments in the sequencing of the genomes of microbial and plant species. The genomic resource contains the sequences of millions of new genes with potential application in industrial biotechnology and includes families of enzymes within discrete genomes that potentially catalyze equivalent chemical reactions. One of the key emerging characteristics of these intragenomic complements of enzymes is the impressive breadth of catalytic diversity that is observed within them. This diversity may have been acquired either in order to combat the spectrum of metabolic challenges with which the organism may be presented in its natural environment or as part of the biosynthetic machinery evolved to produce a spectrum of secondary metabolites that will prove to be advantageous in establishing a niche. Attempts have been made to functionally characterize the intragenomic complements of enzyme families catalyzing diverse reactions including carbonyl reduction, ester hydrolysis and Baeyer–Villiger oxidation, in Gram-positive bacteria, yeasts, filamentous fungi and the plant Arabidopsis thaliana. These studies are beginning to describe in detail for the first time the impressive range of catalytic potential within single organisms for attributes such as substrate range, enantioselectivity or thermostability, each of which is of interest from an enzyme discovery perspective.  相似文献   

9.
Genome-wide association (GWA) studies represent a powerful strategy for identifying susceptibility genes for complex diseases in human populations but results must be confirmed and replicated. Because of the close homology between mouse and human genomes, the mouse can be used to add evidence to genes suggested by human studies. We used the mouse quantitative trait loci (QTL) map to interpret results from a GWA study for genes associated with plasma HDL cholesterol levels. We first positioned single nucleotide polymorphisms (SNPs) from a human GWA study on the genomic map for mouse HDL QTL. We then used mouse bioinformatics, sequencing, and expression studies to add evidence for one well-known HDL gene (Abca1) and three newly identified genes (Galnt2, Wwox, and Cdh13), thus supporting the results of the human study. For GWA peaks that occur in human haplotype blocks with multiple genes, we examined the homologous regions in the mouse to prioritize the genes using expression, sequencing, and bioinformatics from the mouse model, showing that some genes were unlikely candidates and adding evidence for candidate genes Mvk and Mmab in one haplotype block and Fads1 and Fads2 in the second haplotype block. Our study highlights the value of mouse genetics for evaluating genes found in human GWA studies.  相似文献   

10.
Plant genomes appear to exploit the process of gene duplication as a primary means of acquiring biochemical and developmental flexibility. Thus, for example, most of the enzymatic components of plant secondary metabolism are encoded by small families of genes that originated through duplication over evolutionary time. The dynamics of gene family evolution are well illustrated by the genes that encode chalcone synthase (CHS), the first committed step in flavonoid biosynthesis. We review pertinent facts about CHS evolution in flowering plants with special reference to the morning glory genus, Ipomoea. Our review shows that new CHS genes are recruited recurrently in flowering plant evolution. Rates of nucleotide substitution are frequently accelerated in new duplicate genes, and there is clear evidence for repeated shifts in enzymatic function among duplicate copies of CHS genes. In addition, we present new data on expression patterns of CHS genes as a function of tissue and developmental stage in the common morning glory (I. purpurea). These data show extensive differentiation in gene expression among duplicate copies of CHS genes. We also show that a single mutation which blocks anthocyanin biosynthesis in the floral limb is correlated with a loss of expression of one of the six duplicate CHS genes present in the morning glory genome. This suggests that different duplicate copies of CHS have acquired specialized functional roles over the course of evolution. We conclude that recurrent gene duplication and subsequent differentiation is a major adaptive strategy in plant genome evolution.  相似文献   

11.
Bardet-Biedl syndrome (BBS) is an autosomal recessive, genetically heterogeneous, pleiotropic human disorder characterized by obesity, retinopathy, polydactyly, renal and cardiac malformations, learning disabilities, and hypogenitalism. Eight BBS genes representing all known mapped loci have been identified. Mutation analysis of the known BBS genes in BBS patients indicate that additional BBS genes exist and/or that unidentified mutations exist in the known genes. To identify new BBS genes, we performed homozygosity mapping of small, consanguineous BBS pedigrees, using moderately dense SNP arrays. A bioinformatics approach combining comparative genomic analysis and gene expression studies of a BBS-knockout mouse model was used to prioritize BBS candidate genes within the newly identified loci for mutation screening. By use of this strategy, parathyroid hormone-responsive gene B1 (B1) was found to be a novel BBS gene (BBS9), supported by the identification of homozygous mutations in BBS patients. The identification of BBS9 illustrates the power of using a combination of comparative genomic analysis, gene expression studies, and homozygosity mapping with SNP arrays in small, consanguineous families for the identification of rare autosomal recessive disorders. We also demonstrate that small, consanguineous families are useful in identifying intragenic deletions. This type of mutation is likely to be underreported because of the difficulty of deletion detection in the heterozygous state by the mutation screening methods that are used in many studies.  相似文献   

12.
Although the proteins of the lysine fermentation pathway were biochemically characterized more than thirty years ago, the genes encoding the proteins that catalyze three steps of this pathway are still unknown. We combined gene context, similarity of enzymatic mechanisms, and molecular weight comparisons with known proteins to select candidate genes for these three orphan proteins. We used a wastewater metagenomic collection of sequences to find and characterize the missing genes of the lysine fermentation pathway. After recombinant protein production and purification following cloning in Escherichia coli, we demonstrated that these genes (named kdd, kce, and kal) encode a l-erythro-3,5-diaminohexanoate dehydrogenase, a 3-keto-5-aminohexanoate cleavage enzyme, and a 3-aminobutyryl-CoA ammonia lyase, respectively. Because all of the genes of the pathway are now identified, we used this breakthrough to detect lysine-fermenting bacteria in sequenced genomes. We identified twelve bacteria that possess these genes and thus are expected to ferment lysine, and their gene organization is discussed.  相似文献   

13.
Highly reduced genomes of 144-416 kilobases have been described from nutrient-provisioning bacterial symbionts of several insect lineages [1-5]. Some host insects have formed stable associations with pairs of bacterial symbionts that live in specialized cells and provide them with essential nutrients; genomic data from these systems have revealed remarkable levels of metabolic complementarity between the symbiont pairs [3, 4, 6, 7]. The mealybug Planococcus citri (Hemiptera: Pseudococcidae) contains dual bacterial symbionts existing with an unprecedented organization: an unnamed gammaproteobacteria, for which we propose the name Candidatus Moranella endobia, lives inside the betaproteobacteria Candidatus Tremblaya princeps [8]. Here we describe the complete genomes and metabolic contributions of these unusual nested symbionts. We show that whereas there is little overlap in retained genes involved in nutrient production between symbionts, several essential amino acid pathways in the mealybug assemblage require a patchwork of interspersed gene products from Tremblaya, Moranella, and possibly P.?citri. Furthermore, although Tremblaya has the smallest cellular genome yet described, it contains a genomic inversion present in both orientations in individual insects, starkly contrasting with the extreme structural stability typical of highly reduced bacterial genomes [4, 9, 10].  相似文献   

14.
15.
The Acidianus hospitalis W1 genome consists of a minimally sized chromosome of about 2.13 Mb and a conjugative plasmid pAH1 and it is a host for the model filamentous lipothrixvirus AFV1. The chromosome carries three putative replication origins in conserved genomic regions and two large regions where non-essential genes are clustered. Within these variable regions, a few orphan orfB and other elements of the IS200/607/605 family are concentrated with a novel class of MITE-like repeat elements. There are also 26 highly diverse vapBC antitoxin–toxin gene pairs proposed to facilitate maintenance of local chromosomal regions and to minimise the impact of environmental stress. Complex and partially defective CRISPR/Cas/Cmr immune systems are present and interspersed with five vapBC gene pairs. Remnants of integrated viral genomes and plasmids are located at five intron-less tRNA genes and several non-coding RNA genes are predicted that are conserved in other Sulfolobus genomes. The putative metabolic pathways for sulphur metabolism show some significant differences from those proposed for other Acidianus and Sulfolobus species. The small and relatively stable genome of A. hospitalis W1 renders it a promising candidate for developing the first Acidianus genetic systems.  相似文献   

16.
The metabolic network is an important biological network which consists of enzymes and chemical compounds. However, a large number of metabolic pathways remains unknown, and most organism-specific metabolic pathways contain many missing enzymes. We present a novel method to identify the genes coding for missing enzymes using available genomic and chemical information from bacterial genomes. The proposed method consists of two steps: (a) estimation of the functional association between the genes with respect to chromosomal proximity and evolutionary association, using supervised network inference; and (b) selection of gene candidates for missing enzymes based on the original candidate score and the chemical reaction information encoded in the EC number. We applied the proposed methods to infer the metabolic network for the bacteria Pseudomonas aeruginosa from two genomic datasets: gene position and phylogenetic profiles. Next, we predicted several missing enzyme genes to reconstruct the lysine-degradation pathway in P. aeruginosa using EC number information. As a result, we identified PA0266 as a putative 5-aminovalerate aminotransferase (EC 2.6.1.48) and PA0265 as a putative glutarate semialdehyde dehydrogenase (EC 1.2.1.20). To verify our prediction, we conducted biochemical assays and examined the activity of the products of the predicted genes, PA0265 and PA0266, in a coupled reaction. We observed that the predicted gene products catalyzed the expected reactions; no activity was seen when both gene products were omitted from the reaction.  相似文献   

17.

   

Bacterial and Archaeal cells use selenium structurally in selenouridine-modified tRNAs, in proteins translated with selenocysteine, and in the selenium-dependent molybdenum hydroxylases (SDMH). The first two uses both require the selenophosphate synthetase gene, selD. Examining over 500 complete prokaryotic genomes finds selD in exactly two species lacking both the selenocysteine and selenouridine systems, Enterococcus faecalis and Haloarcula marismortui. Surrounding these orphan selD genes, forming bidirectional best hits between species, and detectable by Partial Phylogenetic Profiling vs. selD, are several candidate molybdenum hydroxylase subunits and accessory proteins. We propose that certain accessory proteins, and orphan selD itself, are markers through which new selenium-dependent molybdenum hydroxylases can be found.  相似文献   

18.

Background

For most sequenced prokaryotic genomes, about a third of the protein coding genes annotated are "orphan proteins", that is, they lack homology to known proteins. These hypothetical genes are typically short and randomly scattered throughout the genome. This trend is seen for most of the bacterial and archaeal genomes published to date.

Results

In contrast we have found that a large fraction of the genes coding for such orphan proteins in the Methanopyrus kandleri AV19 genome occur within two large regions. These genes have no known homologs except from other M. kandleri genes. However, analysis of their lengths, codon usage, and Ribosomal Binding Site (RBS) sequences shows that they are most likely true protein coding genes and not random open reading frames.

Conclusions

Although these regions can be considered as candidates for massive lateral gene transfer, our bioinformatics analysis suggests that this is not the case. We predict many of the organism specific proteins to be transmembrane and belong to protein families that are non-randomly distributed between the regions. Consistent with this, we suggest that the two regions are most likely unrelated, and that they may be integrated plasmids.
  相似文献   

19.
20.
Satoshi Fukuchi  Ken Nishikawa 《DNA research》2004,11(4):219-31, 311-313
Genome annotation produces a considerable number of putative proteins lacking sequence similarity to known proteins. These are referred to as "orphans." The proportion of orphan genes varies among genomes, and is independent of genome size. In the present study, we show that the proportion of orphan genes roughly correlates with the isolation index of organisms (IIO), an indicator introduced in the present study, which represents the degree of isolation of a given genome as measured by sequence similarity. However, there are outlier genomes with respect to the linear correlation, consisting of those genomes that may contain excess amounts of orphan genes. Comparisons of genome sequences among closely related strains revealed that some of the annotated genes are not conserved, suggesting that they are ORFs occurring by chance. Exclusion of these non-conserved ORFs within closely related genomes improved the correlation between the proportion of orphan genes and the IIO values. Assuming that the correlation holds in general, this relationship was used to estimate the number of "authentic" orphan genes in a genome. Using this definition of authentic orphan genes, the anomalies arising from over-assignments, e.g., the percentages of structural annotations, were corrected for 16 genomes, including those of five archaea.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号