首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 515 毫秒
1.
Gene essentiality determines chromosome organisation in bacteria   总被引:4,自引:1,他引:3       下载免费PDF全文
In Escherichia coli and Bacillus subtilis, essentiality, not expressivity, drives the distribution of genes between the two replicating strands. Although essential genes tend to be coded in the leading replicating strand, the underlying selective constraints and the evolutionary extent of these findings have still not been subject to comparative studies. Here, we extend our previous analysis to the genomes of low G + C firmicutes and γ-proteobacteria, and in a second step to all sequenced bacterial genomes. The inference of essentiality by homology allows us to show that essential genes are much more frequent in the leading strand than other genes, even when compared with non- essential highly expressed genes. Smaller biases were found in the genomes of obligatory intracellular bacteria, for which the assignment of essentiality by homology from fast growing free-living bacteria is most problematic. Cross-comparisons used to assess potential errors in the assignment of essentiality by homology revealed that, in most cases, variations in the assignment criteria have little influence on the overall results. Essential genes tend to be more conserved in the leading strand than average genes, which is consistent with selection for this positioning and may impose a strong constraint on chromosomal rearrangements. These results indicate that essentiality plays a fundamental role in the distribution of genes in most bacterial genomes.  相似文献   

2.
Large-scale analyses of protein complexes have recently become available for Escherichia coli and Mycoplasma pneumoniae, yielding 443 and 116 heteromultimeric soluble protein complexes, respectively. We have coupled the results of these mass spectrometry-characterized protein complexes with the 285 “gold standard” protein complexes identified by EcoCyc. A comparison with databases of gene orthology, conservation, and essentiality identified proteins conserved or lost in complexes of other species. For instance, of 285 “gold standard” protein complexes in E. coli, less than 10% are fully conserved among a set of 7 distantly-related bacterial “model” species. Complex conservation follows one of three models: well-conserved complexes, complexes with a conserved core, and complexes with partial conservation but no conserved core. Expanding the comparison to 894 distinct bacterial genomes illustrates fractional conservation and the limits of co-conservation among components of protein complexes: just 14 out of 285 model protein complexes are perfectly conserved across 95% of the genomes used, yet we predict more than 180 may be partially conserved across at least half of the genomes. No clear relationship between gene essentiality and protein complex conservation is observed, as even poorly conserved complexes contain a significant number of essential proteins. Finally, we identify 183 complexes containing well-conserved components and uncharacterized proteins which will be interesting targets for future experimental studies.  相似文献   

3.
The rate of conservation of a gene in evolution is believed to be correlated with its biological importance. Recent studies have devised various conservation measures for genes and have shown that they are correlated with several biological characteristics of functional importance. Specifically, the state-of-the-art propensity for gene loss (PGL) measure was shown to be strongly correlated with gene essentiality and its number of protein–protein interactions (PPIs). The observed correlation between conservation and functional importance varies however between conservation measures, underscoring the need for accurate and general measures for the rate of gene conservation. Here we develop a novel maximum-likelihood approach to computing the rate in which a gene is lost in evolution, motivated by the same principles as those underlying PGL. However, in difference to PGL which considers only the most parsimonious ancestral states of the internal nodes of the phylogenetic tree relating the species, our approach weighs in a probabilistic manner all possible ancestral states, and includes the branch length information as part of the probabilistic model. In application to data of 16 eukaryotic genomes, our approach shows higher correlations with experimental data than PGL, including data on gene lethality, level of connectivity in a PPI network and coherence within functionally related genes.  相似文献   

4.
Unlike eukaryotes, which often recruit duplicated genes into existing protein-protein interaction (PPI) networks, the low levels of gene duplication coupled with the high probability of lateral transfer of novel genes alters the manner in which PPI networks can evolve in bacteria. By inferring the PPIs present in the ancestor to contemporary Gammaproteobacteria, we were able to trace the changes in gene repertoires, and their consequences on PPI network evolution, in several bacterial lineages that have independently undergone reductions in genome size and genome contents. As genomes degrade, virtually all multi-partner proteins have lost interactors; however, the overall average number of connections increases due to the preferential elimination of proteins that interact with only one other protein partner. We also studied the effect of lateral gene transfer on PPI network evolution by analyzing the connectivity of genes that have been gained along the Escherichia coli lineage, as well as those acquired genes subsequently silenced in Shigella flexneri, since diverging from the gammaproteobacterial ancestor. The situation in PPI networks, in which newly acquired genes preferentially attach to the hubs of the network, contrasts that observed in metabolic networks, which evolve by the peripheral gain and loss of genes, and in regulatory networks, in which high connectivity increases the propensity of loss.  相似文献   

5.
The bacterial core genome is of intense interest and the volume of whole genome sequence data in the public domain available to investigate it has increased dramatically. The aim of our study was to develop a model to estimate the bacterial core genome from next-generation whole genome sequencing data and use this model to identify novel genes associated with important biological functions. Five bacterial datasets were analysed, comprising 2096 genomes in total. We developed a Bayesian decision model to estimate the number of core genes, calculated pairwise evolutionary distances (p-distances) based on nucleotide sequence diversity, and plotted the median p-distance for each core gene relative to its genome location. We designed visually-informative genome diagrams to depict areas of interest in genomes. Case studies demonstrated how the model could identify areas for further study, e.g. 25% of the core genes with higher sequence diversity in the Campylobacter jejuni and Neisseria meningitidis genomes encoded hypothetical proteins. The core gene with the highest p-distance value in C. jejuni was annotated in the reference genome as a putative hydrolase, but further work revealed that it shared sequence homology with beta-lactamase/metallo-beta-lactamases (enzymes that provide resistance to a range of broad-spectrum antibiotics) and thioredoxin reductase genes (which reduce oxidative stress and are essential for DNA replication) in other C. jejuni genomes. Our Bayesian model of estimating the core genome is principled, easy to use and can be applied to large genome datasets. This study also highlighted the lack of knowledge currently available for many core genes in bacterial genomes of significant global public health importance.  相似文献   

6.

Background  

In many protein-protein interaction (PPI) networks, densely connected hub proteins are more likely to be essential proteins. This is referred to as the "centrality-lethality rule", which indicates that the topological placement of a protein in PPI network is connected with its biological essentiality. Though such connections are observed in many PPI networks, the underlying topological properties for these connections are not yet clearly understood. Some suggested putative connections are the involvement of essential proteins in the maintenance of overall network connections, or that they play a role in essential protein clusters. In this work, we have attempted to examine the placement of essential proteins and the network topology from a different perspective by determining the correlation of protein essentiality and reverse nearest neighbor topology (RNN).  相似文献   

7.
8.
Understanding the system-level adaptive changes taking place in an organism in response to variations in the environment is a key issue of contemporary biology. Current modeling approaches, such as constraint-based flux-balance analysis, have proved highly successful in analyzing the capabilities of cellular metabolism, including its capacity to predict deletion phenotypes, the ability to calculate the relative flux values of metabolic reactions, and the capability to identify properties of optimal growth states. Here, we use flux-balance analysis to thoroughly assess the activity of Escherichia coli, Helicobacter pylori, and Saccharomyces cerevisiae metabolism in 30,000 diverse simulated environments. We identify a set of metabolic reactions forming a connected metabolic core that carry non-zero fluxes under all growth conditions, and whose flux variations are highly correlated. Furthermore, we find that the enzymes catalyzing the core reactions display a considerably higher fraction of phenotypic essentiality and evolutionary conservation than those catalyzing noncore reactions. Cellular metabolism is characterized by a large number of species-specific conditionally active reactions organized around an evolutionary conserved, but always active, metabolic core. Finally, we find that most current antibiotics interfering with bacterial metabolism target the core enzymes, indicating that our findings may have important implications for antimicrobial drug-target discovery.  相似文献   

9.
Horizontal gene transfer has been occasionally mentioned in eukaryotic genomes, but such events appear much less numerous than in prokaryotes, where they play important functional and evolutionary roles. In yeasts, few independent cases have been described, some of which corresponding to major metabolic functions, but no systematic screening of horizontally transferred genes has been attempted so far. Taking advantage of the synteny conservation among five newly sequenced and annotated genomes of Saccharomycetaceae, we carried out a systematic search for HGT candidates amidst genes present in only one species within conserved synteny blocks. Out of 255 species-specific genes, we discovered 11 candidates for HGT, based on their similarity with bacterial proteins and on reconstructed phylogenies. This corresponds to a minimum of six transfer events because some horizontally acquired genes appear to rapidly duplicate in yeast genomes (e.g. YwqG genes in Kluyveromyces thermotolerans and serine recombinase genes of the IS607 family in Saccharomyces kluyveri). We show that the resulting copies are submitted to a strong functional selective pressure. The mechanisms of DNA transfer and integration are discussed, in relation with the generally small size of HGT candidates. Our results on a limited set of species expand by 50% the number of previously published HGT cases in hemiascomycetous yeasts, suggesting that this type of event is more frequent than usually thought. Our restrictive method does not exclude the possibility that additional HGT events exist. Actually, ancestral events common to several yeast species must have been overlooked, and the absence of homologs in present databases leaves open the question of the origin of the 244 remaining species-specific genes inserted within conserved synteny blocks.  相似文献   

10.

Background

One of the crucial steps toward understanding the biological functions of a cellular system is to investigate protein–protein interaction (PPI) networks. As an increasing number of reliable PPIs become available, there is a growing need for discovering PPIs to reconstruct PPI networks of interesting organisms. Some interolog-based methods and homologous PPI families have been proposed for predicting PPIs from the known PPIs of source organisms.

Results

Here, we propose a multiple-strategy scoring method to identify reliable PPIs for reconstructing the mouse PPI network from two well-known organisms: human and fly. We firstly identified the PPI candidates of target organisms based on homologous PPIs, sharing significant sequence similarities (joint E-value ≤ 1 × 10−40), from source organisms using generalized interolog mapping. These PPI candidates were evaluated by our multiple-strategy scoring method, combining sequence similarities, normalized ranks, and conservation scores across multiple organisms. According to 106,825 PPI candidates in yeast derived from human and fly, our scoring method can achieve high prediction accuracy and outperform generalized interolog mapping. Experiment results show that our multiple-strategy score can avoid the influence of the protein family size and length to significantly improve PPI prediction accuracy and reflect the biological functions. In addition, the top-ranked and conserved PPIs are often orthologous/essential interactions and share the functional similarity. Based on these reliable predicted PPIs, we reconstructed a comprehensive mouse PPI network, which is a scale-free network and can reflect the biological functions and high connectivity of 292 KEGG modules, including 216 pathways and 76 structural complexes.

Conclusions

Experimental results show that our scoring method can improve the predicting accuracy based on the normalized rank and evolutionary conservation from multiple organisms. Our predicted PPIs share similar biological processes and cellular components, and the reconstructed genome-wide PPI network can reflect network topology and modularity. We believe that our method is useful for inferring reliable PPIs and reconstructing a comprehensive PPI network of an interesting organism.  相似文献   

11.
Recent reports indicate that mutations in viral genomes tend to preserve RNA secondary structure, and those mutations that disrupt secondary structural elements may reduce gene expression levels, thereby serving as a functional knockout. In this article, we explore the conservation of secondary structures of mRNA coding regions, a previously unknown factor in bacterial evolution, by comparing the structural consequences of mutations in essential and nonessential Escherichia coli genes accumulated over 40 000 generations in the course of the ‘long-term evolution experiment’. We monitored the extent to which mutations influence minimum free energy (MFE) values, assuming that a substantial change in MFE is indicative of structural perturbation. Our principal finding is that purifying selection tends to eliminate those mutations in essential genes that lead to greater changes of MFE values and, therefore, may be more disruptive for the corresponding mRNA secondary structures. This effect implies that synonymous mutations disrupting mRNA secondary structures may directly affect the fitness of the organism. These results demonstrate that the need to maintain intact mRNA structures imposes additional evolutionary constraints on bacterial genomes, which go beyond preservation of structure and function of the encoded proteins.  相似文献   

12.
Microorganisms have adapted intricate signal transduction mechanisms to coordinate tolerance to toxic levels of metals, including two-component regulatory systems (TCRS). In particular, both cop and czc operons are regulated by TCRS; the cop operon plays a key role in bacterial tolerance to copper, whereas the czc operon is involved in the efflux of cadmium, zinc, and cobalt from the cell. Although the molecular physiology of heavy metal tolerance genes has been extensively studied, their evolutionary relationships are not well-understood. Phylogenetic relationships among heavy-metal efflux proteins and their corresponding two-component regulatory proteins revealed orthologous and paralogous relationships from species divergences and ancient gene duplications. The presence of heavy metal tolerance genes on bacterial plasmids suggests these genes may be prone to spread through horizontal gene transfer. Phylogenetic inferences revealed nine potential examples of lateral gene transfer associated with metal efflux proteins and two examples for regulatory proteins. Notably, four of the examples suggest lateral transfer across major evolutionary domains. In most cases, differences in GC content in metal tolerance genes and their corresponding host genomes confirmed lateral gene transfer events. Three-dimensional protein structures predicted for the response regulators encoded by cop and czc operons showed a high degree of structural similarity with other known proteins involved in TCRS signal transduction, which suggests common evolutionary origins of functional phenotypes and similar mechanisms of action for these response regulators.  相似文献   

13.
Protein-protein interaction (PPI) networks are commonly explored for the identification of distinctive biological traits, such as pathways, modules, and functional motifs. In this respect, understanding the underlying network structure is vital to assess the significance of any discovered features. We recently demonstrated that PPI networks show degree-weighted behavior, whereby the probability of interaction between two proteins is generally proportional to the product of their numbers of interacting partners or degrees. It was surmised that degree-weighted behavior is a characteristic of randomness. We expand upon these findings by developing a random, degree-weighted, network model and show that eight PPI networks determined from single high-throughput (HT) experiments have global and local properties that are consistent with this model. The apparent random connectivity in HT PPI networks is counter-intuitive with respect to their observed degree distributions; however, we resolve this discrepancy by introducing a non-network-based model for the evolution of protein degrees or "binding affinities." This mechanism is based on duplication and random mutation, for which the degree distribution converges to a steady state that is identical to one obtained by averaging over the eight HT PPI networks. The results imply that the degrees and connectivities incorporated in HT PPI networks are characteristic of unbiased interactions between proteins that have varying individual binding affinities. These findings corroborate the observation that curated and high-confidence PPI networks are distinct from HT PPI networks and not consistent with a random connectivity. These results provide an avenue to discern indiscriminate organizations in biological networks and suggest caution in the analysis of curated and high-confidence networks.  相似文献   

14.
Podoviruses are among the major viral groups that infect marine picocyanobacteria Prochlorococcus and Synechococcus. Here, we reported the genome sequences of five Synechococcus podoviruses isolated from the estuarine environment, and performed comparative genomic and phylogenomic analyses based on a total of 20 cyanopodovirus genomes. The genomes of all the known marine cyanopodoviruses are highly syntenic. A pan-genome of 349 clustered orthologous groups was determined, among which 15 were core genes. These core genes make up nearly half of each genome in length, reflecting the high level of genome conservation among this cyanophage type. The whole genome phylogenies based on concatenated core genes and gene content were highly consistent and confirmed the separation of two discrete marine cyanopodovirus clusters MPP-A and MPP-B. The genomes within cluster MPP-B grouped into subclusters mainly corresponding to Prochlorococcus or Synechococcus host types. Auxiliary metabolic genes tend to occur in a specific phylogenetic group of these cyanopodoviruses. All the MPP-B phages analyzed here encode the photosynthesis gene psbA, which are absent in all the MPP-A genomes thus far. Interestingly, all the MPP-B and two MPP-A Synechococcus podoviruses encode the thymidylate synthase gene thyX, while at the same genome locus all the MPP-B Prochlorococcus podoviruses encode the transaldolase gene talC. Both genes are hypothesized to have the potential to facilitate the biosynthesis of deoxynucleotide for phage replication. Inheritance of specific functional genes could be important to the evolution and ecological fitness of certain cyanophage genotypes. Our analyses demonstrate that cyanopodoviruses of estuarine and oceanic origins share a conserved core genome and suggest that accessory genes may be related to environmental adaptation.  相似文献   

15.
16.
17.

Background

Parkinson''s Disease (PD) is one of the most prevailing neurodegenerative diseases. Improving diagnoses and treatments of this disease is essential, as currently there exists no cure for this disease. Microarray and proteomics data have revealed abnormal expression of several genes and proteins responsible for PD. Nevertheless, few studies have been reported involving PD-specific protein-protein interactions.

Results

Microarray based gene expression data and protein-protein interaction (PPI) databases were combined to construct the PPI networks of differentially expressed (DE) genes in post mortem brain tissue samples of patients with Parkinson''s disease. Samples were collected from the substantia nigra and the frontal cerebral cortex. From the microarray data, two sets of DE genes were selected by 2-tailed t-tests and Significance Analysis of Microarrays (SAM), run separately to construct two Query-Query PPI (QQPPI) networks. Several topological properties of these networks were studied. Nodes with High Connectivity (hubs) and High Betweenness Low Connectivity (bottlenecks) were identified to be the most significant nodes of the networks. Three and four-cliques were identified in the QQPPI networks. These cliques contain most of the topologically significant nodes of the networks which form core functional modules consisting of tightly knitted sub-networks. Hitherto unreported 37 PD disease markers were identified based on their topological significance in the networks. Of these 37 markers, eight were significantly involved in the core functional modules and showed significant change in co-expression levels. Four (ARRB2, STX1A, TFRC and MARCKS) out of the 37 markers were found to be associated with several neurotransmitters including dopamine.

Conclusion

This study represents a novel investigation of the PPI networks for PD, a complex disease. 37 proteins identified in our study can be considered as PD network biomarkers. These network biomarkers may provide as potential therapeutic targets for PD applications development.  相似文献   

18.
The COG database: an updated version includes eukaryotes   总被引:4,自引:0,他引:4  

Background

The availability of multiple, essentially complete genome sequences of prokaryotes and eukaryotes spurred both the demand and the opportunity for the construction of an evolutionary classification of genes from these genomes. Such a classification system based on orthologous relationships between genes appears to be a natural framework for comparative genomics and should facilitate both functional annotation of genomes and large-scale evolutionary studies.

Results

We describe here a major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes and the construction of clusters of predicted orthologs for 7 eukaryotic genomes, which we named KOGs after eukaryotic orthologous groups. The COG collection currently consists of 138,458 proteins, which form 4873 COGs and comprise 75% of the 185,505 (predicted) proteins encoded in 66 genomes of unicellular organisms. The eukaryotic orthologous groups (KOGs) include proteins from 7 eukaryotic genomes: three animals (the nematode Caenorhabditis elegans, the fruit fly Drosophila melanogaster and Homo sapiens), one plant, Arabidopsis thaliana, two fungi (Saccharomyces cerevisiae and Schizosaccharomyces pombe), and the intracellular microsporidian parasite Encephalitozoon cuniculi. The current KOG set consists of 4852 clusters of orthologs, which include 59,838 proteins, or ~54% of the analyzed eukaryotic 110,655 gene products. Compared to the coverage of the prokaryotic genomes with COGs, a considerably smaller fraction of eukaryotic genes could be included into the KOGs; addition of new eukaryotic genomes is expected to result in substantial increase in the coverage of eukaryotic genomes with KOGs. Examination of the phyletic patterns of KOGs reveals a conserved core represented in all analyzed species and consisting of ~20% of the KOG set. This conserved portion of the KOG set is much greater than the ubiquitous portion of the COG set (~1% of the COGs). In part, this difference is probably due to the small number of included eukaryotic genomes, but it could also reflect the relative compactness of eukaryotes as a clade and the greater evolutionary stability of eukaryotic genomes.

Conclusion

The updated collection of orthologous protein sets for prokaryotes and eukaryotes is expected to be a useful platform for functional annotation of newly sequenced genomes, including those of complex eukaryotes, and genome-wide evolutionary studies.  相似文献   

19.
20.
Background

Cockroaches are terrestrial insects that strikingly eliminate waste nitrogen as ammonia instead of uric acid. Blattabacterium cuenoti (Mercier 1906) strains Bge and Pam are the obligate primary endosymbionts of the cockroaches Blattella germanica and Periplaneta americana, respectively. The genomes of both bacterial endosymbionts have recently been sequenced, making possible a genome-scale constraint-based reconstruction of their metabolic networks. The mathematical expression of a metabolic network and the subsequent quantitative studies of phenotypic features by Flux Balance Analysis (FBA) represent an efficient functional approach to these uncultivable bacteria.

Results

We report the metabolic models of Blattabacterium strains Bge (iCG238) and Pam (iCG230), comprising 296 and 289 biochemical reactions, associated with 238 and 230 genes, and 364 and 358 metabolites, respectively. Both models reflect both the striking similarities and the singularities of these microorganisms. FBA was used to analyze the properties, potential and limits of the models, assuming some environmental constraints such as aerobic conditions and the net production of ammonia from these bacterial systems, as has been experimentally observed. In addition, in silico simulations with the iCG238 model have enabled a set of carbon and nitrogen sources to be defined, which would also support a viable phenotype in terms of biomass production in the strain Pam, which lacks the first three steps of the tricarboxylic acid cycle. FBA reveals a metabolic condition that renders these enzymatic steps dispensable, thus offering a possible evolutionary explanation for their elimination. We also confirm, by computational simulations, the fragility of the metabolic networks and their host dependence.

Conclusions

The minimized Blattabacterium metabolic networks are surprisingly similar in strains Bge and Pam, after 140 million years of evolution of these endosymbionts in separate cockroach lineages. FBA performed on the reconstructed networks from the two bacteria helps to refine the functional analysis of the genomes enabling us to postulate how slightly different host metabolic contexts drove their parallel evolution.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号