首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Dinucleotide usage is known to vary in the genomes of organisms. The dinucleotide usage profiles or genome signatures are similar for sequence samples taken from the same genome, but are different for taxonomically distant species. This concept of genome signatures has been used to study several organisms including viruses, to elucidate the signatures of evolutionary processes at the genome level. Genome signatures assume greater importance in the case of host–pathogen interactions, where molecular interactions between the two species take place continuously, and can influence their genomic composition. In this study, analyses of whole genome sequences of the HIV-1 subtype B, a retrovirus that caused global pandemic of AIDS, have been carried out to analyse the variation in genome signatures of the virus from 1983 to 2007. We show statistically significant temporal variations in some dinucleotide patterns highlighting the selective evolution of the dinucleotide profiles of HIV-1 subtype B, possibly a consequence of host specific selection.  相似文献   

2.
It is hard to imagine a world without food‐associated microbes. The production of bread, wine, beer, salami, coffee, chocolate, cheese and many other foods and beverages all rely on specific microbes. In cheese, myriad microbial species collaborate to yield the complex organoleptic properties that are appreciated by millions of people worldwide. In the early days of cheese making, these complex communities emerged spontaneously from the natural flora associated with the raw materials, the equipment, the production environment or craftsmen involved in the production process. However, in some cases, the microbes shifted their natural habitat to the new cheese‐associated environment. The most obvious cause of this is backslopping, where part of a fermented product is used to inoculate the next batch. In addition, some microbes may simply adhere to the tools used in the production process. These microbial communities gradually adapted to the novel man‐made niches, a process referred to as “domestication.” Domestication is associated with specific genomic and phenotypic changes and ultimately leads to lineages that are genetically and phenotypically distinct from their wild ancestors. In this issue of Molecular Ecology, Dumas et al. have investigated a prime example of cheese‐associated microbes, the fungus Penicillium roqueforti. The authors identified several hallmarks of domestication in the genome and phenome of this species, allowing them to hypothesize about the origin of blue‐veined cheese fungi domestication, and the specific evolutionary processes involved in adaptation to the cheese matrix.  相似文献   

3.

Background  

Halophilic prokaryotes are adapted to thrive in extreme conditions of salinity. Identification and analysis of distinct macromolecular characteristics of halophiles provide insight into the factors responsible for their adaptation to high-salt environments. The current report presents an extensive and systematic comparative analysis of genome and proteome composition of halophilic and non-halophilic microorganisms, with a view to identify such macromolecular signatures of haloadaptation.  相似文献   

4.
DNA signatures are nucleotide sequences that can be used to detect the presence of an organism and to distinguish that organism from all other species. Here we describe Insignia, a new, comprehensive system for the rapid identification of signatures in the genomes of bacteria and viruses. With the availability of hundreds of complete bacterial and viral genome sequences, it is now possible to use computational methods to identify signature sequences in all of these species, and to use these signatures as the basis for diagnostic assays to detect and genotype microbes in both environmental and clinical samples. The success of such assays critically depends on the methods used to identify signatures that properly differentiate between the target genomes and the sample background. We have used Insignia to compute accurate signatures for most bacterial genomes and made them available through our Web site. A sample of these signatures has been successfully tested on a set of 46 Vibrio cholerae strains, and the results indicate that the signatures are highly sensitive for detection as well as specific for discrimination between these strains and their near relatives. Our approach, whereby the entire genomic complement of organisms are compared to identify probe targets, is a promising method for diagnostic assay development, and it provides assay designers with the flexibility to choose probes from the most relevant genes or genomic regions. The Insignia system is freely accessible via a Web interface and has been released as open source software at: http://insignia.cbcb.umd.edu.  相似文献   

5.
Dong Yang  Ying Jiang  Fuchu He 《遗传学报》2009,36(11):645-651
Genome sequencing opened the flood gate of "-omics" studies, among which the research about correlations between genomic and phenomic variables is an important part. With the development of functional genomics and systems biology, genome-wide investigation of the correlations between many genomic and phenomic variables became possible. In this review, five genomic variables, such as evolution rate (or "age" of the gene), the length of intron and ORF (protein length) in one gene, the biases of amino acid composition and codon usage, along with the phenomic variables related to expression patterns (level and breadth) are focused on. In most cases, genes with higher mRNA/protein expression level tend to evolve slowly, have less intronic DNA, code for smaller proteins, and have higher biases of amino acid composition and codon usage. In addition, broadly expressed proteins evolve more slowly and are shorter than tissue-specific proteins. Studies in this field are helpful for deeper understanding the signatures of selection mediated by the features of gene expression and are of great significance to enrich the evolution theory.  相似文献   

6.

Background  

Pathogenicity islands (PAIs), distinct genomic segments of pathogens encoding virulence factors, represent a subgroup of genomic islands (GIs) that have been acquired by horizontal gene transfer event. Up to now, computational approaches for identifying PAIs have been focused on the detection of genomic regions which only differ from the rest of the genome in their base composition and codon usage. These approaches often lead to the identification of genomic islands, rather than PAIs.  相似文献   

7.
8.
A gene in a genome is defined as putative alien (pA) if its codon usage difference from the average gene exceeds a high threshold and codon usage differences from ribosomal protein genes, chaperone genes and protein-synthesis-processing factors are also high. pA gene clusters in bacterial genomes are relevant for detecting genomic islands (GIs), including pathogenicity islands (PAIs). Four other analyses appropriate to this task are G+C genome variation (the standard method); genomic signature divergences (dinucleotide bias); extremes of codon bias; and anomalies of amino acid usage. For example, the cagA domain of Helicobacter pylori is highly deviant in its genome signature and codon bias from the rest of the genome. Using these methods we can detect two potential PAIs in the Neisseria meningitidis genome, which contain hemagglutinin and/or hemolysin-related genes. Additionally, G+C variation and genome signature differences of the Mycobacterium tuberculosis genome indicate two pA gene clusters.  相似文献   

9.
10.
ABSTRACT: BACKGROUND: Synonymous codon usage bias has typically been correlated with, and attributed to translational efficiency. However, there are other pressures on genomic sequence composition that can affect codon usage patterns such as mutational biases. This study provides an analysis of the codon usage patterns in Arabidopsis thaliana in relation to gene expression levels, codon volatility, mutational biases and selective pressures. RESULTS: We have performed synonymous codon usage and codon volatility analyses for all genes in the A. thaliana genome. In contrast to reports for species from other kingdoms, we find that neither codon usage nor volatility are correlated with selection pressure (as measured by dN/dS), nor with gene expression levels on a genome wide level. Our results show that codon volatility and usage are not synonymous, rather that they are correlated with the abundance of G and C at the third codon position (GC3). CONCLUSIONS: Our results indicate that while the A. thaliana genome shows evidence for synonymous codon usage bias, this is not related to the expression levels of its constituent genes. Neither codon volatility nor codon usage are correlated with expression levels or selective pressures but, because they are directly related to the composition of G and C at the third codon position, they are the result of mutational bias. Therefore, in A. thaliana codon volatility and usage do not result from selection for translation efficiency or protein functional shift as measured by positive selection.  相似文献   

11.

Background

The genetic code is redundant, meaning that most amino acids can be encoded by more than one codon. Highly expressed genes tend to use optimal codons to increase the accuracy and speed of translation. Thus, codon usage biases provide a signature of the relative expression levels of genes, which can, uniquely, be quantified across the domains of life.

Results

Here we describe a general statistical framework to exploit this phenomenon and to systematically associate genes with environments and phenotypic traits through changes in codon adaptation. By inferring evolutionary signatures of translation efficiency in 911 bacterial and archaeal genomes while controlling for confounding effects of phylogeny and inter-correlated phenotypes, we linked 187 gene families to 24 diverse phenotypic traits. A series of experiments in Escherichia coli revealed that 13 of 15, 19 of 23, and 3 of 6 gene families with changes in codon adaptation in aerotolerant, thermophilic, or halophilic microbes. Respectively, confer specific resistance to, respectively, hydrogen peroxide, heat, and high salinity. Further, we demonstrate experimentally that changes in codon optimality alone are sufficient to enhance stress resistance. Finally, we present evidence that multiple genes with altered codon optimality in aerobes confer oxidative stress resistance by controlling the levels of iron and NAD(P)H.

Conclusions

Taken together, these results provide experimental evidence for a widespread connection between changes in translation efficiency and phenotypic adaptation. As the number of sequenced genomes increases, this novel genomic context method for linking genes to phenotypes based on sequence alone will become increasingly useful.  相似文献   

12.
Understanding the extent and causes of biases in codon usage and nucleotide composition is essential to the study of viral evolution, particularly the interplay between viruses and host cells or immune responses. To understand the common features and differences among viruses we analyzed the genomic characteristics of a representative collection of all sequenced vertebrate-infecting DNA viruses. This revealed that patterns of codon usage bias are strongly correlated with overall genomic GC content, suggesting that genome-wide mutational pressure, rather than natural selection for specific coding triplets, is the main determinant of codon usage. Further, we observed a striking difference in CpG content between DNA viruses with large and small genomes. While the majority of large genome viruses show the expected frequency of CpG, most small genome viruses had CpG contents far below expected values. The exceptions to this generalization, the large gammaherpesviruses and iridoviruses and the small dependoviruses, have sufficiently different life-cycle characteristics that they may help reveal some of the factors shaping the evolution of CpG usage in viruses. Electronic Supplementary Material Electronic Supplementary material is available for this article at and accessible for authorised users. [Reviewing Editor: Dr. Nicolas Galtier]  相似文献   

13.
Can genome analysis tell us about the lifestyle of an organism? We ask this question considering a thorough cross comparison of thermophilic and mesophilic genomes, since presently the number of available genomes is enough to ensure statistical significance of the results. We analyze, by means of principal component analysis (PCA), the codon composition of a database comprising 116 genomes, selected so as to include one species for each genus and show that a cross genomic approach can allow the extraction of common determinants of thermostability at the genome level. The results of our analysis indicate that all the known features of thermostability can be found in the 64 component loadings of the second principal axis of PCA. By this, we develop an index of thermostability whose discriminative power between mesophiles and thermophiles scores with 98% accuracy at the genome level and with 95% accuracy at the protein sequence level. We also prove that these results are not due to phylogenetic differences between archaea and bacteria.  相似文献   

14.
15.
Along the gene, nucleotides in various codon positions tend to exert a slight but observable influence on the nucleotide choice at neighboring positions. Such context biases are different in different organisms and can be used as genomic signatures. In this paper, we will focus specifically on the dinucleotide composed of a third codon position nucleotide and its succeeding first position nucleotide. Using the 16 possible dinucleotide combinations, we calculate how well individual genes conform to the observed mean dinucleotide frequencies of an entire genome, forming a distance measure for each gene. It is found that genes from different genomes can be separated with a high degree of accuracy, according to these distance values. In particular, we address the problem of recent horizontal gene transfer, and how imported genes may be evaluated by their poor assimilation to the host's context biases. By concentrating on the third- and succeeding first position nucleotides, we eliminate most spurious contributions from codon usage and amino-acid requirements, focusing mainly on mutational effects. Since imported genes are expected to converge only gradually to genomic signatures, it is possible to question whether a gene present in only one of two closely related organisms has been imported into one organism or deleted in the other. Striking correlations between the proposed distance measure and poor homology are observed when Escherichia coli genes are compared to Salmonella typhi, indicating that sets of outlier genes in E. coli may contain a high number of genes that have been imported into E. coli, and not deleted in S. typhi. Received: 16 January 2001 / Accepted: 30 August 2001  相似文献   

16.
Genome scans have become a common approach to identify genomic signatures of natural selection and reproductive isolation, as well as the genomic bases of ecologically relevant phenotypes, based on patterns of polymorphism and differentiation among populations or species. Here, we review the results of studies taking genome scan approaches in plants, consider the patterns of genomic differentiation documented and their possible causes, discuss the results in light of recent models of genomic differentiation during divergent adaptation and speciation, and consider assumptions and caveats in their interpretation. We find that genomic regions of high divergence generally appear quite small in comparisons of both closely and more distantly related populations, and for the most part, these differentiated regions are spread throughout the genome rather than strongly clustered. Thus, the genome scan approach appears well-suited for identifying genomic regions or even candidate genes that underlie adaptive divergence and/or reproductive barriers. We consider other methodologies that may be used in conjunction with genome scan approaches, and suggest further developments that would be valuable. These include broader use of sequence-based markers of known genomic location, greater attention to sampling strategies to make use of parallel environmental or phenotypic transitions, more integration with approaches such as quantitative trait loci mapping and measures of gene flow across the genome, and additional theoretical and simulation work on processes related to divergent adaptation and speciation.  相似文献   

17.
New and simple numerical criteria based on a codon adaptation index are applied to the complete genomic sequences of 80 Eubacteria and 16 Archaea, to infer weak and strong genome tendencies toward content bias, translational bias, and strand bias. These criteria can be applied to all microbial genomes, even those for which little biological information is known, and a codon bias signature, that is the collection of strong biases displayed by a genome, can be automatically derived. A codon bias space, where genomes are identified by their preferred codons, is proposed as a novel formal framework to interpret genomic relationships. Principal component analysis confirms that although GC content has a dominant effect on codon bias space, thermophilic and mesophilic species can be identified and separated by codon preferences. Two more examples concerning lifestyle are studied with linear discriminant analysis: suitable separating functions characterized by sets of preferred codons are provided to discriminate: translationally biased (hyper)thermophiles from mesophiles, and organisms with different respiratory characteristics, aerobic, anaerobic, facultative aerobic and facultative anaerobic. These results suggest that codon bias space might reflect the geometry of a prokaryotic "physiology space." Evolutionary perspectives are noted, numerical criteria and distances among organisms are validated on known cases, and various results and predictions are discussed both on methodological and biological grounds.  相似文献   

18.
Microbial communities capable of degrading biopolymers and surfactants typically found in graywater were selected in continuous-flow bioreactors operated at 30, 44, 53, or 62°C. The effect of temperature upon microbial activity and community composition was determined. Microbial respiration of the organic components of the medium (including linear alkylbenzene sulfonate) was detected in samples from each reactor. The microbial community in each reactor was adapted to the operating temperature. Nucleic acid-based analyses of community composition showed that distinct consortia were present at each temperature. Community complexity was inversely related to temperature. The specific maintenance rate was twofold higher at 62°C than at the lower temperatures. Under starvation conditions, microbes in the 62°C system lost membrane integrity 30- to 100-fold faster than microbes at lower temperatures. Received 02 April 1999/ Accepted in revised form 17 May 1999  相似文献   

19.
Possessing three circular chromosomes is a distinct genomic characteristic of Burkholderia cenocepacia AU 1054, a clinically important pathogen in cystic fibrosis. In this study, base composition, codon usage and functional role category were analyzed in the B. cenocepacia AU 1054 genome. Although no bias in the base and codon usage was detected between any two chromosomes, function differences did exist in the genes of each chromosome. Similar base composition and differential functional role categories indicated that genes on these three chromosomes were relatively stable and that a proper division of labor was established. Based on variations in the base or codon usage, four small gene clusters were observed in all of the genes. Multivariate analysis revealed that protein hydrophobicity played a predominant role in shaping base usage bias, while horizontal gene transfer and the gene expression level were the two most important factors that affected the codon usage bias. Interestingly, we also found that these gene clusters were correlated with different biological functions: (i) 45 pyrimidine-leading-codon preferred genes were predominantly involved in regulatory function; (ii) most drug resistance-related genes involved in 826 genes that coding for hydrophobic proteins; (iii) most of the 111 horizontal transfer genes were responsible for genomic plasticity; and (iv) 73 highly expressed genes (predicted by their codon adaptation index values) showed environmental adaptation to cystic fibrosis. Our results showed that genes with base or codon usage bias were affected by mutational pressure and natural selection, and their functions could contribute to drug assistance and transmissible activity in B. cenocepacia.  相似文献   

20.
The COVID-19 pandemic has demonstrated the serious potential for novel zoonotic coronaviruses to emerge and cause major outbreaks. The immediate animal origin of the causative virus, SARS-CoV-2, remains unknown, a notoriously challenging task for emerging disease investigations. Coevolution with hosts leads to specific evolutionary signatures within viral genomes that can inform likely animal origins. We obtained a set of 650 spike protein and 511 whole genome nucleotide sequences from 222 and 185 viruses belonging to the family Coronaviridae, respectively. We then trained random forest models independently on genome composition biases of spike protein and whole genome sequences, including dinucleotide and codon usage biases in order to predict animal host (of nine possible categories, including human). In hold-one-out cross-validation, predictive accuracy on unseen coronaviruses consistently reached ~73%, indicating evolutionary signal in spike proteins to be just as informative as whole genome sequences. However, different composition biases were informative in each case. Applying optimised random forest models to classify human sequences of MERS-CoV and SARS-CoV revealed evolutionary signatures consistent with their recognised intermediate hosts (camelids, carnivores), while human sequences of SARS-CoV-2 were predicted as having bat hosts (suborder Yinpterochiroptera), supporting bats as the suspected origins of the current pandemic. In addition to phylogeny, variation in genome composition can act as an informative approach to predict emerging virus traits as soon as sequences are available. More widely, this work demonstrates the potential in combining genetic resources with machine learning algorithms to address long-standing challenges in emerging infectious diseases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号