首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background  

The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT) is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes.  相似文献   

2.

Background  

Microbial genomes contain an abundance of genes with conserved proximity forming clusters on the chromosome. However, the conservation can be a result of many factors such as vertical inheritance, or functional selection. Thus, identification of conserved gene clusters that are under functional selection provides an effective channel for gene annotation, microarray screening, and pathway reconstruction. The problem of devising a robust method to identify these conserved gene clusters and to evaluate the significance of the conservation in multiple genomes has a number of implications for comparative, evolutionary and functional genomics as well as synthetic biology.  相似文献   

3.

Background

Microsporidia are intracellular parasites that are highly-derived relatives of fungi. They have compacted genomes and, despite a high rate of sequence evolution, distantly related species can share high levels of gene order conservation. To date, only two species have been analysed in detail, and data from one of these largely consists of short genomic fragments. It is therefore difficult to determine how conservation has been maintained through microsporidian evolution, and impossible to identify whether certain regions are more prone to genomic stasis.

Principal Findings

Here, we analyse three large fragments of the Enterocytozoon bieneusi genome (in total 429 kbp), a species of medical significance. A total of 296 ORFs were identified, annotated and their context compared with Encephalitozoon cuniculi and Antonospora locustae. Overall, a high degree of conservation was found between all three species, and interestingly the level of conservation was similar in all three pairwise comparisons, despite the fact that A. locustae is more distantly related to E. cuniculi and E. bieneusi than either are to each other.

Conclusions/Significance

Any two genes that are found together in any pair of genomes are more likely to be conserved in the third genome as well, suggesting that a core of genes tends to be conserved across the entire group. The mechanisms of rearrangments identified among microsporidian genomes were consistent with a very slow evolution of their architecture, as opposed to the very rapid sequence evolution reported for these parasites.  相似文献   

4.

Background

Extant genomes share regions where genes have the same order and orientation, which are thought to arise from the conservation of an ancestral order of genes during evolution. Such regions of so-called conserved synteny, or synteny blocks, must be precisely identified and quantified, as a prerequisite to better understand the evolutionary history of genomes.

Results

Here we describe PhylDiag, a software that identifies statistically significant synteny blocks in pairwise comparisons of eukaryote genomes. Compared to previous methods, PhylDiag uses gene trees to define gene homologies, thus allowing gene deletions to be considered as events that may break the synteny. PhylDiag also accounts for gene orientations, blocks of tandem duplicates and lineage specific de novo gene births. Starting from two genomes and the corresponding gene trees, PhylDiag returns synteny blocks with gaps less than or equal to the maximum gap parameter gapmax. This parameter is theoretically estimated, and together with a utility to graphically display results, contributes to making PhylDiag a user friendly method. In addition, putative synteny blocks are subject to a statistical validation to verify that they are unlikely to be due to a random combination of genes.

Conclusions

We benchmark several known metrics to measure 2D-distances in a matrix of homologies and we compare PhylDiag to i-ADHoRe 3.0 on real and simulated data. We show that PhylDiag correctly identifies small synteny blocks even with insertions, deletions, incorrect annotations or micro-inversions. Finally, PhylDiag allowed us to identify the most relevant distance metric for 2D-distance calculation between homologies.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-268) contains supplementary material, which is available to authorized users.  相似文献   

5.

Background

Arthropods are the most diverse group of eukaryotic organisms, but their phylogenetic relationships are poorly understood. Herein, we describe three mitochondrial genomes representing orders of millipedes for which complete genomes had not been characterized. Newly sequenced genomes are combined with existing data to characterize the protein coding regions of myriapods and to attempt to reconstruct the evolutionary relationships within the Myriapoda and Arthropoda.

Results

The newly sequenced genomes are similar to previously characterized millipede sequences in terms of synteny and length. Unique translocations occurred within the newly sequenced taxa, including one half of the Appalachioria falcifera genome, which is inverted with respect to other millipede genomes. Across myriapods, amino acid conservation levels are highly dependent on the gene region. Additionally, individual loci varied in the level of amino acid conservation. Overall, most gene regions showed low levels of conservation at many sites. Attempts to reconstruct the evolutionary relationships suffered from questionable relationships and low support values. Analyses of phylogenetic informativeness show the lack of signal deep in the trees (i.e., genes evolve too quickly). As a result, the myriapod tree resembles previously published results but lacks convincing support, and, within the arthropod tree, well established groups were recovered as polyphyletic.

Conclusions

The novel genome sequences described herein provide useful genomic information concerning millipede groups that had not been investigated. Taken together with existing sequences, the variety of compositions and evolution of myriapod mitochondrial genomes are shown to be more complex than previously thought. Unfortunately, the use of mitochondrial protein-coding regions in deep arthropod phylogenetics appears problematic, a result consistent with previously published studies. Lack of phylogenetic signal renders the resulting tree topologies as suspect. As such, these data are likely inappropriate for investigating such ancient relationships.  相似文献   

6.

Background  

Comparative genomics has provided valuable insights into the nature of gene sequence variation and chromosomal organization of closely related bacterial species. However, questions about the biological significance of gene order conservation, or synteny, remain open. Moreover, few comprehensive studies have been reported for rhizobial genomes.  相似文献   

7.

Background  

Gene order in eukaryotic genomes is not random, with genes with similar expression profiles tending to cluster. In yeasts, the model taxon for gene order analysis, such syntenic clusters of non-homologous genes tend to be conserved over evolutionary time. Whether similar clusters show gene order conservation in other lineages is, however, undecided. Here, we examine this issue in Drosophila melanogaster using high-resolution chromosome rearrangement data.  相似文献   

8.
Zheng Y  Roberts RJ  Kasif S 《Genome biology》2002,3(11):research0060.1-research00609

Background  

The current speed of sequencing already exceeds the capability of annotation, creating a potential bottleneck. A large proportion of the genes in microbial genomes remains uncharacterized. Here we propose a new method for functional annotation using the conservation patterns of gene clusters. If several gene clusters show the same coevolution pattern across different genomes it is reasonable to infer they are functionally related. The gene cluster phylogenetic profile integrates chromosomal proximity information and phylogenetic profile information and allows us to infer functional dependences between the gene clusters even at great distance on the chromosome.  相似文献   

9.

Background

The massive scale of microarray derived gene expression data allows for a global view of cellular function. Thus far, comparative studies of gene expression between species have been based on the level of expression of the gene across corresponding tissues, or on the co-expression of the gene with another gene.

Results

To compare gene expression between distant species on a global scale, we introduce the "expression context". The expression context of a gene is based on the co-expression with all other genes that have unambiguous counterparts in both genomes. Employing this new measure, we show 1) that the expression context is largely conserved between orthologs, and 2) that sequence identity shows little correlation with expression context conservation after gene duplication and speciation.

Conclusion

This means that the degree of sequence identity has a limited predictive quality for differential expression context conservation between orthologs, and thus presumably also for other facets of gene function.  相似文献   

10.

Background

Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes.

Methodology/Principal Findings

We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes.

Conclusion

The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species.  相似文献   

11.

Background

Gene order in eukaryotic genomes is not random. Genes showing similar expression (coexpression) patterns are often clustered along the genome. The goal of this study is to characterize coexpression clustering in mammalian genomes and to investigate the underlying mechanisms.

Methodology/Principal Findings

We detect clustering of coexpressed genes across multiple scales, from neighboring genes to chromosomal domains that span tens of megabases and, in some cases, entire chromosomes. Coexpression domains may be positively or negatively correlated with other domains, within and between chromosomes. We find that long-range expression domains are associated with gene density, which in turn is related to physical organization of the chromosomes within the nucleus. We show that gene expression changes between healthy and diseased tissue samples occur in a gene density-dependent manner.

Conclusions/Significance

We demonstrate that coexpression domains exist across multiple scales. We identify potential mechanisms for short-range as well as long-range coexpression domains. We provide evidence that the three-dimensional architecture of the chromosomes may underlie long-range coexpression domains. Chromosome territory reorganization may play a role in common human diseases such as Alzheimer''s disease and psoriasis.  相似文献   

12.

Background  

Identification of homologous regions or conserved syntenies across genomes is one crucial step in comparative genomics. This task is usually performed by genome alignment softwares like WABA or blastz. In case of conserved syntenies, such regions are defined as conserved gene orders. On the gene order level, homologous regions can even be found between distantly related genomes, which do not align on the nucleotide sequence level.  相似文献   

13.

Background  

In search of new antifungal targets of potential interest for pharmaceutical companies, we initiated a comparative genomics study to identify the most promising protein-coding genes in fungal genomes. One criterion was the protein sequence conservation between reference pathogenic genomes. A second criterion was that the corresponding gene in Saccharomyces cerevisiae should be essential. Since thiamine pyrophosphate is an essential product involved in a variety of metabolic pathways, proteins responsible for its production satisfied these two criteria.  相似文献   

14.
15.

Background  

An increasing number of whole viral and bacterial genomes are being sequenced and deposited in public databases. In parallel to the mounting interest in whole genomes, the number of whole genome analyses software tools is also increasing. GeneOrder was originally developed to provide an analysis of genes between two genomes, allowing visualization of gene order and synteny comparisons of any small genomes. It was originally developed for comparing virus, mitochondrion and chloroplast genomes. This is now extended to small bacterial genomes of sizes less than 2 Mb.  相似文献   

16.

Background  

The decrease in cost for sequencing and improvement in technologies has made it easier and more common for the re-sequencing of large genomes as well as parallel sequencing of small genomes. It is possible to completely sequence a small genome within days and this increases the number of publicly available genomes. Among the types of genomes being rapidly sequenced are those of microbial and viral genomes responsible for infectious diseases. However, accurate gene prediction is a challenge that persists for decoding a newly sequenced genome. Therefore, accurate and efficient gene prediction programs are highly desired for rapid and cost effective surveillance of RNA viruses through full genome sequencing.  相似文献   

17.
Wu J 《BMC genomics》2008,9(Z2):S13

Background

Computational gene prediction tools routinely generate large volumes of predicted coding exons (putative exons). One common limitation of these tools is the relatively low specificity due to the large amount of non-coding regions.

Methods

A statistical approach is developed that largely improves the gene prediction specificity. The key idea is to utilize the evolutionary conservation principle relative to the coding exons. By first exploiting the homology between genomes of two related species, a probability model for the evolutionary conservation pattern of codons across different genomes is developed. A probability model for the dependency between adjacent codons/triplets is added to differentiate coding exons and random sequences. Finally, the log odds ratio is developed to classify putative exons into the group of coding exons and the group of non-coding regions.

Results

The method was tested on pre-aligned human-mouse sequences where the putative exons are predicted by GENSCAN and TWINSCAN. The proposed method is able to improve the exon specificity by 73% and 32% respectively, while the loss of the sensitivity ≤ 1%. The method also keeps 98% of RefSeq gene structures that are correctly predicted by TWINSCAN when removing 26% of predicted genes that are in non-coding regions. The estimated number of true exons in TWINSCAN's predictions is 157,070. The results and the executable codes can be downloaded from http://www.stat.purdue.edu/~jingwu/codon/

Conclusion

The proposed method demonstrates an application of the evolutionary conservation principle to coding exons. It is a complementary method which can be used as an additional criteria to refine many existing gene predictions.
  相似文献   

18.

Background

Nucleomorphs are residual nuclei derived from eukaryotic endosymbionts in chlorarachniophyte and cryptophyte algae. The endosymbionts that gave rise to nucleomorphs and plastids in these two algal groups were green and red algae, respectively. Despite their independent origin, the chlorarachniophyte and cryptophyte nucleomorph genomes share similar genomic features such as extreme size reduction and a three-chromosome architecture. This suggests that similar reductive evolutionary forces have acted to shape the nucleomorph genomes in the two groups. Thus far, however, only a single chlorarachniophyte nucleomorph and plastid genome has been sequenced, making broad evolutionary inferences within the chlorarachniophytes and between chlorarachniophytes and cryptophytes difficult. We have sequenced the nucleomorph and plastid genomes of the chlorarachniophyte Lotharella oceanica in order to gain insight into nucleomorph and plastid genome diversity and evolution.

Results

The L. oceanica nucleomorph genome was found to consist of three linear chromosomes totaling ~610 kilobase pairs (kbp), much larger than the 373 kbp nucleomorph genome of the model chlorarachniophyte Bigelowiella natans. The L. oceanica plastid genome is 71 kbp in size, similar to that of B. natans. Unexpectedly long (~35 kbp) sub-telomeric repeat regions were identified in the L. oceanica nucleomorph genome; internal multi-copy regions were also detected. Gene content analyses revealed that nucleomorph house-keeping genes and spliceosomal intron positions are well conserved between the L. oceanica and B. natans nucleomorph genomes. More broadly, gene retention patterns were found to be similar between nucleomorph genomes in chlorarachniophytes and cryptophytes. Chlorarachniophyte plastid genomes showed near identical protein coding gene complements as well as a high level of synteny.

Conclusions

We have provided insight into the process of nucleomorph genome evolution by elucidating the fine-scale dynamics of sub-telomeric repeat regions. Homologous recombination at the chromosome ends appears to be frequent, serving to expand and contract nucleomorph genome size. The main factor influencing nucleomorph genome size variation between different chlorarachniophyte species appears to be expansion-contraction of these telomere-associated repeats rather than changes in the number of unique protein coding genes. The dynamic nature of chlorarachniophyte nucleomorph genomes lies in stark contrast to their plastid genomes, which appear to be highly stable in terms of gene content and synteny.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-374) contains supplementary material, which is available to authorized users.  相似文献   

19.

Background

The molecular components in synapses that are essential to the life cycle of synaptic vesicles are well characterized. Nonetheless, many aspects of synaptic processes, in particular how they relate to complex behaviour, remain elusive. The genomes of flies, mosquitoes, the honeybee and the beetle are now fully sequenced and span an evolutionary breadth of about 350 million years; this provides a unique opportunity to conduct a comparative genomics study of the synapse.

Results

We compiled a list of 120 gene prototypes that comprise the core of presynaptic structures in insects. Insects lack several scaffolding proteins in the active zone, such as bassoon and piccollo, and the most abundant protein in the mammalian synaptic vesicle, namely synaptophysin. The pattern of evolution of synaptic protein complexes is analyzed. According to this analysis, the components of presynaptic complexes as well as proteins that take part in organelle biogenesis are tightly coordinated. Most synaptic proteins are involved in rich protein interaction networks. Overall, the number of interacting proteins and the degrees of sequence conservation between human and insects are closely correlated. Such a correlation holds for exocytotic but not for endocytotic proteins.

Conclusion

This comparative study of human with insects sheds light on the composition and assembly of protein complexes in the synapse. Specifically, the nature of the protein interaction graphs differentiate exocytotic from endocytotic proteins and suggest unique evolutionary constraints for each set. General principles in the design of proteins of the presynaptic site can be inferred from a comparative study of human and insect genomes.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号