首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.

Background

Songbirds (oscine Passeriformes) are among the most diverse and successful vertebrate groups, comprising almost half of all known bird species. Identifying the genomic innovations that might be associated with this success, as well as with characteristic songbird traits such as vocal learning and the brain circuits that underlie this behavior, has proven difficult, in part due to the small number of avian genomes available until recently. Here we performed a comparative analysis of 48 avian genomes to identify genomic features that are unique to songbirds, as well as an initial assessment of function by investigating their tissue distribution and predicted protein domain structure.

Results

Using BLAT alignments and gene synteny analysis, we curated a large set of Ensembl gene models that were annotated as novel or duplicated in the most commonly studied songbird, the Zebra finch (Taeniopygia guttata), and then extended this analysis to 47 additional avian and 4 non-avian genomes. We identified 10 novel genes uniquely present in songbird genomes. A refined map of chromosomal synteny disruptions in the Zebra finch genome revealed that the majority of these novel genes localized to regions of genomic instability associated with apparent chromosomal breakpoints. Analyses of in situ hybridization and RNA-seq data revealed that a subset of songbird-unique genes is expressed in the brain and/or other tissues, and that 2 of these (YTHDC2L1 and TMRA) are highly differentially expressed in vocal learning-associated nuclei relative to the rest of the brain.

Conclusions

Our study reveals novel genes unique to songbirds, including some that may subserve their unique vocal control system, substantially improves the quality of Zebra finch genome annotations, and contributes to a better understanding of how genomic features may have evolved in conjunction with the emergence of the songbird lineage.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1082) contains supplementary material, which is available to authorized users.  相似文献   

4.

Background

Nucleomorphs are residual nuclei derived from eukaryotic endosymbionts in chlorarachniophyte and cryptophyte algae. The endosymbionts that gave rise to nucleomorphs and plastids in these two algal groups were green and red algae, respectively. Despite their independent origin, the chlorarachniophyte and cryptophyte nucleomorph genomes share similar genomic features such as extreme size reduction and a three-chromosome architecture. This suggests that similar reductive evolutionary forces have acted to shape the nucleomorph genomes in the two groups. Thus far, however, only a single chlorarachniophyte nucleomorph and plastid genome has been sequenced, making broad evolutionary inferences within the chlorarachniophytes and between chlorarachniophytes and cryptophytes difficult. We have sequenced the nucleomorph and plastid genomes of the chlorarachniophyte Lotharella oceanica in order to gain insight into nucleomorph and plastid genome diversity and evolution.

Results

The L. oceanica nucleomorph genome was found to consist of three linear chromosomes totaling ~610 kilobase pairs (kbp), much larger than the 373 kbp nucleomorph genome of the model chlorarachniophyte Bigelowiella natans. The L. oceanica plastid genome is 71 kbp in size, similar to that of B. natans. Unexpectedly long (~35 kbp) sub-telomeric repeat regions were identified in the L. oceanica nucleomorph genome; internal multi-copy regions were also detected. Gene content analyses revealed that nucleomorph house-keeping genes and spliceosomal intron positions are well conserved between the L. oceanica and B. natans nucleomorph genomes. More broadly, gene retention patterns were found to be similar between nucleomorph genomes in chlorarachniophytes and cryptophytes. Chlorarachniophyte plastid genomes showed near identical protein coding gene complements as well as a high level of synteny.

Conclusions

We have provided insight into the process of nucleomorph genome evolution by elucidating the fine-scale dynamics of sub-telomeric repeat regions. Homologous recombination at the chromosome ends appears to be frequent, serving to expand and contract nucleomorph genome size. The main factor influencing nucleomorph genome size variation between different chlorarachniophyte species appears to be expansion-contraction of these telomere-associated repeats rather than changes in the number of unique protein coding genes. The dynamic nature of chlorarachniophyte nucleomorph genomes lies in stark contrast to their plastid genomes, which appear to be highly stable in terms of gene content and synteny.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-374) contains supplementary material, which is available to authorized users.  相似文献   

5.
6.

Background

Gene prediction is a challenging but crucial part in most genome analysis pipelines. Various methods have evolved that predict genes ab initio on reference sequences or evidence based with the help of additional information, such as RNA-Seq reads or EST libraries. However, none of these strategies is bias-free and one method alone does not necessarily provide a complete set of accurate predictions.

Results

We present IPred (Integrative gene Prediction), a method to integrate ab initio and evidence based gene identifications to complement the advantages of different prediction strategies. IPred builds on the output of gene finders and generates a new combined set of gene identifications, representing the integrated evidence of the single method predictions.

Conclusion

We evaluate IPred in simulations and real data experiments on Escherichia Coli and human data. We show that IPred improves the prediction accuracy in comparison to single method predictions and to existing methods for prediction combination.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1315-9) contains supplementary material, which is available to authorized users.  相似文献   

7.

Background

Mutations often accompany DNA replication. Since there may be fewer cell cycles per year in the germlines of long-lived than short-lived angiosperms, the genomes of long-lived angiosperms may be diverging more slowly than those of short-lived angiosperms. Here we test this hypothesis.

Results

We first constructed a genetic map for walnut, a woody perennial. All linkage groups were short, and recombination rates were greatly reduced in the centromeric regions. We then used the genetic map to construct a walnut bacterial artificial chromosome (BAC) clone-based physical map, which contained 15,203 exonic BAC-end sequences, and quantified with it synteny between the walnut genome and genomes of three long-lived woody perennials, Vitis vinifera, Populus trichocarpa, and Malus domestica, and three short-lived herbs, Cucumis sativus, Medicago truncatula, and Fragaria vesca. Each measure of synteny we used showed that the genomes of woody perennials were less diverged from the walnut genome than those of herbs. We also estimated the nucleotide substitution rate at silent codon positions in the walnut lineage. It was one-fifth and one-sixth of published nucleotide substitution rates in the Medicago and Arabidopsis lineages, respectively. We uncovered a whole-genome duplication in the walnut lineage, dated it to the neighborhood of the Cretaceous-Tertiary boundary, and allocated the 16 walnut chromosomes into eight homoeologous pairs. We pointed out that during polyploidy-dysploidy cycles, the dominant tendency is to reduce the chromosome number.

Conclusion

Slow rates of nucleotide substitution are accompanied by slow rates of synteny erosion during genome divergence in woody perennials.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1906-5) contains supplementary material, which is available to authorized users.  相似文献   

8.

Background

Recent advances in DNA sequencing techniques resulted in more than forty sequenced plant genomes representing a diverse set of taxa of agricultural, energy, medicinal and ecological importance. However, gene family curation is often only inferred from DNA sequence homology and lacks insights into evolutionary processes contributing to gene family dynamics. In a comparative genomics framework, we integrated multiple lines of evidence provided by gene synteny, sequence homology and protein-based Hidden Markov Modelling to extract homologous super-clusters composed of multi-domain resistance (R)-proteins of the NB-LRR type (for NUCLEOTIDE BINDING/LEUCINE-RICH REPEATS), that are involved in plant innate immunity.

Results

To assess the diversity of R-proteins within and between species, we screened twelve eudicot plant genomes including six major crops and found a total of 2,363 NB-LRR genes. Our curated R-proteins set shows a 50% average for tandem duplicates and a 22% fraction of gene copies retained from ancient polyploidy events (ohnologs). We provide evidence for strong positive selection and show significant differences in molecular evolution rates (Ka/Ks-ratio) among tandem- (mean = 1.59), ohnolog (mean = 1.36) and singleton (mean = 1.22) R-gene duplicates. To foster the process of gene-edited plant breeding, we report species-specific presence/absence of all 140 NB-LRR genes present in the model plant Arabidopsis and describe four distinct clusters of NB-LRR “gatekeeper” loci sharing syntenic orthologs across all analyzed genomes.

Conclusion

By curating a near-complete set of multi-domain R-protein clusters in an eudicot-wide scale, our analysis offers significant insight into evolutionary dynamics underlying diversification of the plant innate immune system. Furthermore, our methods provide a blueprint for future efforts to identify and more rapidly clone functional NB-LRR genes from any plant species.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-966) contains supplementary material, which is available to authorized users.  相似文献   

9.
10.
11.

Background

Microarray technology, as well as other functional genomics experiments, allow simultaneous measurements of thousands of genes within each sample. Both the prediction accuracy and interpretability of a classifier could be enhanced by performing the classification based only on selected discriminative genes. We propose a statistical method for selecting genes based on overlapping analysis of expression data across classes. This method results in a novel measure, called proportional overlapping score (POS), of a feature’s relevance to a classification task.

Results

We apply POS, along‐with four widely used gene selection methods, to several benchmark gene expression datasets. The experimental results of classification error rates computed using the Random Forest, k Nearest Neighbor and Support Vector Machine classifiers show that POS achieves a better performance.

Conclusions

A novel gene selection method, POS, is proposed. POS analyzes the expressions overlap across classes taking into account the proportions of overlapping samples. It robustly defines a mask for each gene that allows it to minimize the effect of expression outliers. The constructed masks along‐with a novel gene score are exploited to produce the selected subset of genes.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-274) contains supplementary material, which is available to authorized users.  相似文献   

12.

Background

Single copy genes are common across angiosperm genomes. With the sufficiently high quality sequenced genomes, the identification of large-scale single copy genes among multiple species is possible. Although some characteristics have been reported, our study provides novel insights into single copy genes.

Results

We identified single copy genes across 29 angiosperm genomes. A significant negative correlation was found between the number of duplicate blocks and the number of single copy genes. We found that a considerable number of single copy genes are located in organelles, showing a preference for binding and catalytic activity. The analysis of effective number of codons (Nc) illustrates that single copy genes have a stronger codon bias than non-single copy genes in eudicots. The relative high expression level of single copy genes was partially confirmed by the RNA-seq data, rather than the Codon Adaptation Index (CAI). Unlike in most other species, a strongly negatively correlation occurs between Nc and GC3 among single copy genes in grass genomes. When compared to all non-single copy genes, single copy genes indicate more conservation (as indicated by Ka and Ks values). But our alternative splicing (AS) results reveal that selective constraints are weaker in single copy genes than in low copy family genes (1–10 in-paralogs) and stronger than high copy family genes (>10 in-paralogs). Using concatenated shared single copy genes, we obtained a well-resolved phylogenetic tree. With the addition of intron sequences, the branch support is improved, but striking incongruences are also evident. Therefore, it is noteworthy that inclusion of intron sequences seems more appropriate for the phylogenetic reconstruction at lower taxonomic levels.

Conclusions

Our analysis provides insight into the evolutionary characteristics of single copy genes across 29 angiosperm genomes. The results suggest that there are key differences in evolutionary constraints between single copy genes and non-single copy genes. And to some extent, these evolutionary constraints show some species-specific differences, especially between eudicots and monocots. Our preliminary evidence also suggests that the concatenated shared single copy genes are well suited for use in resolving phylogenetic relationships.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-504) contains supplementary material, which is available to authorized users.  相似文献   

13.

Background

The correct taxonomic assignment of bacterial genomes is a primary and challenging task. With the availability of whole genome sequences, the gene content based approaches appear promising in inferring the bacterial taxonomy. The complete genome sequencing of a bacterial genome often reveals a substantial number of unique genes present only in that genome which can be used for its taxonomic classification.

Results

In this study, we have proposed a comprehensive method which uses the taxon-specific genes for the correct taxonomic assignment of existing and new bacterial genomes. The taxon-specific genes identified at each taxonomic rank have been successfully used for the taxonomic classification of 2,342 genomes present in the NCBI genomes, 36 newly sequenced genomes, and 17 genomes for which the complete taxonomy is not yet known. This approach has been implemented for the development of a tool ‘Microtaxi’ which can be used for the taxonomic assignment of complete bacterial genomes.

Conclusion

The taxon-specific gene based approach provides an alternate valuable methodology to carry out the taxonomic classification of newly sequenced or existing bacterial genomes.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1542-0) contains supplementary material, which is available to authorized users.  相似文献   

14.

Background

Cryptosporidium hominis is a dominant species for human cryptosporidiosis. Within the species, IbA10G2 is the most virulent subtype responsible for all C. hominis–associated outbreaks in Europe and Australia, and is a dominant outbreak subtype in the United States. In recent yearsIaA28R4 is becoming a major new subtype in the United States. In this study, we sequenced the genomes of two field specimens from each of the two subtypes and conducted a comparative genomic analysis of the obtained sequences with those from the only fully sequenced Cryptosporidium parvum genome.

Results

Altogether, 8.59-9.05 Mb of Cryptosporidium sequences in 45–767 assembled contigs were obtained from the four specimens, representing 94.36-99.47% coverage of the expected genome. These genomes had complete synteny in gene organization and 96.86-97.0% and 99.72-99.83% nucleotide sequence similarities to the published genomes of C. parvum and C. hominis, respectively. Several major insertions and deletions were seen between C. hominis and C. parvum genomes, involving mostly members of multicopy gene families near telomeres. The four C. hominis genomes were highly similar to each other and divergent from the reference IaA25R3 genome in some highly polymorphic regions. Major sequence differences among the four specimens sequenced in this study were in the 5′ and 3′ ends of chromosome 6 and the gp60 region, largely the result of genetic recombination.

Conclusions

The sequence similarity among specimens of the two dominant outbreak subtypes and genetic recombination in chromosome 6, especially around the putative virulence determinant gp60 region, suggest that genetic recombination plays a potential role in the emergence of hyper-transmissible C. hominis subtypes. The high sequence conservation between C. parvum and C. hominis genomes and significant differences in copy numbers of MEDLE family secreted proteins and insulinase-like proteases indicate that telomeric gene duplications could potentially contribute to host expansion in C. parvum.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1517-1) contains supplementary material, which is available to authorized users.  相似文献   

15.

Background

Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs).

Results

The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced.

Conclusions

We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1826-4) contains supplementary material, which is available to authorized users.  相似文献   

16.

Background

The community composition of the human microbiome is known to vary at distinct anatomical niches. But little is known about the nature of variations, if any, at the genome/sub-genome levels of a specific microbial community across different niches. The present report aims to explore, as a case study, the variations in gene repertoire of 28 Prevotella reference genomes derived from different body-sites of human, as reported earlier by the Human Microbiome Consortium.

Results

The pan-genome for Prevotella remains “open”. On an average, 17% of predicted protein-coding genes of any particular Prevotella genome represent the conserved core genes, while the remaining 83% contribute to the flexible and singletons. The study reveals exclusive presence of 11798, 3673, 3348 and 934 gene families and exclusive absence of 17, 221, 115 and 645 gene families in Prevotella genomes derived from human oral cavity, gastro-intestinal tracts (GIT), urogenital tract (UGT) and skin, respectively. Distribution of various functional COG categories differs significantly among the habitat-specific genes. No niche-specific variations could be observed in distribution of KEGG pathways.

Conclusions

Prevotella genomes derived from different body sites differ appreciably in gene repertoire, suggesting that these microbiome components might have developed distinct genetic strategies for niche adaptation within the host. Each individual microbe might also have a component of its own genetic machinery for host adaptation, as appeared from the huge number of singletons.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1350-6) contains supplementary material, which is available to authorized users.  相似文献   

17.
18.

Background

Evidence based on genomic sequences is urgently needed to confirm the phylogenetic relationship between Mesorhizobium strain MAFF303099 and M. huakuii. To define underlying causes for the rather striking difference in host specificity between M. huakuii strain 7653R and MAFF303099, several probable determinants also require comparison at the genomic level. An improved understanding of mobile genetic elements that can be integrated into the main chromosomes of Mesorhizobium to form genomic islands would enrich our knowledge of how genome dynamics may contribute to Mesorhizobium evolution in general.

Results

In this study, we sequenced the complete genome of 7653R and compared it with five other Mesorhizobium genomes. Genomes of 7653R and MAFF303099 were found to share a large set of orthologs and, most importantly, a conserved chromosomal backbone and even larger perfectly conserved synteny blocks. We also identified candidate molecular differences responsible for the different host specificities of these two strains. Finally, we reconstructed an ancestral Mesorhizobium genomic island that has evolved into diverse forms in different Mesorhizobium species.

Conclusions

Our ortholog and synteny analyses firmly establish MAFF303099 as a strain of M. huakuii. Differences in nodulation factors and secretion systems T3SS, T4SS, and T6SS may be responsible for the unique host specificities of 7653R and MAFF303099 strains. The plasmids of 7653R may have arisen by excision of the original genomic island from the 7653R chromosome.

Electronic supplementary material

The online version of this article (doi: 10.1186/1471-2164-15-440) contains supplementary material, which is available to authorized users.  相似文献   

19.

Background

Vertebrate mitochondrial genomes (mitogenomes) are 16–18 kbp double-stranded circular DNAs that encode a set of 37 genes. The arrangement of these genes and the major noncoding region is relatively conserved through evolution although gene rearrangements have been described for diverse lineages. The tandem duplication-random loss model has been invoked to explain the mechanisms of most mitochondrial gene rearrangements. Previously reported mitogenomic sequences for geckos rarely included gene rearrangements, which we explore in the present study.

Results

We determined seven new mitogenomic sequences from Gekkonidae using a high-throughput sequencing method. The Tropiocolotes tripolitanus mitogenome involves a tandem duplication of the gene block: tRNAArg, NADH dehydrogenase subunit 4L, and NADH dehydrogenase subunit 4. One of the duplicate copies for each protein-coding gene may be pseudogenized. A duplicate copy of the tRNAArg gene appears to have been converted to a tRNAGln gene by a C to T base substitution at the second anticodon position, although this gene may not be fully functional in protein synthesis. The Stenodactylus petrii mitogenome includes several tandem duplications of tRNALeu genes, as well as a translocation of the tRNAAla gene and a putative origin of light-strand replication within a tRNA gene cluster. Finally, the Uroplatus fimbriatus and U. ebenaui mitogenomes feature the apparent loss of the tRNAGlu gene from its original position. Uroplatus fimbriatus appears to retain a translocated tRNAGlu gene adjacent to the 5’ end of the major noncoding region.

Conclusions

The present study describes several new mitochondrial gene rearrangements from Gekkonidae. The loss and reassignment of tRNA genes is not very common in vertebrate mitogenomes and our findings raise new questions as to how missing tRNAs are supplied and if the reassigned tRNA gene is fully functional. These new examples of mitochondrial gene rearrangements in geckos should broaden our understanding of the evolution of mitochondrial gene arrangements.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-930) contains supplementary material, which is available to authorized users.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号