首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Corallimorpharia are the closest noncalcifying relatives of reef‐building corals. Aside from their popularity among aquarium hobbyists, their evolutionary position between the Actiniaria (sea anemones) and the Scleractinia (hard corals) makes them ideal candidates for comparative studies aiming at understanding the evolution of hexacorallian orders in general and reef‐building corals in particular. Here we have sequenced and assembled two draft genomes for the Corallimorpharia species Amplexidiscus fenestrafer and Discosoma sp. The draft genomes encompass 370 and 445 Mbp, respectively, and encode for 21,372 and 23,199 genes. To facilitate future studies using these resources, we provide annotations for the predicted gene models—not only at gene level, by annotating gene models with the function of the best‐matching homologue, and GO terms when available; but also at protein domain level, where gene function can be better verified through the conservation of the sequence and order of protein domains. Further, we provide an online platform ( http://corallimorpharia.reefgenomics.org ), which includes a blast interface and a genome browser to facilitate the use of these resources. We believe that these two genomes are important resources for future studies on hexacorallian systematics and the evolutionary basis of their specific traits such as the symbiotic relationship with dinoflagellates of the genus Symbiodinium or the evolution of calcification in reef‐building corals.  相似文献   

3.
Plants frequently possess operon‐like gene clusters for specialized metabolism. Cultivated rice, Oryza sativa, produces antimicrobial diterpene phytoalexins represented by phytocassanes and momilactones, and the majority of their biosynthetic genes are clustered on chromosomes 2 and 4, respectively. These labdane‐related diterpene phytoalexins are biosynthesized from geranylgeranyl diphosphate via ent‐copalyl diphosphate or syn‐copalyl diphosphate. The two gene clusters consist of genes encoding diterpene synthases and chemical‐modification enzymes including P450s. In contrast, genes for the biosynthesis of gibberellins, which are labdane‐related phytohormones, are scattered throughout the rice genome similar to other plant genomes. The mechanism of operon‐like gene cluster formation remains undefined despite previous studies in other plant species. Here we show an evolutionary insight into the rice gene clusters by a comparison with wild Oryza species. Comparative genomics and biochemical studies using wild rice species from the AA genome lineage, including Oryza barthii, Oryza glumaepatula, Oryza meridionalis and the progenitor of Asian cultivated rice Oryza rufipogon indicate that gene clustering for biosynthesis of momilactones and phytocassanes had already been accomplished before the domestication of rice. Similar studies using the species Oryza punctata from the BB genome lineage, the distant FF genome lineage species Oryza brachyantha and an outgroup species Leersia perrieri suggest that the phytocassane biosynthetic gene cluster was present in the common ancestor of the Oryza species despite the different locations, directions and numbers of their member genes. However, the momilactone biosynthetic gene cluster evolved within Oryza before the divergence of the BB genome via assembly of ancestral genes.  相似文献   

4.
Rickettsia are best known as strictly intracellular vector‐borne bacteria that cause mild to severe diseases in humans and other animals. Recent advances in molecular tools and biological experiments have unveiled a wide diversity of Rickettsia spp. that include species with a broad host range and some species that act as endosymbiotic associates. Molecular phylogenies of Rickettsia spp. contain some ambiguities, such as the position of R. canadensis and relationships within the spotted fever group. In the modern era of genomics, with an ever‐increasing number of sequenced genomes, there is enhanced interest in the use of whole‐genome sequences to understand pathogenesis and assess evolutionary relationships among rickettsial species. Rickettsia have small genomes (1.1–1.5 Mb) as a result of reductive evolution. These genomes contain split genes, gene remnants and pseudogenes that, owing to the colinearity of some rickettsial genomes, may represent different steps of the genome degradation process. Genomics reveal extreme genome reduction and massive gene loss in highly vertebrate‐pathogenic Rickettsia compared to less virulent or endosymbiotic species. Information gleaned from rickettsial genomics challenges traditional concepts of pathogenesis that focused primarily on the acquisition of virulence factors. Another intriguing phenomenon about the reduced rickettsial genomes concerns the large fraction of non‐coding DNA and possible functionality of these “non‐coding” sequences, because of the high conservation of these regions. Despite genome streamlining, Rickettsia spp. contain gene families, selfish DNA, repeat palindromic elements and genes encoding eukaryotic‐like motifs. These features participate in sequence and functional diversity and may play a crucial role in adaptation to the host cell and pathogenesis. Genome analyses have identified a large fraction of mobile genetic elements, including plasmids, suggesting the possibility of lateral gene transfer in these intracellular bacteria. Phylogenetic analyses have identified several candidates for horizontal gene acquisition among Rickettsia spp. including tra, pat2, and genes encoding for the type IV secretion system and ATP/ADP translocase that may have been acquired from bacteria living in amoebae. Gene loss, gene duplication, DNA repeats and lateral gene transfer all have shaped rickettsial genome evolution. A comprehensive analysis of the entire genome, including genes and non‐coding DNA, will help to unlock the mysteries of rickettsial evolution and pathogenesis.  相似文献   

5.
Current sequencing methods produce large amounts of data, but genome assemblies based on these data are often woefully incomplete. These incomplete and error-filled assemblies result in many annotation errors, especially in the number of genes present in a genome. In this paper we investigate the magnitude of the problem, both in terms of total gene number and the number of copies of genes in specific families. To do this, we compare multiple draft assemblies against higher-quality versions of the same genomes, using several new assemblies of the chicken genome based on both traditional and next-generation sequencing technologies, as well as published draft assemblies of chimpanzee. We find that upwards of 40% of all gene families are inferred to have the wrong number of genes in draft assemblies, and that these incorrect assemblies both add and subtract genes. Using simulated genome assemblies of Drosophila melanogaster, we find that the major cause of increased gene numbers in draft genomes is the fragmentation of genes onto multiple individual contigs. Finally, we demonstrate the usefulness of RNA-Seq in improving the gene annotation of draft assemblies, largely by connecting genes that have been fragmented in the assembly process.  相似文献   

6.

Background  

One of the many gene families that expanded in early vertebrate evolution is the neuropeptide (NPY) receptor family of G-protein coupled receptors. Earlier work by our lab suggested that several of the NPY receptor genes found in extant vertebrates resulted from two genome duplications before the origin of jawed vertebrates (gnathostomes) and one additional genome duplication in the actinopterygian lineage, based on their location on chromosomes sharing several gene families. In this study we have investigated, in five vertebrate genomes, 45 gene families with members close to the NPY receptor genes in the compact genomes of the teleost fishes Tetraodon nigroviridis and Takifugu rubripes. These correspond to Homo sapiens chromosomes 4, 5, 8 and 10.  相似文献   

7.
Spirodela polyrhiza is a fast‐growing aquatic monocot with highly reduced morphology, genome size and number of protein‐coding genes. Considering these biological features of Spirodela and its basal position in the monocot lineage, understanding its genome architecture could shed light on plant adaptation and genome evolution. Like many draft genomes, however, the 158‐Mb Spirodela genome sequence has not been resolved to chromosomes, and important genome characteristics have not been defined. Here we deployed rapid genome‐wide physical maps combined with high‐coverage short‐read sequencing to resolve the 20 chromosomes of Spirodela and to empirically delineate its genome features. Our data revealed a dramatic reduction in the number of the rDNA repeat units in Spirodela to fewer than 100, which is even fewer than that reported for yeast. Consistent with its unique phylogenetic position, small RNA sequencing revealed 29 Spirodela‐specific microRNA, with only two being shared with Elaeis guineensis (oil palm) and Musa balbisiana (banana). Combining DNA methylation data and small RNA sequencing enabled the accurate prediction of 20.5% long terminal repeats (LTRs) that doubled the previous estimate, and revealed a high Solo:Intact LTR ratio of 8.2. Interestingly, we found that Spirodela has the lowest global DNA methylation levels (9%) of any plant species tested. Taken together our results reveal a genome that has undergone reduction, likely through eliminating non‐essential protein coding genes, rDNA and LTRs. In addition to delineating the genome features of this unique plant, the methodologies described and large‐scale genome resources from this work will enable future evolutionary and functional studies of this basal monocot family.  相似文献   

8.
9.
Hybridization between divergent lineages generates new allelic combinations. One mechanism that can hinder the formation of hybrid populations is mitonuclear incompatibility, that is, dysfunctional interactions between proteins encoded in the nuclear and mitochondrial genomes (mitogenomes) of diverged lineages. Theoretically, selective pressure due to mitonuclear incompatibility can affect genotypes in a hybrid population in which nuclear genomes and mitogenomes from divergent lineages admix. To directly and thoroughly observe this key process, we de novo sequenced the 747‐Mb genome of the coastal goby, Chaenogobius annularis, and investigated its integrative genomic phylogeographics using RNA‐sequencing, RAD‐sequencing, genome resequencing, whole mitogenome sequencing, amplicon sequencing, and small RNA‐sequencing. Chaenogobius annularis populations have been geographically separated into Pacific Ocean (PO) and Sea of Japan (SJ) lineages by past isolation events around the Japanese archipelago. Despite the divergence history and potential mitonuclear incompatibility between these lineages, the mitogenomes of the PO and SJ lineages have coexisted for generations in a hybrid population on the Sanriku Coast. Our analyses revealed accumulation of nonsynonymous substitutions in the PO‐lineage mitogenomes, including two convergent substitutions, as well as signals of mitochondrial lineage‐specific selection on mitochondria‐related nuclear genes. Finally, our data implied that a microRNA gene was involved in resolving mitonuclear incompatibility. Our integrative genomic phylogeographic approach revealed that mitonuclear incompatibility can affect genome evolution in a natural hybrid population.  相似文献   

10.
Baculoviruses, members of the family Baculoviridae, are large, enveloped viruses that contain a double‐stranded circular DNA genome of 80–180 kbp, encoding 90–180 putative proteins. These viruses are exclusively pathogenic for arthropods, particularly insects, and have been developed, or are being developed, as environmentally sound pesticides and eukaryotic vectors for foreign protein expression, surface display, gene delivery for gene therapy, vaccine production and drug screening. The baculoviruses contain a set of approximately 30 core genes that are conserved among all baculovirus genomes sequenced to date. Individual baculoviruses also contain a number of lineage‐ or species‐specific genes that have greatly impacted the diversification and evolution of baculoviruses. In this review, we first describe the general properties and biology of baculoviruses and then focus on the baculovirus genes and mechanisms involved in the replication, spread and survival of baculoviruses within the context of their diversity, evolution and insect manipulation.  相似文献   

11.
Combining high‐throughput sequencing with targeted sequence capture has become an attractive tool to study specific genomic regions of interest. Most studies have so far focused on the exome using short‐read technology. These approaches are not designed to capture intergenic regions needed to reconstruct genomic organization, including regulatory regions and gene synteny. Here, we demonstrate the power of combining targeted sequence capture with long‐read sequencing technology for comparative genomic analyses of the haemoglobin (Hb) gene clusters across eight species separated by up to 70 million years. Guided by the reference genome assembly of the Atlantic cod (Gadus morhua) together with genome information from draft assemblies of selected codfishes, we designed probes covering the two Hb gene clusters. Use of custom‐made barcodes combined with PacBio RSII sequencing led to highly continuous assemblies of the LA (~100 kb) and MN (~200 kb) clusters, which include syntenic regions of coding and intergenic sequences. Our results revealed an overall conserved genomic organization of the Hb genes within this lineage, yet with several, lineage‐specific gene duplications. Moreover, for some of the species examined, we identified amino acid substitutions at two sites in the Hbb1 gene as well as length polymorphisms in its regulatory region, which has previously been linked to temperature adaptation in Atlantic cod populations. This study highlights the use of targeted long‐read capture as a versatile approach for comparative genomic studies by generation of a cross‐species genomic resource elucidating the evolutionary history of the Hb gene family across the highly divergent group of codfishes.  相似文献   

12.
Wood‐feeding lower termites harbour symbiotic gut protists that support the termite nutritionally by degrading recalcitrant lignocellulose. These protists themselves host specific endo‐ and ectosymbiotic bacteria, functions of which remain largely unknown. Here, we present draft genomes of a dominant, uncultured ectosymbiont belonging to the order Bacteroidales, ‘Candidatus Symbiothrix dinenymphae’, which colonizes the cell surface of the cellulolytic gut protists Dinenympha spp. We analysed four single‐cell genomes of Ca. S. dinenymphae, the highest genome completeness was estimated to be 81.6–82.3% with a predicted genome size of 4.28–4.31 Mb. The genome retains genes encoding large parts of the amino acid, cofactor and nucleotide biosynthetic pathways. In addition, the genome contains genes encoding various glycoside hydrolases such as endoglucanases and hemicellulases. The genome indicates that Ca. S. dinenymphae ferments lignocellulose‐derived monosaccharides to acetate, a major carbon and energy source of the host termite. We suggest that the ectosymbiont digests lignocellulose and provides nutrients to the host termites, and hypothesize that the hydrolytic activity might also function as a pretreatment for the host protist to effectively decompose the crystalline cellulose components.  相似文献   

13.
Sesame (Sesamum indicum L.) is an important oil crop renowned for its high oil content and quality. Recently, genome assemblies for five sesame varieties including two landraces (S. indicum cv. Baizhima and Mishuozhima) and three modern cultivars (S. indicum var. Zhongzhi13, Yuzhi11 and Swetha), have become available providing a rich resource for comparative genomic analyses and gene discovery. Here, we employed a reference‐assisted assembly approach to improve the draft assemblies of four of the sesame varieties. We then constructed a sesame pan‐genome of 554.05 Mb. The pan‐genome contained 26 472 orthologous gene clusters; 15 409 (58.21%) of them were core (present across all five sesame genomes), whereas the remaining 41.79% (11 063) clusters and the 15 890 variety‐specific genes were dispensable. Comparisons between varieties suggest that modern cultivars from China and India display significant genomic variation. The gene families unique to the sesame modern cultivars contain genes mainly related to yield and quality, while those unique to the landraces contain genes involved in environmental adaptation. Comparative evolutionary analysis indicates that several genes involved in plant‐pathogen interaction and lipid metabolism are under positive selection, which may be associated with sesame environmental adaption and selection for high seed oil content. This study of the sesame pan‐genome provides insights into the evolution and genomic characteristics of this important oilseed and constitutes a resource for further sesame crop improvement.  相似文献   

14.
Erigeron breviscapus is an important medicinal plant in Compositae and the first species to realize the whole process from the decoding of the draft genome sequence to scutellarin biosynthesis in yeast. However, the previous low‐quality genome assembly has hindered the optimization of candidate genes involved in scutellarin synthesis and the development of molecular‐assisted breeding based on the genome. Here, the E. breviscapus genome was updated using PacBio RSII sequencing data and Hi‐C data, and increased in size from 1.2 Gb to 1.43 Gb, with a scaffold N50 of 156.82 Mb and contig N50 of 140.95 kb, and a total of 43,514 protein‐coding genes were obtained and oriented onto nine pseudo‐chromosomes, thus becoming the third plant species assembled to chromosome level after sunflower and lettuce in Compositae. Fourteen genes with evidence for positive selection were identified and found to be related to leaf morphology, flowering and secondary metabolism. The number of genes in some gene families involved in flavonoid biosynthesis in E. breviscapus have been significantly expanded. In particular, additional candidate genes involved in scutellarin biosynthesis, such as flavonoid‐7‐O‐glucuronosyltransferase genes (F7GATs) were identified using updated genome. In addition, three candidate genes encoding indole‐3‐pyruvate monooxygenase YUCCA2 (YUC2), serine carboxypeptidase‐like 18 (SCPL18), and F‐box protein (FBP), respectively, were identified to be probably related to leaf development and flowering by resequencing 99 individuals. These results provided a substantial genetic basis for improving agronomic and quality traits of E. breviscapus, and provided a platform for improving other draft genome assemblies to chromosome‐level.  相似文献   

15.
Long terminal repeat retrotransposons (LTR‐RTs) represent a major fraction of plant genomes, but processes leading to transposition bursts remain elusive. Polyploidy expectedly leads to LTR‐RT proliferation, as the merging of divergent diploids provokes a genome shock activating LTR‐RTs and/or genetic redundancy supports the accumulation of active LTR‐RTs through relaxation of selective constraints. Available evidence supports interspecific hybridization as the main trigger of genome dynamics, but few studies have addressed the consequences of intraspecific polyploidy (i.e. autopolyploidy), where the genome shock is expectedly minimized. The dynamics of LTR‐RTs was thus here evaluated through low coverage 454 sequencing of three closely related diploid progenitors and three independent autotetraploids from the young Biscutella laevigata species complex. Genomes from this early diverging Brassicaceae lineage presented a minimum of 40% repeats and a large diversity of transposable elements. Differential abundances and patterns of sequence divergence among genomes for 37 LTR‐RT families revealed contrasted dynamics during species diversification. Quiescent LTR‐RT families with limited genetic variation among genomes were distinguished from active families (37.8%) having proliferated in specific taxa. Specific families proliferated in autopolyploids only, but most transpositionally active families in polyploids were also differentiated among diploids. Low expression levels of transpositionally active LTR‐RT families in autopolyploids further supported that genome shock and redundancy are non‐mutually exclusive triggers of LTR‐RT proliferation. Although reputed stable, autopolyploid genomes show LTR‐RT fractions presenting analogies with polyploids between widely divergent genomes.  相似文献   

16.
17.
18.
Aldehyde dehydrogenase (ALDH) superfamily represents a group of NAD(P)+-dependent enzymes that catalyze the oxidation of a wide spectrum of endogenous and exogenous aldehydes. With the advent of megabase genome sequencing, the ALDH superfamily is expanding rapidly on many fronts. As expected, ALDH genes are found in virtually all genomes analyzed to date, indicating the importance of these enzymes in biological functions. Complete genome sequences of various species have revealed additional ALDH genes. As of July 2000, the ALDH superfamily consists of 331 distinct genes, of which eight are found in archaea, 165 in eubacteria, and 158 in eukaryota. The number of ALDH genes in some species with their genomes completely sequenced and annotated, Escherichia coli and Caenorhabditis elegans, ranges from 10 to 17. In the human genome, 17 functional genes and three pseudogenes have been identified to date. Divergent evolution, based on multiple alignment analysis of 86 eukaryotic ALDH amino-acid sequences, was the basis of the standardized ALDH gene nomenclature system (Pharmacogenetics 9: 421–434, 1999). Thus far, the eukaryotic ALDHs comprise 20 gene families. A complete list of all ALDH sequences known to date is presented here along with the evolution analysis of the eukaryotic ALDHs.  相似文献   

19.
Aims: The aims of this study are to obtain the draft genome sequence of Streptomyces coelicoflavus ZG0656, which produces novel acarviostatin family α‐amylase inhibitors, and then to reveal the putative acarviostatin‐related gene cluster and the biosynthetic pathway. Methods and Results: The draft genome sequence of S. coelicoflavus ZG0656 was generated using a shotgun approach employing a combination of 454 and Solexa sequencing technologies. Genome analysis revealed a putative gene cluster for acarviostatin biosynthesis, termed sct‐cluster. The cluster contains 13 acarviostatin synthetic genes, six transporter genes, four starch degrading or transglycosylation enzyme genes and two regulator genes. On the basis of bioinformatic analysis, we proposed a putative biosynthetic pathway of acarviostatins. The intracellular steps produce a structural core, acarviostatin I00‐7‐P, and the extracellular assemblies lead to diverse acarviostatin end products. Conclusions: The draft genome sequence of S. coelicoflavus ZG0656 revealed the putative biosynthetic gene cluster of acarviostatins and a putative pathway of acarviostatin production. Significance and Impact of the Study: To our knowledge, S. coelicoflavus ZG0656 is the first strain in this species for which a genome sequence has been reported. The analysis of sct‐cluster provided important insights into the biosynthesis of acarviostatins. This work will be a platform for producing novel variants and yield improvement.  相似文献   

20.
Glycine latifolia (Benth.) Newell & Hymowitz (2= 40), one of the 27 wild perennial relatives of soybean, possesses genetic diversity and agronomically favorable traits that are lacking in soybean. Here, we report the 939‐Mb draft genome assembly of G. latifolia (PI 559298) using exclusively linked‐reads sequenced from a single Chromium library. We organized scaffolds into 20 chromosome‐scale pseudomolecules utilizing two genetic maps and the Glycine max (L.) Merr. genome sequence. High copy numbers of putative 91‐bp centromere‐specific tandem repeats were observed in consecutive blocks within predicted pericentromeric regions on several pseudomolecules. No 92‐bp putative centromeric repeats, which are abundant in G. max, were detected in G. latifolia or Glycine tomentella. Annotation of the assembled genome and subsequent filtering yielded a high confidence gene set of 54 475 protein‐coding loci. In comparative analysis with five legume species, genes related to defense responses were significantly overrepresented in Glycine‐specific orthologous gene families. A total of 304 putative nucleotide‐binding site (NBS)‐leucine‐rich‐repeat (LRR) genes were identified in this genome assembly. Different from other legume species, we observed a scarcity of TIR‐NBS‐LRR genes in G. latifolia. The G. latifolia genome was also predicted to contain genes encoding 367 LRR‐receptor‐like kinases, a family of proteins involved in basal defense responses and responses to abiotic stress. The genome sequence and annotation of G. latifolia provides a valuable source of alternative alleles and novel genes to facilitate soybean improvement. This study also highlights the efficacy and cost‐effectiveness of the application of Chromium linked‐reads in diploid plant genome de novo assembly.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号