首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background

Single copy genes are common across angiosperm genomes. With the sufficiently high quality sequenced genomes, the identification of large-scale single copy genes among multiple species is possible. Although some characteristics have been reported, our study provides novel insights into single copy genes.

Results

We identified single copy genes across 29 angiosperm genomes. A significant negative correlation was found between the number of duplicate blocks and the number of single copy genes. We found that a considerable number of single copy genes are located in organelles, showing a preference for binding and catalytic activity. The analysis of effective number of codons (Nc) illustrates that single copy genes have a stronger codon bias than non-single copy genes in eudicots. The relative high expression level of single copy genes was partially confirmed by the RNA-seq data, rather than the Codon Adaptation Index (CAI). Unlike in most other species, a strongly negatively correlation occurs between Nc and GC3 among single copy genes in grass genomes. When compared to all non-single copy genes, single copy genes indicate more conservation (as indicated by Ka and Ks values). But our alternative splicing (AS) results reveal that selective constraints are weaker in single copy genes than in low copy family genes (1–10 in-paralogs) and stronger than high copy family genes (>10 in-paralogs). Using concatenated shared single copy genes, we obtained a well-resolved phylogenetic tree. With the addition of intron sequences, the branch support is improved, but striking incongruences are also evident. Therefore, it is noteworthy that inclusion of intron sequences seems more appropriate for the phylogenetic reconstruction at lower taxonomic levels.

Conclusions

Our analysis provides insight into the evolutionary characteristics of single copy genes across 29 angiosperm genomes. The results suggest that there are key differences in evolutionary constraints between single copy genes and non-single copy genes. And to some extent, these evolutionary constraints show some species-specific differences, especially between eudicots and monocots. Our preliminary evidence also suggests that the concatenated shared single copy genes are well suited for use in resolving phylogenetic relationships.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-504) contains supplementary material, which is available to authorized users.  相似文献   

2.
Pomegranate (Punica granatum L.) is one of the oldest known edible fruits. It has grown in popularity and is a profitable fruit crop due to its attractive features including a bright red appearance and its biological activities. Scientific exploration of the genetics and evolution of these beneficial traits has been hampered by limited genomic information. In this study, we sequenced the complete chloroplast (cp) genome of the native P. granatum (cultivar Helow) cultivated in the mountains of Jabal Al-Akhdar, Oman. The results revealed a P. granatum cp genome length of 158,630 bp, characterized by a relatively conserved structure containing 2 inverted repeat regions of 25,466 bp, an 18,686 bp small single copy regions, and an 89,015 bp large single copy region. The 86 protein-coding genes included 37 transfer RNA genes and 8 ribosomal RNA genes. Comparison of the P. granatum whole cp genome with seven Lagerstroemia species revealed an overall high degree of sequence similarity with divergence among intergenic spacers. The location, distribution, and divergence of repeat sequences and shared genes of the Punica and Lagerstroemia species were highly similar. Analyses of nucleotide substitution, insertion/deletions, and highly variable regions in these cp genomes identified potential plastid markers for taxonomic and phylogenetic studies in Myrtales. A phylogenetic study of the cp genomes and 76 shared coding regions generated similar cladograms. The complete cp genome of P. granatum will aid in taxonomical studies of the family Lythraceae.  相似文献   

3.

Background

The number of completely sequenced plastid genomes available is growing rapidly. This array of sequences presents new opportunities to perform comParative analyses. In comParative studies, it is often useful to compare across wide phylogenetic spans and, within angiosperms, to include representatives from basally diverging lineages such as the genomes reported here: Nuphar advena (from a basal-most lineage) and Ranunculus macranthus (a basal eudicot). We report these two new plastid genome sequences and make comparisons (within angiosperms, seed plants, or all photosynthetic lineages) to evaluate features such as the status of ycf15 and ycf68 as protein coding genes, the distribution of simple sequence repeats (SSRs) and longer dispersed repeats (SDR), and patterns of nucleotide composition.

Results

The Nuphar [GenBank:NC_008788] and Ranunculus [GenBank:NC_008796] plastid genomes share characteristics of gene content and organization with many other chloroplast genomes. Like other plastid genomes, these genomes are A+T-rich, except for rRNA and tRNA genes. Detailed comparisons of Nuphar with Nymphaea, another Nymphaeaceae, show that more than two-thirds of these genomes exhibit at least 95% sequence identity and that most SSRs are shared. In broader comparisons, SSRs vary among genomes in s of abundance and length and most contain repeat motifs based on A and T nucleotides.

Conclusion

SSR and SDR abundance varies by genome and, for SSRs, is proportional to genome size. Long SDRs are rare in the genomes assessed. SSRs occur less frequently than predicted and, although the majority of the repeat motifs do include A and T nucleotides, the A+T bias in SSRs is less than that predicted from the underlying genomic nucleotide composition. In codon usage third positions show an A+T bias, however variation in codon usage does not correlate with differences in A+T-richness. Thus, although plastome nucleotide composition shows "A+T richness", an A+T bias is not apparent upon more in-depth analysis, at least in these aspects. The pattern of evolution in the sequences identified as ycf15 and ycf68 is not consistent with them being protein-coding genes. In fact, these regions show no evidence of sequence conservation beyond what is normal for non-coding regions of the IR.  相似文献   

4.
Abstract More than 190 plastid genomes have been completely sequenced during the past two decades due to advances in DNA sequencing technologies. Based on this unprecedented abundance of data, extensive genomic changes have been revealed in the plastid genomes. Inversion is the most common mechanism that leads to gene order changes. Several inversion events have been recognized as informative phylogenetic markers, such as a 30‐kb inversion found in all living vascular plants minus lycopsids and two short inversions putatively shared by all ferns. Gene loss is a common event throughout plastid genome evolution. Many genes were independently lost or transferred to the nuclear genome in multiple plant lineages. The trnR‐CCG gene was lost in some clades of lycophytes, ferns, and seed plants, and all the ndh genes were absent in parasitic plants, gnetophytes, Pinaceae, and the Taiwan moth orchid. Certain parasitic plants have, in particular, lost plastid genes related to photosynthesis because of the relaxation of functional constraint. The dramatic growth of plastid genome sequences has also promoted the use of whole plastid sequences and genomic features to solve phylogenetic problems. Chloroplast phylogenomics has provided additional evidence for deep‐level phylogenetic relationships as well as increased phylogenetic resolutions at low taxonomic levels. However, chloroplast phylogenomics is still in its infant stage and rigorous analysis methodology has yet to be developed.  相似文献   

5.

Background

Artemisia frigida Willd. is an important Mongolian traditional medicinal plant with pharmacological functions of stanch and detumescence. However, there is little sequence and genomic information available for Artemisia frigida, which makes phylogenetic identification, evolutionary studies, and genetic improvement of its value very difficult. We report the complete chloroplast genome sequence of Artemisia frigida based on 454 pyrosequencing.

Methodology/Principal Findings

The complete chloroplast genome of Artemisia frigida is 151,076 bp including a large single copy (LSC) region of 82,740 bp, a small single copy (SSC) region of 18,394 bp and a pair of inverted repeats (IRs) of 24,971 bp. The genome contains 114 unique genes and 18 duplicated genes. The chloroplast genome of Artemisia frigida contains a small 3.4 kb inversion within a large 23 kb inversion in the LSC region, a unique feature in Asteraceae. The gene order in the SSC region of Artemisia frigida is inverted compared with the other 6 Asteraceae species with the chloroplast genomes sequenced. This inversion is likely caused by an intramolecular recombination event only occurred in Artemisia frigida. The existence of rich SSR loci in the Artemisia frigida chloroplast genome provides a rare opportunity to study population genetics of this Mongolian medicinal plant. Phylogenetic analysis demonstrates a sister relationship between Artemisia frigida and four other species in Asteraceae, including Ageratina adenophora, Helianthus annuus, Guizotia abyssinica and Lactuca sativa, based on 61 protein-coding sequences. Furthermore, Artemisia frigida was placed in the tribe Anthemideae in the subfamily Asteroideae (Asteraceae) based on ndhF and trnL-F sequence comparisons.

Conclusion

The chloroplast genome sequence of Artemisia frigida was assembled and analyzed in this study, representing the first plastid genome sequenced in the Anthemideae tribe. This complete chloroplast genome sequence will be useful for molecular ecology and molecular phylogeny studies within Artemisia species and also within the Asteraceae family.  相似文献   

6.

Background

Genome level analyses have enhanced our view of phylogenetics in many areas of the tree of life. With the production of whole genome DNA sequences of hundreds of organisms and large-scale EST databases a large number of candidate genes for inclusion into phylogenetic analysis have become available. In this work, we exploit the burgeoning genomic data being generated for plant genomes to address one of the more important plant phylogenetic questions concerning the hierarchical relationships of the several major seed plant lineages (angiosperms, Cycadales, Gingkoales, Gnetales, and Coniferales), which continues to be a work in progress, despite numerous studies using single, few or several genes and morphology datasets. Although most recent studies support the notion that gymnosperms and angiosperms are monophyletic and sister groups, they differ on the topological arrangements within each major group.

Methodology

We exploited the EST database to construct a supermatrix of DNA sequences (over 1,200 concatenated orthologous gene partitions for 17 taxa) to examine non-flowering seed plant relationships. This analysis employed programs that offer rapid and robust orthology determination of novel, short sequences from plant ESTs based on reference seed plant genomes. Our phylogenetic analysis retrieved an unbiased (with respect to gene choice), well-resolved and highly supported phylogenetic hypothesis that was robust to various outgroup combinations.

Conclusions

We evaluated character support and the relative contribution of numerous variables (e.g. gene number, missing data, partitioning schemes, taxon sampling and outgroup choice) on tree topology, stability and support metrics. Our results indicate that while missing characters and order of addition of genes to an analysis do not influence branch support, inadequate taxon sampling and limited choice of outgroup(s) can lead to spurious inference of phylogeny when dealing with phylogenomic scale data sets. As expected, support and resolution increases significantly as more informative characters are added, until reaching a threshold, beyond which support metrics stabilize, and the effect of adding conflicting characters is minimized.  相似文献   

7.
Several individuals of the Caribbean Zamia clade and other cycad genera were used to identify single‐copy nuclear genes for phylogeographic and phylogenetic studies in Cycadales. Two strategies were employed to select target loci: (i) a tblastX search of Arabidopsis conserved ortholog sequence (COS) set and (ii) a tblastX search of Arabidopsis‐Populus‐Vitis‐Oryza Shared Single‐Copy genes (APVO SSC) against the EST Zamia databases in GenBank. From the first strategy, 30 loci were selected, and from the second, 16 loci. In both cases, the matching GenBank accessions of Zamia were used as a query for retrieving highly similar sequences from Cycas, Picea, Pinus species or Ginkgo biloba. After retrieving and aligning all the sequences in each locus, intron predictions were completed to assist in primer design. PCR was carried out in three rounds to detect paralogous loci. A total of 29 loci were successfully amplified as a single band of which 20 were likely single‐copy loci. These loci showed different diversity and divergence levels. A preliminary screening allowed us to select 8 promising loci (40S, ATG2, BG, GroES, GTP, LiSH, PEX4 and TR) for the Zamia pumila complex and 4 loci (COS26, GroES, GTP and HTS) for all other cycad genera.  相似文献   

8.
Mitochondrial genomes of plants are much larger than those of mammals and often contain conserved open reading frames (ORFs) of unknown function. Here, we show that one of these conserved ORFs is actually the gene for ribosomal protein L10 (rpl10) in plant. No rpl10 gene has heretofore been reported in any mitochondrial genome other than the exceptionally gene-rich genome of the protist Reclinomonas americana. Conserved ORFs corresponding to rpl10 are present in a wide diversity of land plant and green algal mitochondrial genomes. The mitochondrial rpl10 genes are transcribed in all nine land plants examined, with five seed plant genes subject to RNA editing. In addition, mitochondrial-rpl10-like cDNAs were identified in EST libraries from numerous land plants. In three lineages of angiosperms, rpl10 is either lost from the mitochondrial genome or a pseudogene. In two of them (Brassicaceae and monocots), no nuclear copy of mitochondrial rpl10 is identifiably present, and instead a second copy of nuclear-encoded chloroplast rpl10 is present. Transient assays using green fluorescent protein indicate that this duplicate gene is dual targeted to mitochondria and chloroplasts. We infer that mitochondrial rpl10 has been functionally replaced by duplicated chloroplast counterparts in Brassicaceae and monocots.  相似文献   

9.
10.

Background

Plastids have inherited their own genomes from a single cyanobacterial ancestor, but the majority of cyanobacterial genes, once retained in the ancestral plastid genome, have been lost or transferred into the eukaryotic host nuclear genome via endosymbiotic gene transfer. Although previous studies showed that cyanobacterial gnd genes, which encode 6-phosphogluconate dehydrogenase, are present in several plastid-lacking protists as well as primary and secondary plastid-containing phototrophic eukaryotes, the evolutionary paths of these genes remain elusive.

Results

Here we show an extended phylogenetic analysis including novel gnd gene sequences from Excavata and Glaucophyta. Our analysis demonstrated the patchy distribution of the excavate genes in the gnd gene phylogeny. The Diplonema gene was related to cytosol-type genes in red algae and Opisthokonta, while heterolobosean genes occupied basal phylogenetic positions with plastid-type red algal genes within the monophyletic eukaryotic group that is sister to cyanobacterial genes. Statistical tests based on exhaustive maximum likelihood analyses strongly rejected that heterolobosean gnd genes were derived from a secondary plastid of green lineage. In addition, the cyanobacterial gnd genes from phototrophic and phagotrophic species in Euglenida were robustly monophyletic with Stramenopiles, and this monophyletic clade was moderately separated from those of red algae. These data suggest that these secondary phototrophic groups might have acquired the cyanobacterial genes independently of secondary endosymbioses.

Conclusion

We propose an evolutionary scenario in which plastid-lacking Excavata acquired cyanobacterial gnd genes via eukaryote-to-eukaryote lateral gene transfer or primary endosymbiotic gene transfer early in eukaryotic evolution, and then lost either their pre-existing or cyanobacterial gene.  相似文献   

11.
12.

Background

Most studies inferring species phylogenies use sequences from single copy genes or sets of orthologs culled from gene families. For taxa such as plants, with very high levels of gene duplication in their nuclear genomes, this has limited the exploitation of nuclear sequences for phylogenetic studies, such as those available in large EST libraries. One rarely used method of inference, gene tree parsimony, can infer species trees from gene families undergoing duplication and loss, but its performance has not been evaluated at a phylogenomic scale for EST data in plants.

Results

A gene tree parsimony analysis based on EST data was undertaken for six angiosperm model species and Pinus, an outgroup. Although a large fraction of the tentative consensus sequences obtained from the TIGR database of ESTs was assembled into homologous clusters too small to be phylogenetically informative, some 557 clusters contained promising levels of information. Based on maximum likelihood estimates of the gene trees obtained from these clusters, gene tree parsimony correctly inferred the accepted species tree with strong statistical support. A slight variant of this species tree was obtained when maximum parsimony was used to infer the individual gene trees instead.

Conclusion

Despite the complexity of the EST data and the relatively small fraction eventually used in inferring a species tree, the gene tree parsimony method performed well in the face of very high apparent rates of duplication.
  相似文献   

13.
The complete plastid genome sequence of the American cranberry (Vaccinium macrocarpon Ait.) was reconstructed using next-generation sequencing data by in silico procedures. We used Roche 454 shotgun sequence data to isolate cranberry plastid-specific sequences of “HyRed” via homology comparisons with complete sequences from several species available at the National Center for Biotechnology Information database. Eleven cranberry plastid contigs were selected for the construction of the plastid genome-based homologies and on raw reads flowing through contigs and connection information. We assembled and annotated a cranberry plastid genome (82,284 reads; 185x coverage) with a length of 176 kb and the typical structure found in plants, but with several structural rearrangements in the large single-copy region when compared to other plastid asterid genomes. To evaluate the reliability of the sequence data, phylogenetic analysis of 30 species outside the order Ericales (with 54 genes) showed Vaccinium inside the clade Asteridae, as reported in other studies using single genes. The cranberry plastid genome sequence will allow the accumulation of critical data useful for breeding and a suite of other genetic studies.  相似文献   

14.
Lagerstroemia (crape myrtle) is an important plant genus used in ornamental horticulture in temperate regions worldwide. As such, numerous hybrids have been developed. However, DNA sequence resources and genome information for Lagerstroemia are limited, hindering evolutionary inferences regarding interspecific relationships. We report the complete plastid genome of Lagerstroemia fauriei. To our knowledge, this is the first reported whole plastid genome within Lythraceae. This genome is 152,440 bp in length with 38% GC content and consists of two single-copy regions separated by a pair of 25,793 bp inverted repeats. The large single copy and the small single copy regions span 83,921 bp and 16,933 bp, respectively. The genome contains 129 genes, including 17 located in each inverted repeat. Phylogenetic analysis of genera sampled from Geraniaceae, Myrtaceae, and Onagraceae corroborated the sister relationship between Lythraceae and Onagraceae. The plastid genomes of L. fauriei and several other Lythraceae species lack the rpl2 intron, which indicating an early loss of this intron within the Lythraceae lineage. The plastid genome of L. fauriei provides a much needed genetic resource for further phylogenetic research in Lagerstroemia and Lythraceae. Highly variable markers were identified for application in phylogenetic, barcoding and conservation genetic applications.  相似文献   

15.

Background

Species of Paris Sect. Marmorata are valuable medicinal plants to synthesize steroidal saponins with effective pharmacological therapy. However, the wild resources of the species are threatened by plundering exploitation before the molecular genetics studies uncover the genomes and evolutionary significance. Thus, the availability of complete chloroplast genome sequences of Sect. Marmorata is necessary and crucial to the understanding the plastome evolution of this section and facilitating future population genetics studies. Here, we determined chloroplast genomes of Sect. Marmorata, and conducted the whole chloroplast genome comparison.

Results

This study presented detailed sequences and structural variations of chloroplast genomes of Sect. Marmorata. Over 40 large repeats and approximately 130 simple sequence repeats as well as a group of genomic hotspots were detected. Inverted repeat contraction of this section was inferred via comparing the chloroplast genomes with the one of P. verticillata. Additionally, almost all the plastid protein coding genes were found to prefer ending with A/U. Mutation bias and selection pressure predominately shaped the codon bias of most genes. And most of the genes underwent purifying selection, whereas photosynthetic genes experienced a relatively relaxed purifying selection.

Conclusions

Repeat sequences and hotspot regions can be scanned to detect the intraspecific and interspecific variability, and selected to infer the phylogenetic relationships of Sect. Marmorata and other species in subgenus Daiswa. Mutation and natural selection were the main forces to drive the codon bias pattern of most plastid protein coding genes. Therefore, this study enhances the understanding about evolution of Sect. Marmorata from the chloroplast genome, and provide genomic insights into genetic analyses of Sect. Marmorata.
  相似文献   

16.
The complete plastid genome sequence of the red macroalga Grateloupia taiwanensis S.-M.Lin & H.-Y.Liang (Halymeniaceae, Rhodophyta) is presented here. Comprising 191,270 bp, the circular DNA contains 233 protein-coding genes and 29 tRNA sequences. In addition, several genes previously unknown to red algal plastids are present in the genome of G. taiwanensis. The plastid genomes from G. taiwanensis and another florideophyte, Gracilaria tenuistipitata var. liui, are very similar in sequence and share significant synteny. In contrast, less synteny is shared between G. taiwanensis and the plastid genome representatives of Bangiophyceae and Cyanidiophyceae. Nevertheless, the gene content of all six red algal plastid genomes here studied is highly conserved, and a large core repertoire of plastid genes can be discerned in Rhodophyta.  相似文献   

17.

Background and Aims

Most molecular phylogenetic studies of Orchidaceae have relied heavily on DNA sequences from the plastid genome. Nuclear and mitochondrial loci have only been superficially examined for their systematic value. Since 40% of the genera within Vanilloideae are achlorophyllous mycoheterotrophs, this is an ideal group of orchids in which to evaluate non-plastid gene sequences.

Methods

Phylogenetic reconstructions for Vanilloideae were produced using independent and combined data from the nuclear 18S, 5·8S and 26S rDNA genes and the mitochondrial atpA gene and nad1b-c intron.

Key Results

These new data indicate placements for genera such as Lecanorchis and Galeola, for which plastid gene sequences have been mostly unavailable. Nuclear and mitochondrial parsimony jackknife trees are congruent with each other and previously published trees based solely on plastid data. Because of high rates of sequence divergence among vanilloid orchids, even the short 5·8S rDNA gene provides impressive levels of resolution and support.

Conclusions

Orchid systematists are encouraged to sequence nuclear and mitochondrial gene regions along with the growing number of plastid loci available.Key words: 26S rDNA, 18S rDNA, 5·8S rDNA, atpA, nad1, orchids, plastid, Vanilla, vanilloid orchids, Vanilloideae  相似文献   

18.
Chloroplast genome sequences are very useful for species identification and phylogenetics. Chuanminshen (Chuanminshen violaceum Sheh et Shan) is an important traditional Chinese medicinal plant, for which the phylogenetic position is still controversial. In this study, the complete chloroplast genome of Chuanminshen violaceum Sheh et Shan was determined. The total size of Chuanminshen chloroplast genome was 154,529 bp with 37.8% GC content. It has the typical quadripartite structure, a large single copy (17,800 bp) and a small single copy (84,171 bp) and a pair of inverted repeats (26,279 bp). The whole genome harbors 132 genes, which includes 85 protein coding genes, 37 tRNA genes, eight rRNA genes, and two pseudogenes. Thirty-nine SSR loci, 32 tandem repeats and 49 dispersed repeats were found. Phylogenetic analyses results with the help of MEGA showed a new insight for the Chuanminshen phylogenetic relationship with the reported chloroplast genomes in Apiales plants.  相似文献   

19.

Background

Spirodela polyrhiza is a species of the order Alismatales, which represent the basal lineage of monocots with more ancestral features than the Poales. Its complete sequence of the mitochondrial (mt) genome could provide clues for the understanding of the evolution of mt genomes in plant.

Methods

Spirodela polyrhiza mt genome was sequenced from total genomic DNA without physical separation of chloroplast and nuclear DNA using the SOLiD platform. Using a genome copy number sensitive assembly algorithm, the mt genome was successfully assembled. Gap closure and accuracy was determined with PCR products sequenced with the dideoxy method.

Conclusions

This is the most compact monocot mitochondrial genome with 228,493 bp. A total of 57 genes encode 35 known proteins, 3 ribosomal RNAs, and 19 tRNAs that recognize 15 amino acids. There are about 600 RNA editing sites predicted and three lineage specific protein-coding-gene losses. The mitochondrial genes, pseudogenes, and other hypothetical genes (ORFs) cover 71,783 bp (31.0%) of the genome. Imported plastid DNA accounts for an additional 9,295 bp (4.1%) of the mitochondrial DNA. Absence of transposable element sequences suggests that very few nuclear sequences have migrated into Spirodela mtDNA. Phylogenetic analysis of conserved protein-coding genes suggests that Spirodela shares the common ancestor with other monocots, but there is no obvious synteny between Spirodela and rice mtDNAs. After eliminating genes, introns, ORFs, and plastid-derived DNA, nearly four-fifths of the Spirodela mitochondrial genome is of unknown origin and function. Although it contains a similar chloroplast DNA content and range of RNA editing as other monocots, it is void of nuclear insertions, active gene loss, and comprises large regions of sequences of unknown origin in non-coding regions. Moreover, the lack of synteny with known mitochondrial genomic sequences shed new light on the early evolution of monocot mitochondrial genomes.  相似文献   

20.

Background  

The magnoliids with four orders, 19 families, and 8,500 species represent one of the largest clades of early diverging angiosperms. Although several recent angiosperm phylogenetic analyses supported the monophyly of magnoliids and suggested relationships among the orders, the limited number of genes examined resulted in only weak support, and these issues remain controversial. Furthermore, considerable incongruence resulted in phylogenetic reconstructions supporting three different sets of relationships among magnoliids and the two large angiosperm clades, monocots and eudicots. We sequenced the plastid genomes of three magnoliids, Drimys (Canellales), Liriodendron (Magnoliales), and Piper (Piperales), and used these data in combination with 32 other angiosperm plastid genomes to assess phylogenetic relationships among magnoliids and to examine patterns of variation of GC content.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号