首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 30 毫秒
1.

Background

Most studies inferring species phylogenies use sequences from single copy genes or sets of orthologs culled from gene families. For taxa such as plants, with very high levels of gene duplication in their nuclear genomes, this has limited the exploitation of nuclear sequences for phylogenetic studies, such as those available in large EST libraries. One rarely used method of inference, gene tree parsimony, can infer species trees from gene families undergoing duplication and loss, but its performance has not been evaluated at a phylogenomic scale for EST data in plants.

Results

A gene tree parsimony analysis based on EST data was undertaken for six angiosperm model species and Pinus, an outgroup. Although a large fraction of the tentative consensus sequences obtained from the TIGR database of ESTs was assembled into homologous clusters too small to be phylogenetically informative, some 557 clusters contained promising levels of information. Based on maximum likelihood estimates of the gene trees obtained from these clusters, gene tree parsimony correctly inferred the accepted species tree with strong statistical support. A slight variant of this species tree was obtained when maximum parsimony was used to infer the individual gene trees instead.

Conclusion

Despite the complexity of the EST data and the relatively small fraction eventually used in inferring a species tree, the gene tree parsimony method performed well in the face of very high apparent rates of duplication.
  相似文献   

2.
Helicosporidia are obligate invertebrate pathogens with a unique and highly adapted mode of infection. The evolutionary history of Helicosporidia has been uncertain, but several recent molecular phylogenetic studies have shown an unexpectedly close relationship to green algae, and specifically to the opportunistic pathogen Prototheca. To date, molecular sequences from Helicosporidia are restricted to those genes used for phylogenetic reconstruction and genes related to the existence and function of its cryptic plastid. We have therefore conducted a small expressed sequence tag (EST) project on Helicosporidium sp., yielding about 700 unique sequences. We have examined the functional distribution of known genes, the distribution of EST abundance, and the prevalence of previously unknown gene sequences. To demonstrate the potential utility of large amounts of data, we have used ribosomal proteins to test whether the phylogenetic position of Helicosporidium inferred from a small number of genes is broadly supported by a large number of genes. We conducted phylogenetic analyses on 69 ribosomal proteins and found that 98% supported the green algal origin of Helicosporidia and 80% support a specific relationship with Prototheca. Overall, these data multiply the available molecular information from Helicosporidium 100-fold, which should provide the basis for new insights into these unusual but interesting parasites.  相似文献   

3.
4.
5.
6.
7.
We have conducted a preliminary phylogenetic survey of ammonia-oxidizing beta-proteobacteria, using 16S rRNA gene libraries prepared by selective PCR and DNA from acid and neutral soils and polluted and nonpolluted marine sediments. Enrichment cultures were established from samples and analyzed by PCR. Analysis of 111 partial sequences of c. 300 bases revealed that the environmental sequences formed seven clusters, four of which are novel, within the phylogenetic radiation defined by cultured autotrophic ammonia oxidizers. Longer sequences from 13 cluster representatives support their phylogenetic positions relative to cultured taxa. These data suggest that known taxa may not be representative of the ammonia-oxidizing beta-proteobacteria in our samples. Our data provide further evidence that molecular and culture-based enrichment methods can select for different community members. Most enrichments contained novel Nitrosomonas-like sequences whereas novel Nitrosospira-like sequences were more common from gene libraries of soils and marine sediments. This is the first evidence for the occurrence of Nitrosospira-like strains in marine samples. Clear differences between the sequences of soil and marine sediment libraries were detected. Comparison of 16S rRNA sequences from polluted and nonpolluted sediments provided no strong evidence that the community composition was determined by the degree of pollution. Soil clone sequences fell into four clusters, each containing sequences from acid and neutral soils in varying proportions. Our data suggest that some related strains may be present in both samples, but further work is needed to resolve whether there is selection due to pH for particular sequence types.  相似文献   

8.
9.
10.
A detailed assessment of the evolution and phylogenetic utility of two genes, ftsZ and wsp, was used to investigate the origin of male-killing Wolbachia, previously isolated from the ladybird Adalia bipunctata and the butterfly Acraea encedon. The analysis included almost all available sequences of B-group Wolbachia and two outgroup taxa and showed that (1) the two gene regions differ in phylogenetic utility, (2) sequence variation is here correlated with phylogenetic information content, (3) both genes show significant rate heterogeneity between lineages, (4) increased substitution rates are associated with homoplasy in the data, (5) wsp sequences of some taxa appear to be subject to positive selection, and (6) only a limited number of clades can be inferred with confidence due to either lack of phylogenetic information or the presence of homoplasy. With respect to the evolution of male-killing, the two genes nevertheless seemed to provide unbiased information. However, they consistently produce contradictory results. Current data therefore do not permit clarification of the origin of this behavior. In addition, A. bipunctata was found to be a host to two recently diverged strains of male-killing Wolbachia that showed increased substitution rates for both genes. Moreover, the wsp gene, which codes for an outer membrane protein, was found to be subject to positive selection in these taxa. These findings were postulated to be the product of high selection pressures due to antagonistic host-symbiont interactions in this ladybird species. In conclusion, our study demonstrates that the results of a detailed phylogenetic analysis, including characterization of the limitations of such an approach, can serve as a valuable basis for an understanding of the evolution of Wolbachia bacteria. Moreover, particular features of gene evolution, such as elevated substitution rates or the presence of positive selection, may provide information about the dynamics of Wolbachia-host associations.  相似文献   

11.
12.

Background

Although the overwhelming majority of genes found in angiosperms are members of gene families, and both gene- and genome-duplication are pervasive forces in plant genomes, some genes are sufficiently distinct from all other genes in a genome that they can be operationally defined as 'single copy'. Using the gene clustering algorithm MCL-tribe, we have identified a set of 959 single copy genes that are shared single copy genes in the genomes of Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa. To characterize these genes, we have performed a number of analyses examining GO annotations, coding sequence length, number of exons, number of domains, presence in distant lineages, such as Selaginella and Physcomitrella, and phylogenetic analysis to estimate copy number in other seed plants and to demonstrate their phylogenetic utility. We then provide examples of how these genes may be used in phylogenetic analyses to reconstruct organismal history, both by using extant coverage in EST databases for seed plants and de novo amplification via RT-PCR in the family Brassicaceae.

Results

There are 959 single copy nuclear genes shared in Arabidopsis, Populus, Vitis and Oryza ["APVO SSC genes"]. The majority of these genes are also present in the Selaginella and Physcomitrella genomes. Public EST sets for 197 species suggest that most of these genes are present across a diverse collection of seed plants, and appear to exist as single or very low copy genes, though exceptions are seen in recently polyploid taxa and in lineages where there is significant evidence for a shared large-scale duplication event. Genes encoding proteins localized in organelles are more commonly single copy than expected by chance, but the evolutionary forces responsible for this bias are unknown. Regardless of the evolutionary mechanisms responsible for the large number of shared single copy genes in diverse flowering plant lineages, these genes are valuable for phylogenetic and comparative analyses. Eighteen of the APVO SSC single copy genes were amplified in the Brassicaceae using RT-PCR and directly sequenced. Alignments of these sequences provide improved resolution of Brassicaceae phylogeny compared to recent studies using plastid and ITS sequences. An analysis of sequences from 13 APVO SSC genes from 69 species of seed plants, derived mainly from public EST databases, yielded a phylogeny that was largely congruent with prior hypotheses based on multiple plastid sequences. Whereas single gene phylogenies that rely on EST sequences have limited bootstrap support as the result of limited sequence information, concatenated alignments result in phylogenetic trees with strong bootstrap support for already established relationships. Overall, these single copy nuclear genes are promising markers for phylogenetics, and contain a greater proportion of phylogenetically-informative sites than commonly used protein-coding sequences from the plastid or mitochondrial genomes.

Conclusions

Putatively orthologous, shared single copy nuclear genes provide a vast source of new evidence for plant phylogenetics, genome mapping, and other applications, as well as a substantial class of genes for which functional characterization is needed. Preliminary evidence indicates that many of the shared single copy nuclear genes identified in this study may be well suited as markers for addressing phylogenetic hypotheses at a variety of taxonomic levels.  相似文献   

13.
Large-scale statistical analysis of secondary xylem ESTs in pine   总被引:3,自引:0,他引:3  
  相似文献   

14.
The expressed sequence tag (EST) data provide a powerful tool for identification of transcribed DNA sequences. However, as EST are relatively short, many exons are poorly covered by EST, thus reducing the utility of EST data. Recently, signature sequence tag (SST) fingerprints were proposed as an alternative to EST fingerprints. Given a fingerprint set of probes, SST of a clone is a subset of probes from the fingerprint set that hybridize with the clone. We demonstrate that besides being a powerful technique for screening cDNA libraries, SST technology provides for very accurate gene predictions. Even with a small fingerprint set (600-800 probes), SST-based gene recognition outperforms many conventional and EST-based methods. The increase in the size of the fingerprint set to 1500 probes provides almost perfect gene recognition. Even more importantly, SST-based gene predictions miss very few exons and, therefore, provide an opportunity to bypass the cDNA sequencing step on the way from finished genomic sequence to mutation detection in gene-hunting projects. Because SST data can be obtained in a highly parallel and inexpensive way, SST technology has a potential of complementing EST technology for gene hunting.  相似文献   

15.
16.
17.
18.
A molecular understanding of porcine reproduction is of biological interest and economic importance. Our Midwest Consortium has produced cDNA libraries containing the majority of genes expressed in major female reproductive tissues, and we have deposited into public databases 21,499 expressed sequence tag (EST) gene sequences from the 3 end of clones from these libraries. These sequences represent 10,574 different genes, based on sequence comparison among these data, and comparison with existing porcine ESTs and genes indicate as many as 4652 of these EST clusters are novel. In silico analysis identified sequences that are expressed in specific pig tissues or organs and confirmed the broad expression in pig for many genes ubiquitously expressed in human tissues. Furthermore, we have developed computer software to identify sequence similarity of these pig genes with their human counterparts, and to extract the mapping information of these human homologues from genome databases. We demonstrate the utility of this software for comparative mapping by localizing 61 genes on the porcine physical map for Chromosomes (Chrs) 5, 10, and 14. The following Accession numbers were assigned to our deposited sequences: BF701840 – BF704551, BF708383, BF708386 – BF713604, BG322266 – BG322271, BI398567 – BI405235, BQ597354 – BQ605166.  相似文献   

19.
20.
MOTIVATION: A whole set of Expressed Sequence Tags (ESTs) from the Sf9 cell line of Spodoptera frugiperda is presented here for the first time. By this way we want to identify both conserved and specific genes of this pest species. We also expect from this analysis to find a class of protein sequences providing a tool to explore genomic features and phylogeny of Lepidoptera. RESULTS: The ESTs display both housekeeping as well as developmentally regulated genes, and a high percentage of sequences with unknown function. Among the identified ORFs, almost all ribosomal proteins (RPs) were found with high EST redundancy and hence sequence accuracy. The codon usage found among RP genes is in average surprisingly much less biased in Lepidoptera than in other organisms. Other Spodoptera genes also displayed a low bias, suggesting a general genome expression feature in this Lepidoptera. We also found that the L35A and L36 RP sequences, respectively, display 40 and 10 amino-acid insertions, both being present only in insects. Sequence analysis suggests that they are probably not subjected to a strong selective pressure and may be good phylogenetic markers for Lepidoptera. Most interestingly, the Lepidoptera sequences of 9 RP genes displayed a specific signature different from the canonical one. We conclude that the RP family allows valuable comparative genomics and phylogeny of Lepidoptera. AVAILABILITY: All EST sequence data are available from the private 'Spodo-Base' upon request.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号