首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Insertion and deletion events (indels) provide a suite of markers with enormous potential for molecular phylogenetics. Using many more indel characters than those in previous studies, we here for the first time address the impact of indel inclusion on the phylogenetic inferences of Arctoidea (Mammalia: Carnivora). Based on 6843 indel characters from 22 nuclear intron loci of 16 species of Arctoidea, our analyses demonstrate that when the indels were not taken into consideration, the monophyly of Ursidae and Pinnipedia tree and the monophyly of Pinnipedia and Musteloidea tree were both recovered, whereas inclusion of indels by using three different indel coding schemes give identical phylogenetic tree topologies supporting the monophyly of Ursidae and Pinnipedia. Our work brings new perspectives on the previously controversial placements among Arctoidea families, and provides another example demonstrating the importance of identifying and incorporating indels in the phylogenetic analyses of introns. In addition, comparison of indel incorporation methods revealed that the three indel coding methods are all advantageous over treating indels as missing data, given that incorporating indels produces consistent results across methods. This is the first report of the impact of different indel coding schemes on phylogenetic reconstruction at the family level in Carnivora, which indicates that indels should be taken into account in the future phylogenetic analyses.  相似文献   

2.
The Sepsidae is, with approximately 300 described species, a relatively small family of cyclorrhaphan flies whose behaviour, morphology, and development have been extensively studied. However, currently the only available tree for Sepsidae is more than 10 years old and was based entirely on morphological characters. Here, we present the results of parsimony and Bayesian analyses based on 75 species, ten genes, and morphology. Parsimony and Bayesian analyses produce largely congruent and well‐supported topologies regardless of whether indels are coded as 5th character states, as missing values, or all sites with indels are removed. The tree confirms the monophyly of Sepsidae and identifies the Ropalomeridae as its sister group. With regard to higher‐level relationships, we identify widespread conflict between the morphological and the DNA sequence data. The proposed hypothesis based on both partitions largely reflects the signal in the molecular data. Particularly surprising is the rejection of two relationship hypotheses with strong morphological support, namely the sister group relationship between Orygma and the remaining Sepsidae and the monophyly of the Sepsis species group. Our partitioned Bremer support (PBS) analyses imply that indel coding has a stronger effect on the relative performance of individual gene partitions than the exclusion of alignment‐ambiguous sequences or the location of a gene on the mitochondrial or nuclear genome. However, these analyses also reveal unexpectedly strong fluctuations in PBS values given that indel treatment has only a minor effect on tree topology and jacknife support. These unexpected fluctuations highlight the need for a comparative study across multiple data sets that investigates the influence of conflict and indel treatment on PBS values. © The Willi Hennig Society 2008.  相似文献   

3.
Insertions and deletions (indels) in chloroplast noncoding regions are common genetic markers to estimate population structure and gene flow, although relatively little is known about indel evolution among recently diverged lineages such as within plant families. Because indel events tend to occur nonrandomly along DNA sequences, recurrent mutations may generate homoplasy for indel haplotypes. This is a potential problem for population studies, because indel haplotypes may be shared among populations after recurrent mutation as well as gene flow. Furthermore, indel haplotypes may differ in fitness and therefore be subject to natural selection detectable as rate heterogeneity among lineages. Such selection could contribute to the spatial patterning of cpDNA haplotypes, greatly complicating the interpretation of cpDNA population structure. This study examined both nucleotide and indel cpDNA variation and divergence at six noncoding regions (psbB-psbH, atpB-rbcL, trnL-trnH, rpl20-5'rps12, trnS-trnG, and trnH-psbA) in 16 individuals from eight species in the Lecythidaceae and a Sapotaceae outgroup. We described patterns of cpDNA changes, assessed the level of indel homoplasy, and tested for rate heterogeneity among lineages and regions. Although regression analysis of branch lengths suggested some degree of indel homoplasy among the most divergent lineages, there was little evidence for indel homoplasy within the Lecythidaceae. Likelihood ratio tests applied to the entire phylogenetic tree revealed a consistent pattern rejecting a molecular clock. Tajima's 1D and 2D tests revealed two taxa with consistent rate heterogeneity, one showing relatively more and one relatively fewer changes than other taxa. In general, nucleotide changes showed more evidence of rate heterogeneity than did indel changes. The rate of evolution was highly variable among the six cpDNA regions examined, with the trnS-trnG and trnH-psbA regions showing as much as 10% and 15% divergence within the Lecythidaceae. Deviations from rate homogeneity in the two taxa were constant across cpDNA regions, consistent with lineage-specific rates of evolution rather than cpDNA region-specific natural selection. There is no evidence that indels are more likely than nucleotide changes to experience homoplasy within the Lecythidaceae. These results support a neutral interpretation of cpDNA indel and nucleotide variation in population studies within species such as Corythophora alta.  相似文献   

4.
Genotyping‐by‐sequencing (GBS) and related methods are increasingly used for studies of non‐model organisms from population genetic to phylogenetic scales. We present GIbPSs, a new genotyping toolkit for the analysis of data from various protocols such as RAD, double‐digest RAD, GBS, and two‐enzyme GBS without a reference genome. GIbPSs can handle paired‐end GBS data and is able to assign reads from both strands of a restriction fragment to the same locus. GIbPSs is most suitable for population genetic and phylogeographic analyses. It avoids genotyping errors due to indel variation by identifying and discarding affected loci. GIbPSs creates a genotype database that offers rich functionality for data filtering and export in numerous formats. We performed comparative analyses of simulated and real GBS data with GIbPSs and another program, pyRAD. This program accounts for indel variation by aligning homologous sequences. GIbPSs performed better than pyRAD in several aspects. It required much less computation time and displayed higher genotyping accuracy. GIbPSs retained smaller numbers of loci overall in analyses of real GBS data. It nevertheless delivered more complete genotype matrices with greater locus overlap between individuals and greater numbers of loci sampled in all individuals.  相似文献   

5.
I. Bonnin  J. M. Prosperi    I. Olivieri 《Genetics》1996,143(4):1795-1805
Two populations of the selfing annual Medicago truncatula Gaertn. (Leguminoseae), each subdivided into three subpopulations, were studied for both metric traits (quantitative characters) and genetic markers (random amplified polymorphic DNA and one morphological, single-locus marker). Hierarchical analyses of variance components show that (1) populations are more differentiated for quantitative characters than for marker loci, (2) the contribution of both within and among subpopulations components of variance to overall genetic variance of these characters is reduced as compared to markers, and (3) at the population level, within population structure is slightly but not significantly larger for markers than for quantitative traits. Under the hypothesis that most markers are neutral, such comparisons may be used to make hypotheses about the strength and heterogeneity of natural selection in the face of genetic drift and gene flow. We thus suggest that in these populations, quantitative characters are under strong divergent selection among populations, and that gene flow is restricted among populations and subpopulations.  相似文献   

6.
7.
We compared four approaches for analyzing three data sets derived from staphylinoid beetles, a superfamily whose known species diversity is roughly comparable to that of vertebrates. One data set is derived from adult morphology and the two molecular data sets are from 12S ribosomal RNA and cytochrome b mitochondrial DNA. We found that taxonomic congruence following conditional data combination, herein called compatible evidence (CE), resolved more nodes compatible with an initial conservative hypothesis than did total evidence (TE), conditional data combination (CDC), or taxonomic congruence (TC). CE sets a base of nodes obtained by CDC analysis and then investigates what further agreement may arise in a universe where these nodes are accepted as given. We suggest that CE75-75 may be appropriate for future studies that aim to both generate a well-corroborated tree and investigate conflicts between data sets, partitions, and characters. CE75-75 is a 75% bootstrap consensus CDC tree followed by combinable-component consensus of a 75% bootstrap consensus of each homogeneous set of partitions having hierarchical structure.  相似文献   

8.
Genetic analyses of population structure can be placed in explicit environmental contexts if appropriate environmental data are available. Here, we use high-coverage and high-resolution oceanographic and genetic sequence data to assess population structure patterns and their potential environmental influences for humpback dolphins in the Western Indian Ocean. We analyzed mitochondrial DNA data from 94 dolphins from the coasts of South Africa, Mozambique, Tanzania and Oman, employing frequency-based and maximum-likelihood algorithms to assess population structure and migration patterns. The genetic data were combined with 13 years of remote sensing oceanographic data of variables known to influence cetacean dispersal and population structure. Our analyses show strong and highly significant genetic structure between all putative populations, except for those in South Africa and Mozambique. Interestingly, the oceanographic data display marked environmental heterogeneity between all sampling areas and a degree of overlap between South Africa and Mozambique. Our combined analyses therefore suggest the occurrence of genetically isolated populations of humpback dolphins in areas that are environmentally distinct. This study highlights the utility of molecular tools in combination with high-resolution and high-coverage environmental data to address questions not only pertaining to genetic population structure, but also to relevant ecological processes in marine species.  相似文献   

9.
10.
Understanding the phylogeography of a species requires not only elucidating patterns of genetic structure among populations, but also identifying the possible evolutionary events creating that structure. The use of a single phylogeographic test or analysis, however, usually provides a picture of genetic structure without revealing the possible underlying evolutionary causes. We used current analytical techniques in a sequential approach to examine genetic structure and its underlying causes in the bogus yucca moth Prodoxus decipiens (Lepidoptera: Prodoxidae). Both historical biogeography and recent human transplantations of the moth's host plants provided a priori expectations of the pattern of genetic structure and its underlying causes. We evaluated these expectations by using a progression of phylogenetic, demographic, and population genetic analyses of mtDNA sequence data from 476 individuals distributed across 25 populations that encompassed the range of P. decipiens. The combination of these analyses revealed that much of the genetic structure has evolved more recently than suggested by historical biogeography, has been influenced by changes in demography, and can be best explained by long distance dispersal and isolation by distance. We suggest that performing a suite of analyses that focus on different temporal scales may be an effective approach to investigating the patterns and causes of genetic structure within species.  相似文献   

11.
We use a multigene data set (the mitochondrial locus and nine nuclear gene regions) to test phylogenetic relationships in the South American "lava lizards" (genus Microlophus) and describe a strategy for aligning noncoding sequences that accounts for differences in tempo and class of mutational events. We focus on seven nuclear introns that vary in size and frequency of multibase length mutations (i.e., indels) and present a manual alignment strategy that incorporates insertions and deletions (indels) for each intron. Our method is based on mechanistic explanations of intron evolution that does not require a guide tree. We also use a progressive alignment algorithm (Probabilistic Alignment Kit; PRANK) and distinguishes insertions from deletions and avoids the "gapcost" conundrum. We describe an approach to selecting a guide tree purged of ambiguously aligned regions and use this to refine PRANK performance. We show that although manual alignment is successful in finding repeat motifs and the most obvious indels, some regions can only be subjectively aligned, and there are limits to the size and complexity of a data matrix for which this approach can be taken. PRANK alignments identified more parsimony-informative indels while simultaneously increasing nucleotide identity in conserved sequence blocks flanking the indel regions. When comparing manual and PRANK with two widely used methods (CLUSTAL, MUSCLE) for the alignment of the most length-variable intron, only PRANK recovered a tree congruent at deeper nodes with the combined data tree inferred from all nuclear gene regions. We take this concordance as an objective function of alignment quality and present a strongly supported phylogenetic hypothesis for Microlophus relationships. From this hypothesis we show that (1) a coded indel data partition derived from the PRANK alignment contributed significantly to nodal support and (2) the indel data set permitted detection of significant conflict between mitochondrial and nuclear data partitions, which we hypothesize arose from secondary contact of distantly related taxa, followed by hybridization and mtDNA introgression.  相似文献   

12.
Density of taxon sampling and number/kind of characters are central to achieving the ultimate goals in phylogenetic reconstruction: tree robustness and improved accuracy. In molecular phylogenetics, DNA sequence repositories such as GenBank are potential sources for expanding datasets in two dimensions, taxa and characters, to the level of “supermatrices.” However, the issue of missing characters/genomic regions is generally considered a major impediment to this endeavor. We used here the angiosperm order Caryophyllales to systematically address the impact of missing data when expanding taxon sampling and number of characters in phylogenetic reconstruction. Our analyses show that expansion of taxon sampling by ~13-fold resulted in improved phylogenetic assessment of the Caryophyllales despite up to 38% missing data. Expanding number of characters in the dataset by allowing for up to 100-fold increase in amount of missing data and inclusion of entries with about 40% missing genomic regions did not negatively impact tree structure or robustness, but to the contrary improved both. These results are timely regarding the ongoing efforts to achieve detailed assessment of the tree of life.  相似文献   

13.
Sequence data from the noncoding region separating the plastid genes atpbeta and rbcL were gathered for 27 epacrid taxa, representing all previously recognized infrafamilial groups, and four outgroup taxa (Ericaceae), to address several persistent phylogenetic questions in the group. Parsimony analyses were conducted on these data, as well as on a complementary rbcL sequence dataset assembled from the literature and the combined dataset. The atpbeta-rbcL spacer was notable for the high frequency of insertion-deletion mutations (indels); their distributions were coded as binary characters and included as a adjunct matrix in some of the analyses. The phylogenetic patterns derived from the spacer and rbcL data and the combined analyses, both including and excluding the indel data, concur in resolving seven major lineages corresponding to the tribes of Crayn et al. (1998, Aust. J. Bot. 46, 187-200), viz. Prionoteae, Archerieae, Oligarrheneae, Cosmelieae, Richeeae, Epacrideae, and Styphelieae. The relationships of the tribes and within Styphelieae, however, are not convincingly resolved. Minor conflicts in the positions of some taxa between the spacer and the rbcL trees are poorly supported. Among epacrids, the spacer region provided more cladistically informative characters than rbcL and resulted in trees with lower homoplasy. Further, the spacer data, when analyzed alone and when combined with rbcL, resolved several clades that could not be retrieved on rbcL data alone and provided increased support for many other relationships. The evolution of a putative three-base inversion associated with a hairpin secondary structure in the spacer region is discussed in the light of the inferred phylogeny.  相似文献   

14.
The diamondback moth Plutella xylostella (Linnaeus) (Lepidoptera: Plutellidae) is one of the most destructive insect pests of cruciferous plants worldwide. Biological, ecological and genetic studies have indicated that this moth is migratory in many regions around the world. Although outbreaks of this pest occur annually in China and cause heavy damage, little is known concerning its migration. To better understand its migration pattern, we investigated the population genetic structure and demographic history of the diamondback moth by analyzing 27 geographical populations across China using four mitochondrial genes and nine microsatellite loci. The results showed that high haplotype diversity and low nucleotide diversity occurred in the diamondback moth populations, a finding that is typical for migratory species. No genetic differentiation among all populations and no correlation between genetic and geographical distance were found. However, pairwise analysis of the mitochondrial genes has indicated that populations from the southern region were more differentiated than those from the northern region. Gene flow analysis revealed that the effective number of migrants per generation into populations of the northern region is very high, whereas that into populations of the southern region is quite low. Neutrality testing, mismatch distribution and Bayesian Skyline Plot analyses based on mitochondrial genes all revealed that deviation from Hardy-Weinberg equilibrium and sudden expansion of the effective population size were present in populations from the northern region but not in those from the southern region. In conclusion, all our analyses strongly demonstrated that the diamondback moth migrates within China from the southern to northern regions with rare effective migration in the reverse direction. Our research provides a successful example of using population genetic approaches to resolve the seasonal migration of insects.  相似文献   

15.
Applying microsatellite DNA markers in population genetic studies of the pest moth Helicoverpa armigera is subject to numerous technical problems, such as the high frequency of null alleles, occurrence of size homoplasy, presence of multiple copies of flanking sequence in the genome and the lack of PCR amplification robustness between populations. To overcome these difficulties, we developed exon-primed intron-crossing (EPIC) nuclear DNA markers for H. armigera based on ribosomal protein (Rp) and the Dopa Decarboxylase (DDC) genes and sequenced alleles showing length polymorphisms. Allele length polymorphisms were usually from random indels (insertions or deletions) within introns, although variation of short dinucleotide DNA repeat units was also detected. Mapping crosses demonstrated Mendelian inheritance patterns for these EPIC markers and the absence of both null alleles and allele 'dropouts'. Three examples of allele size homoplasies due to indels were detected in EPIC markers RpL3, RpS6 and DDC, while sequencing of multiple individuals across 11 randomly selected alleles did not detect indel size homoplasies. The robustness of the EPIC-PCR markers was demonstrated by PCR amplification in the related species, H. zea, H. assulta and H. punctigera.  相似文献   

16.
Insertions and deletions (indels) result in sequences of various lengths when homologous gene regions are compared among individuals or species. Although indels are typically phylogenetically informative, occurrence and incorporation of these characters as gaps in intraspecific population genetic data sets are rarely discussed. Moreover, the impact of gaps on estimates of fixation indices, such as F(ST), has not been reviewed. Here, I summarize the occurrence and population genetic signal of indels among 60 published studies that involved alignments of multiple sequences from the mitochondrial DNA (mtDNA) control region of vertebrate taxa. Among 30 studies observing indels, an average of 12% of both variable and parsimony-informative sites were composed of these sites. There was no consistent trend between levels of population differentiation and the number of gap characters in a data block. Across all studies, the average influence on estimates of PhiST was small, explaining only an additional 1.8% of among population variance (range 0.0-8.0%). Studies most likely to observe an increase in PhiST with the inclusion of gap characters were those with < 20 variable sites, but a near equal number of studies with few variable sites did not show an increase. In contrast to studies at interspecific levels, the influence of indels for intraspecific population genetic analyses of control region DNA appears small, dependent upon total number of variable sites in the data block, and related to species-specific characteristics and the spatial distribution of mtDNA lineages that contain indels.  相似文献   

17.
Since the 1920s, population geneticists have had measures that describe how genetic variation is distributed spatially within a species' geographical range. Modern genetic survey techniques frequently yield information on the evolutionary relationships among the alleles or haplotypes as well as information on allele frequencies and their spatial distributions. This evolutionary information is often expressed in the form of an estimated haplotype or allele tree. Traditional statistics of population structure, such as F statistics, do not make use of evolutionary genealogical information, so it is necessary to develop new statistical estimators and tests that explicitly incorporate information from the haplotype tree. One such technique is to use the haplotype tree to define a nested series of branches (clades), thereby allowing an evolutionary nested analysis of the spatial distribution of genetic variation. Such a nested analysis can be performed regarding the geographical sampling locations either as categorical or continuous variables (i.e. some measure of spatial distance). It is shown that such nested phylogeographical analyses have more power to detect geographical associations than traditional, nonhistorical analyses and, as a consequence, allow a broader range of gene-flow parameters to be estimated in a precise fashion. More importantly, such nested analyses can discriminate between phylogeographical associations due to recurrent but restricted gene flow vs. historical events operating at the population level (e.g. past fragmentation, colonization, or range expansion events). Restricted gene flow and historical events can be intertwined, and the cladistic analyses can reconstruct their temporal juxtapositions, thereby yielding great insight into both the evolutionary history and population structure of the species. Examples are given that illustrate these properties, concentrating on the detection of range expansion events.  相似文献   

18.
L. Excoffier  P. E. Smouse 《Genetics》1994,136(1):343-359
We formalize the use of allele frequency and geographic information for the construction of gene trees at the intraspecific level and extend the concept of evolutionary parsimony to molecular variance parsimony. The central principle is to consider a particular gene tree as a variable to be optimized in the estimation of a given population statistic. We propose three population statistics that are related to variance components and that are explicit functions of phylogenetic information. The methodology is applied in the context of minimum spanning trees (MSTs) and human mitochondrial DNA restriction data, but could be extended to accommodate other tree-making procedures, as well as other data types. We pursue optimal trees by heuristic optimization over a search space of more than 1.29 billion MSTs. This very large number of equally parsimonious trees underlines the lack of resolution of conventional parsimony procedures. This lack of resolution is highlighted by the observation that equally parsimonious trees yield very different estimates of population genetic diversity and genetic structure, as shown by null distributions of the population statistics, obtained by evaluation of 10,000 random MSTs. We propose a non-parametric test for the similarity between any two trees, based on the distribution of a weighted coevolutionary correlation. The ability to test for tree relatedness leads to the definition of a class of solutions instead of a single solution. Members of the class share virtually all of the critical internal structure of the tree but differ in the placement of singleton branch tips.  相似文献   

19.
Indels in DNA sequences frequently affect more than a single nucleotide, creating problems for alignment, character coding and phylogenetic analysis. However, the size and frequency of multiple‐residue indels is not usually tested, and with popular alignment packages their reconstruction is indirectly acheived by reducing the affine (gap extension) cost. We explored the length distribution of indels in intron sequences of the gene Mp20 by modifying the gap opening and gap extension costs. Given a “known” tree for the study group, global homology levels were greatest under low gap cost, with gap extension costs of roughly 0.4‐fold the opening cost. Different approaches to gap coding and weighting suggested that taxonomic congruence was correlated with high frequencies of multiple‐position indels, with a maximum indel length of 2–5 bp and few indels above 15 bp, but also including a proportion of indels > 100 bp. Only a small minority of indels could be reconstructed as single‐position indels. Consequently, tree topologies improved when homologous multinucleotide indels were recoded as binary characters which are otherwise highly homoplastic and weighted characters in single‐position coding. In tree‐generating alignment procedures as implemented in POY, where gap penalty determines the character weight during tree search, the problem of assigning inappropriately high weight to multiple‐residue indels could partly be overcome by setting the extension costs to about 0.4‐fold lower than gap opening costs. We conclude that multiple consecutive gap positions are not independent characters and hence methods for parsimony reconstruction of long indels are required. Finally, we also observed a general lack of correlation between taxonomic and character congruence, demonstrating the difficulties of applying congruence criteria to decide among competing alignments. This highlights the value of recent model‐based alignment procedures which can implement the statistical distributions of indel size classes, and do not rely on potentially circular strategies for optimizing overall congruence. © The Willi Hennig Society 2006.  相似文献   

20.
Subtle morphological differences can be essential to diagnosing closely related species, and an understanding of the genetic basis of these characters can contribute to understanding their divergences. We used voucher specimens from previous genetic analyses of population structure to subsequently analyse genome-wide associations linking morphology to genetic variation in spruce budworms, a group of economically important and morphologically similar forest pests. In particular, we assessed the taxonomic value and genetic architecture of two morphological traits (wing pattern and genitalic spicule abundance) that have been reported to differ among spruce budworm species. Our results suggest that phallic spicule number has greater taxonomic utility than wing pattern for distinguishing Choristoneura fumiferana (Clemens) from Choristoneura occidentalis occidentalis Freeman and Choristoneura occidentalis biennis Freeman. However, there was considerable overlap among taxa for all phenotypic characters analysed. In a genome-wide association study, wing pattern variation was significantly associated with four single nucleotide polymorphism (SNP) loci, including two adjacent SNPs. One SNP was flanked by sequence resembling RNA-directed DNA polymerase from mobile element jockey-like. This locus is a promising candidate for the study of wing pattern development in spruce budworms, as jockey-like transposable elements and polymerases have documented roles in wing spot production in other Lepidoptera. Our study links classical taxonomic characters and genomic data to provide insights into the potential genetic architecture of species differences. It also demonstrates previously untapped morphological and taxonomic value in voucher specimens from earlier molecular genetic analyses.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号