首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Stephens and Donnelly have introduced a simple yet powerful importance sampling scheme for computing the likelihood in population genetic models. Fundamental to the method is an approximation to the conditional probability of the allelic type of an additional gene, given those currently in the sample. As noted by Li and Stephens, the product of these conditional probabilities for a sequence of draws that gives the frequency of allelic types in a sample is an approximation to the likelihood, and can be used directly in inference. The aim of this note is to demonstrate the high level of accuracy of "product of approximate conditionals" (PAC) likelihood when used with microsatellite data. Results obtained on simulated microsatellite data show that this strategy leads to a negligible bias over a wide range of the scaled mutation parameter theta. Furthermore, the sampling variance of likelihood estimates as well as the computation time are lower than that obtained with importance sampling on the whole range of theta. It follows that this approach represents an efficient substitute to IS algorithms in computer intensive (e.g. MCMC) inference methods in population genetics.  相似文献   

2.
T H Lam  M Shen  J-M Chia  S H Chan  E C Ren 《Heredity》2013,111(2):131-138
Genetic rearrangement by recombination is one of the major driving forces for genome evolution, and recombination is known to occur in non-random, discreet recombination sites within the genome. Mapping of recombination sites has proved to be difficult, particularly, in the human MHC region that is complicated by both population variation and highly polymorphic HLA genes. To overcome these problems, HLA-typed individuals from three representative populations: Asian, European and African were used to generate phased HLA haplotypes. Extended haplotype homozygosity (EHH) plots constructed from the phased haplotype data revealed discreet EHH drops corresponding to recombination events and these signatures were observed to be different for each population. Surprisingly, the majority of recombination sites detected are unique to each population, rather than being common. Unique recombination sites account for 56.8% (21/37 of total sites) in the Asian cohort, 50.0% (15/30 sites) in Europeans and 63.2% (24/38 sites) in Africans. Validation carried out at a known sperm typing recombination site of 45 kb (HLA-F-telomeric) showed that EHH was an efficient method to narrow the recombination region to 826 bp, and this was further refined to 660 bp by resequencing. This approach significantly enhanced mapping of the genomic architecture within the human MHC, and will be useful in studies to identify disease risk genes.  相似文献   

3.
Allele frequency differences across populations can provide valuable information both for studying population structure and for identifying loci that have been targets of natural selection. Here, we examine the relationship between recombination rate and population differentiation in humans by analyzing two uniformly-ascertained, whole-genome data sets. We find that population differentiation as assessed by inter-continental F ST shows negative correlation with recombination rate, with F ST reduced by 10% in the tenth of the genome with the highest recombination rate compared with the tenth of the genome with the lowest recombination rate (P≪10−12). This pattern cannot be explained by the mutagenic properties of recombination and instead must reflect the impact of selection in the last 100,000 years since human continental populations split. The correlation between recombination rate and F ST has a qualitatively different relationship for F ST between African and non-African populations and for F ST between European and East Asian populations, suggesting varying levels or types of selection in different epochs of human history.  相似文献   

4.
We describe an efficient NMR triple resonance approach that correlates, at high resolution, protein side-chain and backbone resonances. It relies on the combination of two strategies: joint evolution of aliphatic side-chain proton/carbon coherences using a backbone N–H based HCcoNH reduced dimensionality (RD) experiment and non-uniform sampling (NUS) in two indirect dimensions. A typical data set containing such correlation information can be acquired in 2 days, at very high resolution unfeasible for conventional 4D HCcoNH-TOCSY experiments. The resonances of the aliphatic side-chain protons are unambiguously assigned to their attached carbons through the analysis of the ‘sum’ and ‘difference’ spectra. This approach circumvents the tedious process of manual resonance assignments using HCcH-TOCSY data, while providing additional resolving power of backbone N–H signals. A simple peak-list based algorithm has been implemented in the IBIS software for rapid automated backbone and side-chain assignments.  相似文献   

5.
Release from parasites, pathogens or predators (i.e. enemies) is a widely cited ‘rule of thumb’ to explain the proliferation of nonindigenous species in their introduced regions (i.e. the ‘enemy release hypothesis’, or ERH). Indeed, profound effects of some parasites and predators on host populations are well documented. However, some support for the ERH comes from studies that find a reduction in the species richness of enemies in the introduced range, relative to the native range, of particular hosts. For example, data on helminth parasites of the European starling in both its native Eurasia and in North America support a reduction of parasites in the latter. However, North American ‘founder’ starlings were likely not chosen randomly from across Eurasia. This could result in an overestimation of enemy release since enemies affect their hosts on a population level. We control for the effects of subsampling colonists and find, contrary to previous reports, no evidence that introduced populations of starlings experienced a reduction in the species richness of helminth parasites after colonization of North America. These results highlight the importance of choosing appropriate contrast groups in biogeographical analyses of biological invasions to minimize the confounding effects of ‘propagule biases’.  相似文献   

6.
DNA sequencing technologies provide unprecedented opportunities to analyze within-host evolution of microorganism populations. Often, within-host populations are analyzed via pooled sequencing of the population, which contains multiple individuals or “haplotypes.” However, current next-generation sequencing instruments, in conjunction with single-molecule barcoded linked-reads, cannot distinguish long haplotypes directly. Computational reconstruction of haplotypes from pooled sequencing has been attempted in virology, bacterial genomics, metagenomics, and human genetics, using algorithms based on either cross-host genetic sharing or within-host genomic reads. Here, we describe PoolHapX, a flexible computational approach that integrates information from both genetic sharing and genomic sequencing. We demonstrated that PoolHapX outperforms state-of-the-art tools tailored to specific organismal systems, and is robust to within-host evolution. Importantly, together with barcoded linked-reads, PoolHapX can infer whole-chromosome-scale haplotypes from 50 pools each containing 12 different haplotypes. By analyzing real data, we uncovered dynamic variations in the evolutionary processes of within-patient HIV populations previously unobserved in single position-based analysis.  相似文献   

7.
We analyze recombination in C. jejuni using MLST data from isolates taken from wild birds, cattle, wild rabbits, and water in a 100-km2 study region in Cheshire, UK. We use a recent approximate likelihood method for inference, based on combining likelihood information from all pairs of segregating (polymorphic) sites in the data. We find substantial evidence for recombination, but only for recombination with short tract lengths, of around 225–750 bp. We estimate that the rate of recombination is of a similar magnitude to the rate of mutation.[Reviewing Editor: Dr. Magnus Nordborg]  相似文献   

8.
Li H  Stephan W 《PLoS genetics》2006,2(10):e166
An important goal of population genetics is to determine the forces that have shaped the pattern of genetic variation in natural populations. We developed a maximum likelihood method that allows us to infer demographic changes and detect recent positive selection (selective sweeps) in populations of varying size from DNA polymorphism data. Applying this approach to single nucleotide polymorphism data at more than 250 noncoding loci on the X chromosome of Drosophila melanogaster from an (ancestral) African population and a (derived) European, we found that the African population expanded about 60,000 y ago and that the European population split off from the African lineage about 15,800 y ago, thereby suffering a severe population size bottleneck. We estimated that about 160 beneficial mutations (with selection coefficients s between 0.05% and 0.5%) were fixed in the euchromatic portion of the X in the African population since population size expansion, and about 60 mutations (with s around 0.5%) in the diverging European lineage.  相似文献   

9.
Archeogenetics has been revolutionary, revealing insights into demographic history and recent positive selection. However, most studies to date have ignored the nonrandom association of genetic variants at different loci (i.e. linkage disequilibrium). This may be in part because basic properties of linkage disequilibrium in samples from different times are still not well understood. Here, we derive several results for summary statistics of haplotypic variation under a model with time-stratified sampling: (1) The correlation between the number of pairwise differences observed between time-staggered samples (πΔt) in models with and without strict population continuity; (2) The product of the linkage disequilibrium coefficient, D, between ancient and modern samples, which is a measure of haplotypic similarity between modern and ancient samples; and (3) The expected switch rate in the Li and Stephens haplotype copying model. The latter has implications for genotype imputation and phasing in ancient samples with modern reference panels. Overall, these results provide a characterization of how haplotype patterns are affected by sample age, recombination rates, and population sizes. We expect these results will help guide the interpretation and analysis of haplotype data from ancient and modern samples.  相似文献   

10.
Using sequence data from seven nuclear loci in 385 isolates of the haploid, plant parasitic, ascomycete fungus, Sclerotinia, divergence times of populations and of species were distinguished. The evolutionary history of haplotypes on both population and species scales was reconstructed using a combination of parsimony, maximum likelihood and coalescent methods, implemented in a specific order. Analysis of site compatibility revealed recombination blocks from which alternative (marginal) networks were inferred, reducing uncertainty in the network due to recombination. Our own modifications of Templeton and co-workers' cladistic inference method and a coalescent approach detected the same phylogeographic processes. Assuming neutrality and a molecular clock, the boundary between divergent populations and species is an interval of time between coalescence (to a common ancestor) of populations and coalescence of species.  相似文献   

11.
The patterns of genetic variation within and among individuals and populations can be used to make inferences about the evolutionary forces that generated those patterns. Numerous population genetic approaches have been developed in order to infer evolutionary history. Here, we present the “Two-Two (TT)” and the “Two-Two-outgroup (TTo)” methods; two closely related approaches for estimating divergence time based in coalescent theory. They rely on sequence data from two haploid genomes (or a single diploid individual) from each of two populations. Under a simple population-divergence model, we derive the probabilities of the possible sample configurations. These probabilities form a set of equations that can be solved to obtain estimates of the model parameters, including population split times, directly from the sequence data. This transparent and computationally efficient approach to infer population divergence time makes it possible to estimate time scaled in generations (assuming a mutation rate), and not as a compound parameter of genetic drift. Using simulations under a range of demographic scenarios, we show that the method is relatively robust to migration and that the TTo method can alleviate biases that can appear from drastic ancestral population size changes. We illustrate the utility of the approaches with some examples, including estimating split times for pairs of human populations as well as providing further evidence for the complex relationship among Neandertals and Denisovans and their ancestors.  相似文献   

12.
Both biological populations and fault tolerant evolvable hardware systems need to respond rapidly to changes in their dynamic environmental niche. Such changes can be caused by a disturbance event or fault occurring. Here I examine evolutionary algorithms, based on eukaryote sexual selection, which allow different levels of recombination of ‘genes’. The differences in recombination are based on ‘genes’ related to the optimisation process being either linked on a single ‘chromosome’ or being present on separate ‘chromosomes’. When genes are present on separate chromosomes the initial rate of evolution of a randomly generated population is faster than if the genes are linked on the same chromosome. However, when the optimisation problem is changed during the optimisation period, indicating a disturbance or fault occurring, the initial fitness of the linked population is higher and the rate of optimisation immediately after the disturbance is more rapid than for the non-linked populations. The genotypic and phenotypic diversity of the linked populations are also significantly higher immediately prior to the disturbance event. I propose this diversity provides the necessary variation to allow more rapid evolution following a disturbance. The results demonstrate the importance of population diversity in response to change, supporting theory from conservation biology.  相似文献   

13.
Many existing cohorts contain a range of relatedness between genotyped individuals, either by design or by chance. Haplotype estimation in such cohorts is a central step in many downstream analyses. Using genotypes from six cohorts from isolated populations and two cohorts from non-isolated populations, we have investigated the performance of different phasing methods designed for nominally ‘unrelated’ individuals. We find that SHAPEIT2 produces much lower switch error rates in all cohorts compared to other methods, including those designed specifically for isolated populations. In particular, when large amounts of IBD sharing is present, SHAPEIT2 infers close to perfect haplotypes. Based on these results we have developed a general strategy for phasing cohorts with any level of implicit or explicit relatedness between individuals. First SHAPEIT2 is run ignoring all explicit family information. We then apply a novel HMM method (duoHMM) to combine the SHAPEIT2 haplotypes with any family information to infer the inheritance pattern of each meiosis at all sites across each chromosome. This allows the correction of switch errors, detection of recombination events and genotyping errors. We show that the method detects numbers of recombination events that align very well with expectations based on genetic maps, and that it infers far fewer spurious recombination events than Merlin. The method can also detect genotyping errors and infer recombination events in otherwise uninformative families, such as trios and duos. The detected recombination events can be used in association scans for recombination phenotypes. The method provides a simple and unified approach to haplotype estimation, that will be of interest to researchers in the fields of human, animal and plant genetics.  相似文献   

14.
Haplotyping as perfect phylogeny: a direct approach.   总被引:4,自引:0,他引:4  
A full haplotype map of the human genome will prove extremely valuable as it will be used in large-scale screens of populations to associate specific haplotypes with specific complex genetic-influenced diseases. A haplotype map project has been announced by NIH. The biological key to that project is the surprising fact that some human genomic DNA can be partitioned into long blocks where genetic recombination has been rare, leading to strikingly fewer distinct haplotypes in the population than previously expected (Helmuth, 2001; Daly et al., 2001; Stephens et al., 2001; Friss et al., 2001). In this paper we explore the algorithmic implications of the no-recombination in long blocks observation, for the problem of inferring haplotypes in populations. This assumption, together with the standard population-genetic assumption of infinite sites, motivates a model of haplotype evolution where the haplotypes in a population are assumed to evolve along a coalescent, which as a rooted tree is a perfect phylogeny. We consider the following algorithmic problem, called the perfect phylogeny haplotyping problem (PPH), which was introduced by Gusfield (2002) - given n genotypes of length m each, does there exist a set of at most 2n haplotypes such that each genotype is generated by a pair of haplotypes from this set, and such that this set can be derived on a perfect phylogeny? The approach taken by Gusfield (2002) to solve this problem reduces it to established, deep results and algorithms from matroid and graph theory. Although that reduction is quite simple and the resulting algorithm nearly optimal in speed, taken as a whole that approach is quite involved, and in particular, challenging to program. Moreover, anyone wishing to fully establish, by reading existing literature, the correctness of the entire algorithm would need to read several deep and difficult papers in graph and matroid theory. However, as stated by Gusfield (2002), many simplifications are possible and the list of "future work" in Gusfield (2002) began with the task of developing a simpler, more direct, yet still efficient algorithm. This paper accomplishes that goal, for both the rooted and unrooted PPH problems. It establishes a simple, easy-to-program, O(nm(2))-time algorithm that determines whether there is a PPH solution for input genotypes and produces a linear-space data structure to represent all of the solutions. The approach allows complete, self-contained proofs. In addition to algorithmic simplicity, the approach here makes the representation of all solutions more intuitive than in Gusfield (2002), and solves another goal from that paper, namely, to prove a nontrivial upper bound on the number of PPH solutions, showing that that number is vastly smaller than the number of haplotype solutions (each solution being a set of n pairs of haplotypes that can generate the genotypes) when the perfect phylogeny requirement is not imposed.  相似文献   

15.
Genetic variability in stress tolerance (heat, desiccation, and hypoxia) and fitness (virulence and reproduction potential) among natural populations of Steinernema carpocapsae was assessed by estimating phenotypic differences. Significant differences were observed in stress tolerance among populations. Populations isolated from North Carolina showed significantly more stress tolerance than those isolated from Ohio. Significant differences were also observed in populations isolated from the same locality. Survival of infective juveniles after exposure to 40°C for 2 h ranged from 37 to 82%. A threefold difference was observed in infective juvenile survival following exposure to osmotic desiccation or hypoxic condition. Several populations tested were superior to the most widely used strain (‘All’ strain) in stress tolerance traits, with one population KMD33, being superior to the ‘All’ strain in all traits. Fitness as expressed by virulence and reproductive potential differed significantly among populations but showed less variability than the stress tolerance traits. All populations tested had a reproductive potential greater than or similar to that of the ‘All’ strain and most of them caused >60% insect mortality of the wax moth larvae, Galleria mellonella. The high genetic variability in stress tolerance among natural populations suggests the feasibility of using selection for genetic improvement of these traits.  相似文献   

16.
We present a statistical model for patterns of genetic variation in samples of unrelated individuals from natural populations. This model is based on the idea that, over short regions, haplotypes in a population tend to cluster into groups of similar haplotypes. To capture the fact that, because of recombination, this clustering tends to be local in nature, our model allows cluster memberships to change continuously along the chromosome according to a hidden Markov model. This approach is flexible, allowing for both "block-like" patterns of linkage disequilibrium (LD) and gradual decline in LD with distance. The resulting model is also fast and, as a result, is practicable for large data sets (e.g., thousands of individuals typed at hundreds of thousands of markers). We illustrate the utility of the model by applying it to dense single-nucleotide-polymorphism genotype data for the tasks of imputing missing genotypes and estimating haplotypic phase. For imputing missing genotypes, methods based on this model are as accurate or more accurate than existing methods. For haplotype estimation, the point estimates are slightly less accurate than those from the best existing methods (e.g., for unrelated Centre d'Etude du Polymorphisme Humain individuals from the HapMap project, switch error was 0.055 for our method vs. 0.051 for PHASE) but require a small fraction of the computational cost. In addition, we demonstrate that the model accurately reflects uncertainty in its estimates, in that probabilities computed using the model are approximately well calibrated. The methods described in this article are implemented in a software package, fastPHASE, which is available from the Stephens Lab Web site.  相似文献   

17.
The phenoxy herbicides 2,4-D and dicamba are released daily into the environment in large amount. The mechanisms of genotoxicity and mutagenicity of these herbicides are poorly understood, and the available genotoxicity data is controversial. There is a cogent need for a novel genotoxicity monitoring system that could provide both reliable information at the molecular level, and complement existing systems.We employed the transgenic Arabidopsis thaliana ‘point mutation’ and ‘recombination’ plants to monitor the genetic effects of the herbicides 2,4-D and dicamba. We found that both herbicides had a significant effect on the frequency of homologous recombination A→G mutation. Neither herbicides affected the T→G mutation frequency. Interestingly, these transgenic biomonitoring plants were able to detect the presence of phenoxy herbicides at concentrations that were lower than the guideline levels for Drinking Water Quality. The results of our studies suggest that our transgenic system may be ideal for the evaluation of the genotoxicity of herbicide-contaminated water. Moreover, the unique ability of the plants to detect both double-strand breaks (homologous recombination) and point mutations provides tremendous potential in the study of molecular mechanisms of genotoxicity and mutagenicity of phenoxy herbicides.  相似文献   

18.
The locus for Friedreich ataxia (FRDA), a severe neurodegenerative disease, is tightly linked to markers D9S5 and D9S15, and analysis of rare recombination events has suggested the order cen–FRDA–D9S5–D9S15–qter. We report here the construction of a YAC contig extending 800 kb centromeric to D9S5 and the isolation of five new microsatellite markers from this region. In order to map these markers with respect to the FRDA locus, all within a 1-cM confidence interval, we sought to increase the genetic information of available FRDA families by considering homozygosity by descent and association with founder haplotypes in isolated populations. This approach allowed us to identify one phase-known recombination and one probable historic recombination on haplotypes from Réunion Island patients, both of which place three of the five markers proximal to FRDA. This represents the first identification of close FRDA flanking markers on the centromeric side. The two other markers allowed us to narrow the breakpoint of a previously identified distal recombination that is >180 kb from D9S5 (26P). Taken together, the results place the FRDA locus in a 450-kb interval, which is small enough for direct search of candidate genes. A detailed rare cutter restriction map and a cosmid contig covering this interval were constructed and should facilitate the search of genes in this region.  相似文献   

19.
Recombination is a fundamental evolutionary force. Therefore the population recombination rate ρ plays an important role in the analysis of population genetic data; however, it is notoriously difficult to estimate. This difficulty applies both to the accuracy of commonly used estimates and to the computational efforts required to obtain them. Some particularly popular methods are based on approximations to the likelihood. They require considerably less computational efforts than the full-likelihood method with not much less accuracy. Nevertheless, the computation of these approximate estimates can still be very time consuming, in particular when the sample size is large. Although auxiliary quantities for composite likelihood estimates can be computed in advance and stored in tables, these tables need to be recomputed if either the sample size or the mutation rate θ changes. Here we introduce a new method based on regression combined with boosting as a model selection technique. For large samples, it requires much less computational effort than other approximate methods, while providing similar levels of accuracy. Notably, for a sample of hundreds or thousands of individuals, the estimate of ρ using regression can be obtained on a single personal computer within a couple of minutes while other methods may need a couple of days or months (or even years). When the sample size is smaller (n ≤ 50), our new method remains computational efficient but produces biased estimates. We expect the new estimates to be helpful when analyzing large samples and/or many loci with possibly different mutation rates.  相似文献   

20.
The identification of pure indigenous fish from hybridised populations represents a key issue in fisheries management and conservation biology. In the present study an approach for selection of purebred marble trout (Salmo trutta marmoratus C.) individuals out of admixed populations was set up and assessed. In a first step, baseline data sets of pure marble trout and pure brown trout specimens based on twelve microsatellite loci were used to simulate five consecutive generations of admixture. The baseline and the resulting simulation data sets were then combined with data of a ‘real’ hybridised marble trout population to perform a single individual assignment test as implemented in STRUCTURE. By this procedure the assignment approach was calibrated and it was possible to compare admixture coefficients obtained for individuals from different populations. The ranking of individual admixture coefficients on a plot and comparison with simulated data revealed that the test population was composed of pure marble trout individuals, first generation hybrids between marble trout and brown trout, and hybrid backcross specimens between both groups. However, by defining a critical q-value of 0.1 and additionally integrating individual sequence data of the mtDNA control region, it was possible to indicate individuals, which could be selected for the establishment of a pure marble trout strain.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号