首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The inference of positive selection in genomes is a problem of great interest in evolutionary genomics. By identifying putative regions of the genome that contain adaptive mutations, we are able to learn about the biology of organisms and their evolutionary history. Here we introduce a composite likelihood method that identifies recently completed or ongoing positive selection by searching for extreme distortions in the spatial distribution of the haplotype frequency spectrum along the genome relative to the genome-wide expectation taken as neutrality. Furthermore, the method simultaneously infers two parameters of the sweep: the number of sweeping haplotypes and the “width” of the sweep, which is related to the strength and timing of selection. We demonstrate that this method outperforms the leading haplotype-based selection statistics, though strong signals in low-recombination regions merit extra scrutiny. As a positive control, we apply it to two well-studied human populations from the 1000 Genomes Project and examine haplotype frequency spectrum patterns at the LCT and MHC loci. We also apply it to a data set of brown rats sampled in NYC and identify genes related to olfactory perception. To facilitate use of this method, we have implemented it in user-friendly open source software.  相似文献   

2.
We analyze patterns of genetic variability of populations in the presence of a large seedbank with the help of a new coalescent structure called the seedbank coalescent. This ancestral process appears naturally as a scaling limit of the genealogy of large populations that sustain seedbanks, if the seedbank size and individual dormancy times are of the same order as those of the active population. Mutations appear as Poisson processes on the active lineages and potentially at reduced rate also on the dormant lineages. The presence of “dormant” lineages leads to qualitatively altered times to the most recent common ancestor and nonclassical patterns of genetic diversity. To illustrate this we provide a Wright–Fisher model with a seedbank component and mutation, motivated from recent models of microbial dormancy, whose genealogy can be described by the seedbank coalescent. Based on our coalescent model, we derive recursions for the expectation and variance of the time to most recent common ancestor, number of segregating sites, pairwise differences, and singletons. Estimates (obtained by simulations) of the distributions of commonly employed distance statistics, in the presence and absence of a seedbank, are compared. The effect of a seedbank on the expected site-frequency spectrum is also investigated using simulations. Our results indicate that the presence of a large seedbank considerably alters the distribution of some distance statistics, as well as the site-frequency spectrum. Thus, one should be able to detect from genetic data the presence of a large seedbank in natural populations.  相似文献   

3.
4.
Clonal populations accumulate mutations over time, resulting in different haplotypes. Deep sequencing of such a population in principle provides information to reconstruct these haplotypes and the frequency at which the haplotypes occur. However, this reconstruction is technically not trivial, especially not in clonal systems with a relatively low mutation frequency. The low number of segregating sites in those systems adds ambiguity to the haplotype phasing and thus obviates the reconstruction of genome-wide haplotypes based on sequence overlap information.Therefore, we present EVORhA, a haplotype reconstruction method that complements phasing information in the non-empty read overlap with the frequency estimations of inferred local haplotypes. As was shown with simulated data, as soon as read lengths and/or mutation rates become restrictive for state-of-the-art methods, the use of this additional frequency information allows EVORhA to still reliably reconstruct genome-wide haplotypes. On real data, we show the applicability of the method in reconstructing the population composition of evolved bacterial populations and in decomposing mixed bacterial infections from clinical samples.  相似文献   

5.
In order to improve the h-index in terms of its accuracy and sensitivity to the form of the citation distribution, we propose the new bibliometric index . The basic idea is to define, for any author with a given number of citations, an “ideal” citation distribution which represents a benchmark in terms of number of papers and number of citations per publication, and to obtain an index which increases its value when the real citation distribution approaches its ideal form. The method is very general because the ideal distribution can be defined differently according to the main objective of the index. In this paper we propose to define it by a “squared-form” distribution: this is consistent with many popular bibliometric indices, which reach their maximum value when the distribution is basically a “square”. This approach generally rewards the more regular and reliable researchers, and it seems to be especially suitable for dealing with common situations such as applications for academic positions. To show the advantages of the -index some mathematical properties are proved and an application to real data is proposed.  相似文献   

6.
De A  Durrett R 《Genetics》2007,176(2):969-981
The symmetric island model with D demes and equal migration rates is often chosen for the investigation of the consequences of population subdivision. Here we show that a stepping-stone model has a more pronounced effect on the genealogy of a sample. For samples from a small geographical region commonly used in genetic studies of humans and Drosophila, there is a shift of the frequency spectrum that decreases the number of low-frequency-derived alleles and skews the distribution of statistics of Tajima, Fu and Li, and Fay and Wu. Stepping-stone spatial structure also changes the two-locus sampling distribution and increases both linkage disequilibrium and the probability that two sites are perfectly correlated. This may cause a false prediction of cold spots of recombination and may confuse haplotype tests that compute probabilities on the basis of a homogeneously mixing population.  相似文献   

7.
The frequency distribution of pairwise differences between sequences of mtDNA has recently been used to estimate the size of human populations before and after a hypothetical episode of rapid population growth and the time at which the population grew. To test the internal consistency of this method, we used three different sets of human mtDNA data and the corresponding demographic parameters estimated from the distribution of pairwise differences to determine by simulation the expected number of segregating sites, S, and its empirical distribution. The results indicate that the observed values of S are significantly lower than expected in two of three cases under the assumption of the infinite-sites model. Further simulations in which mutations were allowed to occur more than once at the same site and in which there was variation in mutation rate among sites show that the expected number of segregating sites can be much lower than under the infinite-site assumption. Nevertheless, the observed value of S is still significantly different from the value expected under the expansion hypothesis in two of three cases.   相似文献   

8.
9.
In experiments with many statistical tests there is need to balance type I and type II error rates while taking multiplicity into account. In the traditional approach, the nominal -level such as 0.05 is adjusted by the number of tests, , i.e., as 0.05/. Assuming that some proportion of tests represent “true signals”, that is, originate from a scenario where the null hypothesis is false, power depends on the number of true signals and the respective distribution of effect sizes. One way to define power is for it to be the probability of making at least one correct rejection at the assumed -level. We advocate an alternative way of establishing how “well-powered” a study is. In our approach, useful for studies with multiple tests, the ranking probability is controlled, defined as the probability of making at least correct rejections while rejecting hypotheses with smallest P-values. The two approaches are statistically related. Probability that the smallest P-value is a true signal (i.e., ) is equal to the power at the level , to an excellent approximation. Ranking probabilities are also related to the false discovery rate and to the Bayesian posterior probability of the null hypothesis. We study properties of our approach when the effect size distribution is replaced for convenience by a single “typical” value taken to be the mean of the underlying distribution. We conclude that its performance is often satisfactory under this simplification; however, substantial imprecision is to be expected when is very large and is small. Precision is largely restored when three values with the respective abundances are used instead of a single typical effect size value.  相似文献   

10.
Previously available primer sets for detecting anaerobic ammonium-oxidizing (anammox) bacteria are inefficient, resulting in a very limited database of such sequences, which limits knowledge of their ecology. To overcome this limitation, we designed a new primer set that was 100% specific in the recovery of ~700-bp 16S rRNA gene sequences with >96% homology to the “Candidatus Scalindua” group of anammox bacteria, and we detected this group at all sites studied, including a variety of freshwater and marine sediments and permafrost soil. A second primer set was designed that exhibited greater efficiency than previous primers in recovering full-length (1,380-bp) sequences related to “Ca. Scalindua,” “Candidatus Brocadia,” and “Candidatus Kuenenia.” This study provides evidence for the widespread distribution of anammox bacteria in that it detected closely related anammox 16S rRNA gene sequences in 11 geographically and biogeochemically diverse freshwater and marine sediments.  相似文献   

11.
The impact of land use intensity on the diversity of arbuscular mycorrhizal fungi (AMF) was investigated at eight sites in the “three-country corner” of France, Germany, and Switzerland. Three sites were low-input, species-rich grasslands. Two sites represented low- to moderate-input farming with a 7-year crop rotation, and three sites represented high-input continuous maize monocropping. Representative soil samples were taken, and the AMF spores present were morphologically identified and counted. The same soil samples also served as inocula for “AMF trap cultures” with Plantago lanceolata, Trifolium pratense, and Lolium perenne. These trap cultures were established in pots in a greenhouse, and AMF root colonization and spore formation were monitored over 8 months. For the field samples, the numbers of AMF spores and species were highest in the grasslands, lower in the low- and moderate-input arable lands, and lowest in the lands with intensive continuous maize monocropping. Some AMF species occurred at all sites (“generalists”); most of them were prevalent in the intensively managed arable lands. Many other species, particularly those forming sporocarps, appeared to be specialists for grasslands. Only a few species were specialized on the arable lands with crop rotation, and only one species was restricted to the high-input maize sites. In the trap culture experiment, the rate of root colonization by AMF was highest with inocula from the permanent grasslands and lowest with those from the high-input monocropping sites. In contrast, AMF spore formation was slowest with the former inocula and fastest with the latter inocula. In conclusion, the increased land use intensity was correlated with a decrease in AMF species richness and with a preferential selection of species that colonized roots slowly but formed spores rapidly.  相似文献   

12.
DNA barcoding with the mitochondrial COI gene reveals distinct haplotype subgroups within the monophyletic and parthenogenetic nematode species, Mesocriconema xenoplax. Biological attributes of these haplotype groups (HG) have not been explored. An analysis of M. xenoplax from 40 North American sites representing both native plant communities and agroecosystems was conducted to identify possible subgroup associations with ecological, physiological, or geographic factors. A dataset of 132 M. xenoplax specimens was used to generate sequences of a 712 bp region of the cytochrome oxidase subunit I gene. Maximum-likelihood and Bayesian phylogenies recognized seven COI HG (≥99/0.99 posterior probability/bootstrap value). Species delimitation metrics largely supported the genetic integrity of the HG. Discriminant function analysis of HG morphological traits identified stylet length, total body length, and stylet knob width as the strongest distinguishing features among the seven groups, with stylet length as the strongest single distinguishing morphological feature. Multivariate analysis identified land cover, ecoregion, and maximum temperature as predictors of 53.6% of the total variation (P = 0.001). Within land cover, HG categorized under “herbaceous,” “woody wetlands,” and “deciduous forest” were distinct in DAPC and RDA analyses and were significantly different (analysis of molecular variance P = 0.001). These results provide empirical evidence for molecular, morphological, and ecological differentiation associated with HG within the monophyletic clade that represents the species Mesocriconema xenoplax.  相似文献   

13.
Primary open angle glaucoma (POAG) is a multi-factorial optic disc neuropathy characterized by accelerating damage of the retinal ganglion cells and atrophy of the optic nerve head. The vulnerability of the optic nerve damage leading to POAG has been postulated to result from oxidative stress and mitochondrial dysfunction. In this study, we investigated the possible involvement of the mitochondrial genomic variants in 101 patients and 71 controls by direct sequencing of the entire mitochondrial genome. The number of variable positions in the mtDNA with respect to the revised Cambridge Reference Sequence (rCRS), have been designated “Segregating Sites”. The segregating sites present only in the patients or controls have been designated “Unique Segregating Sites (USS)”. The population mutation rate (θ = 4Neμ) as estimated by Watterson’s θ (θw), considering only the USS, was significantly higher among the patients (p = 9.8×10−15) compared to controls. The difference in θw and the number of USS were more pronounced when restricted to the coding region (p<1.31×10−21 and p = 0.006607, respectively). Further analysis of the region revealed non-synonymous variations were significantly higher in Complex I among the patients (p = 0.0053). Similar trends were retained when USS was considered only within complex I (frequency 0.49 vs 0.31 with p<0.0001 and mutation rate p-value <1.49×10−43) and ND5 within its gene cluster (frequency 0.47 vs 0.23 with p<0.0001 and mutation rate p-value <4.42×10−47). ND5 is involved in the proton pumping mechanism. Incidentally, glaucomatous trabecular meshwork cells have been reported to be more sensitive to inhibition of complex I activity. Thus mutations in ND5, expected to inhibit complex I activity, could lead to generation of oxidative stress and favor glaucomatous condition.  相似文献   

14.
Herein, we evaluated the concordance of population inferences and conclusions resulting from the analysis of short mitochondrial fragments (i.e., partial or complete D-Loop nucleotide sequences) versus complete mitogenome sequences for 53 bobwhites representing six ecoregions across TX and OK (USA). Median joining (MJ) haplotype networks demonstrated that analyses performed using small mitochondrial fragments were insufficient for estimating the true (i.e., complete) mitogenome haplotype structure, corresponding levels of divergence, and maternal population history of our samples. Notably, discordant demographic inferences were observed when mismatch distributions of partial (i.e., partial D-Loop) versus complete mitogenome sequences were compared, with the reduction in mitochondrial genomic information content observed to encourage spurious inferences in our samples. A probabilistic approach to variant prediction for the complete bobwhite mitogenomes revealed 344 segregating sites corresponding to 347 total mutations, including 49 putative nonsynonymous single nucleotide variants (SNVs) distributed across 12 protein coding genes. Evidence of gross heteroplasmy was observed for 13 bobwhites, with 10 of the 13 heteroplasmies involving one moderate to high frequency SNV. Haplotype network and phylogenetic analyses for the complete bobwhite mitogenome sequences revealed two divergent maternal lineages (d XY = 0.00731; F ST = 0.849; P < 0.05), thereby supporting the potential for two putative subspecies. However, the diverged lineage (n = 103 variants) almost exclusively involved bobwhites geographically classified as Colinus virginianus texanus, which is discordant with the expectations of previous geographic subspecies designations. Tests of adaptive evolution for functional divergence (MKT), frequency distribution tests (D, F S) and phylogenetic analyses (RAxML) provide no evidence for positive selection or hybridization with the sympatric scaled quail (Callipepla squamata) as being explanatory factors for the two bobwhite maternal lineages observed. Instead, our analyses support the supposition that two diverged maternal lineages have survived from pre-expansion to post-expansion population(s), with the segregation of some slightly deleterious nonsynonymous mutations.  相似文献   

15.
The Magnificent Frigatebird Fregata magnificens has a pantropical distribution, nesting on islands along the Atlantic and Pacific coasts. In the Caribbean, there is little genetic structure among colonies; however, the genetic structure among the colonies off Brazil and its relationship with those in the Caribbean are unknown. In this study, we used mtDNA and microsatellite markers to infer population structure and evolutionary history in a sample of F. magnificens individuals collected in Brazil, Grand Connétable (French Guyana), and Barbuda. Virtually all Brazilian individuals had the same mtDNA haplotype. There was no haplotype sharing between Brazil and the Caribbean, though Grand Connétable shared haplotypes with both regions. A Bayesian clustering analysis using microsatellite data found two genetic clusters: one associated with Barbuda and the other with the Brazilian populations. Grand Connétable was more similar to Barbuda but had ancestry from both clusters, corroborating its “intermediate” position. The Caribbean and Grand Connétable populations showed higher genetic diversity and effective population size compared to the Brazilian population. Overall, our results are in good agreement with an effect of marine winds in isolating the Brazilian meta-population.  相似文献   

16.
Familial Mediterranean fever (FMF) is an autosomal recessive disease causing attacks of fever and serositis. The FMF gene (designated “MEF”) is on 16p, with the gene order 16cen–D16S80–MEF–D16S94–D16S283–D16S291–16pter. Here we report the association of FMF susceptibility with alleles at D16S94, D16S283, and D16S291 among 31 non-Ashkenazi Jewish families (14 Moroccan, 17 non-Moroccan). We observed highly significant associations at D16S283 and D16S291 among the Moroccan families. For the non-Moroccans, only the allelic association at D16S94 approached statistical significance. Haplotype analysis showed that 18/25 Moroccan FMF chromosomes, versus 0/21 noncarrier chromosomes, bore a specific haplotype for D16S94–D16S283–D16S291. Among non-Moroccans this haplotype was present in 6/26 FMF chromosomes versus 1/28 controls. Both groups of families are largely descended from Jews who fled the Spanish Inquisition. The strong haplotype association seen among the Moroccans is most likely a founder effect, given the recent origin and genetic isolation of the Moroccan Jewish community. The lower haplotype frequency among non-Moroccan carriers may reflect differences both in history and in population genetics.  相似文献   

17.
Highly polymorphic genes with central roles in lymphocyte mediated immune surveillance are grouped together in the major histocompatibility complex (MHC) in higher vertebrates. Generally, across vertebrate species the class II MHC DRA gene is highly conserved with only limited allelic variation. Here however, we provide evidence of trans-species polymorphism at the DRA locus in domestic sheep (Ovis aries). We describe variation at the Ovar-DRA locus that is far in excess of anything described in other vertebrate species. The divergent DRA allele (Ovar-DRA*0201) differs from the sheep reference sequences by 20 nucleotides, 12 of which appear non-synonymous. Furthermore, DRA*0201 is paired with an equally divergent DRB1 allele (Ovar-DRB1*0901), which is consistent with an independent evolutionary history for the DR sub-region within this MHC haplotype. No recombination was observed between the divergent DRA and B genes in a range of breeds and typical levels of MHC class II DR protein expression were detected at the surface of leukocyte populations obtained from animals homozygous for the DRA*0201, DRB1*0901 haplotype. Bayesian phylogenetic analysis groups Ovar-DRA*0201 with DRA sequences derived from species within the Oryx and Alcelaphus genera rather than clustering with other ovine and caprine DRA alleles. Tests for Darwinian selection identified 10 positively selected sites on the branch leading to Ovar-DRA*0201, three of which are predicted to be associated with the binding of peptide antigen. As the Ovis, Oryx and Alcelaphus genera have not shared a common ancestor for over 30 million years, the DRA*0201 and DRB1*0901 allelic pair is likely to be of ancient origin and present in the founding population from which all contemporary domestic sheep breeds are derived. The conservation of the integrity of this unusual DR allelic pair suggests some selective advantage which is likely to be associated with the presentation of pathogen antigen to T-cells and the induction of protective immunity.  相似文献   

18.
K. Misawa  F. Tajima 《Genetics》1997,147(4):1959-1964
Knowing the amount of DNA polymorphism is essential to understand the mechanism of maintaining DNA polymorphism in a natural population. The amount of DNA polymorphism can be measured by the average number of nucleotide differences per site (π), the proportion of segregating (polymorphic) site (s) and the minimum number of mutations per site (s*). Since the latter two quantities depend on the sample size, θ is often used as a measure of the amount of DNA polymorphism, where θ = 4Nμ, N is the effective population size and μ is the neutral mutation rate per site per generation. It is known that θ estimated from π, s and s* under the infinite site model can be biased when the mutation rate varies among sites. We have therefore developed new methods for estimating θ under the finite site model. Using computer simulations, it has been shown that the new methods give almost unbiased estimates even when the mutation rate varies among sites substantially. Furthermore, we have also developed new statistics for testing neutrality by modifying Tajima's D statistic. Computer simulations suggest that the new test statistics can be used even when the mutation rate varies among sites.  相似文献   

19.
Untranslated gene regions (UTRs) play an important role in controlling gene expression. 3′-UTRs are primarily targeted by microRNA (miRNA) molecules that form complex gene regulatory networks. Cancer genomes are replete with non-coding mutations, many of which are connected to changes in tumor gene expression that accompany the development of cancer and are associated with resistance to therapy. Therefore, variants that occurred in 3′-UTR under cancer progression should be analysed to predict their phenotypic effect on gene expression, e.g., by evaluating their impact on miRNA target sites. Here, we analyze 3′-UTR variants in DICER1 and DROSHA genes in the context of myelodysplastic syndrome (MDS) development. The key features of this analysis include an assessment of both “canonical” and “non-canonical” types of mRNA-miRNA binding and tissue-specific profiling of miRNA interactions with wild-type and mutated genes. As a result, we obtained a list of DICER1 and DROSHA variants likely altering the miRNA sites and, therefore, potentially leading to the observed tissue-specific gene downregulation. All identified variants have low population frequency consistent with their potential association with pathology progression.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号