首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
SUMMARY: GWAsimulator implements a rapid moving-window algorithm to simulate genotype data for case-control or population samples from genomic SNP chips. For case-control data, the program generates cases and controls according to a user-specified multi-locus disease model, and can simulate specific regions if desired. The program uses phased genotype data as input and has the flexibility of simulating genotypes for different populations and different genomic SNP chips. When the HapMap phased data are used, the simulated data have similar local LD patterns as the HapMap data. As genome-wide association (GWA) studies become increasingly popular and new GWA data analysis methods are being developed, we anticipate that GWAsimulator will be an important tool for evaluating performance of new GWA analysis methods. AVAILABILITY: The C++ source code, executables for Linux, Windows and MacOS, manual, example data sets and analysis program are available at http://biostat.mc.vanderbilt.edu/GWAsimulator  相似文献   

2.
The background linkage disequilibrium (LD) in genetic isolates is of great interest in human genetics. Although many empirical studies have evaluated the background LD in European isolates, such as the Finnish and Sardinians, few data from other regions, such as Asia, have been reported. To evaluate the extent of background LD in East Asian genetic isolates, we analyzed the X chromosome in the Japanese population and in four Mongolian populations (Khalkh, Khoton, Uriankhai, and Zakhchin), the demographic histories of which are quite different from one another. Fisher's exact test revealed that the Japanese and Khalkh, which are the expanded populations, had the same or a relatively higher level of LD than did the Finnish, European American, and Sardinian populations. In contrast, the Khoton, Uriankhai, and Zakhchin populations, which have kept their population size constant, had a higher background LD. These results were consistent with previous genetic anthropological studies in European isolates and indicate that the Japanese and Khalkh populations could be utilized in the fine mapping of both complex and monogenic diseases, whereas the Khoton, Uriankhai, and Zakhchin populations could play an important role in the initial mapping of complex disease genes.  相似文献   

3.
Several previous studies concluded that linkage disequilibrium (LD) in livestock populations from developed countries originated from the impact of strong selection. Here, we assessed the extent of LD in a cattle population from western Africa that was bred in an extensive farming system. The analyses were performed on 363 individuals in a Bos indicus x Bos taurus population using 42 microsatellite markers on BTA04, BTA07 and BTA13. A high level of expected heterozygosity (0.71), a high mean number of alleles per locus (9.7) and a mild shift in Hardy-Weinberg equilibrium were found. Linkage disequilibrium extended over shorter distances than what has been observed in cattle from developed countries. Effective population size was assessed using two methods; both methods produced large values: 1388 when considering heterozygosity (assuming a mutation rate of 10(-3)) and 2344 when considering LD on whole linkage groups (assuming a constant population size over generations). However, analysing the decay of LD as a function of marker spacing indicated a decreasing trend in effective population size over generations. This decrease could be explained by increasing selective pressure and/or by an admixture process. Finally, LD extended over small distances, which suggested that whole-genome scans will require a large number of markers. However, association studies using such populations will be effective.  相似文献   

4.
We compared the accuracies of four genomic-selection prediction methods as affected by marker density, level of linkage disequilibrium (LD), quantitative trait locus (QTL) number, sample size, and level of replication in populations generated from multiple inbred lines. Marker data on 42 two-row spring barley inbred lines were used to simulate high and low LD populations from multiple inbred line crosses: the first included many small full-sib families and the second was derived from five generations of random mating. True breeding values (TBV) were simulated on the basis of 20 or 80 additive QTL. Methods used to derive genomic estimated breeding values (GEBV) were random regression best linear unbiased prediction (RR–BLUP), Bayes-B, a Bayesian shrinkage regression method, and BLUP from a mixed model analysis using a relationship matrix calculated from marker data. Using the best methods, accuracies of GEBV were comparable to accuracies from phenotype for predicting TBV without requiring the time and expense of field evaluation. We identified a trade-off between a method's ability to capture marker-QTL LD vs. marker-based relatedness of individuals. The Bayesian shrinkage regression method primarily captured LD, the BLUP methods captured relationships, while Bayes-B captured both. Under most of the study scenarios, mixed-model analysis using a marker-derived relationship matrix (BLUP) was more accurate than methods that directly estimated marker effects, suggesting that relationship information was more valuable than LD information. When markers were in strong LD with large-effect QTL, or when predictions were made on individuals several generations removed from the training data set, however, the ranking of method performance was reversed and BLUP had the lowest accuracy.  相似文献   

5.
Li Y  Li Y  Wu S  Han K  Wang Z  Hou W  Zeng Y  Wu R 《Genetics》2007,176(3):1811-1821
Analysis of population structure and organization with DNA-based markers can provide important information regarding the history and evolution of a species. Linkage disequilibrium (LD) analysis based on allelic associations between different loci is emerging as a viable tool to unravel the genetic basis of population differentiation. In this article, we derive the EM algorithm to obtain the maximum-likelihood estimates of the linkage disequilibria between dominant markers, to study the patterns of genetic diversity for a diploid species. The algorithm was expanded to estimate and test linkage disequilibria of different orders among three dominant markers and can be technically extended to manipulate an arbitrary number of dominant markers. The feasibility of the proposed algorithm is validated by an example of population genetic studies of hickory trees, native to southeastern China, using dominant random amplified polymorphic DNA markers. Extensive simulation studies were performed to investigate the statistical properties of this algorithm. The precision of the estimates of linkage disequilibrium between dominant markers was compared with that between codominant markers. Results from simulation studies suggest that three-locus LD analysis displays increased power of LD detection relative to two-locus LD analysis. This algorithm is useful for studying the pattern and amount of genetic variation within and among populations.  相似文献   

6.
Genetic association studies increasingly rely on the use of linkage disequilibrium (LD) tag SNPs to reduce genotyping costs. We developed a software package TAGster to select, evaluate and visualize LD tag SNPs both for single and multiple populations. We implement several strategies to improve the efficiency of current LD tag SNP selection algorithms: (1) we modify the tag SNP selection procedure of Carlson et al. to improve selection efficiency and further generalize it to multiple populations. (2) We propose a redundant SNP elimination step to speed up the exhaustive tag SNP search algorithm proposed by Qin et al. (3) We present an additional multiple population tag SNP selection algorithm based on the framework of Howie et al., but using our modified exhaustive search procedure. We evaluate these methods using resequenced candidate gene data from the Environmental Genome Project and show improvements in both computational and tagging efficiency. AVAILABILITY: The software Package TAGster is freely available at http://www.niehs.nih.gov/research/resources/software/tagster/  相似文献   

7.
Observed linkage disequilibrium (LD) between genetic markers in different populations descended independently from a common ancestral population can be used to estimate their absolute time of divergence, because the correlation of LD between populations will be reduced each generation by an amount that, approximately, depends only on the recombination rate between markers. Although drift leads to divergence in allele frequencies, it has less effect on divergence in LD values. We derived the relationship between LD and time of divergence and verified it with coalescent simulations. We then used HapMap Phase II data to estimate time of divergence between human populations. Summed over large numbers of pairs of loci, we find a positive correlation of LD between African and non-African populations at levels of up to ~0.3 cM. We estimate that the observed correlation of LD is consistent with an effective separation time of approximately 1,000 generations or ~25,000 years before present. The most likely explanation for such relatively low separation times is the existence of substantial levels of migration between populations after the initial separation. Theory and results from coalescent simulations confirm that low levels of migration can lead to a downward bias in the estimate of separation time.  相似文献   

8.
Linkage disequilibrium in the domesticated pig   总被引:5,自引:0,他引:5  
Nsengimana J  Baret P  Haley CS  Visscher PM 《Genetics》2004,166(3):1395-1404
This study investigated the extent of linkage disequilibrium (LD) in two genomic regions (on chromosomes 4 and 7) in five populations of domesticated pigs. LD was measured with D' and tested for significance with the Fisher exact test. Effects of genetic (linkage) distance, chromosome, population, and their interactions on D' were tested both through a linear model analysis of covariance and by a theoretical nonlinear model. The overall result was that (1) the distance explained most of the variability of D', (2) the effect of chromosome was significant, and (3) the effect of population was significant. The significance of the chromosome effect may have resulted from selection and the significance of the population effect illustrates the effects of population structures and effective population sizes on LD. These results suggest that mapping methods based on LD may be valuable even with only moderately dense marker spacing in pigs.  相似文献   

9.
北鳅(Lefua costata)为冷水性鱼类,分布于淮河以北,分析遗传结构能够反映其适应环境变迁的响应.基于线粒体D-loop区211条序列分析了我国北鳅的谱系地理学和遗传多样性,样本采自9条水系共18个样点.单倍型分析显示共计55个单倍型,呈高单倍型多样性(h=0.9304)和高核苷酸多样性(π=0.0087).单...  相似文献   

10.
Effectiveness of marker-assisted selection (MAS) and quantitative trait loci (QTL) mapping using population-wide linkage disequilibrium (LD) between markers and QTL depends on the extent of LD and how it declines with distance in a population. Because marker-QTL LD cannot be observed directly, the objective of this study was to evaluate alternative measures of observable LD between multi-allelic markers as predictors of usable LD of multi-allelic markers with presumed biallelic QTL. Observable LD between marker pairs was evaluated using eight existing measures and one new measure. These consisted of two pooled and standardized measures of LD between pairs of alleles at two markers based on Lewontin's LD measure, two pooled measures of squared correlations between alleles, one standardized measure using Hardy-Weinberg heterozygosities, and four measures based on the chi-square statistic for testing for association between alleles at two loci. In simulated populations with a range of LD generated by drift and a range of marker polymorphism, marker-marker LD measured by a standardized chi-square statistic (denoted chi(2')) was found to be the best predictor of useable marker-QTL LD for a group of multi-allelic markers. Estimates of the level and decline of marker-marker LD with distance obtained from chi(2') were linearly and highly correlated with usable LD of those markers with QTL across population structures and marker polymorphism. Corresponding relationships were poorer for the other marker-marker LD measures. Therefore, when LD is generated by drift, chi(2') is recommended to quantify the amount and extent of usable LD in a population for QTL mapping and MAS based on multi-allelic markers.  相似文献   

11.
The pattern of linkage disequilibrium (LD) is affected by a number of factors, including population demography. High LD is seen in populations with a relatively limited and constant size, presumably because of genetic drift. We have examined the extent of LD among over 300 genome-wide pattern microsatellite loci in 29 populations from around the world. The pattern of LD varied between populations, with a larger extent of LD in populations with limited size relative to larger populations. In addition, the LD between 88 less well-spaced microsatellite markers from 10 different genomic regions was examined in the Sami compared with the general Swedish population. For these markers, increased LD extending up to 5 Mb was detected in the Sami. The amount of LD also differed between the chromosomal regions. The amount of LD in the Sami makes this population suitable for the mapping of complex genetic traits.Åsa Johansson, Veronika Vavruch-Nilsson contributed equally to the report  相似文献   

12.
Population-based methods for the genetic mapping of adaptive traits and the analysis of natural selection require that the population structure and demographic history of a species are taken into account. We characterized geographic patterns of genetic variation in the model plant Arabidopsis thaliana by genotyping 115 genome-wide single nucleotide polymorphism (SNP) markers in 351 accessions from the whole species range using a matrix-assisted laser desorption/ionization time-of-flight assay, and by sequencing of nine unlinked short genomic regions in a subset of 64 accessions. The observed frequency distribution of SNPs is not consistent with a constant-size neutral model of sequence polymorphism due to an excess of rare polymorphisms. There is evidence for a significant population structure as indicated by differences in genetic diversity between geographic regions. Accessions from Central Asia have a low level of polymorphism and an increased level of genome-wide linkage disequilibrium (LD) relative to accessions from the Iberian Peninsula and Central Europe. Cluster analysis with the structure program grouped Eurasian accessions into K=6 clusters. Accessions from the Iberian Peninsula and from Central Asia constitute distinct populations, whereas Central and Eastern European accessions represent admixed populations in which genomes were reshuffled by historical recombination events. These patterns likely result from a rapid postglacial recolonization of Eurasia from glacial refugial populations. Our analyses suggest that mapping populations for association or LD mapping should be chosen from regional rather than a species-wide sample or identified genetically as sets of individuals with similar average genetic distances. Electronic Supplementary Material Supplementary material is available for this article at and is accessible for authorized users.  相似文献   

13.
MOTIVATION: Cancer is well known to be the end result of somatic mutations that disrupt normal cell division. The number of such mutations that have to be accumulated in a cell before cancer develops depends on the type of cancer. The waiting time T(m) until the appearance of m mutations in a cell is thus an important quantity in population genetics models of carcinogenesis. Such models are often difficult to analyze theoretically because of the complex interactions of mutation, drift and selection. They are also computationally expensive to simulate because of the large number of cells and the low mutation rate. RESULTS: We develop an efficient algorithm for simulating the waiting time T(m) until m mutations under a population genetics model of cancer development. We use an exact algorithm to simulate evolution of small cell populations and coarse-grained τ-leaping approximation to handle large populations. We compared our hybrid simulation algorithm with the exact algorithm in small populations and with available asymptotic results for large populations. The comparison suggested that our algorithm is accurate and computationally efficient. We used the algorithm to study the waiting time for up to 20 mutations under a Moran model with variable population sizes. Our new algorithm may be useful for studying realistic models of carcinogenesis, which incorporates variable mutation rates and fitness effects.  相似文献   

14.
Multilocus genotyping of microbial pathogens has revealed a range of population structures, with some bacteria showing extensive recombination and others showing almost complete clonality. The population structure of the protozoan parasite Plasmodium falciparum has been harder to evaluate, since most studies have used a limited number of antigen-encoding loci that are known to be under strong selection. We describe length variation at 12 microsatellite loci in 465 infections collected from 9 locations worldwide. These data reveal dramatic differences in parasite population structure in different locations. Strong linkage disequilibrium (LD) was observed in six of nine populations. Significant LD occurred in all locations with prevalence <1% and in only two of five of the populations from regions with higher transmission intensities. Where present, LD results largely from the presence of identical multilocus genotypes within populations, suggesting high levels of self-fertilization in populations with low levels of transmission. We also observed dramatic variation in diversity and geographical differentiation in different regions. Mean heterozygosities in South American countries (0.3-0.4) were less than half those observed in African locations (0. 76-0.8), with intermediate heterozygosities in the Southeast Asia/Pacific samples (0.51-0.65). Furthermore, variation was distributed among locations in South America (F:(ST) = 0.364) and within locations in Africa (F:(ST) = 0.007). The intraspecific patterns of diversity and genetic differentiation observed in P. falciparum are strikingly similar to those seen in interspecific comparisons of plants and animals with differing levels of outcrossing, suggesting that similar processes may be involved. The differences observed may also reflect the recent colonization of non-African populations from an African source, and the relative influences of epidemiology and population history are difficult to disentangle. These data reveal a range of population structures within a single pathogen species and suggest intimate links between patterns of epidemiology and genetic structure in this organism.  相似文献   

15.
Two dinucleotide short tandem-repeat polymorphisms (STRPs) and a polymorphic Alu element spanning a 22-kb region of the PLAT locus on chromosome 8p12-q11.2 were typed in 1,287-1,420 individuals originating from 30 geographically diverse human populations, as well as in 29 great apes. These data were analyzed as haplotypes consisting of each of the dinucleotide repeats and the flanking Alu insertion/deletion polymorphism. The global pattern of STRP/Alu haplotype variation and linkage disequilibrium (LD) is informative for the reconstruction of human evolutionary history. Sub-Saharan African populations have high levels of haplotype diversity within and between populations, relative to non-Africans, and have highly divergent patterns of LD. Non-African populations have both a subset of the haplotype diversity present in Africa and a distinct pattern of LD. The pattern of haplotype variation and LD observed at the PLAT locus suggests a recent common ancestry of non-African populations, from a small population originating in eastern Africa. These data indicate that, throughout much of modern human history, sub-Saharan Africa has maintained both a large effective population size and a high level of population substructure. Additionally, Papua New Guinean and Micronesian populations have rare haplotypes observed otherwise only in African populations, suggesting ancient gene flow from Africa into Papua New Guinea, as well as gene flow between Melanesian and Micronesian populations.  相似文献   

16.
The human dopaminergic system is a significant focal point of study in the fields of neuropsychiatry and pharmacology, plus it is also a promising nuclear DNA marker in studies of human genome diversity. In this study, we assayed six polymorphic markers in the dopamine D2 receptor gene (DRD2) in 482 unrelated individuals from nine ethnic populations of India. Our results demonstrate that the six markers are highly polymorphic in all populations and the constructed haplotypes show a high level of heterozygosity. Out of the eight possible three-site haplotypes, all populations commonly shared only three haplotypes. The haplotypes exhibited fairly high frequencies across multiple populations; Kurumba population showed all eight three-site haplotypes. The ancestral haplotype (B2-D2-Al) was observed at high frequency only in the Siddi population. Haplotypes based on all six markers revealed 16 haplotypes, out of which only 6 are most common with a frequency of greater than 5% in at least one of the nine populations. But only three haplotypes were shared by all nine populations with the cumulative frequency ranging from 80.8% (Kurumba) to 96.6% (Onge). Great variation in levels of linkage disequilibrium (LD) was detected, ranging from complete LD in the Badaga to virtually no LD in the Siddi. This range of LD likely reflects different population histories, such as African ancestry in the Siddi and recent founding events in the population isolates, Badaga and Kota.  相似文献   

17.
Use of genetic methods to estimate effective population size (Ne) is rapidly increasing, but all approaches make simplifying assumptions unlikely to be met in real populations. In particular, all assume a single, unstructured population, and none has been evaluated for use with continuously distributed species. We simulated continuous populations with local mating structure, as envisioned by Wright''s concept of neighborhood size (NS), and evaluated performance of a single-sample estimator based on linkage disequilibrium (LD), which provides an estimate of the effective number of parents that produced the sample (Nb). Results illustrate the interacting effects of two phenomena, drift and mixture, that contribute to LD. Samples from areas equal to or smaller than a breeding window produced estimates close to the NS. As the sampling window increased in size to encompass multiple genetic neighborhoods, mixture LD from a two-locus Wahlund effect overwhelmed the reduction in drift LD from incorporating offspring from more parents. As a consequence, never approached the global Ne, even when the geographic scale of sampling was large. Results indicate that caution is needed in applying standard methods for estimating effective size to continuously distributed populations.  相似文献   

18.
Linkage disequilibrium (LD) mapping is commonly used as a fine mapping tool in human genome mapping and has been used with some success for initial disease gene isolation in certain isolated inbred human populations. An understanding of the population history of domestic dog breeds suggests that LD mapping could be routinely utilized in this species for initial genome-wide scans. Such an approach offers significant advantages over traditional linkage analysis. Here, we demonstrate, using canine copper toxicosis in the Bedlington terrier as the model, that LD mapping could be reasonably expected to be a useful strategy in low-resolution, genome-wide scans in pure-bred dogs. Significant LD was demonstrated over distances up to 33.3 cM. It is very unlikely, for a number of reasons discussed, that this result could be extrapolated to the rest of the genome. It is, however, consistent with the expectation given the population structure of canine breeds and, in this breed at least, with the hypothesis that it may be possible to utilize LD in a genome-wide scan. In this study, LD mapping confirmed the location of the copper toxicosis in Bedlington terrier gene (CT-BT) and was able to do so in a population that was refractory to traditional linkage analysis.  相似文献   

19.
The relationship between linkage disequilibrium (LD) and recombination fraction can be used to infer the pattern of genetic variation and evolutionary process in humans and other systems. We described a computational framework to construct a linkage–LD map from commonly used biallelic, single-nucleotide polymorphism (SNP) markers for outcrossing plants by which the decline of LD is visualized with genetic distance. The framework was derived from an open-pollinated (OP) design composed of plants randomly sampled from a natural population and seeds from each sampled plant, enabling simultaneous estimation of the LD in the natural population and recombination fraction due to allelic co-segregation during meiosis. We modified the framework to infer evolutionary pasts of natural populations using those marker types that are segregating in a dominant manner, given their role in creating and maintaining population genetic diversity. A sophisticated two-level EM algorithm was implemented to estimate and retrieve the missing information of segregation characterized by dominant-segregating markers such as single methylation polymorphisms. The model was applied to study the relationship between linkage and LD for a non-model outcrossing species, a gymnosperm species, Torreya grandis, naturally distributed in mountains of the southeastern China. The linkage–LD map constructed from various types of molecular markers opens a powerful gateway for studying the history of plant evolution.  相似文献   

20.
Li MH  Merilä J 《Molecular ecology》2011,20(14):2916-2928
Information about the levels of linkage disequilibrium (LD) in wild animal populations is still limited, and this is true particularly with respect to possible interpopulation variation in the levels of LD. We compared the levels and extent of LD at the genome‐wide scale in three Siberian jay (Perisoreus infaustus) populations, two of which (Kuusamo and Ylläs) represented outbred populations within the main distribution area of the species, whereas the third (Suupohja) was a semi‐isolated, partially inbred population at the margin of the species’ distribution area. Although extensive long‐range LD (>20 cM) was observed in all three populations, LD generally decayed to background levels at a distance of 1–5 cM or c. 200–600 kb. The degree and extent of LD differed markedly between populations but aligned closely with both observed levels of within‐population genetic variation and expectations based on population history. The levels of LD were highest in the most inbred population with strong population substructure (Suupohja), compared with the two outbred populations. Furthermore, the decay of LD with increasing distance was slower in Suupohja, compared with the other two populations. By demonstrating that levels of LD can vary greatly over relatively short geographical distances within a species, these results suggest that prospects for association mapping differ from population to population. In this example, the prospects are best in the Suupohja population, given that minimized marker genotyping and a minimum marker spacing of 1–5 cM (c. 200–600 kb) would be sufficient for a whole genome scan for detecting QTL.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号