共查询到20条相似文献,搜索用时 171 毫秒
1.
MOTIVATION: With the availability of large-scale, high-density single-nucleotide polymorphism markers and information on haplotype structures and frequencies, a great challenge is how to take advantage of haplotype information in the association mapping of complex diseases in case-control studies. RESULTS: We present a novel approach for association mapping based on directly mining haplotypes (i.e. phased genotype pairs) produced from case-control data or case-parent data via a density-based clustering algorithm, which can be applied to whole-genome screens as well as candidate-gene studies in small genomic regions. The method directly explores the sharing of haplotype segments in affected individuals that are rarely present in normal individuals. The measure of sharing between two haplotypes is defined by a new similarity metric that combines the length of the shared segments and the number of common alleles around any marker position of the haplotypes, which is robust against recent mutations/genotype errors and recombination events. The effectiveness of the approach is demonstrated by using both simulated datasets and real datasets. The results show that the algorithm is accurate for different population models and for different disease models, even for genes with small effects, and it outperforms some recently developed methods. 相似文献
2.
Cardon LR 《Human heredity》2000,50(6):350-358
A multiple-regression model is described for the detection of linkage disequilibrium in quantitative trait loci. The model is developed for application to large numbers of single nucleotide polymorphism (SNP) markers genotyped on small nuclear families. Parental data are not required by the method, although it provides a direct means to test quantitative trait locus-marker allele association and to determine whether any such association is attributable to linkage disequilibrium or population admixture. Analytical expectations for the regression coefficients are derived, allowing direct interpretation of the parameter estimates. Simulation studies indicate a substantial improvement in power over classical linkage studies of sibling pairs and show the effects of population admixture on the model outcomes. 相似文献
3.
Brunilda Balliu Jeanine J. Houwing‐Duistermaat Stefan Bhringer 《Biometrical journal. Biometrische Zeitschrift》2019,61(3):747-768
Marginal tests based on individual SNPs are routinely used in genetic association studies. Studies have shown that haplotype‐based methods may provide more power in disease mapping than methods based on single markers when, for example, multiple disease‐susceptibility variants occur within the same gene. A limitation of haplotype‐based methods is that the number of parameters increases exponentially with the number of SNPs, inducing a commensurate increase in the degrees of freedom and weakening the power to detect associations. To address this limitation, we introduce a hierarchical linkage disequilibrium model for disease mapping, based on a reparametrization of the multinomial haplotype distribution, where every parameter corresponds to the cumulant of each possible subset of a set of loci. This hierarchy present in the parameters enables us to employ flexible testing strategies over a range of parameter sets: from standard single SNP analyses through the full haplotype distribution tests, reducing degrees of freedom and increasing the power to detect associations. We show via extensive simulations that our approach maintains the type I error at nominal level and has increased power under many realistic scenarios, as compared to single SNP and standard haplotype‐based studies. To evaluate the performance of our proposed methodology in real data, we analyze genome‐wide data from the Wellcome Trust Case‐Control Consortium. 相似文献
4.
This work develops a population-genetics model for polymorphic chromosome inversions. The model precisely describes how an inversion changes the nature of and approach to linkage equilibrium. The work also describes algorithms and software for allele-frequency estimation and linkage analysis in the presence of an inversion. The linkage algorithms implemented in the software package Mendel estimate recombination parameters and calculate the posterior probability that each pedigree member carries the inversion. Application of Mendel to eight Centre d'Etude du Polymorphisme Humain pedigrees in a region containing a common inversion on 8p23 illustrates its potential for providing more-precise estimates of the location of an unmapped marker or trait gene. Our expanded cytogenetic analysis of these families further identifies inversion carriers and increases the evidence of linkage. 相似文献
5.
Stuart J. E. Baird 《Molecular ecology resources》2015,15(5):1017-1019
Linkage disequilibrium (LD, association of allelic states across loci) is poorly understood by many evolutionary biologists, but as technology for multilocus sampling improves, we ignore LD at our peril. If we sample variation at 10 loci in an organism with 20 chromosomes, we can reasonably treat them as 10 ‘independent witnesses’ of the evolutionary process. If instead, we sample variation at 1000 loci, many are bound to be close together on a chromosome. With only one or two crossovers per meiosis, associations between close neighbours decay so slowly that even LD created far in the past will not have dissipated, so we cannot treat the 1000 loci as independent witnesses (Barton 2011 ). This means that as marker density on genomes increases classic analyses assuming independent loci become mired in the problem of overconfidence: if 1000 independent witnesses are assumed, and that number should be much lower, any conclusion will be overconfident. This is of special concern because our literature suffers from a strong publication bias towards confident answers, even when they turn out to be wrong (Knowles 2008 ). In contrast, analyses that take into account associations across loci both control for overconfidence and can inform us about LD generating events far in the past, for example human/Neanderthal admixture (Fu et al. 2014 ). With increased marker density, biologists must increase their awareness of LD and, in this issue of Molecular Ecology Resources, Kemppainen et al. ( 2015 ) make software available that can only help in this process: LDna allows patterns of LD in a data set to be explored using tools borrowed from network analysis. This has great potential, but realizing that potential requires understanding LD. 相似文献
6.
We describe the use of multivariate regression for testing allelic association in the presence of linkage, using marker genotype data from sibships. The test is valid, provided that the correct mean structure is modeled but does not require the correlation structure within families to be specified. The test can be implemented using standard statistical software such as the SAS programming language. In a simulation study, we evaluated this new test in comparison with one from a standard, matched-case-control analysis. First, we noted that the genetic effect needed to be quite extreme before residual familial correlation due to linkage led to false inference using the standard, matched-pair analysis. Second, we showed that under examples of extreme residual familial correlation, the new test had the correct test size. Third, we found that the test was more powerful than the sibship disequilibrium test of Horvath and Laird. Finally, we concluded that although the standard analysis may lead to correct inference for practical purposes, the new test is valid, even under extreme residual familial correlation and with no cost in power at the causal locus. 相似文献
7.
We illustrate how homozygosity of haplotypes can be used to measure the level of disequilibrium between two or more markers. An excess of either homozygosity or heterozygosity signals a departure from the gametic phase equilibrium: We describe the specific form of dependence that is associated with high (low) homozygosity and derive various linkage disequilibrium measures. They feature a clear biological interpretation, can be used to construct tests, and are standardized to allow comparison across loci and populations. They are particularly advantageous to measure linkage disequilibrium between highly polymorphic markers. 相似文献
8.
We present a mathematically precise formulation of total linkage disequilibrium between multiple loci as the deviation from probabilistic independence and provide explicit formulas for all higher-order terms of linkage disequilibrium, thereby combining J. Dausset et al.'s 1978 definition of linkage disequilibrium with H. Geiringer's 1944 approach. We recursively decompose higher-order linkage disequilibrium terms into lower-order ones. Our greatest simplification comes from defining linkage disequilibrium at a single locus as allele frequency at that locus. At each level, decomposition of linkage disequilibrium is mathematically equivalent to number theoretic compositions of positive integers; i.e., we have converted a genetic decomposition into a mathematical decomposition. 相似文献
9.
10.
Selection of genetic markers for association analyses,using linkage disequilibrium and haplotypes 下载免费PDF全文
The genotyping of closely spaced single-nucleotide polymorphism (SNP) markers frequently yields highly correlated data, owing to extensive linkage disequilibrium (LD) between markers. The extent of LD varies widely across the genome and drives the number of frequent haplotypes observed in small regions. Several studies have illustrated the possibility that LD or haplotype data could be used to select a subset of SNPs that optimize the information retained in a genomic region while reducing the genotyping effort and simplifying the analysis. We propose a method based on the spectral decomposition of the matrices of pairwise LD between markers, and we select markers on the basis of their contributions to the total genetic variation. We also modify Clayton's "haplotype tagging SNP" selection method, which utilizes haplotype information. For both methods, we propose sliding window-based algorithms that allow the methods to be applied to large chromosomal regions. Our procedures require genotype information about a small number of individuals for an initial set of SNPs and selection of an optimum subset of SNPs that could be efficiently genotyped on larger numbers of samples while retaining most of the genetic variation in samples. We identify suitable parameter combinations for the procedures, and we show that a sample size of 50-100 individuals achieves consistent results in studies of simulated data sets in linkage equilibrium and LD. When applied to experimental data sets, both procedures were similarly effective at reducing the genotyping requirement while maintaining the genetic information content throughout the regions. We also show that haplotype-association results that Hosking et al. obtained near CYP2D6 were almost identical before and after marker selection. 相似文献
11.
A new strategy for studying the genome structure and organization of natural populations is proposed on the basis of a combined analysis of linkage and linkage disequilibrium using known polymorphic markers. This strategy exploits a random sample drawn from a panmictic natural population and the open-pollinated progeny of the sample. It is established on the principle of gene transmission from the parental to progeny generation during which the linkage between different markers is broken down due to meiotic recombination. The strategy has power to simultaneously capture the information about the linkage of the markers (as measured by recombination fraction) and the degree of their linkage disequilibrium created at a historic time. Simulation studies indicate that the statistical method implemented by the Fisher-scoring algorithm can provide accurate and precise estimates for the allele frequencies, recombination fractions, and linkage disequilibria between different markers. The strategy has great implications for constructing a dense linkage disequilibrium map that can facilitate the identification and positional cloning of the genes underlying both simple and complex traits. 相似文献
12.
Knowledge of the extent and range of linkage disequilibrium (LD), defined as non-random association of alleles at two or more loci, in animal populations is extremely valuable in localizing genes affecting quantitative traits, identifying chromosomal regions under selection, studying population history, and characterizing/managing genetic resources and diversity. Two commonly used LD measures, r(2) and D', and their permutation based adjustments, were evaluated using genotypes of more than 6,000 pigs from six commercial lines (two terminal sire lines and four maternal lines) at ~4,500 autosomal SNPs (single nucleotide polymorphisms). The results indicated that permutation only partially removed the dependency of D' on allele frequency and that r(2) is a considerably more robust LD measure. The maximum r(2) was derived as a function of allele frequency. Using the same genotype dataset, the extent of LD in these pig populations was estimated for all possible syntenic SNP pairs using r(2) and the ratio of r(2) over its theoretical maximum. As expected, the extent of LD highest for SNP pairs was found in tightest linkage and decreased as their map distance increased. The level of LD found in these pig populations appears to be lower than previously implied in several other studies using microsatellite genotype data. For all pairs of SNPs approximately 3 centiMorgan (cM) apart, the average r(2) was equal to 0.1. Based on the average population-wise LD found in these six commercial pig lines, we recommend a spacing of 0.1 to 1 cM for a whole genome association study in pig populations. 相似文献
13.
Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium 总被引:52,自引:0,他引:52 下载免费PDF全文
Carlson CS Eberle MA Rieder MJ Yi Q Kruglyak L Nickerson DA 《American journal of human genetics》2004,74(1):106-120
Common genetic polymorphisms may explain a portion of the heritable risk for common diseases. Within candidate genes, the number of common polymorphisms is finite, but direct assay of all existing common polymorphism is inefficient, because genotypes at many of these sites are strongly correlated. Thus, it is not necessary to assay all common variants if the patterns of allelic association between common variants can be described. We have developed an algorithm to select the maximally informative set of common single-nucleotide polymorphisms (tagSNPs) to assay in candidate-gene association studies, such that all known common polymorphisms either are directly assayed or exceed a threshold level of association with a tagSNP. The algorithm is based on the r(2) linkage disequilibrium (LD) statistic, because r(2) is directly related to statistical power to detect disease associations with unassayed sites. We show that, at a relatively stringent r(2) threshold (r2>0.8), the LD-selected tagSNPs resolve >80% of all haplotypes across a set of 100 candidate genes, regardless of recombination, and tag specific haplotypes and clades of related haplotypes in nonrecombinant regions. Thus, if the patterns of common variation are described for a candidate gene, analysis of the tagSNP set can comprehensively interrogate for main effects from common functional variation. We demonstrate that, although common variation tends to be shared between populations, tagSNPs should be selected separately for populations with different ancestries. 相似文献
14.
Aerts J Megens HJ Veenendaal T Ovcharenko I Crooijmans R Gordon L Stubbs L Groenen M 《Cytogenetic and genome research》2007,117(1-4):338-345
Many of the economically important traits in chicken are multifactorial and governed by multiple genes located at different quantitative trait loci (QTLs). The optimal marker density to identify these QTLs in linkage and association studies is largely determined by the extent of linkage disequilibrium (LD) around them. In this study, we investigated the extent of LD on two chromosomes in a white layer and two broiler chicken breeds. Pairwise levels of LD were calculated for 33 and 36 markers on chromosomes 10 and 28, respectively. We found that useful LD (i.e. an r(2) value higher than 0.3) in Nutreco chicken breed E5 (inbred) can extend to around 1 cM on chromosomes 10 and 28, although in a second region on chromosome 28 it extends to about 2.5 cM. The extent in breed Nutreco E3 (outbred) was very short in chromosome 10 (15 kb) but very much larger on chromosome 28, particularly in one region of depressed heterozygosity. The layer breed E2 (inbred) showed an extent of useful LD up to 4 cM on chromosome 10; the extent on chromosome 28 could not be assessed due to an erratic pattern of LD on that chromosome, although in one region LD appears to be in the order of 0.8 cM. This indicates that there may be very large differences in patterns of LD between different chicken breeds and different genomic regions. 相似文献
15.
Daniel B. Sloan Peter D. Fields Justin C. Havird 《Proceedings. Biological sciences / The Royal Society》2015,282(1815)
There is extensive evidence from model systems that disrupting associations between co-adapted mitochondrial and nuclear genotypes can lead to deleterious and even lethal consequences. While it is tempting to extrapolate from these observations and make inferences about the human-health effects of altering mitonuclear associations, the importance of such associations may vary greatly among species, depending on population genetics, demographic history and other factors. Remarkably, despite the extensive study of human population genetics, the statistical associations between nuclear and mitochondrial alleles remain largely uninvestigated. We analysed published population genomic data to test for signatures of historical selection to maintain mitonuclear associations, particularly those involving nuclear genes that encode mitochondrial-localized proteins (N-mt genes). We found that significant mitonuclear linkage disequilibrium (LD) exists throughout the human genome, but these associations were generally weak, which is consistent with the paucity of population genetic structure in humans. Although mitonuclear LD varied among genomic regions (with especially high levels on the X chromosome), N-mt genes were statistically indistinguishable from background levels, suggesting that selection on mitonuclear epistasis has not preferentially maintained associations involving this set of loci at a species-wide level. We discuss these findings in the context of the ongoing debate over mitochondrial replacement therapy. 相似文献
16.
Maniatis N Collins A Gibson J Zhang W Tapper W Morton NE 《American journal of human genetics》2004,74(5):846-855
Recently, metric linkage disequilibrium (LD) maps that assign an LD unit (LDU) location for each marker have been developed (Maniatis et al. 2002). Here we present a multiple pairwise method for positional cloning by LD within a composite likelihood framework and investigate the operating characteristics of maps in physical units (kb) and LDU for two bodies of data (Daly et al. 2001; Jeffreys et al. 2001) on which current ideas of blocks are based. False-negative indications of a disease locus (type II error) were examined by selecting one single-nucleotide polymorphism (SNP) at a time as causal and taking its allelic count (0, 1, or 2, for the three genotypes) as a pseudophenotype, Y. By use of regression and correlation, association between every pseudophenotype and the allelic count of each SNP locus (X) was based on an adaptation of the Malecot model, which includes a parameter for location of the putative gene. By expressing locations in kb or LDU, greater power for localization was observed when the LDU map was fitted. The efficiency of the kb map, relative to the LDU map, to describe LD varied from a maximum of 0.87 to a minimum of 0.36, with a mean of 0.62. False-positive indications of a disease locus (type I error) were examined by simulating an unlinked causal SNP and the allele count was used as a pseudophenotype. The type I error was in good agreement with Wald's likelihood theorem for both metrics and all models that were tested. Unlike tests that select only the most significant marker, haplotype, or haploset, these methods are robust to large numbers of markers in a candidate region. Contrary to predictions from tagging SNPs that retain haplotype diversity, the sample with smaller size but greater SNP density gave less error. The locations of causal SNPs were estimated with the same precision in blocks and steps, suggesting that block definition may be less useful than anticipated for mapping a causal SNP. These results provide a guide to efficient positional cloning by SNPs and a benchmark against which the power of positional cloning by haplotype-based alternatives may be measured. 相似文献
17.
18.
An approach to the investigation of the evolution of quantitative traits on the basis of analysis of two-locus marginal systems dynamics has been developed. It has been shown that under stabilizing selection the "quasi-stationary" state is quickly reached and maintained continuously. The "quasi-stationary" state is characterized by small changes in allele frequencies and by linkage disequilibrium that significantly decreases genotypic variance. Equations defining the role of linkage disequilibrium in the stationary state of mutation-selection balance are derived. 相似文献
19.
GOLD--graphical overview of linkage disequilibrium 总被引:38,自引:0,他引:38
SUMMARY: We describe a software package that provides a graphical summary of linkage disequilibrium in human genetic data. It allows for the analysis of family data and is well suited to the analysis of dense genetic maps. AVAILABILITY: http://www.well.ox.ac.uk/asthma/GOLD CONTACT: goncalo@well.ox.ac.uk 相似文献
20.
Inferences about linkage disequilibrium. 总被引:32,自引:0,他引:32
B S Weir 《Biometrics》1979,35(1):235-254
Existing theory for inferences about linkage disequilibrium is restricted to a measure defined on gametic frequencies. Unless gametic frequencies are directly observable, they are inferred from genotypic frequencies under the assumption of random union of gametes. Primary emphasis in this paper is given to genotypic data, and disequilibrium coefficients are defined for all subsets of two or more of the four genes, two at each of two loci, carried by an individual. Linkage disequilibrium coefficients are defined for genes within and between gametes, and methods of estimating and testing these coefficients are given for gametic data. For genotypic data, when coupling and repulsion double heterozygotes cannot be distinguished. Burrows' composite measure of linkage disequilibrium is discussed. In particular, the estimate for this measure and hypothesis tests based on it are compared to the usual maximum likelihood estimate of gametic linkage disequilibrium, and corresponding likelihood ratio or contingency chi-square tests. General use of the composite measure, whether or not random union of gametes is an appropriate assumption, is recommended. Attention is given to small samples, where the non-normality of gene frequencies will have greatest effect on methods of inference based on normal theory. Even tools such as Fisher's z-transformation for the correlation of gene frequencies are found to perform quite satisfactorily. 相似文献