共查询到20条相似文献,搜索用时 0 毫秒
1.
MOTIVATION: With the availability of large-scale, high-density single-nucleotide polymorphism markers and information on haplotype structures and frequencies, a great challenge is how to take advantage of haplotype information in the association mapping of complex diseases in case-control studies. RESULTS: We present a novel approach for association mapping based on directly mining haplotypes (i.e. phased genotype pairs) produced from case-control data or case-parent data via a density-based clustering algorithm, which can be applied to whole-genome screens as well as candidate-gene studies in small genomic regions. The method directly explores the sharing of haplotype segments in affected individuals that are rarely present in normal individuals. The measure of sharing between two haplotypes is defined by a new similarity metric that combines the length of the shared segments and the number of common alleles around any marker position of the haplotypes, which is robust against recent mutations/genotype errors and recombination events. The effectiveness of the approach is demonstrated by using both simulated datasets and real datasets. The results show that the algorithm is accurate for different population models and for different disease models, even for genes with small effects, and it outperforms some recently developed methods. 相似文献
2.
Cardon LR 《Human heredity》2000,50(6):350-358
A multiple-regression model is described for the detection of linkage disequilibrium in quantitative trait loci. The model is developed for application to large numbers of single nucleotide polymorphism (SNP) markers genotyped on small nuclear families. Parental data are not required by the method, although it provides a direct means to test quantitative trait locus-marker allele association and to determine whether any such association is attributable to linkage disequilibrium or population admixture. Analytical expectations for the regression coefficients are derived, allowing direct interpretation of the parameter estimates. Simulation studies indicate a substantial improvement in power over classical linkage studies of sibling pairs and show the effects of population admixture on the model outcomes. 相似文献
3.
We describe the use of multivariate regression for testing allelic association in the presence of linkage, using marker genotype data from sibships. The test is valid, provided that the correct mean structure is modeled but does not require the correlation structure within families to be specified. The test can be implemented using standard statistical software such as the SAS programming language. In a simulation study, we evaluated this new test in comparison with one from a standard, matched-case-control analysis. First, we noted that the genetic effect needed to be quite extreme before residual familial correlation due to linkage led to false inference using the standard, matched-pair analysis. Second, we showed that under examples of extreme residual familial correlation, the new test had the correct test size. Third, we found that the test was more powerful than the sibship disequilibrium test of Horvath and Laird. Finally, we concluded that although the standard analysis may lead to correct inference for practical purposes, the new test is valid, even under extreme residual familial correlation and with no cost in power at the causal locus. 相似文献
4.
We illustrate how homozygosity of haplotypes can be used to measure the level of disequilibrium between two or more markers. An excess of either homozygosity or heterozygosity signals a departure from the gametic phase equilibrium: We describe the specific form of dependence that is associated with high (low) homozygosity and derive various linkage disequilibrium measures. They feature a clear biological interpretation, can be used to construct tests, and are standardized to allow comparison across loci and populations. They are particularly advantageous to measure linkage disequilibrium between highly polymorphic markers. 相似文献
5.
6.
Selection of genetic markers for association analyses,using linkage disequilibrium and haplotypes
下载免费PDF全文

The genotyping of closely spaced single-nucleotide polymorphism (SNP) markers frequently yields highly correlated data, owing to extensive linkage disequilibrium (LD) between markers. The extent of LD varies widely across the genome and drives the number of frequent haplotypes observed in small regions. Several studies have illustrated the possibility that LD or haplotype data could be used to select a subset of SNPs that optimize the information retained in a genomic region while reducing the genotyping effort and simplifying the analysis. We propose a method based on the spectral decomposition of the matrices of pairwise LD between markers, and we select markers on the basis of their contributions to the total genetic variation. We also modify Clayton's "haplotype tagging SNP" selection method, which utilizes haplotype information. For both methods, we propose sliding window-based algorithms that allow the methods to be applied to large chromosomal regions. Our procedures require genotype information about a small number of individuals for an initial set of SNPs and selection of an optimum subset of SNPs that could be efficiently genotyped on larger numbers of samples while retaining most of the genetic variation in samples. We identify suitable parameter combinations for the procedures, and we show that a sample size of 50-100 individuals achieves consistent results in studies of simulated data sets in linkage equilibrium and LD. When applied to experimental data sets, both procedures were similarly effective at reducing the genotyping requirement while maintaining the genetic information content throughout the regions. We also show that haplotype-association results that Hosking et al. obtained near CYP2D6 were almost identical before and after marker selection. 相似文献
7.
We present a mathematically precise formulation of total linkage disequilibrium between multiple loci as the deviation from probabilistic independence and provide explicit formulas for all higher-order terms of linkage disequilibrium, thereby combining J. Dausset et al.'s 1978 definition of linkage disequilibrium with H. Geiringer's 1944 approach. We recursively decompose higher-order linkage disequilibrium terms into lower-order ones. Our greatest simplification comes from defining linkage disequilibrium at a single locus as allele frequency at that locus. At each level, decomposition of linkage disequilibrium is mathematically equivalent to number theoretic compositions of positive integers; i.e., we have converted a genetic decomposition into a mathematical decomposition. 相似文献
8.
A new strategy for studying the genome structure and organization of natural populations is proposed on the basis of a combined analysis of linkage and linkage disequilibrium using known polymorphic markers. This strategy exploits a random sample drawn from a panmictic natural population and the open-pollinated progeny of the sample. It is established on the principle of gene transmission from the parental to progeny generation during which the linkage between different markers is broken down due to meiotic recombination. The strategy has power to simultaneously capture the information about the linkage of the markers (as measured by recombination fraction) and the degree of their linkage disequilibrium created at a historic time. Simulation studies indicate that the statistical method implemented by the Fisher-scoring algorithm can provide accurate and precise estimates for the allele frequencies, recombination fractions, and linkage disequilibria between different markers. The strategy has great implications for constructing a dense linkage disequilibrium map that can facilitate the identification and positional cloning of the genes underlying both simple and complex traits. 相似文献
9.
Daniel B. Sloan Peter D. Fields Justin C. Havird 《Proceedings. Biological sciences / The Royal Society》2015,282(1815)
There is extensive evidence from model systems that disrupting associations between co-adapted mitochondrial and nuclear genotypes can lead to deleterious and even lethal consequences. While it is tempting to extrapolate from these observations and make inferences about the human-health effects of altering mitonuclear associations, the importance of such associations may vary greatly among species, depending on population genetics, demographic history and other factors. Remarkably, despite the extensive study of human population genetics, the statistical associations between nuclear and mitochondrial alleles remain largely uninvestigated. We analysed published population genomic data to test for signatures of historical selection to maintain mitonuclear associations, particularly those involving nuclear genes that encode mitochondrial-localized proteins (N-mt genes). We found that significant mitonuclear linkage disequilibrium (LD) exists throughout the human genome, but these associations were generally weak, which is consistent with the paucity of population genetic structure in humans. Although mitonuclear LD varied among genomic regions (with especially high levels on the X chromosome), N-mt genes were statistically indistinguishable from background levels, suggesting that selection on mitonuclear epistasis has not preferentially maintained associations involving this set of loci at a species-wide level. We discuss these findings in the context of the ongoing debate over mitochondrial replacement therapy. 相似文献
10.
Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium 总被引:52,自引:0,他引:52
下载免费PDF全文

Carlson CS Eberle MA Rieder MJ Yi Q Kruglyak L Nickerson DA 《American journal of human genetics》2004,74(1):106-120
Common genetic polymorphisms may explain a portion of the heritable risk for common diseases. Within candidate genes, the number of common polymorphisms is finite, but direct assay of all existing common polymorphism is inefficient, because genotypes at many of these sites are strongly correlated. Thus, it is not necessary to assay all common variants if the patterns of allelic association between common variants can be described. We have developed an algorithm to select the maximally informative set of common single-nucleotide polymorphisms (tagSNPs) to assay in candidate-gene association studies, such that all known common polymorphisms either are directly assayed or exceed a threshold level of association with a tagSNP. The algorithm is based on the r(2) linkage disequilibrium (LD) statistic, because r(2) is directly related to statistical power to detect disease associations with unassayed sites. We show that, at a relatively stringent r(2) threshold (r2>0.8), the LD-selected tagSNPs resolve >80% of all haplotypes across a set of 100 candidate genes, regardless of recombination, and tag specific haplotypes and clades of related haplotypes in nonrecombinant regions. Thus, if the patterns of common variation are described for a candidate gene, analysis of the tagSNP set can comprehensively interrogate for main effects from common functional variation. We demonstrate that, although common variation tends to be shared between populations, tagSNPs should be selected separately for populations with different ancestries. 相似文献
11.
Aerts J Megens HJ Veenendaal T Ovcharenko I Crooijmans R Gordon L Stubbs L Groenen M 《Cytogenetic and genome research》2007,117(1-4):338-345
Many of the economically important traits in chicken are multifactorial and governed by multiple genes located at different quantitative trait loci (QTLs). The optimal marker density to identify these QTLs in linkage and association studies is largely determined by the extent of linkage disequilibrium (LD) around them. In this study, we investigated the extent of LD on two chromosomes in a white layer and two broiler chicken breeds. Pairwise levels of LD were calculated for 33 and 36 markers on chromosomes 10 and 28, respectively. We found that useful LD (i.e. an r(2) value higher than 0.3) in Nutreco chicken breed E5 (inbred) can extend to around 1 cM on chromosomes 10 and 28, although in a second region on chromosome 28 it extends to about 2.5 cM. The extent in breed Nutreco E3 (outbred) was very short in chromosome 10 (15 kb) but very much larger on chromosome 28, particularly in one region of depressed heterozygosity. The layer breed E2 (inbred) showed an extent of useful LD up to 4 cM on chromosome 10; the extent on chromosome 28 could not be assessed due to an erratic pattern of LD on that chromosome, although in one region LD appears to be in the order of 0.8 cM. This indicates that there may be very large differences in patterns of LD between different chicken breeds and different genomic regions. 相似文献
12.
Maniatis N Collins A Gibson J Zhang W Tapper W Morton NE 《American journal of human genetics》2004,74(5):846-855
Recently, metric linkage disequilibrium (LD) maps that assign an LD unit (LDU) location for each marker have been developed (Maniatis et al. 2002). Here we present a multiple pairwise method for positional cloning by LD within a composite likelihood framework and investigate the operating characteristics of maps in physical units (kb) and LDU for two bodies of data (Daly et al. 2001; Jeffreys et al. 2001) on which current ideas of blocks are based. False-negative indications of a disease locus (type II error) were examined by selecting one single-nucleotide polymorphism (SNP) at a time as causal and taking its allelic count (0, 1, or 2, for the three genotypes) as a pseudophenotype, Y. By use of regression and correlation, association between every pseudophenotype and the allelic count of each SNP locus (X) was based on an adaptation of the Malecot model, which includes a parameter for location of the putative gene. By expressing locations in kb or LDU, greater power for localization was observed when the LDU map was fitted. The efficiency of the kb map, relative to the LDU map, to describe LD varied from a maximum of 0.87 to a minimum of 0.36, with a mean of 0.62. False-positive indications of a disease locus (type I error) were examined by simulating an unlinked causal SNP and the allele count was used as a pseudophenotype. The type I error was in good agreement with Wald's likelihood theorem for both metrics and all models that were tested. Unlike tests that select only the most significant marker, haplotype, or haploset, these methods are robust to large numbers of markers in a candidate region. Contrary to predictions from tagging SNPs that retain haplotype diversity, the sample with smaller size but greater SNP density gave less error. The locations of causal SNPs were estimated with the same precision in blocks and steps, suggesting that block definition may be less useful than anticipated for mapping a causal SNP. These results provide a guide to efficient positional cloning by SNPs and a benchmark against which the power of positional cloning by haplotype-based alternatives may be measured. 相似文献
13.
14.
LDA--a java-based linkage disequilibrium analyzer 总被引:7,自引:0,他引:7
SUMMARY: We describe an integrated java-based program that provides elaborate graphic and plain-text output of pairwise linkage disequilibrium (LD) analysis of single nucleotide polymorphisms genotypic data. It is most suitable for molecular geneticists, who are focusing on LD measures estimation, statistical significance test and extent prediction. AVAILABILITY: The software is available at: http://www.chgb.org.cn/lda/lda.htm. SUPPLEMENTARY INFORMATION: Detailed tutorials, LDA help system and examples are distributed within LDA software. For Macintosh OS X user, the Jre version 1.4 can be downloaded from http://connect.apple.com. 相似文献
15.
Background
A major QTL for fatness and growth, denoted FAT1, has previously been detected on pig chromosome 4q (SSC4q) using a Large White – wild boar intercross. Progeny that carried the wild boar allele at this locus had higher fat deposition, shorter length of carcass, and reduced growth. The position and the estimated effects of the FAT1 QTL for growth and fatness have been confirmed in a previous study. In order to narrow down the QTL interval we have traced the inheritance of the wild boar allele associated with high fat deposition through six additional backcross generations.Results
Progeny-testing was used to determine the QTL genotype for 10 backcross sires being heterozygous for different parts of the broad FAT1 region. The statistical analysis revealed that five of the sires were segregating at the QTL, two were negative while the data for three sires were inconclusive. We could confirm the QTL effects on fatness/meat content traits but not for the growth traits implying that growth and fatness are controlled by distinct QTLs on chromosome 4. Two of the segregating sires showed highly significant QTL effects that were as large as previously observed in the F2 generation. The estimates for the remaining three sires, which were all heterozygous for smaller fragments of the actual region, were markedly smaller. With the sample sizes used in the present study we cannot with great confidence determine whether these smaller effects in some sires are due to chance deviations, epistatic interactions or whether FAT1 is composed of two or more QTLs, each one with a smaller phenotypic effect. Under the assumption of a single locus, the critical region for FAT1 has been reduced to a 3.3 cM interval between the RXRG and SDHC loci.Conclusion
We have further characterized the FAT1 QTL on pig chromosome 4 and refined its map position considerably, from a QTL interval of 70 cM to a maximum region of 20 cM and a probable region as small as 3.3 cM. The flanking markers for the small region are RXRG and SDHC and the orthologous region of FAT1 in the human genome is located on HSA1q23.3 and harbors approximately 20 genes. Our strategy to further refine the map position of this major QTL will be i) to type new markers in our pigs that are recombinant in the QTL interval and ii) to perform Identity-By-Descent (IBD) mapping across breeds that have been strongly selected for lean growth. 相似文献16.
GOLD--graphical overview of linkage disequilibrium 总被引:38,自引:0,他引:38
SUMMARY: We describe a software package that provides a graphical summary of linkage disequilibrium in human genetic data. It allows for the analysis of family data and is well suited to the analysis of dense genetic maps. AVAILABILITY: http://www.well.ox.ac.uk/asthma/GOLD CONTACT: goncalo@well.ox.ac.uk 相似文献
17.
Inferences about linkage disequilibrium. 总被引:32,自引:0,他引:32
B S Weir 《Biometrics》1979,35(1):235-254
Existing theory for inferences about linkage disequilibrium is restricted to a measure defined on gametic frequencies. Unless gametic frequencies are directly observable, they are inferred from genotypic frequencies under the assumption of random union of gametes. Primary emphasis in this paper is given to genotypic data, and disequilibrium coefficients are defined for all subsets of two or more of the four genes, two at each of two loci, carried by an individual. Linkage disequilibrium coefficients are defined for genes within and between gametes, and methods of estimating and testing these coefficients are given for gametic data. For genotypic data, when coupling and repulsion double heterozygotes cannot be distinguished. Burrows' composite measure of linkage disequilibrium is discussed. In particular, the estimate for this measure and hypothesis tests based on it are compared to the usual maximum likelihood estimate of gametic linkage disequilibrium, and corresponding likelihood ratio or contingency chi-square tests. General use of the composite measure, whether or not random union of gametes is an appropriate assumption, is recommended. Attention is given to small samples, where the non-normality of gene frequencies will have greatest effect on methods of inference based on normal theory. Even tools such as Fisher's z-transformation for the correlation of gene frequencies are found to perform quite satisfactorily. 相似文献
18.
An approach to the investigation of the evolution of quantitative traits on the basis of analysis of two-locus marginal systems dynamics has been developed. It has been shown that under stabilizing selection the "quasi-stationary" state is quickly reached and maintained continuously. The "quasi-stationary" state is characterized by small changes in allele frequencies and by linkage disequilibrium that significantly decreases genotypic variance. Equations defining the role of linkage disequilibrium in the stationary state of mutation-selection balance are derived. 相似文献
19.
20.
Thomas A 《Human heredity》2007,64(1):16-26
We review recent developments of MCMC integration methods for computations on graphical models for two applications in statistical genetics: modelling allelic association and pedigree based linkage analysis. We discuss and illustrate estimation of graphical models from haploid and diploid genotypes, and the importance of MCMC updating schemes beyond what is strictly necessary for irreducibility. We then outline an approach combining these methods to compute linkage statistics when alleles at the marker loci are in linkage disequilibrium. Other extensions suitable for analysis of SNP genotype data in pedigrees are also discussed and programs that implement these methods, and which are available from the author's web site, are described. We conclude with a discussion of how this still experimental approach might be further developed. 相似文献