共查询到20条相似文献,搜索用时 15 毫秒
1.
Yan A Meng Yi Yu L Adrienne Cupples Lindsay A Farrer Kathryn L Lunetta 《BMC bioinformatics》2009,10(1):78
Background
Single nucleotide polymorphisms (SNPs) may be correlated due to linkage disequilibrium (LD). Association studies look for both direct and indirect associations with disease loci. In a Random Forest (RF) analysis, correlation between a true risk SNP and SNPs in LD may lead to diminished variable importance for the true risk SNP. One approach to address this problem is to select SNPs in linkage equilibrium (LE) for analysis. Here, we explore alternative methods for dealing with SNPs in LD: change the tree-building algorithm by building each tree in an RF only with SNPs in LE, modify the importance measure (IM), and use haplotypes instead of SNPs to build a RF. 相似文献2.
3.
Linkage disequilibrium testing when linkage phase is unknown 总被引:2,自引:0,他引:2
Schaid DJ 《Genetics》2004,166(1):505-512
Linkage disequilibrium, the nonrandom association of alleles from different loci, can provide valuable information on the structure of haplotypes in the human genome and is often the basis for evaluating the association of genomic variation with human traits among unrelated subjects. But, linkage phase of genetic markers measured on unrelated subjects is typically unknown, and so measurement of linkage disequilibrium, and testing whether it differs significantly from the null value of zero, requires statistical methods that can account for the ambiguity of unobserved haplotypes. A common method to test whether linkage disequilibrium differs significantly from zero is the likelihood-ratio statistic, which assumes Hardy-Weinberg equilibrium of the marker phenotype proportions. We show, by simulations, that this approach can be grossly biased, with either extremely conservative or liberal type I error rates. In contrast, we use simulations to show that a composite statistic, proposed by Weir and Cockerham, maintains the correct type I error rates, and, when comparisons are appropriate, has similar power as the likelihood-ratio statistic. We extend the composite statistic to allow for more than two alleles per locus, providing a global composite statistic, which is a strong competitor to the usual likelihood-ratio statistic. 相似文献
4.
Sham PC Ao SI Kwan JS Kao P Cheung F Fong PY Ng MK 《Bioinformatics (Oxford, England)》2007,23(1):129-131
We have developed an online program, WCLUSTAG, for tag SNP selection that allows the user to specify variable tagging thresholds for different SNPs. Tag SNPs are selected such that a SNP with user-specified tagging threshold C will have a minimum R2 of C with at least one tag SNP. This flexible feature is useful for researchers who wish to prioritize genomic regions or SNPs in an association study. AVAILABILITY: The online WCLUSTAG program is available at http://bioinfo.hku.hk/wclustag/ 相似文献
5.
Boyles AL Scott WK Martin ER Schmidt S Li YJ Ashley-Koch A Bass MP Schmidt M Pericak-Vance MA Speer MC Hauser ER 《Human heredity》2005,59(4):220-227
OBJECTIVES: Describe the inflation in nonparametric multipoint LOD scores due to inter-marker linkage disequilibrium (LD) across many markers with varied allele frequencies. METHOD: Using simulated two-generation families with and without parents, we conducted nonparametric multipoint linkage analysis with 2 to 10 markers with minor allele frequencies (MAF) of 0.5 and 0.1. RESULTS: Misspecification of population haplotype frequencies by assuming linkage equilibrium caused inflated multipoint LOD scores due to inter-marker LD when parental genotypes were not included. Inflation increased as more markers in LD were included and decreased as markers in equilibrium were added. When marker allele frequencies were unequal, the r2 measure of LD was a better predictor of inflation than D'. CONCLUSION: This observation strongly supports the evaluation of LD in multipoint linkage analyses, and further suggests that unaccounted for LD may be suspected when two-point and multipoint linkage analyses show a marked disparity in regions with elevated r2 measures of LD. Given the increasing popularity of high-density genome-wide SNP screens, inter-marker LD should be a concern in future linkage studies. 相似文献
6.
The extent of haplotype ambiguity in a string of single-nucleotide polymorphisms (SNPs) was quantified by Hodge et al. [Nat Genet 1999;21:360]. In their measure, the level of ambiguity increases with increasing numbers of loci and as loci become more polymorphic. That work assumed linkage equilibrium (LE). However, linkage disequilibrium (LD) provides additional information about the haplotypes at a site, thereby diluting the level of ambiguity. The ambiguity vanishes altogether when LD reaches its maximum value. Here, we introduce the ambiguity measure, Phi, to allow for LD (between pairs of SNPs). We derive the formula Phi = 4x(2)x(3) for ambiguity in individuals, where x(1), x(2), x(3) and x(4) are the probabilities of the A(1)A(2), A(1)B(2), B(1)A(2) and B(1)B(2) haplotypes, respectively, and w.l.o.g. x(1)x(4) > or = x(2)x(3). Alternatively, Phi can be expressed in terms of the allele frequencies and the LD parameter delta. We also extend the formula to triads of two parents plus one child. We estimate our measure Phi for relevant SNPs in the published lipoprotein lipase (LPL) gene dataset [Clark et al., Am J Hum Genet 1998;63:595; Nickerson et al., Nat Genet 1998;19:233], obtaining values ranging from a low of 0 to a high of 0.11 among adjacent pairs of sites. In genome-wide LD studies to map common disease genes, a dense map of SNPs may be utilized to detect association between a marker and disease. Therefore, the measurement of ambiguity can potentially help investigators to determine a more efficient map, designed to minimize ambiguity and subsequent information loss. 相似文献
7.
Robert Lawrence Aaron G Day-Williams Richard Mott John Broxholme Lon R Cardon Eleftheria Zeggini 《BMC bioinformatics》2009,10(1):367-5
Background
A number of tools for the examination of linkage disequilibrium (LD) patterns between nearby alleles exist, but none are available for quickly and easily investigating LD at longer ranges (>500 kb). We have developed a web-based query tool (GLIDERS: Genome-wide LInkage DisEquilibrium Repository and Search engine) that enables the retrieval of pairwise associations with r2 ≥ 0.3 across the human genome for any SNP genotyped within HapMap phase 2 and 3, regardless of distance between the markers. 相似文献8.
9.
Mehar S Khatkar Matthew Hobbs Markus Neuditschko Johann Sölkner Frank W Nicholas Herman W Raadsma 《BMC bioinformatics》2010,11(1):171
Background
Recent developments of high-density SNP chips across a number of species require accurate genetic maps. Despite rapid advances in genome sequence assembly and availability of a number of tools for creating genetic maps, the exact genome location for a number of SNPs from these SNP chips still remains unknown. We have developed a locus ordering procedure based on linkage disequilibrium (LODE) which provides estimation of the chromosomal positions of unaligned SNPs and scaffolds. It also provides an alternative means for verification of genetic maps. We exemplified LODE in cattle. 相似文献10.
Miller JM Poissant J Kijas JW Coltman DW;International Sheep Genomics Consortium 《Molecular ecology resources》2011,11(2):314-322
The development of genomic resources for wild species is still in its infancy. However, cross-species utilization of technologies developed for their domestic counterparts has the potential to unlock the genomes of organisms that currently lack genomic resources. Here, we apply the OvineSNP50 BeadChip, developed for domestic sheep, to two related wild ungulate species: the bighorn sheep (Ovis canadensis) and the thinhorn sheep (Ovis dalli). Over 95% of the domestic sheep markers were successfully genotyped in a sample of fifty-two bighorn sheep while over 90% were genotyped in two thinhorn sheep. Pooling the results from both species identified 868 single-nucleotide polymorphisms (SNPs), 570 were detected in bighorn sheep, while 330 SNPs were identified in thinhorn sheep. The total panel of SNPs was able to discriminate between the two species, assign population of origin for bighorn sheep and detect known relationship classes within one population of bighorn sheep. Using an informative subset of these SNPs (n=308), we examined the extent of genome-wide linkage disequilibrium (LD) within one population of bighorn sheep and found that high levels of LD persist over 4 Mb. 相似文献
11.
Wang L Luzynski K Pool JE Janoušek V Dufková P Vyskočilová MM Teeter KC Nachman MW Munclinger P Macholán M Piálek J Tucker PK 《Molecular ecology》2011,20(14):2985-3000
Theory predicts that naturally occurring hybrid zones between genetically distinct taxa can move over space and time as a result of selection and/or demographic processes, with certain types of hybrid zones being more or less likely to move. Determining whether a hybrid zone is stationary or moving has important implications for understanding evolutionary processes affecting interactions in hybrid populations. However, direct observations of hybrid zone movement are difficult to make unless the zone is moving rapidly. Here, evidence for movement in the house mouse Mus musculus domesticus × Mus musculus musculus hybrid zone is provided using measures of LD and haplotype structure among neighbouring SNP markers from across the genome. Local populations of mice across two transects in Germany and the Czech Republic were sampled, and a total of 1301 mice were genotyped at 1401 markers from the nuclear genome. Empirical measures of LD provide evidence for extinction and (re)colonization in single populations and, together with simulations, suggest hybrid zone movement because of either geography-dependent asymmetrical dispersal or selection favouring one subspecies over the other. 相似文献
12.
Experimental designs for reliable detection of linkage disequilibrium in unstructured random population association studies 下载免费PDF全文
Ball RD 《Genetics》2005,170(2):859-873
A method is given for design of experiments to detect associations (linkage disequilibrium) in a random population between a marker and a quantitative trait locus (QTL), or gene, with a given strength of evidence, as defined by the Bayes factor. Using a version of the Bayes factor that can be linked to the value of an F-statistic with an existing deterministic power calculation makes it possible to rapidly evaluate a comprehensive range of scenarios, demonstrating the feasibility, or otherwise, of detecting genes of small effect. The Bayes factor is advocated for use in determining optimal strategies for selecting candidate genes for further testing or applications. The prospects for fine-scale mapping of QTL are reevaluated in this framework. We show that large sample sizes are needed to detect small-effect genes with a respectable-sized Bayes factor, and to have good power to detect a QTL allele at low frequency it is necessary to have a marker with similar allele frequency near the gene. 相似文献
13.
Prediction of multi-locus inbreeding coefficients and relation to linkage disequilibrium in random mating populations 总被引:1,自引:0,他引:1
An algorithm to predict the level of identity by descent simultaneously at multiple loci is presented, which can in principle be extended to any number of loci. The model assumes a random mating population, with random association of haplotypes. The relationship is shown between coefficients of multi-locus identity or non-identity by descent and moments of multi-locus linkage disequilibrium. Thus, these moments can be computed from the multilocus identity or, using algorithms derived previously to predict the disequilibria moments, vice-versa. The results can be applied to predict multi-locus identity in, for example, gene mapping. 相似文献
14.
Mapping and linkage disequilibrium analysis with a genome-wide collection of SNPs that detect polymorphism in cultivated tomato 总被引:2,自引:0,他引:2
Robbins MD Sim SC Yang W Van Deynze A van der Knaap E Joobeur T Francis DM 《Journal of experimental botany》2011,62(6):1831-1845
The history of tomato (Solanum lycopersicum L.) improvement includes genetic bottlenecks, wild species introgressions, and divergence into distinct market classes. This history makes tomato an excellent model to investigate the effects of selection on genome variation. A combination of linkage mapping in two F(2) populations and physical mapping with emerging genome sequence data was used to position 434 PCR-based markers including SNPs. Three-hundred-and-forty markers were used to genotype 102 tomato lines representing wild species, landraces, vintage cultivars, and contemporary (fresh market and processing) varieties. Principal component analysis confirmed genetic divergence between market classes of cultivated tomato (P <0.0001). A genome-wide survey indicated that linkage disequilibrium (LD) decays over 6-8 cM when all cultivated tomatoes, including vintage and contemporary, were considered together. Within contemporary processing varieties, LD decayed over 6-14 cM, and decay was over 3-16 cM within fresh market varieties. Significant inter-chromosomal (gametic phase) LD was detected in both fresh market and processing varieties between chromosomes 2 and 3, and 2 and 4, but in distinct chromosomal locations for each market class. Additional LD was detected between chromosomes 3 and 4, 3 and 11, and 4 and 6 in fresh market varieties and chromosomes 3 and 12 in processing varieties. These results suggest that breeding practices for market specialization in tomato have led to a genetic divergence between fresh market and processing types. 相似文献
15.
We describe the use of multivariate regression for testing allelic association in the presence of linkage, using marker genotype data from sibships. The test is valid, provided that the correct mean structure is modeled but does not require the correlation structure within families to be specified. The test can be implemented using standard statistical software such as the SAS programming language. In a simulation study, we evaluated this new test in comparison with one from a standard, matched-case-control analysis. First, we noted that the genetic effect needed to be quite extreme before residual familial correlation due to linkage led to false inference using the standard, matched-pair analysis. Second, we showed that under examples of extreme residual familial correlation, the new test had the correct test size. Third, we found that the test was more powerful than the sibship disequilibrium test of Horvath and Laird. Finally, we concluded that although the standard analysis may lead to correct inference for practical purposes, the new test is valid, even under extreme residual familial correlation and with no cost in power at the causal locus. 相似文献
16.
Allele frequency matching between SNPs reveals an excess of linkage disequilibrium in genic regions of the human genome 下载免费PDF全文
Significant interest has emerged in mapping genetic susceptibility for complex traits through whole-genome association studies. These studies rely on the extent of association, i.e., linkage disequilibrium (LD), between single nucleotide polymorphisms (SNPs) across the human genome. LD describes the nonrandom association between SNP pairs and can be used as a metric when designing maximally informative panels of SNPs for association studies in human populations. Using data from the 1.58 million SNPs genotyped by Perlegen, we explored the allele frequency dependence of the LD statistic r(2) both empirically and theoretically. We show that average r(2) values between SNPs unmatched for allele frequency are always limited to much less than 1 (theoretical approximately 0.46 to 0.57 for this dataset). Frequency matching of SNP pairs provides a more sensitive measure for assessing the average decay of LD and generates average r(2) values across nearly the entire informative range (from 0 to 0.89 through 0.95). Additionally, we analyzed the extent of perfect LD (r(2) = 1.0) using frequency-matched SNPs and found significant differences in the extent of LD in genic regions versus intergenic regions. The SNP pairs exhibiting perfect LD showed a significant bias for derived, nonancestral alleles, providing evidence for positive natural selection in the human genome. 相似文献
17.
Aerts J Megens HJ Veenendaal T Ovcharenko I Crooijmans R Gordon L Stubbs L Groenen M 《Cytogenetic and genome research》2007,117(1-4):338-345
Many of the economically important traits in chicken are multifactorial and governed by multiple genes located at different quantitative trait loci (QTLs). The optimal marker density to identify these QTLs in linkage and association studies is largely determined by the extent of linkage disequilibrium (LD) around them. In this study, we investigated the extent of LD on two chromosomes in a white layer and two broiler chicken breeds. Pairwise levels of LD were calculated for 33 and 36 markers on chromosomes 10 and 28, respectively. We found that useful LD (i.e. an r(2) value higher than 0.3) in Nutreco chicken breed E5 (inbred) can extend to around 1 cM on chromosomes 10 and 28, although in a second region on chromosome 28 it extends to about 2.5 cM. The extent in breed Nutreco E3 (outbred) was very short in chromosome 10 (15 kb) but very much larger on chromosome 28, particularly in one region of depressed heterozygosity. The layer breed E2 (inbred) showed an extent of useful LD up to 4 cM on chromosome 10; the extent on chromosome 28 could not be assessed due to an erratic pattern of LD on that chromosome, although in one region LD appears to be in the order of 0.8 cM. This indicates that there may be very large differences in patterns of LD between different chicken breeds and different genomic regions. 相似文献
18.
A new strategy for studying the genome structure and organization of natural populations is proposed on the basis of a combined analysis of linkage and linkage disequilibrium using known polymorphic markers. This strategy exploits a random sample drawn from a panmictic natural population and the open-pollinated progeny of the sample. It is established on the principle of gene transmission from the parental to progeny generation during which the linkage between different markers is broken down due to meiotic recombination. The strategy has power to simultaneously capture the information about the linkage of the markers (as measured by recombination fraction) and the degree of their linkage disequilibrium created at a historic time. Simulation studies indicate that the statistical method implemented by the Fisher-scoring algorithm can provide accurate and precise estimates for the allele frequencies, recombination fractions, and linkage disequilibria between different markers. The strategy has great implications for constructing a dense linkage disequilibrium map that can facilitate the identification and positional cloning of the genes underlying both simple and complex traits. 相似文献
19.
GOLD--graphical overview of linkage disequilibrium 总被引:38,自引:0,他引:38
SUMMARY: We describe a software package that provides a graphical summary of linkage disequilibrium in human genetic data. It allows for the analysis of family data and is well suited to the analysis of dense genetic maps. AVAILABILITY: http://www.well.ox.ac.uk/asthma/GOLD CONTACT: goncalo@well.ox.ac.uk 相似文献
20.
Daniel B. Sloan Peter D. Fields Justin C. Havird 《Proceedings. Biological sciences / The Royal Society》2015,282(1815)
There is extensive evidence from model systems that disrupting associations between co-adapted mitochondrial and nuclear genotypes can lead to deleterious and even lethal consequences. While it is tempting to extrapolate from these observations and make inferences about the human-health effects of altering mitonuclear associations, the importance of such associations may vary greatly among species, depending on population genetics, demographic history and other factors. Remarkably, despite the extensive study of human population genetics, the statistical associations between nuclear and mitochondrial alleles remain largely uninvestigated. We analysed published population genomic data to test for signatures of historical selection to maintain mitonuclear associations, particularly those involving nuclear genes that encode mitochondrial-localized proteins (N-mt genes). We found that significant mitonuclear linkage disequilibrium (LD) exists throughout the human genome, but these associations were generally weak, which is consistent with the paucity of population genetic structure in humans. Although mitonuclear LD varied among genomic regions (with especially high levels on the X chromosome), N-mt genes were statistically indistinguishable from background levels, suggesting that selection on mitonuclear epistasis has not preferentially maintained associations involving this set of loci at a species-wide level. We discuss these findings in the context of the ongoing debate over mitochondrial replacement therapy. 相似文献