共查询到20条相似文献,搜索用时 15 毫秒
1.
Single-nucleotide polymorphisms (SNPs) are rapidly replacing microsatellites as the markers of choice for genetic linkage studies and many other studies of human pedigrees. Here, we describe an efficient approach for modeling linkage disequilibrium (LD) between markers during multipoint analysis of human pedigrees. Using a gene-counting algorithm suitable for pedigree data, our approach enables rapid estimation of allele and haplotype frequencies within clusters of tightly linked markers. In addition, with the use of a hidden Markov model, our approach allows for multipoint pedigree analysis with large numbers of SNP markers organized into clusters of markers in LD. Simulation results show that our approach resolves previously described biases in multipoint linkage analysis with SNPs that are in LD. An updated version of the freely available Merlin software package uses the approach described here to perform many common pedigree analyses, including haplotyping and haplotype frequency estimation, parametric and nonparametric multipoint linkage analysis of discrete traits, variance-components and regression-based analysis of quantitative traits, calculation of identity-by-descent or kinship coefficients, and case selection for follow-up association studies. To illustrate the possibilities, we examine a data set that provides evidence of linkage of psoriasis to chromosome 17. 相似文献
2.
Evaluation of linkage disequilibrium measures between multi-allelic markers as predictors of linkage disequilibrium between markers and QTL 总被引:7,自引:0,他引:7
Effectiveness of marker-assisted selection (MAS) and quantitative trait loci (QTL) mapping using population-wide linkage disequilibrium (LD) between markers and QTL depends on the extent of LD and how it declines with distance in a population. Because marker-QTL LD cannot be observed directly, the objective of this study was to evaluate alternative measures of observable LD between multi-allelic markers as predictors of usable LD of multi-allelic markers with presumed biallelic QTL. Observable LD between marker pairs was evaluated using eight existing measures and one new measure. These consisted of two pooled and standardized measures of LD between pairs of alleles at two markers based on Lewontin's LD measure, two pooled measures of squared correlations between alleles, one standardized measure using Hardy-Weinberg heterozygosities, and four measures based on the chi-square statistic for testing for association between alleles at two loci. In simulated populations with a range of LD generated by drift and a range of marker polymorphism, marker-marker LD measured by a standardized chi-square statistic (denoted chi(2')) was found to be the best predictor of useable marker-QTL LD for a group of multi-allelic markers. Estimates of the level and decline of marker-marker LD with distance obtained from chi(2') were linearly and highly correlated with usable LD of those markers with QTL across population structures and marker polymorphism. Corresponding relationships were poorer for the other marker-marker LD measures. Therefore, when LD is generated by drift, chi(2') is recommended to quantify the amount and extent of usable LD in a population for QTL mapping and MAS based on multi-allelic markers. 相似文献
3.
Collins-Schramm HE Phillips CM Operario DJ Lee JS Weber JL Hanson RL Knowler WC Cooper R Li H Seldin MF 《American journal of human genetics》2002,70(3):737-750
Mapping by admixture linkage disequilibrium (MALD) is a potentially powerful technique for the mapping of complex genetic diseases. The practical requirements of this method include (a) a set of markers spanning the genome that have large allele-frequency differences between the parental ethnicities contributing to the admixed population and (b) an understanding of the extent of admixture in the study population. To this end, a DNA-pooling technique was used to screen microsatellite and diallelic insertion/deletion markers for allele-frequency differences between putative representatives of the parental populations of the admixed Mexican American (MA) and African American (AA) populations. Markers with promising pooled differences were then confirmed by individual genotyping in both the parental and admixed populations. For the MA population, screening of >600 markers identified 151 ethnic-difference markers (EDMs) with delta>0.30 (where delta is the absolute value of each allele-frequency difference between two populations, summed over all marker alleles and divided by two) that are likely to be useful for MALD analysis. For the AA population, analysis of >400 markers identified 97 EDMs. In addition, individual genotyping of these markers in Pima Amerindians, Yavapai Amerindians, European American (EA) individuals, Africans from Zimbabwe, MA individuals, and AA individuals, as well as comparison to the CEPH genotyping set, suggests that the differences between subpopulations of an ethnicity are small for many markers with large interethnic differences. Estimates of admixture that are based on individual genotyping of these markers are consistent with a 60% EA:40% Amerindian contribution to MA populations and with a 20% EA:80% African contribution to AA populations. Taken together, these data suggest that EDMs with large interpopulation and small intrapopulation differences can be readily identified for MALD studies in both AA and MA populations. 相似文献
4.
Li Yi Zhang Suzanne Marchand Nicholas A. Tinker François Belzile 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》2009,119(1):43-52
Diversity Array Technology (DArT) markers were used to investigate the genetic diversity, population structure, and extent of linkage disequilibrium (LD) on a genome-wide level in Canadian barley (Hordeum vulgare L.). Approximately 1,000 DArT markers were polymorphic and scored with high confidence among a collection of 170 barley lines composed mostly of Canadian cultivars and breeding lines. The reproducibility of DArT markers proved very high, as 99.9% of allele calls were identical among seven replicated samples. The polymorphism information content (PIC) of DArT markers ranged between 0.04 and 0.50 with an average of 0.38. Using principal coordinate analysis (PCoA), most lines fell into one of two major groups reflecting inflorescence type (two-row versus six-row). Within these two large groups, evidence of geographic clustering of genotypes was also observed. A cluster analysis Unweighted Pair Group Method with Algorithmic Mean suggested the existence of three subgroups within the two-row group and four subgroups within the six-row group. An analysis of molecular variance (AMOVA) revealed highly significant (P < 0.001) genetic variance within subgroups, among subgroups, and among groups. Values of LD, expressed as r 2, declined with increasing genetic distance, and mean values of r 2 fell below 0.2 for markers located 2.6 cM apart. Approximately 8% of marker pairs located on the same chromosome and 3.4% of pairs located on different chromosomes were in LD (r 2 > 0.2). Within both the subsets of two-row and six-row lines, LD extended slightly further (3.5 cM) than for the entire set, while 7.5% of intra-chromosomal locus pairs and <2% of inter-chromosomal pairs were in LD. We discuss the implications of these findings with regard to the prospects of association mapping of complex traits in barley. 相似文献
5.
Assessment of linkage disequilibrium in potato genome with single nucleotide polymorphism markers 总被引:6,自引:0,他引:6
下载免费PDF全文

The extent of linkage disequilibrium (LD) is an important factor in designing association mapping experiments. Unlike other plant species that have been analyzed so far for the extent of LD, cultivated potato (Solanum tuberosum L.), an outcrossing species, is a highly heterozygous autotetraploid. The favored genotypes of modern cultivars are maintained by vegetative propagation through tubers. As a first step in the LD analysis, we surveyed both coding and noncoding regions of 66 DNA fragments from 47 accessions for single nucleotide polymorphism (SNP). In the process, we combined information from the potato SNP database with experimental SNP detection. The total length of all analyzed fragments was >25 kb, and the number of screened sequence bases reached almost 1.4 million. Average nucleotide polymorphism (=11.5x10(-3)) and diversity (pi=14.6x10(-3)) was high compared to the other plant species. The overall Tajima's D value (0.5) was not significant, but indicates a deficit of low-frequency alleles relative to expectation. To eliminate the possibility that an elevated D value occurs due to population subdivision, we assessed the population structure with probabilistic statistics. The analysis did not reveal any significant subdivision, indicating a relatively homogenous population structure. However, the analysis of individual fragments revealed the presence of subgroups in the fragment closely linked to the R1 resistance gene. Data pooled from all fragments show relatively fast decay of LD in the short range (r2=0.208 at 1 kb) but slow decay afterward (r2=0.137 at approximately 70 kb). The estimate from our data indicates that LD in potato declines below 0.10 at a distance of approximately 10 cM. We speculate that two conflicting factors play a vital role in shaping LD in potato: the outcrossing mating type and the very limited number of meiotic generations. 相似文献
6.
Effectiveness of marker-assisted selection (MAS) and quantitative trait locus (QTL) mapping using population-wide linkage disequilibrium (LD) between markers and QTLs depends on the extent of LD and how it declines with distance between markers and QTLs in a population. Marker-QTL LD can be predicted from LD between markers. Our previous work evaluated LD measures between multi-allelic markers as predictors of usable LD of multi-allelic markers with QTLs. Since single nucleotide polymorphisms (SNPs) are the current marker of choice for high-density genotyping and LD-mapping of QTLs, the objective of this study was to use LD between multi-allelic markers to predict LD among biallelic SNPs or between SNPs and QTLs. Observable LD between multi-allelic markers was evaluated using nine measures. These included two pooled and standardized measures of LD between pairs of alleles at two markers based on Lewontin's LD measure, two pooled measures of squared correlations between alleles, one standardized measure using Hardy-Weinberg heterozygosities, and four measures based on the chi-square statistic for testing for association between alleles at two loci. The standardized chi-square measure that best predicted usable LD between multi-allelic markers and QTLs, based on our previous work, overestimated usable SNP-SNP or SNP-QTL LD. Instead, three other measures were found to be good predictors of usable SNP-SNP or SNP-QTL LD when LD is generated by drift. Therefore, the LD measure between multi-allelic markers that is best for predicting usable LD in a population depends on the type of markers (i.e. multi-allelic or biallelic) that will eventually be used for QTL mapping or MAS. 相似文献
7.
Maniatis N Collins A Gibson J Zhang W Tapper W Morton NE 《American journal of human genetics》2004,74(5):846-855
Recently, metric linkage disequilibrium (LD) maps that assign an LD unit (LDU) location for each marker have been developed (Maniatis et al. 2002). Here we present a multiple pairwise method for positional cloning by LD within a composite likelihood framework and investigate the operating characteristics of maps in physical units (kb) and LDU for two bodies of data (Daly et al. 2001; Jeffreys et al. 2001) on which current ideas of blocks are based. False-negative indications of a disease locus (type II error) were examined by selecting one single-nucleotide polymorphism (SNP) at a time as causal and taking its allelic count (0, 1, or 2, for the three genotypes) as a pseudophenotype, Y. By use of regression and correlation, association between every pseudophenotype and the allelic count of each SNP locus (X) was based on an adaptation of the Malecot model, which includes a parameter for location of the putative gene. By expressing locations in kb or LDU, greater power for localization was observed when the LDU map was fitted. The efficiency of the kb map, relative to the LDU map, to describe LD varied from a maximum of 0.87 to a minimum of 0.36, with a mean of 0.62. False-positive indications of a disease locus (type I error) were examined by simulating an unlinked causal SNP and the allele count was used as a pseudophenotype. The type I error was in good agreement with Wald's likelihood theorem for both metrics and all models that were tested. Unlike tests that select only the most significant marker, haplotype, or haploset, these methods are robust to large numbers of markers in a candidate region. Contrary to predictions from tagging SNPs that retain haplotype diversity, the sample with smaller size but greater SNP density gave less error. The locations of causal SNPs were estimated with the same precision in blocks and steps, suggesting that block definition may be less useful than anticipated for mapping a causal SNP. These results provide a guide to efficient positional cloning by SNPs and a benchmark against which the power of positional cloning by haplotype-based alternatives may be measured. 相似文献
8.
Most linkage programs assume linkage equilibrium among multiple linked markers. This assumption may lead to bias for tightly linked markers where strong linkage disequilibrium (LD) exists. We used simulated data from Genetic Analysis Workshop 14 to examine the possible effect of LD on multipoint linkage analysis. Single-nucleotide polymorphism packets from a non-disease-related region that was generated with LD were used for both model-free and parametric linkage analyses. Results showed that high LD among markers can induce false-positive evidence of linkage for affected sib-pair analysis when parental data are missing. Bias can be eliminated with parental data and can be reduced when additional markers not in LD are included in the analyses. 相似文献
9.
Dissecting linkage disequilibrium in African-American genomes: roles of markers and individuals 总被引:1,自引:0,他引:1
Xu S Huang W Wang H He Y Wang Y Wang Y Qian J Xiong M Jin L 《Molecular biology and evolution》2007,24(9):2049-2058
Substantial increases of linkage disequilibrium (LD) both in magnitude and in range have been observed in recently admixed populations such as African-American (AfA). On the other hand, it has also been shown that LD in AfAs was very similar to that of African. In this study, we attempted to resolve these contradicting observations by conducting a systematic examination of the LD structure in AfAs by genotyping a sample of AfA individuals at 24,341 single nucleotide polymorphisms (SNPs) spanning almost the entire chromosome 21, with an average density of 1.5 kb/SNP. The overall LD in AfAs is similar to that in African populations and much less than that in European populations. Even when the ancestry-informative markers (AIMs) were used, extended LD in AfA was found to be limited to certain magnitude range (0.2 < or = r(2) < or = 0.8) and certain distance range, that is, between-marker distance more than 200 kb. Furthermore, the inclusion of AfA individuals with predominant African ancestry was found to reduce the overall magnitude of LD. Elevation of LD in the AfA population, compared with its parental populations, can only be observed at the markers with large allele frequency differences between 2 parental populations at limited scenario. AfA individuals of wholly African ancestry contribute little to the extended LD in the AfA population, and further genotyping or association analysis conducted using only admixed individuals may lead to higher statistical power and possibly reduced cost. 相似文献
10.
Dense SNP maps can be highly informative for linkage studies. But when parental genotypes are missing, multipoint linkage scores can be inflated in regions with substantial marker-marker linkage disequilibrium (LD). Such regions were observed in the Affymetrix SNP genotypes for the Genetic Analysis Workshop 14 (GAW14) Collaborative Study on the Genetics of Alcoholism (COGA) dataset, providing an opportunity to test a novel simulation strategy for studying this problem. First, an inheritance vector (with or without linkage present) is simulated for each replicate, i.e., locations of recombinations and transmission of parental chromosomes are determined for each meiosis. Then, two sets of founder haplotypes are superimposed onto the inheritance vector: one set that is inferred from the actual data and which contains the pattern of LD; and one set created by randomly selecting parental alleles based on the known allele frequencies, with no correlation (LD) between markers. Applying this strategy to a map of 176 SNPs (66 Mb of chromosome 7) for 100 replicates of 116 sibling pairs, significant inflation of multipoint linkage scores was observed in regions of high LD when parental genotypes were set to missing, with no linkage present. Similar inflation was observed in analyses of the COGA data for these affected sib pairs with parental genotypes set to missing, but not after reducing the marker map until r2 between any pair of markers was 相似文献
11.
Mapping genes by drift-generated linkage disequilibrium. 总被引:1,自引:0,他引:1
12.
We describe the use of multivariate regression for testing allelic association in the presence of linkage, using marker genotype data from sibships. The test is valid, provided that the correct mean structure is modeled but does not require the correlation structure within families to be specified. The test can be implemented using standard statistical software such as the SAS programming language. In a simulation study, we evaluated this new test in comparison with one from a standard, matched-case-control analysis. First, we noted that the genetic effect needed to be quite extreme before residual familial correlation due to linkage led to false inference using the standard, matched-pair analysis. Second, we showed that under examples of extreme residual familial correlation, the new test had the correct test size. Third, we found that the test was more powerful than the sibship disequilibrium test of Horvath and Laird. Finally, we concluded that although the standard analysis may lead to correct inference for practical purposes, the new test is valid, even under extreme residual familial correlation and with no cost in power at the causal locus. 相似文献
13.
Michael Dean Jean A. Amos Jennifer Lynch Giovanni Romeo Marcella Devoto Ken Ward Dicky Halley Ben Oostra Maurizio Ferrari Silvia Russo Bruce S. Weir Paula B. Finn Francis S. Collins Michael C. Iannuzzi 《Human genetics》1990,85(3):275-278
Summary Three polymorphic DNA markers surrounding the D7S8 locus were tested for their usefulness in the diagnosis of cystic fibrosis (CF) by linkage analysis. The markers correspond to the loci D7S424 and D7S426. These polymorphisms were studied by centers in the U.S., the United Kingdom, the Netherlands, and Italy, using samples from populations throughout Europe and North America. The additional information provided by these probes increased the heterogeneity of the region from 50% to 58% and was essential for a completely informative diagnosis in one family. A very high degree of linkage disequilibrium was found between these markers, which span a distance of approximately 250kb. In addition, linkage disequilibrium with CF was noted. Significant heterogeneity of linkage disequilibrium was found among the populations, both for the marker-marker pairs and between the markers and CF. 相似文献
14.
A new strategy for studying the genome structure and organization of natural populations is proposed on the basis of a combined analysis of linkage and linkage disequilibrium using known polymorphic markers. This strategy exploits a random sample drawn from a panmictic natural population and the open-pollinated progeny of the sample. It is established on the principle of gene transmission from the parental to progeny generation during which the linkage between different markers is broken down due to meiotic recombination. The strategy has power to simultaneously capture the information about the linkage of the markers (as measured by recombination fraction) and the degree of their linkage disequilibrium created at a historic time. Simulation studies indicate that the statistical method implemented by the Fisher-scoring algorithm can provide accurate and precise estimates for the allele frequencies, recombination fractions, and linkage disequilibria between different markers. The strategy has great implications for constructing a dense linkage disequilibrium map that can facilitate the identification and positional cloning of the genes underlying both simple and complex traits. 相似文献
15.
Selection of genetic markers for association analyses,using linkage disequilibrium and haplotypes
下载免费PDF全文

The genotyping of closely spaced single-nucleotide polymorphism (SNP) markers frequently yields highly correlated data, owing to extensive linkage disequilibrium (LD) between markers. The extent of LD varies widely across the genome and drives the number of frequent haplotypes observed in small regions. Several studies have illustrated the possibility that LD or haplotype data could be used to select a subset of SNPs that optimize the information retained in a genomic region while reducing the genotyping effort and simplifying the analysis. We propose a method based on the spectral decomposition of the matrices of pairwise LD between markers, and we select markers on the basis of their contributions to the total genetic variation. We also modify Clayton's "haplotype tagging SNP" selection method, which utilizes haplotype information. For both methods, we propose sliding window-based algorithms that allow the methods to be applied to large chromosomal regions. Our procedures require genotype information about a small number of individuals for an initial set of SNPs and selection of an optimum subset of SNPs that could be efficiently genotyped on larger numbers of samples while retaining most of the genetic variation in samples. We identify suitable parameter combinations for the procedures, and we show that a sample size of 50-100 individuals achieves consistent results in studies of simulated data sets in linkage equilibrium and LD. When applied to experimental data sets, both procedures were similarly effective at reducing the genotyping requirement while maintaining the genetic information content throughout the regions. We also show that haplotype-association results that Hosking et al. obtained near CYP2D6 were almost identical before and after marker selection. 相似文献
16.
Daniel B. Sloan Peter D. Fields Justin C. Havird 《Proceedings. Biological sciences / The Royal Society》2015,282(1815)
There is extensive evidence from model systems that disrupting associations between co-adapted mitochondrial and nuclear genotypes can lead to deleterious and even lethal consequences. While it is tempting to extrapolate from these observations and make inferences about the human-health effects of altering mitonuclear associations, the importance of such associations may vary greatly among species, depending on population genetics, demographic history and other factors. Remarkably, despite the extensive study of human population genetics, the statistical associations between nuclear and mitochondrial alleles remain largely uninvestigated. We analysed published population genomic data to test for signatures of historical selection to maintain mitonuclear associations, particularly those involving nuclear genes that encode mitochondrial-localized proteins (N-mt genes). We found that significant mitonuclear linkage disequilibrium (LD) exists throughout the human genome, but these associations were generally weak, which is consistent with the paucity of population genetic structure in humans. Although mitonuclear LD varied among genomic regions (with especially high levels on the X chromosome), N-mt genes were statistically indistinguishable from background levels, suggesting that selection on mitonuclear epistasis has not preferentially maintained associations involving this set of loci at a species-wide level. We discuss these findings in the context of the ongoing debate over mitochondrial replacement therapy. 相似文献
17.
18.
Daryl J Somers Travis Banks Ron Depauw Stephen Fox John Clarke Curtis Pozniak Curt McCartney 《Génome》2007,50(6):557-567
Bread wheat and durum wheat were examined for linkage disequilibrium (LD) using microsatellite markers distributed across the genome. The allele database consisted of 189 bread wheat accessions genotyped at 370 loci and 93 durum wheat accessions genotyped at 245 loci. A significance level of p < 0.001 was set for all comparisons. The bread and durum wheat collections showed that 47.9% and 14.0% of all locus pairs were in LD, respectively. LD was more prevalent between loci on the same chromosome compared with loci on independent chromosomes and was highest between adjacent loci. Only a small fraction (bread wheat, 0.9%; durum wheat, 3.2%) of the locus pairs in LD showed R2 values > 0.2. The LD between adjacent locus pairs extended (R2 > 0.2) approximately 2-3 cM, on average, but some regions of the bread and durum wheat genomes showed high levels of LD (R2 = 0.7 and 1.0, respectively) extending 41.2 and 25.5 cM, respectively. The wheat collections were clustered by similarity into subpopulations using unlinked microsatellite data and the software Structure. Analysis within subpopulations showed 14- to 16-fold fewer locus pairs in LD, higher R2 values for those pairs in LD, and LD extending further along the chromosome. The data suggest that LD mapping of wheat can be performed with simple sequence repeats to a resolution of <5 cM. 相似文献
19.
Aerts J Megens HJ Veenendaal T Ovcharenko I Crooijmans R Gordon L Stubbs L Groenen M 《Cytogenetic and genome research》2007,117(1-4):338-345
Many of the economically important traits in chicken are multifactorial and governed by multiple genes located at different quantitative trait loci (QTLs). The optimal marker density to identify these QTLs in linkage and association studies is largely determined by the extent of linkage disequilibrium (LD) around them. In this study, we investigated the extent of LD on two chromosomes in a white layer and two broiler chicken breeds. Pairwise levels of LD were calculated for 33 and 36 markers on chromosomes 10 and 28, respectively. We found that useful LD (i.e. an r(2) value higher than 0.3) in Nutreco chicken breed E5 (inbred) can extend to around 1 cM on chromosomes 10 and 28, although in a second region on chromosome 28 it extends to about 2.5 cM. The extent in breed Nutreco E3 (outbred) was very short in chromosome 10 (15 kb) but very much larger on chromosome 28, particularly in one region of depressed heterozygosity. The layer breed E2 (inbred) showed an extent of useful LD up to 4 cM on chromosome 10; the extent on chromosome 28 could not be assessed due to an erratic pattern of LD on that chromosome, although in one region LD appears to be in the order of 0.8 cM. This indicates that there may be very large differences in patterns of LD between different chicken breeds and different genomic regions. 相似文献
20.